Commit Graph

124216 Commits

Author SHA1 Message Date
Navdeep Parhar
ac4031d805 cxgbe(4): Enable support for per-connection rate limiting in the default
firmware configuration files.

Approved by:	re@ (gjb@)
Sponsored by:	Chelsio Communications
2018-09-26 21:16:07 +00:00
Sean Eric Fagan
a7fcb1afcb Add per-session locking to cryptosoft (swcr).
As part of ZFS Crypto, I started getting a series of panics when I did not
have AESNI loaded.  Adding locking fixed it, and I concluded that the
Reinit function altered the AES key schedule.  This locking is not as
fine-grained as it could be (AESNI uses per-cpu locking), but
it's minimally invasive.

Sponsored by: iXsystems Inc
Reviewed by: cem, mav
Approved by: re (gjb), mav (mentor)
Differential Revision: https://reviews.freebsd.org/D17307
2018-09-26 20:23:12 +00:00
Warner Losh
b28589287b Remove bogus spaces.
Spaces aren't allowed in these strings.

Approved by: re@ (glen)
2018-09-26 19:41:00 +00:00
Warner Losh
0dc34160f3 Add PNP info to PCI attachments of cbb, cxgb, ida, iwn, ixl, ixlv,
mfi, mps, mpr, mvs, my, oce, pcn, ral, rl. This only labels existing
pci device tables, and has no probe / attach code changes.

Reviewed by: imp, chuck
Submitted by: Lakhan Shiva Kamireddy <lakhanshiva@gmail.com>
Sponsored by: Google, Inc. (GSoC 2018)
Approved by: re (glen)
2018-09-26 17:12:30 +00:00
Warner Losh
329e817fcc Reapply, with minor tweaks, r338025, from the original commit:
Remove unused and easy to misuse PNP macro parameter

Inspired by r338025, just remove the element size parameter to the
MODULE_PNP_INFO macro entirely.  The 'table' parameter is now required to
have correct pointer (or array) type.  Since all invocations of the macro
already had this property and the emitted PNP data continues to include the
element size, there is no functional change.

Mostly done with the coccinelle 'spatch' tool:

  $ cat modpnpsize0.cocci
    @normaltables@
    identifier b,c;
    expression a,d,e;
    declarer MODULE_PNP_INFO;
    @@
     MODULE_PNP_INFO(a,b,c,d,
    -sizeof(d[0]),
     e);

    @singletons@
    identifier b,c,d;
    expression a;
    declarer MODULE_PNP_INFO;
    @@
     MODULE_PNP_INFO(a,b,c,&d,
    -sizeof(d),
     1);

  $ rg -l MODULE_PNP_INFO -- sys | \
    xargs spatch --in-place --sp-file modpnpsize0.cocci

(Note that coccinelle invokes diff(1) via a PATH search and expects diff to
tolerate the -B flag, which BSD diff does not.  So I had to link gdiff into
PATH as diff to use spatch.)

Tinderbox'd (-DMAKE_JUST_KERNELS).
Approved by: re (glen)
2018-09-26 17:12:14 +00:00
Andrey V. Elsukov
0ddfd867ed Fix witness warning in xform_init().
Do not call crypto_newsession() while holding xforms_lock mutex.
Release mutex before invoking crypto_newsession(), and use
ipsec_kmod_enter()/ipsec_kmod_exit() functions to protect from doing
access to unloaded kernel module memory.

Move xform-releated functions into subr_ipsec.c to be able use
ipsec_kmod_* functions. Also unconditionally build ipsec_kmod_*
functions, since now they are always used by IPSec code.

Add xf_cntr field to struct xformsw, it is used by ipsec_kmod_*
functions. Also constify xf_name field, since it is not expected to be
modified.

Approved by:	re (kib)
Differential Revision:	https://reviews.freebsd.org/D17302
2018-09-26 14:47:51 +00:00
Slava Shwartsman
a4ea412d4f Add PCIV_INVALID definition
From PCI Spec rev 2.2, 6.2.1. Device Identification:
Vendor ID This field identifies the manufacturer of the device. Valid
vendor identifiers are allocated by the PCI SIG to ensure uniqueness.
0FFFFh is an invalid value for Vendor ID.

MFC after:      3 days
Approved by:    re (Glen), hselasky (mentor), kib (mentor)
Sponsored by:   Mellanox Technologies
2018-09-26 13:16:55 +00:00
Michael Tuexen
0277ec9c43 Whitespace changes and fixing a typo. No functional change.
Approved by:	re (kib@)
MFC after:	1 week
2018-09-26 10:24:50 +00:00
Navdeep Parhar
cb8dedc76e cxgbe(4): Treat base/end of firmware parameters as signed integers when
figuring out whether the range is valid or not.

Approved by:	re@ (rgrimes@)
MFC after:	1 week
Sponsored by:	Chelsio Communications
2018-09-26 02:27:37 +00:00
Konstantin Belousov
05e1cca97a Fix some uses of dmaplimit.
dmaplimit is the first byte after the end of DMAP.

Reported by:	"Johnson, Archna" <Archna.Johnson@netapp.com>
Reviewed by:	alc, markj
Approved by:	re (gjb)
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D17318
2018-09-25 20:07:58 +00:00
Konstantin Belousov
cbe100dfca Fix an issue in r338862.
For pmap_invalidate_all_pcid(), only reset pm_gen for non-kernel
pmaps, as it was done before the conversion to ifuncs.  The reset is
useless but innocent for kernel_pmap. Coverity reported that cpuid is
used uninitialized in this case.

Reported by:	cem
Reviewed by:	alc, cem, markj
CID:	 1395807
Sponsored by:	The FreeBSD Foundation
Approved by:	re (gjb)
Differential revision:	https://reviews.freebsd.org/D17314
2018-09-25 18:24:25 +00:00
Mateusz Guzik
af534f8d99 zfs: depessimize zfs_root with rmlocks
Currently vfs calls the root method on each absolute lookup and when
crossing mount points.

zfs_root ends up looking up the inode internally as if it was not
instantianted which results in significant lock contention on systems
like EPYC.

Store the vnode in the mount point and protect the access with rmlocks.
This is a temporary hack for 12.0.

Sample result:

before:
make -s -j 128 buildkernel 2778.09s user 3319.45s system 8370% cpu 1:12.85 total

after:
make -s -j 128 buildkernel 3199.57s user 1772.78s system 8232% cpu 1:00.40 total

Tested by:	pho (zfs mount/unmount tests)
Reviewed by:	kib, mav, sef (different parts)
Approved by:	re (gjb)
Differential Revision:	https://reviews.freebsd.org/D17233
2018-09-25 17:58:06 +00:00
Navdeep Parhar
ea710848dc cxgbe(4): Link related changes.
- Switch to using 32b port/link capabilities in the driver.  The 32b
  format is used internally by firmwares > 1.16.45.0 and the driver will
  now interact with the firmware in its native format, whether it's 16b
  or 32b.  Note that the 16b format doesn't have room for 50G, 200G, or
  400G speeds.

- Add a bit in the pause_settings knobs to allow negotiated PAUSE
  settings to override manual settings.

- Ensure that manual link settings persist across an administrative
  down/up as well as transceiver unplug/replug.

- Remove unused is_*G_port() functions.

Approved by:	re@ (gjb@)
MFC after:	1 month
Sponsored by:	Chelsio Communications
2018-09-25 05:52:42 +00:00
Justin Hibbits
fd8cf3be5f powerpc: Blacklist the top 64kB range of the lower 4GB PA space
The PHB4 host bridge used by the POWER9 uses a 64kB range in 32-bit
space at the address 0xffff0000-0xffffffff.  Reserve this range so that
DMA memory cannot be allocated within this range.  This fixes seemingly
random crashes on a POWER9 system.  Ideally this range will have been
reserved by the firmware, but as of now this is not the case.

Submitted by:	git_bdragon.rtk0.net
Reviewed by:	nwhitehorn
Approved by:	re(kib)
Differential Revision:	https://reviews.freebsd.org/D17183
2018-09-25 02:34:28 +00:00
Colin Percival
189bd7a978 Recognize the Amazon PCI serial device found in i3.metal EC2 instances
as an NS8250 UART.

Reviewed by:	sbruno, imp
Approved by:	re (delphij)
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D17250
2018-09-24 22:15:04 +00:00
Mark Johnston
463406ac4a Add more NUMA-specific low memory predicates.
Use these predicates instead of inline references to vm_min_domains.
Also add a global all_domains set, akin to all_cpus.

Reviewed by:	alc, jeff, kib
Approved by:	re (gjb)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17278
2018-09-24 19:24:17 +00:00
John Baldwin
b65a9568f8 Restore the API of the kf_sa_local and kf_sa_peer members.
In 11.x and earlier these were accessible as direct members of 'struct
kinfo_file'.  Existing code already knows about the new location of
these members as well, so wrapper macros did not work for these
fields.  Instead, define an anonymous struct containing the fields
from 'struct kinfo_file' in FreeBSD 11 that were not part of the
'kf_un' union.  This anonymous struct is then placed in an anonymous
union along with the new 'kf_un' union.  This preserves the API of
both structure layouts without requiring any wrapper macros.

PR:		231525
Reviewed by:	kib
Approved by:	re (gjb)
Differential Revision:	https://reviews.freebsd.org/D17262
2018-09-24 18:20:38 +00:00
John Baldwin
7a102e0463 Implement pmap_sync_icache().
This invokes "fence" on the hart performing the write followed by an IPI
to execute "fence.i" on all harts.

This is required to support userland debuggers setting breakpoints in
user processes.

Reviewed by:	br (earlier version), markj
Approved by:	re (gjb)
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D17139
2018-09-24 17:41:29 +00:00
Alexander Motin
cf4a52cf67 Fix use-after-free in RAID0 error reporting of GEOM_RAID.
PR:		231510
Submitted by:	yangx92@hotmail.com
Approved by:	re (gjb)
MFC after:	1 week
2018-09-24 16:58:55 +00:00
Alan Cox
f5fbe90de4 Passing UMA_ZONE_NOFREE to uma_zcreate() for swpctrie_zone and swblk_zone is
redundant, because uma_zone_reserve_kva() is performed on both zones and it
sets this same flag on the zone.  (Moreover, the implementation of the swap
pager does not itself require these zones to be UMA_ZONE_NOFREE.)

Reviewed by:	kib, markj
Approved by:	re (gjb)
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D17296
2018-09-24 16:49:02 +00:00
Mark Johnston
3d14a7bb43 Ensure that "domain" is initialized when vm_ndomains == 1.
Reported by:	alc
Approved by:	re (gjb)
2018-09-24 15:32:46 +00:00
Mateusz Guzik
9afff6b1c0 Eliminate false sharing in malloc due to statistic collection
Currently stats are collected in a MAXCPU-sized array which is not
aligned and suffers enormous false-sharing. Fix the problem by
utilizing per-cpu allocation.

The counter(9) API is not used here as it is too incomplete and does
not provide a win over per-cpu zone sized for malloc stats struct. In
particular stats are being reported for each cpu separately by just
copying what is supposed to be an array element for given cpu.

This eliminates significant false-sharing during malloc-heavy tests
e.g. on Skylake. See the review for details.

Reviewed by:	markj
Approved by:	re (kib)
Differential Revision:	https://reviews.freebsd.org/D17289
2018-09-23 19:00:06 +00:00
Michael Tuexen
078a49a077 Remove the unused parameter 'locked' from the function
syncache_respond(). There is no functional change. The
parameter became unused in r313330, but wasn't removed.

Approved by:		re (kib@)
MFC after:		1 month
Sponsored by:		Netflix, Inc.
2018-09-23 16:37:32 +00:00
Konstantin Belousov
b7befdf509 Correct panic messages.
Reviewed by:	mckusick
Sponsored by:	The FreeBSD Foundation
Approved by:	re (rgrimes)
MFC after:	1 week
2018-09-22 17:05:49 +00:00
Konstantin Belousov
108ff63e8a Further reorganize pmap_invalidate TLB code.
Split calculation of mask for shootdown IPI and local
invalidation. Reorder IPI before local.

Suggested by:	alc
Reviewed by:	alc, markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Approved by:	re (rgrimes)
Differential revision:	https://reviews.freebsd.org/D17277
2018-09-22 17:04:39 +00:00
Mateusz Guzik
334107b91b vfs: __predict common case in VFS_EPILOGUE/PROLOGUE
NFS is the only in-tree filesystem using the feature, but all ops test
for it.

Currently the resulting sigdefer calls have to be jumped over in the
common case.

This is a bandaid, longer term fix will move this feature away.

Approved by:	re (kib)
2018-09-22 11:39:30 +00:00
Navdeep Parhar
061bbaf7e7 cxgbe(4): Reuse existing "switching" L2T entries when possible.
Approved by:	re@ (rgrimes@)
Sponsored by:	Chelsio Communications
2018-09-22 01:24:30 +00:00
Alexander Motin
ae1b0b825a MFV r338866: 9700 ZFS resilvered mirror does not balance reads
illumos/illumos-gate@82f63c3c2b

Reviewed by: Toomas Soome <tsoome@me.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Matthew Ahrens <mahrens@delphix.com>
Author:     Jerry Jelinek <jerry.jelinek@joyent.com>

Approved by:	re (delphij)
2018-09-21 21:56:00 +00:00
Mark Johnston
6cdde6fda2 Use the GNU as-compatible .endm instead of .endmacro.
Approved by:	re (gjb)
2018-09-21 20:20:03 +00:00
Konstantin Belousov
d995b5b1ea Convert x86 TLB top-level invalidation functions to ifuncs.
Note that shootdown IPI handlers are already per-mode.

Suggested by:	alc
Reviewed by:	alc, markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Approved by:	re (gjb)
Differential revision:	https://reviews.freebsd.org/D17184
2018-09-21 17:53:06 +00:00
Mateusz Guzik
68c7542edf amd64: even up copyin/copyout with memcpy + other cleanup
- _fault handlers for both primitives are identical, provide just one
- change the copying scheme to match memcpy (in particular jump
avoidance for the most common case of multiply of 8)
- stop re-reading pcb address on exit, just store it locally (in r9)

Reviewed by:	kib
Approved by:	re (gjb)
Differential Revision:	https://reviews.freebsd.org/D17265
2018-09-21 15:00:46 +00:00
Andrey V. Elsukov
50f5c94edb Fix possible NULL pointer dereference in ffec_alloc_mbufcl().
PR:		231514
Approved by:	re (kib)
MFC after:	1 week
2018-09-21 13:44:05 +00:00
Ed Maste
f789d9839a Include kernel ident in uname
In non-reproducible mode we have the kernel ident as a side effect of
including the build directory.  Explicitly add it to the ident string in
reproducible mode.

Reported by:	mjg
Approved by:	re (gjb)
Sponsored by:	The FreeBSD Foundation
2018-09-21 13:43:06 +00:00
Mateusz Guzik
d6fda03a64 select: stop doing zero-sized memsets
Approved by:	re (kib)
2018-09-21 13:20:41 +00:00
Ed Maste
18c00da806 remove double space between branch and version in kernel ident
Reported by:	dim
Approved by:	re (kib)
Sponsored by:	The FreeBSD Foundation
2018-09-21 13:02:25 +00:00
Mateusz Guzik
bb32268086 amd64: check for small size in memmove, memcpy and memset
If the size is 15 bytes or less avoid spinning up rep just to copy the 8
bytes. In my tests on EPYC and old Intel microarchs without ERMS (like
Westmere) it provided a nice win over the current version (e.g. for EPYC
memset with 15 bytes of size goes from 59712651 ops/s to 70600095) all
while almost not pessimizing the other cases.

Data collected during package building shows that < 16 sizes are pretty
common.

Verified with the glibc test suite.

Approved by:	re (kib)
2018-09-21 12:27:36 +00:00
Matt Macy
b08d611de8 fix vlan locking to permit sx acquisition in ioctl calls
- update vlan(9) to handle changes earlier this year in multicast locking

Tested by: np@, darkfiberu at gmail.com

PR:	230510
Reviewed by:	mjoras@, shurd@, sbruno@
Approved by:	re (gjb@)
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D16808
2018-09-21 01:37:08 +00:00
Glen Barber
a128aaea95 Update head from ALPHA6 to ALPHA7 as part of the 12.0-RELEASE
cycle.

Approved by:	re (implicit)
Sponsored by:	The FreeBSD Foundation
2018-09-20 23:59:42 +00:00
Mateusz Guzik
5254f065ab amd64: macroify copyin/copyout and provide erms variants, follow up
Fix a fat-fingered typo with a "funny" side-effect: when doing copyin on a
cpu without ERMS and with size being a multiply of 8 a page fault would be
triggered resulting in EFAULT.

Pointy hat: mjg
Approved by:	re (implicit)
2018-09-20 20:32:08 +00:00
Stephen Hurd
861437f83a Add IFCAP_TSO6 for igb
It seems igb supports TSO6, but the capability got lost in
the iflib update. Restore this capability.

PR:		231476
Reported by:	lev
Reviewed by:	erj
Approved by:	re (gjb)
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D17242
2018-09-20 20:06:44 +00:00
Andrey V. Elsukov
76b09d1823 Add new field max_hdrsize to struct encap_config.
It is currently unused and reserved for future use to keep KBI/KPI.
Also add several spare pointers to be able extend structure if it
will be needed.

Approved by:	re (gjb)
2018-09-20 19:45:27 +00:00
Stephen Hurd
0c919c2370 Fix capabilities handling for iflib drivers
Various capabilities were not being handled correctly in the
SIOCSIFCAP handler. Specifically:

IFCAP_RXCSUM and IFCAP_RXCSUM_IPV6 could be set even if not supported

It was impossible to disable IFCAP_RXCSUM and/or IFCAP_RXCSUM_IPV6 via
ifconfig since it does ioctl() per command-line flag rather than combine
them into a single call.

IFCAP_VLAN_HWCSUM could not be modified via the ioctl()

Setting any combination of the three IFCAP_WOL flags would set only
IFCAP_WOL_MCAST | IFCAP_WOL_MAGIC. For example, setting only
IFCAP_WOL_UCAST would result in both IFCAP_WOL_MCAST and IFCAP_WOL_MAGIC
being enabled, but IFCAP_WOL_UCAST would not be enabled.

Because if_vlancap() was called before if_togglecapenable(), vlan flags
were sometimes not applied correctly.

Interfaces were being unnecessarily stopped and restarted for WoL

PR:		231151
Submitted by:	Kaho Toshikazu <kaho@elam.kais.kyoto-u.ac.jp>
Reported by:	Shirkdog <mshirk@daemon-security.com>
Reviewed by:	galladin
Approved by:	re (gjb)
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D17158
2018-09-20 19:35:35 +00:00
Mateusz Guzik
4bf0035fd1 amd64: macroify copyin/copyout and provide erms variants
Reviewed by:	kib
Approved by:	re (gjb)
Differential Revision:	https://reviews.freebsd.org/D17257
2018-09-20 18:30:17 +00:00
Mark Johnston
969e147aff Ensure that imports into per-domain kmem arenas are KVA_QUANTUM-aligned.
The old code appears to assume that vmem_alloc() would import
size-aligned KVA chunks from the parent kernel_arena, but vmem doesn't
provide this guarantee.

Also remove the unused global RWX arena and add comments explaining why
we have per-domain arenas.

Reported by:	alc
Reviewed by:	alc, kib (previous version)
Approved by:	re (gjb)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17249
2018-09-20 18:29:55 +00:00
Mateusz Guzik
c396945b74 vfs: remove lookup_shared tunable
Reviewed by:	kib, jhb
Approved by:	re (gjb)
Differential Revision:	https://reviews.freebsd.org/D17253
2018-09-20 18:25:26 +00:00
Bjoern A. Zeeb
6675bee81a In icmp6_rip6_input(), once we have a lock, make sure the inp is
not freed.  This can happen since the list traversal and locking
was converted to epoch(9).  If the inp is marked "freed", skip it.

This prevents a NULL pointer deref panic in ip6_savecontrol_v4()
trying to access the socket hanging off the inp, which was gone
by the time we got there.

Reported by:	andrew
Tested by:	andrew
Approved by:	re (gjb)
2018-09-20 15:45:53 +00:00
Mark Johnston
25ed23cfbb Change the domain selection policy in kmem_back().
Ensure that pages backing the same virtual large page come from the
same physical domain, as kmem_malloc_domain() does.

PR:		231038
Reviewed by:	alc, kib
Approved by:	re (gjb)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17248
2018-09-20 15:45:12 +00:00
Mateusz Guzik
51e13c93b6 fd: prevent inlining of _fdrop thorough kern_descrip.c
fdrop is used in several places in the file and almost never has to call
_fdrop. Thus inlining it is a pure waste of space.

Approved by:	re (kib)
2018-09-20 13:32:40 +00:00
Mateusz Guzik
a286a3099c amd64: move fusufault after all users
A lot of function have the following check:
        cmpq    %rax,%rdi                       /* verify address is valid */
        ja      fusufault

The label is present earlier in kernel .text, which means this is a jump
backwards. Absent any information in branch predictor, the cpu predicts it
as taken. Since it is almost never taken in practice, this results in a
completely avoidable misprediction.

Move it past all consumers, so that it is predicted as not taken.

Approved by:	re (kib)
2018-09-20 13:29:43 +00:00
John Baldwin
232d0b87e0 Various fixes for floating point on RISC-V.
- Explicitly load an empty initial state into FP registers when taking
  the fault on the first FP instruction in a thread.  Setting
  SSTATE.FS to INITIAL is just a marker to let context switch restore
  code know that it can load FP registers with zeroes instead of
  memory loads.  It does not imply that the hardware will reset all
  registers to zero on first access.  In addition, set the state to
  CLEAN instead of INITIAL after the first FP instruction.
  cpu_switch() doesn't do anything for INITIAL and only restores from
  the pcb if the state is CLEAN.  We could perhaps change cpu_switch
  to call fpe_state_clear if the state was INITIAL and leave SSTATE.FS
  set to INITIAL instead of CLEAN after the first FP instruction.
  However, adding this complexity to cpu_switch() doesn't seem worth
  the supposed gain.
- Only save the current FPU registers in fill_fpregs() if the request
  is made to save the current thread's registers.  Previously if a
  debugger requested FP registers via ptrace() it was getting a copy
  of the debugger's FP registers rather than the debugee's.
- Zero the entire FP register set structure returned for ptrace() if a
  thread hasn't used FP registers rather than leaking garbage in the
  fp_fcsr field.
- If a debugger writes FP registers via ptrace(), always mark the pcb
  as having valid FP registers and set SSTATUS.FS_MASK to CLEAN so
  that the registers will be restored when the debugged thread
  resumes.
- Be more explicit about clearing the SSTATUS.FS field before setting
  it to CLEAN on the first FP instruction trap.

Submitted by:	br, markj
Approved by:	re (rgrimes)
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D17141
2018-09-19 23:45:18 +00:00
John Baldwin
31ce875385 Clear all of the VFP state in fill_fpregs().
Zero the entire FP register set structure returned for ptrace() if a
thread hasn't used FP registers rather than leaking garbage in the
fp_sr and fp_cr fields.

Reviewed by:	emaste, andrew
Approved by:	re (rgrimes)
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D17140
2018-09-19 22:53:52 +00:00
Konstantin Belousov
d12c446550 Convert x86 cache invalidation functions to ifuncs.
This simplifies the runtime logic and reduces the number of
runtime-constant branches.

Reviewed by:	alc, markj
Sponsored by:	The FreeBSD Foundation
Approved by:	re (gjb)
Differential revision:	https://reviews.freebsd.org/D16736
2018-09-19 19:35:02 +00:00
Mark Johnston
1aed6d48a8 Move kernel vmem arena initialization to vm_kern.c.
This keeps the initialization coupled together with the kmem_* KPI
implementation, which is the main user of these arenas.

No functional change intended.

Reviewed by:	alc
Approved by:	re (gjb)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17247
2018-09-19 19:13:43 +00:00
Bjoern A. Zeeb
997fecb5c2 Update udp6_output() inp locking to avoid concurrency issues with
route cache updates.

Bring over locking changes applied to udp_output() for the route cache
in r297225 and fixed in r306559 which achieve multiple things:
(1) acquire an exclusive inp lock earlier depending on the expected
    conditions; we add a comment explaining this in udp6,
(2) having acquired the exclusive lock earlier eliminates a slight
    possible chance for a race condition which was present in v4 for
    multiple years as well and is now gone, and
(3) only pass the inp_route6 to ip6_output() if we are holding an
    exclusive inp lock, so that possible route cache updates in case
    of routing table generation number changes can happen safely.
In addition this change (as the legacy IP counterpart) decomposes the
tracking of inp and pcbinfo lock and adds extra assertions, that the
two together are acquired correctly.

PR:		230950
Reviewed by:	karels, markj
Approved by:	re (gjb)
Pointyhat to:	bz (for completely missing this bit)
Differential Revision:	https://reviews.freebsd.org/D17230
2018-09-19 18:49:37 +00:00
Konstantin Belousov
ad8edd79cc Convert i386 NPX hardware context save methods to ifuncs.
Since ifunc-capable linker is now required on i386, bring this code in
line with the amd64 counterpart.

Reviewed by:	alc, markj
Sponsored by:	The FreeBSD Foundation
Approved by:	re (gjb)
Differential revision:	https://reviews.freebsd.org/D16736
2018-09-19 16:37:43 +00:00
Mateusz Guzik
c035292545 vm: check for empty kstack cache before locking
The current cache logic checks the total number of stacks in the kernel,
which even on small boxes significantly exceeds the 128 limit (e.g. an
8-way box with zfs has almost 800 stacks allocated).

Stacks are cached earlier for each main thread.

As a result the code is rarely executed, but when it is then (on boxes like
the above) it always fails. Since there are no provisions made for NUMA and
release time is approaching, just do a quick check to avoid acquiring the
lock.

Approved by:	re (kib)
2018-09-19 16:02:33 +00:00
Konstantin Belousov
215aa93033 amd64 pmap: remove tautological assert.
pm_pcid is unsigned.

Reviewed by:	cem, markj
CID:	1395727
Noted by:	cem
Sponsored by:	The FreeBSD Foundation
Approved by:	re (gjb)
MFC after:	3 days
Differential revision:	https://reviews.freebsd.org/D17235
2018-09-19 15:39:16 +00:00
Konstantin Belousov
a6ade1a07b Fix ZFS VFS op quotactl to follow busy protocol.
Reviewed by:	avg, mckusick
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Approved by:	re (gjb)
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D17208
2018-09-19 14:38:01 +00:00
Konstantin Belousov
bf94d6c78b Fix state of dquot-less vnodes after failed quotaoff.
UFS quotaoff iterates over all mp vnodes, and derefences and clears
the pointers to corresponding dquots. If SU work items transiently
reference some of dquots,quotaoff() would eventually fail, but all
processed vnodes are already stripped from dquots.  The state is
problematic, since quotas are left enabled, but there is no dquots
where blocks and inodes can be accounted.  The result is assertion
failures and NULL pointer dereferences.

Fix it by suspending writes around quotaoff() call.  Since the
filesystem is synced, no dandling references to dquots from SU
workitems can left behind, which means that quotaoff succeeds.

The complication there is that quotaoff VFS op is performed with the
mount point busied, while to suspend, we need to start write on the
mp.  If vn_start_write() is called on busied mp, system might deadlock
against parallel unmount request.  Handle this by unbusy-ing mp before
starting write, which in turn requires changing the quotaoff()
interface to return with the mount point not busied, same as was done
for quotaon().

Reviewed by:	mckusick
Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
Approved by:	re (gjb)
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D17208
2018-09-19 14:36:57 +00:00
Jung-uk Kim
9c40dcbe5f Make geli(8) buildable. 2018-09-19 07:08:04 +00:00
Navdeep Parhar
5f65b4cab2 cxgbe(4): Enable TXRTLMT by default when the feature is available in the
kernel (options RATELIMIT) and provisioned in the driver's configuration
file (nethofld > 0).

Submitted by:	gallatin@
Approved by:	re@ (kib@)
2018-09-18 21:34:37 +00:00
Mark Johnston
26fe2217bf Only update the domain cursor once in keg_fetch_slab().
We drop the keg lock when we go to actually allocate the slab, allowing
other threads to advance the cursor.  This can cause us to exit the
round-robin loop before having attempted allocations from all domains,
resulting in a hang during a subsequent blocking allocation attempt from
a depleted domain.

Reported and tested by:	Jan Bramkamp <crest@bultmann.eu>
Reviewed by:	alc, cem
Approved by:	re (gjb)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17209
2018-09-18 17:51:45 +00:00
Ed Maste
9ed6559e3e Require ifunc-capable linker for i386
The amd64 kernel started using ifunc for a variety of functions with
arch-specific implementations, and we would like to make use of the
same functionality on i386 and as much as possible avoid divergence
between i386 and amd64.  In particular, future changes for security
improvements and mitigations may rely on ifunc support.

Approved by:	re (kib)
Sponsored by:	The FreeBSD Foundation
2018-09-18 15:01:21 +00:00
Michael Tuexen
ba4704a278 Remove unused code.
Approved by:	re (kib@)
MFC after:	1 week
2018-09-18 10:53:07 +00:00
Mateusz Guzik
2554f86a8d vm: stop taking proc lock in mmap to satisfy racct if it is disabled
Limits can be safely obtained with lim_cur from the thread. racct is compiled
in but disabled by default. Note that racct enablement is a boot-only tunable.

This eliminates second most common place of taking the lock while pkg building.

While here don't take the lock in mlockall either.

Reviewed by:	kib
Approved by:	re (gjb)
Differential Revision:	https://reviews.freebsd.org/D17210
2018-09-18 01:24:30 +00:00
David C Somayajulu
9d50798c61 Fixed isses:
State check before enqueuing transmit task in bxe_link_attn() routine.
 State check before invoking bxe_nic_unload in bxe_shutdown().

Submitted by:Vaishali.Kulkarni@cavium.com
Approved by:re(gjb)
2018-09-17 20:15:18 +00:00
Konstantin Belousov
a408841593 Do not upgrade the vnode lock to call getinoquota().
Doing so can deadlock when the thread already owns another vnode lock,
e.g. during a rename, as was demonstrated by the reporter.  In fact,
there seems to be no need to force the call to getinoquota() always,
because vn_open() locks vnode exclusively, and this is the most
important case.  To add to the point, directories where the dirent is
added or removed, are locked exclusively as well.

Reported by:	bwidawsk
Tested by:	bwidawsk, pho (as part of the larger patch)
Sponsored by:	The FreeBSD Foundation
Approved by:	re (gjb)
MFC after:	1 week
2018-09-17 19:38:43 +00:00
John Baldwin
87bdca8290 Fix a regression in r338360 when booting an x86 machine without APIC.
The atpic_register_sources callback tries to avoid registering interrupt
sources that would collide with an I/O APIC.  However, the previous
implementation was failing to register IRQs 8-15 since the slave PIC
saw valid IRQs from the master and assumed an I/O APIC was present.  To
fix, go back to registering all 8259A interrupt sources in one loop when
the master's register_sources method is invoked.

PR:		231291
Approved by:	re (kib)
MFC after:	1 month
2018-09-17 17:18:54 +00:00
Mark Johnston
6368b4e471 Fix an nvpair leak in vdev_geom_read_config().
Also change the behaviour slightly: instead of freeing "config" if the
last nvlist doesn't pass the tests, return the last config that did pass
those tests.  This matches the comment at the beginning of the function.

PR:		230704
Diagnosed by:	avg
Reviewed by:	asomers, avg
Tested by:	Mark Martinec <Mark.Martinec@ijs.si>
Approved by:	re (gjb)
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential revision:  https://reviews.freebsd.org/D17202
2018-09-17 16:16:57 +00:00
Konstantin Belousov
3c022be2ca Use ifunc to resolve context switching mode on amd64.
Patch removes all checks for pti/pcid/invpcid from the context switch
path. I verified this by looking at the generated code, compiling with
the in-tree clang.  The invpcid_works1 trick required inline attribute
for pmap_activate_sw_pcid_pti() to work.

Reviewed by:	alc, markj
Sponsored by:	The FreeBSD Foundation
Approved by:	re (gjb)
Differential revision:	https://reviews.freebsd.org/D17181
2018-09-17 15:52:19 +00:00
Mateusz Guzik
d6943c5804 amd64: tidy up kernel memmove, take 2
There is no need to use %rax for temporary values and avoiding doing
so shortens the func.
Handle the explicit 'check for tail' depessimisization for backwards copying.

This reduces the diff against userspace.

Tested with the glibc test suite.

Approved by:	re (kib)
2018-09-17 15:51:49 +00:00
Konstantin Belousov
09a6ada991 Calculate PTI, PCID and INVPCID modes earlier, before ifuncs are resolved.
This will be used in following conversion of pmap_activate_sw().

Reviewed by:	alc, markj
Sponsored by:	The FreeBSD Foundation
Approved by:	re (gjb)
Differential revision:	https://reviews.freebsd.org/D17181
2018-09-17 15:34:19 +00:00
Konstantin Belousov
76ed0c542f Make the PTI violation check to follow style of the SMAP check.
No functional changes.

Reviewed by:	alc, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Approved by:	re (rgrimes)
Differential revision:	https://reviews.freebsd.org/D17181
2018-09-17 14:59:05 +00:00
Andrey V. Elsukov
c6851ad04c Restore outbound packets capturing for if_gre(4). It was missed in r335048.
Also clear M_MCAST and M_BCAST flags for encapsulated datagram, since it
will have new IP header.

Approved by:	re (kib)
2018-09-17 10:10:14 +00:00
Mateusz Guzik
9d1b868da0 Revert amd64: tidy up kernel memmove
There is a braino in the non-erms variant which breaks the
functionality.

Will be fixed at a later time with a different patch.

Reported by:	Manfred Antar
Approved by:	re (implicit)
2018-09-16 21:46:27 +00:00
Oleksandr Tymoshenko
234afdb9cc [ig4] Fix device description for Kaby Lake systems
Kaby Lake I2C controller is Intel Sunrise Point-H not Intel Sunrise Point-LP.

Submitted by:	Dmitry Luhtionov
Approved by:	re (kib)
2018-09-16 21:44:36 +00:00
Mateusz Guzik
17f67f63b9 amd64: tidy up kernel memmove
There is no need to use %rax for temporary values and avoiding doing
so shortens the func.
Handle the explicit 'check for tail' depessimisization for backwards copying.

This reduces the diff against userspace.

Approved by:	re (kib)
2018-09-16 19:28:27 +00:00
Konstantin Belousov
bd6c14afa7 Remove unneeded new line from the panic string.
Reviewed by:	alc, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Approved by:	re (rgrimes)
Differential revision:	https://reviews.freebsd.org/D17181
2018-09-16 18:36:42 +00:00
John-Mark Gurney
032d3aaa96 Significantly improve pf purge cpu usage by only taking locks
when there is work to do.  This reduces CPU consumption to one
third on systems.  This will help keep the thread CPU usage under
control now that the default hash size has increased.

Reviewed by:	kp
Approved by:	re (kib)
Differential Revision:	https://reviews.freebsd.org/D17097
2018-09-16 00:44:23 +00:00
Mark Johnston
d5089b3aed Log a message after a successful boot-time microcode update.
Reviewed by:	kib
Approved by:	re (delphij)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17135
2018-09-14 17:04:36 +00:00
Bjoern A. Zeeb
6bfb487b6e Set ident for GENERIC-MMCCAM to not announce itself as
GENERIC anymore.

Reviewed by:	andrew
Approved by:	re (gjb)
2018-09-14 15:46:31 +00:00
Mateusz Guzik
c51b7ab9e3 amd64: implement pagezero_erms
Intel docs claim such a memset (rep stosb + 4096 bytes) is
special-cased by microarchs. They also switched Linux to use
it for this purpose.

Approved by:	re (gjb)
2018-09-14 15:29:35 +00:00
Matt Macy
0204d85a62 hwpmc: set default rate if event description lacks one / filter rate against misuse
Not all event descriptions have a sample rate (such as inst_retired.any)
this will restore the legacy behavior of using 65536 in that case. It also
prevents accidental API misuse that could lead to panic.

PR:	230985
Reported by:	markj
Reviewed by:	markj
Approved by:	re (gjb)
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D16958
2018-09-14 01:30:05 +00:00
Glen Barber
b79672bb08 Update head from ALPHA5 to ALPHA6 as part of the 12.0-RELEASE
cycle.

Approved by:	re (implicit)
Sponsored by:	The FreeBSD Foundation
2018-09-13 23:59:59 +00:00
Navdeep Parhar
4bb64e96e4 cxgbe(4): Use the correct number of parameters when querying the tid
range for hashfilters.

Approved by:	re@ (gjb@)
2018-09-13 22:58:13 +00:00
Ed Maste
f2990e6c19 Enable Capsicum on armv6/armv7
We ought to be consistent across our Tier-1 and nearly-Tier-1
architectures, so enable Capsicum for 32-bit armv6/armv7 by default.

PR:		204008
Reviewed by:	ian, oshogbo
Approved by:	re (gjb)
Relnotes:	Yes
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17023
2018-09-13 21:00:17 +00:00
Eric van Gyzen
73511c241b Set zfs_arc_meta_strategy to metadata only
The previous default of "balanced" appears to have caused pathological
behavior, including very poor performance and 100% CPU load in the
arc_reclaim_thread.

The symptoms appeared when the daily periodic run started.
With this change, the system--and the ARC in particular--behaved
normally during a manual daily periodic run.

From Mark Johnston:  The port of the balanced strategy is incomplete,
since arc_prune_async() is a no-op on FreeBSD.  (This also seems
to imply that r337653 is a no-op.)  After 12 is branched we can
port the remaining bits and consider changing the default back.

Submitted by:	markj (essentially)
Reviewed by:	markj
Approved by:	re (gjb)
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D17156
2018-09-13 17:56:48 +00:00
Oleksandr Tymoshenko
6fb3c89473 [ig4] Add PCI IDs for I2C controller on Intel Kaby Lake systems
PR:	221777
Approved by:	re (kib)
Submitted by:	marc.priggemeyer@gmail.com
2018-09-13 17:36:55 +00:00
Navdeep Parhar
6f3a49c317 cxgbe/iw_cxgbe: Fix reported build breakage when the kernel
configuration has "device cxgbe' but no VIMAGE.

Reported by:	mav@
Approved by:	re@ (kib@)
2018-09-13 16:27:21 +00:00
Mateusz Guzik
13ea074dc3 amd64: implement ERMS-based memmove, memcpy and memset
Reviewed by:	kib
Approved by:	re (gjb)
Differential Revision:	https://reviews.freebsd.org/D17124
2018-09-13 14:53:51 +00:00
Ed Maste
cc41fd23ea Enable reproducible builds in advance of 12.0-REL
r338642 toggled the REPRODUCIBLE_BUILD knob but missed the
corresponding kern.opts.mk change.

We want to build the 12.0 release artifacts with reproducible builds
mode enabled. Switch it on in HEAD now to enable testing with upcoming
ALPHA builds. We can revisit the default setting for HEAD after the
branch is created.

This change eliminates the build metadata (user, hostname, timestamp,
etc.) from the kernel and loader.  If the src tree is a git, svn or p4
checkout with changes then the metadata is retained.

The WITHOUT_REPRODUCIBLE_BUILD src.conf(5) knob can be used to revert
to the previous behaviour.

Approved by:	re (gjb)
Sponsored by:	The FreeBSD Foundation
2018-09-13 14:52:59 +00:00
Emmanuel Vadot
f9d40f5cca arm64: Make aw_sid and aw_thermal depend on nvmem
Both drivers use this interface so add a dependancy on it.
Since awg uses aw_sid for generating the MAC address, make it
depend on both aw_sid and nmvem so when only removing nvmem from
kernel config it will not include this driver.

Reported by:	sbruno
Approved by:	re (gjb)
2018-09-13 14:08:10 +00:00
Roger Pau Monné
5ff6c7f363 xen: temporary disable SMAP when forwarding hypercalls from user-space
The Xen page-table walker used to resolve the virtual addresses in the
hypercalls will refuse to access user-space pages when SMAP is enabled
unless the AC flag in EFLAGS is set (just like normal hardware with
SMAP support would do).

Since privcmd allows forwarding hypercalls (and buffers) from
user-space into Xen make sure SMAP is temporary disabled for the
duration of the hypercall from user-space.

Approved by:		re (gjb)
Sponsored by:		Citrix Systems R&D
2018-09-13 07:15:02 +00:00
Roger Pau Monné
a74cdf4e74 xen: legacy PVH fixes for the new interrupt count
Register interrupts using the PIC pic_register_sources method instead
of doing it in apic_setup_io. This is now required, since the internal
interrupt structures are not yet setup when calling apic_setup_io.

Approved by:		re (gjb)
Sponsored by:		Citrix Systems R&D
2018-09-13 07:14:11 +00:00
Roger Pau Monné
d7627401ec lapic: skip setting intrcnt if lapic is not present
Instead of panicking. Legacy PVH mode doesn't provide a lapic, and
since native_lapic_intrcnt is called unconditionally this would cause
the assert to trigger. Change the assert into a continue in order to
take into account the possibility of systems without a lapic.

Reviewed by:		jhb
Approved by:		re (gjb)
Sponsored by:		Citrix Systems R&D
Differential revision:	https://reviews.freebsd.org/D17015
2018-09-13 07:13:13 +00:00
Roger Pau Monné
4edbde911b xen: fix setting legacy PVH vcpu id
The recommended way to obtain the vcpu id is using the cpuid
instruction with a specific leaf value. This leaf value must be
obtained at runtime, and it's done when populating the hypercall page.

Legacy PVH however will get the hypercall page populated by the
hypervisor itself before booting, so the cpuid leaf was not actually
set, thus preventing setting the vcpu id value from cpuid.

Fix this by making sure the cpuid leaf has been probed before
attempting to set the vcpu id.

Approved by:		re (gjb)
Sponsored by:		Citrix Systems R&D
2018-09-13 07:12:16 +00:00
Roger Pau Monné
4fcd5f3003 xen: limit the usage of PIRQs to a legacy PVH Dom0
That's the only mode in FreeBSD that requires the usage of PIRQs, so
there's no need to attach the PIRQ PIC when running in other modes.

Approved by:		re (gjb)
Sponsored by:		Citrix Systems R&D
2018-09-13 07:11:11 +00:00
Roger Pau Monné
ddbc1b4387 xen: fix initial kenv setup for legacy PVH
When adding support for the new PVH mode the kenv handling was
switched to use a boot time allocated scratch space, however the
legacy PVH early boot code was not modified to allocate such space.

Approved by:		re (gjb)
Sponsored by:		Citrix Systems R&D
2018-09-13 07:09:41 +00:00
Roger Pau Monné
c9a591b0f6 xen: remove xenpv_set_ids
The vcpu_id for legacy PVH mode can be set from the output of cpuid,
so there's no need to have a special function to set it.

Also note that xenpv_set_ids should have been executed only for PV
guests, but was executed for all guests types and vcpu_id was later
fixed up for HVM guests.

Reported by:		cperciva
Approved by:		re (gjb)
Sponsored by:		Citrix Systems R&D
2018-09-13 07:08:31 +00:00
Roger Pau Monné
fae9a0cb9b xen: fix PV IPI setup
So that it's done when the vcpu_id has been set. For the BSP the
vcpu_id is set at SUB_INTR, while for the APs it's done in
init_secondary_tail that's called at SUB_SMP order FIRST.

Reported and tested by:	cperciva
Approved by:		re (gjb)
Sponsored by:		Citrix Systems R&D
Differential revision:	https://reviews.freebsd.org/D17013
2018-09-13 07:07:13 +00:00
Roger Pau Monné
a515acf7bb msi: remove the check that interrupt sources have been added
When running as a specific type of Xen guest the hypervisor won't
provide any emulated IO-APICs or legacy PICs at all, thus hitting the
following assert in the MSI code:

panic: Assertion num_io_irqs > 0 failed at /usr/src/sys/x86/x86/msi.c:334
cpuid = 0
time = 1
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xffffffff826ffa70
vpanic() at vpanic+0x1a3/frame 0xffffffff826ffad0
panic() at panic+0x43/frame 0xffffffff826ffb30
msi_init() at msi_init+0xed/frame 0xffffffff826ffb40
apic_setup_io() at apic_setup_io+0x72/frame 0xffffffff826ffb50
mi_startup() at mi_startup+0x118/frame 0xffffffff826ffb70
start_kernel() at start_kernel+0x10

Fix this by removing the assert in the MSI code, since it's possible
to get to the MSI initialization without having registered any other
interrupt sources.

Reviewed by:		jhb
Approved by:		re (gjb)
Sponsored by:		Citrix Systems R&D
Differential revision:	https://reviews.freebsd.org/D17001
2018-09-13 07:05:51 +00:00
Roger Pau Monné
d01d12de14 x86bios: use M_NOWAIT with mallocs
Or else it triggers the following bug:

APIC: CPU 6 has ACPI ID 6
APIC: CPU 7 has ACPI ID 7
panic: vm_wait in early boot
cpuid = 0
time = 1
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xffffffff826ff8d0
vpanic() at vpanic+0x1a3/frame 0xffffffff826ff930
panic() at panic+0x43/frame 0xffffffff826ff990
vm_wait_domain() at vm_wait_domain+0xf9/frame 0xffffffff826ff9c0
kmem_alloc_contig_domain() at kmem_alloc_contig_domain+0x252/frame 0xffffffff826ffa50
kmem_alloc_contig() at kmem_alloc_contig+0x6c/frame 0xffffffff826ffad0
contigmalloc() at contigmalloc+0x2e/frame 0xffffffff826ffb00
x86bios_modevent() at x86bios_modevent+0x225/frame 0xffffffff826ffb20
module_register_init() at module_register_init+0xc0/frame 0xffffffff826ffb50
mi_startup() at mi_startup+0x118/frame 0xffffffff826ffb70
start_kernel() at start_kernel+0x10

While there also make x86bios_unmap_mem idempotent.

Reviewed by:		kib
Approved by:		re (gjb)
Sponsored by:		Citrix Systems R&D
Differential revision:	https://reviews.freebsd.org/D17000
2018-09-13 07:04:00 +00:00
Michael Tuexen
a8a8a8a808 Fix TCP Fast Open for the TCP RACK stack.
* Fix a bug where the SYN handling during established state was
  applied to a front state.
* Move a check for retransmission after the timer handling.
  This was suppressing timer based retransmissions.
* Fix an off-by one byte in the sequence number of retransmissions.
* Apply fixes corresponding to
  https://svnweb.freebsd.org/changeset/base/336934

Reviewed by:		rrs@
Approved by:		re (kib@)
MFC after:		1 month
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D16912
2018-09-12 10:27:58 +00:00
Ruslan Bukin
bd528a398e Enable VIMAGE support for RISC-V.
Approved by:	re (gjb)
Sponsored by:	DARPA, AFRL
2018-09-12 08:13:54 +00:00
Ruslan Bukin
752a8ea48e Use elf_relocaddr() to find the address for R_RISCV_RELATIVE
relocation.

elf_relocaddr() has a hook to handle VIMAGE data addresses.

This fixes VIMAGE support for RISC-V when built as a module.

Approved by:	re (gjb)
Sponsored by:	DARPA, AFRL
2018-09-12 08:12:34 +00:00
Ruslan Bukin
86c5937532 Don't mark module data as static on RISC-V.
Similar to arm64, riscv compiler uses PC-relative loads/stores,
and with static data compiler does not emit relocations.
In result, kernel module linker has nothing to fix and data accessed
from the wrong location.

Approved by:	re (gjb)
Sponsored by:	DARPA, AFRL
2018-09-12 08:05:33 +00:00
Gordon Tetlow
c9e562b188 Correct ELF header parsing code to prevent invalid ELF sections from
disclosing memory.

Submitted by:	markj
Reported by:	Thomas Barabosch, Fraunhofer FKIE
Approved by:	re (implicit)
Approved by:	so
Security:	FreeBSD-SA-18:12.elf
Security:	CVE-2018-6924
Sponsored by:	The FreeBSD Foundation
2018-09-12 04:57:34 +00:00
Mateusz Guzik
e382dd47aa amd64: enable options NUMA in GENERIC and MINIMAL
Reviewed by:	gallatin, cem, scottl
Approved by:	re (kib)
Relnotes:	yes
Sponsored by:	Dell EMC Isilon, Netflix
Differential Revision:	https://reviews.freebsd.org/D17059
2018-09-11 23:54:31 +00:00
Ed Maste
08d0704d74 Switch reproducible builds to unmodified src tree mode
newvers.sh supports two modes for reproducible builds:

 -r    Reproducible build.  Do not embed directory names, user
       names, time stamps or other dynamic information into
       the output file.  This is intended to allow two builds
       done at different times and even by different people on
       different hosts to produce identical output.

 -R    Reproducible build if the tree represents an unmodified
       checkout from a version control system.  Metadata is
       included if the tree is modified.

Switch to the second mode when reproducible builds are enabled.
The value of a reproducible build is much less when building from an
uncontrolled, modified src tree, and -R likely provides the best
compromise in allowing the REPRODUCIBLE_BUILD knob to be enabled by
default for the release.

Approved by:	re (kib)
Sponsored by:	The FreeBSD Foundation
2018-09-11 19:19:07 +00:00
Eric Joyner
de35521a3b ix(4), ixv(4): VLAN tag stripping fixes for Amazon EC2 Enhanced Networking
From Piotr:

ix(4), ixv(4): Add VLAN tag strip check when receiving packets
ixv(4): Fix support for VLAN_HWTAGGING and VLAN_HWFILTER flags

This change will prevent driver from passing VLAN tags when
interface configuration is not expecting them. VF driver will
check for VLAN_HWTAGGING and VLAN_HWFILTER flags and act adequately.

This patch resolves problem occuring on EC2 platforms.

Submitted by:	Piotr Pietruszewski <piotr.pietruszewski@intel.com>
Reported by:	cperciva@
Reviewed by:	cperciva@, Intel Networking
Approved by:	re
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D17061
2018-09-11 18:33:43 +00:00
Edward Tomasz Napierala
d4eab13738 Make the wait in cfiscsi_offline() interruptible. This is the second half
of the fix/workaround for the "ctld hanging on reload" problem.

PR:		220175
Reported by:	Eugene M. Zheganin <emz at norma.perm.ru>
Tested by:	Eugene M. Zheganin <emz at norma.perm.ru>
Approved by:	re (kib)
MFC after:	2 weeks
Sponsored by:	playkey.net
2018-09-11 11:39:59 +00:00
Mark Johnston
54af3d0dac Fix synchronization of LB group access.
Lookups are protected by an epoch section, so the LB group linkage must
be a CK_LIST rather than a plain LIST.  Furthermore, we were not
deferring LB group frees, so in_pcbremlbgrouphash() could race with
readers and cause a use-after-free.

Reviewed by:	sbruno, Johannes Lundberg <johalun0@gmail.com>
Tested by:	gallatin
Approved by:	re (gjb)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17031
2018-09-10 19:00:29 +00:00
Mark Johnston
7a364d458a Split some checks in vm_page_activate() to make it easier to read.
No functional change intended.

Reviewed by:	alc, kib
Approved by:	re (gjb)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17028
2018-09-10 18:59:23 +00:00
Xin LI
7a8d266139 random(4): Squash non-error timeout codes from tsleep(9).
In both scenarios a timeout (EWOULDBLOCK) is considered as a
normal condition and the error should not pop up to upper layers.

PR:		231181
Submitted by:	cem
Reported by:	lev
Reviewed by:	vangyzen, markm, delphij
Approved by:	re (kib)
Approved by:	secteam (delphij)
Differential Revision:	https://reviews.freebsd.org/D17049
2018-09-09 17:12:31 +00:00
Hans Petter Selasky
7877f59352 Introduce and use sgid_index in CM requests in ibcore.
For RoCE, when CM requests are received for RC and UD connections,
netdevice of the incoming request is unavailable. Because of that CM
requests are always forwarded to init_net namespace.

Now that we have the GID index available, introduce SGID index in
incoming CM requests and refer to the netdevice of it.

While at it fix some incorrect uses of init_net and make sure
the rdma_create_id() function stores the VNET it is passed.

Based on linux commit:
cee104334c98dd04e9dd4d9a4fa4784f7f6aada9

MFC after:		3 days
Approved by:		re (gjb)
Sponsored by:		Mellanox Technologies
2018-09-09 07:20:15 +00:00
Mark Johnston
9c78fa0a61 Fix the 32-bit arm build.
X-MFC with:     r338537
Approved by:	re (rgrimes)
Sponsored by:	The FreeBSD Foundation
2018-09-08 23:39:26 +00:00
Mark Johnston
31184bcd68 Exclude the EFI framebuffer from phys_avail[] on arm64.
On the ThunderX the region occupied by the framebuffer is included in
the EFI map, so explicitly add it to the set of regions that aren't
managed by the physical memory allocator.

PR:		231064
Reviewed by:	andrew
Approved by:	re (gjb)
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17073
2018-09-08 21:52:44 +00:00
Mark Johnston
88db0afa82 Bump MAX_HWCNT and MAX_EXCNT.
These limits are hit on the ThunderX.  Also make
arm_physmem_exclude_region() panic rather than fail silently if the
limit on excluded regions is reached.

PR:		231064
Reviewed by:	andrew
Approved by:	re (kib)
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17073
2018-09-08 21:51:47 +00:00
Mark Johnston
5a7f993702 Relax an assertion in vm_pqbatch_process_page().
While executing vm_pqbatch_process_page(m), m->queue may change to
PQ_NONE if the page daemon is concurrently freeing the page.  In this
case m's queue state flags must be clear, so vm_pqbatch_process_page()
will be a no-op, but the race could cause spurious assertion failures.
Correct the assertion which assumed that m->queue's value does not
change while the page queue lock is held.

Reviewed by:	alc, kib
Reported and tested by:	pho
Approved by:	re (gjb)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17027
2018-09-08 21:49:43 +00:00
Konstantin Belousov
fc71a66be0 intelspi: don't leak spibus children on detach.
Submitted by:	Yuri Pankov
MFC after:	1 week
Approved by:	re (gjb)
Differential revision:	https://reviews.freebsd.org/D17076
2018-09-08 18:57:29 +00:00
Mark Johnston
9ef958c702 Specify the correct resource type in teardown paths.
Submitted by:	Yuri Pankov <yuripv@yuripv.net>
Approved by:	re (kib)
MFC after:	1 week
2018-09-07 21:12:37 +00:00
Mark Johnston
a7026c7fd9 Use ratecheck(9) in in_pcbinslbgrouphash().
Reviewed by:	bz, Johannes Lundberg <johalun0@gmail.com>
Approved by:	re (kib)
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D17065
2018-09-07 21:11:41 +00:00
Hans Petter Selasky
86156495f4 Implement get network interface by params function in ipoib.
Also fix the validate_ipv4_net_dev() and validate_ipv6_net_dev() functions
which had source and destination addresses swapped, and didn't set the
scope ID for IPv6 link-local addresses.

This allows applications like krping to work using IPoIB devices.

MFC after:		3 days
Approved by:		re (gjb)
Sponsored by:		Mellanox Technologies
2018-09-07 18:05:09 +00:00
Glen Barber
dac18e8ed5 Update head from ALPHA4 to ALPHA5 as part of the 12.0-RELEASE
cycle.

Approved by:	re (implicit)
Sponsored by:	The FreeBSD Foundation
2018-09-07 00:00:31 +00:00
Kirk McKusick
4b6a2c497e The Call For Testing had no reports of operational problems and
found that performance was no worse and usually better when running
with TRIM consolidation. Performance improvement was most noticable
when multiple large files are released in a short period of time.

Thus, TRIM consolidation is being enabled by default. Should
operational problems be found, it can be disabled using the command
`sysctl vfs.ffs.dotrimcons=0'. This variable can also be set as a
tunable if early disabling is necessary.

Approved by:  re (gjb)
Sponsored by: Netflix
2018-09-06 23:28:35 +00:00
Marius Strobl
ecdc974571 Avoid uninitialized read of ext_csd.
Reported by:    Coverity
CID:            1395275
Approved by:	re (gjb, kib)
2018-09-06 21:24:14 +00:00
Marius Strobl
0519c933fd - Explicitly compare a pointer to NULL. The __builtin_expect() of clang
3.4.1 otherwise isn't able to cope with the expression.
- Fix a nearby whitespace bug.

Approved by:	re (gjb, kib)
2018-09-06 21:09:54 +00:00
Mark Johnston
c56c7299c2 Use the correct terminology.
Reported by:    kib
Approved by:	re (gjb)
Differential revision:  https://reviews.freebsd.org/D16191
2018-09-06 20:02:19 +00:00
Bjoern A. Zeeb
113c4fad55 The inp_lle field to struct inpcb, along with two "valid" flags
for the rt and lle cache were added in r191129 (2009).
To my best knowledge they have never been used and route caching
has converted the inp_rt field from that commit to inp_route
rendering this field and these flags obsolete.

Convert the pointer into a spare pointer to not change the size of
the structure anymore (and to have a spare pointer) and mark the
two fields as unused.

Reviewed by:	markj, karels
Approved by:	re (gjb)
Differential Revision:	https://reviews.freebsd.org/D17062
2018-09-06 19:55:40 +00:00
Mateusz Guzik
12360b3079 amd64: depessimize copyinstr_smap
The stac/clac combo around each byte copy is causing a measurable
slowdown in benchmarks. Do it only before and after all data is
copied. While here reorder the code to avoid a forward branch in
the common case.

Note the copying loop (originating from copyinstr) is avoidably slow
and will be fixed later.

Reviewed by:	kib
Approved by:	re (gjb)
Differential Revision:	https://reviews.freebsd.org/D17063
2018-09-06 19:42:40 +00:00
Mark Johnston
23984ce5cd Avoid resource deadlocks when one domain has exhausted its memory. Attempt
other allowed domains if the requested domain is below the minimum paging
threshold.  Block in fork only if all domains available to the forking
thread are below the severe threshold rather than any.

Submitted by:	jeff
Reported by:	mjg
Reviewed by:	alc, kib, markj
Approved by:	re (rgrimes)
Differential Revision:	https://reviews.freebsd.org/D16191
2018-09-06 19:28:52 +00:00
John Baldwin
0c0c965a8f Re-enable kernel modules for the MALTA64EL kernel configuration.
Update the BOOTSTRAPPING check for libelf to require the fix for
mips64el object files committed in r338478 and re-enable kernel
modules in the MALTA64EL config file.

Reviewed by:	emaste
Approved by:	re (gjb)
Differential Revision:	https://reviews.freebsd.org/D17054
2018-09-06 19:21:31 +00:00
Stephen Hurd
64e6fc1379 Clean up iflib sysctls
Remove sysctls:
txq_drain_encapfail - now a duplicate of encap_txd_encap_fail
intr_link - was never incremented
intr_msix - was never incremented
rx_zero_len - was never incremented

The following were not incremented in all code-paths that apply:
m_pullups, mbuf_defrag, rxd_flush, tx_encap, rx_intr_enables, tx_frees,
encap_txd_encap_fail.

Fixes:
Replace the broken collapse_pkthdr() implementation with an MPASS().
fl_refills and fl_refills_large were not incremented when using netmap.

Reviewed by:	gallatin
Approved by:	re (marius)
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D16733
2018-09-06 18:51:52 +00:00
Andrew Turner
fe3ddcf2d5 Fix the GIC ACPI cross reference value.
To support INTRNG with ACPI we need to set a non-zero cross reference value
for the interrupt controller. The GICv3 driver already had this value set,
however it was missed in the GICv2 driver. Fix this by setting xref to the
correct value.

Approved by:	re (gjb)
2018-09-06 17:25:50 +00:00
Andrew Turner
ddb7437378 Remove the check that the Arm generic interrupt controller variant is
non-zero. This is the case on qemu, so remove it to allow us to boot there.
This change is needed to boot on qemu with ACPI.

Approved by:	re (gjb)
2018-09-06 17:25:01 +00:00
Breno Leitao
03e83a838e powerpc64: Add initial support for HTM (kABI)
This patch adds the very initial support for HTM that might come at FreeBSD
version 12.1. This basic support defines a new kABI, so, we do not need to change
it later during 12.1 time frame, when the full implementation will come.

Reviewed by: jhibbits
Approved by: re(marius), jhibbits (mentor)
Differential Revision: https://reviews.freebsd.org/D16889
2018-09-06 17:07:21 +00:00
Mark Johnston
21f01f4584 Remove vm_page_remque().
Testing m->queue != PQ_NONE is not sufficient; see the commit log
message for r338276.  As of r332974 vm_page_dequeue() handles
already-dequeued pages, so just replace vm_page_remque() calls with
vm_page_dequeue() calls.

Reviewed by:	kib
Tested by:	pho
Approved by:	re (marius)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17025
2018-09-06 16:17:45 +00:00
Bjoern A. Zeeb
6d2b0c0166 Make tcp_hpts.c compile a LINT kernel with options RSS and PCBGROUPS added by
adding the missing include files and changing a the type of cpuid which
would otherwise cause a false comparison with NETISR_CPUID_NONE.

Reviewed by:	rrs
Approved by:	re (marius)
Differential Revision:	https://reviews.freebsd.org/D16891
2018-09-06 16:11:24 +00:00
Mark Johnston
49365eb433 Define sctp probes only when SCTP is configured.
Otherwise the "depends_on provider" guard in sctp.d does not work as
intended.

Reported by:	mjg
Reviewed by:	tuexen
Approved by:	re (gjb)
Differential Revision:	https://reviews.freebsd.org/D17057
2018-09-06 14:15:03 +00:00
Hans Petter Selasky
58c277d846 Add proper support for VIMAGE to krping.
Make sure we pass the correct VNET when allocating the RDMA ID.

MFC after:		3 days
Approved by:		re (gjb)
Sponsored by:		Mellanox Technologies
2018-09-06 14:03:11 +00:00
Alexander Motin
cae8b43e5c Add missing copyin() to access LUN and port ioctl arguments.
Somehow this was working even after PTI in, at least on amd64, and got
broken by something only very recently.

Reviewed by:	araujo
Approved by:	re (gjb)
2018-09-06 14:03:10 +00:00
Hans Petter Selasky
6ed134c41b Make the MSIX module parameter limit per device, in mlx5en(4).
MFC after:		3 days
Approved by:		re (marius)
Sponsored by:		Mellanox Technologies
2018-09-06 12:41:09 +00:00
Hans Petter Selasky
16ae32f927 Add support for receive side scaling stride, RSSS, in mlx5en(4).
The receive side scaling stride parameter is a value which define the interval
between active receive side queues. The traffic for the inactive queues is
redirected to the nearest active queue by use of modulus. The default value
of this parameter is one, which means all receive side queues are used.

The point of this feature is to redirect more traffic to fewer receive side
queues in order to take more advantage of sorted large receive offload,
sorted LRO. The sorted LRO works better when more packets are accumulated
per service interval.

MFC after:		3 days
Approved by:		re (marius)
Sponsored by:		Mellanox Technologies
2018-09-06 12:28:06 +00:00
Slava Shwartsman
b4df6efb4e ibcore: Fix endless loop in searching for matching VLAN device
In r337943 ifnet's if_pcp was set to the PCP value in use
instead of IFNET_PCP_NONE.
Current ibcore code assumes that if_pcp is IFNET_PCP_NONE with
VLAN interfaces so it can identify prio-tagged traffic.
Fix that by explicitly verifying that that the if_type is IFT_ETHER
and not IFT_L2VLAN.

MFC after:      3 days
Approved by:    re (Marius), hselasky (mentor), kib (mentor)
Sponsored by:   Mellanox Technologies
2018-09-06 12:26:57 +00:00
Hans Petter Selasky
0be5034007 Don't stall transmit queue on drops in mlx5en(4).
When a transmitted packet is dropped don't stall the transmit queue.

MFC after:		3 days
Approved by:		re (marius)
Sponsored by:		Mellanox Technologies
2018-09-06 12:19:36 +00:00
Hans Petter Selasky
2d32b0a304 Maximum number of mbuf frags is off-by-one for worst case scenario in mlx5en(4).
Inspecting the PRM no more than 0x3F data segments, DS, of size 16 bytes is
allowed.

Worst case scenario summary of DS usage:
Header is fixed:	2 DS
Maximum inlining:	98 => (98 - 2) / 16 = 6 DS
Remainder:		0x3F - 2 - 6 = 55 DS (mbuf frags)

Previously a value of 56 DS was used and this would work in the
normal case because not all inline data area was used up.

MFC after:		3 days
Approved by:		re (marius)
Sponsored by:		Mellanox Technologies
2018-09-06 11:06:07 +00:00
Mark Johnston
cc4f3d0ae2 Rename hardclock_cnt() to hardclock() and remove the old implementation.
Also remove some related and unused subroutines.  They have long been
replaced by variants that handle multiple coalesced events with a single
call.

No functional change intended.

Reviewed by:	cem, kib
Approved by:	re (gjb)
Differential Revision:	https://reviews.freebsd.org/D17029
2018-09-06 02:10:59 +00:00
John Baldwin
eb81f38a62 Fix objcopy for little-endian MIPS64 objects.
MIPS64 does not store the 'r_info' field of a relocation table entry as
a 64-bit value consisting of a 32-bit symbol index in the high 32 bits
and a 32-bit type in the low 32 bits as on other architectures.  Instead,
the 64-bit 'r_info' field is really a 32-bit symbol index followed by four
individual byte type fields.  For big-endian MIPS64, treating this as a
64-bit integer happens to be compatible with the layout expected by other
architectures (symbol index in upper 32-bits of resulting "native" 64-bit
integer).  However, for little-endian MIPS64 the parsed 64-bit integer
contains the symbol index in the low 32 bits and the 4 individual byte
type fields in the upper 32-bits (but as if the upper 32-bits were
byte-swapped).

To cope, add two helper routines in gelf_getrel.c to translate between the
correct native 'r_info' value and the value obtained after the normal
byte-swap translation.  Use these routines in gelf_getrel(), gelf_getrela(),
gelf_update_rel(), and gelf_update_rela().  This fixes 'readelf -r' on
little-endian MIPS64 objects which was previously decoding incorrect
relocations as well as 'objcopy: invalid symbox index' warnings from
objcopy when extracting debug symbols from kernel modules.

Even with this fixed, objcopy was still crashing when trying to extract
debug symbols from little-endian MIPS64 modules.  The workaround in
gelf_*rel*() depends on the current ELF object having a valid ELF header
so that the 'e_machine' field can be compared against EM_MIPS.  objcopy
was parsing the relocation entries to possibly rewrite the 'r_info' fields
in the update_relocs() function before writing the initial ELF header to
the destination object file.  Move the initial write of the ELF header
earlier before copy_contents() so that update_relocs() uses the correct
symbol index values.

Note that this change should really go upstream.  The binutils readelf
source has a similar hack for MIPS64EL though I implemented this version
from scratch using the MIPS64 ABI PDF as a reference.

Discussed with:	jkoshy
Reviewed by:	emaste, imp
Approved by:	re (gjb, kib)
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D15734
2018-09-05 20:51:53 +00:00
Mark Johnston
ce9eea6425 Correct the condition under which we allocate a terminator node.
We will have last_block < blocks if the block count is divisible
by BLIST_BMAP_RADIX, but a terminator node is still needed if the
tree isn't balanced.  In this case we were overruning the blist
array by 16 bytes during initialization.

While here, add a check for the invalid blocks == 0 case.

PR:		231116
Reviewed by:	alc, kib (previous version), Doug Moore <dougm@rice.edu>
Approved by:	re (gjb)
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17020
2018-09-05 19:05:30 +00:00
Mark Johnston
8be02ee4da Fix style bugs in in_pcblookup_lbgroup().
No functional change intended.

Reviewed by:	bz, Johannes Lundberg <johalun0@gmail.com>
Approved by:	re (rgrimes)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17030
2018-09-05 15:04:11 +00:00
Eugene Grosbein
d5d21ad932 Fix "ipfw fwd" to work for incoming IPv4 packets when ip_tryforward() chooses
fast forwarding path, as it already works for IPv6 and for both of them
on old slow path.

PR:			231143
Reviewed by:		ae
Approved by:		re (gjb)
MFC after:		1 week
Differential Revision:	https://reviews.freebsd.org/D17039
2018-09-05 13:59:36 +00:00
Ruslan Bukin
157654d0c4 Permit supervisor to access user VA space for certain functions only.
This is done by setting SUM (permit Supervisor User Memory access)
bit in sstatus register.

The functions we allow access for are routines in assembly that
explicitly handle crossing the user kernel boundary.

Approved by:	re (kib)
Sponsored by:	DARPA, AFRL
2018-09-05 11:34:58 +00:00
Ruslan Bukin
93952a8b1b Fix bug: compare uaddr to VM_MAXUSER_ADDRESS, not to a tmp value
left by SET_FAULT_HANDLER().

Approved by:	re (kib)
Sponsored by:	DARPA, AFRL
2018-09-05 09:53:55 +00:00
Konstantin Belousov
8e6e1ba843 Assign to correct structure members.
Reported by:	cem from Coverity
Sponsored by:	The FreeBSD Foundation
MFC after:	6 days
Approved by:	re (gjb)
2018-09-04 19:28:46 +00:00
Konstantin Belousov
20df4f456d amd64: Properly re-merge r334537 into SMAP-ified copyin(9) and copyout(9).
Also this fixes the eflags.ac leak from copyin_smap() when the copied
data length is multiple of eight bytes.

Sponsored by:	The FreeBSD Foundation
Approved by:	re (gjb)
2018-09-04 19:27:53 +00:00
Konstantin Belousov
e21c5abc2a amd64: For non-PTI mode, do not initialize PCPU kcr3 to KPML4phys.
Non-PTI mode does not switch kcr3, which means that kcr3 is almost
always stale.  This is important for the NMI handler, which reloads
%cr3 with PCPU(kcr3) if the value is different from PMAP_NO_CR3.

The end result is that curpmap in NMI handler does not match the page
table loaded into hardware.  The manifestation was copyin(9) looping
forever when a usermode access page fault cannot be resolved by
vm_fault() updating a different page table.

Reported by:	mmacy
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
Approved by:	re (gjb)
2018-09-04 19:26:54 +00:00
Vladimir Kondratyev
316086bd6c wmt(4): Fix regression introduced in r337289
r337289 has a side effect of reducing usb frame 0 buffer size down to
touch report size. That broke some devices e.g. "Raydium Touch System"
which are capable of generating non-touch frames of bigger length.
Fix it with enlarging frame 0 buffer up to internal wmt(4) buffer size.

Reported by:	Roberto Fernandez Cueto <roberfern@gmail.com>
Tested by:	Roberto Fernandez Cueto <roberfern@gmail.com>
Approved by:	re (gjb)
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D16772
2018-09-04 19:22:31 +00:00
Emmanuel Vadot
79f29de22e regulator: Use bool values instead of 0/1
While here do not attempt to disable regulators if they are meant
to be always on.

Reviewed by:    mmel
Approved by:    re (kib)
2018-09-04 19:18:55 +00:00
Bjoern A. Zeeb
ec86402ecd Replicate r328271 from legacy IP to IPv6 using a single macro
to clear L2 and L3 route caches.
Also mark one function argument as __unused.

Reviewed by:	karels, ae
Approved by:	re (rgrimes)
Differential Revision:	https://reviews.freebsd.org/D17007
2018-09-03 22:27:27 +00:00
Bjoern A. Zeeb
f6aeb1eee5 Replicate r307234 from legacy IP to IPv6 code, using the RO_RTFREE()
macro rather than hand crafted code.
No functional changes.

Reviewed by:	karels
Approved by:	re (rgrimes)
Differential Revision:	https://reviews.freebsd.org/D17006
2018-09-03 22:14:37 +00:00
Bjoern A. Zeeb
bc11a8829e As discussed in D6262 post-commit review, change inp_route to
inp_route6 for IPv6 code after r301217.
This was most likely a c&p error from the legacy IP code, which
did not matter as it is a union and both structures have the same
layout at the beginning.
No functional changes.

Reviewed by:	karels, ae
Approved by:	re (rgrimes)
Differential Revision:	https://reviews.freebsd.org/D17005
2018-09-03 22:12:48 +00:00
Bjoern A. Zeeb
d4672c61d3 Rather than duplicating the functionality of a macro after r322866
use the already existing one.  No functional changes.

Reviewed by:	karels, ae
Approved by:	re (rgrimes)
Differential Revision:	https://reviews.freebsd.org/D17004
2018-09-03 22:10:49 +00:00
Mark Johnston
73ad0b6abf Use the correct malloc type in in_pcblbgroup_free().
Approved by:	re (kib)
Sponsored by:	The FreeBSD Foundation
2018-09-03 17:39:09 +00:00
Ruslan Bukin
888c8381ad Enable 'C'-compressed ISA extension.
This was disabled recently due to lack of support in KDB disassembler
and DTrace FBT provider. Support for 'C'-extension to both of these was
added, so we can now enable 'C'-extension.

This reduces size of the kernel important for low-end embedded devices,
and saves cache footprint for high perfomance machines.

Approved by:	re (kib)
Sponsored by:	DARPA, AFRL
2018-09-03 14:43:16 +00:00
Ruslan Bukin
378a495661 Add support for 'C'-compressed ISA extension to DTrace FBT provider.
Approved by:	re (kib)
Sponsored by:	DARPA, AFRL
2018-09-03 14:34:09 +00:00
Robert Watson
deea362c80 The kernel DTrace audit provider (dtaudit) relies on auditd(8) to load
/etc/security/audit_event to provide a list of audit event-number <->
name mappings.  However, this occurs too late for anonymous tracing.
With this change, adding 'audit_event_load="YES"' to /boot/loader.conf
will cause the boot loader to preload the file, and then the kernel
audit code will parse it to register an initial set of audit event-number
<-> name mappings.  Those mappings can later be updated by auditd(8) if
the configuration file changes.

Reviewed by:	gnn, asomers, markj, allanjude
Discussed with:	jhb
Approved by:	re (kib)
MFC after:	1 week
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D16589
2018-09-03 14:26:43 +00:00
Konstantin Belousov
50cd0be78f Catch exceptions during EFI RT calls on amd64.
This appeared to be required to have EFI RT support and EFI RTC
enabled by default, because there are too many reports of faulting
calls on many different machines.  The knob is added to leave the
exceptions unhandled to allow to debug the actual bugs.

Reviewed by:	kevans
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Approved by:    re (rgrimes)
Differential revision:	https://reviews.freebsd.org/D16972
2018-09-02 21:37:05 +00:00
Konstantin Belousov
1565fb29a7 Add amd64 mdthread fields needed for the upcoming EFI RT exception
handling.

This is split into a separate commit from the main change to make it
easier to handle possible revert after upcoming KBI freeze.

Reviewed by:	kevans
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Approved by:    re (rgrimes)
Differential revision:	https://reviews.freebsd.org/D16972
2018-09-02 21:16:43 +00:00
Konstantin Belousov
420382368a Improve error messages from clock_if.m method failures.
Print error message in verbose mode when CLOCK_SETTIME() clock_if.m
method failed.  For EFIRT RTC clock, add error code for the failure of
CLOCK_GETTIME() report.

Reviewed by:	kevans
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Approved by:    re (rgrimes)
Differential revision:	https://reviews.freebsd.org/D16972
2018-09-02 20:17:51 +00:00
Konstantin Belousov
9eb958988a Swap order of dererencing PCPU curpmap and checking for usermode in
trap_pfault() KPTI violation check.

EFI RT may set curpmap to NULL for the duration of the call for some
machines (PCID but no INVPCID).  Since apparently EFI RT code must be
ready for exceptions from the calls, avoid dereferencing curpmap until
we know that this call does not come from usermode.

Reviewed by:	kevans
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Approved by:    re (rgrimes)
Differential revision:	https://reviews.freebsd.org/D16972
2018-09-02 20:07:36 +00:00
Konstantin Belousov
d4be3789fe Normalize use of semicolon with EFI_TIME_LOCK macros.
Reviewed by:	kevans
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Approved by:    re (rgrimes)
Differential revision:	https://reviews.freebsd.org/D16972
2018-09-02 19:48:41 +00:00
Alan Cox
72aebdd742 Recent changes have created, for the first time, physical memory segments
that can be coalesced.  To be clear, fragmentation of phys_avail[] is not
the cause.  This fragmentation of vm_phys_segs[] arises from the "special"
calls to vm_phys_add_seg(), in other words, not those that derive directly
from phys_avail[], but those that we create for the initial kernel page
table pages and now for the kernel and modules loaded at boot time.  Since
we sometimes iterate over the physical memory segments, coalescing these
segments at initialization time is a worthwhile change.

Reviewed by:	kib, markj
Approved by:	re (rgrimes)
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D16976
2018-09-02 18:29:38 +00:00
Mark Johnston
a9d49f9e64 Fix the hash table lookup in fbt_destroy().
Reported and tested by:	pho
Approved by:	re (kib)
X-MFC with:	r338359
2018-09-02 17:02:13 +00:00
Edward Tomasz Napierala
d783154e46 Try harder in cfiscsi_offline(). This is believed to be the workaround
for the "ctld hanging on reload" problem observed in same cases under
high load.  I'm not 100% sure it's _the_ fix, as the issue is rather hard
to reproduce, but it was tested as part of a larger path and the problem
disappeared.  It certainly shouldn't break anything.

Now, technically, it shouldn't be needed.  Quoting mav@, "After
ct->ct_online == 0 there should be no new sessions attached to the target.
And if you see some problems abbout it, it may either mean that there are
some races where single cfiscsi_session_terminate(cs) call may be lost,
or as a guess while this thread was sleeping target was reenabbled and
redisabled again".  Should such race be discovered and properly fixed
in the future, than this and the followup two commits can be backed out.

PR:		220175
Reported by:	Eugene M. Zheganin <emz at norma.perm.ru>
Tested by:	Eugene M. Zheganin <emz at norma.perm.ru>
Discussed with:	mav
Approved by:	re (gjb)
MFC after:	2 weeks
Sponsored by:	playkey.net
2018-09-01 16:16:40 +00:00
Glen Barber
bf466ddcff Revert r338423, reapplying r338422, which did get approval but
communication lines got crossed.

Apologies to avatar@ for the confusion.

Approved by:	re (implicit)
Sponsored by:	The FreeBSD Foundation
2018-09-01 15:07:38 +00:00
Glen Barber
b5a3b728c5 Revert r338422, which was did not get official approval from re@.
Approved by:	re (implicit)
Sponsored by:	The FreeBSD Foundation
2018-09-01 13:43:14 +00:00
Tai-hwa Liang
8578932ddb Adding support for CS46xx MIDI output. With this patch, users can
play the MIDI files through /dev/sequencer device with tools like
playmidi. The audio output will go through the external MIDI device
such like wavetable synthesis card.

Reviewed by:	matk (a long time ago), kib
Approved by:	re (kib)
Tested with: 	Terratec SiXPack 5.1+ + Yamaha DB50XG
MFC after:	4 weeks
2018-09-01 11:26:53 +00:00
Mark Johnston
d7965243c1 Re-compute the ARC size before computing the MFU target.
This fixes an upstream regression introduced in r331404, causing overly
aggressive reclamation of the ARC when under pressure.

Diagnosed by:	Paul <devgs@ukr.net>
Approved by:	re (gjb)
MFC after:	3 days
2018-08-31 21:45:05 +00:00
John Baldwin
cdb6aa7e47 Fix build of x86 UP kernels after dynamic IRQ changes in r338360.
Reported by:	Ian FREISLICH <ian.freislich@capeaugusta.com>
Approved by:	re (gjb)
MFC after:	2 weeks
2018-08-31 18:26:37 +00:00
Glen Barber
5d2346efea Update head from ALPHA3 to ALPHA4 as part of the 12.0-RELEASE
cycle.  The i386 build failure appears to be transient, and
now becoming more difficult to reliably reproduce to identify
the cause.  I will continue to investigate this, however.

Approved by:	re (implicit)
Sponsored by:	The FreeBSD Foundation
2018-08-31 16:29:36 +00:00
Ruslan Bukin
0f669630ac Fix an integer overflow while setting the kernel address (MODINFO_ADDR).
This eliminates build warning and makes kldstat happy.

Approved by:	re (marius)
2018-08-31 16:15:46 +00:00
John Baldwin
74aa2d49d6 Don't directly dereference a user pointer in the VPD ioctl.
The PCIOCLISTVPD ioctl on /dev/pci is used to fetch a list of VPD
key-value pairs for a specific PCI function.  It is used by
'pciconf -l -V'.  The list is stored in a userland-supplied buffer as
an array of variable-length structures where the key and data length
are stored in a fixed-size header followed by the variable-length
value as a byte array.  To facilitate walking this array in userland,
<sys/pciio.h> provides a PVE_NEXT() helper macro to return a pointer
to the next array element by reading the the length out of the current
header and using it to compute the address of the next header.

To simplify the implementation, the ioctl handler was also using
PVE_NEXT() when on the user address of the user buffer to compute the
user address of the next array element.  However, the PVE_NEXT() macro
when used with a user address was reading the value's length by
indirecting the user pointer.  The value was ready after the current
record had been copied out to the user buffer, so it appeared to work
on architectures where user addresses are directly dereferencable from
the kernel (all but powerpc and i386 after the 4:4 split).  The recent
enablement of SMAP on amd64 caught this violation however.  To fix,
add a variant of PVE_NEXT() for use in the ioctl handler that takes an
explicit value length.

Reported by:	Jeffrey Pieper @ Intel
Reviewed by:	kib
Approved by:	re (gjb)
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D16800
2018-08-31 16:10:01 +00:00
Kristof Provost
505e91f500 frag6: Fix fragment reassembly
r337776 started hashing the fragments into buckets for faster lookup.

The hashkey is larger than intended. This results in random stack data being
included in the hashed data, which in turn means that fragments of the same
packet might end up in different buckets, causing the reassembly to fail.

Set the correct size for hashkey.

PR:		231045
Approved by:	re (kib)
MFC after:	3 days
2018-08-31 08:37:15 +00:00
Glen Barber
f488983bdd Revert r338401, as the i386 build is broken.
Approved by:	re (implicit)
Sponsored by:	The FreeBSD Foundation
2018-08-31 02:22:33 +00:00
Warner Losh
91eeadc516 Don't load ccp automatically with devmatch
Remove the PNP info for the moment from the driver. It's an
experimental driver (as noted in r328150). It's performance is about
1/10th that of aesni. It will often panic when used with GELI (PR
2279820).  It's not in our best interest to have such a driver be
autoloaded by default.

Approved by: re@ (rgrimes)
Reviewed By: cem@
Differential Review: https://reviews.freebsd.org/D16959
2018-08-31 01:01:16 +00:00
Glen Barber
7c0f8d816a Update head from ALPHA3 to ALPHA4 as part of the 12.0-RELEASE
cycle.

Approved by:	re (implicit)
Sponsored by:	The FreeBSD Foundation
2018-08-31 00:05:38 +00:00
Emmanuel Vadot
6830111909 omap4_prcm: Delay the frequencies read check
Same as r333305, with Linux 4.17 dts the compatible for the prcm added
'simplebus', it mean that the simplebus driver will attach to it
at the BUS_PASS_BUS pass.
Change the pass for the prcm driver to be at BUS_PASS_BUS so we will win
the attach.
This introduce a problem as this driver needs the omap_scm one to be already
attached. omap_scm also attach at BUS_PASS_BUS but after the prcm one as it is
after in the dtb and the simplebus driver simpy walk the tree to attach it's
children.
Use the bus_new_pass method to defer the frequencies read at BUS_PASS_TIMER.
This fixes booting on pandaboard

Approved by:	re (rgrimes)
2018-08-30 14:32:47 +00:00
Emmanuel Vadot
522a17253e dtb: Add LINKS for omap4 DTBs
We do not use the upstream name so add links so u-boot can load the dtb
for us.

Approved by:	re (rgrimes)
2018-08-30 14:32:10 +00:00
Mark Johnston
f554293615 Re-add kstat.zfs.misc.arcstats.other_size under COMPAT_FREEBSD11.
It is used by a number of applications, notably top(1).

Reported by:	netchild
Reviewed by:	allanjude
Approved by:	re (delphij)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D16943
2018-08-30 13:42:01 +00:00
Stephen Hurd
bc0e855bd9 Fix compile error due to missing parenthesis in r338372
Approved by:	re (gjb)
2018-08-29 16:21:34 +00:00
Stephen Hurd
a520f8b6fe Fix potential data corruption in iflib
The MP ring may have txq pointers enqueued.  Previously, these were
passed to m_free() when IFC_QFLUSH was set.  This patch checks for
the value and doesn't call m_free().

Reviewed by:	gallatin
Approved by:	re (gjb)
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D16882
2018-08-29 15:55:25 +00:00
Emmanuel Vadot
b83d10091f arm64: GENERIC-MMCCAM: Fix build and module depend
Fix the build of the GENERIC-MMCCAM kernel config after the sdhci_xenon
driver was commited.
While here correct sdhci_fdt and tegra_sdhci, even with MMCCAM they do
need to depend on sdhci(4)

Reported by:	Reshetnikov Dmitriy <genserg@hotmail.com>
Approved by:	re (kib)
Sponsored by:	Rubicon Communications, LLC ("NetGate")
2018-08-29 14:01:27 +00:00
Konstantin Belousov
f0165b1ca6 Remove {max/min}_offset() macros, use vm_map_{max/min}() inlines.
Exposing max_offset and min_offset defines in public headers is
causing clashes with variable names, for example when building QEMU.

Based on the submission by:	royger
Reviewed by:	alc, markj (previous version)
Sponsored by:	The FreeBSD Foundation (kib)
MFC after:	1 week
Approved by:	re (marius)
Differential revision:	https://reviews.freebsd.org/D16881
2018-08-29 12:24:19 +00:00
Navdeep Parhar
e32cd65c5b cxgbe/iw_cxgbe: Fix iWARP RDMA + VIMAGE operation by setting the VNET
properly in a couple of places in the driver.

Submitted by:	Krishnamraju Eraparaju @ Chelsio
Approved by:	re@ (rgrimes@)
Sponsored by:	Chelsio Communications
2018-08-29 04:37:53 +00:00
Mark Johnston
394e8d20d9 Add a sysctl for the ZFS abd_scatter_enabled setting.
Submitted by:	Yamagi Burmeister <lists@yamagi.org> (original version)
Approved by:	re (rgrimes)
MFC after:	3 days
2018-08-29 02:49:18 +00:00
John Baldwin
fd036deac1 Dynamically allocate IRQ ranges on x86.
Previously, x86 used static ranges of IRQ values for different types
of I/O interrupts.  Interrupt pins on I/O APICs and 8259A PICs used
IRQ values from 0 to 254.  MSI interrupts used a compile-time-defined
range starting at 256, and Xen event channels used a
compile-time-defined range after MSI.  Some recent systems have more
than 255 I/O APIC interrupt pins which resulted in those IRQ values
overflowing into the MSI range triggering an assertion failure.

Replace statically assigned ranges with dynamic ranges.  Do a single
pass computing the sizes of the IRQ ranges (PICs, MSI, Xen) to
determine the total number of IRQs required.  Allocate the interrupt
source and interrupt count arrays dynamically once this pass has
completed.  To minimize runtime complexity these arrays are only sized
once during bootup.  The PIC range is determined by the PICs present
in the system.  The MSI and Xen ranges continue to use a fixed size,
though this does make it possible to turn the MSI range size into a
tunable in the future.

As a result, various places are updated to use dynamic limits instead
of constants.  In addition, the vmstat(8) utility has been taught to
understand that some kernels may treat 'intrcnt' and 'intrnames' as
pointers rather than arrays when extracting interrupt stats from a
crashdump.  This is determined by the presence (vs absence) of a
global 'nintrcnt' symbol.

This change reverts r189404 which worked around a buggy BIOS which
enumerated an I/O APIC twice (using the same memory mapped address for
both entries but using an IRQ base of 256 for one entry and a valid
IRQ base for the second entry).  Making the "base" of MSI IRQ values
dynamic avoids the panic that r189404 worked around, and there may now
be valid I/O APICs with an IRQ base above 256 which this workaround
would incorrectly skip.

If in the future the issue reported in PR 130483 reoccurs, we will
have to add a pass over the I/O APIC entries in the MADT to detect
duplicates using the memory mapped address and use some strategy to
choose the "correct" one.

While here, reserve room in intrcnts for the Hyper-V counters.

PR:		229429, 130483
Reviewed by:	kib, royger, cem
Tested by:	royger (Xen), kib (DMAR)
Approved by:	re (gjb)
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D16861
2018-08-28 21:09:19 +00:00
Mark Johnston
c208cb9923 Allow multiple FBT probes to share a tracepoint.
With GNU ifuncs, multiple FBT probes may correspond to the same
instruction.  fbt_invop() assumed that this could not happen and
would return after the first probe found in the global FBT hash
table, which might not be the one that's enabled.  Fix the problem
on x86 by linking probes that share a tracepoint and having each
linked probe fire when the tracepoint is hit.

PR:		230846
Approved by:	re (gjb)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D16921
2018-08-28 20:21:36 +00:00
Konstantin Belousov
4ec2e460b5 Regen after r338357.
Approved by:	re (gjb)
2018-08-28 18:50:34 +00:00
Konstantin Belousov
1444bf7c81 Fix compat32 ftruncate cap mode after ino64.
Reported by:    asomers
PR:     230120
Sponsored by:   The FreeBSD Foundation
Approved by:	re (gjb)
2018-08-28 18:49:39 +00:00
Konstantin Belousov
d367236183 Several bug fixes and robustness improvements for the AP boot page
table allocation.

At the time that mp_bootaddress() is called, phys_avail[] array does
not reflect some memory reservations already done, like kernel
placement.  Recent changes to DMAP protection which make kernel text
read-only in DMAP revealed this, where on some machines AP boot page
tables selection appears to intersect with the kernel itself.

Fix this by checking the addresses selected using the same algorithm
as bootaddr_rwx().  Also, try to chomp pages for the page table not
only at the start of the contiguous range, but also at the end.  This
should improve robustness when the only suitable range is already
consumed by the kernel.

Reported and tested by: Michael Gmelin <freebsd@grem.de>
Reviewed by:    jhb
MFC after:      1 week
Sponsored by:   The FreeBSD Foundation
Approved by:    re (gjb)
Differential revision:  https://reviews.freebsd.org/D16907
2018-08-28 18:47:02 +00:00
Navdeep Parhar
d6ddb0848c cxgbe/tom: Unregister shared CPL handlers on module unload. This fixes
a panic with INVARIANTS that occurs when t4_tom is unloaded and reloaded.

Approved by:	re@ (kib@)
2018-08-28 18:16:02 +00:00
Marcin Wojtas
0b5cab4e0f Use ip/ipv6 structures in al_eth only if they are supported
The ip/ipv6 header files are included only if the appropriate definition
exists, but the driver was missing similar checks when using the
ip and ip6_hdr structures.

If the kernel was not built with the INET or INET6 option, the driver
was preventing kernel from being built.

To fix that, the missing ifdef checks were added to the driver.

PR: Bug 230886
Submitted by: Michal Krawczyk <mk@semihalf.com>
Reported by: O. Hartmann
Approved by: re (gjb)
Obtained from: Semihalf
MFC after: 1 week
Sponsored by: Amazon, Inc.
2018-08-28 17:09:41 +00:00
Warner Losh
264d4ffdf1 Add big, nasty abandonware tags to this code.
This code works for some people, but hasn't been updated in a long
time. Still allow people to use this code for the moment, but put a
big, nasty obsolete message to inform and encourage people to move to
the port.

Approved by: re@ (gjb)
Differential Review: https://reviews.freebsd.org/D16894
2018-08-28 14:46:55 +00:00
Warner Losh
c0386fa3af Put building of drm and drm2 modules behind options.
Make the building of drm dependent on MK_MODULE_DRM and the building
of module drm2 on MK_MODULE_DRM2. The defaults are unchanged.

Approved by: re@ (gjb)
Differential Review: https://reviews.freebsd.org/D16894
2018-08-28 14:46:49 +00:00
Andrew Gallatin
4b82a7b62f Reject IPv4 SO_REUSEPORT_LB groups when looking up an IPv6 listening socket
Similar to how the IPv4 code will reject an IPv6 LB group,
we must ignore IPv4 LB groups when looking up an IPv6
listening socket.   If this is not done, a port only match
may return an IPv4 socket, which causes problems (like
sending IPv6 packets with a hopcount of 0, making them unrouteable).

Thanks to rrs for all the work to diagnose this.

Approved by:	re (rgrimes)
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D16899
2018-08-27 18:13:20 +00:00
Andrew Turner
9ea0458663 Use the correct register when storing the arm VFP state.
Previously we have been lucky where the state was already in r0, however
this is not guaranteed. Use the passed in register as the location to
store the upper half of the arm VFP registers rather than relying on it
being r0.

Approved by:	re (kib)
2018-08-27 10:08:27 +00:00
Xin LI
a29173be53 Remove arc4random_stir and arc4random_addrandom from stdlib.h.
Users of arc4random(3) should never call them directly.

All ports tree usage was fixed as part of bug 230756.

Relnotes:       yes
Approved by:    re (marius), exp-run (bug 230756 by portmgr antoine)
2018-08-26 18:04:54 +00:00
Mark Murray
19fa89e938 Remove the Yarrow PRNG algorithm option in accordance with due notice
given in random(4).

This includes updating of the relevant man pages, and no-longer-used
harvesting parameters.

Ensure that the pseudo-unit-test still does something useful, now also
with the "other" algorithm instead of Yarrow.

PR:		230870
Reviewed by:	cem
Approved by:	so(delphij,gtetlow)
Approved by:	re(marius)
Differential Revision:	https://reviews.freebsd.org/D16898
2018-08-26 12:51:46 +00:00
Alan Cox
49bfa624ac Eliminate the arena parameter to kmem_free(). Implicitly this corrects an
error in the function hypercall_memfree(), where the wrong arena was being
passed to kmem_free().

Introduce a per-page flag, VPO_KMEM_EXEC, to mark physical pages that are
mapped in kmem with execute permissions.  Use this flag to determine which
arena the kmem virtual addresses are returned to.

Eliminate UMA_SLAB_KRWX.  The introduction of VPO_KMEM_EXEC makes it
redundant.

Update the nearby comment for UMA_SLAB_KERNEL.

Reviewed by:	kib, markj
Discussed with:	jeff
Approved by:	re (marius)
Differential Revision:	https://reviews.freebsd.org/D16845
2018-08-25 19:38:08 +00:00
Colin Percival
ee97b2336a Speed up vt(4) by keeping a record of the most recently drawn character and
the foreground and background colours.  In bitblt_text functions, compare
values to this cache and don't re-draw the characters if they haven't changed.
When invalidating the display, clear this cache in order to force characters
to be redrawn; also force full redraws between suspend/resume pairs since odd
artifacts can otherwise result.

When scrolling the display (which is where most time is spent within the vt
driver) this yields a significant performance improvement if most lines are
less than the width of the terminal, since this avoids re-drawing blanks on
top of blanks.

(Note that "re-drawing" here includes writing to the VGA text mode buffer; on
virtualized systems this can be extremely slow since it triggers a glyph
being rendered onto a 640x480 screen).

On a c5.4xlarge EC2 instance (with emulated text mode VGA) this cuts the time
spent in vt(4) during the kernel boot from 1200 ms to 700ms; on my laptop
(with a 3200x1800 display) the corresponding time is reduced from 970 ms down
to 155 ms.

Reviewed by:	imp, cem
Approved by:	re (gjb)
Relnotes:	Significant speedup in vt(4) and the system boot generally.
Differential Revision:	https://reviews.freebsd.org/D16723
2018-08-25 16:14:56 +00:00
Konstantin Belousov
23c97bcba1 Remove dead code in i386 cpu_throw().
Curpmap must be already valid when cpu_throw() is called, even for early
AP startup.

Suggested by:	alc
Reviewed by:	alc, markj
Sponsored by:	The FreeBSD Foundation
Approved by:	re (marius)
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D16893
2018-08-25 15:31:23 +00:00
Konstantin Belousov
60b7423434 Unify amd64 and i386 vmspace0 pmap activation.
Add pmap_activate_boot() for i386, move the invocation on APs from MD
init_secondary() to x86 init_secondary_tail().

Suggested by:	alc
Reviewed by:	alc, markj
Sponsored by:	The FreeBSD Foundation
Approved by:	re (marius)
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D16893
2018-08-25 15:21:28 +00:00
Bjoern A. Zeeb
e67e4c6392 Unbreak RSS builds after r338257. Folding both RSS blocks together
I missed the closing } of the new combined block.

Pointyhat to:	bz
Reported by:	np
Approved by:	re (kib)
2018-08-24 21:49:21 +00:00
Navdeep Parhar
47c39432b1 Unbreak VLANs after r337943.
ether_set_pcp should not be called from ether_output_frame for VLAN
interfaces -- the vid + pcp will be inserted during vlan_transmit in
that case. r337943 sets the VLAN's ifnet's if_pcp to a proper PCP value
and this led to double encapsulation (once with vid 0 and second time
with vid+pcp).

PR: 230794
Reviewed by:	kib@
Approved by:	re@ (gjb@)
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D16887
2018-08-24 21:48:13 +00:00
Marius Strobl
7f6921ae18 The read accessors generated by __BUS_ACCESSOR() have the problem that
they don't check the result of BUS_READ_IVAR(9) and silently return stack
garbage on failure in case a bus doesn't implement a particular instance
variable for example. With MMC bridges not providing MMCBR_IVAR_RETUNE_REQ,
yet, this in turn can cause mmc(4) to get into a state in which re-tuning
seems to be necessary but is inappropriate, causing mmc_wait_for_request()
to fail. Thus, don't use __BUS_ACCESSOR() for mmcbr_get_retune_req() and
instead provide a version of the latter which returns retune_req_none if
reading MMCBR_IVAR_RETUNE_REQ fails.
One more straight-forward solution would have been to change mmc(4) to not
call mmcbr_get_retune_req() if the current transfer mode doesn't require
re-tuning to begin with. However, for modes such as SDR50, it depends on
the controller whether periodic re-tuning is need. Therefore, knowledge of
whether a particular transfer mode does require re-tuning should be kept
to the bridge drivers.
This change is the generic version of r338271, as intended not requiring
bridge drivers to be touched (unless transfer modes beyond high speed are
to be supported that is).

Approved by:	re (gjb)
2018-08-24 21:08:05 +00:00
Gleb Smirnoff
306abf0f35 Either "free" or "allocated" is misleading here, since an item
in a bucket is free from perspective of UMA consumer, and it is
allocated from perspective of keg.

Discussed with:	markj
Approved by:	re (kib)
2018-08-24 18:47:50 +00:00
Emmanuel Vadot
a9e5047fa6 arm64: Add DTS overlays for A64
- sun50i-a64-sid.dtso registers the Security ID node, needed for thermal
 - sun50i-a64-ths.dtso registers the thermal node, for which we already have a
driver
 - sun50i-a64-timer.dtso registers the timer node, needed as the generic timer
 glitch on A64 SoC.

Approved by:    re (gjb)
2018-08-24 15:00:36 +00:00
Mark Murray
29f9b2a93e Limit the amount of "fast" entropy. We don't need nearly as much
for security, and the excess just slows things down badly.

PR:             230808
Submitted by:   rwmaillists@googlemail.com, but tweeked by me
Reported by:    Danilo Egea Gondolfo <danilo@FreeBSD.org>
Reviewed by:	cem,delphij
Approved by:	re(rgrimes)
Approved by:	so(delphij)
MFC after:      1 Month
Differential Revision:	https://reviews.freebsd.org/D16873
2018-08-24 14:53:46 +00:00
Mark Murray
27064b30f2 Fix braino of mine where the reseeds would happen far too often,
making the kernel process way too busy.

PR:             230808
Submitted by:   Conrad Meyer <cem@FreeBSD.org>
Reported by:    Danilo Egea Gondolfo <danilo@FreeBSD.org>
Reviewed by:	cem,delphij
Approved by:	re(rgrimes)
Approved by:	so(delphij)
MFC after:      1 Month
Security:	Yes
Differential Revision:	https://reviews.freebsd.org/D16872
2018-08-24 14:53:42 +00:00
Michael Tuexen
c6c0be2765 Fix a shadowed variable warning.
Thanks to Peter Lei for reporting the issue.

Approved by:		re(kib@)
MFH:			1 month
Sponsored by:		Netflix, Inc.
2018-08-24 10:50:19 +00:00
Alexander Motin
cb892a4117 Unblock speculative prefetcher also on pool creation.
Fix at r331950 appeared to be incomplete, fixing only case of pool
import, but not pool creation, leaving prefetcher still blocked for
newly created pools.

Approved by:	re (gjb)
MFC after:	1 week
2018-08-24 01:59:25 +00:00
Glen Barber
cd371ac1a2 Update head from ALPHA2 to ALPHA3 as part of the 12.0-RELEASE
cycle.

Approved by:	re (implicit)
Sponsored by:	The FreeBSD Foundation
2018-08-24 00:25:25 +00:00
Warner Losh
592ffb2175 Revert drm2 removal.
Revert r338177, r338176, r338175, r338174, r338172

After long consultations with re@, core members and mmacy, revert
these changes. Followup changes will be made to mark them as
deprecated and prent a message about where to find the up-to-date
driver.  Followup commits will be made to make this clear in the
installer. Followup commits to reduce POLA in ways we're still
exploring.

It's anticipated that after the freeze, this will be removed in
13-current (with the residual of the drm2 code copied to
sys/arm/dev/drm2 for the TEGRA port's use w/o the intel or
radeon drivers).

Due to the impending freeze, there was no formal core vote for
this. I've been talking to different core members all day, as well as
Matt Macey and Glen Barber. Nobody is completely happy, all are
grudgingly going along with this. Work is in progress to mitigate
the negative effects as much as possible.

Requested by: re@ (gjb, rgrimes)
2018-08-24 00:02:00 +00:00
Gleb Smirnoff
a307fb5b0c Fix comment. The actual meaning of ub_cnt is the opposite. 2018-08-23 23:24:28 +00:00
Kirk McKusick
a9c2220f5c Proper spelling of consolidation.
Submitted by: Dimitry Andric
2018-08-23 22:35:14 +00:00
Marius Strobl
602a05b0da - Use le32dec(9) for decoding EXT_CSD values where it makes sense. [1]
- Locally cache some instance variable values in mmc_discover_cards()
  in order to improve the code readability a bit.

Obtained from:	NetBSD [1]
2018-08-23 21:26:58 +00:00
Mark Johnston
899fe184c7 Add a per-pagequeue pdpages counter.
Expose these counters under the vm.domain sysctl node.  The existing
vm.stats.vm.v_pdpages sysctl is preserved.

Reviewed by:	alc (previous version)
Differential Revision:	https://reviews.freebsd.org/D14666
2018-08-23 21:03:45 +00:00
Mark Johnston
99d92d732f Ensure that queue state is cleared when vm_page_dequeue() returns.
Per-page queue state is updated non-atomically, with either the page
lock or the page queue lock held.  When vm_page_dequeue() is called
without the page lock, in rare cases a different thread may be
concurrently dequeuing the page with the pagequeue lock held.  Because
of the non-atomic update, vm_page_dequeue() might return before queue
state is completely updated, which can lead to race conditions.

Restrict the vm_page_dequeue() interface so that it must be called
either with the page lock held or on a free page, and busy wait when
a different thread is concurrently updating queue state, which must
happen in a critical section.

While here, do some related cleanup: inline vm_page_dequeue_locked()
into its only caller and delete a prototype for the unimplemented
vm_page_requeue_locked().  Replace the volatile qualifier for "queue"
added in r333703 with explicit uses of atomic_load_8() where required.

Reported and tested by:	pho
Reviewed by:	alc
Differential Revision:	https://reviews.freebsd.org/D15980
2018-08-23 20:34:22 +00:00
Marius Strobl
608226d559 Obtain the bus mode (MMC or SD) from the directly superordinated
bus rather than reaching up to the bridge and use the cached mode
in mmcsd_delete(), too.
2018-08-23 20:25:27 +00:00
Mark Johnston
84c68e5820 Configure -zifunc-noplt for amd64 kernels.
Per r338251, this ensures that ifunc calls have the same ordinary
function calls.

Reviewed by:	emaste (previous version)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D16750
2018-08-23 19:58:24 +00:00
Michael Tuexen
90ab3571d8 Use arc4rand() instead of read_random() in the SCTP and TCP code.
This was suggested by jmg@.

Reviewed by:		delphij@, jmg@, jtl@
MFC after:		1 month
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D16860
2018-08-23 19:10:45 +00:00
Emmanuel Vadot
4ca213c07a a10_timer: Update the driver so we can use it on other SoC
a10_timer is currently use in UP allwinner SoC (A10 and A13).
Those don't have the generic arm timer.
The arm generic timecounter is broken in the A64 SoC, some attempts have
been made to fix the glitch but users still reported some minor ones.
Since the A64 (and all Allwinner SoC) still have this timer controller, rework
the driver so we can use it in any SoC.
Since it doesn't have the 64 bits counter on all SoC, use one of the
generic 32 bits counter as the timecounter source.

PR:	229644
2018-08-23 18:46:05 +00:00
Emmanuel Vadot
55f3f71ca0 aw_mmc: Handle MMCBR_IVAR_RETUNE_REQ
Without this the mmc stack sometimes think that we are in in a retune
operation and some command like switch the bus width to 4 bits failed.
We now switch correctly to 4 bits mode for sd card.

Reported by:	jmg, others in pine64 irc channel
2018-08-23 18:33:42 +00:00
Marius Strobl
6daf59aa79 Remove a duplicated interface capability bit missed in r336313. 2018-08-23 18:11:55 +00:00
Marius Strobl
6dea80e699 - According to section 2.2.5 of the SDHCI specification version 4.20,
SDHCI_TRNS_ACMD12 is to be set only for multiple-block read/write
  commands without data length information, so don't unconditionally
  set this bit. The result matches what e. g. Linux does.
- Section 2.2.19 of the SDHCI specification version 4.20 states that
  SDHCI_ACMD12_ERR should be only valid if SDHCI_INT_ACMD12ERR is set
  and hardware may clear SDHCI_ACMD12_ERR when SDHCI_INT_ACMD12ERR is
  cleared (differing silicon behavior is specifically allowed, though).
  Thus, read SDHCI_ACMD12_ERR before clearing SDHCI_INT_ACMD12ERR.
  While at it, use the 16-bit accessor rather than the 32-bit one for
  reading the 16-bit SDHCI_ACMD12_ERR.
- SDHCI_INT_TUNEERR isn't one of the ROC bits in SDHCI_INT_STATUS so
  clear it explicitly.
- Add missing prototypes and sort them.
2018-08-23 17:50:41 +00:00
Bjoern A. Zeeb
20cc39d085 MFp4 bz_ipv6_fast:
Migrate udp6_send() v4mapped code to udp6_output() saving us a re-lock and
further simplifying the address-family handling code by eliminating
AF_INET checks and almost all v4mapped handling right after the start
as cases could actually not happen anymore.

Rework output path locking similar to UDP4 allowing for better
parallelism (see r222488, and later versions).

Sponsored by: The FreeBSD Foundation (2012)
Sponsored by: iXsystems (2012)
Differential Revision:	https://reviews.freebsd.org/D3721
2018-08-23 16:54:22 +00:00
Kristof Provost
903eaa68f1 xen/netfront: Ensure curvnet is set
netfront_backend_changed() is called from the xenwatch_thread(), which means
that the curvnet is not set. We have to set it before we can call things like
arp_ifinit().

PR:		230845
2018-08-23 16:52:52 +00:00
Navdeep Parhar
becda72184 cxgbe(4): Use fcmpset instead of cmpset when appropriate.
Suggested by:	mjg@
MFC after:	1 month
Sponsored by:	Chelsio Communications
2018-08-23 16:24:27 +00:00
Mark Johnston
5b27058e96 Remove a redundant #ifdef _KERNEL.
Submitted by:	Sebastian Huber <sebastian.huber@embedded-brains.de>
MFC after:	3 days
2018-08-23 15:01:27 +00:00
Emmanuel Vadot
668085e884 arm64: GENERIC: Compile allwinner dtbs 2018-08-23 13:25:32 +00:00
Emmanuel Vadot
26b29cd7aa dtb: Allwinner: Add aarch64 dts files 2018-08-23 13:24:28 +00:00
Emmanuel Vadot
81b5157b94 dtb: aarch64 uses vendor subdirectories, handle that 2018-08-23 13:23:54 +00:00
Emmanuel Vadot
5695558451 make_dtb: Always add root directory in the include path
Some arch like arm and arm64 share DTS files, add the root dts directory
to the include paths.
2018-08-23 13:23:21 +00:00
Emmanuel Vadot
c3f6b2cf9f dts: Import DTS for arm64
- Most of the boards are using U-Boot, u-boot embed a DTB that isn't
compiled with -@ (overlay ready) so we cannot use overlays. We want
overlays, overlays are nice.
 - The DTS life is going to linux, then sometimes it's imported in
U-Boot but it depend on the SoC family, U-Boot doesn't batch import
every DTS like we do. So sometimes to U-Boot DTS are very old. Or when
an interesting patch in commited upstream it is in Linux X+2 (roughly 4
months from now), we then have to wait for U-Boot to catch up, that
give us between 4 and 6 months to have an update.
 - Some boards like the Marvell ones have 3 DTS, the one in the
vendor U-Boot made by Marvell themselves, the one in u-boot mainline
and the one in Linux. I found that the DTS in the Marvell U-Boot have
some problem with FreeBSD (especially the macchiatobin that declare
node with the same address but not the same size, that is not something
that the rman code can handle, it could be modified, I don't know the
code well enough). Also some compatible are used when they shouldn't,
for example they declare the gpio being orion-gpio while this binding
requires interrupts supports, which the node doesn't have.
 - The above situation is mostly the same with RockChip SoCs (possibly
others, those are the only SoCs I work on that have this problem).

 Note that importing the DTS doesn't mean that every board will use
them, I don't intend to copy the DTB to the GENERIC memstick image for
the Overdrive 1000/3000 for example, the ones provided by the firmware
works fine.
 RPI3 will still stay an exception as we use the DTB provided by the
rpi-firmware package, so they come from the rpi foundation linux fork.
2018-08-23 13:21:01 +00:00
Michael Tuexen
4ba1513d1a Don't use the explicit number 32 for the length of the secrets,
use sizeof() or explicit #definesi instead. No functional change.
This was suggested by jmg@.

MFC after:		1 month
XMFC with:		r338053
Sponsored by:		Netflix, Inc.
2018-08-23 06:03:59 +00:00
Warner Losh
d36967bd2b Add a new device flag: DF_ATTACHED_ONCE
This flag is set once the device has been successfully attached. When
set, it inhibits devmatch from trying to match the device. This in
turn allows kldunload to work as expected. Prior to the change, the
driver would immediately reload because devmatch had no notion that
the driver had once been attached, and therefore shouldn't participate
in further matching.

Differential Revision: https://reviews.freebsd.org/D16735
2018-08-23 05:06:16 +00:00
Warner Losh
5fa2979791 Create devctl freeze/thaw.
This adds it to devctl, libdevctl, defines the two IOCTLs and
implements the kernel bits. causes any new drivers that are added via
kldload to be deferred until a 'thaw' comes in. These do not stack: it
is an error to freeze while frozen, or thaw while thawed.

Differential Revision: https://reviews.freebsd.org/D16735
2018-08-23 05:05:47 +00:00
Conrad Meyer
b6c7d9c345 devstat(9): Constify function parameters that can be const
No functional change.

When attempting to document the changed argument types in devstat.9, I
discovered the 20 year old manual page severely mismatched reality even
prior to my simple change.  So I took a first cut pass cleaning that up to
match reality.  I'm sure I've missed some things; the goal was just to leave
it better than when I started.

Sponsored by:	Dell EMC Isilon
2018-08-23 01:42:45 +00:00
Navdeep Parhar
b8bfcb71fd cxgbev(4): Updates to the VF driver to cope with recent ifmedia and
ctrlq changes in the base driver.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2018-08-23 00:58:10 +00:00
Oleksandr Tymoshenko
ad4c75f74a [ig4] Fix I/O timeout issue with Designware I2C controller on AMD platforms
Due to hardware limitation AMD I2C controller can't trigger pending
interrupt if interrupt status has been changed after clearing
interrupt status bits.  So, I2C will lose the interrupt and IO will be
timed out. Implements a workaround to disable I2C controller interrupt
and re-enable I2C interrupt before existing interrupt handler.

Submitted by:	rajfbsd@gmail.com
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D16720
2018-08-22 22:56:01 +00:00