Commit Graph

129054 Commits

Author SHA1 Message Date
Mark Johnston
4013d72684 Fix handling of empty SCM_RIGHTS messages.
As unp_internalize() processes the input control messages, it builds
an output mbuf chain containing the internalized representations of
those messages.  In one special case, that of an empty SCM_RIGHTS
message, the message is simply discarded.  However, the loop which
appends mbufs to the output chain assumed that each iteration would
produce an mbuf, resulting in a null pointer dereference if an empty
SCM_RIGHTS message was followed by a non-empty message.

Fix this by advancing the output mbuf chain tail pointer only if an
internalized control message was produced.

Reported by:	syzbot+1b5cced0f7fad26ae382@syzkaller.appspotmail.com
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2019-10-08 23:34:48 +00:00
John Baldwin
4f13842f75 Add support for KTLS in the Chelsio TOE module.
This adds a TOE hook to allocate a KTLS session.  It also recognizes
TLS mbufs in the socket buffer and sends those to the NIC using a TLS
work request to encrypt the record before segmenting it.

TOE TLS support must be enabled via the dev.t6nex.<N>.tls sysctl in
addition to enabling KTLS.

Reviewed by:	np, gallatin
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D21891
2019-10-08 21:40:42 +00:00
John Baldwin
9e14430d46 Add a TOE KTLS mode and a TOE hook for allocating TLS sessions.
This adds the glue to allocate TLS sessions and invokes it from
the TLS enable socket option handler.  This also adds some counters
for active TOE sessions.

The TOE KTLS mode is returned by getsockopt(TLSTX_TLS_MODE) when
TOE KTLS is in use on a socket, but cannot be set via setsockopt().

To simplify various checks, a TLS session now includes an explicit
'mode' member set to the value returned by TLSTX_TLS_MODE.  Various
places that used to check 'sw_encrypt' against NULL to determine
software vs ifnet (NIC) TLS now check 'mode' instead.

Reviewed by:	np, gallatin
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D21891
2019-10-08 21:34:06 +00:00
Mateusz Guzik
fa43c5d49e amd64: plug spurious cld instructions
ABI already guarantees the direction is forward. Note this does not take care
of i386-specific cld's.

Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D21906
2019-10-08 21:14:11 +00:00
John Baldwin
c59050aab5 Set the FID field in lookaside crypto requests to the rx queue ID.
The PCI block in the adapter requires this field to be set to a valid
queue ID.  It is not clear why it did not fail on all machines, but
the effect was that crypto operations reading input data via DMA
failed with an internal PCI read error on machines with 128G or more
of RAM.

Reported by:	gallatin
Reviewed by:	np
MFC after:	3 days
Sponsored by:	Chelsio Communications
2019-10-08 20:22:05 +00:00
Hans Petter Selasky
08650d170c Fix regression issue after r352989:
As noted by the commit message, callouts are now persistant
and should not be in the auto-zero section of the RQ's and SQ's.
This fixes an assert when using the TX completion event
factor feature with mlx5en(4).

Found by:	gallatin@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-08 19:49:25 +00:00
Dimitry Andric
063e3a6dcc Prepare for merging back to head:
* Set tentative merge date
* Add UPDATING entry
* Bump __FreeBSD_version
* Bump FREEBSD_CC_VERSION
* Bump LLD_REVISION
2019-10-08 18:21:33 +00:00
Dimitry Andric
8b3bc70a2b Merge ^/head r352764 through r353315. 2019-10-08 18:17:02 +00:00
Gleb Smirnoff
1e80e4f26c Remove epoch assertion from if_setlladdr(). Originally this function was
protected by IF_ADDR_LOCK(), which was a mutex, so that two simultaneous
if_setlladdr() can't execute. Later it was switched to IF_ADDR_RLOCK(),
likely by a mistake. Later it was switched to NET_EPOCH_ENTER(). Then I
incorrectly added NET_EPOCH_ASSERT() here.

In reality ifp->if_addr never goes away and never changes its length. So,
doing bcopy() in it is always "safe", meaning it won't dereference a wrong
pointer or write into someone's else memory. Of course doing two bcopy() in
parallel would result in a mess of two addresses, but net epoch doesn't
protect against that, neither IF_ADDR_RLOCK() did.

So for now, just remove the assertion and leave for later a proper fix.

Reported by:	markj
2019-10-08 17:55:45 +00:00
Gleb Smirnoff
e4c40a8a71 Quickly plug another regression from r353292. Again, multicast locking needs
lots of work...

Reported by:	pho
2019-10-08 16:59:17 +00:00
Gleb Smirnoff
e9dc46cc30 In DIAGNOSTIC block of if_delmulti_ifma_flags() enter the network epoch.
This quickly plugs the regression from r353292. The locking of multicast
definitely needs a broader review today...

Reported by:	pho, dhw
2019-10-08 16:45:56 +00:00
Mark Johnston
8f52459857 Simplify pmap_page_array_startup() a bit.
No functional change intended.

Sponsored by:	The FreeBSD Foundation
2019-10-08 16:42:50 +00:00
Mark Johnston
b0fd461524 Avoid erroneously clearing PGA_WRITEABLE in riscv's pmap_enter().
During a CoW fault, we must check for both 4KB and 2MB mappings before
clearing PGA_WRITEABLE on the old mapping's page.  Previously we were
only checking for 4KB mappings.  This was missed in r344106.

MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
2019-10-08 15:03:48 +00:00
Mateusz Guzik
06d25b07c6 amd64 pmap: allocate pv table entries for gaps in PA
This matches the state prior to r353149 and fixes crashes with DRM
modules.

Reported and tested by:	cy, garga, Krasznai Andras
Fixes: r353149 ("amd64 pmap: implement per-superpage locks")
Sponsored by:	The FreeBSD Foundation
2019-10-08 14:59:50 +00:00
Mark Johnston
7992921bfe Clear PGA_WRITEABLE in riscv's pmap_remove_l3().
pmap_remove_l3() may remove the last mapping of a page, in which case
it must clear PGA_WRITEABLE.

Reported by:	Jenkins, via lwhsu
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2019-10-08 14:54:35 +00:00
Andriy Gapon
ac99b25298 zfs: use atomic_load_64 to read atomic variable in dmu_object_alloc_impl
As long as we support ZFS on 32-bit platforms we should do this for all
64-bit variables that are modified in a lockless fashion using atomic
operations.  Otherwise, there is a risk of a reading a torn value.

Here is a rationale for why I am doing this in dmu_object_alloc_impl:
- it's very recent code
- the code deals with object IDs and a number of objects in a file
  system can overflow 32 bits
- incorrect allocation of an object ID may result in hard to debug
  problems
- fixing all plain reads of 64-bit atomic variables is not a trivial
  undertaking to do in one shot, so I chose to do it incrementally

MFC after:	3 weeks
X-MFC after:	r353301, r353176
2019-10-08 11:27:48 +00:00
Michael Tuexen
953b78bed9 Validate length before use it, not vice versa.
r353060 should have contained this...
This fixes
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=18070
MFC after:		3 days
2019-10-08 11:07:16 +00:00
Hans Petter Selasky
a362cf527e Fix regression issue after r353274:
Make sure the vnet_shutdown field is not set until after all
VNET_SYSUNINIT()'s in the SI_SUB_VNET_DONE subsystem have been
executed. Especially the vnet_if_return() functions requires that
if_move() is still operational.

Reported by:	lwhsu@
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2019-10-08 11:06:24 +00:00
Andriy Gapon
db8bee42ce i386: hide more of atomic 64-bit definitions under _KERNEL
At the moment i386 does not provide 64-bit atomic operations in
userland.  Exposing some atomic_*_64 defines can cause unnecessary
confusion.

Discussed with:	kib
MFC after:	2 weeks
2019-10-08 10:50:16 +00:00
Doug Moore
2288078c5e Define macro VM_MAP_ENTRY_FOREACH for enumerating the entries in a vm_map.
In case the implementation ever changes from using a chain of next pointers,
then changing the macro definition will be necessary, but changing all the
files that iterate over vm_map entries will not.

Drop a counter in vm_object.c that would have an effect only if the
vm_map entry count was wrong.

Discussed with: alc
Reviewed by: markj
Tested by: pho (earlier version)
Differential Revision:	https://reviews.freebsd.org/D21882
2019-10-08 07:14:21 +00:00
Justin Hibbits
84046d16eb powerpc: Implement atomic_(f)cmpset_ for short and char
|
This adds two implementations for each atomic_fcmpset_ and atomic_cmpset_
short and char functions, selectable at compile time for the target
architecture.  By default, it uses a generic shift-and-mask to perform atomic
updates to sub-components of 32-bit words from <sys/_atomic_subword.h>.
However, if ISA_206_ATOMICS is defined it uses the ll/sc instructions for
halfword and bytes, introduced in PowerISA 2.06.  These instructions are
supported by all IBM processors from POWER7 on, as well as the Freescale/NXP
e6500 core.  Although the e5500 and e500mc both implement PowerISA 2.06 they
do not implement these instructions.

As part of this, clean up the atomic_(f)cmpset_acq and _rel wrappers, by
using macros to reduce code duplication.

ISA_206_ATOMICS requires clang or newer binutils (2.20 or later).

Differential Revision:	https://reviews.freebsd.org/D21682
2019-10-08 01:36:34 +00:00
Mark Johnston
cb49ec5431 Improve locking in the IPV6_V6ONLY socket option handler.
Acquire the inp lock before checking whether the socket is already bound,
and around updates to the inp_vflag field.

MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D21867
2019-10-07 23:35:23 +00:00
Mark Johnston
4090e2170d Assert that the PGA_{WRITEABLE,EXECUTABLE} flags do not leak.
Reviewed by:	alc, kib
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D21783
2019-10-07 23:31:17 +00:00
Mateusz Guzik
7b1fbc424a vm: stop trylocking page queues in vm_page_pqbatch_submit
About 11 minutes of poudriere -s -j 104 and probing on return value of
trylocks reveals that over 10% of attempts fail, which in turn means
there are more atomics performed than necessary.

Trylocking was there to try preventing migration, but it's not very likely
to happen if the lock is uncontested.

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D21925
2019-10-07 23:19:09 +00:00
Gleb Smirnoff
b8a6e03fac Widen NET_EPOCH coverage.
When epoch(9) was introduced to network stack, it was basically
dropped in place of existing locking, which was mutexes and
rwlocks. For the sake of performance mutex covered areas were
as small as possible, so became epoch covered areas.

However, epoch doesn't introduce any contention, it just delays
memory reclaim. So, there is no point to minimise epoch covered
areas in sense of performance. Meanwhile entering/exiting epoch
also has non-zero CPU usage, so doing this less often is a win.

Not the least is also code maintainability. In the new paradigm
we can assume that at any stage of processing a packet, we are
inside network epoch. This makes coding both input and output
path way easier.

On output path we already enter epoch quite early - in the
ip_output(), in the ip6_output().

This patch does the same for the input path. All ISR processing,
network related callouts, other ways of packet injection to the
network stack shall be performed in net_epoch. Any leaf function
that walks network configuration now asserts epoch.

Tricky part is configuration code paths - ioctls, sysctls. They
also call into leaf functions, so some need to be changed.

This patch would introduce more epoch recursions (see EPOCH_TRACE)
than we had before. They will be cleaned up separately, as several
of them aren't trivial. Note, that unlike a lock recursion the
epoch recursion is safe and just wastes a bit of resources.

Reviewed by:	gallatin, hselasky, cy, adrian, kristof
Differential Revision:	https://reviews.freebsd.org/D19111
2019-10-07 22:40:05 +00:00
Michael Tuexen
746c7ae563 In r343587 a simple port filter as sysctl tunable was added to siftr.
The new sysctl was not added to the siftr.4 man page at the time.
This updates the man page, and removes one left over trailing whitespace.

Submitted by:		Richard Scheffenegger
Reviewed by:		bcr@
MFC after:		3 days
Differential Revision:	https://reviews.freebsd.org/D21619
2019-10-07 20:35:04 +00:00
Edward Tomasz Napierala
1a13f2e6b4 Introduce stats(3), a flexible statistics gathering API.
This provides a framework to define a template describing
a set of "variables of interest" and the intended way for
the framework to maintain them (for example the maximum, sum,
t-digest, or a combination thereof).  Afterwards the user
code feeds in the raw data, and the framework maintains
these variables inside a user-provided, opaque stats blobs.
The framework also provides a way to selectively extract the
stats from the blobs.  The stats(3) framework can be used in
both userspace and the kernel.

See the stats(3) manual page for details.

This will be used by the upcoming TCP statistics gathering code,
https://reviews.freebsd.org/D20655.

The stats(3) framework is disabled by default for now, except
in the NOTES kernel (for QA); it is expected to be enabled
in amd64 GENERIC after a cool down period.

Reviewed by:	sef (earlier version)
Obtained from:	Netflix
Relnotes:	yes
Sponsored by:	Klara Inc, Netflix
Differential Revision:	https://reviews.freebsd.org/D20477
2019-10-07 19:05:05 +00:00
Hans Petter Selasky
4715738b12 Compile time assert a valid subsystem for all VNET init and uninit functions.
Using VNET init and uninit functions outside the given range has undefined
behaviour.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2019-10-07 14:24:59 +00:00
Hans Petter Selasky
204e2f30d9 Factor out VNET shutdown check into an own vnet structure field.
Remove the now obsolete vnet_state field. This greatly simplifies the
detection of VNET shutdown and avoids code duplication.

Discussed with:	bz@
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2019-10-07 14:15:41 +00:00
Hans Petter Selasky
ac66be4122 Make control endpoint quirk for xhci(4) configurable.
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2019-10-07 13:40:29 +00:00
Andriy Gapon
3251c5ae51 fix up r353168, add atomic_swap_64 to i386 version of opensolaris_atomic.S
The compatibility code for the atomic operations in ZFS code is a bit
messy.  In some cases the native definitions are directly made
available, in some cases there are emulated operations in
opensolaris_atomic.c and in yet other cases there are atomic operations
implemented in assembly that were obtained from OpenSolaris / illumos.

This commit adds atomic_swap_64 for use with i386 userland.
The code is copied from illumos.

I am not sure why FreeBSD does not provide that operation natively.
Maybe because we try (or pretend) to support processors that did not
have the necessary instructions.

While here I also added atomic_load_64 for the same reasons.
This is original code based on iilumos atomic_swap_64 and FreeBSD
atomic_load_acq_64_i586.

Pointyhat to:	avg
MFC after:	1 week
2019-10-07 12:53:27 +00:00
Andriy Gapon
862c20fd89 MFV r350898, r351075: 8423 8199 7432 Implement large_dnode pool feature
8423 8199 7432 Implement large_dnode pool feature

7432 Large dnode pool feature
8199 multi-threaded dmu_object_alloc()
8423 Implement large_dnode pool feature
10406 large_dnode changes broke zfs recv of legacy stream

llumos/illumos-gate@54811da5ac
54811da5ac
https://www.illumos.org/issues/8423
https://www.illumos.org/issues/8199
https://www.illumos.org/issues/7432

illumos/illumos-gate@811964cd9f
811964cd9f
https://www.illumos.org/issues/10406

  ZoL issues:
  Improved dnode allocation #6564
  Clean up large dnode code #6262
  Fix dnode_hold() freeing dnode behavior #8172
  Fix dnode allocation race #6414, #6439
  Partial: Raw sends must be able to decrease nlevels #6821, #6864
  Remove unnecessary txg syncs from receive_object() Closes #7197

This updates FreeBSD large_dnode code (that was imported from ZoL) to a
version that was committed to illumos.  It has some cleanups,
improvements and fixes comparing to what we have in FreeBSD now.
I think that the most significant update is 8199 multi-threaded
dmu_object_alloc().

This commit reverts r351077 that was a revert of r351074 and r351076 and
restores those changes.  Required atomic operations should be available
now on all platforms where we build ZFS.

Obtained from:	illumos
MFC after:	3 weeks
2019-10-07 08:14:45 +00:00
Emmanuel Vadot
0a457350fd arm: dts: ti: Fix mmc3 instance by setting it to disabled
DTS Import of Linux 5.3 added a patch that rework the L3 mmc instance
in the AM335x SoC but removed the status = 'disabled' on the node.
This cause the kernel to probe the device even if the board doesn't
have this mmc used and since we don't correctly activate the clock
for this module we panic with an external data abort.
Beaglebone(s) don't have this device anyway so simply disabling it.
Patch for the DTS was sent upstream.
https://patchwork.kernel.org/patch/11176921/

PR:		241089
Reported by:	phk
2019-10-07 08:11:49 +00:00
Andriy Gapon
f1fd3654e7 ZFS: unconditionally use atomic_swap_64
Previously, the code used a plain store on platforms that lacked
atomic_swap_64 and possibly some other platforms as the condition worked
only if atomic_swap_64 was a macro.

MFC after:	1 week
X-MFC after:	r353166, r353167
2019-10-07 08:00:54 +00:00
Andriy Gapon
cf78c4beae ZFS: add emulation of atomic_swap_64 and atomic_load_64
Some 32-bit platforms do not provide 64-bit atomic operations that ZFS
requires, either in userland or at all.  We emulate those operations for
those platforms using a mutex.  That is not entirely correct and it's
very efficient.  Besides, the loads are plain loads, so torn values are
possible.

Nevertheless, the emulation seems to work for some definition of work.

This change adds atomic_swap_64, which is already used in ZFS code, and
atomic_load_64 that can be used to prevent torn reads.

MFC after:	1 week
2019-10-07 07:54:34 +00:00
Andriy Gapon
eab7984cfe add atomic_load_64 for mipsn32
It's just an alias for atomic_load_acq_64 (same as on i386).

MFC after:	1 week
2019-10-07 07:42:26 +00:00
Andriy Gapon
80475f9523 align use of cp15_pmccntr_get with its availability
According to ian, the only armv6 cpu we support is the 1176, so this
change is effectively a no-op.
The change is just to make the code more self-consistent.
The issue was noticed by a standalone module build for armv6.

Reviewed by:	ian
MFC after:	3 weeks
2019-10-07 07:37:42 +00:00
Alan Cox
bc285d6a8f Eliminate an unused declaration. The variable in question is only defined
and used on sparc64.

MFC after:	1 week
2019-10-07 04:22:03 +00:00
Alan Cox
e227f09282 Eliminate a redundant bzero(). The l0 page table page was already zeroed
by efi_1t1_page().

MFC after:	1 week
2019-10-07 03:37:28 +00:00
Justin Hibbits
02e7952133 powerpc64/pmap: Fix release order to match lock order in moea64_enter()
Page PV lock is always taken first, so should be released last.  This also
(trivially) shortens the hold time of the pmap lock.

Submitted by:	mjg
2019-10-07 02:36:42 +00:00
Randall Stewart
5b63b22075 Brad Davis identified a problem with the new LRO code, VLAN's
no longer worked. The problem was that the defines used the
same space as the VLAN id. This commit does three things.
1) Move the LRO used fields to the PH_per fields. This is
   safe since the entire PH_per is used for IP reassembly
   which LRO code will not hit.
2) Remove old unused pace fields that are not used in mbuf.h
3) The VLAN processing is not in the mbuf queueing code. Consequently
   if a VLAN submits to Rack or BBR we need to bypass the mbuf queueing
   for now until rack_bbr_common is updated to handle the VLAN properly.

Reported by:	Brad Davis
2019-10-06 22:29:02 +00:00
Mateusz Guzik
e35cd9e38f ufs: add root vnode caching
See r353150.

Sponsored by:   The FreeBSD Foundation
Differential Revision:  https://reviews.freebsd.org/D21646
2019-10-06 22:18:03 +00:00
Mateusz Guzik
d511f93e45 nfsclient: add root vnode caching
See r353150.

Sponsored by:   The FreeBSD Foundation
Differential Revision:  https://reviews.freebsd.org/D21646
2019-10-06 22:17:29 +00:00
Mateusz Guzik
7682d0be2b tmpfs: add root vnode caching
See r353150.

Sponsored by:   The FreeBSD Foundation
Differential Revision:  https://reviews.freebsd.org/D21646
2019-10-06 22:17:11 +00:00
Mateusz Guzik
559ac49d41 devfs: add root vnode caching
See r353150.

Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D21646
2019-10-06 22:16:55 +00:00
Mateusz Guzik
818631b634 zfs: add root vnode caching
This replaces the approach added in r338927.

See r353150.

Sponsored by:	The FreeBSD Foundation
2019-10-06 22:16:00 +00:00
Mateusz Guzik
dc20b834ca vfs: add optional root vnode caching
Root vnodes looekd up all the time, e.g. when crossing a mount point.
Currently used routines always perform a costly lookup which can be
trivially avoided.

Reviewed by:	jeff (previous version), kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D21646
2019-10-06 22:14:32 +00:00
Mateusz Guzik
6114fc8b85 amd64 pmap: implement per-superpage locks
The current 256-lock sized array is a problem in the following ways:
- it's way too small
- there are 2 locks per cacheline
- it is not NUMA-aware

Solve these issues by introducing per-superpage locks backed by pages
allocated from respective domains.

This significantly reduces contention e.g. during poudriere -j 104.
See the review for results.

Reviewed by:	kib
Discussed with:	jeff
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D21833
2019-10-06 22:13:35 +00:00
Justin Hibbits
d137ff5521 powerpc/pmap64: Properly parenthesize PV_LOCK_COUNT macros
As pointed out by mjg, without the parentheses the calculations done against
these macros are incorrect, resulting in only 1/3 of locks being used.

Reported by:	mjg
2019-10-06 19:11:01 +00:00
Michael Tuexen
63fb39ba7b Plumb an mbuf leak in a code path that should not be taken. Also avoid
that this path is taken by setting the tail pointer correctly.
There is still bug related to handling unordered unfragmented messages
which were delayed in deferred handling.
This issue was found by OSS-Fuzz testing the usrsctp stack and reported in
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=17794

MFC after:		3 days
2019-10-06 08:47:10 +00:00
Kyle Evans
29a5f63951 riscv: use the common sub-word {,f}cmpset implementation
Reviewed by:	mhorne
Differential Revision:	https://reviews.freebsd.org/D21888
2019-10-06 01:35:31 +00:00
Kyle Evans
e3f35d562f Remove the remnants of SI_CHEAPCLONE
SI_CHEAPCLONE was introduced in r66067 for use with cloned bpfs. It was
later also used in tty, tun, tap at points. The rough timeline for being
removed in each of these is as follows:

- r181690: bpf switched to use cdevpriv API by ed@
- r181905: ed@ rewrote the TTY later to be mpsafe
- r204464: kib@ removes it from tun/tap, declaring it unused

I've not yet been able to dig up any other consumers in the intervening 9
years. It is no longer set on any devices in the tree and leaves an
interesting situation in make_dev_sv where we're ok with the device already
being set SI_NAMED.
2019-10-05 21:52:06 +00:00
Kyle Evans
d42fecb5c1 kern_conf: fully initialize cloned devices with make_dev_args, too
Attempting to initialize si_drv{1,2} with mda_si_drv{1,2} does not work if
you are operating on cloned devices.

clone_create must be called prior to the make_dev* family to create/return
the device on the clonelist as needed. This device is later returned early
in newdev(), prior to si_drv{0,1,2} initialization.

This patch simply breaks out of the loop if we've found a device and
finishes init.

Reviewed by:	kib
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D21904
2019-10-05 21:44:18 +00:00
Mateusz Guzik
dfa8dae493 devfs: plug redundant bwillwrite avoidance
vn_write already checks for vnode type to see if bwillwrite should be called.

This effectively reverts r244643.

Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D21905
2019-10-05 17:44:33 +00:00
Emmanuel Vadot
c36e2d14ac arm64: rockchip: usb2phy: Add set/get mode
We only support host mode so those functions are just added so
we won't panic when generic-{e,o}hci will set the phy to host mode.

MFC after:	1 month
X-MFC-With:	r353062
2019-10-05 17:36:33 +00:00
Michael Tuexen
2560cb1eb8 Fix a use after free bug when removing remote addresses.
This bug was found by OSS-Fuzz and reported in
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=18004

MFC after:		3 days
2019-10-05 13:28:01 +00:00
Michael Tuexen
0941b9dc37 Plumb an mbuf leak found by Mark Wodrich from Google by fuzz testing the
userland stack and reporting it in:
https://github.com/sctplab/usrsctp/issues/396

MFC after:		3 days
2019-10-05 12:34:50 +00:00
Michael Tuexen
44f788d793 Fix the adding of padding to COOKIE-ECHO chunks.
Thanks to Mark Wodrich who found this issue while fuzz testing the
usrsctp stack and reported the issue in
https://github.com/sctplab/usrsctp/issues/382

MFC after:		3 days
2019-10-05 09:46:11 +00:00
Cy Schubert
289850db1c Add missing definition in DEBUG code.
MFC after:	3 days
2019-10-04 22:10:38 +00:00
Conrad Meyer
c76e96edf6 nvdimm(4): Fix Clang build after r353110
Clang spuriously warns about some well-defined C99 static initializers.
Mute it.

X-MFC-With:	r353110
2019-10-04 21:47:09 +00:00
Eric van Gyzen
a912616493 Make the hw.intrs sysctl OID read-only
The handler ignores the new value, so make the OID read-only.

I found this while working on r353111.

MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
2019-10-04 21:46:11 +00:00
Eric van Gyzen
fdd888dee3 Add CTLFLAG_STATS to several debug.softdep sysctl OIDs
Refer to r353111.

MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
2019-10-04 21:44:52 +00:00
Eric van Gyzen
e61e783b83 Add CTLFLAG_STATS to some vfs sysctl OIDs
Add CTLFLAG_STATS to the following OIDs:

vfs.altbufferflushes
vfs.recursiveflushes
vfs.barrierwrites
vfs.flushwithdeps
vfs.reassignbufcalls

Refer to r353111.

MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
2019-10-04 21:43:43 +00:00
Eric van Gyzen
886614922d Add CTLFLAG_STATS to all COUNTER_U64* sysctl OIDs
CTLFLAG_STATS identifies a sysctl OID as statistical or informational,
as opposed to a configurable/tunable OID that changes behavior.
This can be used, for example, to verfiy that the kyua tests do not
modify configurable OIDs when allow_sysctl_side_effects is true.

Add CTLFLAG_STATS to all COUNTER_U64* OIDs.

I will add the flag to more OIDs in a few subsequent commits, to
facilitate MFC.  The flag should be added to many more OIDs.  I plan to
add it those that my test found and some nearby that looked obvious.

MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
2019-10-04 21:39:11 +00:00
Conrad Meyer
cbd974b4b9 nvdimm(4): Add nvdimm_e820 pseudo-bus
nvdimm_e820 is a newbus pseudo driver that looks for "legacy" e820 PRAM
spans and creates ordinary-looking SPA devfs nodes for them
(/dev/nvdimm_spaN).

As these legacy regions lack real NFIT SPA regions and namespace
definitions, they must be administratively sliced up externally using
device.hints.  This is similar in purpose to the Linux memmap= mechanism.

It is assumed that systems with working NFIT tables will not have any use
for this driver, and that that will be the prevailing style going forward,
so if there are no explicit hints provided, this driver does not
automatically create any devices.

Reviewed by:	kib (previous version)
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D21885
2019-10-04 18:38:47 +00:00
Mariusz Zaborski
5eb65c4ce5 dtrace: 64-bits registers support
The registers in ilumos and FreeBSD have a different number.
In the illumos, last 32-bits register defined is SS an in FreeBSD is GS.
While translating register we should comper it to the highest one.

PR:             240358
Reported by:    lwhsu@
MFC after:      2 weeks
2019-10-04 16:17:00 +00:00
Kyle Evans
291287667c tuntap(4): loosen up tunclose restrictions
Realistically, this cannot work. We don't allow the tun to be opened twice,
so it must be done via fd passing, fork, dup, some mechanism like these.
Applications demonstrably do not enforce strict ordering when they're
handing off tun devices, so the parent closing before the child will easily
leave the tun/tap device in a bad state where it can't be destroyed and a
confused user because they did nothing wrong.

Concede that we can't leave the tun/tap device in this kind of state because
of software not playing the TUNSIFPID game, but it is still good to find and
fix this kind of thing to keep ifconfig(8) up-to-date and help ensure good
discipline in tun handling.

MFC after:	 3 days
2019-10-04 13:43:07 +00:00
Kirk McKusick
44d37182ce Update ffs_getcg() function to accept a flags parameter to be passed
to breadn_flags() in preparation for later need when doing forcible
unmount when disk dies or is removed.

No functional change.

Sponsored by: Netflix
2019-10-04 05:28:36 +00:00
Alan Cox
f4ddd49973 The implementation of arm64_tlb_flushID_SE() was removed from cpufunc_asm.S
in r313347.  Eliminate its declaration from this file.

MFC after:	1 week
2019-10-04 03:55:53 +00:00
John Baldwin
b1e0d0db25 Remove aw_ehci from NOTES to fix LINT kernel builds after r353063.
Reported by:	Jenkins
2019-10-03 21:37:01 +00:00
Michael Tuexen
aac50dab6d When skipping the address parameter, take the padding into account.
MFC after:		3 days
2019-10-03 20:47:57 +00:00
Michael Tuexen
5989470c37 Cleanup sctp_asconf_error_response() and ensure that the parameter
is padded as required. This fixes the followig bug reported by
OSS-Fuzz for the usersctp stack:
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=17790

MFC after:		3 days
2019-10-03 20:39:17 +00:00
Konstantin Belousov
c5dac63c15 tmpfs_readdir(): unlock the locked node.
During readdir() we guarantee that the tn_dir.tn_parent does not go
away, but it might be replaced by a parallel rename.  Read tn_parent
only once, then use the cached value.

Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2019-10-03 19:55:05 +00:00
Konstantin Belousov
f7e69a6fa0 tmpfs_rename: style.
Reformat multi-line comments to follow style.
Also fix some typos.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2019-10-03 19:51:56 +00:00
Emmanuel Vadot
29ee738052 allwinner: Remove a10_ehci driver
We have generic-ehci since r353062 so use it.

MFC after:	1 month
X-MFC-With:	r353062
2019-10-03 18:58:15 +00:00
Emmanuel Vadot
7a58744fd0 Split out the attachment from the generic-ehci driver
Create an attachment file for the existing ACPI attachment, and create a
new FDT attachment for the generic-ehci driver.

Submitted by:	andrew (Original version)
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D19389
2019-10-03 18:53:03 +00:00
Michael Tuexen
967e1a5333 Add missing input validation. This could result in reading from
uninitialized memory.
The issue was found by OSS-Fuzz for usrsctp  and reported in
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=17780

MFC after:		3 days
2019-10-03 18:36:54 +00:00
Kyle Evans
59997c3c46 if_tuntap: create /dev aliases when a tuntap device gets renamed
Currently, if you do:

$ ifconfig tun0 create
$ ifconfig tun0 name wg0
$ ls -l /dev | egrep 'wg|tun'

You will see tun0, but no wg0. In fact, it's slightly more annoying to make
the association between the new name and the old name in order to open the
device (if it hadn't been opened during the rename).

Register an eventhandler for ifnet_arrival_events and catch interface
renames. We can determine if the ifnet is a tun easily enough from the
if_dname, which matches the cevsw.d_name from the associated tuntap_driver.

Some locking dance is required because renames don't require the device to
be opened, so it could go away in the middle of handling the ioctl, but as
soon as we've verified this isn't the case we can attempt to busy the tun
and either bail out if the tun device is dying, or we can proceed with the
rename.

We only create these aliases on a best-effort basis. Renaming a tun device
to "usbctl", which doesn't exist as an ifnet but does as a /dev, is clearly
not that disastrous, but we can't and won't create a /dev for that.
2019-10-03 17:54:00 +00:00
Kyle Evans
c4cad1549e if_tuntap: add a busy/unbusy mechanism, replace destroy OPEN check
A future commit will create device aliases when a tuntap device is renamed
so that it's still easily found in /dev after the rename.  Said mechanism
will want to keep the tun alive long enough to either realize that it's
about to go away or complete the alias creation, even if the alias is about
to get destroyed.

While we're introducing it, using it to prevent open devices from going away
makes plenty of sense and keeps the logic on waking up tun_destroy clean, so
we don't have multiple places trying to cv_broadcast unless it's still in
use elsewhere.
2019-10-03 17:46:27 +00:00
Ed Maste
9923b64177 Remove host binary object drivers from GENERIC
Four drivers (hpt27xx, hptmv, hptnr, hptrr, hpt27xx) include precompiled
binary objects; have users load them as modules if they are needed.

Additional work (i.e., integrating devmatch) required before MFC.

Reviewed by:	markj
Relnotes:	Yes
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D21865
2019-10-03 12:51:57 +00:00
Andriy Gapon
912c3fe715 ZFS: add bookmark renaming
The feature is implemented as an extension of the existing
ZFS_IOC_RENAME ioctl.  Both the userland and the DSL interfaces support
renaming only a single bookmark at a time.  As of now, there is no ZCP
interface to the new functionality.  I am going to add it once the DSL
interface passes a test of time.

This change picks up support for zfs_ioc_namecheck_t::ENTITY_NAME that
was added to ZoL as part of Redacted Send/Receive feature by Paul
Dagnelie <pcd@delphix.com>.  This is needed to allow a bookmark name in
zc_name.

Discussed with:	mahrens
Reviewed by:	bcr (man page)
Sponsored by:	CyberSecure
Differential Revision: https://reviews.freebsd.org/D21795
2019-10-03 11:08:45 +00:00
Konstantin Belousov
d60ac9d561 Remove unnecessary vm/vm_page.h and vm/vm_pager.h includes from
tmpfs/tmpfs_vnodes.c.

Submitted by:	ota@j.email.ne.jp
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D21881
2019-10-03 08:25:09 +00:00
Gleb Smirnoff
0b951c55a1 Fix build failure from r353026. Somehow module build allowed this.
Pointy hat to:	glebius
2019-10-03 04:41:57 +00:00
Gleb Smirnoff
30b7addf5a Protect access to seq->xwin[] with the seq mutex.
MFC after:	5 weeks
2019-10-03 02:34:51 +00:00
Gleb Smirnoff
631cabba47 - Remove the compile time limit for number of links a ng_bridge node
can handle.  Instead using an array on node private data, use per-hook
  private data.
- Use NG_NODE_FOREACH_HOOK() to traverse through hooks instead of array.

PR:		240787
Submitted by:	Lutz Donnerhacke <lutz donnerhacke.de>
Differential Revision:	  https://reviews.freebsd.org/D21803
2019-10-03 02:32:55 +00:00
John Baldwin
5c7a45bb14 Fix the EMBEDFS_FORMAT helper variable for riscv64.
It was defined with the wrong MACHINE_ARCH previously.  This permits
using an MFS image without defining MD_ROOT_SIZE which has various
benefits (one being that the build is able to treat the MFS image as
a dependency and properly re-link the kernel with the new image when
building with NO_CLEAN).

MFC after:	2 weeks
Sponsored by:	DARPA
2019-10-02 21:49:39 +00:00
Ed Maste
f91dd6091b simplify path handling in sysctl_try_reclaim_vnode
MAXPATHLEN / PATH_MAX includes space for the terminating NUL, and namei
verifies the presence of the NUL.  Thus there is no need to increase the
buffer size here.

The sysctl passes the string excluding the NUL, so req->newlen equal to
PATH_MAX is too long.

Reviewed by:	kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D21876
2019-10-02 21:01:23 +00:00
Conrad Meyer
31f1c8fc84 nvdimm: Fix error path mis-free
Regression introduced in r343629 when malloc result was renamed from spa to
spa_mapping and the 'spa' name was instead used to iterate a table, but the
free() target was not updated.

Reviewed by:	kib, scottph
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D21871
2019-10-02 19:13:35 +00:00
Kyle Evans
55c4535d81 sparc64: use generic sub-word atomic *cmpset
Most of this diff is refactoring to reduce duplication between the different
acq_ and rel_ variants.

Differential Revision:	https://reviews.freebsd.org/D21822
2019-10-02 17:08:20 +00:00
Kyle Evans
281ec62c97 mips: use generic sub-word atomic *cmpset
Most of this diff is refactoring to reduce duplication between the different
acq_ and rel_ variants.

Differential Revision:	https://reviews.freebsd.org/D21822
2019-10-02 17:07:59 +00:00
Kyle Evans
b6c5d1ef76 Provide generic sub-word atomic *cmpset
Provide *cmpset_{8,16} as wrappers around atomic_fcmpset_32. Initial users
will be mips and sparc64, and perhaps parts of powerpc.

This are not for general consumption; machine/atomic.h should include this
header as needed to provide atomic_{,f}cmpset_{8,16} and machine/atomic.h
should provide acq_ and rel_ variants.

Reviewed by:	jhibbits, kib
Differential Revision:	https://reviews.freebsd.org/D21822
2019-10-02 17:06:28 +00:00
Mark Johnston
5131cba6d6 Use OBJT_PHYS VM objects for kernel modules.
OBJT_DEFAULT incurs some unnecessary overhead given that kernel module
pages cannot be paged out.

Reviewed by:	alc, kib
MFC after:	1 week
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D21862
2019-10-02 16:34:42 +00:00
Mark Johnston
551caa8741 Harmonize the hptmv blob's build rule with that of other hpt* drivers.
No functional change intended.

Reviewed by:	emaste
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D21866
2019-10-02 16:18:50 +00:00
Mark Johnston
4a7b33ecf4 Disallow fcntl(F_READAHEAD) when the vnode is not a regular file.
The mountpoint may not have defined an iosize parameter, so an attempt
to configure readahead on a device file can lead to a divide-by-zero
crash.

The sequential heuristic is not applied to I/O to or from device files,
and posix_fadvise(2) returns an error when v_type != VREG, so perform
the same check here.

Reported by:	syzbot+e4b682208761aa5bc53a@syzkaller.appspotmail.com
Reviewed by:	kib
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D21864
2019-10-02 15:45:49 +00:00
Kyle Evans
22c2c971a6 mips: fcmpset: do not spin on sc failure
For ll/sc architectures, atomic(9) allows failure modes where *old == val
due to write failure and callers should compensate for this. Do not retry on
failure, just leave 0 in ret and fail the operation if we couldn't sc it.
This lets the caller determine if it should retry or not.

Reviewed by:	kib
Looks ok:	imp
Differential Revision:	https://reviews.freebsd.org/D21836
2019-10-02 15:13:40 +00:00
Hans Petter Selasky
bb2d815739 Fix build failure for gcc after r352983, due to
not using static variable declared by net/sff8472.h .

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 12:45:39 +00:00
Hans Petter Selasky
67a254a074 Fix build failure for 32-bit platforms after r352991, due to
incorrect printf() formatter string.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 12:02:14 +00:00
Hans Petter Selasky
d3d47408fc Bump driver version for mlx5core, mlx5en(4) and mlx5ib(4).
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 11:15:35 +00:00
Hans Petter Selasky
fedc7bd218 Print numeric error_type and module_status in mlx5core
in case the strings are not available.

Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 11:06:01 +00:00
Hans Petter Selasky
02fd8723c2 Add print to show user a reason for rejecting buffer size change in mlx5en(4).
Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 11:05:05 +00:00
Hans Petter Selasky
e525a7f0eb Only update lossy buffers config when manual PFC configuration was done
in mlx5en(4).

Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 11:02:54 +00:00
Hans Petter Selasky
63dd1be931 Improve mlx5_fwdump_prep logging in mlx5core.
Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 11:01:05 +00:00
Hans Petter Selasky
7bf46a63ce Randomize the delay when waiting for VSC flag in mlx5core.
The PRM suggests random 0 - 10ms to prevent multiple waiters on the same
interval in order to avoid starvation.

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 10:59:44 +00:00
Hans Petter Selasky
59efbf791d Wait for FW readiness before initializing command interface in mlx5core.
Before attempting to initialize the command interface we must wait till
the fw_initializing bit is clear.

If we fail to meet this condition the hardware will drop our
configuration, specifically the descriptors page address.  This scenario
can happen when the firmware is still executing an FLR flow and did not
finish yet so the driver needs to wait for that to finish.

Linux commits:
6c780a0267b8
b8a92577f4be.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 10:53:28 +00:00
Hans Petter Selasky
c84e0068ce Fix regression issue about bad refcounting of unlimited send tags
in mlx5en(4) after r348254.

The unlimited send tags are shared amount multiple connections and are
not allocated per send tag allocation request. Only increment the refcount.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 10:46:57 +00:00
Hans Petter Selasky
eeb1ff9800 Seal transmit path with regards to using destroyed mutex in mlx5en(4).
It may happen during link down that the running state may be observed
non-zero in the transmit routine, right before the running state is
cleared. This may end up using a destroyed mutex.

Make all channel mutexes and callouts persistant.

Preserve receive and send queue statistics during link toggle.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 10:43:49 +00:00
Hans Petter Selasky
b4672400e6 Remove unused cpu field from channel structure in mlx5en(4).
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 10:26:26 +00:00
Hans Petter Selasky
a2d65bfd8f Remove mkey_be from channel structure in mlx5en(4).
Use value from priv structure instead.
This saves some space in the channel structure.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 10:25:47 +00:00
Hans Petter Selasky
3e40712eb0 Return an error from ioctl(MLX5_FW_RESET) if reset was rejected in mlx5core.
Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 10:24:13 +00:00
Hans Petter Selasky
96425f44c9 Add sysctl(8) to get and set forward error correction, FEC, configuration
in mlx5en(4).

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 10:22:15 +00:00
Hans Petter Selasky
048ddb58bc Move EEPROM information query from a sysctl in mlx5en(4) to an ioctl
in mlx5core. The EEPROM information is not only a property of the
mlx5en(4) driver.

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 10:14:55 +00:00
Hans Petter Selasky
6deb0b1e94 Add support for buffer parameter manipulations in mlx5en(4).
The following sysctls are added:
dev.mce.N.conf.qos.cable_length
dev.mce.N.conf.qos.buffers_size
dev.mce.N.conf.qos.buffers_prio

Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 10:08:04 +00:00
Hans Petter Selasky
c28ef24918 Import Linux code to query/set buffer state in mlx5en(4).
Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 10:05:34 +00:00
Hans Petter Selasky
d585ff62c4 Add mlx5e_dbg() compatibility macro.
Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:59:42 +00:00
Hans Petter Selasky
006ae571da Update definitons for PPTB and PBMC registers layouts in mlx5core.
Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:58:00 +00:00
Hans Petter Selasky
207ff00e26 Add definition for the Port Buffer Status Register in mlx5core.
Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:57:12 +00:00
Hans Petter Selasky
8ae1c36f8b Sort the ports registers definitions numerically in mlx5core.
Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:56:27 +00:00
Hans Petter Selasky
6b4040d8ff Unify prints in mlx5en(4).
All prints in mlx5en(4) should use on of the macros:
mlx5_en_err/dbg/warn

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:49:44 +00:00
Hans Petter Selasky
a2f4f59ca8 Unify prints in mlx5core.
All prints in mlx5core should use on of the macros:
mlx5_core_err/dbg/warn

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:48:01 +00:00
Hans Petter Selasky
c9bb26aef1 Add proper print in case of 0x0 health syndrome in mlx5core.
In case of health counter fails to increment it indicates a bad device health.
In case when the syndrome indicated by firmware is 0x0, this indicates that
firmware is unable to respond to initialization segment reads.
Add proper print in this case.

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:46:14 +00:00
Hans Petter Selasky
95c05e056f Add missing blank line at the end of the print in mlx5core.
Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:45:07 +00:00
Hans Petter Selasky
4bc8507b82 Remove no longer needed fwdump register tables from mlx5core.
Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:43:48 +00:00
Hans Petter Selasky
4745819044 Read rege map from crdump scan space in mlx5core.
Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:40:23 +00:00
Hans Petter Selasky
f29c160ef3 Define MLX5_VSC_DOMAIN_SCAN_CRSPACE.
Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:34:34 +00:00
Hans Petter Selasky
4d98df72cd Use the MLX5_VSC_DOMAIN_SEMAPHORES constant instead of hand-rolled symbol
in mlx5core.

Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:33:38 +00:00
Hans Petter Selasky
04910901bc Move mlx5_ifc_vsc_space_bits and mlx5_ifc_vsc_addr_bits to mlx5_ifc.h.
Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:32:41 +00:00
Hans Petter Selasky
e456decc55 Make the mlx5_vsc_wait_on_flag(9) function global.
Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:31:36 +00:00
Hans Petter Selasky
111b57c359 Add port module event software counters in mlx5core.
While at it, fixup PME based on latest PRM defines.

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:29:55 +00:00
Hans Petter Selasky
55221653c0 Correct and update some counter names in mlx5en(4).
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:27:56 +00:00
Hans Petter Selasky
2110251a39 Export channel IRQ number as part of the "hw_ctx_debug" sysctl(8) in mlx5en(4).
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:27:08 +00:00
Hans Petter Selasky
6226306b47 Cleanup naming of IRQ vectors in mlx5en.
Remove unused IRQ naming functions and arrays.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:23:33 +00:00
Hans Petter Selasky
66b38bfe3d Add support for Multi-Physical Function Switch, MPFS, in mlx5en.
MPFS is a logical switch in the Mellanox device which forward packets
based on a hardware driven L2 address table, to one or more physical-
or virtual- functions. The physical- or virtual- function is required
to tell the MPFS by using the MPFS firmware commands, which unicast
MAC addresses it is requesting from the physical port's traffic.
Broadcast and multicast traffic however, is copied to all listening
physical- and virtual- functions and does not need a rule in the MPFS
switching table.

Linux commit:	eeb66cdb682678bfd1f02a4547e3649b38ffea7e
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:22:22 +00:00
Hans Petter Selasky
2db3dd5061 Implement macro for asserting priv lock in mlx5en.
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:16:17 +00:00
Hans Petter Selasky
2e16440940 Fix for missing cleanup code in error case in mlx5en.
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:15:07 +00:00
Hans Petter Selasky
53784e3632 Check return value of mlx5_vector2eqn() function in mlx5en.
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:14:01 +00:00
Hans Petter Selasky
8e773e55f4 Make sure the number of IRQ vectors doesn't exceed 256 in mlx5core.
The "intr" field in "struct mlx5_ifc_eqc_bits" is only 8 bits wide.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:12:53 +00:00
Hans Petter Selasky
b632492966 Update warning and error print formats in mlx5ib.
Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:11:01 +00:00
Hans Petter Selasky
c788dcead1 Fix reported max SGE calculation in mlx5ib.
Add the 512 bytes limit of RDMA READ and the size of remote address to the max
SGE calculation.

Submitted by:	slavash@
Linux commit:	288c01b746aa
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:09:28 +00:00
Hans Petter Selasky
6fe20cefa3 Make sure the transmit loop doesn't get starved in ipoib.
When the software send queue gets filled up, callbacks to
if_transmit will stop. Make sure the transmit callback
routine checks the send queue and outputs any remaining
mbufs. Else the remaining mbufs may simply sit in the
output queue blocking the transmit path.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:06:13 +00:00
Hans Petter Selasky
e8f206cc60 Notify all sleeping threads of device removal in krping.
Implement d_purge for krping_cdevsw.

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:03:48 +00:00
Kyle Evans
5a391b572b shm_open2(2): completely unbreak
kern_shm_open2(), since conception, completely fails to pass the mode along
to kern_shm_open(). This breaks most uses of it.

Add tests alongside this that actually check the mode of the returned
files.

PR:		240934 [pulseaudio breakage]
Reported by:	ler, Andrew Gierth [postgres breakage]
Diagnosed by:	Andrew Gierth (great catch)
Tested by:	ler, tmunro
Pointy hat to:	kevans
2019-10-02 02:37:34 +00:00
Emmanuel Vadot
150c95edfe generic_ehci: Enable all phys and resets
The number of phys and resets is not defined and it controller dependent
so enable/disable every one of them.
2019-10-01 22:20:03 +00:00
Emmanuel Vadot
fbf39b9767 arm: allwinner: a10_ehci: Enable all phys
Even if there should be only one phy enable all the ones declared in
the dts just to be sure.
2019-10-01 22:19:12 +00:00
Emmanuel Vadot
ecd012f975 arm: allwinner: a10_ehci: Look for the phy based on the id
phy-names was never in the bindings schema even if it was present
in some DTS. Get the optional phy based on its ID.

PR:		240978
2019-10-01 20:22:54 +00:00
Emmanuel Vadot
4dbb3f478b generic_ohci: Look for the phy based on the id
phy-names was never in the bindings schema even if it was present
in some DTS. Get the optional phy based on its ID.
2019-10-01 20:21:49 +00:00
Alexander Motin
e9f4580d92 Improve latency of synchronous 128KB writes.
Before my ZIL space optimization few years ago 128KB writes were logged
as two 64KB+ records in two 128KB log blocks.  After that change it became
~124KB+/4KB+ in two 128KB log blocks to free space in the second block
for another record.  Unfortunately in case of 128KB only writes, when space
in the second block remained unused, that change increased write latency by
imbalancing checksum computation time between parallel threads.

This change introduces new 68KB log block size, used for both writes below
67KB and 128KB-sharp writes.  Writes of 68-127KB are still using one 128KB
block to not increase processing overhead.  Writes above 131KB are still
using full 128KB blocks, since possible saving there is small.  Mixed loads
will likely also fall back to previous 128KB, since code uses maximum of
the last 10 requested block sizes.

On a simple 128KB write test with queue depth of 1 this change demonstrates
~15-20% performance improvement.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2019-10-01 20:09:25 +00:00
Ian Lepore
044477e294 Add 8 and 16 bit versions of atomic_cmpset and atomic_fcmpset for arm.
This adds 8 and 16 bit versions of the cmpset and fcmpset functions. Macros
are used to generate all the flavors from the same set of instructions; the
macro expansion handles the couple minor differences between each size
variation (generating ldrexb/ldrexh/ldrex for 8/16/32, etc).

In addition to handling new sizes, the instruction sequences used for cmpset
and fcmpset are rewritten to be a bit shorter/faster, and the new sequence
will not return false when *dst==*old but the store-exclusive fails because
of concurrent writers. Instead, it just loops like ldrex/strex sequences
normally do until it gets a non-conflicted store. The manpage allows LL/SC
architectures to bogusly return false, but there's no reason to actually do
so, at least on arm.

Reviewed by:	cognet
2019-10-01 19:39:00 +00:00
Emmanuel Vadot
9f45d455d7 syr827: Switch to iicdev_{readfrom,writeto}
Also use IIC_INTRWAIT as we need this to work with the rockchip i2c driver.
2019-10-01 18:32:27 +00:00
Emmanuel Vadot
361a394828 arm64: rockchip: rk805: Switch to iicdev_{readfrom,writeto}
This simpify the code a bit.
2019-10-01 18:30:06 +00:00
Ed Maste
f403831e6c sysalls.master: remove superfluous ellipsis in comment
A single period is sufficient in this comment, and making this change
lets us find references to varargs syscalls by searching for ...
2019-10-01 17:05:21 +00:00