Commit Graph

137136 Commits

Author SHA1 Message Date
Emmanuel Vadot
7cbdf8a05d dwmmc: Add \n to a debug printf 2021-04-27 19:01:09 +02:00
Emmanuel Vadot
f1cc48e5da mmc: dwmmc: Convert driver to use the mmc_sim interface
A lot more generic cam related things are done in mmc_sim so this simplify
the driver a lot.

Differential Revision:	https://reviews.freebsd.org/D27487
Reviewed by:	kibab
2021-04-27 19:00:47 +02:00
Emmanuel Vadot
2671bdb540 allwinner: aw_mmc: Convert driver to use the mmc_sim interface
A lot more generic cam related things are done in mmc_sim so this simplify
the driver a lot.

Differential Revision:	https://reviews.freebsd.org/D27486
Reviewed by:	imp
2021-04-27 19:00:42 +02:00
Emmanuel Vadot
47bde7925b mmccam: Add mmc_sim, a generic sim for mmc driver to use
This adds a generic sim that abstract a lot of what needs to be implemented
in a driver for mmccam support.
A new interface with three methods is added :

 - mmc_sim_get_tran_settings: Use to get what the controller supports in term
   of capabilities, freq etc ...
 - mmc_sim_set_tran_settings: Use to change the speed/freq/etc of the
   sdcard host controller
 - mmc_sim_cam_request: Used for MMCIO requests

Differential Revision:	https://reviews.freebsd.org/D27485
Reviewed by:	kibab
2021-04-27 19:00:38 +02:00
Ruslan Bukin
4c1ecf5502 Consider the broken card detect flag that comes from 'broken-cd;'
dts property.

This fixes operation on Intel Stratix 10 devices.

Tested on Terasic DE10-Pro.

Reviewed by: manu
Sponsored by: UKRI
Differential revision: https://reviews.freebsd.org/D29999
2021-04-27 12:19:05 +01:00
Michael Tuexen
059ec2225c sctp: cleanup verification of INIT and INIT-ACK chunks 2021-04-27 12:45:43 +02:00
Alexander V. Chernikov
439d087d0b [fib algo] always commit static routes synchronously.
Modular fib lookup framework features logic that allows
 route update batching for the algorithms that cannot easily
 apply the routing change without rebuilding. As a result,
 dataplane lookups may return old data until the the sync
 takes place. With the default sync timeout of 50ms, it is
 possible that new binary like ping(8) executed exactly after
 route(8) will still use the old fib data.

To address some aspects of the problem, framework executes
 all rtable changes without RTF_GATEWAY synchronously.

To fix the aforementioned problem, this diff extends sync
 execution for all RTF_STATIC routes (e.g. ones maintained by
 route(8).
This fixes a bunch of tests in the networking space.

Reported by:	ci, arichardson
MFC after:	2 weeks
2021-04-27 08:31:40 +00:00
Alexander V. Chernikov
25682e6a49 Fix rtsock sockaddr alignment.
b31fbebeb3 introduced alloc_sockaddr_aligned() which, in fact,
 failed to produce aligned addresses.

Reported by:	Oskar Holmlund <oskar.holmlund at yahoo.com>
MFC after:	immediately
2021-04-27 08:04:19 +00:00
Alexander V. Chernikov
bc5ef45aec Fix drace CTF for the rib_head.
33cb3cb2e3 introduced an `rib_head` structure field under the
FIB_ALGO define. This may be problematic for the CTF, as some
 of the files including `route_var.h` do not have `fib_algo`
 defined.

Make dtrace happy by making the field unconditional.

Suggested by:	markj
2021-04-27 07:47:53 +00:00
Rick Macklem
f5ff282bc0 nfscl: fix the handling of NFSERR_DELAY for Open/LayoutGet RPCs
For a pNFS mount, the NFSv4.1/4.2 client uses compound RPCs that
have both Open and LayoutGet operations in them.
If the pNFS server were tp reply NFSERR_DELAY for one of these
compounds, the retry after a delay cannot be handled by
newnfs_request(), since there is a reference held on the open
state for the Open operation in them.

Fix this by adding these RPCs to the "don't do delay here"
list in newnfs_request().

This patch is only needed if the mount is using pNFS (the "pnfs"
mount option) and probably only matters if the MDS server
is issuing delegations as well as pNFS layouts.

Found by code inspection.

MFC after:	2 weeks
2021-04-26 17:48:21 -07:00
Rick Macklem
61aea7fa3c param.h: bump __FreeBSD_version for commit 8759773148
Commit 8759773148 changed the internal KPI between the
nfsd and nfscommon modules, so both need to be rebuilt
from sources.
2021-04-26 16:35:18 -07:00
Rick Macklem
8759773148 nfsd: fix the slot sequence# when a callback fails
Commit 4281bfec36 patched the server so that the
callback session slot would be free'd for reuse when
a callback attempt fails.
However, this can often result in the sequence# for
the session slot to be advanced such that the client
end will reply NFSERR_SEQMISORDERED.

To avoid the NFSERR_SEQMISORDERED client reply,
this patch negates the sequence# advance for the
case where the callback has failed.
The common case is a failed back channel, where
the callback cannot be sent to the client, and
not advancing the sequence# is correct for this
case.  For the uncommon case where the client's
reply to the callback is lost, not advancing the
sequence# will indicate to the client that the
next callback is a retry and not a new callback.
But, since the FreeBSD server always sets "csa_cachethis"
false in the callback sequence operation, a retry
and a new callback should be handled the same way
by the client, so this should not matter.

Until you have this patch in your NFSv4.1/4.2 server,
you should consider avoiding the use of delegations.
Even with this patch, interoperation with the
Linux NFSv4.1/4.2 client in kernel versions prior
to 5.3 can result in frequent 15second delays if
delegations are enabled.  This occurs because, for
kernels prior to 5.3, the Linux client does a TCP
reconnect every time it sees multiple concurrent
callbacks and then it takes 15seconds to recover
the back channel after doing so.

MFC after:	2 weeks
2021-04-26 16:24:10 -07:00
Navdeep Parhar
43bbae1948 cxgbe(4): Separate the sw- and hw-specific parts of resource allocations
The driver uses both software resources (locks, callouts, memory for
descriptors and for bookkeeping, sysctls, etc.) and hardware resources
(VIs, DMA queues, TCAM entries, etc.) to operate the NIC.  This commit
splits the single *_ALLOCATED flag used to track all these resources
into separate *_SW_ALLOCATED and *_HW_ALLOCATED flags.

This is the simplified pseudocode that now applies to most queues (foo
can be ctrlq/txq/rxq/ofld_txq/ofld_rxq):

/* Idempotent */
alloc_foo
{
	if (!SW_ALLOCATED)
		init_iq/init_eq/init_fl		no-fail sw init
		alloc_iq_fl/alloc_eq/alloc_wrq	may-fail sw alloc
		add_foo_sysctls, etc.		no-fail post-alloc items
	if (!HW_ALLOCATED)
		alloc_iq_fl_hwq/alloc_eq_hwq	hw resource allocation
}

/* Idempotent */
free_foo
{
	if (!HW_ALLOCATED)
		free_iq_fl_hwq/free_eq_hwq	release hw resources
	if (!SW_ALLOCATED)
		free_iq_fl/free_eq/free_wrq	release sw resources
}

The routines that take the driver to FULL_INIT_DONE and VI_INIT_DONE and
back are now all idempotent.  The quiesce routines pay attention to the
HW_ALLOCATED flag and will not wait on the hardware for pidx/cidx
updates and other completions if this flag is not set.

MFC after:	1 month
Sponsored by:	Chelsio Communications
2021-04-26 14:09:59 -07:00
Michael Tuexen
c70d1ef15d sctp: improve handling of illegal packets containing INIT chunks
Stop further processing of a packet when detecting that it
contains an INIT chunk, which is too small or is not the only
chunk in the packet. Still allow to finish the processing
of chunks before the INIT chunk.

Thanks to Antoly Korniltsev and Taylor Brandstetter for reporting
an issue with the userland stack, which made me aware of this
issue.

MFC after:	3 days
2021-04-26 10:43:58 +02:00
Martin Matuska
9f1dc86c46 zfs: restore copyright disclaimer change from 4b84b4cca
The change will be pull-requested to upstream.

X-MFC-with:	4b84b4cca4
2021-04-26 22:16:50 +02:00
Mark Johnston
409ab7e109 imgact_elf: Ensure that the return value in parse_notes is initialized
parse_notes relies on the caller-supplied callback to initialize "res".
Two callbacks are used in practice, brandnote_cb and note_fctl_cb, and
the latter fails to initialize res.  Fix it.

In the worst case, the bug would cause the inner loop of check_note to
examine more program headers than necessary, and the note header usually
comes last anyway.

Reviewed by:	kib
Reported by:	KMSAN
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D29986
2021-04-26 14:53:16 -04:00
Warner Losh
099919b76d newbus: remove support for SINGLETON
Revert rest of de8dd262c4 since it's now unused.

jhibbits@ introduced this to give powerpc MMU functions IFUNC like
performance while retaining the kobj interface, speeding up operations
10-20%. Since there was only ever one instance of the mmu interface
active at any given time, we could cache the looked up results more
agressively.

powerpc migrated to using IFUNCs to get an even larger performance boost
in 45b69dd63e, deleting the two files it was added to in de8dd262c4.

However, there's few, if any, other potential applications of this to
the tree today. It's now unused and undocumented. Retire it to eliminate
this wart and to preclude the need to document it. Should a simmilar
case arise in the future, the code is in git...

Discusssed with:	jhibbits@
Reviewed by:		jhb@
Sponsored by:		Netflix
Differential Revision:	https://reviews.freebsd.org/D29997
2021-04-26 11:41:08 -06:00
Kevin Bowling
ba7b31b3e9 e1000: Fix register name in reg_dump sysctl
The correct name of this register is CTRL_EXT.

Approved by:	markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D29967
2021-04-26 09:30:54 -07:00
Kristof Provost
402dfb0a8d pf: Fix parsing of long table names
When parsing the nvlist for a struct pf_addr_wrap we unconditionally
tried to parse "ifname". This broke for PF_ADDR_TABLE when the table
name was longer than IFNAMSIZ. PF_TABLE_NAME_SIZE is longer than
IFNAMSIZ, so this is a valid configuration.

Only parse (or return) ifname or tblname for the corresponding
pf_addr_wrap type.

This manifested as a failure to set rules such as these, where the pfctl
optimiser generated an automatic table:

	pass in proto tcp to 192.168.0.1 port ssh
	pass in proto tcp to 192.168.0.2 port ssh
	pass in proto tcp to 192.168.0.3 port ssh
	pass in proto tcp to 192.168.0.4 port ssh
	pass in proto tcp to 192.168.0.5 port ssh
	pass in proto tcp to 192.168.0.6 port ssh
	pass in proto tcp to 192.168.0.7 port ssh

Reported by:	Florian Smeets
Tested by:	Florian Smeets
Reviewed by:	donner
X-MFC-With:	5c11c5a365
MFC after:	2 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D29962
2021-04-26 18:08:15 +02:00
Neel Chauhan
e657f3de6d linuxkpi: Remove unneeded {} in atomic_dec_and_lock_irqsave() 2021-04-26 08:25:33 -07:00
Neel Chauhan
c8de6e2015 linuxkpi: Elimiate brackets on return in spinlock.h 2021-04-26 08:16:48 -07:00
Neel Chauhan
ce65353ac1 linuxkpi: Implement atomic_dec_and_lock_irqsave()
This is needed by the drm-kmod 5.5 update.

Reviewed by:		hselasky, manu
MFC after:		3 days
Differential Revision:	https://reviews.freebsd.org/D29988
2021-04-26 08:15:49 -07:00
Neel Chauhan
057f145aae linuxkpi: Implement the wait_event_interruptible macro
This is needed by the drm-kmod 5.5 update and is similar in logic to the
existing wait_event_killable macro.

Reviewed by:		hselasky, manu
MFC after:		3 days
Differential Revision:	https://reviews.freebsd.org/D29987
2021-04-26 08:12:18 -07:00
Kristof Provost
5f5bf88949 pfsync: Expose PFSYNCF_OK flag to userspace
Add 'syncok' field to ifconfig's pfsync interface output. This allows
userspace to figure out when pfsync has completed the initial bulk
import.

Reviewed by:	donner
MFC after:	2 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D29948
2021-04-26 14:31:17 +02:00
Kristof Provost
6fcc8e042a pf: Allow multiple labels to be set on a rule
Allow up to 5 labels to be set on each rule.
This offers more flexibility in using labels. For example, it replaces
the customer 'schedule' keyword used by pfSense to terminate states
according to a schedule.

Reviewed by:	glebius
MFC after:	2 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D29936
2021-04-26 14:14:21 +02:00
Michael Tuexen
163153c2a0 sctp: small cleanup, no functional change
MFC:		3 days
2021-04-26 02:56:48 +02:00
Kevin Bowling
0f6bea61ed e1000: Improve device name strings
This is just clerical work to ease bug triage and may be used to set
expectations around the ability for anyone in the community to perform
testing and development on older parts (this driver covers over 20 years
of silicon)

Reviewed by:	erj
Approved by:	markj
Sponsored by:	Pink Floyd - Any Colour You Like (in kind)
Differential Revision:	https://reviews.freebsd.org/D29872
2021-04-25 22:08:54 -07:00
Patrick Kelsey
ca7005f189 iflib: Improve mapping of TX/RX queues to CPUs
iflib now supports mapping each (TX,RX) queue pair to the same CPU
(default), to separate CPUs, or to a pair of physical and logical CPUs
that share the same L2 cache.  The mapping mechanism supports unequal
numbers of TX and RX queues, with the excess queues always being
mapped to consecutive physical CPUs.  When the platform cannot
distinguish between physical and logical CPUs, all are treated as
physical CPUs.  See the comment on get_cpuid_for_queue() for the
entire matrix.

The following device-specific tunables influence the mapping process:
dev.<device>.<unit>.iflib.core_offset       (existing)
dev.<device>.<unit>.iflib.separate_txrx     (existing)
dev.<device>.<unit>.iflib.use_logical_cores (new)

The following new, read-only sysctls provide visibility of the mapping
results:
dev.<device>.<unit>.iflib.{t,r}xq<n>.cpu

When an iflib driver allocates TX softirqs without providing reference
RX IRQs, iflib now binds those TX softirqs to CPUs using the above
mapping mechanism (that is, treats them as if they were TX IRQs).
Previously, such bindings were left up to the grouptaskqueue code and
thus fell outside of the iflib CPU mapping strategy.

Reviewed by:	kbowling
Tested by:	olivier, pkelsey
MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D24094
2021-04-26 01:06:34 -04:00
Martin Matuska
4b84b4cca4 zfs: fix non-functional mismerges from vendor/openzfs
- fix copyright in module/os/freebsd/spl/spl_acl.c
- fix mismerge in non-processed module/os/linux/zfs/zfs_uio.c

MFC after:      3 days
Obtained from:  OpenZFS
2021-04-26 03:05:13 +02:00
Rick Macklem
aad780464f nfscl: return delegations in the NFS VOP_RECLAIM()
After a vnode is recycled it can no longer be
acquired via vfs_hash_get() and, as such,
a delegation for the vnode cannot be recalled.

In the unlikely event that a delegation still
exists when the vnode is being recycled, return
the delegation since it will no longer be
recallable.

Until you have this patch in your NFSv4 client,
you should consider avoiding the use of delegations.

MFC after:	2 weeks
2021-04-25 17:57:55 -07:00
Rick Macklem
02695ea890 nfscl: fix delegation recall when the file is not open
Without this patch, if a NFSv4 server recalled a
delegation when the file is not open, the renew
thread would block in the NFS VOP_INACTIVE()
trying to acquire the client state lock that it
already holds.

This patch fixes the problem by delaying the
vrele() call until after the client state
lock is released.

This bug has been in the NFSv4 client for
a long time, but since it only affects
delegation when recalled due to another
client opening the file, it got missed
during previous testing.

Until you have this patch in your client,
you should avoid the use of delegations.

MFC after:	2 weeks
2021-04-25 12:55:00 -07:00
Alexander V. Chernikov
7d222ce3c1 Fix NOINET[6],!VIMAGE builds after FIB_ALGO addition to GENERIC
Reported by:	jbeich
PR:		255390
2021-04-21 05:53:42 +01:00
Edward Tomasz Napierala
5d1d844a77 kern_linkat: modify to accept AT_ flags instead of FOLLOW/NOFOLLOW
This makes this API match other kern_xxxat() functions.

Reviewed By:	kib
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D29776
2021-04-25 14:13:12 +01:00
Alexander V. Chernikov
67372fb3e0 Fix NOINET[6] build after enabling FIB_ALGO in GENERIC.
Submitted by:	jbeich
PR:		255389
2021-04-21 02:49:18 +01:00
Alexander V. Chernikov
c23385612d [fib algo] Do not print algo attach/detach message on boot
MFC after:	1 day
2021-04-25 08:58:06 +00:00
Alexander V. Chernikov
a81e2e7890 Make gcc happy by initializing error in rib_handle_ifaddr_info(). 2021-04-25 08:44:59 +00:00
Stefan Eßer
6409e59427 Fix build with gcc
Correctly declare function without arguments as f(void) instead of f().
2021-04-25 10:15:17 +02:00
Alexander V. Chernikov
6993187a8c Add FIB_ALGO to GENERIC on amd64/arm64.
Option `FIB_ALGO` gates new modular fib lookup functionality,
 enabling more performant routing table lookups and improving
 control plane convergence under the load.

Detailed feature description is available in D27401.

Reviewed By: olivier, gnn
Differential Revision: https://reviews.freebsd.org/D28434
2021-04-24 23:22:58 +00:00
Alexander V. Chernikov
5d1403a79a [rtsock] Enforce netmask/RTF_HOST consistency.
Traditionally we had 2 sources of information whether the
 added/delete route request targets network or a host route:
netmask (RTA_NETMASK) and RTF_HOST flag.

The former one is tricky: netmask can be empty or can explicitly
 specify the host netmask. Parsing netmask sockaddr requires per-family
 parsing and that's what rtsock code traditionally avoided. As a result,
 consistency was not enforced and it was possible to specify network with
 the RTF_HOST flag and vice versa.

Continue normalization efforts from D29826 and D29826 and ensure that
 RTF_HOST flag always reflects host/network data from netmask field.

Differential Revision: https://reviews.freebsd.org/D29958
MFC after:	2 days
2021-04-24 22:41:27 +00:00
Robert Watson
af14713d49 Support run-time configuration of the PIPE_MINDIRECT threshold.
PIPE_MINDIRECT determines at what (blocking) write size one-copy
optimizations are applied in pipe(2) I/O.  That threshold hasn't
been tuned since the 1990s when this code was originally
committed, and allowing run-time reconfiguration will make it
easier to assess whether contemporary microarchitectures would
prefer a different threshold.

(On our local RPi4 baords, the 8k default would ideally be at least
32k, but it's not clear how generalizable that observation is.)

MFC after:	3 weeks
Reviewers:	jrtc27, arichardson
Differential Revision: https://reviews.freebsd.org/D29819
2021-04-24 20:04:28 +01:00
Vladimir Kondratyev
e68d76c054 hkbd: Fix typo which disables keyboard input in kdb
Reported by:	Greg V
MFC after:	1 week
2021-04-24 22:01:14 +03:00
Edward Tomasz Napierala
77651151f3 linux: make ptrace(2) return EIO when trying to peek invalid address
Previously we've returned the error from native ptrace(2), ENOMEM.
This confused Linux strace(2).

Reviewed By:	emaste
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D29925
2021-04-24 11:37:50 +01:00
Hans Petter Selasky
a9b66dbd91 Allow the tcp_lro_flush_all() function to be called when the control
structure is zeroed, by setting the VNET after checking the mbuf count
for zero. It appears there are some cases with early interrupts on some
network devices which still trigger page-faults on accessing a NULL "ifp"
pointer before the TCP LRO control structure has been initialized.
This basically preserves the old behaviour, prior to
9ca874cf74 .

No functional change.

Reported by:	rscheff@
Differential Revision:	https://reviews.freebsd.org/D29564
MFC after:	2 weeks
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2021-04-24 12:23:42 +02:00
Alexander Motin
b99419aee4 mpr/mps(4): Make device mapping some more robust.
Allow new enclosure to replace previously existing one if there is
no completely unused table entry, same as it is done for devices.

If we can not process DPM due to corruption -- wipe it and restart
from scratch.  Otherwise I don't see a way to recover persistence if
something go wrong and there is no BIOS to recover it for us.

Together this solves a problem that appeared when 9300-8i firmware
update to 16.00.10.00 somehow switched its mapping mode from Device
Persistence to Enclosure/Slot without wiping the DPM table.  It made
HBA completely unusable, since overflowed and conflicting mapping
table was unable to map any of enclosures and so devices.

Also while there make some enclosure mapping errors more informative.

MFC after:	1 month
Sponsored by:	iXsystems, Inc.
2021-04-23 23:36:51 -04:00
Tai-hwa Liang
2acbe67787 sound(4): fixing panic for INVARIANTS kernel
3e7bae0821 turns the BUS_READ_IVAR() failure from a warning into a
KASSERT.  For certain PCI audio devices such like snd_csa(4) and
snd_emu10kx(4), the ac97_create() keeps the device handler generated
by device_add_child(pci_dev, "pcm"), which is not really a PCI device
handler.  This in turn causes the subsequent pci_get_subdevice()
inside ac97_initmixer() triggering a panic.

This patch tries to put a bandaid for the aforementioned pcm device
children such that they can use the correct PCI handler(from parent)
to avoid a KASSERT panic in the INVARIANTS kernel.

Tested with:	snd_csa(4), snd_ich(4), snd_emu10kx(4)
Reviewed by:	imp
MFC after:	1 month
2021-04-24 03:27:43 +00:00
Rick Macklem
4281bfec36 nfsd: fix session slot handling for failed callbacks
When the NFSv4.1/4.2 server does a callback to a client
on the back channel, it will use a session slot in the
back channel session. If the back channel has failed,
the callback will fail and, without this patch, the
session slot will not be released.
As more callbacks are attempted, all session slots
can become busy and then the nfsd thread gets stuck
waiting for a back channel session slot.

This patch frees the session slot upon callback
failure to avoid this problem.

Without this patch, the problem can be avoided by leaving
delegations disabled in the NFS server.

MFC after:	2 weeks
2021-04-23 15:24:47 -07:00
Navdeep Parhar
50f5d13eeb cxgbe(4): hw.cxgbe.panic_on_fatal_err can be changed any time.
MFC after:	2 weeks
Sponsored by:	Chelsio Communications
2021-04-23 12:17:54 -07:00
Mark Johnston
8e8f1cc9bb Re-enable network ioctls in capability mode
This reverts a portion of 274579831b ("capsicum: Limit socket
operations in capability mode") as at least rtsol and dhcpcd rely on
being able to configure network interfaces while in capability mode.

Reported by:	bapt, Greg V
Sponsored by:	The FreeBSD Foundation
2021-04-23 09:22:49 -04:00
Andrew Gallatin
3183d0b680 iflib: initialize LRO unconditionally
Changes to the LRO code have exposed a bug in iflib where devices
which are not capable of doing LRO are still calling
tcp_lro_flush_all(), even when they have not initialized the LRO
context. This used to be mostly harmless, but the LRO code now sets
the VNET based on the ifp in the lro context and will try to access it
through a NULL ifp resulting in a panic at boot.

To fix this, we unconditionally initializes LRO so that we have a
valid LRO context when calling tcp_lro_flush_all(). One alternative is
to check the device capabilities before calling tcp_lro_flush_all() or
adding a new state flag in the ctx. However, it seems unwise to add an
extra, mostly useless test for higher performance devices when we can
just initialize LRO for all devices.

Reviewed by: erj, hselasky, markj, olivier
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D29928
2021-04-23 05:55:20 -04:00
Navdeep Parhar
5f00292fe3 cxgbe(4): Move the hw-specific parts of VXLAN setup to a separate function.
It can be called to (re)apply the settings in the driver softc to the
hardware.

MFC after:	2 weeks
Sponsored by:	Chelsio Communications
2021-04-23 00:26:47 -07:00
Navdeep Parhar
b47b28e5b2 cxgbe(4): Add flag to reliably stop the driver from accessing hw stats.
There are two kinds of routines in the driver that read statistics from
the hardware: the cxgbe_* variants read the per-port MPS/MAC registers
and the vi_* variants read the per-VI registers.  They can be called
from the 1Hz callout or if_get_counter.  All stats collection now takes
place under the callout lock and there is a new flag to indicate that
these routines should not access any hardware register.

MFC after:	2 weeks
Sponsored by:	Chelsio Communications
2021-04-22 17:45:52 -07:00
Navdeep Parhar
dc77e79296 cxgbe(4): Fix minor nit in the display of MPS TCAM entries.
MFC after:	3 days
2021-04-22 15:36:51 -07:00
Navdeep Parhar
8f1bc78ef7 cxgbe(4): make the logging helpers a little more robust.
MFC after:	3 days
Sponsored by:	Chelsio Communications
2021-04-22 15:28:43 -07:00
Vladimir Kondratyev
55eb41bb1f hv_kbd: Fix build with EVDEV_SUPPORT kernel option disabled.
Reported by:	olivier
MFC with:	e4643aa4c4
2021-04-23 01:13:25 +03:00
Navdeep Parhar
557c4521bb cxgbe/t4_tom: Implement tod_pmtu_update.
tod_pmtu_update was added to the kernel in 01d74fe1ff.

Sponsored by:	Chelsio Communications
2021-04-22 14:48:57 -07:00
Warner Losh
2183bfcce4 newvers.sh: better regexp for the FreeBSD_version line
Tested with:		cirrus-ci https://cirrus-ci.com/build/6012323274948608
Reviewed by:		emaste@, rgrimes@
Sponsored by:		Netflix
Differential Revision:	https://reviews.freebsd.org/D29869
2021-04-22 11:45:12 -06:00
Warner Losh
9a5a5c1576 pvscsi: Advertise maxio of 256k.
While the PV SCSI SG list can handle 512k of SG entries, it can only do
so for I/O that's aligned to 4k or better. newfs_msdos does unaligned
I/O, so triggers too long for host errors in cam when a 512k I/O is
attempted. Prefer power of 2 256k to the absolute maximum 508k, though
that can be revisited should the latter show to give significant
performance improvement.

MFC After:		3 days
Tested by:		darius on discord (508k version of patch)
Sponsored by:		Netflix
2021-04-22 11:23:29 -06:00
Mateusz Guzik
7ea3223c78 zfs: use vn_seqc_read_notmodify for racing .. lookups
Catching an in-flight unlocked vnode is fine here.

Reported by;	pho
2021-04-22 13:18:39 +00:00
Ram Kishore Vegesna
fc620f9782 ocs_fc: Fix memory leak in ocs_scsi_io_alloc()
PR: 254690
Approved by: mav(mentor)
MFC after: 2 weeks
2021-04-22 17:48:37 +05:30
Hans Petter Selasky
47bc8fc9ae Add more USB quirks for Kingston devices.
PR:		253855
MFC after:	1 week
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2021-04-22 12:40:29 +02:00
Hans Petter Selasky
28af0c4814 Add more USB quirks for Garmin devices.
Sort the Garmin products while at it.

PR:		254664
MFC after:	1 week
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2021-04-22 12:35:07 +02:00
Hans Petter Selasky
d2c8714064 Remove USB device ID added by SVN r150701 in the CDC USB ethernet driver.
Since then, the FreeBSD USB stack has got proper USB RNDIS support.

PR:		254345
MFC after:	1 week
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2021-04-22 12:23:36 +02:00
Ka Ho Ng
7c707c7c25 __FreeBSD_version: update the references to the doc tree
Update the reference of which file to update in the doc tree when
bumping __FreeBSD_version.

This change is to catch up with commit f8fed61b80 in the doc repository.

MFC after:	3 days
Approved by:	lwhsu (mentor)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D29920
2021-04-22 17:36:22 +08:00
Alexander V. Chernikov
4044af03a4 Fix vtnet TCP lro panic
Differential Revision: https://reviews.freebsd.org/D29900
Reviewed by:	hps, kp
2021-04-19 17:06:34 +01:00
Warner Losh
df456a1fcf newbus: style nit (align comments)
Sponsored by:		Netflix
2021-04-21 15:37:24 -06:00
Warner Losh
1eebd6158c newbus: Optimize/Simplify kobj_class_compile_common a little
"i" is not used in this loop at all. There's no need to initialize and
increment it.

Reviewed by:		markj@
Sponsored by:		Netflix
Differential Revision:	https://reviews.freebsd.org/D29898
2021-04-21 15:37:24 -06:00
John Baldwin
6a3a6fe34b riscv: Assert that SUM is not set in SSTATUS for exceptions.
Reviewed by:	mhorne
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D29764
2021-04-21 13:57:20 -07:00
John Baldwin
753bcca440 riscv: Clear SUM in SSTATUS for supervisor mode exceptions.
Previously, a page fault taken during copyin/out and related functions
would run the entire fault handler while permitting direct access to
user addresses.  This could also leak across context switches (e.g. if
the page fault handler was preempted by an interrupt or slept for disk
I/O).

To fix, clear SUM in assembly after saving the original version of
SSTATUS in the supervisor mode trapframe.

Reviewed by:	mhorne, jrtc27
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D29763
2021-04-21 13:57:04 -07:00
Navdeep Parhar
01d74fe1ff Path MTU discovery hooks for offloaded TCP connections.
Notify the TOE driver when when an ICMP type 3 code 4 (Fragmentation
needed and DF set) message is received for an offloaded connection.
This gives the driver an opportunity to lower the path MTU for the
connection and resume transmission, much like what the kernel does for
the connections that it handles.

Reviewed by:	glebius@
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D29755
2021-04-21 13:00:16 -07:00
Mark Johnston
652908599b Add required checks for unmapped mbufs in ipdivert and ipfw
Also add an M_ASSERTMAPPED() macro to verify that all mbufs in the chain
are mapped.  Use it in ipfw_nat, which operates on a chain returned by
m_megapullup().

PR:		255164
Reviewed by:	ae, gallatin
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D29838
2021-04-21 15:47:05 -04:00
Mateusz Guzik
9c651561a2 zfs: damage control racing .. lookups in face of mkdir/rmdir
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D29769
2021-04-21 15:25:32 +00:00
Konstantin Belousov
54f98c4dbf vn_open_vnode(): handle error when fp == NULL
If VOP_ADD_WRITECOUNT() or adv locking failed, so VOP_CLOSE() needs to
be called, we cannot use fp fo_close() when there is no fp.  This occurs
when e.g. kernel code directly calls vn_open() instead of the open(2)
syscall.

In this case, VOP_CLOSE() can be called directly, after possible lock
upgrade.

Reported by:	nvass@gmx.com
PR:	255119
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D29830
2021-04-21 18:06:51 +03:00
Ka Ho Ng
a3a02acde1 param.h: bump __FreeBSD_version for commit 4ce1ba6523
Commit 4ce1ba6523 changed the sndstat(4) ioctls nvlist schema and
definitions.

Approved by:	philip (mentor)
2021-04-21 16:21:08 +08:00
Ka Ho Ng
4ce1ba6523 sndstat: nvlist schema and API definition changes
- SNDSTAT_LABEL_* are renamed to SNDST_DSPS_*, and SNDSTAT_LABEL_DSPS
  becomes SNDST_DSPS.
- Centralize channel number/rate/formats into a single nvlist
  The above nvlist is named "info_play" and "info_rec"
- Expose only encoding format in pfmts/rfmts. Userland has no direct
  access to AFMT_ENCODING/CHANNEL/EXTCHANNEL macros, thus it serves no
  meaning to expose too much information through this pair of labels.
  However pminrate/rminrate, pmaxrate/rmaxrate, pfmts/rfmts are
  deprecated and will be removed in future.

This commit keeps ioctls ABI compatibility with __FreeBSD_version
1400006 for now. In future the compat ABI with 1400006 will be removed
once audio/virtual_oss is rebuilt.

Sponsored by:	The FreeBSD Foundation
Reviewed by:	hselasky
Approved by:	philip (mentor)
Differential Revision:	https://reviews.freebsd.org/D29770
2021-04-21 16:19:15 +08:00
Xin LI
438b553207 arcmsr(4): Fix SCSI command timeout on ARC-1886.
Many thanks to Areca for continuing to support FreeBSD.

Submitted by:	黃清隆 <ching2048 areca com tw>
MFC after:	2 weeks
2021-04-21 01:03:54 -07:00
Alexander V. Chernikov
33cb3cb2e3 Fix rib generation count for fib algo.
Currently, PCB caching mechanism relies on the rib generation
 counter (rnh_gen) to invalidate cached nhops/LLE entries.

With certain fib algorithms, it is now possible that the
 datapath lookup state applies RIB changes with some delay.
In that scenario, PCB cache will invalidate on the RIB change,
 but the new lookup may result in the same nexthop being returned.
When fib algo finally gets in sync with the RIB changes, PCB cache
 will not receive any notification and will end up caching the stale data.

To fix this, introduce additional counter, rnh_gen_rib, which is used
 only when FIB_ALGO is enabled.
This counter is incremented by the control plane. Each time when fib algo
 synchronises with the RIB, it updates rnh_gen to the current rnh_gen_rib value.

Differential Revision: https://reviews.freebsd.org/D29812
Reviewed by:	donner
MFC after:	2 weeks
2021-04-20 22:02:41 +00:00
Alexander V. Chernikov
b31fbebeb3 Relax rtsock message restrictions.
Address multiple issues with strict rtsock message validation.

D28668 "normalisation" approach was based on the assumption that
 we always have at least "standard" sockaddr len.
It turned out to be false - certain older applications like quagga
 or routed abuse sin[6]_len field and set it to the offset to the
 first fully-zero bit in the mask. It is impossible to normalise
 such sockaddrs without reallocation.

With that in mind, change the approach to use a distinct memory
 buffer for the altered sockaddrs. This allows supporting the older
 software while maintaining the guarantee on the "standard" sockaddrs.

PR:	255273,255089
Differential Revision:	https://reviews.freebsd.org/D29826
MFC after:	3 days
2021-04-20 21:34:19 +00:00
Sofian Brabez
561d34d705 iwnstats: fix build with clang and allow install under /usr/local/sbin
iwnstats was not compiling because of some issues raised by the clang
compiler due to -Werror. As a tool it is not connected to world build.

Add missing field "barker_mrc" initialization in struct
iwn_sensitivity_limits for -Wmissing-field-initializers, remove unused
pointer *is on iwn_stats_*_print functions and unused variables for
-Wunused-parameter and -Wunused-variable.

The value for field "barker_mrc" of struct iwn2030_sensitivity_limits
was obtained from linux 3.2 wireless/iwlwifi driver code (iwl-2000.c:115
.barker_corr_th_min_mrc = 390).

Also set BINDIR in Makefile to make it possible to install under
/usr/local/sbin/iwnstats as it require super user.

Reviewed by:	adrian
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D29800
2021-04-20 18:07:56 +00:00
Gleb Smirnoff
d554522f6e tcp_hostcache: use SMR for lookups, mutex(9) for updates.
In certain cases, e.g. a SYN-flood from a limited set of hosts,
the TCP hostcache becomes the main contention point. To solve
that, this change introduces lockless lookups on the hostcache.

The cache remains a hash, however buckets are now CK_SLIST. For
updates a bucket mutex is obtained, for read an SMR section is
entered.

Reviewed by:	markj, rscheff
Differential revision:	https://reviews.freebsd.org/D29729
2021-04-20 10:02:20 -07:00
Gleb Smirnoff
1db08fbe3f tcp_input: always request read-locking of PCB for any pure SYN segment.
This is further rework of 08d9c92027.  Now we carry the knowledge of
lock type all the way through tcp_input() and also into tcp_twcheck().
Ideally the rlocking for pure SYNs should propagate all the way into
the alternative TCP stacks, but not yet today.

This should close a race when socket is bind(2)-ed but not yet
listen(2)-ed and a SYN-packet arrives racing with listen(2), discovered
recently by pho@.
2021-04-20 10:02:20 -07:00
Gleb Smirnoff
7b5053ce22 tcp_input: remove comments and assertions about tcpbinfo locking
They aren't valid since d40c0d47cd.
2021-04-20 10:02:20 -07:00
Richard Scheffenegger
a649f1f6fd tcp: Deal with DSACKs, and adjust rescue hole on success.
When a rescue retransmission is successful, rather than
inserting new holes to the left of it, adjust the old
rescue entry to cover the missed sequence space.

Also, as snd_fack may be stale by that point, pull it forward
in order to never create a hole left of snd_una/th_ack.

Finally, with DSACKs, tcp_sack_doack() may be called
with new full ACKs but a DSACK block. Account for this
eventuality properly to keep sacked_bytes >= 0.

MFC after: 3 days
Reviewed By: kbowling, tuexen, #transport
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D29835
2021-04-20 14:54:28 +02:00
Hans Petter Selasky
9ca874cf74 Add TCP LRO support for VLAN and VxLAN.
This change makes the TCP LRO code more generic and flexible with regards
to supporting multiple different TCP encapsulation protocols and in general
lays the ground for broader TCP LRO support. The main job of the TCP LRO code is
to merge TCP packets for the same flow, to reduce the number of calls to upper
layers. This reduces CPU and increases performance, due to being able to send
larger TSO offloaded data chunks at a time. Basically the TCP LRO makes it
possible to avoid per-packet interaction by the host CPU.

Because the current TCP LRO code was tightly bound and optimized for TCP/IP
over ethernet only, several larger changes were needed. Also a minor bug was
fixed in the flushing mechanism for inactive entries, where the expire time,
"le->mtime" was not always properly set.

To avoid having to re-run time consuming regression tests for every change,
it was chosen to squash the following list of changes into a single commit:
- Refactor parsing of all address information into the "lro_parser" structure.
  This easily allows to reuse parsing code for inner headers.
- Speedup header data comparison. Don't compare field by field, but
  instead use an unsigned long array, where the fields get packed.
- Refactor the IPv4/TCP/UDP checksum computations, so that they may be computed
  recursivly, only applying deltas as the result of updating payload data.
- Make smaller inline functions doing one operation at a time instead of
  big functions having repeated code.
- Refactor the TCP ACK compression code to only execute once
  per TCP LRO flush. This gives a minor performance improvement and
  keeps the code simple.
- Use sbintime() for all time-keeping. This change also fixes flushing
  of inactive entries.
- Try to shrink the size of the LRO entry, because it is frequently zeroed.
- Removed unused TCP LRO macros.
- Cleanup unused TCP LRO statistics counters while at it.
- Try to use __predict_true() and predict_false() to optimise CPU branch
  predictions.

Bump the __FreeBSD_version due to changing the "lro_ctrl" structure.

Tested by:	Netflix
Reviewed by:	rrs (transport)
Differential Revision:	https://reviews.freebsd.org/D29564
MFC after:	2 week
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2021-04-20 13:36:22 +02:00
Kristof Provost
586aab9e0a pf: Refactor state killing
Extract the state killing code from pfioctl() and rephrase the filtering
conditions for readability.

No functional change intended.

MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D29795
2021-04-20 09:30:23 +02:00
Rick Macklem
78ffcb86d9 nfscommon: fix function name in comment
MFC after:	2 weeks
2021-04-19 20:09:46 -07:00
Greg V
32231805fb linker_set: fix globl/weak symbol redefinitions to work on clang 12
In clang 12.0.0.rc2, going from weak to global is now a hard error:

```
/usr/src/stand/libsa/amd64/_setjmp.S:67:25: error: _longjmp changed binding to STB_GLOBAL
.text; .p2align 4,0x90; .globl _longjmp; .type _longjmp,@function; _longjmp:; .cfi_startproc
```

And the other way is a warning, but we have -Werror:

```
error: __start_set_Xcommand_set changed binding to STB_WEAK [-Werror,-Winline-asm]
error: __stop_set_Xcommand_set changed binding to STB_WEAK [-Werror,-Winline-asm]
```

ref: https://reviews.llvm.org/D90108

Reviewed By:	arichardson
MFC after:	1 week
Differential Revision: https://reviews.freebsd.org/D29159
2021-04-20 02:56:24 +01:00
Rick Macklem
5a89498d19 nfsd: fix stripe size reply for the File Layout pNFS server
At a recent testing event I found out that I had misinterpreted
RFC5661 where it describes the stripe size in the File Layout's
nfl_util field. This patch fixes the pNFS File Layout server
so that it returns the correct value to the NFSv4.1/4.2 pNFS
enabled client.

This affects almost no one, since pNFS server configurations
are rare and the extant pNFS aware NFS clients seemed to
function correctly despite the erroneous stripe size.
It *might* be needed for correct behaviour if a recent
Linux client mounts a FreeBSD pNFS server configuration
that is using File Layout (non-mirrored configuration).

MFC after:	2 weeks
2021-04-19 17:54:54 -07:00
Gleb Smirnoff
faa9ad8a90 Fix off-by-one error in KASSERT from 02f26e98c7. 2021-04-19 17:20:19 -07:00
Kevin Bowling
59690eab57 e1000: Add support for [Tiger, Alder, Meteor] Lake
Add support for current and future client platform PCI IDs. These are
all I219 variants and have no known driver changes versus previous
generation client platform I219 variants.

Reviewed by:	markj
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D29801
2021-04-19 14:32:59 -07:00
Alexander V. Chernikov
758c9d54d4 Improve error reporting in rtsock.c
MFC after:	3 days
2021-04-19 20:36:41 +00:00
Kevin Bowling
4b38eed76d e1000: Correct promisc multicast filter handling
There are a number of issues in the e1000 multicast filter handling
that have been present for a long time. Take the updated approach from
ixgbe(4) which does not have the issues.

The issues are outlined in the PR, in particular this solves crossing
over and under the hardware's filter limit, not programming the
hardware filter when we are above its limit, disabling SBP (show bad
packets) when the tunable is enabled and exiting promiscuous mode, and
an off-by-one error in the em_copy_maddr function.

PR:		140647
Reported by:	jtl
Reviewed by:	markj
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D29789
2021-04-19 12:49:55 -07:00
Kevin Bowling
deecaa1445 ixgbe: Clean up unneeded set in ixgbe_if_multi_set
We don't need to set the bits here since the if/else if/else statements
fully cover setting these bit pairs.

Reported by:	markj
Reviewed by:	markj, erj
Approved by:	#intel_networking
MFC aftter:	1 week
Differential Revision:	https://reviews.freebsd.org/D29827
2021-04-19 12:37:30 -07:00
Konstantin Belousov
fad437ba61 linuxkpi: reduce number of stray mm_struct allocations
Only allocate struct_mm after we checked that other threads do not carry
useful mm_struct.  If they don't, drop process lock, allocate, and recheck.

Note that for M_NOWAIT allocations we could avoid dropping process lock,
but I do not think that this increased complexity is useful.

Reviewed by:	hselasky
Sponsored by:	Mellanox Technologies/NVidia Networking
MFC after:	1 week
2021-04-19 21:34:08 +03:00
Konstantin Belousov
165ba13fb8 linuxkpi: guarantee allocations of task and mm for interrupt threads
Create and use zones for task and mm.  Reserve items in zones based on the
estimation of the max number of interrupts in the system.  Use M_USE_RESERVE
to allow to take reserved items when allocation occurs from the interrupt
thread context.

Of course, this would only work first time we allocate the task for
interrupt thread. If interrupt is deallocated and allocated anew,
creating a new thread, it might be that zone is depleted. It still
should be good enough for practical uses.

Reviewed by:	hselasky
Sponsored by:	Mellanox Technologies/NVidia Networking
MFC after:	1 week
2021-04-19 21:34:08 +03:00
Konstantin Belousov
4ce1f6162e linuxkpi: some style, wrap too long lines
Reviewed by:	hselasky
Sponsored by:	Mellanox Technologies/NVidia Networking
MFC after:	1 week
2021-04-19 21:34:07 +03:00
Konstantin Belousov
ecfbddf0cd sysctl vm.objects: report backing object and swap use
For anonymous objects, provide a handle kvo_me naming the object,
and report the handle of the backing object.  This allows userspace
to deconstruct the shadow chain.  Right now the handle is the address
of the object in KVA, but this is not guaranteed.

For the same anonymous objects, report the swap space used for actually
swapped out pages, in kvo_swapped field.  I do not believe that it is
useful to report full 64bit counter there, so only uint32_t value is
returned, clamped to the max.

For kinfo_vmentry, report anonymous object handle backing the entry,
so that the shadow chain for the specific mapping can be deconstructed.

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D29771
2021-04-19 21:32:01 +03:00
Konstantin Belousov
4342ba184c sysctl_handle_string: do not malloc when SYSCTL_IN cannot fault
In particular, this avoids malloc(9) calls when from early tunable handling,
with no working malloc yet.

Reported and tested by:	mav
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2021-04-19 21:32:01 +03:00
Konstantin Belousov
578c26f31c linkat(2): check NIRES_EMPTYPATH on the first fd arg
Reported by:	arichardson
Reviewed by:	markj
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D29834
2021-04-19 21:32:01 +03:00
Kristof Provost
42ec75f83a pf: Optionally attempt to preserve rule counter values across ruleset updates
Usually rule counters are reset to zero on every update of the ruleset.
With keepcounters set pf will attempt to find matching rules between old
and new rulesets and preserve the rule counters.

MFC after:	4 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D29780
2021-04-19 14:31:47 +02:00
Kristof Provost
8bb0f1b87b pf: Remove PFRULE_REFS from userspace
PFRULE_REFS should never be used by userspace, so hide it behind #ifdef
_KERNEL.

MFC after:	never
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D29779
2021-04-19 14:31:47 +02:00
Kristof Provost
4f1f67e888 pf: PFRULE_REFS should not be user-visible
Split the PFRULE_REFS flag from the rule_flag field. PFRULE_REFS is a
kernel-internal flag and should not be exposed to or read from
userspace.

MFC after:	4 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D29778
2021-04-19 14:31:47 +02:00
Jonah Caplan
0e4025bffa bridgestp: validate timer values in config BPDU
IEEE Std 802.1D-2004 Section 17.14 defines permitted ranges for timers.
Incoming BPDU messages should be checked against the permitted ranges.
The rest of 17.14 appears to be enforced already.

PR:		254924
Reviewed by:	kp, donner
Differential Revision:	https://reviews.freebsd.org/D29782
2021-04-19 12:09:18 +02:00
Edward Tomasz Napierala
156da725d3 linux(4): bump osrelease to 4.4.0.
This is required for the current Arch Linux binaries to work.

PR:		254112
Reviewed By:	emaste
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D29218
2021-04-19 11:37:58 +01:00
Ka Ho Ng
6fe60f1d5c AMD-vi: Fortify IVHD device_identify process
- Use malloc(9) to allocate ivhd_hdrs list. The previous assumption
  that there are at most 10 IVHDs in a system is not true. A counter
  example would be a system with 4 IOMMUs, and each IOMMU is related
  to IVHDs type 10h, 11h and 40h in the ACPI IVRS table.
- Always scan through the whole ivhd_hdrs list to find IVHDs that has
  the same DeviceId but less prioritized IVHD type.

Sponsored by:	The FreeBSD Foundation
MFC with:	74ada297e8
Reviewed by:	grehan
Approved by:	lwhsu (mentor)
Differential Revision:	https://reviews.freebsd.org/D29525
2021-04-19 16:08:13 +08:00
Adrian Chadd
61c83c4e8b [ath] Add ath_hal_getnav and ath_hal_setnav so the driver layer
can check the NAV as appropriate.
2021-04-18 22:59:28 -07:00
Adrian Chadd
bed90bf8ed [ath_hal] Add get/set NAV functions
The NAV (network allocation vector) register reflects the current MAC
tracking of NAV - when it will stay quiet before transmitting.

Other devices transmit their frame durations in their 802.11 PHY headers
and all devices that hear a frame - even if it's one in an encoding
they don't understand - will understand the low bitrate PHY header that
includes the frame duration.  So, they'll set NAV to this value so
they'll stay quiet until the transmit completes.

Anyway, sometimes the PHY NAV header is garbled and sometimes, notably
older broadcom devices, will fake a long NAV so they can get "cleaner" air
for local calibration.  When this happens, the hardware will stay quiet
for quite some time and this can lead to missed/stuck beacons, or
(for Very Large Values) a MAC hang.

This code just adds the ability to get/set the NAV; the driver will
need to take care of using it during transmit hangs and beacon misses
to see if it's due to a trash looking NAV.
2021-04-18 22:52:31 -07:00
Adrian Chadd
dead34f822 [ath_hal] ar9300: save TSF across full chip reset
This saves the TSF across a a full reset.  The TSF is otherwise cleared
and subsequent beaconing stops until the TSF catches up to nexttbtt.
2021-04-18 22:49:54 -07:00
Richard Scheffenegger
b87cf2bc84 tcp: keep SACK scoreboard sorted when doing rescue retransmission
Reviewed By: tuexen, kbowling, #transport
MFC after: 3 days
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D29825
2021-04-18 23:11:10 +02:00
Warner Losh
571a1a64b1 Minor style tidy: if( -> if (
Fix a few 'if(' to be 'if (' in a few places, per style(9) and
overwhelming usage in the rest of the kernel / tree.

MFC After:		3 days
Sponsored by:		Netflix
2021-04-18 11:19:15 -06:00
Warner Losh
f1f9870668 Minor style cleanup
We prefer 'while (0)' to 'while(0)' according to grep and stlye(9)'s
space after keyword rule. Remove a few stragglers of the latter.
Many of these usages were inconsistent within the file.

MFC After:		3 days
Sponsored by:		Netflix
2021-04-18 11:14:17 -06:00
Justin Hibbits
6525c2d4de mips/octeon SDK: Fix __cvmx_cmd_queue_lock asm for clang 11
The 'ticket' and 'my_ticket' arguments are both read and written within
the same asm block.  Clang is stricter with the constraints than gcc4
was, so accepts the '=r' at face value and will happily overwrite
registers that "should" be preserved.

Mark these operands to not clobber other operands, so they get their own
registers.

This fixes a panic on bringing up the octe interfaces.
2021-04-18 12:05:55 -05:00
Alexander V. Chernikov
0abb6ff590 fib algo: do not reallocate datapath index for datapath ptr update.
Fib algo uses a per-family array indexed by the fibnum to store
 lookup function pointers and per-fib data.

Each algorithm rebuild currently requires re-allocating this array
 to support atomic change of two pointers.

As in reality most of the changes actually involve changing only
 data pointer, add a shortcut performing in-flight pointer update.

MFC after:	2 weeks
2021-04-18 16:12:13 +01:00
Alexander V. Chernikov
e2f79d9e51 Fib algo: extend KPI by allowing algo to set datapath pointers.
Some algorithms may require updating datapath and control plane
 algo pointers after the (batched) updates.

Export fib_set_datapath_ptr() to allow setting the new datapath
 function or data pointer from the algo.
Add fib_set_algo_ptr() to allow updating algo control plane
 pointer from the algo.
Add fib_epoch_call() epoch(9) wrapper to simplify freeing old
 datapath state.

Reviewed by:		zec
Differential Revision: https://reviews.freebsd.org/D29799
MFC after:		1 week
2021-04-18 16:12:12 +01:00
Michael Tuexen
9e644c2300 tcp: add support for TCP over UDP
Adding support for TCP over UDP allows communication with
TCP stacks which can be implemented in userspace without
requiring special priviledges or specific support by the OS.
This is joint work with rrs.

Reviewed by:		rrs
Sponsored by:		Netflix, Inc.
MFC after:		1 week
Differential Revision:	https://reviews.freebsd.org/D29469
2021-04-18 16:16:42 +02:00
Vincenzo Maffione
f4a54f4333 netmap: use safer defaults for hwbuf_len
We must make sure that incoming packets will never overflow the netmap
buffers, even when the user is using the offset feature. In the typical
scenario, the netmap buffer is 2KiB and, with an MTU of 1500, there are
~500 bytes available for user offsets.

Unfortunately, some NICs accept incoming packets even when they are
larger then the MTU. This means that the only way to stop DMA from
overflowing the netmap buffers, when offsets are allowed, is to choose
a hardware buffer length which is smaller than the netmap buffer
length. For most NICs and for 2KiB netmap buffers, this means 1024
bytes, which is unconveniently small.

The current code will select the small hardware buf size even when
offsets are not     in use. The main purpose of this change is to
fix this bug by returning to the normal behavior for the no-offsets
case.

At the same time, the patch pushes the handling of the offset case
to the lower level driver code, so that it can be made NIC-specific
(in future patches).
2021-04-18 13:39:15 +00:00
Warner Losh
09da6ffa55 newbus: style nit: use while<space>(0)
Sponsored by:		Netflix
2021-04-17 23:46:18 -06:00
Warner Losh
e81b14633b newbus: Minor update fix.
driver_t was supposed to just be a quick hack for 4.x
compatibility. However, it's been documented now as the preferred API
rather than the replacement kobj_class_t. Drop the note about 4.x since
it's clear we're a bit late to retiring its use through the tree with
almost 1500 references to driver_t.

Sponsored by:		Netflix
2021-04-17 13:56:28 -06:00
Richard Scheffenegger
2e97826052 rack: Fix ECN on finalizing session.
Maintain code similarity between RACK and base stack
for ECN. This may not strictly be necessary, depending
when a state transition to FIN_WAIT_1 is done in RACK
after a shutdown() or close() syscall.

MFC after: 3 days
Reviewed By: tuexen, #transport
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D29658
2021-04-17 20:16:42 +02:00
Alexander Motin
0f29396e49 mpt(4): Remove incorrect S/G segments limits.
First, two of those four checks are unreachable.
Second, I don't believe there should be ">=" instead of ">".
Third, bus_dma(9) already returns the same EFBIG if ">".

This fixes false I/O errors in worst S/G cases with maxphys >= 2MB.

MFC after:	1 week
2021-04-17 10:49:44 -04:00
Cy Schubert
b51f459a20 wpa: Import wpa_supplicant/hostapd commit f91680c15
This is the April update to vendor/wpa committed upstream
2021/04/07.

This is MFV efec822389.

Suggested by:		philip
Reviewed by:		philip
MFC after:		2 months
Differential Revision:	https://reviews.freebsd.org/D29744
2021-04-17 07:21:12 -07:00
Vincenzo Maffione
13c4641188 netmap: make sure rings are disabled during resets
Explicitly disable ring synchronization before calling
callbacks that may result in a hardware reset.

Before this patch we relied on capturing the down/up events which,
however, may not be issued by all drivers.
2021-04-17 14:02:47 +00:00
Richard Scheffenegger
d1de2b05a0 tcp: Rename rfc6675_pipe to sack.revised, and enable by default
As full support of RFC6675 is in place, deprecating
net.inet.tcp.rfc6675_pipe and enabling by default
net.inet.tcp.sack.revised.

Reviewed By: #transport, kbowling, rrs
Sponsored by: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D28702
2021-04-17 14:59:45 +02:00
Kevin Bowling
21afed4b1d ixgbe: Clarify index name in ixgbe_mc_filter_apply
"It looks like it would be less confusing to rename 'count' to
something like 'idx', since that's what it's used for in this
function."

Reviewed by:	erj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D29798
2021-04-16 18:20:41 -07:00
Gleb Smirnoff
86046cf55f tcp_respond(): fix assertion, should have been done in 08d9c92027. 2021-04-16 15:39:51 -07:00
Warner Losh
ef29dd1fec newbus: style nit
Sponsored by:		Netflix
2021-04-16 15:14:29 -06:00
Alfredo Dal'Ava Junior
b8bc6b7954 opal_console: fix serial console output corruption on powerpc64
Adds OPAL_CONSOLE_WRITE error handling and implements a call to
OPAL_CONSOLE_WRITE_BUFFER_SPACE to verify if there's enough space
before writing to console.

This fixes serial port output getting corrupted on fast writes, like
on "dmesg" output.

Tested on Raptor Blackbird running powerpc64 BE kernel

Reviewed by:	luporl
Sponsored by:	Eldorado Reserach Institute (eldorado.org.br)
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D29063
2021-04-16 20:10:09 -03:00
Alexander Motin
3e34783420 pms(4): Limit maximum I/O size to 256KB instead of 1MB.
There is a weird limit of AGTIAPI_MAX_DMA_SEGS (128) S/G segments per
I/O since the initial driver import.  I don't know why it was added,
can only guess some hardware limitation, but in worst case it means
maximum I/O size of 508KB.  Respect it to be safe, rounding to 256KB.

MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2021-04-16 15:50:34 -04:00
Alexander Motin
8434a65ce4 pms(4): Do not return CAM_REQ_CMP on errors.
It is a direct request for data corruptions, one report of which we
have received.  I am very surprised that only one.

MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2021-04-16 15:50:33 -04:00
Edward Tomasz Napierala
e47823b831 linux: support AT_EMPTY_PATH flag in fchownat(2)
This fixes rsyslog package installation scripts in Bionic.

Reviewed By:	kib
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D29108
2021-04-16 16:27:20 +01:00
Edward Tomasz Napierala
4b45c2bb83 linux: make fstatat(2) handle AT_EMPTY_PATH
Without it, Qt5 apps from Focal fail to start, being unable to load
their plugins.  It's also necessary for glibc 2.33, as found in recent
Arch snapshots.

PR:		254112
Reviewed By:	kib
Sponsored by:	The FreeBSD Foundation, EPSRC
Differential Revision:	https://reviews.freebsd.org/D28192
2021-04-16 08:56:19 +01:00
Andrey V. Elsukov
9bacbf1ae2 ipfw: do not use sleepable malloc in callout context.
Use M_NOWAIT flag when hash growing is called from callout.

PR:             255041
Reviewed by:	kevans
MFC after:	10 days
Differential Revision: https://reviews.freebsd.org/D29772
2021-04-16 10:22:44 +03:00
Kyle Evans
77c89fa6f5 modules: remove stale if_wg reference
This variable isn't being used anywhere, remove it.
2021-04-15 19:59:13 -05:00
Tai-hwa Liang
bdf316e892 fwip(4): fixing kernel panic when receiving unicast packet
Wrapping fwip_unicast_input() with NET_EPOCH_{ENTER,EXIT} to avoid a
NET_EPOCH_ASSERT() in netisr_dispatch().

Reviewed by:	hselasky
MFC after:	2 weeks
2021-04-15 22:56:07 +00:00
Gleb Smirnoff
cb8d7c44d6 tcp_syncache: add net.inet.tcp.syncache.see_other sysctl
A security feature from c06f087ccb appeared to be a huge bottleneck
under SYN flood. To mitigate that add a sysctl that would make
syncache(4) globally visible, ignoring UID/GID, jail(2) and mac(4)
checks. When turned on, we won't need to call crhold() on the listening
socket credential for every incoming SYN packet.

Reviewed by:	bz
2021-04-15 15:26:48 -07:00
Rick Macklem
34256484af Revert "nfsd: cut the Linux NFSv4.1/4.2 some slack w.r.t. RFC5661"
This reverts commit 9edaceca81.

It turns out that the Linux client intentionally does an NFSv4.1
RPC with only a Sequence operation in it and with "seqid + 1"
for the slot.  This is used to re-synchronize the slot's seqid
and the client expects the NFS4ERR_SEQ_MISORDERED error reply.

As such, revert the patch, so that the server remains RFC5661
compliant.
2021-04-15 14:08:40 -07:00
Alexander V. Chernikov
6b8ef0d428 Add batched update support for the fib algo.
Initial fib algo implementation was build on a very simple set of
 principles w.r.t updates:

1) algorithm is ether able to apply the change synchronously (DIR24-8)
 or requires full rebuild (bsearch, lradix).
2) framework falls back to rebuild on every error (memory allocation,
 nhg limit, other internal algo errors, etc).

This changes brings the new "intermediate" concept - batched updates.
Algotirhm can indicate that the particular update has to be handled in
 batched fashion (FLM_BATCH).
The framework will write this update and other updates to the temporary
 buffer instead of pushing them to the algo callback.
Depending on the update rate, the framework will batch 50..1024 ms of updates
 and submit them to a different algo callback.

This functionality is handy for the slow-to-rebuild algorithms like DXR.

Differential Revision:	https://reviews.freebsd.org/D29588
Reviewed by:	zec
MFC after:	2 weeks
2021-04-14 23:54:11 +01:00
Kevin Bowling
68a46f11ea e1000: Restore VF interface random MAC
Restore 525e07418c after the iflib conversion of igb(4). This
reenables random MAC address generation when attaching to a VF with a
zeroed MAC.

PR:		253535
Reported by:	Balaev PA <mail@void.so>
Reviewed by:	markj
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D29785
2021-04-15 11:45:02 -07:00
Kevin Bowling
bb1b375fa7 e1000: fix em_mac_min and 82547 packet buffer
The boundary differentiating "lem" vs "em" class devices was wrong
after the iflib conversion of lem(4).

The Packet Buffer size for 82547 class chips was not set correctly
after the iflib conversion of lem(4).

These changes restore functionality on an 82547 for the submitter.

PR:		236119
Reported by:	Jeff Gibbons <jgibbons@protogate.com>
Reviewed by:	markj
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D29766
2021-04-15 10:19:30 -07:00
Kevin Bowling
548d8a131d e1000: disable hw.em.sbp debug setting
This is a debugging tunable that shouldn't have retained this setting
after the initial iflib conversion of the driver

PR:		248934
Reported by:	Franco Fichtner <franco@opnsense.org>
Reviewed by:	markj
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D29768
2021-04-15 09:48:41 -07:00
Edward Tomasz Napierala
1663120ae4 linux: implement O_PATH
Reviewed By:	kib
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D29773
2021-04-15 15:30:59 +01:00
Vladimir Kondratyev
788a171c77 ng_ubt: Block attachment of uninitialized Intel Wireless 7265
As this controller requires firmware patch downloading to operate.
"Intel Wireless 7265" support in iwmbtfw(8) is yet to be done.

Tested by:	arrowd et al
PR:		228787
MFC after:	2 weeks
2021-04-15 17:26:32 +03:00
Vladimir Kondratyev
d605d72948 ng_ubt: Use DEFINE_CLASS_1 macro for kobj inheritance.
MFC after:	2 weeks
2021-04-15 17:25:50 +03:00
Vladimir Kondratyev
52489f2a55 ng_ubt: Do not clear stall before receiving of HCI command response.
Unconditional execution of "clear feature" request at SETUP stage was
workaround for probe failures on ng_ubt.ko re-kldloading which is
unnecessary now.

Reviewed by:	hselasky
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D29775
2021-04-15 17:25:00 +03:00
Edward Tomasz Napierala
1b11173c00 linux: extend the LINUX_O_ constants to make room for O_PATH
No functional changes.

Sponsored By:	EPSRC
2021-04-15 15:04:44 +01:00
Konstantin Belousov
e3d6759585 b_vflags update requries bufobj lock
The trunc_dependencies() issue was reported by	Alexander Lochmann
<alexander.lochmann@tu-dortmund.de>, who found the problem by performing
lock analysis using LockDoc, see https://doi.org/10.1145/3302424.3303948.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2021-04-15 15:47:42 +03:00
Konstantin Belousov
bbf7a4e878 O_PATH: allow vnode kevent filter on such files
if VREAD access is checked as allowed during open

Requested by:	wulf
Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D29323
2021-04-15 12:49:18 +03:00
Konstantin Belousov
f9b923af34 O_PATH: Allow to open symlink
When O_NOFOLLOW is specified, namei() returns the symlink itself.  In
this case, open(O_PATH) should be allowed, to denote the location of symlink
itself.

Prevent O_EXEC in this case, execve(2) code is not ready to try to execute
symlinks.

Reported by:	wulf
Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D29323
2021-04-15 12:49:09 +03:00
Konstantin Belousov
a5970a529c Make files opened with O_PATH to not block non-forced unmount
by only keeping hold count on the vnode, instead of the use count.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D29323
2021-04-15 12:48:27 +03:00
Konstantin Belousov
8d9ed174f3 open(2): Implement O_PATH
Reviewed by:	markj
Tested by:	pho
Discussed with:	walker.aj325_gmail.com, wulf
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D29323
2021-04-15 12:48:24 +03:00
Konstantin Belousov
509124b626 Add AT_EMPTY_PATH for several *at(2) syscalls
It is currently allowed to fchownat(2), fchmodat(2), fchflagsat(2),
utimensat(2), fstatat(2), and linkat(2).

For linkat(2), PRIV_VFS_FHOPEN privilege is required to exercise the flag.
It allows to link any open file.

Requested by:	trasz
Tested by:	pho, trasz
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D29111
2021-04-15 12:48:11 +03:00