Commit Graph

139234 Commits

Author SHA1 Message Date
Bartlomiej Grzesik
9dfc8606eb ipsec: Add support for PMTUD for IPv6 tunnels
Discard and send ICMPv6 Packet Too Big to sender when we try to encapsulate
and forward a packet which total length exceeds the PMTU.
Logic is based on the IPv4 implementation.
Common code was moved to a separate function.

Differential revision:	https://reviews.freebsd.org/D31771
Obtained from:		Semihalf
Sponsored by:		Stormshield
2021-09-24 10:27:21 +02:00
Bartlomiej Grzesik
b4220bf387 ipsec: If no PMTU in hostcache assume it's equal to link's MTU
If we fail to find to PMTU in hostcache, we assume it's equal
to link's MTU.

This patch prevents packets larger then link's MTU to be dropped
silently if there is no PMTU in hostcache.

Differential revision:	https://reviews.freebsd.org/D31770
Obtained from:		Semihalf
Sponsored by:		Stormshield
2021-09-24 10:25:53 +02:00
Bartlomiej Grzesik
4f3376951d ipsec: Add PMTUD support for IPsec IPv4 over IPv6 tunnel
Add support for checking PMTU for IPv4 packets encapsulated in IPv6 tunnels.

Differential revision:	https://reviews.freebsd.org/D31769
Sponsored by:		Stormshield
Obtained from:		Semihalf
2021-09-24 10:17:11 +02:00
Jason A. Harmening
f9e28f9003 unionfs: lock newly-created vnodes before calling insmntque()
This fixes an insta-panic when attempting to use unionfs with
DEBUG_VFS_LOCKS.  Note that unionfs still has a long way to
go before it's generally stable or usable.

Reviewed by:	kib (prior version), markj
Tested by:	pho
Differential Revision: https://reviews.freebsd.org/D31917
2021-09-23 19:20:30 -07:00
Nathaniel Wesley Filardo
0321a7990b kqueue: Add EV_KEEPUDATA flag
When this flag is set, operations that update an existing kevent will
not change the udata field.  This can be used to NOTE_TRIGGER or
EV_{EN,DIS}ABLE events without overwriting the stashed pointer.

Reviewed by:	Domagoj Stolfa <domagoj.stolfa@gmail.com>
Obtained from:	CheriBSD
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D30286
2021-09-23 17:31:39 -07:00
Konstantin Belousov
45c2c7c484 aio_aqueue(): avoid ucred leak on failure path
PR:	258698
Submitted by:	sigsys@gmail.com
MFC after:	1 week
2021-09-24 03:18:34 +03:00
Warner Losh
502dc84a8b nvme: Use shared timeout rather than timeout per transaction
Keep track of the approximate time commands are 'due' and the next
deadline for a command. twice a second, wake up to see if any commands
have entered timeout. If so, quiessce and then enter a recovery mode
half the timeout further in the future to allow the ISR to
complete. Once we exit recovery mode, we go back to operations as
normal.

Sponsored by:		Netflix
Differential Revision:	https://reviews.freebsd.org/D28583
2021-09-23 16:42:08 -06:00
Kristof Provost
cb13059663 pf: fix pagefault in pf_getstatus()
We can't copyout() while holding a lock, in case it triggers a page
fault.
Release the lock before copyout, which is safe because we've already
copied all the data into the nvlist.

PR:		258601
Reviewed by:	mjg
MFC after:	1 week
Sponsored by:	Modirum MDPay
Differential Revision:	https://reviews.freebsd.org/D32076
2021-09-23 21:56:59 +02:00
Wenzhuo Lu
d5ad2f2a67 e1000: fix K1 configuration
This patch is for the following updates to the K1 configurations:
Tx idle period for entering K1 should be 128 ns.
Minimum Tx idle period in K1 should be 256 ns.

Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>

PR:		258153
Reviewed by:	erj
Tested by:	iron.udjin@gmail.com
Approved by:	imp
Obtained from:	DPDK (6f934fa24dfd437c90ead96bc7598ee77a117ede)
MFC after:	1 week
2021-09-23 12:41:37 -07:00
Alexander Motin
ef50d5fbc3 x86: Add NUMA nodes into CPU topology.
Depending on hardware, NUMA nodes may match last level caches, or
they may be above them (AMD Zen 2/3) or below (Intel Xeon w/ SNC).
This information is provided by ACPI instead of CPUID, and it is
provided for each CPU individually instead of mask widths, but
this code should be able to properly handle all the above cases.

This change should immediately allow idle stealing in sched_ule(4)
to prefer load from NUMA-local CPUs to remote ones when the node
does not match LLC.  Later we may think of how to better handle it
on sched_pickcpu() side.

MFC after:	1 month
2021-09-23 14:31:38 -04:00
Randall Stewart
1ca931a540 tcp: Rack compressed ack path updates the recv window too easily
The compressed ack path of rack is not following proper procedures in updating
the peers window. It should be checking the seq and ack values before updating and
instead it is blindly updating the values. This could in theory get the wrong window
in the connection for some length of time.

Reviewed by: tuexen
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D32082
2021-09-23 11:43:29 -04:00
Randall Stewart
fd69939e79 tcp: Two bugs in rack one of which can lead to a panic.
In extensive testing in NF we have found two issues inside
the rack stack.

1) An incorrect offset is being generated by the fast send path when a fast send is initiated on
   the end of the socket buffer and before the fast send runs, the sb_compress macro adds data to the trailing socket.
   This fools the fast send code into thinking the sb offset changed and it miscalculates a "updated offset".
   It should only do that when the mbuf in question got smaller.. i.e. an ack was processed. This can lead to
   a panic deref'ing a NULL mbuf if that packet is ever retransmitted. At the best case it leads to invalid data being
   sent to the client which usually terminates the connection. The fix is to have the proper logic (that is in the rsm fast path)
   to make sure we only update the offset when the mbuf shrinks.
2) The other issue is more bothersome. The timestamp check in rack needs to use the msec timestamp when
   comparing the timestamp echo to now. It was using a microsecond timestamp which ends up giving error
   prone results but causes only small harm in trying to identify which send to use in RTT calculations if its a retransmit.

Reviewed by: tuexen
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D32062
2021-09-23 10:54:23 -04:00
Ed Maste
dbc7ca5945 vt: bound buffer access in redraw optimization
PR:		248628
Reported by:	oleg
Reviewed by:	cem, oleg (both earlier)
Fixes:		ee97b2336a ("Speed up vt(4) by keeping...")
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32059
2021-09-23 09:51:36 -04:00
Michael Tuexen
414499b3f9 sctp: Cleanup stream schedulers.
No functional change intended.

MFC after:	1 week
2021-09-23 14:16:56 +02:00
Arnaud Ysmal
0b92a7fe47 LACP: Do not wait response for marker messages not sent
The error returned when a marker message can not be emitted on a port is not handled.

This cause the lacp to block all emissions until the timeout of 3 seconds is reached.

To fix this issue, I just clear the LACP_PORT_MARK flag when the packet could not be emitted.

Differential revision:	https://reviews.freebsd.org/D30467
Obtained from:		Stormshield
2021-09-23 10:57:11 +02:00
Kyle Evans
5e79bba562 kern: random: collect ~16x less from fast-entropy sources
Previously, we were collecting at a base rate of:

64 bits x 32 pools x 10 Hz = 2.5 kB/s

This change drops it to closer to 64-ish bits per pool per second, to
work a little better with entropy providers in virtualized environments
without compromising the security goals of Fortuna.

Reviewed by:	#csprng (cem, delphij, markm)
Differential Revision:	https://reviews.freebsd.org/D32021
2021-09-23 01:03:02 -05:00
Kyle Evans
6895cade94 kern: random: drop read_rate and associated functionality
Refer to discussion in PR 230808 for a less incomplete discussion, but
the gist of this change is that we currently collect orders of magnitude
more entropy than we need.

The excess comes from bytes being read out of /dev/*random.  The default
rate at which we collect entropy without the read_rate increase is
already more than we need to recover from a compromise of an internal
state.

Reviewed by:	#csprng (cem, delphij, markm)
Differential Revision:	https://reviews.freebsd.org/D32021
2021-09-23 01:03:01 -05:00
Wojciech Macek
7bc13692a2 hwpmc: fix performance issues
Differential revision:	https://reviews.freebsd.org/D32025

Avoid using atomics as it_wait is guarded by td_lock.

Report threshold calculation is done only if at least one PMC hook
is installed

Fixes:
* avoid unnecessary branching (if frame != null ...)
  by having PMC_HOOK_INSTALLED_ANY
  condition on the top of them, which should hint
  the core not to execute speculatively anything
  which us underneath;
* access intr_hwpmc_waiting_report_threshold cacheline
  only if at least one hook is loaded;
2021-09-23 07:15:42 +02:00
Konstantin Belousov
e36d0e86e3 Revert "linux32: add a hack to avoid redefining the type of the savefpu tag"
This reverts commit 0f6829488e.
Also it changes the type of md_usr_fpu_save struct mdthread member
to void *, which is what uncovered this trouble.  Now the save area
is untyped, but since it is hidden behind accessors, it is not too
significant.  Since apparently there are consumers affected outside
the tree, this hack is better than one from the reverted revision.

PR:	258678
Reported by:	cy
Reviewed by:	cy, kevans, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32060
2021-09-22 23:17:47 +03:00
Alexander Motin
884f38590c Fix false device_set_unit() error.
It should silently succeed if the current unit number is the same as
requested, not fail immediately.

MFC after:	1 week
2021-09-22 08:44:39 -04:00
Alexander Motin
8db1669959 Fix build without SMP.
MFC after:	1 month
2021-09-21 22:13:33 -04:00
Alexander Motin
e745d729be sched_ule(4): Improve long-term load balancer.
Before this change long-term load balancer was unable to migrate
running threads, only ones waiting on run queues.  But with growing
number of CPU cores it is quite typical now for system to not have
many waiting threads.  But same time if due to some coincidence two
long-running CPU-bound threads ended up sharing same physical CPU
core, they could suffer from the SMT penalty indefinitely, and the
load balancer couldn't help.

Improve that by teaching the load balancer to hint running threads
to migrate by marking them with TDF_NEEDRESCHED and new TDF_PICKCPU
flag, making sched_pickcpu() to search for better CPU later, when
it is convenient.

Fix CPU search logic when balancing to limit round-robin migrations
in case of almost equal load to the group of physical cores.  The
previous code bounced threads across all the system, that should be
pretty bad for caches and NUMA affinity, while additional fairness
was almost invisible, diminishing with number of cores in the group.

MFC after:	1 month
2021-09-21 18:19:20 -04:00
Olivier Houchard
a342ecd326 arm: Handle thumb2 thread entry point.
In cpu_set_upcall(), if the thread startup routine is a thumb routine, make
sure to set PSR_T, so that the CPU will run in thumb mode.

MFC After:      1 week
2021-09-21 23:20:27 +02:00
Olivier Houchard
2191473724 arm64: Handle thumb2 thread entry point.
In cpu_set_upcall(), if the thread startup routine is a thumb routine, make
sure to set PSR_T, so that the CPU will run in thumb mode.

MFC After:	1 week
2021-09-21 23:20:27 +02:00
Konstantin Belousov
397f188936 Remove SV_CAPSICUM
It was only needed for cloudabi

Reviewed by:	emaste
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D31923
2021-09-22 00:18:44 +03:00
Konstantin Belousov
cf0ee8738e Drop cloudabi
According to https://github.com/NuxiNL/cloudlibc:
CloudABI is no longer being maintained. It was an awesome experiment,
but it never got enough traction to be sustainable.

There is no reason to keep it in FreeBSD.

Approved by:	ed (private mail)
Reviewed by:	emaste
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D31923
2021-09-22 00:18:44 +03:00
Konstantin Belousov
c2ee4dfd04 ia32_get_fpcontext(): xfpusave can be legitimately NULL
Reported by:	cy
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Fixes:	bd9e0f5df6
2021-09-22 00:17:06 +03:00
Alexander Motin
bd84094a51 sched_ule(4): Fix interactive threads stealing.
In scenarios when first thread in the queue can migrate to specified
CPU, but later ones can't runq_steal_from() incorrectly returned NULL.

MFC after:	2 weeks
2021-09-21 16:03:32 -04:00
Alan Somers
4f917847c9 fusefs: don't panic if FUSE_GETATTR fails durint VOP_GETPAGES
During VOP_GETPAGES, fusefs needs to determine the file's length, which
could require a FUSE_GETATTR operation.  If that fails, it's better to
SIGBUS than panic.

MFC after:	1 week
Sponsored by:	Axcient
Reviewed by: 	markj, kib
Differential Revision: https://reviews.freebsd.org/D31994
2021-09-21 14:01:06 -06:00
Mark Johnston
bcdc599dc2 Revert "cpuset(9): Add CPU_FOREACH_IS(SET|CLR) and modify consumers to use it"
This reverts commit 9068f6ea69.

The underlying macro needs to be reworked to avoid problems with control
flow statements.

Reported by:	rlibby
2021-09-21 13:51:42 -04:00
Konstantin Belousov
2e79a21632 amd64: consistently use uprintf() to report weird situations in sigreturn
Reviewed by:	jhb
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31954
2021-09-21 20:20:15 +03:00
Konstantin Belousov
bd9e0f5df6 amd64: eliminate td_md.md_fpu_scratch
For signal send, copyout from the user FPU save area directly.

For sigreturn, we are in sleepable context and can do temporal
allocation of the transient save area.  We cannot copying from userspace
directly to user save area because XSAVE state needs to be validated,
also partial copyins can corrupt it.

Requested by:	jhb
Reviewed by:	jhb, markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31954
2021-09-21 20:20:15 +03:00
Konstantin Belousov
df8dd6025a amd64: stop using top of the thread' kernel stack for FPU user save area
Instead do one more allocation at the thread creation time.  This frees
a lot of space on the stack.

Also do not use alloca() for temporal storage in signal delivery sendsig()
function and signal return syscall sys_sigreturn().  This saves equal
amount of space, again by the cost of one more allocation at the thread
creation time.

A useful experiment now would be to reduce KSTACK_PAGES.

Reviewed by:	jhb, markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31954
2021-09-21 20:20:15 +03:00
Konstantin Belousov
0f6829488e linux32: add a hack to avoid redefining the type of the savefpu tag
when compiling in amd64 kernel environment with -m32.  This is a temporal
workaround for some future proper (but unclear) fix.

Reviewed by:	jhb, markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31954
2021-09-21 20:20:15 +03:00
Konstantin Belousov
9151abe323 exec_machdep.c: some style, use ANSI C definition for sys_sigreturn()
Reviewed by:	jhb, markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31954
2021-09-21 20:20:15 +03:00
Konstantin Belousov
12ca33f44f amd64: move signal handling and register structures manipulations into exec_machdep.c
from machdep.c which is too large pile of unrelated things.
Some ptrace functions are moved from machdep.c to ptrace_machdep.c.

Now machdep.c contains code mostly related to the low level initialization
and regular low level operation of the architecture, while signal MD code
and registers handling is placed in exec_machdep.c.

Reviewed by:	jhb, markj
Discussed with:	jrtc27
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31954
2021-09-21 20:20:15 +03:00
Konstantin Belousov
a42d362bb5 amd64: centralize definitions of CS_SECURE and EFL_SECURE
Requested by	markj
Reviewed by:	jhb, markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31954
2021-09-21 20:20:14 +03:00
Mateusz Guzik
590d0715b3 ipsec: enter epoch before calling into ipsec_run_hhooks
pfil_run_hooks which eventually can get called asserts on it.

Reviewed by:	ae
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D32007
2021-09-21 17:02:41 +00:00
Emmanuel Vadot
559f60214b dwmmc: Remove dwmmc_setup_bus call from start_cmd
There is no need to re-setup the bus before each commands.
Tested-on:  Rock64, RockPro64
Reported by:	    avg
2021-09-21 18:17:20 +02:00
Emmanuel Vadot
af32e2cc32 dwmmc: Properly implement power_off/power_up
Write to the PWREN register should be done in update_ios based
on the power_mode value in the ios struct.
Also none of the manual (RockChip and Altera) and Linux talks about
the needed for an inverted PWREN value so just remove this.
This fixes eMMC (and possibly SD) when u-boot didn't setup the controller.

Reported by:	avg
Tested-on:	Rock64, RockPro64
2021-09-21 18:17:20 +02:00
Mark Johnston
9068f6ea69 cpuset(9): Add CPU_FOREACH_IS(SET|CLR) and modify consumers to use it
This implementation is faster and doesn't modify the cpuset, so it lets
us avoid some unnecessary copying as well.  No functional change
intended.

Reviewed by:	cem, kib, jhb
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32029
2021-09-21 12:07:47 -04:00
Mark Johnston
dfd3bde577 bitset(9): Introduce BIT_FOREACH_ISSET and BIT_FOREACH_ISCLR
These allow one to non-destructively iterate over the set or clear bits
in a bitset.  The motivation is that we have several code fragments
which iterate over a CPU set like this:

while ((cpu = CPU_FFS(&cpus)) != 0) {
	cpu--;
	CPU_CLR(cpu, &cpus);
	<do something>;
}

This is slow since CPU_FFS begins the search at the beginning of the
bitset each time.  On amd64 and arm64, CPU sets have size 256, so there
are four limbs in the bitset and we do a lot of unnecessary scanning.

A second problem is that this is destructive, so code which needs to
preserve the original set has to make a copy.  In particular, we have
quite a few functions which take a cpuset_t parameter by value, meaning
that each call has to copy the 32 byte cpuset_t.

The new macros address both problems.

Reviewed by:	cem, kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32028
2021-09-21 12:07:39 -04:00
Michael Tuexen
762ae0ec8d sctp: Simplify stream scheduler usage
Callers are getting the stcb send lock, so just KASSERT that.
No need to signal this when calling stream scheduler functions.
No functional change intended.

MFC after:	1 week
2021-09-21 17:13:57 +02:00
Olivier Houchard
2734050154 arm64: Handle 32bits breakpoint exception.
A different exception is raised when we hit a 32bits breakpoint, rather than
a 64bits one, so handle those as well when COMPAT_FREEBSD32 is defined.
This should fix SIGBUS at least when using breakpoints with thumb2 code.

PR:		256468
MFC After:	1 week
2021-09-21 15:52:42 +02:00
Andrew Turner
5a619ca07a Fix the arm64 L2_BLOCK_MASK definition
It was missing the top 16 bits.

Sponsored by:	The FreeBSD Foundation
2021-09-21 13:47:34 +00:00
Mitchell Horne
806ebc9eba bcm2835_sdhci: don't use DMA for kernel dumps
When handling a data irq, the sdhci driver calls the
sdhci_platform_will_handle() method, to determine if it should allow the
platform driver to handle the transfer or fall back to programmed I/O.
While dumping, the data irq path may be invoked directly (not from an
interrupt context), which the bcm2835_sdhci DMA code is not prepared to
handle. Return early in this case, to force the fallback to PIO.

Otherwise, the KASSERT that follows will be triggered, and the dump will
fail. On non-INVARIANTS kernels, the system will hang, waiting for a DMA
interrupt that will never arrive.

Reviewed by:	kevans
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31893
2021-09-21 10:08:39 -03:00
Michael Tuexen
0b79a76f84 sctp: improve consistency when calling stream scheduler
Hold always the stcb send lock when calling sctp_ss_init() and
sctp_ss_remove_from_stream().

MFC after:	1 week
2021-09-21 00:54:13 +02:00
Wojciech Macek
5572fda3a2 mvneta: split to FDT and generic part
Split some missing routines.

Obtained from:	Semihalf
2021-09-21 09:38:38 +02:00
Warner Losh
44fb3c695f endian.h: Use the __bswap* versions
Make it possible to have all these macros work without bswap* being
defined. bswap* is part of the application namespace and applications
are free to redefine those functions.

Reviewed by:		emaste,jhb,markj
Sponsored by:		Netflix
Differential Revision:	https://reviews.freebsd.org/D31964
2021-09-20 22:02:35 -06:00
Warner Losh
da73926566 libcam: Define depop structures and introduce scsi_wrap
Define structures related to the depop set of commands (GET PHYSICAL ELEMENT
STATUS, REMOVE ELEMENT AND TRUNCATE, and RESTORE ELEMENT AND REBUILD) as
well as the CDB construction routines.

Also create scsi_wrap.c. This will have convenience routines that will do all
the elements of allocating the ccb, generating the CDB, sending the command
(looping as necessary for cases where data is returned, but it's size isn't
known up front), etc. As this functionality is fleshed out, calling many
camcontrol commands programatically gets much easier.

Sponsored by:		Netflix
Differential Revision:	https://reviews.freebsd.org/D29017
2021-09-20 16:27:59 -06:00
Konstantin Belousov
2933a7ca03 aio_fsync_vnode: handle ERELOOKUP after VOP_FSYNC()
Reported by:	tmunro
Reviewed by:	jhb, tmunro
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32023
2021-09-20 21:40:17 +03:00
Konstantin Belousov
922bee44e4 aio_fsync_vnode: use for(;;) loop instead of label
Reviewed by:	jhb, tmunro
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32023
2021-09-20 21:39:46 +03:00
Greg V
c937a405bd vt: call driver's postswitch when panicking on ttyv0
In vt_kms, the postswitch callback restores fbdev mode when
panicking or entering the debugger. This ensures that even when
a graphical applicatino was running on the first tty, simple framebuffer
mode would be restored and the panic would be visible instead
of the frozen GUI. But vt wouldn't call the postswitch callback
when we're already on the first tty, so running a GUI on it
would prevent you from reading any panics.

Reviewed by:	tsoome
MFC after:	2 weeks
Differential Revision: https://reviews.freebsd.org/D29961
2021-09-20 20:29:37 +03:00
Mark Johnston
9e0c051249 opencrypto: Allow kern.crypto.allow_soft to be specified as a tunable
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2021-09-20 12:07:29 -04:00
Bartlomiej Grzesik
8a8166e5bc mmc: switch mmc_helper to device_ api
Add generic mmc_helper which uses newly introduced device_*_property
api. Thanks to this change the sd/mmc drivers will be capable
of parsing both DT and ACPI description.

Ensure backward compatibility for all mmc_fdt_helper users.

Reviewed by: manu, mw
Sponsored by: Semihalf
Differential revision: https://reviews.freebsd.org/D31598
2021-09-20 17:18:02 +02:00
Bartlomiej Grzesik
3f9a00e3b5 device: add device_get_property and device_has_property
Generialize bus specific property accessors. Those functions allow driver code
to access device specific information.

Currently there is only support for FDT and ACPI buses.

Reviewed by: manu, mw
Sponsored by: Semihalf
Differential revision: https://reviews.freebsd.org/D31597
2021-09-20 17:17:57 +02:00
Bartlomiej Grzesik
b91fc6c43a acpica: add ACPI_GET_PROPERTY to access Device Specific Data (DSD)
Add lazy acquiring of DSD package, which allows accessing Device
Specific Data.

Reviewed by: manu, mw
Sponsored by: Semihalf
Differential revision: https://reviews.freebsd.org/D31596
2021-09-20 16:31:08 +02:00
Michael Tuexen
34b1efcea1 sctp: use a valid outstream when adding it to the scheduler
Without holding the stcb send lock, the outstreams might get
reallocated if the number of streams are increased.

Reported by:	syzbot+4a5431d7caa666f2c19c@syzkaller.appspotmail.com
Reported by:	syzbot+aa2e3b013a48870e193d@syzkaller.appspotmail.com
Reported by:	syzbot+e4368c3bde07cd2fb29f@syzkaller.appspotmail.com
Reported by:	syzbot+fe2f110e34811ea91690@syzkaller.appspotmail.com
Reported by:	syzbot+ed6e8de942351d0309f4@syzkaller.appspotmail.com
MFC after:	1 week
2021-09-20 15:52:10 +02:00
Andrew Turner
b94d360e4a Add ELF macros found in the aaelf64 spec
The arm64 aaelf64 spec [0] has DT_AARCH64_ that could be used with
dynamic linking. It also adds GNU_PROPERTY_AARCH64_FEATURE_1_AND used
to tell the kernel which CPU features the binary is compatible with,
but does not require to execute correctly.

Add these values so the kernel and elf tools can make use of them.

[0] https://github.com/ARM-software/abi-aa/blob/2021Q1/aaelf64/aaelf64.rst

Sponsored by:	The FreeBSD Foundation
2021-09-20 10:34:04 +00:00
Hubert Mazur
b831f9ce70 if_mvneta: Build the driver as a kernel module
Fix device detach and attach routine. Add required Makefile
to build as a module. Remove entry from GENERIC, since now
it can be loaded automatically.

Tested on EspressoBin.

Obtained from:		Semihalf
Reviewed by:		manu
Differential revision:	https://reviews.freebsd.org/D31581
2021-09-20 10:58:58 +02:00
Marko Zec
2ac039f7be [fib_algo][dxr] Merge adjacent empty range table chunks.
MFC after:	3 days
2021-09-20 06:30:45 +02:00
Konstantin Belousov
bd3a668087 vm_page_startup: correct calculation of the starting page
Also avoid unneded calculations when phys segment end is the phys_avail[]
start.

Submitted by:	alc
Reviewed by:	markj
MFC after:	1 week
Fixes:	181bfb42fd
Differential revision:	https://reviews.freebsd.org/D32009
2021-09-19 21:27:55 +03:00
Alexander Motin
5f8cb13cfb ciss(4): Fix typo. 2021-09-19 14:08:22 -04:00
Alexander Motin
e8144a13e0 ciss(4): Properly handle data underrun.
For SCSI data underrun is a part of normal life.  It should not be
reported as error.  This fixes MODE SENSE used by modern CAM.

MFC after:	1 month
2021-09-19 14:08:22 -04:00
Mark Johnston
fea1a98ead freebsd32: Fix a double copyin in sendmsg() and recvmsg()
freebsd32_sendmsg() and freebsd32_recvmsg() both copyin the message
header twice, once directly and once in freebsd32_copyinmsghdr().  The
iovec length from the former is used when copying in msg_iov, but the
rest of the kernel uses the iovec length from the latter.  When
kern_sendit() and kern_recvit() iterate over the iovec to compute the
residual for I/O, they can therefore end up walking past the end of the
copied in iovec, either resulting in a system call error, userspace
memory corruption from uiomove() with invalid iovecs, or a kernel page
fault if the copied-in iovec is followed by an unmapped KVA region.

Reported by:	syzbot+7cc64cd0c49605acd421@syzkaller.appspotmail.com
Reviewed by:	kib, emaste
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32010
2021-09-19 13:54:16 -04:00
Mark Johnston
4bda16ff18 freebsd32: Provide an ANSI definition for freebsd32_recvmsg()
Fix style in the freebsd32_sendmsg() definition.

MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2021-09-19 13:53:57 -04:00
Michael Tuexen
e19d93b19d sctp: fix FCFS stream scheduler
Reported by:	syzbot+c6793f0f0ce698bce230@syzkaller.appspotmail.com
MFC after:	1 week
2021-09-19 11:56:26 +02:00
Kirk McKusick
d7770a5495 Eliminate snaplk / bufwait LOR when creating UFS snapshots
Each vnode has an embedded lock that controls access to its contents.
However vnodes describing a UFS snapshot all share a single snapshot
lock to coordinate their access and update. As part of creating a
new UFS snapshot, it has to have its individual vnode lock replaced
with the filesystem's snapshot lock.

The lock order for regular vnodes with respect to buffer locks is that
they must first acquire the vnode lock, then a buffer lock. The order
for the snapshot lock is reversed: a buffer lock must be acquired before
the snapshot lock.

When creating a new snapshot, the snapshot file must retain its vnode
lock until it has allocated all the blocks that it needs before
switching to the snapshot lock. This update moves one final piece of
the initial snapshot block allocation so that it is done before the
newly created snapshot is switched to use the snapshot lock.

Reported by:  Witness code
MFC after:    1 week
Sponsored by: Netflix
2021-09-18 17:02:30 -07:00
Rick Macklem
ad6dc36520 nfscl: Use vfs.nfs.maxalloclen to limit Deallocate RPC RTT
Unlike Copy, the NFSv4.2 Allocate and Deallocate operations do not
allow a reply with partial completion.  As such, the only way to
limit the time the operation takes to provide a reasonable RPC RTT
is to limit the size of the allocation/deallocation in the NFSv4.2
client.

This patch uses the sysctl vfs.nfs.maxalloclen to set
the limit on the size of the Deallocate operation.
There is no way to know how long a server will take to do an
deallocate operation, but 64Mbytes results in a reasonable
RPC RTT for the slow hardware I test on.

For an 8Gbyte deallocation, the elapsed time for doing it in 64Mbyte
chunks was the same (within margin of variability) as the
elapsed time taken for a single large deallocation
operation for a FreeBSD server with a UFS file system.
2021-09-18 14:38:43 -07:00
Mateusz Guzik
7b2ac8eb9b vfs: add missing VIRF_MOUNTPOINT in vfs_mountroot_shuffle
Reported by:	mav
2021-09-18 21:13:51 +02:00
Mateusz Guzik
0d9e99ce3b vfs: add the missing vnode interlock in vfs_mountroot_shuffle
Around v_mountedhere assignment.
2021-09-18 21:13:51 +02:00
Mark Johnston
50b07c1f71 unix: Fix a use-after-free in unp_drop()
We need to load the socket pointer after locking the PCB, otherwise
the socket may have been detached and freed by the time that unp_drop()
sets so_error.

This previously went unnoticed as the socket zone was _NOFREE.

Reported by:	pho
MFC after:	1 week
2021-09-18 10:38:39 -04:00
Franco Fichtner
8e496ea1df pf: always log nat rule and do it pre-rewrite
See also https://github.com/opnsense/core/issues/5005

Reviewed by:	kp
Differential Revision:	https://reviews.freebsd.org/D31504
2021-09-18 13:43:41 +02:00
Mateusz Guzik
f902e4bb04 lockmgr: fix lock profiling of face adaptive spinning 2021-09-18 10:16:58 +00:00
Mateusz Guzik
a2cb65b8fe cache: count vnodes in cache_purgevfs 2021-09-18 10:16:50 +00:00
Mateusz Guzik
5d8e32a66c vfs: retire VNODE_REFCOUNT_FENCE_* macros
They are unused as of last year.
2021-09-18 10:16:00 +00:00
Warner Losh
4b977e6dda nvme/nda: Fail all nvme I/Os after controller fails
Once the controller has failed, fail all I/O w/o sending it to the
device. The reset of the nvme driver won't schedule any I/O to the
failed device, and the controller is in an indeterminate state and can't
accept I/O. Fail both at the top end of the sim and the bottom
end. Don't bother queueing up the I/O for failure in a different task.

Reviewed by:		chuck
Sponsored by:		Netflix
Differential Revision:	https://reviews.freebsd.org/D31341
2021-09-17 16:09:21 -06:00
Kevin Bowling
e05d9788b7 e1000: Consistently use FALLTHROUGH
Approved by:	imp
MFC after:	1 week
2021-09-17 14:36:46 -07:00
Kevin Bowling
1bbdc25fc1 e1000: Use C99 bool types
Approved by:	imp
MFC after:	1 week
2021-09-17 14:29:12 -07:00
Kevin Bowling
984d1616be e1000: Catch up commit with DPDK
Various syncs with the e1000 shared code from DPDK:
"cid-gigabit.2020.06.05.tar.gz released by ND"

Approved by:	imp
Obtained from:	DPDK
MFC after:	1 week
2021-09-17 14:26:01 -07:00
Wenzhuo Lu
40fa6e53f5 e1000: prevent ULP flow if cable connected
Enabling ulp on link down when cable is connect caused an infinite
loop of linkup/down indications in the NDIS driver.
After discussed, correct flow is to enable ULP only when cable is
disconnected.

Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>

Approved by:	imp
Obtained from:	DPDK (4bff263d54d299269966365f9697941eecaa241b)
MFC after:	1 week
2021-09-17 14:25:38 -07:00
Andrzej Ostruszka
089cdb3990 e1000: clean LTO warnings
During LTO build compiler reports some 'false positive' warnings about
variables being possibly used uninitialized.  This patch silences these
warnings.

Exemplary compiler warning to suppress (with LTO enabled):
error: 'link' may be used uninitialized in this function
[-Werror=maybe-uninitialized]
  if (link) {

Signed-off-by: Andrzej Ostruszka <aostruszka@marvell.com>

Approved by:	imp
Obtained from:	DPDK (46136031f19107f4e9b6b3a952cb7f57877a7f0f)
MFC after:	1 week
2021-09-17 14:24:54 -07:00
Yong Wang
ecf2a89a99 e1000: fix multicast setting in VF
In function e1000_update_mc_addr_list_vf(), "msgbuf[0]" is used prior
to initialization at "msgbuf[0] |= E1000_VF_SET_MULTICAST_OVERFLOW".
And "msgbuf[0]" is overwritten at "msgbuf[0] = E1000_VF_SET_MULTICAST".

Fix it by moving the second line prior to the first one that mentioned
above.

Fixes: dffbaf7880a8 ("e1000: revert fix for multicast in VF")
Cc: stable@dpdk.org

Signed-off-by: Yong Wang <wang.yong19@zte.com.cn>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>

Approved by:	imp
Obtained from:	DPDK (f58ca2f9ef6)
MFC after:	1 week
2021-09-17 14:24:44 -07:00
Chengwen Feng
f6517a7e69 e1000: fix timeout for shadow RAM write
This fixes the timed out for shadow RAM write EEWR can't be detected.

Fixes: 5a32a257f957 ("e1000: more NICs in base driver")
Cc: stable@dpdk.org

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Acked-by: Haiyue Wang <haiyue.wang@intel.com>

Approved by:	imp
Obtained from:	DPDK (4a8ab48ec47b3616272e50620b8e1a9599358ea6)
MFC after:	1 week
2021-09-17 14:24:29 -07:00
Guinan Sun
9c4a0fabc8 e1000: cleanup pre-processor tags
The codes has been exposed correctly, so remove pre-processor tags.

Signed-off-by: Guinan Sun <guinanx.sun@intel.com>
Reviewed-by: Wei Zhao <wei.zhao1@intel.com>

Approved by:	imp
Obtained from:	DPDK (a50e998a0fd94e5db508710868a3417b1846425c)
MFC after:	1 week
2021-09-17 14:24:23 -07:00
Guinan Sun
7fb2111413 e1000: introduce DPGFR register
Defined DPGFR, Dynamic Power Gate Force Control Register.

Signed-off-by: Vitaly Lifshits <vitaly.lifshits@intel.com>
Signed-off-by: Guinan Sun <guinanx.sun@intel.com>
Reviewed-by: Wei Zhao <wei.zhao1@intel.com>

Approved by:	imp
Obtained from:	DPDK (1469e5aceffbdcebe834292aadb40b1bd1602867)
MFC after:	1 week
2021-09-17 14:24:07 -07:00
Guinan Sun
de965d042f e1000: expose FEXTNVM registers and masks
Adding defines for FEXTNVM8 and FEXTNVM12 registers with new masks for
future use.

Signed-off-by: Nir Efrati <nir.efrati@intel.com>
Signed-off-by: Guinan Sun <guinanx.sun@intel.com>
Reviewed-by: Wei Zhao <wei.zhao1@intel.com>

Approved by:	imp
Obtained from:	DPDK (6d208ec099cd870a73c6b444b350a82c7a26c5e4)
MFC after:	1 week
2021-09-17 14:23:26 -07:00
Guinan Sun
a8bb4ab7cf e1000: add missed define for VFTA
VLAN filtering using the VFTA (VLAN Filter Table Array) and
should be initialized prior to setting rx mode.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Signed-off-by: Guinan Sun <guinanx.sun@intel.com>
Reviewed-by: Wei Zhao <wei.zhao1@intel.com>

Approved by:	imp
Obtained from:	DPDK (fc9933953c90e99970aa867c38f9c6e6c5d0488d)
MFC after:	1 week
2021-09-17 14:23:19 -07:00
Guinan Sun
e8e3171d99 e1000: increase timeout for ME ULP exit
Due timing issues in WHL and since recovery by host is
not always supported, increased timeout for Manageability Engine(ME)
to finish Ultra Low Power(ULP) exit flow for Nahum before timer expiration.

Signed-off-by: Nir Efrati <nir.efrati@intel.com>
Signed-off-by: Guinan Sun <guinanx.sun@intel.com>
Reviewed-by: Wei Zhao <wei.zhao1@intel.com>

Approved by:	imp
Obtained from:	DPDK (cf1f3ca45d33e793ca581200b4000c39a798113e)
MFC after:	1 week
2021-09-17 14:23:07 -07:00
Guinan Sun
09888d4bc1 e1000: add missing register defines
Added defines for the EEC, SHADOWINF and FLFWUPDATE registers needed for
the nvmupd_validate_offset function to correctly validate the NVM update
offset.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Guinan Sun <guinanx.sun@intel.com>
Reviewed-by: Wei Zhao <wei.zhao1@intel.com>

Approved by:	imp
Obtained from:	DPDK (2c7fe65ab9a31e6ebf438dad7ccc59bcde83a89f)
MFC after:	1 week
2021-09-17 14:23:00 -07:00
Guinan Sun
a6f0cc373f e1000: add PCIm function state
Added define to pcim function state.

Signed-off-by: Vitaly Lifshits <vitaly.lifshits@intel.com>
Signed-off-by: Guinan Sun <guinanx.sun@intel.com>
Reviewed-by: Wei Zhao <wei.zhao1@intel.com>

Approved by:	imp
Obtained from:	DPDK (7ee1a3b273c7f321b50e6ba17c3d9537b1b08347)
MFC after:	1 week
2021-09-17 14:21:22 -07:00
Guinan Sun
d1c37752e2 e1000: expose MAC functions
Now the functions are being accessed outside of the file, we need
to properly expose them for silicon families to use.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Guinan Sun <guinanx.sun@intel.com>
Reviewed-by: Wei Zhao <wei.zhao1@intel.com>

Approved by:	imp
Obtained from:	DPDK (df01c0ee277d51f81d7d72501dba97550d3b6c4a)
MFC after:	1 week
2021-09-17 14:19:22 -07:00
Guinan Sun
1883a6ff3b e1000: update for i210 slow system clock
This code is required for the update for system clock.

Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Signed-off-by: Guinan Sun <guinanx.sun@intel.com>
Reviewed-by: Wei Zhao <wei.zhao1@intel.com>

Approved by:	imp
Obtained from:	DPDK (3f0188c8f29847038bc9f306b2570ace57e3811c)
MFC after:	1 week
2021-09-17 14:18:25 -07:00
Guinan Sun
6b9d35fac1 e1000: remove duplicated phy codes
Add two files base.c and base.h to reduce the redundancy
in the silicon family code.
Remove the code duplication from e1000_82575 files.
Clean family specific functions from base.
Fix up a stray and duplicate function declaration.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Signed-off-by: Guinan Sun <guinanx.sun@intel.com>
Reviewed-by: Wei Zhao <wei.zhao1@intel.com>

Approved by:	imp
Obtained from:	DPDK (44dddd14059f151f39f7e075b887decfc9a10f11)
MFC after:	1 week
2021-09-17 14:17:15 -07:00
Guinan Sun
d50f362b50 e1000: modify HW level time sync mechanisms
Add additional configuration space access to allow HW
level time sync mechanism.

Signed-off-by: Evgeny Efimov <evgeny.efimov@intel.com>
Signed-off-by: Guinan Sun <guinanx.sun@intel.com>
Reviewed-by: Wei Zhao <wei.zhao1@intel.com>

Approved by:	imp
Obtained from:	DPDK (d53391f1fe2e0eba8818517fdf285f893d95dcc8)
MFC after:	1 week
2021-09-17 14:16:15 -07:00
Guinan Sun
6c59e1866c e1000: fix minor issues and improve code style
Fix typo in piece of code of NVM access for SPT.
And cleans up the remaining instances in the shared code
where it was not adhering to the Linux code standard.
Wrong description was found in the mentioned file, so fix them.
Remove shadowing variable declarations.

Relating to operands in bitwise operations having different sizes.
Unreachable code since *clock_in_i2c_* always return success.
Don't return unused s32 and don't check for constants.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Signed-off-by: Vitaly Lifshits <vitaly.lifshits@intel.com>
Signed-off-by: Robert Konklewski <robertx.konklewski@intel.com>
Signed-off-by: Doug Dziggel <douglas.a.dziggel@intel.com>
Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Guinan Sun <guinanx.sun@intel.com>
Reviewed-by: Wei Zhao <wei.zhao1@intel.com>

Approved by:	imp
Obtained from:	DPDK (b8592c89c8fbc871d22313dcac0b86c89a7d5a62)
MFC after:	1 week
2021-09-17 14:14:34 -07:00
Guinan Sun
5b426b3e8c e1000: add function parameter descriptions
Add function parameter descriptions to address gcc 7 warnings.

Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Signed-off-by: Guinan Sun <guinanx.sun@intel.com>
Reviewed-by: Wei Zhao <wei.zhao1@intel.com>

Approved by:	imp
Obtained from:	DPDK (1bf35d435c9764e83be76042fa6489dd127b6c40)
MFC after:	1 week
2021-09-17 14:13:37 -07:00
Guinan Sun
da24467c7a e1000: expose xMDIO methods
Move read and write xmdio methods to e1000_phy.

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Signed-off-by: Guinan Sun <guinanx.sun@intel.com>
Reviewed-by: Wei Zhao <wei.zhao1@intel.com>

Approved by:	imp
Obtained from:	DPDK (b14d20f1b2bb0e6d95f19963c5d7f55374e0ead9)
MFC after:	1 week
2021-09-17 14:10:02 -07:00
Guinan Sun
82a9d0c2c1 e1000: add missing device ID
Adding Intel(R) I210 Gigabit Network Connection 15F6 device ID for SGMII
flashless automotive device.

Signed-off-by: Kamil Bednarczyk <kamil.bednarczyk@intel.com>
Signed-off-by: Guinan Sun <guinanx.sun@intel.com>
Reviewed-by: Wei Zhao <wei.zhao1@intel.com>

Approved by:	imp
Obtained from:	DPDK (586d770bfefc01d4af97c0ddf17c960c3e49ec22)
MFC after:	1 week
2021-09-17 14:09:32 -07:00
Guinan Sun
de0ae5d1cb e1000: support flashless i211 PBA
Add support to print PBA when using flashless.

Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Signed-off-by: Guinan Sun <guinanx.sun@intel.com>
Reviewed-by: Wei Zhao <wei.zhao1@intel.com>

Approved by:	imp
Obtained from:	DPDK (d3c41d90dfd5b39dec14c74cf53086f4e6634aed)
MFC after:	1 week
2021-09-17 14:07:27 -07:00