Commit Graph

141143 Commits

Author SHA1 Message Date
Doug Moore
0ce7909cd0 vm_phys: add essential segment bounds check
A lower-bound segment check is necessary in vm_phys_alloc_seg_contig.
Add one.

Reported by:	jenkins
Reviewed by:	alc
Fixes:	da92ecbc0d vm_phys: fix seg->end test in alloc_seg_contig
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D33945
2022-01-19 00:42:39 -06:00
Alan Somers
89d57b94d7 fusefs: implement VOP_DEALLOCATE
MFC after:	Never
Reviewed by:	khng
Differential Revision: https://reviews.freebsd.org/D33800
2022-01-18 21:13:02 -07:00
Alexander Motin
b7ff445ffa Reduce bufdaemon/bufspacedaemon shutdown time.
Before this change bufdaemon and bufspacedaemon threads used
kthread_shutdown() to stop activity on system shutdown.  The problem is
that kthread_shutdown() has no idea about the wait channel and lock used
by specific thread to wake them up reliably.  As result, up to 9 threads
could consume up to 9 seconds to shutdown for no good reason.

This change introduces specific shutdown functions, knowing how to
properly wake up specific threads, reducing wait for those threads on
shutdown/reboot from average 4 seconds to effectively zero.

MFC after:	2 weeks
Reviewed by:	kib, markj
Differential Revision:  https://reviews.freebsd.org/D33936
2022-01-18 19:26:16 -05:00
John Baldwin
dd2f7a4b45 Bump __FreeBSD_version for the addition of <crypto/chacha20_poly1305.h>.
Sponsored by:	The FreeBSD Foundation
2022-01-18 14:49:24 -08:00
John Baldwin
42876a039e crypto: Stop compiling in chacha20poly1305 AEAD ciphers from libsodium.
These ciphers are now supported via OCF or 'struct enc_xform'.

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33889
2022-01-18 14:48:40 -08:00
John Baldwin
e71680049b crypto: Add a simple API for [X]ChaCha20-Poly1035 on flat buffers.
This is a synchronous software API which wraps the existing software
implementation shared with OCF.  Note that this will not currently
use optimized backends (such as ossl(4)) but may be appropriate for
operations on small buffers.

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33524
2022-01-18 14:47:13 -08:00
Vladimir Kondratyev
89889ab470 LinuxKPI: Allow wake_up to be executed within a critical section
by replaceing of spin_lock() call with spin_lock_irqsave()

This fixes following panic in drm-kmod:

panic: mi_switch: switch in a critical section
cpuid = 2
time = 1636939794
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b
vpanic() at vpanic+0x187
panic() at panic+0x43
mi_switch() at mi_switch+0x198
__mtx_lock_sleep() at __mtx_lock_sleep+0x1c9
__mtx_lock_flags() at __mtx_lock_flags+0xa2
linux_wake_up() at linux_wake_up+0x38
__active_retire() at __active_retire+0xb7
dma_fence_signal() at dma_fence_signal+0x100
dma_resv_add_shared_fence() at dma_resv_add_shared_fence+0x96
i915_gem_do_execbuffer() at i915_gem_do_execbuffer+0x11d0
i915_gem_execbuffer2_ioctl() at i915_gem_execbuffer2_ioctl+0x19a
drm_ioctl_kernel() at drm_ioctl_kernel+0x72
drm_ioctl() at drm_ioctl+0x2c4
linux_file_ioctl() at linux_file_ioctl+0x297
kern_ioctl() at kern_ioctl+0x1dc
sys_ioctl() at sys_ioctl+0x124
amd64_syscall() at amd64_syscall+0x124
fast_syscall_common() at fast_syscall_common+0xf8
--- syscall (54, FreeBSD ELF64, sys_ioctl)

MFC after:	1 week
Reviewed by:	manu
Reported by:	Graham Perrin <grahamperrin_AT_gmail_DOT_com>
PR:		261166
Differential Revision:	https://reviews.freebsd.org/D33888
2022-01-18 23:14:13 +03:00
Vladimir Kondratyev
02ea603302 LinuxKPI: Allow spin_lock_irqsave to be called within a critical section
with spinning on spin_trylock. dma-buf part of drm-kmod depends on this
property and absence of it support results in "mi_switch: switch in a
critical section" assertions [1][2].

[1] https://github.com/freebsd/drm-kmod/issues/116
[2] https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=261166

MFC after:	1 week
Reviewed by:	manu
Differential Revision:	https://reviews.freebsd.org/D33887
2022-01-18 23:14:12 +03:00
Robert Wing
a50e92cc20 geom: add kqfilter support for geom dev
The only event hooked up is NOTE_ATTRIB, which is triggered when the
device is resized. Support for other NOTE_* events to follow.

Reviewed by:	kib, jhb
Differential Revision:	https://reviews.freebsd.org/D33402
2022-01-18 10:54:59 -09:00
Kenneth D. Merry
6e8a2f0400 Update sa(4) comments and man page after review.
sys/cam/scsi/scsi_sa.c:
	Add comments explaining the priority order of the various
	sources of timeout values.  Also, explain that the probe
	that pulls in drive recommended timeouts via the REPORT
	SUPPORTED OPERATION CODES command is in a race with the
	thread that creates the sysctl variables.  Because of that
	race, it is important that the sysctl thread not load any
	timeout values from the kernel environment.

share/man/man4/sa.4:
	Use the Sy macro to emphasize thousandths of a second
	instead of capitalizing it.

Requested by:	Warner Losh <imp@freebsd.org>
Requested by:	Daniel Ebdrup Jensen <debdrup@freebsd.org>
Sponsored by:	Spectra Logic
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D33883
2022-01-18 13:50:31 -05:00
Kenneth D. Merry
5719b5a1bb Switch to using drive-supplied timeouts for the sa(4) driver.
Summary:
The sa(4) driver has historically used tape drive timeouts that
were one-size fits all, with compile-time options to adjust a few
of them.

LTO-9 drives (and presumably other tape drives in the future)
implement a tape characterization process that happens the first
time a tape is loaded.  The characterization process formats the
tape to account for the temperature and humidity in the environment
it is being used in.  The process for LTO-9 tapes can take from 20
minutes (I have observed 17-18 minutes) to 2 hours according to the
documentation.

As a result, LTO-9 drives have significantly longer recommended
load times than previous LTO generations.

To handle this, change the sa(4) driver over to using timeouts
supplied by the tape drive using the timeout descriptors obtained
through the REPORT SUPPORTED OPERATION CODES command.  That command
was introduced in SPC-4.  IBM tape drives going back to at least
LTO-5 report timeout values.  Oracle/Sun/StorageTek tape drives
going back to at least the T10000C report timeout values.  HP LTO-5
and newer drives report timeout values.  The sa(4) driver only
queries drives that claim to support SPC-4.

This makes the timeout settings automatic and accurate for newer
tape drives.

Also, add loader tunable and sysctl support so that the user can
override individual command type timeouts for all tape drives in
the system, or only for specific drives.

The new global (these affect all tape drives) loader tunables are:

	kern.cam.sa.timeout.erase
	kern.cam.sa.timeout.load
	kern.cam.sa.timeout.locate
	kern.cam.sa.timeout.mode_select
	kern.cam.sa.timeout.mode_sense
	kern.cam.sa.timeout.prevent
	kern.cam.sa.timeout.read
	kern.cam.sa.timeout.read_position
	kern.cam.sa.timeout.read_block_limits
	kern.cam.sa.timeout.report_density
	kern.cam.sa.timeout.reserve
	kern.cam.sa.timeout.rewind
	kern.cam.sa.timeout.space
	kern.cam.sa.timeout.tur
	kern.cam.sa.timeout.write
	kern.cam.sa.timeout.write_filemarks

The new per-instance loader tunable / sysctl variables are:

	kern.cam.sa.%d.timeout.erase
	kern.cam.sa.%d.timeout.load
	kern.cam.sa.%d.timeout.locate
	kern.cam.sa.%d.timeout.mode_select
	kern.cam.sa.%d.timeout.mode_sense
	kern.cam.sa.%d.timeout.prevent
	kern.cam.sa.%d.timeout.read
	kern.cam.sa.%d.timeout.read_position
	kern.cam.sa.%d.timeout.read_block_limits
	kern.cam.sa.%d.timeout.report_density
	kern.cam.sa.%d.timeout.reserve
	kern.cam.sa.%d.timeout.rewind
	kern.cam.sa.%d.timeout.space
	kern.cam.sa.%d.timeout.tur
	kern.cam.sa.%d.timeout.write
	kern.cam.sa.%d.timeout.write_filemarks

The values are reported and set in units of thousandths of a
second.

share/man/man4/sa.4:
	Document the new loader tunables in the sa(4) man page.

sys/cam/scsi/scsi_sa.c:
	Add a new timeout_info array to the softc.

	Add a default timeouts array, along with descriptions.

	Add a new sysctl tree to the softc to handle the timeout
	sysctl values.

	Add a new function, saloadtotunables(), that will load
	the global loader tunables first and then any per-instance
	loader tunables second.

	Add creation of the new timeout sysctl variables in
	sasysctlinit().

	Add a new, optional probe state to the sa(4) driver.  We
	previously didn't do any probing, but now we probe for
	timeout descriptors if the drive claims to support SPC-4 or
	later.  In saregister(), we check the SCSI revision and
	either launch the probe state machine, or announce the
	device and become ready.

	In sastart() and sadone(), add support for the new
	SA_STATE_PROBE.  If we're probing, we don't go through
	saerror(), since that is currently only written to handle
	I/O errors in the normal state.

	Change every place in the sa(4) driver that fills in
	timeout values in a CCB to use the new timeout_info[] array
	in the softc.

	Add a new saloadtimeouts() routine to parse the returned
	timeout descriptors from a completed REPORT SUPPORTED
	OPERATION CODES command, and set the values for the
	commands we support.

MFC after:	1 week
Sponsored by:	Spectra Logic

Test Plan:
Try this out with a variety of tape drives and make sure the timeouts that
result (sysctl kern.cam.sa to see them) are reasonable.

Reviewers: #manpages, #cam

Subscribers: imp

Differential Revision: https://reviews.freebsd.org/D33883
2022-01-18 13:50:30 -05:00
Doug Moore
da92ecbc0d vm_phys: fix seg->end test in alloc_seg_contig
In vm_phys_alloc_seg_contig, in allocating multiple memory blocks for
a huge allocation, ensure that the end of the allocated range does not
exceed the upper segment limit.

Reorder a couple of checks to improve code layout.

Reviewed by:	alc
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D33870
2022-01-18 12:49:09 -06:00
John Baldwin
da7fc5c33f freebsd32: Fix layout of struct shmid_kernel32.
The kernel pointers in this structure need to be 32-bit pointers,
not native pointers to 32-bit integers.

Reviewed by:	kib
Sponsored by:	The University of Cambridge, Google Inc.
Differential Revision:	https://reviews.freebsd.org/D33905
2022-01-18 10:42:21 -08:00
John Baldwin
a3af69fa81 iscsi: Abort fewer data-out tasks on a terminating session.
Only abort tasks queued for datamove after
cfiscsi_sesssion_terminate_tasks has posted its internal
CTL_TASK_I_T_NEXUS_RESET task.

Reported by:	Jithesh Arakkan @ Chelsio
Reviewed by:	mav
Fixes:		0cd6e85e24 iscsi: Abort data-out tasks queued on a terminating session.
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D33747
2022-01-18 09:28:43 -08:00
Cy Schubert
64e33c5cb1 Revert "wpa: Import wpa 2.10."
This reverts commit 5eb81a4b40, reversing
changes made to c6806434e7 and
this reverts commit 679ff61123.

What happend is git rebase --rebase-merges doesn't do what is expected.
2022-01-18 08:10:33 -08:00
Mark Johnston
46d35d415a fork: Copy the vm_stacktop field into the new vmspace
Fixes:	1811c1e957 ("exec: Reimplement stack address randomization")
Reported by:	pho
Reported by:	syzbot+0446312a51bc13ead834@syzkaller.appspotmail.com
Sponsored by:	The FreeBSD Foundation
2022-01-18 10:51:49 -05:00
Cy Schubert
5eb81a4b40 wpa: Import wpa 2.10.
The long awaited hostapd 2.10 is finally here.

MFC after:	3 weeks
2022-01-18 07:45:39 -08:00
Randall Stewart
aac52f94ea tcp: Warning cleanup from new compiler.
The clang compiler recently got an update that generates warnings of unused
variables where they were set, and then never used. This revision goes through
the tcp stack and cleans all of those up.

Reviewed by: Michael Tuexen, Gleb Smirnoff
Sponsored by: Netflix Inc.
Differential Revision:
2022-01-18 07:41:18 -05:00
Roger Pau Monné
e0516c7553 x86/apic: remove apic_ops
All supported Xen instances by FreeBSD provide a local APIC
implementation, so there's no need to replace the native local APIC
implementation anymore.

Leave just the ipi_vectored hook in order to be able to override it
with an implementation based on event channels if the underlying local
APIC is not virtualized by hardware. Note the hook cannot use ifuncs,
because at the point where ifuncs are resolved the kernel doesn't yet
know whether it will benefit from using the optimization.

Sponsored by: Citrix Systems R&D
Reviewed by: kib
Differential revision: https://reviews.freebsd.org/D33917
2022-01-18 10:19:04 +01:00
Roger Pau Monné
2450da6776 x86/xen: use x{2}APIC if virtualized by hardware
Instead of using event channels or hypercalls to deal with IPIs and
NMIs.

Using a hardware virtualized APIC should be faster than using any PV
interface, since the VM exit can be avoided.

Xen exposes whether the domain is using hardware assisted x{2}APIC
emulation in a CPUID bit.

Sponsored by: Citrix Systems R&D
2022-01-18 10:18:22 +01:00
Mark Johnston
5072251428 cryptosoft: Avoid referencing end-of-buffer cursors
Once a crypto cursor has reached the end of its buffer, it is invalid to
call crypto_cursor_segment() for at least some crypto buffer types.
Reorganize loops to avoid this.

Fixes:	cfb7b942be ("cryptosoft: Use multi-block encrypt/decrypt for non-AEAD ciphers.")
Fixes:	a221a8f4a0 ("cryptosoft: Use multi-block encrypt/decrypt for AES-GCM.")
Fixes:	f8580fcaa1 ("cryptosoft: Use multi-block encrypt/decrypt for AES-CCM.")
Fixes:	5022c68732 ("cryptosoft: Use multi-block encrypt/decrypt for ChaCha20-Poly1305.")
Reported and tested by:	madpilot
Discussed with:	jhb
Sponsored by:	The FreeBSD Foundation
2022-01-17 19:01:24 -05:00
Mark Johnston
3ce04aca49 proc: Add a sysctl to fetch virtual address space layout info
This provides information about fixed regions of the target process'
user memory map.

Reviewed by:	kib
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33708
2022-01-17 16:12:43 -05:00
Mark Johnston
1811c1e957 exec: Reimplement stack address randomization
The approach taken by the stack gap implementation was to insert a
random gap between the top of the fixed stack mapping and the true top
of the main process stack.  This approach was chosen so as to avoid
randomizing the previously fixed address of certain process metadata
stored at the top of the stack, but had some shortcomings.  In
particular, mlockall(2) calls would wire the gap, bloating the process'
memory usage, and RLIMIT_STACK included the size of the gap so small
(< several MB) limits could not be used.

There is little value in storing each process' ps_strings at a fixed
location, as only very old programs hard-code this address; consumers
were converted decades ago to use a sysctl-based interface for this
purpose.  Thus, this change re-implements stack address randomization by
simply breaking the convention of storing ps_strings at a fixed
location, and randomizing the location of the entire stack mapping.
This implementation is simpler and avoids the problems mentioned above,
while being unlikely to break compatibility anywhere the default ASLR
settings are used.

The kern.elfN.aslr.stack_gap sysctl is renamed to kern.elfN.aslr.stack,
and is re-enabled by default.

PR:		260303
Reviewed by:	kib
Discussed with:	emaste, mw
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33704
2022-01-17 16:12:36 -05:00
Mark Johnston
758d98debe exec: Remove the stack gap implementation
ASLR stack randomization will reappear in a forthcoming commit.  Rather
than inserting a random gap into the stack mapping, the entire stack
mapping itself will be randomized in the same way that other mappings
are when ASLR is enabled.

No functional change intended, as the stack gap implementation is
currently disabled by default.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33704
2022-01-17 16:11:54 -05:00
Mark Johnston
706f4a81a8 exec: Introduce the PROC_PS_STRINGS() macro
Rather than fetching the ps_strings address directly from a process'
sysentvec, use this macro.  With stack address randomization the
ps_strings address is no longer fixed.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33704
2022-01-17 16:11:54 -05:00
Mark Johnston
5a8413e779 setrlimit: Remove special handling for RLIMIT_STACK with a stack gap
This will not be required with a forthcoming reimplementation of ASLR
stack randomization.  Moreover, this change was not sufficient to enable
the use of a stack size limit smaller than the stack gap itself.

PR:		260303
Reviewed by:	kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33704
2022-01-17 11:42:13 -05:00
Mark Johnston
3fc21fdd5f sysent: Add a sv_psstringssz field to struct sysentvec
The size of the ps_strings structure varies between ABIs, so this is
useful for computing the address of the ps_strings structure relative to
the top of the stack when stack address randomization is enabled.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33704
2022-01-17 11:42:07 -05:00
Mark Johnston
1544f5add8 Revert "kern_exec: Add kern.stacktop sysctl."
The current ASLR stack gap feature will be removed, and with that the
need for the kern.stacktop sysctl is gone.  All consumers have been
removed.

This reverts commit a97d697122.

Reviewed by:	kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33704
2022-01-17 11:41:58 -05:00
Mark Johnston
dc7526170d posixshm: Report output buffer truncation from kern.ipc.posix_shm_list
PR:		240573
Reviewed by:	kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33912
2022-01-17 08:35:19 -05:00
Roger Pau Monné
ad15eeeaba x86/xen: fallback when VCPUOP_send_nmi is not available
It has been reported that on some AWS instances VCPUOP_send_nmi
returns -38 (ENOSYS). The hypercall is only available for HVM guests
in Xen 4.7 and newer. Add a fallback to use the native NMI sending
procedure when VCPUOP_send_nmi is not available, so that the NMI is
not lost.

Reported and Tested by: avg
MFC after: 1 week
Fixes: b2802351c1 ('xen: fix dispatching of NMIs')
Sponsored by: Citrix Systems R&D
2022-01-17 11:06:40 +01:00
Alexander V. Chernikov
1f70a85b4c linux: add sysctl to pass untranslated interface names
Reviewed by:	kib
MFC after:	2 weeks
Differential Revision: https://reviews.freebsd.org/D33792
2022-01-17 09:35:15 +00:00
Alexander V. Chernikov
96c524d8b2 linux: fix linux_recvmsg() MSG_PEEK flag handling
Reviewed by:	kib
MFC after:	2 weeks
Differential Revision: https://reviews.freebsd.org/D33790
2022-01-17 09:35:15 +00:00
Edward Tomasz Napierala
b896bdb86d linux: Make compat.linux.preserve_vstatus default to 1
From a user point of view, this makes ^T work out of the box.

Reviewed By:	debdrup (man page)
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D33842
2022-01-17 08:45:01 +00:00
Bjoern A. Zeeb
c3db9d4a14 net80211: ieee80211_dump_node() cosmetics
Printing %p does not need the 0x prefix and while here mark the
ieee80211_node_table argument unused given we do not need it in the
current incarnation of the function.

Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2022-01-17 00:01:46 +00:00
Marko Zec
e7abe200c2 fib_algo: shift / mask by constants in dxr_lookup()
Since trie configuration remains invariant during each DXR instance
lifetime, instead of shifting and masking lookup keys by values
computed at runtime, compile upfront several dxr_lookup()
configurations with hardcoded shift / mask constants, and choose the
apropriate lookup function version after each DXR instance rebuild.

In synthetic tests this yields small but measurable (5-10%) lookup
throughput improvement, depending on FIB size and  prefix patterns.

MFC after:	3 days
2022-01-17 00:13:47 +01:00
Bjoern A. Zeeb
d7ce88aafc LinuxKPI: 802.11 correct enum ieee80211_channel_flags
enum ieee80211_channel_flags are used as bit fields and not as 1..n.
Correct the values using BIT(n).

This is also hoped to fix problems with 7260 cards which come up and
panic due to an empty channel list as all channels are set disabled [1].
It will hopefully also fix the one or other oddity.

Reported by:	ambrisko, Mike Tancsa (mike sentex.net) [1]
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2022-01-16 22:30:38 +00:00
Kristof Provost
e5ca5e801d pf: ensure we don't destroy an uninitialised lock
The new lock introduced in 5f5e32f1b3 needs to be initialised early so
that it can be safely destroyed if we error out.

Reported-by: syzbot+d76113e9a4ae0c0fcac2@syzkaller.appspotmail.com
MFC after:	3 weeks
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2022-01-16 09:04:57 +01:00
Alexander Motin
e76c010899 Fix inverse sleep logic in buf_daemon().
Before commit 3cec5c77d6 buf_daemon() went to longer 1s sleep if
numdirtybuffers <= lodirtybuffers.  After that commit new condition
!BIT_EMPTY(BUF_DOMAINS, &bdlodirty) got opposite -- true when one
or more more domains is above lodirtybuffers.  As result, on freshly
booted system with no dirty buffers buf_daemon() wakes up 10 times
per second and probably only 1 time per second when there is actual
work to do.

MFC after:	1 week
Reviewed by:	kib, markj
Tested by:	pho
Differential revision:	https://reviews.freebsd.org/D33890
2022-01-15 19:32:36 -05:00
Bjoern A. Zeeb
c8dafefaee LinuxKPI: 802.11 Refine/add DTIM/TSF handling
Correct data types related to delivery traffic indication map (DTIM)/
timing synchronization function (TSF) and implement/refine their
handling.  This information is used/needed by iwlwifi to set a station
as associated.  This will hopefully avoid more "no beacon heard"
time event failures.

The recording of the Linux specific sync_device_ts is done in the
receive path for now in case we do have the right information
available.  I need to investigate as to how-much it may make sense
to also migrate it into net80211 in the future depending on the
usage in other drivers (or how we did handle this in the past in
natively ported versions, e.g. iwm).

Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2022-01-15 22:22:30 +00:00
Bjoern A. Zeeb
f3229b62a1 LinuxKPI: 802.11 handle connection loss differently
Rather than just bouncing back to SCAN bounce to INIT on connection
loss.  This is should be refined in the future as the comment already
indicates but we need to tie two different worlds together.

Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2022-01-15 22:18:58 +00:00
Jessica Clarke
4e3a43905e ofw_pci: Fix incorrectly sized softc causing pci(4) out-of-bounds reads
We do not include sys/rman.h and so machine/resource.h ends up not being
included by the time pci_private.h is included. This means PCI_RES_BUS
is never defined, and so the sc_bus member of pci_softc is not present
when compiling ofw_pci, resulting in the wrong softc size being passed
to DEFINE_CLASS_1 and thus any attempts by pci(4) to access that member
are out-of-bounds reads or writes.

This is pretty fragile; arguably pci_private.h should be including
sys/rman.h, but this is the minimal needed change to fix the bug whilst
maintaining the status quo.

Found by:	CHERI
Reported by:	andrew
2022-01-15 19:03:53 +00:00
Colin Percival
de1292c6ff Use CPUID leaf 0x40000010 for local APIC freq
Some VM systems announce the frequency of the local APIC via the
CPUID leaf 0x40000010.  Using this allows us to boot slightly
faster by avoiding the need for timer calibration.

Reviewed by:	markj
Sponsored by:	https://www.patreon.com/cperciva
2022-01-14 17:30:17 -08:00
Colin Percival
4a432614f6 TSC: Use 0x40000010 CPUID leaf for all VM types
While this CPUID leaf was originally only used by VMWare, other
hypervisors now also use it to announce the TSC frequency to guests.

This speeds up the boot process by 100 ms in EC2 and other systems,
by allowing the early calibration DELAY to be skipped.

Reviewed by:	markj
Sponsored by:	https://www.patreon.com/cperciva
2022-01-14 17:30:17 -08:00
Colin Percival
fd980feb57 Detect CPU type before asking VMWare for TSC freq
This allows us to set tsc_is_invariant and select appropriately
fenced versions of RDTSC based on the CPU type.

Reviewed by:	markj
Sponsored by:	https://www.patreon.com/cperciva
2022-01-14 17:30:17 -08:00
Navdeep Parhar
a727d9531a cxgbe(4): Fix bad races between sysctl and driver detach.
The default sysctl context setup by newbus for a device is eventually
freed by device_sysctl_fini, which runs after the device driver's detach
routine.  sysctl nodes associated with this context must not use any
resources (like driver locks, hardware access, counters, etc.) that are
released by driver detach.

There are a lot of sysctl nodes like this in cxgbe(4) and the fix is to
hang them off a context that is explicitly freed by the driver before it
releases any resource that might be used by a sysctl.

This fixes panics when running "sysctl dev.t6nex dev.cc" in a tight loop
and loading/unloading the driver in parallel.

Reported by:	Suhas Lokesha
MFC after:	1 week
Sponsored by:	Chelsio Communications
2022-01-14 16:44:57 -08:00
Ed Maste
301b2b02df snd_hda: restore pin patch for headphones on Lenovo X1 7th Gen
Fixes:		ef790cc740 ("hdaa: update pin patch configurations")
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33900
2022-01-14 19:23:15 -05:00
Warner Losh
4762aa57ad ata_xpt: Rename probe_softc to aprobe_softc
Since both scsi_xpt and ata_xpt use the same name for the softc, this
can lead to problems in gdb. Avoid the issue by renaming the ata
probe_softc to aprobe_softc as has been done for the aprobe in
0f280cbd0a. This was overlooked at the time.

Sponsored by:		Netflix
MFC After:		2 weeks
2022-01-14 17:21:09 -07:00
Simon J. Gerraty
bacb140f31 Ignore calcru: runtime went backwards for vm_guest
VM's have little control over CPU speed, don't make matters worse
by constantly spaming console.

Reviewed by:	jhb
Differential Revision:	https://reviews.freebsd.org/D33902
2022-01-14 16:07:43 -08:00
Alexander Motin
a9a2cdaf3c cam: Optimize write protection MODE SENSE in da(4).
Before this change on every open da(4) driver read all mode pages to
use only one bit.  It was done so to not depend on the list of pages
supported by the disk.  But I've found that at least for SATL of LSI/
Broadcom HBAs with WD HDDs Power Condition mode page reading may take
significant amount of time, much more than any other mode page, that
visibly increased disk retaste time by GEOM.

Address that by using data returned by the first MODE SENSE request
to limit the following ones to only one (the first for now) mode page.

With the change simultaneous retaste of 39 SATA disks takes about 2.5s
instead of more than 4s before, and I no longer see "dareprobe" status
on GEOM event thread.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2022-01-14 18:24:52 -05:00
Vincenzo Maffione
e0e1240528 netmap: fix LOR in iflib_netmap_register
In iflib_device_register(), the CTX_LOCK is acquired first and then
IFNET_WLOCK is acquired by ether_ifattach(). However, in netmap_hw_reg()
we do the opposite: IFNET_RLOCK is acquired first, and then CTX_LOCK
is acquired by iflib_netmap_register(). Fix this LOR issue by wrapping
the CTX_LOCK/UNLOCK calls in iflib_device_register with an additional
IFNET_WLOCK. This is safe since the IFNET_WLOCK is recursive.

MFC after:	1 month
2022-01-14 21:09:04 +00:00