269991 Commits

Author SHA1 Message Date
Damian Szuberski
8ac58c3f56
Fix zfs:AUTO autodetection in initramfs scripts
Don't exit early in find_rootfs() when zpool.bootfs
is set to `zfs:AUTO`.

Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: szubersk <szuberskidamian@gmail.com>
Closes #12658
2021-11-13 08:02:50 -07:00
Pawel Jakub Dawidek
ac32854a6e
Remove (now unused) td argument from zfs_lookup()
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Pawel Jakub Dawidek <pawel@dawidek.net>
Closes #12748
2021-11-12 17:06:44 -08:00
George Amanakis
c9d62d1356
Introduce a tunable to exclude special class buffers from L2ARC
Special allocation class or dedup vdevs may have roughly the same
performance as L2ARC vdevs. Introduce a new tunable to exclude those
buffers from being cacheable on L2ARC.

Reviewed-by: Don Brady <don.brady@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #11761 
Closes #12285
2021-11-11 12:52:16 -08:00
наб
420b44488f
Remove basename(1). Clean up/shorten some coreutils pipelines
Basenames that remain, in cmd/zed/zed.d/statechange-led.sh:
	dev=$(basename "$(echo "$therest" | awk '{print $(NF-1)}')")
	vdev=$(basename "$ZEVENT_VDEV_PATH")
I don't wanna interfere with #11988

scripts/zfs-tests.sh:
	SINGLETESTFILE=$(basename "$SINGLETEST")
tests/zfs-tests/tests/functional/cli_user/zfs_list/zfs_list.kshlib:
	ACTUAL=$(basename $dataset)
	ACTUAL=$(basename $dataset)
tests/zfs-tests/tests/functional/cli_user/zpool_iostat/
	zpool_iostat_-c_homedir.ksh:
	typeset USER_SCRIPT=$(basename "$USER_SCRIPT_FULL")
tests/zfs-tests/tests/functional/cli_user/zpool_iostat/
	zpool_iostat_-c_searchpath.ksh:
	typeset CMD_1=$(basename "$SCRIPT_1")
	typeset CMD_2=$(basename "$SCRIPT_2")
tests/zfs-tests/tests/functional/cli_user/zpool_status/
	zpool_status_-c_homedir.ksh:
	typeset USER_SCRIPT=$(basename "$USER_SCRIPT_FULL")
tests/zfs-tests/tests/functional/cli_user/zpool_status/
	zpool_status_-c_searchpath.ksh
	typeset CMD_1=$(basename "$SCRIPT_1")
	typeset CMD_2=$(basename "$SCRIPT_2")
tests/zfs-tests/tests/functional/migration/migration.cfg:
	export BNAME=`basename $TESTFILE`
tests/zfs-tests/tests/perf/perf.shlib:
	typeset logbase="$(get_perf_output_dir)/$(basename \
tests/zfs-tests/tests/perf/perf.shlib:
	typeset logbase="$(get_perf_output_dir)/$(basename \

These are potentially Of Directories, where basename is actually
useful

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Closes #12652
2021-11-11 13:27:37 -07:00
Fedor Uporov
49d42425d6
Check l2cache vdevs pending list inside the vdev_inuse()
The l2cache device could be added twice because vdev_inuse() does not
check spa_l2cache for added devices. Make l2cache vdevs inuse checking
logic more closer to spare vdevs.

Reviewed-by: George Amanakis <gamanakis@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Fedor Uporov <fuporov.vstack@gmail.com>
Closes #9153 
Closes #12689
2021-11-11 11:54:15 -08:00
Fedor Uporov
d04b5c9e87
zhack: Add repair label option
In case if all label checksums will be invalid on any vdev, the pool
will become unimportable. The zhack with newly added cli options could
be used to restore label checksums and make pool importable again.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Fedor Uporov <fuporov.vstack@gmail.com>
Closes #2510
Closes #12686
2021-11-11 11:26:18 -08:00
Palash Gandhi
637771a066
ZTS: zfs_list_004_neg should not check paths that belong to ZFS
When ZFS is on root, /tmp is a ZFS. This causes zfs_list_004_neg to
fail since `zfs list` on /tmp passes when the test expects it not to.
The fix is to exclude paths that belong to ZFS.

Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Palash Gandhi <pbg4930@rit.edu>
Closes #12744
2021-11-11 08:46:44 -07:00
Brian Behlendorf
c23803be84
Restore dirty dnode detection logic
In addition to flushing memory mapped regions when checking holes,
commit de198f2d95 modified the dirty dnode detection logic to check
the dn->dn_dirty_records instead of the dn->dn_dirty_link.  Relying
on the dirty record has not be reliable, switch back to the previous
method.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #11900 
Closes #12745
2021-11-10 16:14:32 -08:00
Brian Behlendorf
371e0f7754
Exclude zfs_copies_003_pos on Linux
This test case may fail on 5.13 and newer Linux kernels if the
/dev/zvol/ device is not created by udev.

Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #12301
Closes  #12738
2021-11-10 13:56:01 -07:00
Fedor Uporov
2a9c572059
zdb: Report bad label checksum
In case if all label checksums will be invalid on any vdev, the pool
will become unimportable. From other side zdb with -l option will not
provide any useful information why it happened. Add notifications
about corrupted label checksums.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Fedor Uporov <fuporov.vstack@gmail.com>
Closes #2509
Closes #12685
2021-11-10 12:22:00 -07:00
George V. Neville-Neil
06a8ffd4cd Fix two more nits from 0mp 2021-11-10 13:09:19 -05:00
George V. Neville-Neil
409d1bf7d6 Address review comments from 0mp, debdrup and oshogbo 2021-11-10 13:09:18 -05:00
George V. Neville-Neil
406feaa862 Initial clean up the language in the manual pages.
Summary: The manual pages need a bit of editing for language and clarity.

Reviewers: oshogbo, #manpages

Subscribers: imp

Differential Revision: https://reviews.freebsd.org/D32976
2021-11-10 13:09:18 -05:00
Emmanuel Vadot
123b5b8763 loader: Do not force comconsole for arm and arm64
This makes GOP not probed on some situation (AMD Card on PCIe slot
with EDK2 as we have a SERIAL_IO_PROTOCOL compatible uart).

Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D32992
Sponsored by: Beckhoff Automation GmbH & Co. KG
2021-11-16 10:11:56 +01:00
Emmanuel Vadot
2e0d67c3ed loader: lsefi: Print more information
Printing the EFI_HANDLE pointer isn't very useful.
If the handle have a IMAGE_DEVICE_PATH or a DEVICE_PATH protocol print it.
This makes it easier to see which devices are present and what protocol they
expose.

Reviewed by:	imp, tsoome
Differential Revision:	https://reviews.freebsd.org/D32991
Sponsored by: Beckhoff Automation GmbH & Co. KG
2021-11-16 10:11:53 +01:00
Warner Losh
2bbaed4d7f mpr: Minor formatting changes to match mps.
Minor reformatting nits to make mprsas_scsiio_timeout match
mpssas_scsiio_timeout more closely. The differences aren't necessary and
are distracting when comparing the routines. No functional changes.

Sponsored by:		Netflix
2021-11-15 21:27:15 -07:00
Warner Losh
b086bc0bf1 mps: Fix debugging line
Print cm instead of sc here, as is done in mpr. We can get the sc from
cm, but not vice versa.

Sponsored by:		Netflix
2021-11-15 21:27:14 -07:00
Kirk McKusick
9b8eb1c5b6 Followup to f2b391528ad9 to improve printed message.
Sponsored by: Netflix
2021-11-15 16:10:02 -08:00
Kirk McKusick
9e9dcac95a Allow forced r/w mount of UFS/FFS filesystem with a bad check hash.
Normally a UFS/FFS filesystem with a bad check hash can only be
mounted read only. With this commit the mount(8) -f (force) option
can be used to force a read-write mount of a UFS/FFS filesystem with
a bad check hash. Conveniently the filesystem will proceed to
update its on-disk superblock with a corrected check hash.

Sponsored by: Netflix
2021-11-15 16:03:47 -08:00
Kirk McKusick
f10a8d0971 Allow the MNT_FORCE flag to be passed through to an initial mount.
When doing an initial mount(8) with its -f (force) flag, the MNT_FORCE
flag is not passed through to the underlying filesystem mount routine.
MNT_FORCE is only passed through on later updates to an existing
mount. With this commit the MNT_FORCE flag is now passed through on the
initial mount.

Sanity check: kib
Sponsored by: Netflix
2021-11-15 15:45:56 -08:00
Mark Johnston
22875f8879 x86: Implement deferred TSC calibration
There is no universal way to find the TSC frequency.  Newer Intel CPUs
may report it via CPUID leaves 0x15 and 0x16.  Sometimes it can be
obtained from the PLATFORM_INFO MSR as well, though we never use that.
On older platforms we derive the frequency using a DELAY(1000000) call,
which uses the 8254 PIT.  On some newer platforms the 8254 is apparently
non-functional, leading to bogus calibration results.  On such platforms
the TSC frequency must be available from CPUID.  It is also possible to
disable calibration with a tunable, in which case we try to parse the
brand string if the TSC freq is not available from CPUID.

CPUID 0x15 provides an authoritative TSC frequency value, but even that
is not always available on new Intel platforms.  CPUID 0x16 provides the
specified processor base frequency, which is not the same as the TSC
frequency.  Empirically, it is close enough for early boot, but too far
off for timekeeping: on a Comet Lake NUC, CPUID 0x16 yields 1600MHz but
the TSC frequency is rougly 1608MHz, leading to frequent clock stepping
when NTP is in use.

Thus we have a situation where we cannot calibrate using the PIT and
cannot obtain a precise frequency from CPUID (or MSRs).  This change
seeks to address that by using the CPUID 0x16 value during early boot
and refining the calibration later once ACPI-based timecounters are
available.  TSC frequency detection is thus split into two phases:

Early phase:
- On Intel platforms, query CPUID 0x15 and 0x16 and use that value
  initially if available.
- Otherwise, get an estimate using the PIT, reducing the delay loop to
  100ms from 1s.
- Continue to register the TSC as the CPU ticks provider early, even
  though the frequency may be off.  Otherwise any code executed during
  boot that uses cpu_ticks() (e.g., context switching) gets tripped up
  when the ticks provider changes.

Later phase:
- In SI_SUB_CLOCKS, once the timehands are initialized, load the current
  TSC and timecounter (sbinuptime()) values at the beginning and end of
  a 1s interval and use the timecounter frequency (typically from
  kvmclock, HPET or the ACPI PM timer) to estimate the TSC frequency.
- Update the TSC timecounter, global tsc_freq and CPU ticker with the
  new frequency and finally register the TSC as a timecounter.

Reviewed by:	kib, jhb (previous version)
Discussed with:	imp, cperciva
MFC after:	6 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32512
2021-11-15 16:13:24 -05:00
Mark Johnston
2287ced2f5 clock: Group the "clocks" SYSINIT with the function definition
This is how most SYSINITs are defined.  Also annotate the dummy
parameter with __unused.  No functional change intended.

MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2021-11-15 16:13:24 -05:00
Konstantin Belousov
7d20a08076 ldd: also use exec mode for -a
The -a option also requires passing specific environment variables to
instance of rtld doing tracing.

PR:	259069
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2021-11-15 22:49:33 +02:00
Ed Maste
3f9acedb02 growfs: do not error if filesystem is already requested size
For some cloud/virtualization use cases it can be convenient to grow the
filesystem on boot any time the disk/partition happens to be larger, but
not fail if it remains the same size.

Continue to emit a message if we have no action to take, but exit with
status 0 if the size remains the same.

Reviewed by:	trasz
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32856
2021-11-15 15:40:57 -05:00
Navdeep Parhar
a8eacf9329 cxgbe(4): Change the way t4_shutdown_adapter brings the link(s) down.
Modify the GPIO pins only on the Base-T cards and even there drive all
of them low instead of putting them in hi-z state.  For the rest (this
is the common case), directly power off the PLLs of the high speed
serdes.  This is the simplest method that does not involve or conflict
with the firmware but still works with all T4-T6 cards regardless of
what's plugged into the port.

This fixes a problem where the peer wouldn't always see a link down if
it is connected to the device using a -CR4 copper cable.

MFC after:	3 weeks
Sponsored by:	Chelsio Communications
2021-11-15 12:17:26 -08:00
John Baldwin
83a54b582f ktls: Add tests ensuring unsupported receive cipher suites are rejected.
Reviewed by:	markj
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D32982
2021-11-15 11:32:49 -08:00
John Baldwin
233ce578a4 ktls: Add tests ensuring invalid receive cipher suites are rejected.
Reviewed by:	markj
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D32981
2021-11-15 11:32:15 -08:00
John Baldwin
3e7f8a8da2 ktls: Add simple receive tests of kernel TLS.
Similar to the simple transmit tests added in
a10482ea7476d68d1ab028145ae6d97cef747b49, these tests test the kernel
TLS functionality directly by manually encrypting TLS records using
randomly generated keys and writing them to a socket to be processed
by the kernel.

Reviewed by:	markj
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D32980
2021-11-15 11:31:16 -08:00
John Baldwin
d1c369f926 ktls: Add tests ensuring various invalid cipher suites are rejected.
Reviewed by:	markj
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D32843
2021-11-15 11:30:48 -08:00
John Baldwin
900a28fe33 ktls: Reject some invalid cipher suites.
- Reject AES-CBC cipher suites for TLS 1.0 and TLS 1.1 using auth
  algorithms other than SHA1-HMAC.

- Reject AES-GCM cipher suites for TLS versions older than 1.2.

Reviewed by:	markj
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D32842
2021-11-15 11:30:12 -08:00
John Baldwin
0ff2a12ae3 ktls: Add tests for sending empty fragments for TLS 1.0 connections.
Reviewed by:	markj
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D32841
2021-11-15 11:28:12 -08:00
John Baldwin
44265dc3da ktls: Add padding tests for AES-CBC MTE cipher suites.
For each AES-CBC MTE cipher suite, test sending records with 1 to 16
bytes of payload.  This ensures that all of the potential padding
values are covered.

Reviewed by:	markj
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D32840
2021-11-15 11:26:45 -08:00
Brooks Davis
9df4e7e2e9 syscalls: regen 2021-11-15 18:34:28 +00:00
Brooks Davis
01ce7fca44 ommap: fix signed len and pos arguments
4.3 BSD's mmap took an int len and long pos.  Reject negative lengths
and in freebsd32 sign-extend pos correctly rather than mis-handling
negative positions as large positive ones.

Reviewed by:	kib
2021-11-15 18:34:28 +00:00
Brooks Davis
3f8ced5bce syscalls: regen 2021-11-15 18:34:27 +00:00
Brooks Davis
8e4a3add99 struct kevent_freebsd11 -> struct freebsd11_kevent
Rename to match the naming of syscalls and allow 32 to be appended
without making an ugly name like kevent_freebsd1132.

While here, make the kevent changelist argument const.

Reviewed by:	kib
2021-11-15 18:34:27 +00:00
Brooks Davis
f0da2a1467 syscalls: unwrap a long line
Style dictates that each variable is on a single line

Reviewed by:	kib
2021-11-15 18:34:27 +00:00
Navdeep Parhar
8e76bae0b7 cxgbe(4): Keep link configuration compatible with really old firmwares.
MFC after:	1 week
Sponsored by:	Chelsio Communications
2021-11-15 10:18:04 -08:00
Mark Johnston
ab12e8db29 amd64: Reduce the amount of cpuset copying done for TLB shootdowns
We use pmap_invalidate_cpu_mask() to get the set of active CPUs.  This
(32-byte) set is copied by value through multiple frames until we get to
smp_targeted_tlb_shootdown(), where it is copied yet again.

Avoid this copying by having smp_targeted_tlb_shootdown() make a local
copy of the active CPUs for the pmap, and drop the cpuset parameter,
simplifying callers.  Also leverage the use of the non-destructive
CPU_FOREACH_ISSET to avoid unneeded copying within
smp_targeted_tlb_shootdown().

Reviewed by:	alc, kib
Tested by:	pho
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32792
2021-11-15 13:01:31 -05:00
Mark Johnston
71e6e9da22 amd64: Initialize kernel_pmap's active CPU set to all_cpus
This is in preference to simply filling the cpuset, and allows the
conditional in pmap_invalidate_cpu_mask() to be elided.

Also export pmap_invalidate_cpu_mask() outside of pmap.c for use in a
subsequent commit.

Suggested by:	kib
Reviewed by:	alc, kib
Tested by:	pho
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32792
2021-11-15 13:01:30 -05:00
Mark Johnston
42c2cd1ffb amd64: Annotate an unlikely condition in smp_targeted_tlb_shootdown()
Reviewed by:	alc, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2021-11-15 13:01:30 -05:00
Mark Johnston
d28af1abf0 vm: Add a mode to vm_object_page_remove() which skips invalid pages
This will be used to break a deadlock in ZFS between the per-mountpoint
teardown lock and page busy locks.  In particular, when purging data
from the page cache during dataset rollback, we want to avoid blocking
on the busy state of invalid pages since the busying thread may be
blocked on the teardown lock in zfs_getpages().

Add a helper, vn_pages_remove_valid(), for use by filesystems.  Bump
__FreeBSD_version so that the OpenZFS port can make use of the new
helper.

PR:		258208
Reviewed by:	avg, kib, sef
Tested by:	pho (part of a larger patch)
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32931
2021-11-15 13:01:30 -05:00
Mark Johnston
a2665158d0 vm_page: Remove vm_page_sbusy() and vm_page_xbusy()
They are unused today and cannot be safely used in the face of unlocked
lookup, in which pages may be busied without the object lock held.

Obtained from:	jeff (object_concurrency patches)
Reviewed by:	kib
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D32948
2021-11-15 13:01:30 -05:00
Mark Johnston
87b646630c vm_page: Consolidate page busy sleep mechanisms
- Modify vm_page_busy_sleep() and vm_page_busy_sleep_unlocked() to take
  a VM_ALLOC_* flag indicating whether to sleep on shared-busy, and fix
  up callers.
- Modify vm_page_busy_sleep() to return a status indicating whether the
  object lock was dropped, and fix up callers.
- Convert callers of vm_page_sleep_if_busy() to use vm_page_busy_sleep()
  instead.
- Remove vm_page_sleep_if_(x)busy().

No functional change intended.

Obtained from:	jeff (object_concurrency patches)
Reviewed by:	kib
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D32947
2021-11-15 13:01:30 -05:00
Mark Johnston
b0acc3f11b vm_pager: Optimize an assertion
Obtained from:	jeff (object_concurrency patches)
Reviewed by:	kib
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D32946
2021-11-15 13:01:30 -05:00
Mark Johnston
1fb99e97e9 bus: Make BUS_TRANSLATE_RESOURCE behave more like other bus methods
- Return an errno value upon failure, instead of 1.
- Provide a bus_translate_resource() wrapper.
- Implement the generic version, which traverses the hierarchy until a
  bus driver with a non-trivial implementation is found, in subr_bus.c
  like other similar default implementations.
- Make ofw_pcib_translate_resource() return an error if a matching PCI
  address range is not found.
- Make generic_pcie_translate_resource_common() return an int instead of
  a bool.  Fix up callers.

No functional change intended.

Reviewed by:	imp, jhb
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32855
2021-11-15 13:01:30 -05:00
Kirk McKusick
728f2c6131 Suppress UFS/FFS superblock check-hash failure messages when identifying
disk labels.

When the geom label subsystem is checking labels to discover if they
are UFS/FFS filesystems, do not print a kernel error message if a
superblock is found with a check-hash error. That issue is best
handled later if an attempt is made to actually use the filesystem.

Sponsored by: Netflix
2021-11-15 09:26:21 -08:00
Kirk McKusick
f2b391528a Add ability to suppress UFS/FFS superblock check-hash failure messages.
When reading UFS/FFS superblocks that have check hashes, both the kernel
and libufs print an error message if the check hash is incorrect. This
commit adds the ability to request that the error message not be made.
It is intended for use by programs like fsck that wants to print its
own error message and by kernel subsystems like glabel that just wants
to check for possible filesystem types.

This capability will be used in followup commits.

Sponsored by: Netflix
2021-11-15 09:11:54 -08:00
Baptiste Daroussin
6d38604fc5 mandoc: import version 1.14.6
MFC after: 3 weeks
2021-11-15 16:58:58 +01:00
Baptiste Daroussin
e9bf778aef mandoc: import v1.14.6 2021-11-15 16:35:39 +01:00