Don't exit early in find_rootfs() when zpool.bootfs
is set to `zfs:AUTO`.
Reviewed-by: Richard Laager <rlaager@wiktel.com>
Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
Signed-off-by: szubersk <szuberskidamian@gmail.com>
Closes#12658
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Signed-off-by: Pawel Jakub Dawidek <pawel@dawidek.net>
Closes#12748
Special allocation class or dedup vdevs may have roughly the same
performance as L2ARC vdevs. Introduce a new tunable to exclude those
buffers from being cacheable on L2ARC.
Reviewed-by: Don Brady <don.brady@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes#11761Closes#12285
The l2cache device could be added twice because vdev_inuse() does not
check spa_l2cache for added devices. Make l2cache vdevs inuse checking
logic more closer to spare vdevs.
Reviewed-by: George Amanakis <gamanakis@gmail.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Fedor Uporov <fuporov.vstack@gmail.com>
Closes#9153Closes#12689
In case if all label checksums will be invalid on any vdev, the pool
will become unimportable. The zhack with newly added cli options could
be used to restore label checksums and make pool importable again.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Fedor Uporov <fuporov.vstack@gmail.com>
Closes#2510Closes#12686
When ZFS is on root, /tmp is a ZFS. This causes zfs_list_004_neg to
fail since `zfs list` on /tmp passes when the test expects it not to.
The fix is to exclude paths that belong to ZFS.
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Palash Gandhi <pbg4930@rit.edu>
Closes#12744
In addition to flushing memory mapped regions when checking holes,
commit de198f2d95 modified the dirty dnode detection logic to check
the dn->dn_dirty_records instead of the dn->dn_dirty_link. Relying
on the dirty record has not be reliable, switch back to the previous
method.
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #11900Closes#12745
This test case may fail on 5.13 and newer Linux kernels if the
/dev/zvol/ device is not created by udev.
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #12301
Closes #12738
In case if all label checksums will be invalid on any vdev, the pool
will become unimportable. From other side zdb with -l option will not
provide any useful information why it happened. Add notifications
about corrupted label checksums.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: John Kennedy <john.kennedy@delphix.com>
Signed-off-by: Fedor Uporov <fuporov.vstack@gmail.com>
Closes#2509Closes#12685
Summary: The manual pages need a bit of editing for language and clarity.
Reviewers: oshogbo, #manpages
Subscribers: imp
Differential Revision: https://reviews.freebsd.org/D32976
This makes GOP not probed on some situation (AMD Card on PCIe slot
with EDK2 as we have a SERIAL_IO_PROTOCOL compatible uart).
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D32992
Sponsored by: Beckhoff Automation GmbH & Co. KG
Printing the EFI_HANDLE pointer isn't very useful.
If the handle have a IMAGE_DEVICE_PATH or a DEVICE_PATH protocol print it.
This makes it easier to see which devices are present and what protocol they
expose.
Reviewed by: imp, tsoome
Differential Revision: https://reviews.freebsd.org/D32991
Sponsored by: Beckhoff Automation GmbH & Co. KG
Minor reformatting nits to make mprsas_scsiio_timeout match
mpssas_scsiio_timeout more closely. The differences aren't necessary and
are distracting when comparing the routines. No functional changes.
Sponsored by: Netflix
Normally a UFS/FFS filesystem with a bad check hash can only be
mounted read only. With this commit the mount(8) -f (force) option
can be used to force a read-write mount of a UFS/FFS filesystem with
a bad check hash. Conveniently the filesystem will proceed to
update its on-disk superblock with a corrected check hash.
Sponsored by: Netflix
When doing an initial mount(8) with its -f (force) flag, the MNT_FORCE
flag is not passed through to the underlying filesystem mount routine.
MNT_FORCE is only passed through on later updates to an existing
mount. With this commit the MNT_FORCE flag is now passed through on the
initial mount.
Sanity check: kib
Sponsored by: Netflix
There is no universal way to find the TSC frequency. Newer Intel CPUs
may report it via CPUID leaves 0x15 and 0x16. Sometimes it can be
obtained from the PLATFORM_INFO MSR as well, though we never use that.
On older platforms we derive the frequency using a DELAY(1000000) call,
which uses the 8254 PIT. On some newer platforms the 8254 is apparently
non-functional, leading to bogus calibration results. On such platforms
the TSC frequency must be available from CPUID. It is also possible to
disable calibration with a tunable, in which case we try to parse the
brand string if the TSC freq is not available from CPUID.
CPUID 0x15 provides an authoritative TSC frequency value, but even that
is not always available on new Intel platforms. CPUID 0x16 provides the
specified processor base frequency, which is not the same as the TSC
frequency. Empirically, it is close enough for early boot, but too far
off for timekeeping: on a Comet Lake NUC, CPUID 0x16 yields 1600MHz but
the TSC frequency is rougly 1608MHz, leading to frequent clock stepping
when NTP is in use.
Thus we have a situation where we cannot calibrate using the PIT and
cannot obtain a precise frequency from CPUID (or MSRs). This change
seeks to address that by using the CPUID 0x16 value during early boot
and refining the calibration later once ACPI-based timecounters are
available. TSC frequency detection is thus split into two phases:
Early phase:
- On Intel platforms, query CPUID 0x15 and 0x16 and use that value
initially if available.
- Otherwise, get an estimate using the PIT, reducing the delay loop to
100ms from 1s.
- Continue to register the TSC as the CPU ticks provider early, even
though the frequency may be off. Otherwise any code executed during
boot that uses cpu_ticks() (e.g., context switching) gets tripped up
when the ticks provider changes.
Later phase:
- In SI_SUB_CLOCKS, once the timehands are initialized, load the current
TSC and timecounter (sbinuptime()) values at the beginning and end of
a 1s interval and use the timecounter frequency (typically from
kvmclock, HPET or the ACPI PM timer) to estimate the TSC frequency.
- Update the TSC timecounter, global tsc_freq and CPU ticker with the
new frequency and finally register the TSC as a timecounter.
Reviewed by: kib, jhb (previous version)
Discussed with: imp, cperciva
MFC after: 6 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32512
This is how most SYSINITs are defined. Also annotate the dummy
parameter with __unused. No functional change intended.
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
The -a option also requires passing specific environment variables to
instance of rtld doing tracing.
PR: 259069
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
For some cloud/virtualization use cases it can be convenient to grow the
filesystem on boot any time the disk/partition happens to be larger, but
not fail if it remains the same size.
Continue to emit a message if we have no action to take, but exit with
status 0 if the size remains the same.
Reviewed by: trasz
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32856
Modify the GPIO pins only on the Base-T cards and even there drive all
of them low instead of putting them in hi-z state. For the rest (this
is the common case), directly power off the PLLs of the high speed
serdes. This is the simplest method that does not involve or conflict
with the firmware but still works with all T4-T6 cards regardless of
what's plugged into the port.
This fixes a problem where the peer wouldn't always see a link down if
it is connected to the device using a -CR4 copper cable.
MFC after: 3 weeks
Sponsored by: Chelsio Communications
Similar to the simple transmit tests added in
a10482ea7476d68d1ab028145ae6d97cef747b49, these tests test the kernel
TLS functionality directly by manually encrypting TLS records using
randomly generated keys and writing them to a socket to be processed
by the kernel.
Reviewed by: markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D32980
For each AES-CBC MTE cipher suite, test sending records with 1 to 16
bytes of payload. This ensures that all of the potential padding
values are covered.
Reviewed by: markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D32840
4.3 BSD's mmap took an int len and long pos. Reject negative lengths
and in freebsd32 sign-extend pos correctly rather than mis-handling
negative positions as large positive ones.
Reviewed by: kib
Rename to match the naming of syscalls and allow 32 to be appended
without making an ugly name like kevent_freebsd1132.
While here, make the kevent changelist argument const.
Reviewed by: kib
We use pmap_invalidate_cpu_mask() to get the set of active CPUs. This
(32-byte) set is copied by value through multiple frames until we get to
smp_targeted_tlb_shootdown(), where it is copied yet again.
Avoid this copying by having smp_targeted_tlb_shootdown() make a local
copy of the active CPUs for the pmap, and drop the cpuset parameter,
simplifying callers. Also leverage the use of the non-destructive
CPU_FOREACH_ISSET to avoid unneeded copying within
smp_targeted_tlb_shootdown().
Reviewed by: alc, kib
Tested by: pho
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32792
This is in preference to simply filling the cpuset, and allows the
conditional in pmap_invalidate_cpu_mask() to be elided.
Also export pmap_invalidate_cpu_mask() outside of pmap.c for use in a
subsequent commit.
Suggested by: kib
Reviewed by: alc, kib
Tested by: pho
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32792
This will be used to break a deadlock in ZFS between the per-mountpoint
teardown lock and page busy locks. In particular, when purging data
from the page cache during dataset rollback, we want to avoid blocking
on the busy state of invalid pages since the busying thread may be
blocked on the teardown lock in zfs_getpages().
Add a helper, vn_pages_remove_valid(), for use by filesystems. Bump
__FreeBSD_version so that the OpenZFS port can make use of the new
helper.
PR: 258208
Reviewed by: avg, kib, sef
Tested by: pho (part of a larger patch)
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32931
They are unused today and cannot be safely used in the face of unlocked
lookup, in which pages may be busied without the object lock held.
Obtained from: jeff (object_concurrency patches)
Reviewed by: kib
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D32948
- Modify vm_page_busy_sleep() and vm_page_busy_sleep_unlocked() to take
a VM_ALLOC_* flag indicating whether to sleep on shared-busy, and fix
up callers.
- Modify vm_page_busy_sleep() to return a status indicating whether the
object lock was dropped, and fix up callers.
- Convert callers of vm_page_sleep_if_busy() to use vm_page_busy_sleep()
instead.
- Remove vm_page_sleep_if_(x)busy().
No functional change intended.
Obtained from: jeff (object_concurrency patches)
Reviewed by: kib
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D32947
- Return an errno value upon failure, instead of 1.
- Provide a bus_translate_resource() wrapper.
- Implement the generic version, which traverses the hierarchy until a
bus driver with a non-trivial implementation is found, in subr_bus.c
like other similar default implementations.
- Make ofw_pcib_translate_resource() return an error if a matching PCI
address range is not found.
- Make generic_pcie_translate_resource_common() return an int instead of
a bool. Fix up callers.
No functional change intended.
Reviewed by: imp, jhb
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32855
disk labels.
When the geom label subsystem is checking labels to discover if they
are UFS/FFS filesystems, do not print a kernel error message if a
superblock is found with a check-hash error. That issue is best
handled later if an attempt is made to actually use the filesystem.
Sponsored by: Netflix
When reading UFS/FFS superblocks that have check hashes, both the kernel
and libufs print an error message if the check hash is incorrect. This
commit adds the ability to request that the error message not be made.
It is intended for use by programs like fsck that wants to print its
own error message and by kernel subsystems like glabel that just wants
to check for possible filesystem types.
This capability will be used in followup commits.
Sponsored by: Netflix