When __thr_pshared_offpage() is called for allocation, it must not use
the cached offpage for the key. Instead, the cached offpage must be
unmapped and removed from the cache, if any.
It is legitimate for the user code to unmap the shared lock object without
destroying it, and then mapping something over the freed VA to carry
another shared lock. In this case the cached offpage must be un-cached.
PR: 269277
Reported by: rau8344@gmail.com
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D38345
Remove the hard-coded dependency on HYPERV being only x86. Instead, 100%
rely on MK_HYPERV. It's always right (since it's marked BROKEN (so set
to "no") on architectures we don't support).
Sponsored by: Netflix
Reviewed by: bz
Differential Revision: https://reviews.freebsd.org/D38306
Under Linux to sched_[g|s]etaffinity() functions the value returned from a call
to gettid(2) (thread id) can be passed in the argument pid. Specifying pid as 0
will set the attribute for the calling thread, and passing the value returned
from a call to getpid(2) (process id) will set the attribute for the main thread
of the thread group.
Native cpuset(2) family of system calls has "which" argument to determine how
the value of id argument is interpreted, i.e., CPU_WHICH_TID is used to pass
a thread id and CPU_WHICH_PID - to pass a process id.
For now native sched_[g|s]etaffinity() implementation is wrong as uses "which"
CPU_WHICH_PID to pass both (process and thread id) to the kernel. To fix this
adding a new "which" CPU_WHICH_TIDPID intended to handle both id's.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D38209
MFC after: 1 week
Since f35093f8 semantics of a thread affinity functions is changed to be a
compatible with Linux:
In case of getaffinity(), the minimum cpuset_t size that the kernel permits is
the maximum CPU id, present in the system, / NBBY bytes, the maximum size is not
limited.
In case of setaffinity(), the kernel does not limit the size of the user-provided
cpuset_t, internally using only the meaningful part of the set, where the upper
bound is the maximum CPU id, present in the system, no larger than the size of
the kernel cpuset_t.
To match pthread_attr_[g|s]etaffinity_np checks of the user-provided cpusets to
the kernel behavior export the minimum cpuset_t size allowed by running kernel
via new sysctl kern.sched.cpusetsizemin and use it in checks.
Reviewed by:
Differential Revision: https://reviews.freebsd.org/D38112
MFC after: 1 week
The PCBGROUP option and KPI were removed entirely in 93c67567e0.
Reviewed by: pauamma (manpages), glebius, melifaro
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D38187
Several important base system components are written in C++, and the
WITHOUT_CXX option produced a system that was not fully functional.
Just accept this, and remove the option to build without C++ support.
This reverts commit adc3c128c6.
Reviewed by: brooks, kevans, jhb (earlier)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D33108
Upstream luaconf.h is contrib/lua/src/luaconf.h.dist, while userland lua
and loader lua have copies in lib/liblua/luaconf.h and
stand/liblua/luaconf.h.
Adjust whitespace, VCS tags, etc. to match upstream's version, for ease
of comparison.
Reviewed By: imp
Sponsored By: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D38206
Notable upstream pull request merges:
#13805 Configure zed's diagnosis engine with vdev properties
#14110 zfs list: Allow more fields in ZFS_ITER_SIMPLE mode
#14121 Batch enqueue/dequeue for bqueue
#14123 arc_read()/arc_access() refactoring and cleanup
#14159 Bypass metaslab throttle for removal allocations
#14243 Implement uncached prefetch
#14251 Cache dbuf_hash() calculation
#14253 Allow reciever to override encryption property in case of replication
#14254 Restrict visibility of per-dataset kstats inside FreeBSD jails
#14255 Zero end of embedded block buffer in dump_write_embedded()
#14263 Cleanups identified by CodeQL and Coverity
#14264 Miscellaneous fixes
#14272 Change ZEVENT_POOL_GUID to ZEVENT_POOL to display pool names
#14287 FreeBSD: Remove stray debug printf
#14288 Colorize zfs diff output
#14289 deadlock between spa_errlog_lock and dp_config_rwlock
#14291 FreeBSD: Fix potential boot panic with bad label
#14292 Add tunable to allow changing micro ZAP's max size
#14293 Turn default_bs and default_ibs into ZFS_MODULE_PARAMs
#14295 zed: add hotplug support for spare vdevs
#14304 Activate filesystem features only in syncing context
#14311 zpool: do guid-based comparison in is_vdev_cb()
#14317 Pack zrlock_t by 8 bytes
#14320 Update arc_summary and arcstat outputs
#14328 FreeBSD: catch up to 1400077
#14376 Use setproctitle to report progress of zfs send
#14340 Remove some dead ARC code
#14358 Wait for txg sync if the last DRR_FREEOBJECTS might result in a hole
#14360 libzpool: fix ddi_strtoull to update nptr
#14364 Fix unprotected zfs_znode_dmu_fini
#14379 zfs_receive_one: Check for the more likely error first
#14380 Cleanup of dead code suggested by Clang Static Analyzer
#14397 Avoid passing an uninitialized index to dsl_prop_known_index
#14404 Fix reading uninitialized variable in receive_read
#14407 free_blocks(): Fix reports from 2016 PVS Studio FreeBSD report
#14418 Introduce minimal ZIL block commit delay
#14422 x86 assembly: fix .size placement and replace .align with .balign
Obtained from: OpenZFS
OpenZFS commit: 9cd71c8604
The .align directive used to align storage locations is
ambiguous. On some platforms and assemblers it takes a byte count,
on others the argument is interpreted as a shift value. The current
usage expects the first interpretation.
Replace it with the unambiguous .balign directive which always
expects a byte count, regardless of platform and assembler.
Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Closes#14422
In the zstream code, Coverity reported:
"The argument could be controlled by an attacker, who could invoke the
function with arbitrary values (for example, a very high or negative
buffer size)."
It did not report this in the kernel. This is likely because the
userspace code stored this in an int before passing it into the
allocator, while the kernel code stored it in a uint32_t.
However, this did reveal a potentially real problem. On 32-bit systems
and systems with only 4GB of physical memory or less in general, it is
possible to pass a large enough value that the system will hang. Even
worse, on Linux systems, the kernel memory allocator is not able to
support allocations up to the maximum 4GB allocation size that this
allows.
This had already been limited in userspace to 64MB by
`ZFS_SENDRECV_MAX_NVLIST`, but we need a hard limit in the kernel to
protect systems. After some discussion, we settle on 256MB as a hard
upper limit. Attempting to receive a stream that requires more memory
than that will result in E2BIG being returned to user space.
Reported-by: Coverity (CID-1529836)
Reported-by: Coverity (CID-1529837)
Reported-by: Coverity (CID-1529838)
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes#14285
Introduce four new vdev properties:
checksum_n
checksum_t
io_n
io_t
These properties can be used for configuring the thresholds of zed's
diagnosis engine and are interpeted as <N> events in T <seconds>.
When this property is set to a non-default value on a top-level vdev,
those thresholds will also apply to its leaf vdevs. This behavior can be
overridden by explicitly setting the property on the leaf vdev.
Note that, these properties do not persist across vdev replacement. For
this reason, it is advisable to set the property on the top-level vdev
instead of the leaf vdev.
The default values for zed's diagnosis engine (10 events, 600 seconds)
remains unchanged.
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Signed-off-by: Rob Wing <rob.wing@klarasystems.com>
Sponsored-by: Seagate Technology LLC
Closes#13805
If zfs_receive_one() gets back EINVAL, check for the more likely case,
embedded block pointers + encryption and return that error, before
falling back to the less likely case, a resumable stream when the
kernel has not been upgraded to support resume.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Allan Jude <allan@klarasystems.com>
Sponsored-by: rsync.net
Sponsored-by: Klara Inc.
Closes#14379
Add new macro ASMABI used by Windows to change
calling API to "sysv_abi".
Reviewed-by: Attila Fülöp <attila@fueloep.org>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Jorgen Lundman <lundman@lundman.net>
Closes#14228
This allows parsing of zfs send progress by checking the process
title.
Doing so requires some changes to the send code in libzfs_sendrecv.c;
primarily these changes move some of the accounting around, to allow
for the code to be verbose as normal, or set the process title. Unlike
BSD, setproctitle() isn't standard in Linux; thus, borrowed it from
libbsd with slight modifications.
Authored-by: Sean Eric Fagan <sef@FreeBSD.org>
Co-authored-by: Ryan Moeller <ryan@iXsystems.com>
Co-authored-by: Ameer Hamza <ahamza@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes#14376
There's no reason to pass the mixer as an argument since we can fetch it
from mix_dev. No functional change intended.
Reviewed by: markj, hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D38076
This allows us to use the TAILQ_PREV and TAILQ_FOREACH_REVERSE_* macros,
useful for an out-of-tree consumer.
Reviewed by: markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D38055
Cast char's through unsigned char before storing as an integer in
xdr_char(), this ensures that the encoded form is consistently not
sign-extended following Open Solaris's example.
Prior to this change, platforms with signed chars would sign extend
values with the high bit set but ones with unsigned chars would not
so 0xff would be stored as 0x000000ff on unsigned char platforms and
0xffffffff on signed char platforms. Decoding has the same
result for either form so this is a largely cosmetic change, but it
seems best to produce consistent output.
For more discussion, see https://github.com/openzfs/zfs/issues/14173
Reviewed by: mav, imp
Differential Revision: https://reviews.freebsd.org/D37992
* Replay 2010[acflm] which had been merged but not recorded.
* Merge 2010n.
* Reorganize (unsplit) the code to match the upstream layout.
* Merge 2022[cdefg].
MFC after: 1 week
Sponsored by: Klara, Inc.
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Charles Suh <charles.suh@gmail.com>
Closes#14360
This commit supports for spare vdev hotplug. The
spare vdev associated with all the pools will be
marked as "Removed" when the drive is physically
detached and will become "Available" when the
drive is reattached. Currently, the spare vdev
status does not change on the drive removal and
the same is the case with reattachment.
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes#14295
These architectures fail to handle this special case, and will cause the
corresponding setjmp/_setjmp to return 0 rather than 1. Fix this and add
regression tests (also committed upstream).
PR: 268684
Reviewed by: arichardson, jhb
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D29363
Do not cache PID for a process that does not fabricate it,
calls openlog() before forking and does not call exec() thereafter.
PR: 268666
Fixes: e9ae9fa937
Tested by: kp
MFC after: 3 days
There is a lock order inversion deadlock between `spa_errlog_lock` and
`dp_config_rwlock`:
A thread in `spa_delete_dataset_errlog()` is running from a sync task.
It is holding the `dp_config_rwlock` for writer (see
`dsl_sync_task_sync()`), and waiting for the `spa_errlog_lock`.
A thread in `dsl_pool_config_enter()` is holding the `spa_errlog_lock`
(see `spa_get_errlog_size()`) and waiting for the `dp_config_rwlock` (as
reader).
Note that this was introduced by #12812.
This commit address this by defining the lock ordering to be
dp_config_rwlock first, then spa_errlog_lock / spa_errlist_lock.
spa_get_errlog() and spa_get_errlog_size() can acquire the locks in this
order, and then process_error_block() and get_head_and_birth_txg() can
verify that the dp_config_rwlock is already held.
Additionally, a buffer overrun in `spa_get_errlog()` is corrected. Many
code paths didn't check if `*count` got to zero, instead continuing to
overwrite past the beginning of the userspace buffer at `uaddr`.
Tested by having some errors in the pool (via `zinject -t data
/path/to/file`), one thread running `zpool iostat 0.001`, and another
thread runs `zfs destroy` (in a loop, although it hits the first time).
This reproduces the problem easily without the fix, and works with the
fix.
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Reviewed-by: George Amanakis <gamanakis@gmail.com>
Reviewed-by: George Wilson <gwilson@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Matthew Ahrens <mahrens@delphix.com>
Closes#14239Closes#14289
- provide various missing MLINKS for library functions
- update various SEE ALSO section to include the
new linked manual pages
- add various definitions of new functions like isideogram_l(3)
- document COMPATIBILITY for some functions
- bump man page dates
Reviewed by: gbe, bcr
MFC after: 1 week
Pull Request: https://github.com/freebsd/freebsd-src/pull/621
Differential Revision: https://reviews.freebsd.org/D37203
This adds support to color zfs diff (in the style of git diff)
conditional on the ZFS_COLOR environment variable.
Signed-off-by: Ethan Coe-Renner <coerenner1@llnl.gov>
Currently, the receiver fails to override the encryption
property for the plain replicated dataset with the error:
"cannot receive incremental stream: encryption property
'encryption' cannot be set for incremental streams.". The
problem is resolved by allowing the receiver to override
the encryption property for plain replicated send.
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes#14253Closes#13533
If the fields to be listed and sorted by are constrained to those
populated by dsl_dataset_fast_stat(), then zfs list is much faster,
as it does not need to open each objset and reads its properties.
A previous optimization by Pawel Dawidek
(0cee24064a) took advantage
of this to make listing snapshot names sorted only by name much faster.
However, it was limited to `-o name -s name`, this work extends this
optimization to work with:
- name
- guid
- createtxg
- numclones
- inconsistent
- redacted
- origin
and could be further extended to any other properties supported by
dsl_dataset_fast_stat() or similar, that do not require extra locking
or reading from disk.
This was committed before (9a9e2e343dfa2af28bf7910de77ae73aa006de62),
but was reverted due to a regression when used with an older kernel.
If the kernel does not populate zc->zc_objset_stats, we now fallback
to getting the properties via the slower interface, to avoid problems
with newer userland and older kernels.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Allan Jude <allan@klarasystems.com>
Closes#14110
Libarchive 3.6.2
Important bug fixes:
rar5 reader: fix possible garbled output with bsdtar -O (#1745)
mtree reader: support reading mtree files with tabs (#1783)
various small fixes for issues found by CodeQL
MFC after: 2 weeks
PR: 286306 (exp-run)
Reverting because of issue in Makefile.inc1 during native builds:
make[1]: “.../freebsd/Makefile.inc1" line 163: Unknown target aarch64:aarch64.
Since I only tested this patch with make universe on amd64, this issue wasn't caught.
This reverts commit 83bf6ab568.
On powerpc64, powerpc64le and riscv64 some software wrongly assumes that
it runs on powerpc or riscv (32-bit).
Differential revision: https://reviews.freebsd.org/D35962
Approved by: alfredo, imp
`zfs_send_cb_impl()` calls `dump_filesystems()`, which calls
`dump_filesystem()`, which will return `-1` as an error when
`zfs_open()` returns `NULL`.
This will be passed to `zfs_standard_error()`, which passes it to
`zfs_standard_error_fmt()`, which passes it to `strerror()`.
To fix this, we modify zfs_open() to set `errno` whenever it returns
NULL. Most of the cases already have `errno` set (since they pass it to
`zfs_standard_error_fmt()`, which makes this easy. Then we modify
`dump_filesystem()` to pass `errno` instead of `-1`.
Reported-by: Coverity (CID-1524598)
Reviewed-by: Damian Szuberski <szuberskidamian@gmail.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes#14264
* Correct some function prototypes which were documented with the wrong
pointer type.
* Clarify return values and requirements for freeing the limit handle.
MFC after: 1 week
Sponsored by: Axcient
Reviewed by: oshogbo
Differential Revision: https://reviews.freebsd.org/D37586
GCC 12 defaults to C++17 which removes (not just deprecates)
std::auto_ptr<>. Trying to use CXXSTD of c++03 doesn't work with
libc++ headers, but c++11 does.
Reviewed by: brooks, imp, emaste
Differential Revision: https://reviews.freebsd.org/D37531
The comment indicated -Wno-deprecated-declarations was used to avoid
warnings about deprecated auto_ptr and various deprecated function
objects from <functional>. libdevdctl (now) does not use auto_ptr,
so don't mention it in the comment.
Sponsored by: The FreeBSD Foundation
libpmc used -Wno-deprecated-declarations to silence warnings about usage
of deprecated std::auto_ptr, but there is (now) now use of auto_ptr in
libpmc.
Reviewed by: mhorne
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D37576
lib/googletest used -Wno-deprecated-declarations to silence warnings
about usage of deprecated std::auto_ptr, but auto_ptr is not (now) used
anywhere in googletest.
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D37561
Squelch false positives reported by GCC 12 with UBSan.
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: szubersk <szuberskidamian@gmail.com>
Closes#14150
When local properties (e.g., from -o and -x) are provided, don't leak
the packed representation of the received properties due to variable
reuse.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Brooks Davis <brooks.davis@sri.com>
Closes#14197
According to the UNIX standard, <pthread.h> does not include some
PTHREAD_* values which are included in <limits.h>. OpenZFS uses
some of these values in its code, and this might cause build failure on
systems that do not have these PTHREAD_* values in <pthread.h>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Damian Szuberski <szuberskidamian@gmail.com>
Signed-off-by: Minsoo Choo <minsoochoo0122@proton.me>
Closes#14225
Previously when using NO_ROOT we recorded a METALOG entry for the
pam.d/ftp hard link with a different file mode than the link target
pam.d/ftpd, which is not permitted.
This change is similar to 1dbb9994d4 for .profile
Sponsored by: The FreeBSD Foundation
It should pass `MNT_LINE_MAX`, but passes `sizeof (mntpt)`. This is
harmless because the strlen is not actually used by the helper, but
FreeBSD's Coverity scans complained about it.
This was missed in my audit of various string functions since it is not
actually passed to a string function.
Upon review, it was noticed that the helper function does not need to be
a separate function, so I have inlined it as cleanup.
Reported-by: Coverity (CID 1432079)
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: szubersk <szuberskidamian@gmail.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes#14136
FreeBSD's Coverity scans complain that we ignore the return value. There
is no need to check the return value so we cast it to (void) to suppress
further complaints by static analyzers.
Reported-by: Coverity (CID 1018175)
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: szubersk <szuberskidamian@gmail.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes#14136
Suppress a false positive found by new Cppcheck version.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: George Melikov <mail@gmelikov.ru>
Signed-off-by: szubersk <szuberskidamian@gmail.com>
Closes#14148
This passes struct vcpu down in place of struct vm and and integer
vcpu index through the in-kernel instruction emulation code. To
minimize userland disruption, helper macros are used for the vCPU
arguments passed into and through the shared instruction emulation
code.
A few other APIs used by the instruction emulation code have also been
updated to accept struct vcpu in the kernel including
vm_get/set_register and vm_inject_fault.
Reviewed by: corvink, markj
Differential Revision: https://reviews.freebsd.org/D37161
The arguments identifying the VM and vCPU are only needed for
vm_copy_setup.
Reviewed by: corvink, markj
Differential Revision: https://reviews.freebsd.org/D37158
GCC 12 warns that passing "" (a constant of char[1]) to a parameter of
type char[33] could potentially overread. It is not clear from the
context that c->qops can never be "auth-int" (and if it can't, then
the "auth-int" handling in DigestCalcResponse is dead code that should
be removed since this is the only place the function is called).
Reviewed by: emaste
Differential Revision: https://reviews.freebsd.org/D36825
Without this patch, the code in the rpcbind client forces
the use of UDP. A comment notes that some rpcbind servers
only support UDP. This makes NFSv3 mounts to Azure servers
impossible, since they require use of TCP for rpcbind.
Since the comment is very old (imported from NetBSD in 2001)
and I do not believe any UDP only rpcbind servers will
still exist, this patch comments out the code that forces
use of UDP, so that NFSv3 mounts to Azure servers can work.
For an NFSv3 mount, the "udp" mount option will still
make mount_nfs use UDP for rpcbind so that can be used
as a workaround for any old NFSv3 server that only
supports rpcbind over UDP (if any such server still exists).
I asked if doing this change is appropriate on freebsd-fs@
and I only got one reply (off list) that supported doing
the change.
PR: 267301
MFC after: 1 month
The reported bug is UFS: bad file descriptor: soft update journaling
can not be enabled on some FreeBSD-provided disk images – failed
to write updated cg.
The UFS library (libufs(3)) failed to reopen its disk descriptor
when first attempting to update a cylinder group. The error only
occurred when trying to add journaling to a filesystem whose first
cylinder group was too full to hold the journal.
PR: 259090
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Linux defaults to setting "failfast" on BIOs, so that the OS will not
retry IOs that fail, and instead report the error to ZFS.
In some cases, such as errors reported by the HBA driver, not
the device itself, we would wish to retry rather than generating
vdev errors in ZFS. This new property allows that.
This introduces a per vdev option to disable the failfast option.
This also introduces a global module parameter to define the failfast
mask value.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Co-authored-by: Allan Jude <allan@klarasystems.com>
Signed-off-by: Allan Jude <allan@klarasystems.com>
Signed-off-by: Mariusz Zaborski <mariusz.zaborski@klarasystems.com>
Sponsored-by: Seagate Technology LLC
Submitted-by: Klara, Inc.
Closes#14056
Most of this file was a pile of defines, apparently from Solaris that
controlled nothing in the source tree. A few things controlled the
definition of unused types or macros which I have removed.
Considerable further cleanup is possible including removal of
architectures FreeBSD never supported. This file should likely converge
with the Linux version to the extent possible.
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Brooks Davis <brooks.davis@sri.com>
Closes#14127
Require that ZFS_LEGACY_SUPPORT be defined for legacy ioctl support to
be built. For now, define it in zfs_ioctl_compat.h so support is always
built. This will allow systems that need never support pre-openzfs
tools a mechanism to remove support at build time. This code should
be removed once the need for tool compatability is gone.
No functional change at this time.
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Brooks Davis <brooks.davis@sri.com>
Closes#14127
This fixes -Wsingle-bit-bitfield-constant-conversion warning from
clang-16 like:
lib/libzfs/libzfs_dataset.c:4529:19: error: implicit truncation
from 'int' to a one-bit wide bit-field changes value from
1 to -1 [-Werror,-Wsingle-bit-bitfield-constant-conversion]
flags.nounmount = B_TRUE;
^ ~~~~~~
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Brooks Davis <brooks.davis@sri.com>
Closes#14125
uu_avl and uu_list stored internal next/prev pointers and parent
pointers (unused) obfuscated (byte swapped) to hide them from a long
forgotten leak checker (No one at the 2022 OpenZFS developers meeting
could recall the history.) This would break on CHERI systems and adds
no obvious value. Rename the members, use proper types rather than
uintptr_t, and eliminate the related macros.
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Brooks Davis <brooks.davis@sri.com>
Closes#14126
Avoid assuming than a uint64_t can hold a pointer.
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Brooks Davis <brooks.davis@sri.com>
Closes#14131
Avoid assuming than a uint64_t can hold a pointer and reduce the
number of casts in the process.
Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Brooks Davis <brooks.davis@sri.com>
Closes#14131
Check __riscv_xlen == 64 rather than _LP64 and define _LP64 if missing.
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Brooks Davis <brooks.davis@sri.com>
Closes#14128
Refer to sockets rather than processes, since one can have multiple
sockets in a load-balancing group within the same process.
MFC after: 1 week
Sponsored by: Modirum MDPay
Sponsored by: Klara, Inc.
With the change to return EAI_ADDRFAMILY from getaddrinfo(), fetch
would print "Unknown resolver error" for that error. Add that error
and its string to libfetch's table, using an #ifdef just in case.
Correct error strings for EAI_NODATA (although it is currently unused)
and EAI_NONAME. Should maybe rework the code to use gai_strerror(3),
but that doesn't map directly, and the current strings are shortened.
Reviewed in https://reviews.freebsd.org/D37139 with related changes.
Reviewed by: bz
MFC after: 1 month
Rework getaddrinfo(3) to return different error values for unresolvable
names (same as before, EAI_NONAME) and those without a requested addr
(EAI_ADDRFAMILY) when using DNS. This is implemented via an added
error in the nsswitch layer, NS_ADDRFAMILY, which is used only by
getaddrinfo(). The error is passed through nsdispatch(3), but that
routine has no changes to handle this error. The error originates in
the getaddrinfo DNS layer called via nsdispatch(), and is processed
by the search layer that calls nsdispatch().
While here, add a little style to returns near those that were
modified.
Reviewed in https://reviews.freebsd.org/D37139 with related changes.
Reviewed by: bz
MFC after: 1 month
gai_strerror.c still has messages for EAI_ADDRFAMILY and EAI_NODATA,
but not the man page. Re-add to the man page, and update comments
in the source. Document the errors that are not in RFC 3493 or
POSIX.
Reviewed in https://reviews.freebsd.org/D37139 with related changes.
Reviewed by: bz, pauamma
MFC after: 1 month
Allow pf (l2) to be used to redirect ethernet packets to a different
interface.
The intended use case is to send 802.1x challenges out to a side
interface, to enable AT&T links to function with pfSense as a gateway,
rather than the AT&T provided hardware.
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D37193
kmem_scnprintf() is only available in libzpool. Recent buildbot issues
with showing FreeBSD results kept us from seeing this before
97143b9d31 was merged.
The code has been changed to sanitize the output from `kmem_scnprintf()`.
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes#14111
When syncookies are in adaptive mode they may be active or inactive.
Expose this status to users.
Suggested by: Guido van Rooij
Sponsored by: Rubicon Communications, LLC ("Netgate")
4170ae4ea6 was intended to tackle TOCTOU
race conditions reported by CodeQL, but as an oversight, a file
descriptor was not closed and some comments were not updated.
Interestingly, CodeQL did not complain about the file descriptor leak,
so there is room for improvement in how we configure it to try to detect
this issue so that we get early warning about this.
In addition, an optimization opportunity was missed by mistake in
lib/libshare/os/linux/smb.c, which prevented us from truly closing the
TOCTOU race. This was also caught by Coverity.
Reported-by: Coverity (CID 1524424)
Reported-by: Coverity (CID 1526804)
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes#14109
phantom@'s HDD crashed with the final version of strfmon.c, as explained
in 9d430a5991.
Now there are tests in place that cover these code paths.
Reviewed by: kib
PR: 267410
Github PR: #620
MFC after: 1 week
strfmon_l does not take fully into consideration the explicitly passed
locale to perform the formatting.
Parallel universe bug report: https://sourceware.org/bugzilla/show_bug.cgi?id=19633
Obtained from: Darwin
Reviewed by: kib
PR: 267410
Github PR: #620
MFC after: 1 week
Attempt to test the correctness of strfmon_l(3).
Items marked with XXX represent an invalid output.
Obtained from: e7eba0044f
Reviewed by: kib
PR: 267410
Github PR: #620
MFC after: 1 week
Otherwise strfmon(3) could overflow the buffer.
Here is mostly done for correctness and illustrative purposes, as there
is no chance it could actually happen.
Reviewed by: kib
PR: 267410
Github PR: #620
MFC after: 1 week
`snprintf()` is meant to protect against buffer overflows, but operating
on the buffer using its return value, possibly by calling it again, can
cause a buffer overflow, because it will return how many characters it
would have written if it had enough space even when it did not. In a
number of places, we repeatedly call snprintf() by successively
incrementing a buffer offset and decrementing a buffer length, by its
return value. This is a potentially unsafe usage of `snprintf()`
whenever the buffer length is reached. CodeQL complained about this.
To fix this, we introduce `kmem_scnprintf()`, which will return 0 when
the buffer is zero or the number of written characters, minus 1 to
exclude the NULL character, when the buffer was too small. In all other
cases, it behaves like snprintf(). The name is inspired by the Linux and
XNU kernels' `scnprintf()`. The implementation was written before I
thought to look at `scnprintf()` and had a good name for it, but it
turned out to have identical semantics to the Linux kernel version.
That lead to the name, `kmem_scnprintf()`.
CodeQL only catches this issue in loops, so repeated use of snprintf()
outside of a loop was not caught. As a result, a thorough audit of the
codebase was done to examine all instances of `snprintf()` usage for
potential problems and a few were caught. Fixes for them are included in
this patch.
Unfortunately, ZED is one of the places where `snprintf()` is
potentially used incorrectly. Since using `kmem_scnprintf()` in it would
require changing how it is linked, we modify its usage to make it safe,
no matter what buffer length is used. In addition, there was a bug in
the use of the return value where the NULL format character was not
being written by pwrite(). That has been fixed.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes#14098
CodeQL and Coverity both complained about:
* lib/libshare/os/linux/smb.c
* tests/zfs-tests/cmd/mmapwrite.c
* twice
* tests/zfs-tests/tests/functional/tmpfile/tmpfile_002_pos.c
* tests/zfs-tests/tests/functional/tmpfile/tmpfile_stat_mode.c
* coverity had a second complaint that CodeQL did not have
* tests/zfs-tests/cmd/suid_write_to_file.c
* Coverity had two complaints and CodeQL had one complaint, both
differed. The CodeQL complaint is about the main point of the
test, so it is not fixable without a hack involving `fork()`.
The issues reported by CodeQL are fixed, with the exception of the last
one, which is deemed to be a false positive that is too much trouble to
wrokaround. The issues reported by Coverity were only fixed if CodeQL
complained about them.
There were issues reported by Coverity in a number of other files that
were not reported by CodeQL, but fixing the CodeQL complaints is
considered a priority since we want to integrate it into a github
workflow, so the remaining Coverity complaints are left for future work.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes#14098
This fixes the instances of the "Multiplication result converted to
larger type" alert that codeQL scanning found.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Andrew Innes <andrew.c12@gmail.com>
Closes#14094
Windows port frees memory that was alloc'd aligned in a different way
then alloc'd memory. So changing frees to be specific.
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Andrew Innes <andrew.c12@gmail.com>
Co-Authored-By: Jorgen Lundman <lundman@lundman.net>
Closes#14059
devmatch is useful on standalone machine but not on jails.
Put devinfo(8) and libdevinfo there too.
Differential Revision: https://reviews.freebsd.org/D36229
It's not really useful in a jail or in a mdroot or even if a users
wants to do a full zfs machine.
Reviewed by: mckusick
Differential Revision: https://reviews.freebsd.org/D36227
It is useful to have zfs utilities and lib in a separate package as
it allow users to create image that can support ZFS (i.e. not with
WITHOUT_ZFS in src.conf set) without bloating the default image with
all zfs tools (for example for jails).
Differential Revision: https://reviews.freebsd.org/D36225
For most users it's not needed to boot and they are also
available in the FreeBSD-rescue package in case an update
break and FreeBSD-geom package isn't updated correctly.
Differential Revision: https://reviews.freebsd.org/D36224
There's only one value that specifies the number of digits after the
decimal point (oh, sorry, the "radix character") the other specifies the
number before...
While here, add a little more info on the effects of using the #n value.
Obtained from: d1dd1a0864
Reviewed by: kib
PR: 267282
Github PR: #619
MFC after: 1 week
There is a bug when formatting two consecutive values using fixed-widths
and the values need padding. This was because the value of pad_size
was zeroed only every other time.
Format Before After
[%8n] [%8n] [ $123.45] [ $123.45] [ $123.45] [ $123.45]
Reviewed by: kib
PR: 267282
Github PR: #619
MFC after: 1 week
Fix an edge case by printing the required space when, the currency
symbol succeeds the value, a space separates the sign from the value and
the sign position precedes the quantity and the currency symbol.
In other words:
n_cs_precedes = 0
n_sep_by_space = 2
n_sign_posn = 1
From The Open Group's localeconv[1]:
> When {p,n,int_p,int_n}_sep_by_space is 2:
> If the currency symbol and sign string are adjacent, a space separates
> them; otherwise, a space separates the sign string from the value.
Format Before After
[%n] [-123.45¤] [- 123.45¤]
[1]: https://pubs.opengroup.org/onlinepubs/9699919799/functions/localeconv.html
Obtained from: Darwin
Reviewed by: kib
PR: 267282
Github PR: #619
MFC after: 1 week
Take into consideration the possibility of quantities enclosed by
parentheses when aligning.
Matches the examples from The Open Group's:
Format Before After
%(#5n [$ 123.45] [ $ 123.45 ] Use an alternative pos/neg style
[($ 123.45)] [($ 123.45)]
[$ 3,456.78] [ $ 3,456.78 ]
%!(#5n [ 123.45] [ 123.45 ] Disable the currency symbol
[( 123.45)] [( 123.45)]
[ 3,456.78] [ 3,456.78 ]
https://pubs.opengroup.org/onlinepubs/9699919799/functions/strfmon.html
SD5-XSH-ERN-29 is applied, updating the examples for %(#5n and %!(#5n.
Obtained from: Darwin
Reviewed by: kib
PR: 267282
Github PR: #619
MFC after: 1 week
The international currency symbol (int_curr_symbol) has a mandatory
SPACE character as the last character.
Trim this space after reading it, otherwise this extra space will always
be printed when displaying the int_curr_symbol.
Fixes the output when the international currency format is selected
(%i).
Locale Format Before After
en_US.UTF-8 [%i] [USD 123.45] [USD123.45]
fr_FR.UTF-8 [%i] [123,45 EUR ] [123,45 EUR]
Note that the en_US.UTF-8 locale states that no space should be printed
between the currency symbol and the value (sep_by_space = 0).
Reviewed by: kib
PR: 267282
Github PR: #619
MFC after: 1 week
Avoid an out-of-bounds access when trying to set the space_char using an
international currency format (%i) and the C/POSIX locale.
The current code tries to read the SPACE from int_curr_symbol[3]:
currency_symbol = strdup(lc->int_curr_symbol);
space_char = *(currency_symbol+3);
But on C/POSIX locales, int_curr_symbol is empty.
Three implementations have been examined: NetBSD[1], Darwin[2], and
Illumos[3]. Only NetBSD has fixed it[4].
Darwin and NetBSD also trim the mandatory final SPACE character after
reading it.
Locale Format Darwin/NetBSD FreeBSD/Illumos
en_US.UTF-8 [%i] [USD123.45] [USD 123.45]
fr_FR.UTF-8 [%i] [123,45 EUR] [123,45 EUR ]
This commit only fixes the out-of-bounds access.
[1]: https://github.com/NetBSD/src/blob/trunk/lib/libc/stdlib/strfmon.c
[2]: https://opensource.apple.com/source/Libc/Libc-1439.141.1/stdlib/NetBSD/strfmon.c.auto.html
[3]: https://github.com/illumos/illumos-gate/blob/master/usr/src/lib/libc/port/locale/strfmon.c
[4]: 3d7b5d498a
Reviewed by: kib
PR: 267282
Github PR: #619
MFC after: 1 week
Currently libvmmapi provides a way to get a list of the allowed ioctls
on the vmm device file, so that bhyve can limit rights on the device
file fd. The interface is rather strange: it allocates a copy of the
list but returns a const pointer, so the caller has to cast away the
const in order to free it without aggravating the compiler.
As far as I can see, there's no reason to make a copy of the array, but
changing vm_get_ioctls() to not do that would break compatibility. So
this change just introduces a better interface: move all rights-limiting
logic into libvmmapi.
Any new operations on the fd should be wrapped by libvmmapi, so also
discourage use of vm_get_device_fd(). Currently bhyve uses it only when
limiting rights on the device fd.
No functional change intended.
Reviewed by: jhb
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D37098
Coverity made two complaints about this function. The first is that we
ignore the number of bytes read. The second is that we have a sizeof
mismatch.
On 64-bit systems, long is a 64-bit type. Paradoxically, the standard
says that hostid is 32-bit, yet is also a long type. On 64-bit big
endian systems, reading into the long would cause us to return 0 as our
hostid after the mask. This is wrong.
Also, if a partial read were to happen (it should not), we would return
a partial hostid, which is also wrong.
We introduce a uint32_t system_hostid stack variable and ensure that the
read is done into it and check the read's return value. Then we set the
value based on whether the read was successful. This should fix both of
coverity's complaints.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Neal Gompa <ngompa@datto.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes#13968
Clang's static analyzer complains about this.
In get_configs(), if we have an invalid configuration that has no top
level vdevs, we can read a couple of uninitialized variables. Aborting
upon seeing this would break the userland tools for healthy pools, so we
instead initialize the two variables to 0 to allow the userland tools to
continue functioning for the pools with valid configurations.
In zfs_do_wait(), if no wait activities are enabled, we read an
uninitialized error variable.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes#14043
This confused Clang's static analyzer, making it think there was a
possible NULL pointer dereference. There is no NULL pointer dereference.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes#14042
Both Coverity and Clang's static analyzer caught this.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes#14044
Otherwise we do not fall back to sysctls if the auxv entries are not
defined by the kernel. Arguably this is not a bug since we do not
support newer libc running on an older kernel, but we can be a bit more
gentle for the benefit of Valgrind or any other software which
synthesizes the auxv for virtualization purposes.
Reported by: Paul Floyd <paulf2718@gmail.com>
MFC after: 1 week
Reviewed by: brooks, kib
Differential Revision: https://reviews.freebsd.org/D37036
These were categorized as the following:
* Dead assignment 23
* Dead increment 4
* Dead initialization 6
* Dead nested assignment 18
Most of these are harmless, but since actual issues can hide among them,
we correct them.
That said, there were a few return values that were being ignored that
appeared to merit some correction:
* `destroy_callback()` in `cmd/zfs/zfs_main.c` ignored the error from
`destroy_batched()`. We handle it by returning -1 if there is an
error.
* `zfs_do_upgrade()` in `cmd/zfs/zfs_main.c` ignored the error from
`zfs_for_each()`. We handle it by doing a binary OR of the error
value from the subsequent `zfs_for_each()` call to the existing
value. This is how errors are mostly handled inside `zfs_for_each()`.
The error value here is passed to exit from the zfs command, so doing
a binary or on it is better than what we did previously.
* `get_zap_prop()` in `module/zfs/zcp_get.c` ignored the error from
`dsl_prop_get_ds()` when the property is not of type string. We
return an error when it does. There is a small concern that the
`zfs_get_temporary_prop()` call would handle things, but in the case
that it does not, we would be pushing an uninitialized numval onto
the lua stack. It is expected that `dsl_prop_get_ds()` will succeed
anytime that `zfs_get_temporary_prop()` does, so that not giving it a
chance to fix things is not a problem.
* `draid_merge_impl()` in `tests/zfs-tests/cmd/draid.c` used
`nvlist_add_nvlist()` twice in ways in which errors are expected to
be impossible, so we switch to `fnvlist_add_nvlist()`.
A few notable ones did not merit use of the return value, so we
suppressed it with `(void)`:
* `write_free_diffs()` in `lib/libzfs/libzfs_diff.c` ignored the error
value from `describe_free()`. A look through the commit history
revealed that this was intentional.
* `arc_evict_hdr()` in `module/zfs/arc.c` did not need to use the
returned handle from `arc_hdr_realloc()` because it is already
referenced in lists.
* `spa_vdev_detach()` in `module/zfs/spa.c` has a comment explicitly
saying not to use the error from `vdev_label_init()` because whatever
causes the error could be the reason why a detach is being done.
Unfortunately, I am not presently able to analyze the kernel modules
with Clang's static analyzer, so I could have missed some cases of this.
In cases where reports were present in code that is duplicated between
Linux and FreeBSD, I made a conscious effort to fix the FreeBSD version
too.
After this commit is merged, regressions like dee8934 should become
extremely obvious with Clang's static analyzer since a regression would
appear in the results as the only instance of unused code. That assumes
that Coverity does not catch the issue first.
My local branch with fixes from all of my outstanding non-draft pull
requests shows 118 reports from Clang's static anlayzer after this
patch. That is down by 51 from 169.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Cedric Berger <cedric@precidata.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes#13986
Users are allowed to pass NULL to resultp, but we unconditionally assume
that they never do. When an external user does pass NULL to resultp, we
dereference a NULL pointer.
Clang's static analyzer complained about this.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes#14008
This reverts commit 8534e6be81, and adds
a cautionary note that there are dragons about that should be considered
when changing it.
PR: 267026
Reviewed by: dim, imp
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D36981
This reverts commit 76e6e4d72f.
Several programs in the tree use -1 instead of INT_MAX to use
the maximum value. Thanks to Eugene Grosbein for pointing this
out.
Ensure that a negative backlog argument is handled as it if was 0.
Reviewed by: markj@, glebius@
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D31821