44c125c4cebc2fd87c6260b90eddae11201f5232 switched the nvlist allocations
to be M_WAITOK, but this precludes the use in non-sleepable contexts.
(E.g. with a nonsleepable lock held).
All callers for these allocation functions already cope with memory
alloation failures, so there's no reason to allow sleeping during
allocations.
Reviewed by: melifaro, oshogbo
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D29556
Upstream commit message:
Support running FreeBSD buildworld on Arm-based macOS hosts
Arm-based Macs are like FreeBSD and provide a full 64-bit stat from the
start, so have no stat64 variants. Thus, define stat64 and fstat64 as
aliases for the normal versions.
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Jessica Clarke <jrtc27@jrtc27.com>
Closes#11771
MFC after: 1 week
ipf_proxy_check() returns -1 for an error and 0 or 1 for success.
ipf_proxy_check()'s callers check for error and if the return code
is 0, they change it to 1 prior to returning to their callers. Simply
by returning -1 or 1 we reduce complexity and cycles burned changing
0 to 1.
MFC after: 1 week
The global list has a marker with an invariant that free vnodes are
placed somewhere past that. A caller which performs filtering (like ZFS)
can move said marker all the way to the end, across free vnodes which
don't match. Then a caller which does not perform filtering will fail to
find them. This makes vn_alloc_hard sleep for 1 second instead of
reclaiming, resulting in significant stalls.
Fix the problem by requiring an explicit marker by callers which do
filtering.
As a temporary measure extend vnlru_free to restart if it fails to
reclaim anything.
Big thanks go to the reporter for testing several iterations of the
patch.
Reported by: Yamagi <lists yamagi.org>
Tested by: Yamagi <lists yamagi.org>
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D29324
Notable upstream pull request merges:
#11153 Scalable teardown lock for FreeBSD
#11651 Don't bomb out when using keylocation=file://
#11667 zvol: call zil_replaying() during replay
#11683 abd_get_offset_struct() may allocate new abd
#11693 Intentionally allow ZFS_READONLY in zfs_write
#11716 zpool import cachefile improvements
#11720 FreeBSD: Clean up zfsdev_close to match Linux
#11730 FreeBSD: bring back possibility to rewind the
checkpoint from bootloader
Obtained from: OpenZFS
MFC after: 2 weeks
Add parsing of the rewind options.
When I was upstreaming the change [1], I omitted the part where we
detect that the pool should be rewind. When the FreeBSD repo has
synced with the OpenZFS, this part of the code was removed.
[1] FreeBSD repo: 277f38abffc6a8160b5044128b5b2c620fbb970c
[2] OpenZFS repo: f2c027bd6a003ec5793f8716e6189c389c60f47a
Originally reviewed by: tsoome, allanjude
Originally reviewed by: kevans (ok from high-level overview)
Signed-off-by: Mariusz Zaborski <oshogbo@vexillium.org>
PR: 254152
Reported by: Zhenlei Huang <zlei.huang at gmail.com>
Obtained from: https://github.com/openzfs/zfs/pull/11730
The current preference number were copied from IPv4 code,
assuming 500k routes to be the full-view. Adjust with the current
reality (100k full-view).
Reported by: Marek Zarychta <zarychtam at plan-b.pwste.edu.pl>
MFC after: 3 days
After the merge of OpenZFS master-9312e0fd1 it has become possible to
import ZFS pools witn an active org.illumos:edonr feature on FreeBSD,
leading to a panic.
In addition, "zpool status" reported all pools without edonr as upgradable
and "zpool upgrade -v" lists edonr in the list of upgradable features.
This is an accepted but not yet included bugfix by upstream.
Obtained from: https://github.com/openzfs/zfs/pull/11653
Differential Revision: https://reviews.freebsd.org/D28935
Reported by: garga (on freebsd-current@)
Reviewed by: freqlabs
X-MFC-with: ba27dd8be821792e15bdabfac69fd6cab0cf9dd3
This package is intended to be used with ice(4) version 0.28.1-k.
That update will happen in a forthcoming commit.
Signed-off-by: Eric Joyner <erj@FreeBSD.org>
Sponsored by: Intel Corporation
LARGE_NAT is a C macro that increases
NAT_SIZE from 127 to 2047,
RDR_SIZE from 127 to 2047,
HOSTMAP_SIZE from 2047 to 8191,
NAT_TABLE_MAX from 30000 to 180000, and
NAT_TABLE_SZ from 2047 to 16383.
These values can be altered at runtime using the ipf -T command however
some adminstrators of large firewalls rebuild the kernel to enable
LARGE_NAT at boot. This revision adds the tunable net.inet.ipf.large_nat
which allows an administrator to set this option at boot instead of build
time. Setting the LARGE_NAT macro to 1 is unaffected allowing build-time
users to continue using the old way.
Notable upstream changes:
778869fa1 Fix reporting of mount progress
e7adccf7f Disable use of hardware crypto offload drivers on FreeBSD
03e02e5b5 Fix checksum errors not being counted on repeated repair
64e0fe14f Restore FreeBSD resource usage accounting
11f2e9a49 Fix panic if scrubbing after removing a slog device
MFC after: 2 weeks
Notable upstream changes:
bf156c966 Remove unused abd_alloc_scatter_offset_chunkcnt
658fb8020 Add "compatibility" property for zpool feature sets
This update introduces a new pool property called "compatibility"
that can be used to enable a limited set of pool features on pool
creation and "stick" to it, so the "zpool upgrade" does not
accidentally enable features that are not desired. The value of
this property may then be changed later.
See zpool-features(5) for more information about the "compatibility"
pool property.
Obtained from: OpenZFS
MFC after: 2 weeks
If the ksh files are not executable then the tests are not run
and reported as failed.
MFC after: 2 weeks
X-MFC-with: 6b52139eb8e8eda0ea263b24735556194f918642
From openzfs-master 0ae184a6b commit message:
If we do not write any buffers to the cache device and the evict hand
has not advanced do not update the cache device header.
Cherry-picked from openzfs 0ae184a6baaf71e155e9b19af81b75474622ff58
Patch Author: George Amanakis <gamanakis@gmail.com>
MFC after: 3 days
Reviewed by: delphij
Differential Revision: https://reviews.freebsd.org/D28682
From openzfs-master 62d4287f2 commit message:
When scrubbing, (non-sequential) resilvering, or correcting a checksum
error using RAIDZ parity, ZFS should heal any incorrect RAIDZ parity by
overwriting it. For example, if P disks are silently corrupted (P being
the number of failures tolerated; e.g. RAIDZ2 has P=2), `zpool scrub`
should detect and heal all the bad state on these disks, including
parity. This way if there is a subsequent failure we are fully
protected.
With RAIDZ2 or RAIDZ3, a block can have silent damage to a parity
sector, and also damage (silent or known) to a data sector. In this
case the parity should be healed but it is not.
Cherry-picked from openzfs 62d4287f279a0d184f8f332475f27af58b7aa87e
Patch Author: Matthew Ahrens <matthew.ahrens@delphix.com>
MFC after: 3 days
Reviewed by: delphij
Differential Revision: https://reviews.freebsd.org/D28681
57785538c6e0d7e8ca0f161ab95bae10fd304047 change the test for FreeBSD
from __FreeBSD_version to __FreeBSD__. However this test was performed
before sys/param.h was included, therefore __FreeBSD_version was never
defined. As the test was never true opt_random_ip_id.h was never included.
Submitted by: bdragon
Reported by: bdragon
MFC after: 1 week
X-MFC with: 57785538c6e0d7e8ca0f161ab95bae10fd304047
Apply https://github.com/openzfs/zfs/pull/11576
Direct commit from upstream openzfs. Full commit message below:
Set file mode during zfs_write
3d40b65 refactored zfs_vnops.c, which shared much code verbatim between
Linux and BSD. After a successful write, the suid/sgid bits are reset,
and the mode to be written is stored in newmode. On Linux, this was
propagated to both the in-memory inode and znode, which is then updated
with sa_update.
3d40b65 accidentally removed the initialization of newmode, which
happened to occur on the same line as the inode update (which has been
moved out of the function).
The uninitialized newmode can be saved to disk, leading to a crash on
stat() of that file, in addition to a merely incorrect file mode.
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Antonio Russo <aerusso@aerusso.net>
Closes#11474Closes#11576
Obtained from: openzfs/zfs@f8ce8aed0
MFC after: 0 days
Sponsored by: iXsystems, Inc.
The ipfilter NAT table host map size is a tunable that defaults to
a macro value defined at build time. HOSTMAP_SIZE is saved in softn
(the ipnat softc) at initialization. It can be tuned (changed) at runtime
using the ipf -T command. If the hostmap_size tunable is adjusted the
calculation to determine where to put new entries in the table was
incorrect. Use the tunable in the NAT softc instead of the static build
time value.
MFC after: 1 week
The conscious decision was made not to perform any indentation or
whitespace cleanup while cleaning out old redunant #ifdefs. The
reason for this was to avoid confusing future readers of history and
diffs with cosmetic changes, making bisection of any possible bugs
introduced more difficult. This commit cleans up the whitespace
detritus left behind from the previous #ifdef cleanup commits.
MFC after: 1 week
In the old days when K&R C and STD C were each in use a workaround
(read hack) was required to allow the same code to work on each
without modification. All C compilers support STD C. We can finally
put the __P prototype to rest.
MFC after: 1 week
All C compilers in 2021 support standard C and architectures that did
not were retired long ago. Simplify by removing now redundant
pre-standard C code.
MFC after: 1 week
Pass the structure offset in arg2 instead of arg1. This avoids
having to undo the pointer arithmetic on arg1. Instead arg2 can
be used directly as an offset relative to the desired structure.
Reviewed by: cy
Obtained from: CheriBSD
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D27961
It changed the #pinctrl-cells value to be equal to 2 and the macro
that generates the values.
Based on the bindings docs a value of 2 is only acceptable if the node
used pinctrl-single,bits and not pinctrl-single,pins
This allow booting further on the beaglebone black with 5.9 DTS
Don't try to cleanup the zpool if we couldn't create a zpool in the
first place.
Submitted by: tmunro
MFC-with: 022ca2fc7fe08d51f33a1d23a9be49e6d132914e
This change introduces loadable fib lookup modules based on
DPDK rte_lpm lib targeted for high-speed lookups in large-scale tables.
It is based on the lookup framework described in D27401.
IPv4 module is called dpdk_lpm4. It wraps around rte_lpm [1] library.
This library implements variation of DIR24-8 [2] lookup algorithm.
Module provide lockless route lookups and in-place incremental updates,
allowing for good RIB performance.
IPv6 module is called dpdk_lpm6. It wraps around rte_lpm6 [3] library.
Implementation can be seen as multi-bit trie where the stride or number of bits
inspected on each level varies from level to level.
It can vary from 1 to 14 memory accesses, with 5 being the average value
for the lengths that are most commonly used in IPv6.
Module provide lockless route lookups for global unicast addresses
and in-place incremental updates, allowing for good RIB performance.
Implementation details:
* wrapper code lives in `sys/contrib/dpdk_rte_lpm/dpdk_lpm[6].c`.
* rte_lpm[6] implementation contains both RIB and FIB code.
. RIB ("rule_") code, backed by array of hash tables part has been commented out,
as base radix already provides all the necessary primitives.
* link-local lookups are currently implemented as base radix lookup.
This part should be converted to something like read-only radix trie.
Usage detail:
Compile kernel with option FIB_ALGO and load dpdk_lpm4/dpdk_lpm6
module at any time. They will be picked up automatically when
amount of routes raises to several thousand.
[1]: https://doc.dpdk.org/guides/prog_guide/lpm_lib.html
[2]: http://yuba.stanford.edu/~nickm/papers/Infocom98_lookup.pdf
[3]: https://doc.dpdk.org/guides/prog_guide/lpm6_lib.html
Differential Revision: https://reviews.freebsd.org/D27412