This version is intended to be used with the 0.29.4 version of the
ice(4) driver, which will be be committed afterwards.
Signed-off-by: Eric Joyner <erj@FreeBSD.org>
Reviewed by: stallamr_netapp.com
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D30887
Update the driver in order not to break its compilation
and make use of the new ENA logging system
Migrate platform code to the new logging system provided by ena_com
layer.
Make ENA_INFO the new default log level.
Remove all explicit use of `device_printf`, all new logs requiring one
of the log macros to be used.
If we had a multiple nvlist, during nvlist_pack, we calculated the size
of every nvlist separately. For example, if we had a nvlist with three
nodes each containing another (A contains B, and B contains C), we first
calculated the size of nvlist A (which contains B, C), then we calculate
the size of B (which contains C, notice that we already did the
calculation of B, when we calculate A), and finally C. This means that
this calculation was O(N!). This was done because each time we pack
nvlist, we have to put its size in the header
(the separate header for A, B, and C).
To not break the ABI and to reduce the complexity of nvlist_size,
instead of calculating the nvlist size when requested,
we track the size of each nvlist.
Reported by: pjd, kp
Tested by: kp
Notable upstream pull request merges:
#11710 Allow zfs to send replication streams with missing snapshots
#11751 Avoid taking global lock to destroy zfsdev state
#11786 Ratelimit deadman zevents as with delay zevents
#11803 ZFS traverse_visitbp optimization to limit prefetch
#11813 Allow pool names that look like Solaris disk names
#11822 Atomically check and set dropped zevent count
#11822 Don't scale zfs_zevent_len_max by CPU count
#11833 Refactor zfsdev state init/destroy to share common code
#11837 zfs get -p only outputs 3 columns if "clones" property is empty
#11843 libzutil: zfs_isnumber(): return false if input empty
#11849 Use dsl_scan_setup_check() to setup a scrub
#11861 Improvements to the 'compatibility' property
#11862 cmd/zfs receive: allow dry-run (-n) to check property args
#11864 receive: don't fail inheriting (-x) properties on wrong dataset type
#11877 Combine zio caches if possible
#11881 FreeBSD: use vnlru_free_vfsops if available
#11883 FreeBSD: add support for lockless symlink lookup
#11884 FreeBSD: add missing seqc write begin/end around zfs_acl_chown_setattr
#11896 Fix crash in zio_done error reporting
#11905 zfs-send(8): Restore sorting of flags
#11926 FreeBSD: damage control racing .. lookups in face of mkdir/rmdir
#11930 vdev_mirror: don't scrub/resilver devices that can't be read
#11938 Fix AVX512BW Fletcher code on AVX512-but-not-BW machines
#11955 zfs get: don't lookup mount options when using "-s local"
#11956 libzfs: add keylocation=https://, backed by fetch(3) or libcurl
#11959 vdev_id: variable not getting expanded under map_slot()
#11966 Scale worker threads and taskqs with number of CPUs
#11994 Clean up use of zfs_log_create in zfs_dir
#11997 FreeBSD: Don't force xattr mount option
#11997 FreeBSD: Implement xattr=sa
#11997 FreeBSD: Use SET_ERROR to trace xattr name errors
#11998 Simplify/fix dnode_move() for dn_zfetch
#12003 FreeBSD: Initialize/destroy zp->z_lock
#12010 Fix dRAID self-healing short columns
#12033 Revert "Fix raw sends on encrypted datasets when copying back snapshots"
#12040 Reinstate the old zpool read label logic as a fallback
#12046 Improve scrub maxinflight_bytes math
#12049 FreeBSD: avoid memory allocation in arc_prune_async
#12052 FreeBSD: incorporate changes to the VFS_QUOTACTL(9) KPI
#12061 Fix dRAID sequential resilver silent damage handling
#12072 Let zfs diff be more permissive
#12077 FreeBSD: Retry OCF ENOMEM errors.
#12088 Propagate vdev state due to invalid label corruption
#12091 libzfs: On FreeBSD, use MNT_NOWAIT with getfsstat
#12097 FreeBSD: Update dataset_kstats for zvols in dev mode
#12104 FreeBSD boot code reminder after zpool upgrade
#12114 Introduce write-mostly sums
Obtained from: OpenZFS
OpenZFS commit: 75b4cbf625
uni_msg_extend() may fail due to a memory allocation failure. In this
case, though, the message is freed, so callers shouldn't touch it.
PR: 255861
Reviewed by: harti
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D30611
Instead of requiring all implementations of vfs_quotactl to unbusy
the mount for Q_QUOTAON and Q_QUOTAOFF, add an "mp_busy" in/out param
to VFS_QUOTACTL(9). The implementation may then indicate to the caller
whether it needed to unbusy the mount.
Also, add stbool.h to libprocstat modules which #define _KERNEL
before including sys/mount.h. Otherwise they'll pull in sys/types.h
before defining _KERNEL and therefore won't have the bool definition
they need for mp_busy.
Reviewed By: kib, markj
Differential Revision: https://reviews.freebsd.org/D30556
Parts of libprocstat like to pretend they're kernel components for the
sake of including mount.h, and including sys/types.h in the _KERNEL
case doesn't fix the build for some reason. Revert both the
VFS_QUOTACTL() change and the follow-up "fix" for now.
Instead of requiring all implementations of vfs_quotactl to unbusy
the mount for Q_QUOTAON and Q_QUOTAOFF, add an "mp_busy" in/out param
to VFS_QUOTACTL(9). The implementation may then indicate to the caller
whether it needed to unbusy the mount.
Reviewed By: kib, markj
Differential Revision: https://reviews.freebsd.org/D30218
Unfortunately the wrong elemet is freed, also resulting in use-after-free.
PR: 255859
Submitted by: lylgood@foxmail.com
Reported by: lylgood@foxmail.com
MFC after: 3 days
While here, fix all links to older en_US.ISO8859-1 documentation
in the src/ tree.
PR: 255026
Reported by: Michael Büker <freebsd@michael-bueker.de>
Reviewed by: dbaio
Approved by: blackend (mentor), re (gjb)
MFC after: 10 days
Differential Revision: https://reviews.freebsd.org/D30265
The NAV (network allocation vector) register reflects the current MAC
tracking of NAV - when it will stay quiet before transmitting.
Other devices transmit their frame durations in their 802.11 PHY headers
and all devices that hear a frame - even if it's one in an encoding
they don't understand - will understand the low bitrate PHY header that
includes the frame duration. So, they'll set NAV to this value so
they'll stay quiet until the transmit completes.
Anyway, sometimes the PHY NAV header is garbled and sometimes, notably
older broadcom devices, will fake a long NAV so they can get "cleaner" air
for local calibration. When this happens, the hardware will stay quiet
for quite some time and this can lead to missed/stuck beacons, or
(for Very Large Values) a MAC hang.
This code just adds the ability to get/set the NAV; the driver will
need to take care of using it during transmit hangs and beacon misses
to see if it's due to a trash looking NAV.
The 'ticket' and 'my_ticket' arguments are both read and written within
the same asm block. Clang is stricter with the constraints than gcc4
was, so accepts the '=r' at face value and will happily overwrite
registers that "should" be preserved.
Mark these operands to not clobber other operands, so they get their own
registers.
This fixes a panic on bringing up the octe interfaces.
Notable upstream pull request merges:
#11742 When specifying raidz vdev name, parity count should match
#11744 Use a helper function to clarify gang block size
#11771 Support running FreeBSD buildworld on Arm-based macOS hosts
This is the last update that will be MFCed into stable/13.
From now on, the tracking of OpenZFS branches will be different:
- main continues tracking openzfs/zfs/master
- stable/13 is going to track openzfs/zfs/zfs-2.1-release
Obtained from: OpenZFS
MFC after: 1 week
44c125c4ce switched the nvlist allocations
to be M_WAITOK, but this precludes the use in non-sleepable contexts.
(E.g. with a nonsleepable lock held).
All callers for these allocation functions already cope with memory
alloation failures, so there's no reason to allow sleeping during
allocations.
Reviewed by: melifaro, oshogbo
MFC after: 1 week
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D29556
Upstream commit message:
Support running FreeBSD buildworld on Arm-based macOS hosts
Arm-based Macs are like FreeBSD and provide a full 64-bit stat from the
start, so have no stat64 variants. Thus, define stat64 and fstat64 as
aliases for the normal versions.
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Jessica Clarke <jrtc27@jrtc27.com>
Closes#11771
MFC after: 1 week
ipf_proxy_check() returns -1 for an error and 0 or 1 for success.
ipf_proxy_check()'s callers check for error and if the return code
is 0, they change it to 1 prior to returning to their callers. Simply
by returning -1 or 1 we reduce complexity and cycles burned changing
0 to 1.
MFC after: 1 week
The global list has a marker with an invariant that free vnodes are
placed somewhere past that. A caller which performs filtering (like ZFS)
can move said marker all the way to the end, across free vnodes which
don't match. Then a caller which does not perform filtering will fail to
find them. This makes vn_alloc_hard sleep for 1 second instead of
reclaiming, resulting in significant stalls.
Fix the problem by requiring an explicit marker by callers which do
filtering.
As a temporary measure extend vnlru_free to restart if it fails to
reclaim anything.
Big thanks go to the reporter for testing several iterations of the
patch.
Reported by: Yamagi <lists yamagi.org>
Tested by: Yamagi <lists yamagi.org>
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D29324
Notable upstream pull request merges:
#11153 Scalable teardown lock for FreeBSD
#11651 Don't bomb out when using keylocation=file://
#11667 zvol: call zil_replaying() during replay
#11683 abd_get_offset_struct() may allocate new abd
#11693 Intentionally allow ZFS_READONLY in zfs_write
#11716 zpool import cachefile improvements
#11720 FreeBSD: Clean up zfsdev_close to match Linux
#11730 FreeBSD: bring back possibility to rewind the
checkpoint from bootloader
Obtained from: OpenZFS
MFC after: 2 weeks
Add parsing of the rewind options.
When I was upstreaming the change [1], I omitted the part where we
detect that the pool should be rewind. When the FreeBSD repo has
synced with the OpenZFS, this part of the code was removed.
[1] FreeBSD repo: 277f38abff
[2] OpenZFS repo: f2c027bd6a
Originally reviewed by: tsoome, allanjude
Originally reviewed by: kevans (ok from high-level overview)
Signed-off-by: Mariusz Zaborski <oshogbo@vexillium.org>
PR: 254152
Reported by: Zhenlei Huang <zlei.huang at gmail.com>
Obtained from: https://github.com/openzfs/zfs/pull/11730
The current preference number were copied from IPv4 code,
assuming 500k routes to be the full-view. Adjust with the current
reality (100k full-view).
Reported by: Marek Zarychta <zarychtam at plan-b.pwste.edu.pl>
MFC after: 3 days
After the merge of OpenZFS master-9312e0fd1 it has become possible to
import ZFS pools witn an active org.illumos:edonr feature on FreeBSD,
leading to a panic.
In addition, "zpool status" reported all pools without edonr as upgradable
and "zpool upgrade -v" lists edonr in the list of upgradable features.
This is an accepted but not yet included bugfix by upstream.
Obtained from: https://github.com/openzfs/zfs/pull/11653
Differential Revision: https://reviews.freebsd.org/D28935
Reported by: garga (on freebsd-current@)
Reviewed by: freqlabs
X-MFC-with: ba27dd8be8
This package is intended to be used with ice(4) version 0.28.1-k.
That update will happen in a forthcoming commit.
Signed-off-by: Eric Joyner <erj@FreeBSD.org>
Sponsored by: Intel Corporation
LARGE_NAT is a C macro that increases
NAT_SIZE from 127 to 2047,
RDR_SIZE from 127 to 2047,
HOSTMAP_SIZE from 2047 to 8191,
NAT_TABLE_MAX from 30000 to 180000, and
NAT_TABLE_SZ from 2047 to 16383.
These values can be altered at runtime using the ipf -T command however
some adminstrators of large firewalls rebuild the kernel to enable
LARGE_NAT at boot. This revision adds the tunable net.inet.ipf.large_nat
which allows an administrator to set this option at boot instead of build
time. Setting the LARGE_NAT macro to 1 is unaffected allowing build-time
users to continue using the old way.
Notable upstream changes:
778869fa1 Fix reporting of mount progress
e7adccf7f Disable use of hardware crypto offload drivers on FreeBSD
03e02e5b5 Fix checksum errors not being counted on repeated repair
64e0fe14f Restore FreeBSD resource usage accounting
11f2e9a49 Fix panic if scrubbing after removing a slog device
MFC after: 2 weeks
Notable upstream changes:
bf156c966 Remove unused abd_alloc_scatter_offset_chunkcnt
658fb8020 Add "compatibility" property for zpool feature sets
This update introduces a new pool property called "compatibility"
that can be used to enable a limited set of pool features on pool
creation and "stick" to it, so the "zpool upgrade" does not
accidentally enable features that are not desired. The value of
this property may then be changed later.
See zpool-features(5) for more information about the "compatibility"
pool property.
Obtained from: OpenZFS
MFC after: 2 weeks
From openzfs-master 0ae184a6b commit message:
If we do not write any buffers to the cache device and the evict hand
has not advanced do not update the cache device header.
Cherry-picked from openzfs 0ae184a6ba
Patch Author: George Amanakis <gamanakis@gmail.com>
MFC after: 3 days
Reviewed by: delphij
Differential Revision: https://reviews.freebsd.org/D28682
From openzfs-master 62d4287f2 commit message:
When scrubbing, (non-sequential) resilvering, or correcting a checksum
error using RAIDZ parity, ZFS should heal any incorrect RAIDZ parity by
overwriting it. For example, if P disks are silently corrupted (P being
the number of failures tolerated; e.g. RAIDZ2 has P=2), `zpool scrub`
should detect and heal all the bad state on these disks, including
parity. This way if there is a subsequent failure we are fully
protected.
With RAIDZ2 or RAIDZ3, a block can have silent damage to a parity
sector, and also damage (silent or known) to a data sector. In this
case the parity should be healed but it is not.
Cherry-picked from openzfs 62d4287f27
Patch Author: Matthew Ahrens <matthew.ahrens@delphix.com>
MFC after: 3 days
Reviewed by: delphij
Differential Revision: https://reviews.freebsd.org/D28681
57785538c6 change the test for FreeBSD
from __FreeBSD_version to __FreeBSD__. However this test was performed
before sys/param.h was included, therefore __FreeBSD_version was never
defined. As the test was never true opt_random_ip_id.h was never included.
Submitted by: bdragon
Reported by: bdragon
MFC after: 1 week
X-MFC with: 57785538c6
Apply https://github.com/openzfs/zfs/pull/11576
Direct commit from upstream openzfs. Full commit message below:
Set file mode during zfs_write
3d40b65 refactored zfs_vnops.c, which shared much code verbatim between
Linux and BSD. After a successful write, the suid/sgid bits are reset,
and the mode to be written is stored in newmode. On Linux, this was
propagated to both the in-memory inode and znode, which is then updated
with sa_update.
3d40b65 accidentally removed the initialization of newmode, which
happened to occur on the same line as the inode update (which has been
moved out of the function).
The uninitialized newmode can be saved to disk, leading to a crash on
stat() of that file, in addition to a merely incorrect file mode.
Reviewed-by: Ryan Moeller <ryan@ixsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Antonio Russo <aerusso@aerusso.net>
Closes#11474Closes#11576
Obtained from: openzfs/zfs@f8ce8aed0
MFC after: 0 days
Sponsored by: iXsystems, Inc.
The ipfilter NAT table host map size is a tunable that defaults to
a macro value defined at build time. HOSTMAP_SIZE is saved in softn
(the ipnat softc) at initialization. It can be tuned (changed) at runtime
using the ipf -T command. If the hostmap_size tunable is adjusted the
calculation to determine where to put new entries in the table was
incorrect. Use the tunable in the NAT softc instead of the static build
time value.
MFC after: 1 week
The conscious decision was made not to perform any indentation or
whitespace cleanup while cleaning out old redunant #ifdefs. The
reason for this was to avoid confusing future readers of history and
diffs with cosmetic changes, making bisection of any possible bugs
introduced more difficult. This commit cleans up the whitespace
detritus left behind from the previous #ifdef cleanup commits.
MFC after: 1 week
In the old days when K&R C and STD C were each in use a workaround
(read hack) was required to allow the same code to work on each
without modification. All C compilers support STD C. We can finally
put the __P prototype to rest.
MFC after: 1 week
All C compilers in 2021 support standard C and architectures that did
not were retired long ago. Simplify by removing now redundant
pre-standard C code.
MFC after: 1 week
Pass the structure offset in arg2 instead of arg1. This avoids
having to undo the pointer arithmetic on arg1. Instead arg2 can
be used directly as an offset relative to the desired structure.
Reviewed by: cy
Obtained from: CheriBSD
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D27961
It changed the #pinctrl-cells value to be equal to 2 and the macro
that generates the values.
Based on the bindings docs a value of 2 is only acceptable if the node
used pinctrl-single,bits and not pinctrl-single,pins
This allow booting further on the beaglebone black with 5.9 DTS
This change introduces loadable fib lookup modules based on
DPDK rte_lpm lib targeted for high-speed lookups in large-scale tables.
It is based on the lookup framework described in D27401.
IPv4 module is called dpdk_lpm4. It wraps around rte_lpm [1] library.
This library implements variation of DIR24-8 [2] lookup algorithm.
Module provide lockless route lookups and in-place incremental updates,
allowing for good RIB performance.
IPv6 module is called dpdk_lpm6. It wraps around rte_lpm6 [3] library.
Implementation can be seen as multi-bit trie where the stride or number of bits
inspected on each level varies from level to level.
It can vary from 1 to 14 memory accesses, with 5 being the average value
for the lengths that are most commonly used in IPv6.
Module provide lockless route lookups for global unicast addresses
and in-place incremental updates, allowing for good RIB performance.
Implementation details:
* wrapper code lives in `sys/contrib/dpdk_rte_lpm/dpdk_lpm[6].c`.
* rte_lpm[6] implementation contains both RIB and FIB code.
. RIB ("rule_") code, backed by array of hash tables part has been commented out,
as base radix already provides all the necessary primitives.
* link-local lookups are currently implemented as base radix lookup.
This part should be converted to something like read-only radix trie.
Usage detail:
Compile kernel with option FIB_ALGO and load dpdk_lpm4/dpdk_lpm6
module at any time. They will be picked up automatically when
amount of routes raises to several thousand.
[1]: https://doc.dpdk.org/guides/prog_guide/lpm_lib.html
[2]: http://yuba.stanford.edu/~nickm/papers/Infocom98_lookup.pdf
[3]: https://doc.dpdk.org/guides/prog_guide/lpm6_lib.html
Differential Revision: https://reviews.freebsd.org/D27412
POSIX O_DSYNC means that writes include an implicit fdatasync(2), just
as O_SYNC implies fsync(2).
VOP_WRITE() functions that understand the new IO_DATASYNC flag can act
accordingly, but we'll still pass down IO_SYNC so that file systems that
don't understand it will continue to provide the stronger O_SYNC
behaviour.
Flag also applies to fcntl(2).
Reviewed by: kib, delphij
Differential Revision: https://reviews.freebsd.org/D25090
Fix non-FreeBSD CI build after v1.4.8. This definition was only used in
zstd(1), which isn't part of non-FreeBSD CI (I guess). The ifdef was
added in v1.4.5 import.
Upstream does not currently support shared-linked zstd(1), but I have
proposed https://github.com/facebook/zstd/pull/2450 . If that is
adopted, we can add -DZSTD_PROGRAMS_LINK_SHARED to our libzstd build and
drop some diffs.
Reported by: uqs
lua: avoid gcc -Wreturn-local-addr bug
Avoid a bug with gcc's -Wreturn-local-addr warning with some
obfuscation. In buggy versions of gcc, if a return value is an
expression that involves the address of a local variable, and even if
that address is legally converted to a non-pointer type, a warning may
be emitted and the value of the address may be replaced with zero.
Howerver, buggy versions don't emit the warning or replace the value
when simply returning a local variable of non-pointer type.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90737
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Libby <rlibby@FreeBSD.org>
Closes#11337
spa: avoid type narrowing warning
Building the spa module for i386 caused gcc to emit
-Wint-to-pointer-cast "cast to pointer from integer of different size"
because spa.spa_did was uint64_t but pthread_join (via thread_join in
spa_deactivate) takes a pointer (32-bit on i386). Define spa_did to be
pointer-size instead. For now spa_did is in fact never non-zero and the
thread_join could instead be ifdef'd out, but changing the size of
spa_did may be more useful for the future.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Ryan Libby <rlibby@FreeBSD.org>
Closes#11336
FreeBSD libzfs: gcc requires __thread after static
Building libzfs with gcc on FreeBSD failed because gcc is picky about
the order of keywords in declarations with __thread, whereas clang is
more relaxed.
https://gcc.gnu.org/onlinedocs/gcc/Thread-Local.html
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: Ryan Libby <rlibby@FreeBSD.org>
Closes#11331
Fix compiling on FreeBSD + gcc - don't assume illmnos bits
This looks like it was once from the illumnos compat code.
FreeBSD doesn't have cmn_err as a compiler format attribute, so
it definitely errors out.
It doesn't show up on LLVM because it doesn't trigger at all.
Add in the format flags but keep them behind #if 0 for now;
there are too many format issues that trigger when one does
format checking in the shared code.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: adrian chadd <adrian@freebsd.org>
Closes#11068Closes#11069
Fix pointer-is-uint64_t-sized assumption in the ioctl path
This shows up when compiling freebsd-head on amd64 using gcc-6.4.
The lib32 compat build ends up tripping over this assumption.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Signed-off-by: adrian chadd <adrian@freebsd.org>
Closes#11068Closes#11069
Replace MAXPHYS by runtime variable maxphys. It is initialized from
MAXPHYS by default, but can be also adjusted with the tunable kern.maxphys.
Make b_pages[] array in struct buf flexible. Size b_pages[] for buffer
cache buffers exactly to atop(maxbcachebuf) (currently it is sized to
atop(MAXPHYS)), and b_pages[] for pbufs is sized to atop(maxphys) + 1.
The +1 for pbufs allow several pbuf consumers, among them vmapbuf(),
to use unaligned buffers still sized to maxphys, esp. when such
buffers come from userspace (*). Overall, we save significant amount
of otherwise wasted memory in b_pages[] for buffer cache buffers,
while bumping MAXPHYS to desired high value.
Eliminate all direct uses of the MAXPHYS constant in kernel and driver
sources, except a place which initialize maxphys. Some random (and
arguably weird) uses of MAXPHYS, e.g. in linuxolator, are converted
straight. Some drivers, which use MAXPHYS to size embeded structures,
get private MAXPHYS-like constant; their convertion is out of scope
for this work.
Changes to cam/, dev/ahci, dev/ata, dev/mpr, dev/mpt, dev/mvs,
dev/siis, where either submitted by, or based on changes by mav.
Suggested by: mav (*)
Reviewed by: imp, mav, imp, mckusick, scottl (intermediate versions)
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D27225
* Use the new API of ena_trace_*
* Fix typo syndrom --> syndrome
* Remove validation of the Rx req ID (already performed in the ena-com)
* Remove usage of deprecated ENA_ASSERT macro
Submitted by: Ido Segev <idose@amazon.com>
Submitted by: Michal Krawczyk <mk@semihalf.com>
Obtained from: Semihalf
Sponsored by: Amazon, Inc
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D27115
The latest generation hardware requires IO CQ (completion queue)
descriptors memory to be aligned to a 4K. It needs that feature for
the best performance.
Allocating unaligned descriptors will have a big performance impact as
the packet processing in a HW won't be optimized properly. For that
purpose adjust ena_dma_alloc() to support it.
It's a critical fix, especially for the arm64 EC2 instances.
Submitted by: Ido Segev <idose@amazon.com>
Obtained from: Amazon, Inc
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D27114
They are only there to provide less innacurate statistics for debuggers.
However, this is quite heavy-weight and instead it would be better to
teach debuggers how to obtain the necessary information.
This deduplicates 2 sets of caches using the same sizes.
Memory savings fluctuate a lot, one sample result is buildworld on zfs
saving ~180MB RAM in reduced page count associated with zio caches.
The ZIL will be opened on the first write, not earlier.
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Mariusz Zaborski <oshogbo@vexillium.org>
OpenZFS Pull Request: https://github.com/openzfs/zfs/pull/11152
PR: 250934
My script to convert git commits to svn patch does not handle binary
files correctly, and r367387 committed a set of empty files as a result.
MFC with: r367387
Sponsored by: Rubicon Communications, LLC (Netgate)
lz4 port from illumos to Linux added a 16KB per-CPU cache to accommodate for
the missing 16KB malloc. FreeBSD supports this size, making the extra cache
harmful as it can't share buckets.
The use of atomic_sub_64() in zfs_zstd.c was breaking the 32-bit build on
platforms without native 64-bit atomics due to atomic_sub_64() not being
available, and no fallback being provided in _STANDALONE.
Provide a standalone stub to match atomic_add_64() using simple math.
While this is not actually atomic, it does not matter in libsa context,
since it always runs single-threaded and does not run under a scheduler.
Reviewed by: mjg (in email)
This applies:
commit c4ede65bdf
Author: Mateusz Guzik <mjguzik@gmail.com>
Date: Fri Oct 30 23:26:10 2020 +0100
zstd: track allocator statistics
Note that this only tracks sizes as requested by the caller.
Actual allocated space will almost always be bigger (e.g., rounded up to
the next power of 2 or page size). Additionally the allocated buffer may
be holding other areas hostage. Nonetheless, this is a starting point
for tracking memory usage in zstd.
from openzfs
Foundation copyrights, approved by emaste@. It does not include
files which carry other people's copyrights; if you're one
of those people, feel free to make similar change.
Reviewed by: emaste, imp, gbe (manpages)
Differential Revision: https://reviews.freebsd.org/D26980
hese kstats are often expensive to compute so we want to avoid them
unless specifically requested.
The following kstats are affected by this change:
kstat.zfs.${pool}.multihost
kstat.zfs.${pool}.misc.state
kstat.zfs.${pool}.txgs
kstat.zfs.misc.fletcher_4_bench
kstat.zfs.misc.vdev_raidz_bench
kstat.zfs.misc.dbufs
kstat.zfs.misc.dbgmsg
PR: 249258
Reported by: mjg
Reviewed by: mjg, allanjude
Obtained from: https://github.com/openzfs/zfs/pull/11099
Sponsored by: iXsystems, Inc.
- fix panic due to tqid overflow
- Improve libzfs_error_init messages
- Expose zfetch_max_idistance tunable
- Make dbufstat work on FreeBSD
- Fix EIO after resuming receive of new dataset over an existing one
Some vnodes come with a hack which inherits the fplookup flag despite having vops
which don't provide the routine.
Reported by: YAMAMOTO Shigeru <shigeru@os-hackers.jp>
The 32-bit counter eventually wraps to 0 which is a sentinel for invalid
id.
Make it 64-bit on LP64 platforms and 0-check otherwise.
Note: Linux counterpart uses id stored per queue instead of a global.
I did not check going that way is feasible with the goal being the
minimal fix doing the job.
Reported by: YAMAMOTO Shigeru <shigeru@os-hackers.jp>
Reviewed by: mav
Differential Revision: https://reviews.freebsd.org/D26759
Add support to the _STANDALONE environment enough bits of the kernel
that we can compile it. We still have a small zstd_shim.c since there
were 3 items that were a bit hard to nail down and may be cleaned up
in the future. These go hand in hand with a number of commits to
sys/sys in the past weeks, should this need be MFCd.
Discussed with: mmacy (in review and on IRC/Slack)
Reviewed by: freqlabs (on openzfs repo)
Differential Revision: https://reviews.freebsd.org/D26218
Dynamically created OIDs automatically get this flag set.
Reviewed by: jhb
MFC after: 1 week
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D26561
- Annotate FreeBSD sysctls with CTLFLAG_MPSAFE
- Reduce stack usage of Lua
- Don't save user FPU context in kernel threads
- Add support for procfs_list
- Code cleanup in zio_crypt
- Add DB_RF_NOPREFETCH to dbuf_read()s in dnode.c
- Drop references when skipping dmu_send due to EXDEV
- Eliminate gratuitous bzeroing in dbuf_stats_hash_table_data
- Fix legacy compat for platform IOCs
This was introduced when I merged r361287 to OpenZFS and has been fixed
there already, commit 3f6bb6e43fd68e.
Reported by: swills
Reviewed by: allanjude, freqlabs, mmacy
An upcoming change to include bitset(9) macros from vm_page.h
causes a macro name collision with vchi's custom bitset macros.
This change was performed mechanically by:
sed -i .orig s/BITSET/VCHI_BITSET/g $(grep -rl BITSET sys/contrib/vchiq)
Reviewed by: andrew
Approved by: scottl (implicit)
MFC after: 1 week
Sponsored by: Ampere Computing, Inc.
Differential Revision: https://reviews.freebsd.org/D26177
The pointer to vnode is already stored into f_vnode, so f_data can be
reused. Fix all found users of f_data for DTYPE_VNODE.
Provide finit_vnode() helper to initialize file of DTYPE_VNODE type.
Reviewed by: markj (previous version)
Discussed with: freqlabs (openzfs chunk)
Tested by: pho (previous version)
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D26346
This package is intended to be used with ice(4) version 0.26.16. That
update will happen in a forthcoming commit.
MFC after: 3 days
Sponsored by: Intel Corporation
Make the inlines static to avoid kernel build failure with Clang 11 on i386.
(The issue was not observed with Clang 10, currently in tree; reproduction
depends on compiler inlining choices.)
The compiler may choose not to inline 'bare' C inlines, and in that case
expects a symbol of the same name will be available. It does not
automatically define that symbol at use, because of traditional C linking
semantics. (In contrast, C++ does define it, and then deduplicates redundant
definitions at link). As we do not instantiate the C99 inline ('extern
inline ...;'), the linker errors with "undefined symbol."
Reported by: dim
Tested by: dim
Fixes: r364219
Add prng(9) as a replacement for random(9) in the kernel.
There are two major differences from random(9) and random(3):
- General prng(9) APIs (prng32(9), etc) do not guarantee an
implementation or particular sequence; they should not be used for
repeatable simulations.
- However, specific named API families are also exposed (for now: PCG),
and those are expected to be repeatable (when so-guaranteed by the named
algorithm).
Some minor differences from random(3) and earlier random(9):
- PRNG state for the general prng(9) APIs is per-CPU; this eliminates
contention on PRNG state in SMP workloads. Each PCPU generator in an
SMP system produces a unique sequence.
- Better statistical properties than the Park-Miller ("minstd") PRNG
(longer period, uniform distribution in all bits, passes
BigCrush/PractRand analysis).
- Faster than Park-Miller ("minstd") PRNG -- no division is required to
step PCG-family PRNGs.
For now, random(9) becomes a thin shim around prng32(). Eventually I
would like to mechanically switch consumers over to the explicit API.
Reviewed by: kib, markj (previous version both)
Discussed with: markm
Differential Revision: https://reviews.freebsd.org/D25916
It was pointed out to me that this is the convention for documenting upgrade
instructions, rather than just leaving the instructions in the commit message.
It's possible these commands won't be used again before we transition to git,
but then at least they'll give a path forward for whoever touches this next.
Suggested by: lwhsu
We use these to compile libefivar. The particular motivation for this update is
the inclusion of the RISC-V machine definitions that allow us to build the
library on the platform. This support could easily have been submitted as a
small local diff, but the timing of the release coincided with this work, and
it has been over 3 years since these sources were initially imported.
Note that this comes with a license change from regular BSD 2-clause to the
BSD+Patent license. This has been approved by core@ for this particular
project [1].
As with the original import, we retain only the subset of headers that we
actually need to build libefivar. I adapted imp@'s process slightly for this
update:
# Generate list of the headers needed to build
cp -r ../vendor/edk2/dist/MdePkg/Include sys/contrib/edk2
cd lib/libefivar
make
pushd `make -V .OBJDIR`
cat .depend*.o | grep sys/contrib | cut -d' ' -f 3 |
sort -u | sed -e 's=/full/path/sys/contrib/edk2/==' > /tmp/xxx
popd
# Merge the needed files
cd ../../sys/contrib/edk2
svn revert -R .
for i in `cat /tmp/xxx`; do
svn merge -c VendorRevision svn+ssh://repo.freebsd.org/base/vendor/edk2/dist/MdePkg/$i $i
done
svn merge -c VendorRevision svn+ssh://repo.freebsd.org/base/vendor/edk2/dist/MdePkg/MdePkg.dec MdePkg.dec
[1] https://www.freebsd.org/internal/software-license.html