When needed, -fPIC is added in defs.mk. While not in main, mips on
stable/13 can't tolerate it. Remove it here.
MFC After: now (it's a build issue)
Sponsored by: Netflix
The scmi driver in its current form requires the arm_doorbell
driver to communicate with the firmware.
The arm_doorbell is only found in ARM Juno reference board (and
apparently on Morello too).
If we want to use scmi on other platform (like some rockchip or imx
soc), the driver needs to be updated to support svc/shmem communication
with the firmware.
For now since it can be only used with arm_doorbell move the device to
std.arm otherwise kernel configs like ALLWINNER or ROCKCHIP fails to build.
Reviewed by: br, imp
Sponsored by: Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D37953
Ever since gtaskqueue_drain() was added to iflib_stop(), a kernel panic
occurs when the ice(4) driver is in recovery mode. Queues are not
initialized in this mode, so gt_taskqueue is not initialized, and
gtaskqueue_drain() will panic.
Fix this by only doing a drain if an RX queue's gt_taskqueue is
initialized.
Signed-off-by: Eric Joyner <erj@FreeBSD.org>
Reviewed by: erj@
MFC after: 1 week
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D37892
This updated DDP is intended to be used with the forthcoming ice(4)
driver update to 1.37.7-k. (But it will still work with the current
version.)
Co-authored-by: Piotr Kubaj <pkubaj@FreeBSD.org>
Signed-off-by: Eric Joyner <erj@FreeBSD.org>
MFC after: 1 week
Sponsored by: Intel Corporation
As part of the effort to hide the internals of the ifnet struct, convert
the DEBUGNET_SET() macro to use an accessor instead of directly touching
the methods member.
Reviewed by: glebius (older version)
Sponsored by: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D38105
Hide the ifnet structure definition, no user serviceable parts inside,
it's a netstack implementation detail. Include it temporarily in
<net/if_var.h> until all drivers are updated to use the accessors
exclusively.
Reviewed by: glebius
Sponsored by: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D38046
Some drivers, like iflib drivers, take a 'context' argument instead of a
ifnet argument, as a single interface may have multiple contexts.
Follow this scheme by passing the context argument down. Most drivers
will likely pass 'ifp' as the context.
Reviewed by: glebius
Sponsored by: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D38102
Linux 6.2 changes the second argument of the set_acl operation to be a
"struct dentry *" rather than a "struct inode *". The inode* parameter
is still available as dentry->d_inode, so adjust the call to the _impl
function call to dereference and pass that pointer to it.
Also document that the get_acl -> get_inode_acl member name change from
commit 884a693 was an API change also introduced in Linux 6.2.
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Coleman Kane <ckane@colemankane.org>
Closes#14415
Despite all optimizations, tests on actual hardware show that FreeBSD
kernel can't sleep for less then ~2us. Similar tests on Linux show
~50us delay at least from nanosleep() (haven't tested inside kernel).
It means that on very fast log device ZIL may not be able to satisfy
zfs_commit_timeout_pct block commit timeout, increasing log latency
more than desired.
Handle that by introduction of zil_min_commit_timeout parameter,
specifying minimal timeout value where additional delays to aggregate
writes may be skipped. Also skip delays if the LWB is more than 7/8
full, that often happens if I/O sizes are constant and match one of
LWB sizes. Both things are applied only if there were no already
outstanding log blocks, that may indicate single-threaded workload,
that by definition can not benefit from the commit delays.
While there, add short time moving average to zl_last_lwb_latency to
make it more stable.
Tests of single-threaded 4KB writes to NVDIMM SLOG on FreeBSD show IOPS
increase by 9% instead of expected 5%. For zfs_commit_timeout_pct of
1 there IOPS increase by 5.5% instead of expected 1%.
Reviewed-by: Allan Jude <allan@klarasystems.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Prakash Surya <prakash.surya@delphix.com>
Signed-off-by: Alexander Motin <mav@FreeBSD.org>
Sponsored by: iXsystems, Inc.
Closes#14418
The .align directive used to align storage locations is
ambiguous. On some platforms and assemblers it takes a byte count,
on others the argument is interpreted as a shift value. The current
usage expects the first interpretation.
Replace it with the unambiguous .balign directive which always
expects a byte count, regardless of platform and assembler.
Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Closes#14422
The .size directive used by the SET_SIZE C macro uses the special
dot symbol to calculate the size of a function. The dot symbol
refers to the current address, so for the calculation to be
meaningful the SET_SIZE macro must be placed immediately after the
end of the function the size is calculated for.
Reviewed-by: Jorgen Lundman <lundman@lundman.net>
Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Signed-off-by: Attila Fülöp <attila@fueloep.org>
Closes#14422
Overview:
Intel(R) QuickAssist Technology (Intel(R) QAT) provides hardware
acceleration for offloading security, authentication and compression
services from the CPU, thus significantly increasing the performance and
efficiency of standard platform solutions.
This commit introduces:
- Intel® 4xxx Series platform support.
- QuickAssist kernel API implementation update for Generation 4 device.
Enabled services: symmetric cryptography and data compression.
- Increased default number of crypto instances in static configuration
for performance purposes.
OCF backend changes:
- changed GCM/CCM MAC validation policy to generate MAC by HW
and validate by SW due to the QAT HW limitations.
Patch co-authored by: Krzysztof Zdziarski <krzysztofx.zdziarski@intel.com>
Patch co-authored by: Michal Jaraczewski <michalx.jaraczewski@intel.com>
Patch co-authored by: Michal Gulbicki <michalx.gulbicki@intel.com>
Patch co-authored by: Julian Grajkowski <julianx.grajkowski@intel.com>
Patch co-authored by: Piotr Kasierski <piotrx.kasierski@intel.com>
Patch co-authored by: Adam Czupryna <adamx.czupryna@intel.com>
Patch co-authored by: Konrad Zelazny <konradx.zelazny@intel.com>
Patch co-authored by: Katarzyna Rucinska <katarzynax.kargol@intel.com>
Patch co-authored by: Lukasz Kolodzinski <lukaszx.kolodzinski@intel.com>
Patch co-authored by: Zbigniew Jedlinski <zbigniewx.jedlinski@intel.com>
Sponsored by: Intel Corporation
Reviewed by: markj, jhb
Differential Revision: https://reviews.freebsd.org/D36254
Using '%r0' in efunc causes it to parse %r as a 'r' specifier.
This diff just adds a '%' in front of '%r0' in order to create the
correct output.
Reviewed by: markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D38176
Arguments follow primaries, not the other way around.
MFC after: 1 week
Sponsored by: Klara, Inc.
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D38173
When dtrace starts, it tries to detect if the dtrace klds are loaded,
and if not, it loads them by loading the dtraceall kld. This module
depends on most dtrace modules, including systrace for the native
freebsd and freebsd32 ABIs. However, it does not depend on the
systrace_linux klds, as they in turn depend on the linux ABI klds, and
we don't want to load an ABI module that the user has not explicitly
requested. This can leave a naive user in a state where they think all
syscall providers have been loaded, yet linux ABI syscalls are
"invisible" to dtrace.
To fix this, check to see if the linux ABI modules are loaded. If they
are, then load their systrace klds.
Reviewed by: markj, (emaste & jhb, earlier versions)
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D37986
Only 64bit architectures can be supported this way, because libcxx
defines __cxx_contention_t to be int64_t for FreeBSD, and 32bit arches
do not have a kind of UMTX_OP_WAIT_INT64_PRIVATE operation.
LLVM review: https://reviews.llvm.org/D142134
Tested by: emaste
Reviewed by: dim, emaste, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D38132
If we receive a DRR_FREEOBJECTS as the first entry in an object range,
this might end up producing a hole if the freed objects were the
only existing objects in the block.
If the txg starts syncing before we've processed any following
DRR_OBJECT records, this leads to a possible race where the backing
arc_buf_t gets its psize set to 0 in the arc_write_ready() callback
while still being referenced from a dirty record in the open txg.
To prevent this, we insert a txg_wait_synced call if the first
record in the range was a DRR_FREEOBJECTS that actually
resulted in one or more freed objects.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: David Hedberg <david.hedberg@findity.com>
Sponsored by: Findity AB
Closes#11893Closes#14358
In the zstream code, Coverity reported:
"The argument could be controlled by an attacker, who could invoke the
function with arbitrary values (for example, a very high or negative
buffer size)."
It did not report this in the kernel. This is likely because the
userspace code stored this in an int before passing it into the
allocator, while the kernel code stored it in a uint32_t.
However, this did reveal a potentially real problem. On 32-bit systems
and systems with only 4GB of physical memory or less in general, it is
possible to pass a large enough value that the system will hang. Even
worse, on Linux systems, the kernel memory allocator is not able to
support allocations up to the maximum 4GB allocation size that this
allows.
This had already been limited in userspace to 64MB by
`ZFS_SENDRECV_MAX_NVLIST`, but we need a hard limit in the kernel to
protect systems. After some discussion, we settle on 256MB as a hard
upper limit. Attempting to receive a stream that requires more memory
than that will result in E2BIG being returned to user space.
Reported-by: Coverity (CID-1529836)
Reported-by: Coverity (CID-1529837)
Reported-by: Coverity (CID-1529838)
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes#14285
Introduce four new vdev properties:
checksum_n
checksum_t
io_n
io_t
These properties can be used for configuring the thresholds of zed's
diagnosis engine and are interpeted as <N> events in T <seconds>.
When this property is set to a non-default value on a top-level vdev,
those thresholds will also apply to its leaf vdevs. This behavior can be
overridden by explicitly setting the property on the leaf vdev.
Note that, these properties do not persist across vdev replacement. For
this reason, it is advisable to set the property on the top-level vdev
instead of the leaf vdev.
The default values for zed's diagnosis engine (10 events, 600 seconds)
remains unchanged.
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Allan Jude <allan@klarasystems.com>
Signed-off-by: Rob Wing <rob.wing@klarasystems.com>
Sponsored-by: Seagate Technology LLC
Closes#13805
In 2016, the authors of PVS Studio ran it on the FreeBSD kernel, which
identified a number of bugs / cleanup opportunities in the FreeBSD ZFS kernel
code. A few of them persist to the present day:
https://reviews.freebsd.org/D5245
Note that the scan was done against
freebsd/freebsd-src@46763fd4ca.
In particular, we have the following in free_blocks():
\sys\cddl\contrib\opensolaris\uts\common\fs\zfs\dnode_sync.c (174): error V547: Expression '__left >= __right' is always true. Unsigned type value is always >= 0.
\sys\cddl\contrib\opensolaris\uts\common\fs\zfs\dnode_sync.c (171): error V634: The priority of the '*' operation is higher than that of the '<<' operation. It's possible that parentheses should be used in the expression.
\sys\cddl\contrib\opensolaris\uts\common\fs\zfs\dnode_sync.c (175): error V547: Expression '__left >= __right' is always true. Unsigned type value is always >= 0.
A couple of assertions accidentally typecast the arguments they check to
unsigned in such a way that the result is always true. Also, parentheses
are missing around `1<<epbs` in `(db->db_blkid * 1<<epbs)`. This works
out to be okay due to multiplication not caring what order of operations
we use, but it is better to fix it to be `(db->db_blkid << epbs)`.
A few of the function local variables probably never should have been
32-bit in the first place, so we make them 64-bit. We also replace the
existing assertions with additional assertions to ensure that 64-bit
unsigned arithmetic is safe.
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Closes#14407
.git-blame-ignore-revs lists commit hashes that should be skipped by
`git blame` e.g. non-functional whitespace or style cleanup.
The file is populated with a few sample entries.
Reviewed by: brooks, imp
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D36525
fixup ignore revs
Right now we have little visibility into packet drops within netmap.
Start trying to make packet loss issues more visible by counting queue
drops in the transmit path, and in the input path for interfaces running
in emulated mode, where we place received packets in a bounded software
queue that is processed by rxsync.
Reviewed by: vmaffione
MFC after: 1 week
Sponsored by: Zenarmor
Sponsored by: OPNsense
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D38064
The check is ok by default, since the default value of
netmap_generic_ringsize is 1024. But we should check against the
configured "ring" size.
Reviewed by: vmaffione
MFC after: 1 week
Sponsored by: Zenarmor
Sponsored by: OPNsense
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D38062
Per the removed comments these fields should be loaded only once, since
they can in principle be modified concurrently, though this would be a
violation of the userspace contract with netmap.
No functional change intended.
Reviewed by: vmaffione
MFC after: 1 week
Sponsored by: Zenarmor
Sponsored by: OPNsense
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D38061