This merge brings in a couple new files, which needed to be attached to the
build; a new dependency on <limits.h>, which must be stubbed; and a name
change in the Context parameter constants, from ZSTD_p_foo to ZSTD_c_foo.
Significantly, it fixes a kernel build error with GCC where floating-point
functions were included in the kernel build, by hiding them under the same
compile-time #ifdef that already covered their invocation. That issue was
introduced to FreeBSD in the 1.3.7 update and tracked upstream here:
https://github.com/facebook/zstd/issues/1386
The full 1.3.8 release notes can be found on Github:
https://github.com/facebook/zstd/releases/tag/v1.3.8
Relnotes: yes
If the memory size does not fit into u_long, current code truncates
the returned value and returns complete nonsense. Make the result
slightly more useful by clamping it at ULONG_MAX.
Reported and tested : pho
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Due to the typo, it shared the frame with the CMAP1 transient mapping.
In collaboration with: pho
MFC after: 3 days
Sponsored by: The FreeBSD Foundation (kib)
Kernel now includes jail ID when logging a process exit. jid is 0 for unjailed
processes.
Submitted by: Marie Helene Kvello-Aune <freebsd@mhka.no>
Relnotes: yes
Sponsored by: Modirum MDPay
Differential Revision: https://reviews.freebsd.org/D18618
SVN r340744 erroneously changed pfind() to return any process including
zombies and pfind_any() to return only non-zombie processes.
In particular, this caused kill() on a zombie process to fail with [ESRCH].
There is no direct test case for this but /usr/tests/bin/sh/builtins/kill1.0
occasionally triggers it (as reported by lwhsu).
Conversely, returning zombies from pfind() seems likely to violate
invariants and cause panics, but I have not looked at this.
PR: 233646
Reviewed by: mjg, kib, ngie
Differential Revision: https://reviews.freebsd.org/D18665
Mutexes in I/O path there were used twice per I/O to atomically access
several variables to close and/or destroy the device on last request
completion. I found the way to fit all required info into one integer,
suitable for atomic operations. It opened race window on device close,
but addition of timeout to the msleep() there should cover it.
Profiling shows removal of significant spinning time on those mutexes
and IOPS increase from ~600K to >800K to NVMe on 72-core systems.
MFC after: 1 month
Sponsored by: iXsystems, Inc.
Previous code typically crashed in case of NVMe device unplug or even clean
detach while some I/Os are still in flight. To fix this the new code calls
disk_gone() and waits for confirmation of all references gone before calling
disk_destroy(), freeing other resources and allowing controller detach.
While there, fix disk lists locking and reimplement unit numbers assignment.
MFC after: 1 month
Sponsored by: iXsystems, Inc.
We need to tell vm_fault the reason for the fault was because we tried to
execute from the memory location. Without this it may return with success
as we only request read-only memory, then we return to the same location
and try to execute from the same memory address. This leads to an infinite
loop raising the same fault and returning to the same invalid location.
MFC after: 1 week
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D18511
If invalid, return EINVAL. Note that inode check-hashes greatly
reduce the chance that these errors will go undetected.
Reported by: Christopher Krah <krah@protonmail.com>
Reported as: FS-5-UFS-2: Denial Of Service in nmount-3 (ffs_read)
Reviewed by: kib
MFC after: 1 week
Sponsored by: Netflix
M sys/fs/ext2fs/ext2_vnops.c
M sys/kern/vfs_subr.c
M sys/ufs/ffs/ffs_snapshot.c
M sys/ufs/ufs/ufs_vnops.c
Note that this commit brings only formatting changes that were done
during the final review of the illumos change, because FreeBSD got the
main changes before illumos.
illumos/illumos-gate@04e563565204e5635652https://www.illumos.org/issues/5882
This is an import of the temporary pool names functionality from ZoL:
e2282ef57e26b42f3f9d2f3ec9006100d2a8c92f83e9986f6e023bbe6f01
It is intended to assist the creation and management of virtual machines
that have their rootfs on ZFS on hosts that also have their rootfs on
ZFS. These situations cause SPA namespace collisions when the standard
name rpool is used in both cases. The solution is either to give each
guest pool a name unique to the host, which is not always desireable, or
boot a VM environment containing an ISO image to install it, which is
cumbersome.
MFC after: 1 week
Sponsored by: Panzura
Due to hardware errata in Aero controllers, reads to certain
fusion registers could intermittently return all zeroes.
This behavior is transient in nature and subsequent reads will return
valid value.
Fix:
For Aero controllers, any read will retry the read operations
from certain registers for maximum three times, if read returns zero.
Submitted by: Sumit Saxena <sumit.saxena@broadcom.com>
Reviewed by: Kashyap Desai <Kashyap.Desai@broadcom.com>
Approved by: ken
MFC after: 3 days
Sponsored by: Broadcom Inc
For Aero adapters-
1. Driver will use 32 bit atomic descriptor to fire IOs and DCMDs.
2. Driver will use 64 bit request descriptor to fire IOC INIT.
3. If Aero firmware supports 32 bit atomic descriptor, then only driver will use it
otherwise driver will use 64 bit request descriptor.
For rest of adapters(Ventura, Invader and Thunderbolt), driver will use 64 bit request
descriptors only.
Submitted by: Sumit Saxena <sumit.saxena@broadcom.com>
Reviewed by: Kashyap Desai <Kashyap.Desai@broadcom.com>
Approved by: ken
MFC after: 3 days
Sponsored by: Broadcom Inc
Driver will throw a warning message when a Configurable secure type controller is
encountered.
Submitted by: Sumit Saxena <sumit.saxena@broadcom.com>
Reviewed by: Kashyap Desai <Kashyap.Desai@broadcom.com>
Approved by: ken
MFC after: 3 days
Sponsored by: Broadcom Inc
Due to HW Errta on Aero/Sea A0 chipset on secure boot mode & on heavy IO load,
sometimes read operation on MPT Fusion registers will give zero value,
So, as a workaround driver will retry the MPT Fusion register
read operation for max three times upon reading zero value form these
registers.
Submitted by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Reviewed by: Kashyap Desai <Kashyap.Desai@broadcom.com>
Approved by: ken
MFC after: 3 days
Sponsored by: Broadcom Inc
Enable atomic type descriptor support only for Sea & Aero cards,
due to HW errata this atomic descriptor support has to be disabled
on Ventura cards.
Submitted by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Reviewed by: Kashyap Desai <Kashyap.Desai@broadcom.com>
Approved by: ken
MFC after: 3 days
Sponsored by: Broadcom Inc
Added deviceID's for Sea,Aero to mpr Driver
Aero:
0x00E0 Invalid
0x00E1 Configurable Secure
0x00E2 Hard Secure
0x00E3 Tampered
Sea:
0x00E4 Invalid
0x00E5 Configurable Secure
0x00E6 Hard Secure
0x00E7 Tampered
For Tampered & Invalid type cards, driver will claim the device & quit the probe function with below error message,
"HBA is in Non Secure mode"
for Configurable Secure type cards, driver will display below message in .probe() callback function,
"HBA is in Configurable Secure mode"
Submitted by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Reviewed by: Kashyap Desai <Kashyap.Desai@broadcom.com>
Approved by: ken
MFC after: 3 days
Sponsored by: Broadcom Inc
Following list of changes done in the driver as a part of TM handling on the NVMe drives.
Below changes are only applicable on NVMe drives and only when custom NVMe TM handling bit is set to zero by IOC.
1. Issue LUN reset & Target reset TMs with Target reset method field set to Protocol Level reset (0x3),
2. For LUN & target reset TMs use the timeout value as ControllerResetTO value provided by firmware using PCie Device Page 0,
3. If LUN reset fails to terminates the IO then directly escalate to host reset instead of going for target reset TM,
4. For Abort TM use the timeout value as NVMeAbortTO value given by the IOC using Manufacturing Page 11,
5. Log message "PCie Host Reset failed" message up on receiving P
Submitted by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Reviewed by: Kashyap Desai <Kashyap.Desai@broadcom.com>
Approved by: ken
MFC after: 3 days
Sponsored by: Broadcom Inc
typedef struct mps_pass_thru
{
uint64_t PtrRequest;
uint64_t PtrReply;
uint64_t PtrData;
uint32_t RequestSize;
uint32_t ReplySize;
uint32_t DataSize;
uint32_t DataDirection;
uint64_t PtrDataOut;
uint32_t DataOutSize;
uint32_t Timeout;
} mps_pass_thru_t, * ptrmpssas_pass_thru_t;
In the above mps_pass_thru structure; Application expects PrtReply buffer
should contain both MPI reply followed by sense data. So, updated driver
to copy sense data at PtrReply + sizeof(MPI2 reply) location where
application wants the driver to copy back the sense data info.
Submitted by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Reviewed by: Kashyap Desai <Kashyap.Desai@broadcom.com>
Approved by: ken
MFC after: 3 days
Sponsored by: Broadcom Inc
This value remained unchanged for 15 years, and now this bump reduces
lock spinning in GEOM and BIO layers while doing ~1.6M IOPS to 4 NVMe
on 72-core system from ~25% to ~5% by the cost of additional 28KB RAM.
While there, align struct mtx_pool fields to cache lines.
MFC after: 1 month
CAM does not require SIM lock since FreeBSD 10.4, and NVMe code never
required it at all, using per-queue locks instead. This formally allows
parallel request submission in CAM mode as much as single per-device and
per-queue locks of CAM allow.
MFC after: 1 month
The interesting thing is that looking through Darren's commit logs,
the line containing an extern ppsratecheck() definition was removed
from the v5-1-RELEASE branch but not from HEAD (I have taken his
CVS tree and converted it to GIT). There is a commit adding an
additional #if defined to the empty block. I can only assume that
this was intentional for something later. Looking through HEAD the
extern ppsratecheck() is there. However if we put it back it would
conflict with a static ppsratecheck() definition in fil.c when
building ipftest.
Therefore we remove this empty block.
ppsratecheck() is a function in the FreeBSD kernel. However ipftest
cannot call the ppsratecheck() in the kernel. Therefore one exists in
fil.c for use when building the userland ipftest utility which
approximates the packet filter in userland for testing of ipfilter
rules against packets captured with tcpdump.
MFC after: 1 week
The vnode is not opened, so it ends up with the malloced buffers otherwise.
Reported and tested by: pho
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
The presence of allocated v_object does not imply that the buffer is
necessary VMIO kind. Buffer might has been allocated before the
object created, then the buffer is malloced. Although we try to avoid
such situation, it seems to be still legitimate.
Reported and tested by: pho
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
framework is available. pfil(9) has been in FreeBSD since FreeBSD 5
and according to svn log was first committed to HEAD in 2000, therefore
it is safe to say the check is no longer needed in FreeBSD.
pfil(9) first appeared in NetBSD 1.3 (hence the name NETBSD_PF).
Therefore it is safe to say that it is supported by every NetBSD system
today. The framework also exists in illumos.
As ipfilter code is shared and exchanged between FreeBSD and NetBSD, and
at some point in the future illumos too, and as all three platforms have
pfil(9), the redundant NETBSD_PF #defines and #ifdefs are removed.
MFC after: 1 week
g_io_deliver() finishing initialization of the bio, but g_io_deliver()
actually destroys the bio. INVARIANTS makes the bug obvious by
overwriting the bio with garbage.
Restore the old order for calling devstat (except don't restore not calling
it for the error case), and translate to the devstat KPI so that this order
works.
Reviewed by: kib
To check if txsync can be skipped, it is necessary to look for
unseen TX space. However, this means comparing ring->cur
against ring->tail, rather than ring->head against ring->tail
(like nm_ring_empty() does).
This change also adds some more comments to explain the optimization
performed at the beginning of netmap_poll().
MFC after: 3 days
Sponsored by: Sunny Valley Networks
The bug was introduced by r339639, although it is present in the upstream
netmap code since 2015. It is due to resetting the want_rx variable to
POLLIN, rather than resetting it to POLLIN|POLLRDNORM.
It only affects select(), which uses POLLRDNORM. poll() is not affected,
because it uses POLLIN.
Also, it only affects FreeBSD, because Linux skips the optimization
implemented by the piece of code where the bug occurs.
MFC after: 3 days
Sponsored by: Sunny Valley Networks
clustering is not done. The bug caused extreme slowness for large
files in some cases.
There is no way to tell VOP_BMAP() how many blocks are wanted, so for
all file systems it has to waste time in some cases by searching for
more contiguous blocks than will be accessed. For msdosfs, it also
clobbered the fatchain cache in these cases by advancing the cache to
point to the chain entry for block that won't be read. This makes
the cache useless for the next sequential i/o (or VOP_BMAP()), so the
fat chain is searched from the beginning. The cache only has 1 relevant
entry, so it is similarly useless for random i/o.
Fix this by only advancing the cache to point to the chain entry for
the first block that will be read. Clustering uses results from
VOP_BMAP(), so when more than 1 block is read by clustering, the cache
is not advanced as optimally as before, but it is at most 1 cluster
size behind and searching the chain through the blocks for this cluster
doesn't take too long.
Add a generic mechanism to override mp?_wait_command's timeout behavior,
which continues to invoke reinit by default. Invokers who set
cm_timeout_handler may avoid automatic reinit and do their own handling.
Adapt mp?sas_get_sata_identify to this mechanism and remove its callout
hack.
Reviewed by: scottl
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D18614
In the event that the ID command timed out, mps(4)/mpr(4) did not free the
command until it could be cancelled. However, it freed the associated
buffer (cm_data). Fix the lifetime issue by freeing the associated buffer
only after Abort Task or controller reset.
Reviewed by: scottl
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D18612