Kegs with no items reserved have uk_reserve = 0. So the check
keg->uk_reserve >= dom->ud_free_items will be true once all slabs are
depleted. Then, rather than go and allocate a fresh slab, we return to
the cache layer.
The intent was to do this only when the keg actually has a reserve, so
modify the check to verify this first. Another approach would be to
make uk_reserve signed and set it to -1 until uma_zone_reserve() is
called, but this requires a few casts elsewhere.
Fixes: 1b2dcc8c54a8 ("uma: Avoid depleting keg reserves when filling a bucket")
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32516
M_USE_RESERVE is used in a couple of places in the VM to avoid unbounded
recursion when the direct map is not available, as is the case on 32-bit
platforms or when certain kernel sanitizers (KASAN and KMSAN) are
enabled. For example, to allocate KVA, the kernel might allocate a
kernel map entry, which might require a new slab, which requires KVA.
For these zones, we use uma_prealloc() to populate a reserve of items,
and then in certain serialized contexts M_USE_RESERVE can be used to
guarantee a successful allocation. uma_prealloc() allocates the
requested number of items, distributing them evenly among NUMA domains.
Thus, in a first-touch zone, to satisfy an M_USE_RESERVE allocation we
might have to check the slab lists of other domains than the current one
to provide the semantics expected by consumers.
So, try harder to find an item if M_USE_RESERVE is specified and the keg
doesn't have anything for current (first-touch) domain. Specifically,
fall back to a round-robin slab allocation. This change fixes boot-time
panics on NUMA systems with KASAN or KMSAN enabled.[1]
Alternately we could have uma_prealloc() allocate the requested number
of items for each domain, but for some existing consumers this would be
quite wasteful. In general I think keg_fetch_slab() should try harder
to find free slabs in other domains before trying to allocate fresh
ones, but let's limit this to M_USE_RESERVE for now.
Also fix a separate problem that I noticed: in a non-round-robin slab
allocation with M_WAITOK, rather than sleeping after a failed slab
allocation we simply try again. Call vm_wait_domain() before retrying.
Reported by: mjg, tuexen [1]
Reviewed by: alc
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32515
Pick up changes in option dependencies (WITHOUT_OPENSSL and WITHOUT_CXX)
and the addition of WITH_DETECT_TZ_CHANGES and WITH_LLVM_BINUTILS.
Sponsored by: The FreeBSD Foundation
In 7ec86b6609912 ("Also print symbols when printing arm64 registers")
a new function was created to print most registers. Unfortunately the
Link Register (LR) was being printed when we should have printed the
Exception Link Register (ELR).
Fix this by adding the missing 'e'.
Sponsored by: The FreeBSD Foundation
Use types from sys/stat.h for the filesystem and inode numbers for extra
safety.
PR: 259504
Reported by: Markus Wild <freebsd-bugs@virtualtec.ch>
MFC after: 1 week
The iicoc driver supports the OpenCores I2C IP. This is included in at
least the SiFive "Unleashed" and "Unmatched" cores and probably others.
Suggested by: jrtc27
Only build on RISC-V for now, since we're not aware of any other cores
with this IP supported by FreeBSD.
Reviewed by: jrtc27, philip
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D32737
Mark Murray graciously provided access to an aarch64 system
to test the ld128 implementations. This patch address
* Misuses of copysignl() in sinpil() and tanpil().
* Redo the splitting of argument 'x' into an integer part and
remainder. The remainder must satify 0 <= r < 1.
* Update the reduction of the integer part to something that can
easily be seen as even or odd, e.g., sin(pi*x) = (-1)^n*sin(pi*r)
with n <= 2^112 and we an reduce n by subtracting integer powers
of 2.
* In s_cospil.c, fix typos where 'x' is used where 'ax', the
remainder, is required.
* In tanpil(), fix the use of an uninitialized variable, ax = fabsl(ax),
ax should be x in fabsl().
One item of note, in the limited tested on aarch64, the max ULP
for sinpil() and cospil() were less than 1.1 ULP, which is higher
that the desired max ULP less than 1. This was traced to the
kernel for cosl() in the fundamental interval [0,pi/4].
The coefficients in the minmax polynomial likely need refinement.
PR: 218514
MFC after: 1 week
For pNFS servers that specify that Layouts are to be returned
upon close, they may expect that LayoutReturn to happen before
the associated Close.
This patch modifies the NFSv4.1/4.2 pNFS client so that this
is done. This only affects a pNFS mount against a non-FreeBSD
NFSv4.1/4.2 server that specifies return_on_close in LayoutGet
replies.
Found during a recent IETF NFSv4 working group testing event.
MFC after: 2 weeks
Add a void *ni_drv_data field to struct ieee80211_node that drivers
can use to backtrack to their internal state from a net80211 node.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
X-Differential Revision: https://reviews.freebsd.org/D30654 (abandoned)
Use proc_get_binpath() to get the hardlink right.
PR: 248184
Reviewed by: emaste, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D32738
Similar to commit 2be417843a, I believe there could be a race between
the NFS client VOP_LOOKUP() and file Writing that could result in stale
file attributes being loaded into the NFS vnode by VOP_LOOKUP().
I have not been able to reproduce a failure due to this race, but
I believe that there are two possibilities:
The Lookup RPC happens while VOP_WRITE() is being executed and loads
stale file attributes after VOP_WRITE() returns when it has already
completed the Write/Commit RPC(s).
--> For this case, setting the local modify timestamp at the end of
VOP_WRITE() should ensure that stale file attributes are not loaded.
The Lookup RPC occurs after VOP_WRITE() has returned, while
asynchronous Write/Commit RPCs are in progress and then is
blocked by the vnode held by VOP_OPEN/VOP_CLOSE/VOP_FSYNC which
will flush writes via ncl_flush() or ncl_vinvalbuf(), clearing the
NMODIFIED flag (which indicates Writes-in-progress). The VOP_LOOKUP()
then acquires the NFS vnode lock and fills in stale file attributes.
--> Setting the local modify timestamp in ncl_flsuh() and ncl_vinvalbuf()
when they clear NMODIFIED should ensure that stale file attributes
are not loaded.
This patch does the above.
PR: 259071
Reviewed by: asomers
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D32677
Commit 2be417843a04 added n_localmodtime, which is used by Lookup
and ReaddirPlus to check to see if the file attributes in an RPC
reply might be stale. This patch sets n_localmodtime in Deallocate.
Done as a separate commit, since Deallocate is not in stable/13.
PR: 259071
Reviewed by: asomers
Differential Revision: https://reviews.freebsd.org/D32635
Testing with it, there appears to be a race between Lookup
and VOPs like Setattr-of-size, where Lookup ends up loading
stale attributes (including what might be the wrong file size)
into the NFS vnode's attribute cache.
The race occurs when the modifying VOP (which holds a lock
on the vnode), blocks the acquisition of the vnode in Lookup,
after the RPC (with now potentially stale attributes).
Here's what seems to happen:
Child Parent
does stat(), which does
VOP_LOOKUP(), doing the Lookup
RPC with the directory vnode
locked, acquiring file attributes
valid at this point in time
blocks waiting for locked file does ftruncate(), which
vnode does VOP_SETATTR() of Size,
changing the file's size
while holding an exclusive
lock on the file's vnode
releases the vnode lock
acquires file vnode and fills in
now stale attributes including
the old wrong Size
does a read() which returns
wrong data size
This patch fixes the problem by saving a timestamp in the NFS vnode
in the VOPs that modify the file (Setattr-of-size, Allocate).
Then lookup/readdirplus compares that timestamp with the time just
before starting the RPC after it has acquired the file's vnode.
If the modifying RPC occurred during the Lookup, the attributes
in the RPC reply are discarded, since they might be stale.
With this patch the test program works as expected.
Note that the test program does not fail on a July stable/12,
although this race is in the NFS client code. I suspect a
fairly recent change to the name caching code exposed this
bug.
PR: 259071
Reviewed by: asomers
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D32635
Note that this is largely untested at this point, as was
the previous version; I'm committing this mostly to get
rid of `struct linux_pt_reg`.
Sponsored By: EPSRC
Differential Revision: https://reviews.freebsd.org/D32735
In 6e66030c4c0, additional ptracestop was added in order
to implement PTRACE_EVENT_EXEC. Make it only apply to cases
where the debugger is a Linux processes; native FreeBSD
debuggers can trace Linux processes too, but they don't
expect that additonal ptracestop.
Fixes: 6e66030c4c0
Reported By: kib
Reviewed By: kib
Sponsored By: EPSRC
Differential Revision: https://reviews.freebsd.org/D32726
The man page states that "-t %+" prints time information in the same
format as date with no format specifier.
This was not the case, the format used was always that of date for the
POSIX locale.
The fix suggested by the reporter leads to output that matches the
documentation.
Reported by: Jamie Landeg-Jones <jamie@catflap.org>
MFC after: 3 days
Commit 5e5ca4c8fc53 added a NFSMNTP_DELEGISSUED flag to indicate when
a delegation has been issued to the mount. For the common case
where an NFSv4 server is not issuing delegations, this flag
can be checked to avoid acquisition of the NFSCLSTATEMUTEX.
This patch adds checks for NFSMNTP_DELEGISSUED being set
to two more functions.
This change appears to be performance neutral for a small number
of opens, but should reduce lock contention for a large number of opens
for the common case where server is not issuing delegations.
MFC after: 2 week
If we get a Sacked peer with an MTU change we can retransmit forever if the
last bytes are sacked and the client goes away (think power off). Then we
never see the end condition and continually retransmit.
Reviewed by: Michael Tuexen
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D32671
bwsrawdata() is supposed to return the string buffer.
PR: 259451
Reported by: sigsys@gmail.com
Fixes: d053fb22f6d3 ("usr.bin/sort: Avoid UBSan errors")
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Previously it returned a shorter struct. I can't find any
modern software that uses it, but tests/ptrace from strace(1)
repo complained.
Differential Revision: https://reviews.freebsd.org/D32601
This fixes ./waitid.gen.test from the strace(1) test suite.
Reviewed By: kib
Sponsored By: EPSRC
Differential Revision: https://reviews.freebsd.org/D32617
libatf-c++ requires C++ support.
From jrtc27: bit slightly odd this isn't gated by MK_TESTS (which itself
depends on MK_CXX), but this makes sense given the current behaviour.
Reported by: Michael Dexter, Build Option Survey
Reviewed by: imp, jrtc27
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32732
OFED, OPENMP, and PMC depend on C++ support. Force them off when
building WITHOUT_CXX.
Reported by: Michael Dexter, Build Option Survey
Reviewed by: imp, jrtc27
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32730
In the past we built the sanitizer runtimes when building Clang
(and using Clang as the compiler) but 7676b388adbc changed this to
be conditional only on using Clang, to make the runtimes available
for external Clang.
They fail to build when WITHOUT_CXX is set though, so add MK_CXX
as part of the condition.
Reported by: Michael Dexter, Build Option Survey
Reviewed by: imp, jrtc27
Fixes: 7676b388adbc ("Always build the sanitizer runtimes...")
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32731
Translate ERESTART into Linux "internal" errno ERESTARTSYS.
This fixes the erestartsys.gen.test from strace(1).
Reviewed By: kib
Sponsored By: EPSRC
Differential Revision: https://reviews.freebsd.org/D32623
The pic_* interface was used.
Only edge interrupts are supported by this controller.
Driver mutex had to be converted to a spin lock so that it can
be used in the interrupt filter context.
Two types of intr_map_data are supported - INTR_MAP_DATA_GPIO and
INTR_MAP_DATA_FDT. This way interrupts can be allocated using the
userspace gpio interrupt allocation method, as well as directly from
simplebus. The latter can be used by devices that have its irq routed
to a GPIO pin.
Obtained from: Semihalf
Sponsored by: Alstom Group
Differential revision: https://reviews.freebsd.org/D32587
Driver polls status of all PHYs connected to the switch in a
fixed interval.
Add a sysctl that allows to control frequency of that.
The value is expressed in ticks and defaults to "hz", or 1 second.
Obtained from: Semihalf
Sponsored by: Alstom Group
It was previously used by felix(4) for PHY communication.
Since that is not the case anymore this driver is now left unused.
Obtained from: Semihalf
Sponsored by: Alstom Group
Previously we would use an external MDIO device found on the PCI bus.
Switch to using MDIO mapped in a separate BAR of the switch device.
It is much easier this way since we don't have to depend on another
driver anymore.
Obtained from: Semihalf
Sponsored by: Alstom Group
Some BIOSes protect memory region they reside in by using DMAR to
prevent devices from doing any DMA transactions to that part of RAM.
AMI refers to this as "DMA Control Guarantee".
Disable the protection when address translation is enabled.
I stumbled upon this while investigation a failing coredump on a device
which has this feature enabled.
Sponsored by: Stormshield
Obtained from: Semihalf
Reviewed by: kib
Differential revision: https://reviews.freebsd.org/D32591
In some cases we might have to create DMAR context before the
corresponding device has been enumerated by the PCI bus.
In that case we get called with NULL dev, because of that trying
to reserve PCI regions causes a NULL pointer dereference in
pci_find_pcie_root_port.
Sponsored by: Stormshield
Obtained from: Semihalf
MFC after: 2 weeks
Reviewed by: kib, rlibby
Differential revision: https://reviews.freebsd.org/D32589
Turns out that if a peer sends in a window update right after rack fires off
a persists probe, we can mis-interpret the window update and calculate
a bogus RTT (very short). We still process the window update and send
the data but we incorrectly generate an RTT. We should be only doing
the RTT stuff if the rwnd is still small and has not changed.
Reviewed by: Michael Tuexen
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D32717
Geom utilities (geli(8), glabel(8), gmirror(8), gpart(8), gmirror(8),
gmountver(8), etc) all use the geom(8) utility as their back end
to process their commands and pass them into the kernel. Creating
a new utility requires no more than filling out a template describing
the commands and arguments that the utility supports. Consider the
specification for the very simple gmountver(8) utility:
struct g_command class_commands[] = {
{ "create", G_FLAG_VERBOSE | G_FLAG_LOADKLD, NULL,
{
G_OPT_SENTINEL
},
"[-v] prov ..."
},
{ "destroy", G_FLAG_VERBOSE, NULL,
{
{ 'f', "force", NULL, G_TYPE_BOOL },
G_OPT_SENTINEL
},
"[-fv] name"
},
G_CMD_SENTINEL
};
It has just two commands of its own: "create" and "destroy" along
with the four standard commands "list", "status", "load", and
"unload" provided by the base geom(8) utility. The base geom(8)
utility allows each command to use the G_FLAG_VERBOSE flag to specify
that a command should accept the -v flag and when the -v flag is
given the utility prints "Done." if the command completes successfully.
In the above example, both of the commands set the G_FLAG_VERBOSE,
so have the -v option available. In addition the "destroy" command
accepts the -f boolean flag to force the destruction.
If the "destroy" command wanted to also print out verbose information,
it would need to explicitly declare its intent by adding a line:
{ 'v', "verbose", NULL, G_TYPE_BOOL },
Before this change, the geom utility would silently ignore the above
line in the configuration file, so it was impossible for the utility
to know that the -v flag had been set on the command. With this
change a geom command can explicitly specify a -v option with a
line as given above and handle it as it would any other option. If
both a -v option and G_FLAG_VERBOSE are specified for a command
then both types of verbose information will be output when that
command is run with -v.
MFC after: 1 week
Sponsored by: Netflix
PASN requires CRYPT and when built WITHOUT_CRYPT buildworld
fails. Only enable PASN when MK_CRYPT is enabled (default).
PR: 259517
Reported by: emaste
Fixes: c1d255d3ffdbe447de3ab875bf4e7d7accc5bfc5
MFC after: 1 week