Quiet a variety of Wwrite-strings warnings in sys/kern at low-impact
sites. This patch avoids addressing certain others which would need to
plumb const through structure definitions.
Reviewed by: kib, markj
Differential Revision: https://reviews.freebsd.org/D23798
If node attribute returned in the reply for read rpc indicate
truncation, and it happens that the vnode is exclusively locked,
update of the node attributes would try to shrink vnode size. Since
during the read some vnode pages were busied by the reading thread,
vnode_pager_setsize() deadlocks waiting for the busy state owned by
the caller.
Use a thread-local flag to indicate that NFS read owns some (s)busy
pages states and postpone the call to vnode_pager_setsize() until the
thread relinguishes the ownership.
Diagnosed by: rlibby
Tested by: pho, rlibby
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
This provides the needed hint to GCC and offers an annotation for readers to
observe that it's in-fact impossible to hit this point. We'll get hit with a
a -Wswitch error if the enum applicable to the switch above were to get
expanded without the new value(s) being handled.
MACHINE_ARCH sets the hw.machine_arch sysctl in the kernel. In userspace
it sets MACHINE_ARCH in bmake, which bsd.cpu.mk uses to configure the
target ABI for ports.
For riscv64sf builds (i.e. soft-float) that needs to be riscv64sf, but
the sysctl didn't reflect that. It is static.
Set the define from the riscv makefile so that we correctly reflect our
actual build (i.e. riscv64 or riscv64sf), depending on what TARGET_ARCH
we were built with.
That still doesn't satisfy userspace builds (e.g. bmake), so check if
we're building with a software-floating point toolchain there. That
check doesn't work in the kernel, because it never uses floating point.
Reviewed by: philip (previous version), mhorne
Sponsored by: Axiado
Differential Revision: https://reviews.freebsd.org/D23741
We've grown to also require libthr and libprivatestd to be explicitly linked
in here, so do this now to fix freebsd-wifi-build.
Submitted by: Pavel Timofeev <timp87 gmail com>
This enables very cheap read sections with free-to-use latencies and memory
overhead similar to epoch. On a recent AMD platform a read section cost
1ns vs 5ns for the default SMR. On Xeon the numbers should be more like 1
ns vs 11. The memory consumption should be proportional to the product
of the free rate and 2*1/hz while normal SMR consumption is proportional
to the product of free rate and maximum read section time.
While here refactor the code to make future additions more
straightforward.
Name the overall technique Global Unbound Sequences (GUS) and adjust some
comments accordingly. This helps distinguish discussions of the general
technique (SMR) vs this specific implementation (GUS).
Discussed with: rlibby, markj
Specifically, any system with a 32-bit size_t; -residue is calculated as a
32-bit *then* promoted to the 64-bit off_t and the result is ultimately
wrong. This resulted in what would appear to be truncated output, as only
the first line would be read.
Correct it by just making residue an off_t to begin with, since this is what
lseek will take anyways.
Reported by: antoine, dim
Triaged by: cem
Tested by: kevans
X-MFC-With: r358152
ptbl_alloc() is expected to return with the pvh_global_lock and pmap
lock held. However, it will return with them unlocked if nosleep is
specified.
Along with this, fix lock ordering of pvh_global_lock with respect to
the pmap lock in other places.
Differential Revision: https://reviews.freebsd.org/D23692
handler to accept a poitner to a u_int. To make the type of the softc flags
stable and defined, make it a u_int. Cast the enum types to u_int for arg2 so
when passing to dabitsysctl it's a u_int.
Noticed by: emax@
Differential Revision: https://reviews.freebsd.org/D23785
In the successful case, sockshost is not freed prior to return.
The failure case can now be hit after fetch_reopen(), which was not true
before. Thus, we need to make sure to clean up all of the conn resources
which will also close sd. For all of the points prior to fetch_reopen(), we
continue to just close sd.
CID: 1419598, 1419616
This is SVM features word, the bit is defined in "PPR for AMD Family
17h Model 31h B0", document 55803 Rev 0.54.
N.B. GuesSpecCtl (no 't') is the spelling from the document.
Submitted by: Dmitry Luhtionov <dmitryluhtionov@gmail.com>
MFC after: 3 days
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.
This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.
Mark all low hanging fruits as MPSAFE.
Reviewed by: markj
Approved by: kib (mentor, blanket)
Differential Revision: https://reviews.freebsd.org/D23626
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.
Mark all nodes in pf, pfsync and carp as MPSAFE.
Reviewed by: kp
Approved by: kib (mentor, blanket)
Differential Revision: https://reviews.freebsd.org/D23634
The index should be computed as distance from arg[0] and not
the beginning of struct mlx5_ib_congestion .
While at it fix a use of zero length array to avoid depending
on undefined compiler behaviour.
MFC after: 1 week
Sponsored by: Mellanox Technologies
While at it update the sysctl(9) description for the lid state.
Differential Revision: https://reviews.freebsd.org/D23724
PR: 240881
Submitted by: Yuri Pankov <yuripv@yuripv.me>
MFC after: 1 week
Sponsored by: Mellanox Technologies
For drmkpi (D23085) we don't want the Linux struct file as we don't emulate
everything. Also the prototypes should be in shmem_fs.h to have 100%
compatibility with Linux.
Reviewed by: hselasky
MFC after: Maybe
Differential Revision: https://reviews.freebsd.org/D23764
Duplicating the work was putting an avoidable requirement that the filedesc
lock is held across the entire operation (otherwise by the time audit reads
vnode pointers another thread in the same process can chdir somewhere else,
making audit log things using different vnode than the one which will be
used for actual lookup).
Do the obvious thing and pass down vnodes which will be used.
If the configured compression level for kernel dumps
it outside the supported range, clamp it to the closest
supported level. Previously, dumpon would fail.
zstd already does this internally, so the compressor
needs no change.
Reviewed by: cem markj
MFC after: 2 weeks
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D23765
Update the man page to mention that extending a file with truncate(2)
is required by POSIX as of 2008.
Reviewed by: bcr
MFC after: 2 weeks
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D23354
Since we don't set opterr to 0, getopt prints a message when it
encounters an unknown/invalid option. We therefore don't need to
print our own message in the default handler.
Reviewed by: kevans, theraven
Differential Revision: https://reviews.freebsd.org/D23662
Currently, the size of the swap device is unconditionally reported using
blocks, even if -h has been used.
- While here, switch to CONVERT_BLOCKS() instead of CONVERT() which will
avoid overflowing size counters (in human readable form see: r196244)
- Update the column headers to reflect that a size is being reported instead
of the block size units being used
Before:
$ swapinfo
Device 1K-blocks Used Avail Capacity
/dev/gpt/swapfs 1048576 0 1048576 0%
$
After:
$ swapinfo -h
Device Size Used Avail Capacity
/dev/gpt/swapfs 1.0G 0B 1.0G 0%
$
Differential Revision: https://reviews.freebsd.org/D23758
Reviewed by: kevans
MFC after: 3 weeks
This patch adds a new netbe_peek_recvlen() function to the net
backend API. The new function allows the virtio-net receive code
to know in advance how many virtio descriptors chains will be
needed to receive the next packet. As a result, the implementation
of the virtio-net mergeable rx buffers feature becomes efficient,
so that we can enable it also with the tap(4) backend. For the
tap(4) backend, a bounce buffer is introduced to implement the
peeck_recvlen() callback, which implies an additional packet copy
on the receive datapath. In the future, it should be possible to
remove the bounce buffer (and so the additional copy), by
obtaining the length of the next packet from kevent data.
Reviewed by: grehan, aleksandr.fedorov@itglobal.com
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D23472
When we register an interrupt handler we need to pass the intr_type along in
bus_setup_intr().
The interrupt type matters because it is used to decide if we need to enter
NET_EPOCH. That meant that vtmmio-based if_vtnet did not, which led to panics
with INVARIANTS set.
Sponsored by: Axiado
This function test if the string str begins with the string pointed
at by prefix.
Reviewed by: hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D23767
This function just test if the element is the first of the list.
Reviewed by: hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D23766
realpath(3) is used a lot e.g., by clang and is a major source of getcwd
and fstatat calls. This can be done more efficiently in the kernel.
This works by performing a regular lookup while saving the name and found
parent directory. If the terminal vnode is a directory we can resolve it using
usual means. Otherwise we can use the name saved by lookup and resolve the
parent.
See the review for sample syscall counts.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D23574
On machines with SMAP, fueword executes two serializing instructions
which can be seen in microbenchmarks.
As a measure to restore microbenchmark numbers, only read the word on
the attempt to deliver signal in ast(). If the word is set, signal is
not delivered and word is kept, preventing interruption of
interruptible sleeps by signals until userspace calls
sigfastblock(UNBLOCK) which clears the word.
This way, the spurious EINTR that userspace can see while in critical
section is on first interruptible sleep, if a signal is pending, and
on signal posting. It is believed that it is not important for rtld
and lbithr critical sections. It might be visible for the application
code e.g. for the callback of dl_iterate_phdr(3), but again the belief
is that the non-compliance is acceptable. Most important is that the
retry of the sleeping syscall does not interrupt unless additional
signal is posted.
For now I added the knob kern.sigfastblock_fetch_always to enable the
word read on syscall entry to be able to diagnose possible issues due
to spurious EINTR.
While there, do some code restructuting to have all sigfastblock()
handling located in kern_sig.c.
Reviewed by: jeff
Discussed with: mjg
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D23622
Move IPv6 source address checks from after extension header heandling
to the top of the function. If we do not pass these checks there is
no reason to do a lot of work upfront.
Fold extension header preparations and length calculations together into
a single branch and macro rather than doing them sequentially.
Likewise move extension header concatination into a single branch block
only doing it if we recorded any extension header length length.
Reviewed by: melifaro (earlier version), markj, gallatin
Sponsored by: Netflix (partially, originally)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D23740