Now, vm_page_busy_sleep() expects the page's object to be locked.
vm_object_unwire() does some unusual lazy locking of the object chain
and keeps objects locked until a busy page is encountered or the loop
terminates. When a busy page is encountered, rather than unlocking all
but the "bottom-level" object, we must instead skip the object to which
"tm" belongs.
Reported and tested by: pho
Reviewed by: kib
Discussed with: jeff
Sponsored by: Intel, Netflix
Differential Revision: https://reviews.freebsd.org/D21790
No functional change intended.
The intent is to add a "legacy" e820 pmem newbus bus for nvdimm device in a
subsequent revision, and it's a little more clear if the parent buses get
independent source files.
Quite a lot of ACPI-specific logic is left in nvdimm.c; disentangling that
is a much larger change (and probably not especially useful).
Reviewed by: kib
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D21813
This will allow feature control bits (e.g. for ASLR, PROT_MAX) to be
inspected or modified.
Some clean-up and additional work is likely still required, but we can
iterate on this in the tree.
Submitted by: Bora Özarslan <borako.ozarslan@gmail.com>
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D19290
When a VFS option passed to nmount is present but NULL the kernel will
place an empty option in its internal list. This will have a NULL
pointer and a length of 0. When we come to read one of these the kernel
will try to load from the last address of virtual memory. This is
normally invalid so will fault resulting in a kernel panic.
Fix this by checking if the length is valid before dereferencing.
MFC after: 3 days
Sponsored by: DARPA, AFRL
promoted to ints).
- `mode_t` is `uint16_t` (`sys/sys/_types.h`)
- `openat` takes variadic args
- variadic args cannot be 16-bit, and indeed the code uses int
- the manpage currently kinda implies the argument is 16-bit by saying `mode_t`
Prompted by Rust things: https://github.com/tailhook/openat/issues/21
Submitted by: Greg V at unrelenting
Differential Revision: https://reviews.freebsd.org/D21816
The value of struct_kernel_stat64_sz introduced by review D5021 for
RISC-V was incorrect.
Also add a __riscv_xlen == 64 conditional as the 32-bit ABI is not yet
finalized.
Submitted by: Luís Marques
Differential Revision: https://reviews.freebsd.org/D21684
Those functions are used by kernel, and we can't check all possible argument
errors in production kernel. Plus according to docs many of those errors
are checked by hardware. Assertions should just help with code debugging.
MFC after: 2 weeks
from the command line. Prior to this the functionality was mostly there
however since the pool type (-t) was not recognized by the -A and -R
command options -- not recognized by getopt(). Additionally the code to
implement the dynamic add and removal of pools didn't work.
When dynamically adding (-A) a pool a type (-t) to specify if the pool
is a tree or hash pool must be specified. When dynamically removing (-R)
a pool, omitting -t will cause a search-and-destroy which will remove
both types of pools matching the name given (-m).
PR: 218433
MFC after: 1 week
conflicts with the command option of the same name (also -R).
Remove the superfluous and confusing non-global non-command -R option.
PR: 218433
MFC after: 1 week
Only a role of "ipf" is currentlysupported as the other documented
(and undocumented) roles are #ifdef'd out.
The plan is to complete ippool(8) as it is even in its current state
a powerful feature/tool.
PR: 218433
MFC after: 1 month
use epoch don't need Makefile tweaks.
The downside is that any developer who wants EPOCH_TRACE needs to
rebuild kernel in full, but that's fine.
Reviewed by: imp
exec_map_first_page() would unconditionally free an unbacked, invalid
page from the executable image. However, it is possible that the page
is wired, in which case it is incorrect to free the page, so check for
additional wirings first.
Reported by: syzkaller
Tested by: pho
Reviewed by: kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D21767
Add an atomic shm rename operation, similar in spirit to a file
rename. Atomically unlink an shm from a source path and link it to a
destination path. If an existing shm is linked at the destination
path, unlink it as part of the same atomic operation. The caller needs
the same permissions as shm_unlink to the shm being renamed, and the
same permissions for the shm at the destination which is being
unlinked, if it exists. If those fail, EACCES is returned, as with the
other shm_* syscalls.
truss support is included; audit support will come later.
This commit includes only the implementation; the sysent-generated
bits will come in a follow-on commit.
Submitted by: Matthew Bryan <matthew.bryan@isilon.com>
Reviewed by: jilles (earlier revision)
Reviewed by: brueffer (manpages, earlier revision)
Relnotes: yes
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D21423
syn cache overflows. Whether this is due to an attack or due to the system
having more legitimate connections than the syn cache can hold, this
situation can quickly impact performance.
To make the system perform better during these periods, the code will now
switch to exclusively using cookies until the syn cache stops overflowing.
In order for this to occur, the system must be configured to use the syn
cache with syn cookie fallback. If syn cookies are completely disabled,
this change should have no functional impact.
When the system is exclusively using syn cookies (either due to
configuration or the overflow detection enabled by this change), the
code will now skip acquiring a lock on the syn cache bucket. Additionally,
the code will now skip lookups in several places (such as when the system
receives a RST in response to a SYN|ACK frame).
Reviewed by: rrs, gallatin (previous version)
Discussed with: tuexen
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D21644
rather than indirectly through the backpointer to the tcp_syncache
structure stored in the hashtable bucket.
This also allows us to remove the requirement in syncookie_generate()
and syncookie_lookup() that the syncache hashtable bucket must be
locked.
Reviewed by: gallatin, rrs
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D21644
use of this parameter was removed in r313330. This commit now removes
passing this now-unused parameter.
Reviewed by: gallatin, rrs
Sponsored by: Netflix, Inc.
Differential Revision: https://reviews.freebsd.org/D21644
Introduce a new add_off_t static function that exits with an error
message if there's an overflow, otherwise returns their sum. Use this
when adding values obtained from the input patch.
Reviewed by: delphij, allanjude (earlier)
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D7897
We have two ways to check if kenv variable exists - either we check return
value from TUNABLE_INT_FETCH, or we pre-initialize the variable and check
if this value did change. In terminal_init() it is more convinient to
use pre-initialized variables.
Problem was revealed by older loader.efi, which did not set teken.* variables.
Reported by: tuexen
The TUNABLE_INT_FETCH is macro around getenv_int() and we will get
return value 0 or 1 for failure or success, we can use it to decide
which background color to use.
To be consistent with replacing the mtx_lock()/mtx_unlock() calls on
the NFS node mutex (n_mtx) and ncl_iod_mutex, this patch replaces
all mtx_assert() calls on these mutexes with macros as well.
This will simplify changing these locks to sx locks in a future commit.
However, this change may be delayed indefinitely, since it appears there
is a deadlock when vnode_pager_setsize() is called to shrink the size
and the NFS node lock is held.
There is no semantic change as a result of this commit.
Suggested by: kib
MFC after: 1 week
Sync libarchive with vendor.
Relevant vendor changes:
Issue #1237: Fix integer overflow in archive_read_support_filter_lz4.c
PR #1249: Correct some typographical and grammatical errors.
PR #1250: Minor corrections to the formatting of manual pages
MFC after: 1 week
In a few cases, the symbol lookup is missing before attempting to
perform the relocation. While the relocation types affected are
currently unused, this results in an uninitialized variable warning,
that is escalated to an error when building with clang.
Reviewed by: markj
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D21773
Fix some style(9) violations.
This also changes the name of the machine-dependent sysctl kern.debug_kld to
debug.kld_reloc, and changes its type from int to bool. This is acceptable
since we are not currently concerned with preserving the RISC-V ABI.
Reviewed by: markj, kp
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D21772
Use of CPU_FFS() to implement CPUSET_FOREACH() allows to save up to ~0.5%
of CPU time on 72-thread SMT system doing 80K IOPS to NVMe from one thread.
MFC after: 1 month
Sponsored by: iXsystems, Inc.
- split synopsis into separate options that can't be used together
- sort options
- fix (style) issues reported by mandoc lint
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D21710
I've noticed that I missed intr check at one more SCHED_AFFINITY(),
so instead of adding one more branching I prefer to remove few.
Profiler shows the function CPU time reduction from 0.24% to 0.16%.
MFC after: 1 month
Sponsored by: iXsystems, Inc.
Described in [1], signal handlers running in a vfork child have
opportunities to corrupt the parent's state. Address this by adding a new
rfork(2) flag, RFSPAWN, that has vfork(2) semantics but also resets signal
handlers in the child during creation.
x86 uses rfork_thread(3) instead of a direct rfork(2) because rfork with
RFMEM/RFSPAWN cannot work when the return address is stored on the stack --
further information about this problem is described under RFMEM in the
rfork(2) man page.
Addressing this has been identified as a prerequisite to using posix_spawn
in subprocess on FreeBSD [2].
[1] https://ewontfix.com/7/
[2] https://bugs.python.org/issue35823
Reviewed by: jilles, kib
Differential Revision: https://reviews.freebsd.org/D19058
When RFSPAWN is passed, rfork exhibits vfork(2) semantics but also resets
signal handlers in the child during creation to avoid a point of corruption
of parent state from the child.
This flag will be used by posix_spawn(3) to handle potential signal issues.
Reviewed by: jilles, kib
Differential Revision: https://reviews.freebsd.org/D19058
C/C++) in exp(3), expf(3), expm1(3) and expm1f(3) during intermediate
computations that compute the IEEE-754 bit pattern for |2**k| for
integer |k|.
The implementations of exp(3), expf(3), expm1(3) and expm1f(3) need to
compute IEEE-754 bit patterns for 2**k in certain places. (k is an
integer and 2**k is exactly representable in IEEE-754.)
Currently they do things like 0x3FF0'0000+(k<<20), which is to say they
take the bit pattern representing 1 and then add directly to the
exponent field to get the desired power of two. This is fine when k is
non-negative.
But when k<0 (and certain classes of input trigger this), this
left-shifts a negative number -- an operation with undefined behavior in
C and C++.
The desired semantics can be achieved by instead adding the
possibly-negative k to the IEEE-754 exponent bias to get the desired
exponent field, _then_ shifting that into its proper overall position.
(Note that in case of s_expm1.c and s_expm1f.c, there are SET_HIGH_WORD
and SET_FLOAT_WORD uses further down in each of these files that perform
shift operations involving k, but by these points k's range has been
restricted to 2 < k <= 56, and the shift operations under those
circumstances can't do anything that would be UB.)
Submitted by: Jeff Walden, https://github.com/jswalden
Obtained from: https://github.com/freebsd/freebsd/pull/411
Obtained from: https://github.com/freebsd/freebsd/pull/412
MFC after: 3 days
properly nested and warns about recursive entrances. Unlike with locks,
there is nothing fundamentally wrong with such use, the intent of tracer
is to help to review complex epoch-protected code paths, and we mean the
network stack here.
Reviewed by: hselasky
Sponsored by: Netflix
Pull Request: https://reviews.freebsd.org/D21610