Allow the location of capabilities.conf to be configured.
Also allow a per-abi syscall prefix to be configured with the
abi_func_prefix syscalls.conf variable and check syscalls against
entries in capabilities.conf with and without the prefix amended.
Take advantage of these two features to allow use shared capabilities.conf
between the default syscall vector and the freebsd32 compatability
layer. We've been inconsistent about keeping the two in sync as
evidenced by the bugs fixed in r340294. This eliminates that problem
going forward.
Reviewed by: kib
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D17932
mutexes but now are converted to epoch(9) use thread-private epoch_tracker.
Embedding tracker into ifnet(9) or ifnet derived structures creates a non
reentrable function, that will fail miserably if called simultaneously from
two different contexts.
A thread private tracker will provide a single tracker that would allow to
call these functions safely. It doesn't allow nested call, but this is not
expected from compatibility KPIs.
Reviewed by: markj
They only showed up after I redefined LOCKSTAT_ENABLED to 0.
doing_lockprof in mutex.c is a real (but harmless) bug. Should the
value be non-zero it will do checks for lock profiling which would
otherwise be skipped.
state in rwlock.c is a wart from the compiler, the value can't be
used if lock profiling is not enabled.
Sponsored by: The FreeBSD Foundation
Change the assert paths in rm, rw, and sx locks to match the lock
and unlock paths. I did this for mutexes in r306346.
Reported by: Travis Lane <tlane@isilon.com>
MFC after: 2 weeks
Sponsored by: Dell EMC Isilon
processors would benefit from avoiding a function call, but bloating
code. In fact, clang created an uninlined real function for many
object files in the network stack.
- Move epoch_private.h into subr_epoch.c. Code copied exactly, avoiding
any changes, including style(9).
- Remove private copies of critical_enter/exit.
Reviewed by: kib, jtl
Differential Revision: https://reviews.freebsd.org/D17879
Remove restrictions that prevent allocation requests to cross the
boundary between two meta nodes.
Replace the bmu_avail field in meta nodes with a bitmap that identifies
which subtrees have some free memory, and iterate over the nonempty
subtrees only in blst_meta_alloc. If free memory is scarce, this should
make searching for it faster.
Put the code for handling the next-leaf allocation in a separate
function. When taking blocks from the next leaf empties the leaf, be
sure to clear the appropriate bit in its parent, and so on, up to the
least-common ancestor of this leaf and the next.
Eliminate special terminator nodes, and rely instead on the fact that
there is a 0-bit at the end of the bitmask at the root of the tree that
will stop a meta_alloc search, or a next-leaf search, before the search
falls off the end of the tree. Make sure that the tree is big enough to
have space for that 0-bit.
Eliminate special all-free indicators. Lazy initialization of subtrees
stands in the way of having an allocation span a meta-node boundary, so
a subtree of all free blocks is not treated specially. Subtrees of
all-allocated blocks are still recognized by looking at the bitmask at
the root and finding 0.
Don't print all-allocated subtrees. Do print the bitmasks for meta
nodes, when tree-printing.
Submitted by: Doug Moore <dougm@rice.edu>
Reviewed by: alc
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D12635
Both to formally document the requirement that this not be called after the
dynamic kenv is setup, and to perhaps help static analyzers figure out
what's going on. While calling init_static_kenv this late isn't fatal, there
are some caveats that the caller should be aware of:
- Late calls are effectively a no-op, as far as default FreeBSD is
concerned, as everything will switch to searching the dynamic kenv once it's
available.
- Each of the kern_getenv calls will leak memory, as it's assumed that
these are searching static environment and allocations will not be made.
As such, this usage is not sensible and should be detected.
The vlan interfaces can be created from vnet jails, it seems, so it
sounds logical to allow pcp configuration as well.
Reviewed by: bz, hselasky (previous version)
Sponsored by: Mellanox Technologies
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D17777
Correct boneheaded assertion I added in r339501. Mea culpa.
The intent is to notice when an M_WAITOK zone allocation would fail during
netdump, not to prevent all use of mbufs during netdump.
Reviewed by: markj
X-MFC-With: r339501
Differential Revision: https://reviews.freebsd.org/D17957
This also removes a lot of #ifdefs and cleans up a warning when the
AUDIT kernel option is defined, but neither KDTRACE_HOOKS nor MAC are.
Reported and tested by: danger
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
The path must have a tail which does not escape starting/topping
directory. The documentation will come shortly, see the man pages
commit message for the reason of separate commit.
Reviewed by: jilles (previous version)
Discussed with: emaste
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D17714
As dev_t is now a 64-bit integer, it requires special handling as a
system call argument. 64-bit arguments are split between two 64-bit
integers due to the way arguments are promoted to allow reuse of most
system call implementations. They must be reassembled before use.
Further, 64-bit arguments at an odd offset (counting from zero) are
padded and slid to the next slot on powerpc and mips. Fix the
non-COMPAT11 system call by adding a freebsd32_mknodat() and
appropriately padded declerations.
The COMPAT11 system calls are fully compatible with the 64-bit
implementations so remove the freebsd32_ versions.
Use uint32_t consistently as the type of the old dev_t. This matches
the old definition.
Reviewed by: kib
MFC after: 3 days
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D17928
The previous code required that the return type be a single word. This
allows it to be a pointer without using a typedef.
Update the return types of break, mmap, and shmat to be void * as
declared. This only effects systrace output in-tree, but can aid in
generating system call wrappers from syscalls.master.
Reviewed by: kib
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D17873
Different compilation units may otherwise get a different view of the
layout of struct tty depending on whether they include opt_printf.h.
This caused a blowup in the number of types defined in the kernel's
CTF file after r339468; thanks to dim@ for bisecting down to that
revision.
PR: 232675
Reported by: dim
Reviewed by: cem (previous version)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D17877
These submaps are used for mapping pipe buffers and execv() argument
strings respectively, so there's no need for such mappings to have
execute permissions.
Reported by: jhb
Reviewed by: alc, jhb, kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D17827
Leave ptrace(2) alone for the moment as it's defined to take a caddr_t.
Reviewed by: kib
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D17852
This allows us to build the ubsan code added in r340189 into the kernel
with the KUBSAN option. This will report when undefined behaviour is
detected in the currently running kernel.
As it can be large, the kernel is 65MB on arm64, loader may not be able to
load the kernel on all architectures so is disabled by default for now.
Sponsored by: DARPA, AFRL
This imports revision 1.3 of common/lib/libc/misc/ubsan.c from NetBSD, the
micro-ubsan code. It is an implementation of the Undefined Behavior
Sanitizer runtime for use with recent clang and gcc.
The uubsan code will be used in a later commit to implement kubsan to help
find undefined behavior in the kernel.
Sponsored by: DARPA, AFRL
Replace a call to DELAY(1) with a new cpu_lock_delay() KPI. Currently
cpu_lock_delay() is defined to DELAY(1) on all platforms. However,
platforms with a DELAY() implementation that uses spin locks should
implement a custom cpu_lock_delay() doesn't use locks.
Reviewed by: kib
MFC after: 3 days
We already allow to use poll(2). There is no reason to disallow ppoll(2).
PR: 232495
Submitted by: Stefan Grundmann <sg2342@googlemail.com>
Reviewed by: cem, oshogbo
MFC after: 2 weeks
In discussing D17503 "Run epoch calls sooner and more reliably" with
sbahra@ we came to the conclusion that epoch is currently misusing the
ck_epoch API. It isn't safe to do a "write side" operation (ck_epoch_call
or ck_epoch_poll) in the middle of a "read side" section. Since, by definition,
it's possible to be preempted during the middle of an EPOCH_PREEMPT
epoch the GC task might call ck_epoch_poll or another thread might call
ck_epoch_call on the same section. The right solution is ultimately to change
the way that ck_epoch works for this use case. However, as a stopgap for
12 we agreed to simply have separate records for each use case.
Tested by: pho@
MFC after: 3 days
These arguments are mostly paths handled by NAMEI*() macros which already
take const char * arguments.
This change improves the match between syscalls.master and the public
declerations of system calls.
Reviewed by: kib (prior version)
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D17812
Initializing the eflags field of the map->header entry to a value with a
unique new bit set makes a few comparisons to &map->header unnecessary.
Submitted by: Doug Moore <dougm@rice.edu>
Reviewed by: alc, kib
Tested by: pho
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D14005
This will enable callers to take const paths as part of syscall
decleration improvements.
Where doing so is easy and non-distruptive carry the const through
implementations. In UFS the value is passed to an interface that must
take non-const values. In ZFS, const poisoning would touch code shared
with upstream and it's not worth adding diffs.
Bump __FreeBSD_version for external API consumers.
Reviewed by: kib (prior version)
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D17805
The comment isn't stale. The check is bogus in the sense that poll(2)
does not require pollfd entries to be unique in fd space, so there is no
reason there cannot be more pollfd entries than open or even allowed
fds. The check is mostly a seatbelt against accidental misuse or
abuse. FD_SETSIZE, while usually unrelated to poll, is used as an
arbitrary floor for systems with very low kern.maxfilesperproc.
Additionally, document this possible EINVAL condition in the poll.2
manual.
No functional change.
Reviewed by: markj
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D17671
This is more clear and produces better results when generating function
stubs from syscalls.master.
Reviewed by: kib, emaste
Obtained from: CheribSD
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D17784
Add a new 'debugger_on_trap' knob separate from 'debugger_on_panic'
and make the calls to kdb_trap() in MD fatal trap handlers prior to
calling panic() conditional on this new knob instead of
'debugger_on_panic'. Disable the new knob by default. Developers who
wish to recover from a fatal fault by adjusting saved register state
and retrying the faulting instruction can still do so by enabling the
new knob. However, for the more common case this makes the user
experience for panics due to a fatal fault match the user experience
for other panics, e.g. 'c' in DDB will generate a crash dump and
reboot the system rather than being stuck in an infinite loop of fatal
fault messages and DDB prompts.
Reviewed by: kib, avg
MFC after: 2 months
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D17768
This takes advantage of two recents changes to makesyscalls.sh:
r328598: Permit a range of syscall numbers for UNIMPL
r339624: Remove the need for backslashes in syscalls.master
Syscall declerations are now split across multiple lines with the
syscall name and variables each on seperate lines (with an exception for
syscalls taking no arguments.)
Reviewed by: imp
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D17706
I erroneously thought that it was two 64bit platforms which use link_elf_obj.c.
PR: 228854
Reported by: ci.f.o.
MFC after: 3 days
X-MFC with: r339931
Pointyhat to: bz
we fail during module load because the pcpu or vnet module sections are
full. We did return a proper error but not leaving any indication to
the user as to what the actual problem was.
Even worse, on 12/13 currently we are seeing an unrelated error (ENOSYS
instead of ENOSPC, which gets skipped over in kern_linker.c) to be
printed which made problem diagnostics even harder.
PR: 228854
MFC after: 3 days
Remove malloc_domain(9) and most other _domain KPIs added in r327900.
The new functions allow the caller to specify a general NUMA domain
selection policy, rather than specifically requesting an allocation from
a specific domain. The latter policy tends to interact poorly with
M_WAITOK, resulting in situations where a caller is blocked indefinitely
because the specified domain is depleted. Most existing consumers of
the _domain KPIs are converted to instead use a DOMAINSET_PREF() policy,
in which we fall back to other domains to satisfy the allocation
request.
This change also defines a set of DOMAINSET_FIXED() policies, which
only permit allocations from the specified domain.
Discussed with: gallatin, jeff
Reported and tested by: pho (previous version)
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D17418
- In uma_prealloc(), we need to check for an empty domain before the
first allocation attempt, not after. Fix this by switching
uma_prealloc() to use a vm_domainset iterator, which addresses the
secondary issue of using a signed domain identifier in round-robin
iteration.
- Don't automatically create a page daemon for domain 0.
- In domainset_empty_vm(), recompute ds_cnt and ds_order after
excluding empty domains; otherwise we may frequently specify an empty
domain when calling in to the page allocator, wasting CPU time.
Convert DOMAINSET_PREF() policies for empty domains to round-robin.
- When freeing bootstrap pages, don't count them towards the per-domain
total page counts for now: some vm_phys segments are created before
the SRAT is parsed and are thus always identified as being in domain 0
even when they are not. Then, when bootstrap pages are freed, they
are added to a domain that we had previously thought was empty. Until
this is corrected, we simply exclude them from the per-domain page
count.
Reported and tested by: Rajesh Kumar <rajfbsd@gmail.com>
Reviewed by: gallatin
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D17704
Set curthread->td_stopsched when entering kdb via any vector.
Previously, it was only set when entering via panic, so when
entering kdb another way, mutexes and such were still "live",
and an attempt to lock an already locked mutex would panic.
Reviewed by: kib, cem
Discussed with: jhb
Tested by: pho
MFC after: 2 months
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D17687
taskqgroup_detach() would remove the task even if it was running or
enqueued, which could lead to panics (see D17404). With this change,
taskqgroup_detach() drains the task and sets a new flag which prevents the
task from being scheduled again.
I've added grouptask_block() and grouptask_unblock() to allow control
over the flag from other locations as well.
Reviewed by: Jeffrey Pieper <jeffrey.e.pieper@intel.com>
MFC after: 1 week
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D17674
Some of the poll code used 'fds' and some used 'ufds' to refer to the
uap->fds userspace pointer that was passed around to subroutines. Some of
the poll code used 'fds' to refer to the kernel memory pollfd arrays, which
seemed unnecessarily confusing.
Unify on 'ufds' to refer to the userspace pollfd array.
Additionally, 'bits' is not an accurate description of the kernel pollfd
array in kern_poll, so rename that to 'kfds'. Finally, clean up some logic
with mallocarray() and nitems().
No functional change.
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D17670
ioctl(2) commands only have meaning in the context of a file descriptor
so translating them in the syscall layer is incorrect.
The new handler users an accessor to retrieve/construct a pointer from
the last member of the passed structure and relies on type punning to
access the other member which requires no translation.
Unlike r339174 this change supports both places FIODGNAME is handled.
Reviewed by: kib
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D17475
Flags prevent open(2) and *at(2) vfs syscalls name lookup from
escaping the starting directory. Supposedly the interface is similar
to the same proposed Linux flags.
Reviewed by: jilles (code, previous version of manpages), 0mp (manpages)
Discussed with: allanjude, emaste, jonathan
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D17547
Use bypass to catch any NFS VOP dispatch and route it through the
wrapper which does sigdeferstop() and then dispatches original
VOP. NFS does not need a bypass below it, which is not supported.
The vop offset in the vop_vector is added since otherwise it is
impossible to get vop_op_t from the internal table, and I did not
wanted to create the layered fs only to wrap NFS VOPs.
VFS_OP()s wrap is straightforward.
Requested and reviewed by: mjg (previous version)
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D17658
That commit is causing kernel panics in em(4), so this will be reverted
until those are fixed.
Reported by: ae@, pho@, et al
Sponsored by: Intel Corporation
The taskqgroup_detach function does not check if task is already enqueued when
detaching it. This may lead to kernel panic if enqueued task starts after
context state lock is destroyed. Ensure that the already enqueued admin tasks
are executed before detaching them.
The issue was discovered during validation of D16429. Unloading of if_ixlv
followed by immediate removal of VFs with iovctl -D may lead to panic on
NODEBUG kernel.
As well, check if iflib is in detach before enqueueing new admin or iov
tasks, to prevent new tasks from executing while the taskqgroup tasks
are being drained.
Submitted by: Krzysztof Galazka <krzysztof.galazka@intel.com>
Reviewed by: shurd@, erj@
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D17404
Join non-special lines together until we hit a line containing a '}'
character. This allows the function declaration body to be split
across multiple lines without backslash continuation characters.
Continue to join lines ending with backslashes to allow gradual
migration and to support out-of-tree syscall vectors
Reviewed by: emaste, kib
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D17488
The restruct qualifier is intended to aid code generation in the
compiler, but the only access to storage through these pointers is via
structs using copyin/copyout and the like which can not be written in C
or C++ and thus the compiler gains nothing from the qualifiers.
As such, the qualifiers add no value in current usage.
Reviewed by: kib
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D17574
This provides a chicken switch for anyone negatively impacted by
enabling NUMA in the amd64 GENERIC kernel configuration. With
NUMA disabled at boot-time, information about the NUMA topology
is not exposed to the rest of the kernel, and all of physical
memory is viewed as coming from a single domain.
This method still has some performance overhead relative to disabling
NUMA support at compile time.
PR: 231460
Reviewed by: alc, gallatin, kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D17439
vmem uses UMA cache zones to implement the quantum cache. Since
uma_zalloc() returns 0 (NULL) to signal an allocation failure, UMA
should not be used to cache resource 0. Fix this by ensuring that 0 is
never cached in UMA in the first place, and by modifying vmem_alloc()
to fall back to a search of the free lists if the cache is depleted,
rather than blocking in qc_import().
Reported by and discussed with: Brett Gutstein <bgutstein@rice.edu>
Reviewed by: alc
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D17483
The buslogic scsi driver has been tagged as gone in 12 for some time
now. Remove it. The nycbug dmesg database shows only one sighting in 6
for this driver. It was very popular in the early days of the project,
but that popularity seems to have died by 2004 when the nycbug
database started up.
Relnotes: yes
Remove the advanssy drivers (both adv and adw). They were tagged as
gone in 12 a while qgo. The nycbug dmesg database shows this was last
seen in 6 and there were only a few adv sightings then (none for adw).
Relnotes: yes
We tagged aha as gone in 12 a while ago. Proceed with its removal.
Data from nycbug's database shows the last sighting of this driver in
6, with the prior one in 4.x show its popularity had died prior to
4.x.
Relnotes: yes
Prior to this revision, we allocated sufficient context space for 'level'
but never actually set the compress level parameter, so we would always get
the default '3'.
Reviewed by: markj, vangyzen
MFC after: 12 hours
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D17144
Like the companion API devvn_refthread, leave *ref uninitialized when a
reference was not acquired. Initializing to 1 provides a vaguely
correct-looking but bogus value for broken callers to (mistakenly) pass to
dev_relthread() when refthread fails.
Make it even more clear to consumers that dev_relthread is only valid when
dev_refthread succeeds.
Reviewed by: kib, markj
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D16885
It is often useful for developers and administrators to determine a running
thread's stack for debugging purposes. With this feature, using ^T will
print that information
For now, the feature is disabled by default. Enable with sysctl
kern.tty_info_kstacks=1.
Discussed with: markj
Reviewed by: oshogbo
Relnotes: yes
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D17621
Some best-effort consumers may find trylock behavior for stack(9) symbol
resolution acceptable. Expose that behavior to such consumers.
This API is ugly. If in the future the modules and linker file list locking
is cleaned up such that the linker_files list can be iterated safely without
acquiring a sleepable lock, this API should be removed. However, most of
the time nothing will be holding the linker files lock exclusive and the
acquisition can proceed.
Reviewed by: markj
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D17620
Pre-defined policies are useful when integrating the domainset(9)
policy machinery into various kernel memory allocators.
The refactoring will make it easier to add NUMA support for other
architectures.
No functional change intended.
Reviewed by: alc, gallatin, jeff, kib
Tested by: pho (part of a larger patch)
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D17416
returns the section start and stop locations as well as a count if the
caller asks for them.
There was only one out-of-file consumer of count which did not actually
use it and hence was eliminated in r339407.
In r194784 parse_dpcpu(), and in r195699 parse_vnet() (a copy of the
former) started to use the link_elf_lookup_set() interface internally
also asking for the count.
count is computed as the difference of the void **stop - void **start
locations and as such, if the absoulte numbers
(stop - start) % sizeof(void *) != 0
a round-down happens, e.g., **stop 0x1003 - **start 0x1000 => count 0.
To get the section size instead of "count is the number of pointer
elements in the section", the parse_*() functions do a
count *= sizeof(void *).
They use the result to allocate memory and copy the section data
into the "master" and per-instance memory regions with a size of
count.
As a result of count possibly round-down this can miss the last
bytes of the section. The good news is that we do not touch
out of bounds memory during these operations (we may at a later stage
if the last bytes would overflow the master sections).
Given relocation in elf_relocaddr() works based on the absolute
numbers of start and stop, this means that we can possibly try to
access relocated data which was never copied and hence we get
random garbage or at best zeroed memory.
Stop the two (last) consumers of count (the parse_*() functions)
from using count as well, and calculate the section size based on
the absolute numbers of stop and start and use the proper size for
the memory allocation and data copies. This will make the symbols
in the last bytes of the pcpu or vnet sections be presented as
expected.
PR: 232289
Approved by: re (gjb)
MFC after: 2 weeks
was really a "socket close" callback.
Update the socket destructor functionality to run when a socket is
destroyed (rather than when it is closed). The original submitter has
confirmed that this change satisfies the intended use case.
Suggested by: rwatson
Submitted by: Michio Honda <micchie at sfc.wide.ad.jp>
Tested by: Michio Honda <micchie at sfc.wide.ad.jp>
Approved by: re (kib)
Differential Revision: https://reviews.freebsd.org/D17590
can see the dmesg buffer (this is the current behavior). When false (the
new default), dmesg will be unavailable to jailed users, whether root or
not.
The security.bsd.unprivileged_read_msgbuf sysctl still works as before,
controlling system-wide whether non-root users can see the buffer.
PR: 211580
Submitted by: bz
Approved by: re@ (kib@)
MFC after: 3 days
Unconditionally reparenting to PID 1 breaks the procctl(2) reaper
functionality.
Add a regression test for this case.
Reviewed by: kib
Approved by: re (gjb)
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D17589
Both ^/sys/compat/freebsd32/syscalls.master and ^/sys/kern/syscalls.master
cited "COMPAT[n] #ifdef" instead of "COMPAT_FREEBSD[n] #ifdef" in places.
Approved by: re (glebius)
Reading caps is in the hot path (on each successful fd lookup), but
completely unnecessarily requires a function call.
Approved by: re (gjb)
Sponsored by: The FreeBSD Foundation
The change is a no-op for architectures which don't ifunc memset,
memcpy nor memmove.
Convert places which need them. Xen bits by royger.
Reviewed by: kib
Approved by: re (gjb)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D17487
It's not supposed to be legal for two jails to contain the same IP address,
unless both jails contain only that one address. This is the behavior
documented in jail(8), and is there to prevent confusion when multiple
jails are listening on IADDR_ANY.
VIMAGE jails (now the default for GENERIC kernels) test this correctly,
but non-VIMAGE jails have been performing an incomplete test when nested
jails are used.
Approved by: re@ (kib@)
MFC after: 5 days
Change swap_reserve and swap_total to be in units of pages so that
swap reservations can be done using only atomics instead of using a single
global mutex for swap_reserve and a single mutex for all processes running
under the same uid for uid accounting.
Results in mmap speed up and a 70% increase in brk calls / second.
Reviewed by: alc@, markj@, kib@
Approved by: re (delphij@)
Differential Revision: https://reviews.freebsd.org/D16273
shutdown() to wakeup another thread blocked on a stream listen socket.
This code is failing, while it used to work on FreeBSD 10 and still
works on Linux.
It seems reasonable to add another exception to support something users are
actually doing, which used to work on FreeBSD 10, and still works on Linux.
And, it seems like it should be acceptable to POSIX, as we still return
ENOTCONN.
This patch is different to what had been committed to stable/11, since
code around listening sockets is different. Patch in D15019 is written
by jtl@, slightly modified by me.
PR: 227259
Obtained from: jtl
Approved by: re (kib)
Differential Revision: D15019
Tested with ifunc resolvers in the kernel and module with calls from
kernel to kernel, module to kernel, and module to module.
Reviewed by: kib (previous version)
Approved by: re (gjb)
Differential Revision: https://reviews.freebsd.org/D17370
The AMD Threadripper 2990WX is basically a slightly crippled Epyc.
Rather than having 4 memory controllers, one per NUMA domain, it has
only 2 memory controllers enabled. This means that only 2 of the
4 NUMA domains can be populated with physical memory, and the
others are empty.
Add support to FreeBSD for empty NUMA domains by:
- creating empty memory domains when parsing the SRAT table,
rather than failing to parse the table
- not running the pageout deamon threads in empty domains
- adding defensive code to UMA to avoid allocating from empty domains
- adding defensive code to cpuset to avoid binding to an empty domain
Thanks to Jeff for suggesting this strategy.
Reviewed by: alc, markj
Approved by: re (gjb@)
Differential Revision: https://reviews.freebsd.org/D1683
This is mostly a cosmetic change except that obsolete system calls are
assigned meaningful names in the names arrays which means that using
tools like kdump or truss against binaries invoking these system calls
will print out the name instead of the number. The script I use to
generate the XML list of syscalls for GDB also ignores UNIMPL but not
OBSOL entries. In general UNIMPL should only be used to reserve
placeholders for system calls that have never been implemented while
system calls that existed at one time in FreeBSD but were removed
should be marked OBSOL instead.
Reviewed by: brooks, kib, imp
Approved by: re (gjb)
Differential Revision: https://reviews.freebsd.org/D17344