amd64 miniport drivers are allowed to use FPU which triggers "Unregistered use
of FPU in kernel" panic.
Wrap all variants of MSCALL with fpu_kern_enter/fpu_kern_leave. To reduce
amount of allocations/deallocations done via
fpu_kern_alloc_ctx/fpu_kern_free_ctx maintain cache of fpu_kern_ctx elements.
Based on the patch by Paul B Mahol
PR: 165622
Submitted by: Vlad Movchan <vladislav.movchan@gmail.com>
MFC after: 1 month
Most siginfo_to_lsiginfo callers already zeroed the l_siginfo_t before
callit it, but linux_waitid did not. Instead of zeroing in the called
function to address linux_waitid (as in commit 2e6ebe70), just do it in
linux_waitid.
admbugs: 765
Reported by: Vlad Tsyrklevich <vlad@tsyrklevich.net>
Reviewed by: Andrew
MFC after: 1 day
Security: Kernel stack memory disclosure
Sponsored by: The FreeBSD Foundation
An integrity check such as a check-hash or a cross-correlation failed.
The integrity error falls between EINVAL that identifies errors in
parameters to a system call and EIO that identifies errors with the
underlying storage media. EINTEGRITY is typically raised by intermediate
kernel layers such as a filesystem or an in-kernel GEOM subsystem when
they detect inconsistencies. Uses include allowing the mount(8) command
to return a different exit value to automate the running of fsck(8)
during a system boot.
These changes make no use of the new error, they just add it. Later
commits will be made for the use of the new error number and it will
be added to additional manual pages as appropriate.
Reviewed by: gnn, dim, brueffer, imp
Discussed with: kib, cem, emaste, ed, jilles
Differential Revision: https://reviews.freebsd.org/D18765
atomic updates and reduces amount of data protected by zone lock.
During startup point these fields to EARLY_COUNTER. After startup
allocate them for all early zones.
Tested by: pho
- Remove macros that covertly create epoch_tracker on thread stack. Such
macros a quite unsafe, e.g. will produce a buggy code if same macro is
used in embedded scopes. Explicitly declare epoch_tracker always.
- Unmask interface list IFNET_RLOCK_NOSLEEP(), interface address list
IF_ADDR_RLOCK() and interface AF specific data IF_AFDATA_RLOCK() read
locking macros to what they actually are - the net_epoch.
Keeping them as is is very misleading. They all are named FOO_RLOCK(),
while they no longer have lock semantics. Now they allow recursion and
what's more important they now no longer guarantee protection against
their companion WLOCK macros.
Note: INP_HASH_RLOCK() has same problems, but not touched by this commit.
This is non functional mechanical change. The only functionally changed
functions are ni6_addrs() and ni6_store_addrs(), where we no longer enter
epoch recursively.
Discussed with: jtl, gallatin
The check was not introduced in r342628, but the subsequent unchecked access to
refs was added then, prompting a Coverity warning about "Null pointer
dereferences (FORWARD_NULL)." The warning is bogus due to M_WAITOK, but so is
the NULL check that hints it, so just remove it.
CID: 1398588
Reported by: Coverity
the destroying cdev.
Currently linux_destroy_dev() waits for the reference count on the
linux cdev to drain, and each open file hold the reference.
Practically it means that linux_destroy_dev() is blocked until all
userspace processes that have the cdev open, exit. FreeBSD devfs does
not have such problem, because device refcount only prevents freeing
of the cdev memory, and separate 'active methods' counter blocks
destroy_dev() until all threads leave the cdevsw methods. After that,
attempts to enter cdevsw methods are refused with an error.
Implement somewhat similar mechanism for LinuxKPI cdevs. Demote cdev
refcount to only mean a hold on the linux cdev memory. Add sirefs
count to track both number of threads inside the cdev methods, and for
single-bit indicator that cdev is being destroyed. In the later case,
the call is redirected to the dummy cdev.
Reviewed by: markj
Discussed with: hselasky
Tested by: zeising
MFC after: 1 week
Sponsored by: Mellanox Technologies
Differential revision: https://reviews.freebsd.org/D18606
Driver description should be set by core and not by the Ethernet driver.
Approved by: hselasky (mentor)
MFC after: 1 week
Sponsored by: Mellanox Technologies
Currently we always return false if for PCI offline query.
Try to read PCI config, if the return value if 0xffff probably the
PCI is offline.
Approved by: hselasky (mentor)
MFC after: 1 week
Sponsored by: Mellanox Technologies
Make sure we hold a reference on the character device for every opened file
to prevent the character device to be freed prematurely.
Submitted by: hselasky@
Approved by: hselasky (mentor)
MFC after: 1 week
Sponsored by: Mellanox Technologies
If there is a vnode attached to the linux file, use it to fill
kinfo_file. Otherwise, report a new KF_TYPE_DEV file type, without
supplying any type-specific information.
KF_TYPE_DEV is supposed to be used by most devfs-specific file types.
Sponsored by: Mellanox Technologies
MFC after: 1 week
Given a zeroed struct image_args with an allocated buf member,
exec_args_add_fname() must be called to install a file name (or NULL).
Then zero or more calls to exec_args_add_env() followed by zero or
more calls to exec_args_add_env(). exec_args_adjust_args() may be
called after args and/or env to allow an interpreter to be prepended to
the argument list.
To allow code reuse when adding arg and env variables, begin_envv
should be accessed with the accessor exec_args_get_begin_envv()
which handles the case when no environment entries have been added.
Use these functions to simplify exec_copyin_args() and
freebsd32_exec_copyin_args().
Reviewed by: kib
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D15468
Some kevent functions have a boolean "waitok" parameter for use when
calling malloc(9). Replace them with the corresponding malloc() flags:
the desired behaviour is known at compile-time, so this eliminates a
couple of conditional branches, and makes the code easier to read.
No functional change intended.
Reviewed by: kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D18318
According to markj@:
pageproc contains the page daemon and laundry threads, which are
responsible for managing the LRU page queues and writing back dirty
pages. vmproc's main task is to swap out kernel stacks when the system
is under memory pressure, and swap them back in when necessary. It's a
somewhat legacy component of the system and isn't required. You can
build a kernel without it by specifying "options NO_SWAPPING" (which is
a somewhat misleading name), in which vm_swapout_dummy.c is compiled
instead of vm_swapout.c.
Based on this, we want pageproc to emulate kswapd, not vmproc.
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D18061
These are used by kms-drm to determine various heuristics relate
memory conditions.
The number of free swap pages is just a variable, and it can be
much cheaper by either adding a new getter, or simply extern'ing
swap_total. However, this patch opts to use the more expensive,
existing interface - since this isn't an operation in a high per
path.
This allows us to remove some more gpl linuxkpi and do the follo
kms-drm:
git rm linuxkpi/gplv2/include/linux/swap.h
Reviewed by: mmacy, Johannes Lundberg <johalun0@gmail.com>
Approved by: emaste (mentor)
Differential Revision: https://reviews.freebsd.org/D18052
This was hidden behind the LINUX_CMSG_NXTHDR macro which dereferences its
second argument. Stop using the macro as well as LINUX_CMSG_FIRSTHDR. Use
the size field of the kernel copy of the control message header to obtain
the next control message.
PR: 217901
MFC after: 2 days
X-MFC-With: r340631
Instead of calling m_append with a user address, allocate an mbuf cluster
and copy data into it using copyin. For the SCM_CREDS case, instead of
zeroing a stack variable and appending that to the mbuf, zero part of the
mbuf cluster directly. One mbuf cluster is also the size limit used by
the FreeBSD sendmsg syscall (uipc_syscalls.c:sockargs()).
PR: 217901
Reviewed by: kib
MFC after: 3 days
Doing so removes the dependency on proctree lock from sysctl process list
export which further reduces contention during poudriere -j 128 runs.
Reviewed by: kib (previous version)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D17825
Allow the location of capabilities.conf to be configured.
Also allow a per-abi syscall prefix to be configured with the
abi_func_prefix syscalls.conf variable and check syscalls against
entries in capabilities.conf with and without the prefix amended.
Take advantage of these two features to allow use shared capabilities.conf
between the default syscall vector and the freebsd32 compatability
layer. We've been inconsistent about keeping the two in sync as
evidenced by the bugs fixed in r340294. This eliminates that problem
going forward.
Reviewed by: kib
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D17932
As dev_t is now a 64-bit integer, it requires special handling as a
system call argument. 64-bit arguments are split between two 64-bit
integers due to the way arguments are promoted to allow reuse of most
system call implementations. They must be reassembled before use.
Further, 64-bit arguments at an odd offset (counting from zero) are
padded and slid to the next slot on powerpc and mips. Fix the
non-COMPAT11 system call by adding a freebsd32_mknodat() and
appropriately padded declerations.
The COMPAT11 system calls are fully compatible with the 64-bit
implementations so remove the freebsd32_ versions.
Use uint32_t consistently as the type of the old dev_t. This matches
the old definition.
Reviewed by: kib
MFC after: 3 days
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D17928
Bugs range from failure to update after changing syscall implementaion
names to using the wrong name. Somewhat confusingly, the name in
capabilities.conf is exactly the string that appears in syscalls.master,
not the name with a COMPAT* prefix which is the actual function name.
Found while making a change to use the default capabilities.conf.
Fixes: r335177, r336980, r340272, r340274, others
Reviewed by: kib, emaste
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D17925