Commit Graph

3673 Commits

Author SHA1 Message Date
Vladimir Kondratyev
b3c6fe663b epoll: Store epoll_event udata member in ext member of kevent.
Current epoll implementation stores udata fields of epoll_event
structure in special dynamically-sized table rather than in udata field
of backing kevent structure because of 2 reasons:
1. Kevent's udata size is smaller than epoll's on 32-bit archs.
2. Kevent's udata can be clobbered on execution EPOLL_CTL_ADD as kqueue
   modifies existing event while epoll returns error in this case.

After r320043 has introduced four new 64bit user data members (ext[]),
we can store epoll udata in one of them and drop aforementioned table.
According to kqueue_register() source code ext members are not updated
when existing kevent is modified that fixes p.2.

As a side effect the patch fixes PR/252582.

Reviewed by:	trasz
MFC after:	1 month
Differential revision:	https://reviews.freebsd.org/D28169
2021-02-08 02:46:14 +03:00
Edward Tomasz Napierala
e44a78ce6f linux: add support for SO_PEERSEC getsockopt
It returns "unconfined", like Linux without SELinux would.

Sponsored By:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D28164
2021-02-07 20:42:04 +00:00
Edward Tomasz Napierala
f6e8256a96 linux: fix handling of flags for 32 bit send(2) syscall
Previously the flags were passed as-is, which could resulted
in spurious EAGAIN returned for non-blocking sockets, which
broke some Steam games.

PR:		248065
Reported By:	Alex S <iwtcex@gmail.com>
Tested By:	Alex S <iwtcex@gmail.com>
Reviewed By:	emaste
MFC After:	3 days
Sponsored By:	The FreeBSD Foundation
2021-02-06 23:21:27 +00:00
Ryan Stone
b58cf1cb35 Fix race condition in linuxkpi workqueue
Consider the following scenario:

1. A delayed_work struct in the WORK_ST_TIMER state.
2. Thread A calls mod_delayed_work()
3. Thread B (a callout thread) simultaneously calls
linux_delayed_work_timer_fn()

The following sequence of events is possible:

A: Call linux_cancel_delayed_work()
A: Change state from TIMER TO CANCEL
B: Change state from CANCEL to TASK
B: taskqueue_enqueue() the task
A: taskqueue_cancel() the task
A: Call linux_queue_delayed_work_on().  This is a no-op because the
state is WORK_ST_TASK.

As a result, the delayed_work struct will never be invoked.  This is
causing address resolution in ib_addr.c to stop permanently, as it
never tries to reschedule a task that it thinks is already scheduled.

Fix this by introducing locking into the cancel path (which
corresponds with the lock held while the callout runs).  This will
prevent the callout from changing the state of the task until the
cancel is complete, preventing the race.

Differential Revision:	https://reviews.freebsd.org/D28420
Reviewed by: hselasky
MFC after: 2 months
2021-02-04 13:54:53 -05:00
shu
14c40d2c29 linux: remove locks around callout_drain in timerfd_close()
The lock around callout_drain() is unnecessary and may cause
deadlock when one closes a timer descriptor during timer execution.

Reviewed By:	delphij
Submitted By:	ankohuu_outlook.com (Shunchao Hu)
Differential Revision: https://reviews.freebsd.org/D28148
2021-02-03 19:47:38 +00:00
shu
ae71b794cb linux: make timerfd_settime(2) set expirations count to zero
On Linux, read(2) from a timerfd file descriptor returns an unsigned
8-byte integer (uint64_t) containing the number of expirations
that have occurred, if the timer has already expired one or more
times since its settings were last modified using timerfd_settime(),
or since the last successful read(2).  That's to say, once we do
a read or call timerfd_settime(), timer fd's expiration count should
be zero.  Some Linux applications create timerfd and add it to epoll
with LT mode, when event comes, they do timerfd_settime instead
of read to stop event source from trigger.  On FreeBSD,
timerfd_settime(2) didn't set the count to zero, which caused high
CPU utilization.

Submitted by:	ankohuu_outlook.com (Shunchao Hu)
Differential Revision: https://reviews.freebsd.org/D28231
2021-02-03 19:08:40 +00:00
Bjoern A. Zeeb
4a26380ba6 LinuxKPI: add module dependency on firmware(9)
In a6c2507d1b support for LinuxKPI
firmware loading was added.  Record the dependency on firmware(9)
as otherwise (if built as module) linuxkpi will no longer load.

Reported-by:	tijl
MFC after:	1 day
X-MFC-with:	a6c2507d1b
Sponsored-by:	The FreeBSD Foundation
2021-01-30 17:50:26 +00:00
Bjoern A. Zeeb
fa765ca73e LinuxKPI: implement devres() framework parts and two examples
This code implements a version of the devres framework found
working for various iwlwifi use cases and also providing functions
for ttm_page_alloc_dma.c from DRM.

Part of the framework replicates the consumed KPI, while others
are internal helper functions.

In addition the simple devm_k*malloc() consumers were implemented
and kvasprintf() was enhanced to also work for the devm_kasprintf()
case.
Addmittingly lkpi_devm_kmalloc_release() could be avoided but for
the overall understanding of the code and possible memory tracing
it may still be helpful.

Further devsres consumer are implemented for iwlwifi but will follow
later as the main reason for this change is to sort out overlap with
DRM.

Sponsored-by:	The FreeBSD Foundation
Obtained-from:	bz_iwlwifi
MFC After:	3 days
Reviewed-by:	hselasky, manu
Differential Revision:	https://reviews.freebsd.org/D28189
2021-01-28 16:32:43 +00:00
Bjoern A. Zeeb
1fac2cb4d6 LinuxKPI: enhance PCI bits for DRM
In pci_domain_nr() directly return the domain which got set in
lkpifill_pci_dev() in all cases.  This was missed between D27550
and 105a37cac7 .

In order to implement pci_dev_put() harmonize further code
(which was started in the aforementioned commit) and add kobj
related bits (through the now common lkpifill_pci_dev() code)
to the DRM specific calls without adding the DRM allocated
pci devices to the pci_devices list.
Add a release for the lkpinew_pci_dev() (DRM) case so freeing
will work.
This allows the DRM created devices to use the normal kobj/refcount
logic and work with, e.g., pci_dev_put().
(For a slightly more detailed code walk see the review).

Sponsored-by:	The FreeBSD Foundation
Obtained-from:	bz_iwlwifi (partially)
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D28188
2021-01-28 16:23:19 +00:00
Bjoern A. Zeeb
4abbf816bf LinuxKPI: upstream a collection of drm-kmod conflicting changes
The upcoming in-kernel implementations for LinuxKPI based on work on
iwlwifi (and other wireless drivers) conflicts in a few places with
the drm-kmod graphics work outside the base system.

In order to transition smoothly extract the conflicting bits.
This included "unaligned" accessor functions, sg_pcopy_from_buffer(),
IS_*() macros (to be further restricted in the future), power management
bits (possibly no longer conflicting with DRM), and other minor changes.

Obtained-from:  bz_iwlwifi
Sponsored-by:   The FreeBSD Foundation
MFC after:	3 days
Reviewed by:	kib, hselasky, manu, bdragon (looked at earlier versions)
Differential Revision: https://reviews.freebsd.org/D26598
2021-01-28 16:15:12 +00:00
Bjoern A. Zeeb
a6c2507d1b LinuxKPI: add firmware loading support
Implement linux firmware KPI compat code.
This includes: request_firmware() request_firmware_nowait(),
request_firmware_direct(), firmware_request_nowarn(),
and release_firmware().

Given we will try to map requested names from natively ported
or full-linuxkpi-using drivers to a firmware(9) auto-loading
name format (.ko file name and image name matching),
we quieten firmware(9) and print success or failure (unless
the _nowarn() version was called) in the linuxkpi implementation.
At the moment we try up-to 4 different naming combinations,
with path stripped, original name, and requested name with '/'
or '.' replaced.

We do not currently defer loading in the "nowait" case.

Sponsored-by:	The FreeBSD Foundation
Sponsored-by:	Rubicon Communications, LLC ("Netgate")
		(firmware(9) nowarn update from D27413)
MFC after:	3 days
Reviewed by:	kib, manu (looked at older versions)
Differential Revision:	https://reviews.freebsd.org/D27414
2021-01-28 16:05:32 +00:00
Brooks Davis
7a1591c1b6 Rename kern_mmap_req to kern_mmap
Replace all uses of kern_mmap with kern_mmap_req move the old kern_mmap.
Reand rename kern_mmap_req to kern_mmap                                .

The helper saved some code churn initially, but having multiple
interfaces is sub-optimal.

Obtained from:	CheriBSD
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D28292
2021-01-25 21:50:37 +00:00
Brooks Davis
bfc99943b0 ndis(4): remove as previous announced
nids(4) was a clever idea in the early 2000's when the market was
flooded with 10/100 NICs with Windows-only drivers, but that hasn't been
the case for ages and the driver has had no meaningful maintenance in
ages. It only supports Windows-XP era drivers.

Also remove:
 - ndis support from wpa_supplicant
 - ndiscvt(8)

Reviewed By:	emaste, bcr (manpages)
Differential Revision:	https://reviews.freebsd.org/D27609
2021-01-25 21:45:03 +00:00
Edward Tomasz Napierala
7d3310c4fc linux: remove spurious newline.
Sponsored by:	The FreeBSD Foundation
2021-01-19 09:56:45 +00:00
Mark Johnston
4af9323542 linuxkpi: Fix the shrinker scan target
Use the number of items scanned to control the duration of the shrink
loop.  Otherwise, if a consumer like TTM is not able to free the number
of items requested for some reason, the shrinker keeps looping forever.

Reviewed by:	manu
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D28224
2021-01-18 17:07:55 -05:00
Vladimir Kondratyev
ec25b6fa5f LinuxKPI: Reimplement irq_work queue on top of fast taskqueue
Summary:
Linux's irq_work queue was created for asynchronous execution of code from contexts where spin_lock's are not available like "hardware interrupt context". FreeBSD's fast taskqueues was created for the same purposes.

Drm-kmod 5.4 uses irq_work_queue() at least in one place to schedule execution of task/work from the critical section that triggers following INVARIANTS-induced panic:

```
panic: acquiring blockable sleep lock with spinlock or critical section held (sleep mutex) linuxkpi_short_wq @ /usr/src/sys/kern/subr_taskqueue.c:281
cpuid = 6
time = 1605048416
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe006b538c90
vpanic() at vpanic+0x182/frame 0xfffffe006b538ce0
panic() at panic+0x43/frame 0xfffffe006b538d40
witness_checkorder() at witness_checkorder+0xf3e/frame 0xfffffe006b538f00
__mtx_lock_flags() at __mtx_lock_flags+0x94/frame 0xfffffe006b538f50
taskqueue_enqueue() at taskqueue_enqueue+0x42/frame 0xfffffe006b538f70
linux_queue_work_on() at linux_queue_work_on+0xe9/frame 0xfffffe006b538fb0
irq_work_queue() at irq_work_queue+0x21/frame 0xfffffe006b538fd0
semaphore_notify() at semaphore_notify+0xb2/frame 0xfffffe006b539020
__i915_sw_fence_notify() at __i915_sw_fence_notify+0x2e/frame 0xfffffe006b539050
__i915_sw_fence_complete() at __i915_sw_fence_complete+0x63/frame 0xfffffe006b539080
i915_sw_fence_complete() at i915_sw_fence_complete+0x8e/frame 0xfffffe006b5390c0
dma_i915_sw_fence_wake() at dma_i915_sw_fence_wake+0x4f/frame 0xfffffe006b539100
dma_fence_signal_locked() at dma_fence_signal_locked+0x105/frame 0xfffffe006b539180
dma_fence_signal() at dma_fence_signal+0x72/frame 0xfffffe006b5391c0
dma_fence_is_signaled() at dma_fence_is_signaled+0x80/frame 0xfffffe006b539200
dma_resv_add_shared_fence() at dma_resv_add_shared_fence+0xb3/frame 0xfffffe006b539270
i915_vma_move_to_active() at i915_vma_move_to_active+0x18a/frame 0xfffffe006b5392b0
eb_move_to_gpu() at eb_move_to_gpu+0x3ad/frame 0xfffffe006b539320
eb_submit() at eb_submit+0x15/frame 0xfffffe006b539350
i915_gem_do_execbuffer() at i915_gem_do_execbuffer+0x7d4/frame 0xfffffe006b539570
i915_gem_execbuffer2_ioctl() at i915_gem_execbuffer2_ioctl+0x1c1/frame 0xfffffe006b539600
drm_ioctl_kernel() at drm_ioctl_kernel+0xd9/frame 0xfffffe006b539670
drm_ioctl() at drm_ioctl+0x5cd/frame 0xfffffe006b539820
linux_file_ioctl() at linux_file_ioctl+0x323/frame 0xfffffe006b539880
kern_ioctl() at kern_ioctl+0x1f4/frame 0xfffffe006b5398f0
sys_ioctl() at sys_ioctl+0x12a/frame 0xfffffe006b5399c0
amd64_syscall() at amd64_syscall+0x121/frame 0xfffffe006b539af0
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe006b539af0
--- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x800a6f09a, rsp = 0x7fffffffe588, rbp = 0x7fffffffe640 ---
KDB: enter: panic
```
Here, the  dma_resv_add_shared_fence() performs a critical_enter() and following call of schedule_work() from semaphore_notify() triggers 'acquiring blockable sleep lock with spinlock or critical section held' panic.

Switching irq_work implementation to fast taskqueue fixes the panic for me.

Other report with the similar bug: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=247166

Reviewed By: hselasky
Differential Revision: https://reviews.freebsd.org/D27171
2021-01-17 12:47:28 +01:00
Edward Tomasz Napierala
feb96ee9c8 linux: mute "unsupported socket(AF_NETLINK, 3, NETLINK_AUDIT)" warnings
They are way too noisy with Focal.

Sponsored by:	The FreeBSD Foundation
2021-01-14 09:16:28 +00:00
Edward Tomasz Napierala
ec2700e015 linux: mute the "unsupported prctl option 23" warnings
Make the PR_CAPBSET_READ prctl(2) return EINVAL without logging
any warnings; this is way too noisy with Focal.

Sponsored by:	The FreeBSD Foundation
2021-01-13 10:31:56 +00:00
Edward Tomasz Napierala
a339b4223a linux: bump the default version from 3.10.0 to 3.17.0
This is required for Qt5, as found in Ubuntu Focal.  The library contains
the minimum kernel version encoded in an ELF note; this makes rtld ignore
it altogether, with a confusing error message.  Without it, things fail
like this:

$ konsole: error while loading shared libraries: libQt5Core.so.5: cannot
open shared object file: No such file or directory

For reference, the Qt kernel version requirements can be found at:
https://github.com/qt/qtbase/blob/dev/src/corelib/global/minimum-linux_p.h

Sponsored by:	The FreeBSD Foundation
Reviewed By:	emaste
Differential Revision:	https://reviews.freebsd.org/D28105
2021-01-13 10:02:16 +00:00
Mateusz Guzik
6b3a9a0f3d Convert remaining cap_rights_init users to cap_rights_init_one
semantic patch:

@@

expression rights, r;

@@

- cap_rights_init(&rights, r)
+ cap_rights_init_one(&rights, r)
2021-01-12 13:16:10 +00:00
Emmanuel Vadot
11d62b6f31 linuxkpi: add kernel_fpu_begin/kernel_fpu_end
With newer AMD GPUs (>=Navi,Renoir) there is FPU context usage in the
amdgpu driver.
The `kernel_fpu_begin/end` implementations in drm did not even allow nested
begin-end blocks.

Submitted by: Greg V
Reviewed By: manu, hselasky
Differential Revision: https://reviews.freebsd.org/D28061
2021-01-12 12:31:00 +01:00
Emmanuel Vadot
2c95fb753f linuxkpi: Add shrinker support
A driver can register a shrinker that will be called when the kernel
wants to free some memory.
Add support for that in linuxkpi and call the registered shrinkers
when the lowmem event is triggered.

Reviewed by:	bz
Differential Revision:	 https://reviews.freebsd.org/D27728
2021-01-12 12:31:00 +01:00
Emmanuel Vadot
105a37cac7 linuxkpi: Add more pci functions needed by DRM
-pci_get_class : This function search for a matching pci device based on
   the class/subclass and returns a newly created pci_dev.
 - pci_{save,restore}_state : This is analogous to ours with the same name
 - pci_is_root_bus : Return true if this is the root bus
 - pci_get_domain_bus_and_slot : This function search for a matching pci
   device based on domain, bus and slot/function concat into a single
   unsigned int (devfn) and returns a newly created pci_dev
 - pci_bus_{read,write}_config* : Read/Write to the config space.

While here add some helper function to alloc and fill the pci_dev struct.

Reviewed by:   hselasky, bz (older version)
Differential Revision:	   https://reviews.freebsd.org/D27550
2021-01-12 12:31:00 +01:00
Neel Chauhan
408c514f73 linuxkpi: Fix the "error: unknown type name 'u32'" compilation issue when
building the Intel QAT/QuickAssist driver.

Approved by:		hselasky, kib
Differential Revision:	https://reviews.freebsd.org/D28055
2021-01-09 15:27:04 -08:00
Konstantin Belousov
de27805fee linuxkpi: handle ARI
Stop trying to manually calculate RID, which cannot be done correctly
by PCI_DEVFN().  Use PCI_GET_RID() method instead.

Do not use pci_find_dbsf() to go from the linux pci_dev to freebsd
device_t.  First, device is readily available as dev.bsddev.  Second,
using pci_find_dbsf() fails for ARI-enabled functions with large
function numbers, because PCI_SLOT()/PCI_FUNC() are for non-ARI.

Reviewed by:	bz, hselasky, manu
Tested by:	manu (drm)
Sponsored by:	Mellanox Technologies/NVidia Networking
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D27960
2021-01-08 23:17:21 +02:00
Alan Somers
20321e6225 Regenerate syscall files after reallocation of aio_writev/aio_readv 2021-01-07 19:50:32 -07:00
Alan Somers
b3286afae3 Reallocate syscall numbers for aio_writev and aio_readv
The originally chosen numbers interfere with downstream projects'
syscalls.  Move them to the end of the syscall table instead.

Reported by:	jrtc27
Reviewed by:	brooks
MFC-With:	022ca2fc7f
Differential Revision:	022ca2fc7f
2021-01-07 19:49:27 -07:00
Alan Somers
1868a91fac Regenerate syscall files after addition of aio_writev/aio_readv 2021-01-02 19:57:58 -07:00
Alan Somers
022ca2fc7f Add aio_writev and aio_readv
POSIX AIO is great, but it lacks vectored I/O functions. This commit
fixes that shortcoming by adding aio_writev and aio_readv. They aren't
part of the standard, but they're an obvious extension. They work just
like their synchronous equivalents pwritev and preadv.

It isn't yet possible to use vectored aiocbs with lio_listio, but that
could be added in the future.

Reviewed by:    jhb, kib, bcr
Relnotes:       yes
Differential Revision: https://reviews.freebsd.org/D27743
2021-01-02 19:57:58 -07:00
Konstantin Belousov
9dd48b87e6 Regen. 2020-12-27 12:57:27 +02:00
Konstantin Belousov
7a202823aa Expose eventfd in the native API/ABI using a new __specialfd syscall
eventfd is a Linux system call that produces special file descriptors
for event notification. When porting Linux software, it is currently
usually emulated by epoll-shim on top of kqueues.  Unfortunately, kqueues
are not passable between processes.  And, as noted by the author of
epoll-shim, even if they were, the library state would also have to be
passed somehow.  This came up when debugging strange HW video decode
failures in Firefox.  A native implementation would avoid these problems
and help with porting Linux software.

Since we now already have an eventfd implementation in the kernel (for
the Linuxulator), it's pretty easy to expose it natively, which is what
this patch does.

Submitted by:   greg@unrelenting.technology
Reviewed by:    markj (previous version)
MFC after:      2 weeks
Differential Revision:  https://reviews.freebsd.org/D26668
2020-12-27 12:57:26 +02:00
Konstantin Belousov
7cb901bf22 Remove useless ARGUSED annotations.
Submitted by:	greg@unrelenting.technology
2020-12-27 12:57:26 +02:00
Konstantin Belousov
11c9f2ff1a Add SPDX tag.
Submitted by:	greg@unrelenting.technology
2020-12-27 12:57:26 +02:00
Konstantin Belousov
673e2dd652 Add ELF flag to disable ASLR stack gap.
Also centralize and unify checks to enable ASLR stack gap in a new
helper exec_stackgap().

PR:	239873
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2020-12-18 23:14:39 +00:00
John Baldwin
de66c9a118 Cleanups to *ERR* compat shims.
- Use [u]intptr_t casts to convert pointers to integers.

- Change IS_ERR* to return bool instead of long.

Reviewed by:	manu
Obtained from:	CheriBSD
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D27577
2020-12-17 20:28:53 +00:00
John Baldwin
ce8395ecfd Use the 't' modifier to print a ptrdiff_t.
Reviewed by:	imp
Obtained from:	CheriBSD
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D27576
2020-12-16 00:11:30 +00:00
Hans Petter Selasky
b8b3f4fdc3 Improve handling of alternate settings in the USB stack.
Allow setting the alternate interface number to fail when there is only
one alternate setting present, to comply with the USB specification.

Refactor how iface->num_altsetting is computed.

Bump the __FreeBSD_version due to change of core USB structure.

PR:		251856
MFC after:	1 week
Submitted by:	Ma, Horse <Shichun.Ma@dell.com>
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2020-12-15 12:05:07 +00:00
Bryan Drewery
f6e7d67a43 linux_dma: Ensure proper flags pass to allocators.
Possibly fixes the wrong flags being passed to the kernel
allocators in linux_dma_alloc_coherent() and linux_dma_pool_alloc().

Reviewed by:	hps
MFC after:	2 weeks
Sponsored by:	Dell EMC
Differential Revision:	https://reviews.freebsd.org/D27508
2020-12-10 20:45:08 +00:00
Hans Petter Selasky
a399cf139b Prefer using the MIN() function macro over the min() inline function
in the LinuxKPI. Linux defines min() to be a macro, while in FreeBSD
min() is a static inline function clamping its arguments to
"unsigned int".

MFC after:	1 week
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2020-12-07 09:48:06 +00:00
Tijl Coosemans
77fb6b6644 Move V4L feature declarations and DTrace provider definitions from
linux_common.c to linux_util.c so they become available on i386.

linux_common.c defines the linux_common kernel module but this module does
not exist on i386 and linux_common.c is not included in the linux module.
linux_util.c is included in the linux_common module on amd64 and the linux
module on i386.

Remove linux_common.c from files.i386 again.  It was added recently in
r367433 when the DTrace provider definitions were moved.

The V4L feature declarations were moved to linux_common in r283423.
2020-12-06 10:58:55 +00:00
Konstantin Belousov
d7d95c3ff8 Regen 2020-12-04 18:58:27 +00:00
Konstantin Belousov
31df9c26c5 Fix compat32 for ntp_adjtime(2).
struct timex is not 32-bit safe, it uses longs for members.
Provide translation.

Reviewed by:	brooks, cy
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D27471
2020-12-04 18:57:58 +00:00
Hans Petter Selasky
ff15f3f133 Allow the rbtree header file in the LinuxKPI to be used in standalone code.
MFC after:	1 week
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2020-12-04 15:50:44 +00:00
Hans Petter Selasky
01fdacdbc7 Allow the list header file in the LinuxKPI to be used in standalone code.
Some style and spelling nits while at it.

MFC after:	1 week
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2020-12-04 15:46:48 +00:00
Hans Petter Selasky
ed2b70e8af Use function macro for sema_init() in the LinuxKPI to limit macro expansion scope.
MFC after:	1 week
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2020-11-30 09:47:53 +00:00
Konstantin Belousov
cd85379104 Make MAXPHYS tunable. Bump MAXPHYS to 1M.
Replace MAXPHYS by runtime variable maxphys. It is initialized from
MAXPHYS by default, but can be also adjusted with the tunable kern.maxphys.

Make b_pages[] array in struct buf flexible.  Size b_pages[] for buffer
cache buffers exactly to atop(maxbcachebuf) (currently it is sized to
atop(MAXPHYS)), and b_pages[] for pbufs is sized to atop(maxphys) + 1.
The +1 for pbufs allow several pbuf consumers, among them vmapbuf(),
to use unaligned buffers still sized to maxphys, esp. when such
buffers come from userspace (*).  Overall, we save significant amount
of otherwise wasted memory in b_pages[] for buffer cache buffers,
while bumping MAXPHYS to desired high value.

Eliminate all direct uses of the MAXPHYS constant in kernel and driver
sources, except a place which initialize maxphys.  Some random (and
arguably weird) uses of MAXPHYS, e.g. in linuxolator, are converted
straight.  Some drivers, which use MAXPHYS to size embeded structures,
get private MAXPHYS-like constant; their convertion is out of scope
for this work.

Changes to cam/, dev/ahci, dev/ata, dev/mpr, dev/mpt, dev/mvs,
dev/siis, where either submitted by, or based on changes by mav.

Suggested by: mav (*)
Reviewed by:	imp, mav, imp, mckusick, scottl (intermediate versions)
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D27225
2020-11-28 12:12:51 +00:00
Konstantin Belousov
4815f175d0 Linuxolator: Replace use of eventhandlers by sysent hooks.
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D27309
2020-11-23 18:18:16 +00:00
Kyle Evans
60e60e73fd freebsd32: take the _umtx_op struct definitions back
Providing these in freebsd32.h facilitates local testing/measuring of the
structs rather than forcing one to locally recreate them. Sanity checking
offsets/sizes remains in kern_umtx.c where these are typically used.
2020-11-23 00:58:14 +00:00
Kyle Evans
15eaec6a5c _umtx_op: move compat32 definitions back in
These are reasonably compact, and a future commit will blur the compat32
lines by supporting 32-bit operations with the native _umtx_op.
2020-11-22 05:34:51 +00:00
Hans Petter Selasky
99f20bdc47 Allow LinuxKPI types to be used in bootloaders, by checking for the
_STANDALONE definition.

No functional change intended.

MFC after:	1 week
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2020-11-18 13:47:11 +00:00