became unused in FreeBSD 12.x as a side-effect of the NUMA-related
changes.)
Reviewed by: kib, markj
Discussed with: jeff, re@
Differential Revision: https://reviews.freebsd.org/D16825
The timespecadd(3) family of macros were imported from NetBSD back in
r35029. However, they were initially guarded by #ifdef _KERNEL. In the
meantime, we have grown at least 28 syscalls that use timespecs in some
way, leading many programs both inside and outside of the base system to
redefine those macros. It's better just to make the definitions public.
Our kernel currently defines two-argument versions of timespecadd and
timespecsub. NetBSD, OpenBSD, and FreeDesktop.org's libbsd, however, define
three-argument versions. Solaris also defines a three-argument version, but
only in its kernel. This revision changes our definition to match the
common three-argument version.
Bump _FreeBSD_version due to the breaking KPI change.
Discussed with: cem, jilles, ian, bde
Differential Revision: https://reviews.freebsd.org/D14725
On arm64 (and possible other architectures) we are unable to use static
DPCPU data in kernel modules. This is because the compiler will generate
PC-relative accesses, however the runtime-linker expects to be able to
relocate these.
In preparation to fix this create two macros depending on if the data is
global or static.
Reviewed by: bz, emaste, markj
Sponsored by: ABT Systems Ltd
Differential Revision: https://reviews.freebsd.org/D16140
While at it rename hlist_add_after() into hlist_add_behind().
Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
the LinuxKPI. Add a comment saying in which Linux version this change was made.
Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
in the LinuxKPI. While at it document when to use the "virtual_address" or
the "address" field in the "vm_fault" structure.
Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
sg_alloc_table_from_pages() function in the LinuxKPI.
This basically allow segments to have a limit, max_segment.
Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
with upstream Linux by returning the pointer to the removed element.
Submitted by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
- Add macros to allow preinitialization of cap_rights_t.
- Convert most commonly used code paths to use preinitialized cap_rights_t.
A 3.6% speedup in fstat was measured with this change.
Reported by: mjg
Reviewed by: oshogbo
Approved by: sbruno
MFC after: 1 month
- Make sure Giant is locked when calling PCI device methods.
Newbus currently requires this.
- Avoid unlocking Giant right before aquiring the sleepqueue lock.
This can save a task switch.
MFC after: 1 week
Sponsored by: Mellanox Technologies
When disabling the MSIX IRQ vectors for a PCI device through the
LinuxKPI, make sure any old MSIX IRQ numbers are no longer visible to
the linux_pci_find_irq_dev() function else IRQs can be requested from
the wrong PCI device.
MFC after: 1 week
Sponsored by: Mellanox Technologies
When complete_all() is called there might be multiple waiters. The
current implementation could only handle one waiter. Make sure the
completion is sticky when complete_all() is called to be compatible
with Linux.
Found by: Johannes Lundberg <johalun0@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
instead of pause() in the msleep() function to avoid rounding errors when
converting delay values forth and back. Add a guard for a delay value
of zero milliseconds which is undefined.
MFC after: 1 week
Requested by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
delayed_work in the LinuxKPI. This allows the timer_pending() function
macro to be used with delayed work structures.
No functional nor structural change.
MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
LinuxKPI to comply more with Linux. This fixes an issue when these functions
are used in waiting loops.
MFC after: 1 week
Sponsored by: Mellanox Technologies
signal in the LinuxKPI.
The read(), write() and mmap() system calls can return either EINTR or
ERESTART upon receiving a signal. Add code to figure out the correct
return value by temporarily storing the return code from the relevant
FreeBSD kernel APIs in the Linux task structure.
MFC after: 3 days
Sponsored by: Mellanox Technologies
Make vm_wait() take the vm_object argument which specifies the domain
set to wait for the min condition pass. If there is no object
associated with the wait, use curthread' policy domainset. The
mechanics of the wait in vm_wait() and vm_wait_domain() is supplied by
the new helper vm_wait_doms(), which directly takes the bitmask of the
domains to wait for passing min condition.
Eliminate pagedaemon_wait(). vm_domain_clear() handles the same
operations.
Eliminate VM_WAIT and VM_WAITPFAULT macros, the direct functions calls
are enough.
Eliminate several control state variables from vm_domain, unneeded
after the vm_wait() conversion.
Scetched and reviewed by: jeff
Tested by: pho
Sponsored by: The FreeBSD Foundation, Mellanox Technologies
Differential revision: https://reviews.freebsd.org/D14384
Make sure all RCU dereferencing use the READ_ONCE() function macro.
MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
Sponsored by: Limelight Networks
printk_ratelimited() function macro to return a boolean stating if there
was a printout, true, or not, false.
MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
LinuxKPI to be compatible with Linux. No functional change.
MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
handler in the LinuxKPI. This is needed when the interrupt handler is disabled
before freeing the interrupt.
MFC after: 1 week
Submitted by: Johannes Lundberg <johalun0@gmail.com>
Sponsored by: Mellanox Technologies
Older versions of GCC don't allow flexible array members in a union.
Use a zero length array instead.
MFC after: 1 week
Reported by: jbeich@
Sponsored by: Mellanox Technologies
When the owner of the wire reference releases the last reference, it
might be that the page was already attempted to be freed (but free
cannot be performed at that time due to wire). Check that the page
was removed from the object as the indicator of the free attempt and
finish the free operation if so.
Reported and tested by: Slava Shwartsman
Reviewed by: hselasky
Sponsored by: Mellanox Technologies
MFC after: 1 week
actually return the computed result instead of the input value.
This is a regression issue after r289572.
Found by: gcc6
MFC after: 3 days
Sponsored by: Mellanox Technologies
1) The OPW() function macro should have the same return type like the
function it executes.
2) The DEVFS I/O-limit should be enforced for all character device reads
and writes.
3) The character device file handle should be passable, same as for
DEVFS based file handles.
Reported by: jbeich @
MFC after: 1 week
Sponsored by: Mellanox Technologies
in the LinuxKPI. This is done by calling finit() just before returning a magic
value of ENXIO in the "linux_dev_fdopen" function.
The Linux file structure should mimic the BSD file structure as much as
possible. This patch decouples the Linux file structure from the belonging
character device right after the "linux_dev_fdopen" function has returned.
This fixes an issue which allows a Linux file handle to exist after a
character device has been destroyed and removed from the directory index
of /dev. Only when the reference count of the BSD file handle reaches zero,
the Linux file handle is destroyed. This fixes use-after-free issues related
to accessing the Linux file structure after the character device has been
destroyed.
While at it add a missing NULL check for non-present file operation.
Calling a NULL pointer will result in a segmentation fault.
Reviewed by: kib @
MFC after: 1 week
Sponsored by: Mellanox Technologies
than virtual
Summary:
Some architectures have physical/bus addresses that are much larger
than virtual addresses. This change just quiets a warning, as DMAP is not used
on those architectures, and on 64-bit platforms uintptr_t is the same size as
vm_paddr_t and void *.
Reviewed By: hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D14043
in the LinuxKPI. The old implementation assumed only one IDR layer was present.
Take additional IDR layers into account when computing the "id" value.
MFC after: 1 week
Found by: Karthik Palanichamy <karthikp@chelsio.com>
Tested by: Karthik Palanichamy <karthikp@chelsio.com>
Sponsored by: Mellanox Technologies
(i386 and arm) that never implement them. This allows the removal of
#ifdef PHYS_TO_DMAP on code otherwise protected by a runtime check on
PMAP_HAS_DMAP. It also fixes the build on ARM and i386 after I forgot an
#ifdef in r328168.
Reported by: Milan Obuch
Pointy hat to: me
kernel by PHYS_TO_DMAP() as previously present on amd64, arm64, riscv, and
powerpc64. This introduces a new MI macro (PMAP_HAS_DMAP) that can be
evaluated at runtime to determine if the architecture has a direct map;
if it does not (or does) unconditionally and PMAP_HAS_DMAP is either 0 or
1, the compiler can remove the conditional logic.
As part of this, implement PHYS_TO_DMAP() on sparc64 and mips64, which had
similar things but spelled differently. 32-bit MIPS has a partial direct-map
that maps poorly to this concept and is unchanged.
Reviewed by: kib
Suggestions from: marius, alc, kib
Runtime tested on: amd64, powerpc64, powerpc, mips64
Email address has changed, uses consistent name (Matthew, not Matt)
Reported by: Matthew Macy <mmacy@mattmacy.io>
Differential Revision: https://reviews.freebsd.org/D13537
in the LinuxKPI returns NULL. This happens when the VM area's private
data handle already exists and could cause a so-called NULL pointer
dereferencing issue prior to this fix.
Found by: greg@unrelenting.technology
MFC after: 1 week
Sponsored by: Mellanox Technologies
LinuxKPI task struct. Change type of "state" variable from "int" to
"atomic_t" to simplify code and avoid unneccessary casting.
MFC after: 1 week
Sponsored by: Mellanox Technologies
This pointer is checked during the linux_dev_open() callback and does
not need to be NULL checked again. It should always be set for
character devices belonging to the "linuxcdevsw" and technically
there is no need to NULL check this pointer at all.
Suggested by: kib @
MFC after: 1 week
Sponsored by: Mellanox Technologies
After some in-progress work is committed, this would otherwise be the only
instance of #if(n)def NO_SWAPPING in the tree. Moreover, the requisite
opt_vm.h include was missing, so the PHOLD/PRELE calls were always being
compiled in anyway.
MFC after: 1 week
then td->td_sel is NULL and this will result in a segfault inside selrecord().
This happens when only using kqueue() to poll for read and write events.
If select() and kqueue() is mixed there won't be a segfault.
Reported by: Johannes Lundberg
MFC after: 1 week
Sponsored by: Mellanox Technologies
gets drained before invoking the work function. Else the timer
mutex may still be in use which can lead to use-after-free situations,
because the work function might free the work structure before returning.
MFC after: 1 week
Sponsored by: Mellanox Technologies
Bump the FreeBSD version to force recompilation of external
kernel modules due to structure change.
PR: 222504
Submitted by: Greg V <greg@unrelenting.technology>
MFC after: 1 week
Sponsored by: Mellanox Technologies
have a scope ID. Change size of the searched scope ID to the full
16-bits. There can typically be more than 255 interfaces.
Suggested by: ae @
MFC after: 1 week
Sponsored by: Mellanox Technologies
Workaround problem that ifa_ifwithaddr() also matches the scope ID of
the IPv6 address when searching for a maching IPv6 address. For now
simply try all valid scope IDs until a match is found.
MFC after: 1 week
Sponsored by: Mellanox Technologies
use of the linux_poll_wakeup() function from unsafe contexts, which
can lead to use-after-free issues.
Instead of calling linux_poll_wakeup() directly use the wake_up()
family of functions in the LinuxKPI to do this.
Bump the FreeBSD version to force recompilation of external kernel modules.
MFC after: 1 week
Sponsored by: Mellanox Technologies
Deadlock condition:
The return value of TDQ_LOCKPTR(td) is the same for two threads.
1) The first thread signals a wakeup while keeping the rcu_read_lock().
This invokes sched_add() which in turn will try to lock TDQ_LOCK().
2) The second thread is calling synchronize_rcu() calling mi_switch() over
and over again trying to yield(). This prevents the first thread from running
and releasing the RCU reader lock.
Solution:
Release the thread lock while yielding to allow other threads to acquire the
lock pointed to by TDQ_LOCKPTR(td).
Found by: KrishnamRaju ErapaRaju <Krishna2@chelsio.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies