While there, switch to FreeBSD internal callout active status.
Reviewed by: markj, hselasky
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D11900
o Replace __riscv64 with (__riscv && __riscv_xlen == 64)
This is required to support new GCC 7.1 compiler.
This is compatible with current GCC 6.1 compiler.
RISC-V is extensible ISA and the idea here is to have built-in define
per each extension, so together with __riscv we will have some subset
of these as well (depending on -march string passed to compiler):
__riscv_compressed
__riscv_atomic
__riscv_mul
__riscv_div
__riscv_muldiv
__riscv_fdiv
__riscv_fsqrt
__riscv_float_abi_soft
__riscv_float_abi_single
__riscv_float_abi_double
__riscv_cmodel_medlow
__riscv_cmodel_medany
__riscv_cmodel_pic
__riscv_xlen
Reviewed by: ngie
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D11901
wide enough to hold the full 64-bit dev_t. Instead use the "dev" field in
the "linux_cdev" structure to store and lookup this value.
While at it remove superfluous use of parenthesis inside the
MAJOR(), MINOR() and MKDEV() macros in the LinuxKPI.
MFC after: 1 week
Sponsored by: Mellanox Technologies
Use them in some existing code that is vulnerable to roundoff errors.
The existing constant SBT_1NS is a honeypot, luring unsuspecting folks into
writing code such as long_timeout_ns*SBT_1NS to generate the argument for a
sleep call. The actual value of 1ns in sbt units is ~4.3, leading to a
large roundoff error giving a shorter sleep than expected when multiplying
by the trucated value of 4 in SBT_1NS. (The evil honeypot aspect becomes
clear after you waste a whole day figuring out why your sleeps return early.)
Instead of mapping a dummy page upon a page fault, map the page
pointed to by the physical address given by IDX_TO_OFF(vmap->vm_pfn).
To simplify the implementation use OBJT_DEVICE to implement our own
linux_cdev_pager_fault() instead of using the existing
linux_cdev_pager_populate().
Some minor code factoring while at it.
Reviewed by: markj @
MFC after: 1 week
Sponsored by: Mellanox Technologies
Set "td_pinned" to zero after "sched_unbind()" to prevent "td_pinned"
from temporarily becoming negative during "sched_bind()". This can
happen if "sched_bind()" uses "sched_pin()" and "sched_unpin()".
MFC after: 1 week
Sponsored by: Mellanox Technologies
memory map request. When the VM fault handler is NULL a return code of
VM_PAGER_BAD is returned from the character device's pager populate
handler. This fixes compatibility with Linux.
MFC after: 1 week
Sponsored by: Mellanox Technologies
In particular:
- Don't evaluate event conditions with a sleepqueue lock held, since such
code may attempt to acquire arbitrary locks.
- Fix the return value for wait_event_interruptible() in the case that the
wait is interrupted by a signal.
- Implement wait_on_bit_timeout() and wait_on_atomic_t().
- Implement some functions used to test for pending signals.
- Implement a number of wait_event_*() variants and unify the existing
implementations.
- Unify the mechanism used by wait_event_*() and schedule() to put the
calling thread to sleep.
This is required to support updated DRM drivers. Thanks to hselasky for
finding and fixing a number of bugs in the original revision.
Reviewed by: hselasky
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D10986
ARM and MIPS fail universe builds.
ARM and MIPS are missing the following:
* VM_MEMATTR_WRITE_THROUGH
* VM_MEMATTR_WRITE_COMBINING
Pointy-hat to: jhibbits
arm, mips, and powerpc all implement pmap_mapdev_attr() and pmap_unmapdev(),
so add those archs to the checks. powerpc also includes the atomic_swap_*()
functions, so add that to the supported list as well. Not tested except by
compiling powerpc.
Reviewed by: markj
CPU_FOREACH() is not available until SI_SUB_CPU at SI_ORDER_ANY
when the LinuxKPI is loaded as part of the kernel.
MFC after: 1 week
Sponsored by: Mellanox Technologies
kqueue() does not set non-blocking I/O mode for event driven read of
file descriptors. This means the LinuxKPI internal kqueue read and
write event flags must be updated before the next read and/or write
system call. Else the read and/or write system call may block. This
can happen when there is no more data to read following a previous
read event. Then the application also gets blocked from processing
other events. This situation can also be solved by the applications
setting and using non-blocking I/O mode.
MFC after: 1 week
Sponsored by: Mellanox Technologies
character devices. In Linux the FIONBIO IOCTL is handled by the kernel
and not the drivers. Also need return success for the FIOASYNC ioctl
due to existing logic in kern_fcntl() even though it is not supported
currently.
MFC after: 1 week
Sponsored by: Mellanox Technologies
polling contexts in the LinuxKPI.
After the kqueue() support was added to the LinuxKPI in r319409 the
Linux poll file operation will be used outside the system file polling
callback function, which can cause a NULL-pointer panic inside
selrecord() because curthread->td_sel is set to NULL. This patch moves
the selrecord() call away from poll_wait() and to the system file poll
callback function in the LinuxKPI, which essentially wraps the Linux
one. This is similar to what the cuse(3) module is currently doing.
Refer to sys/fs/cuse/*.[ch] for more details.
MFC after: 1 week
Sponsored by: Mellanox Technologies
ioctl(), read() and write() system call handlers. This error code is
internal to the kernel and should not be seen by user-space programs
according to Linux.
Submitted by: Yanko Yankulov <yanko.yankulov@gmail.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
devices. The implementation allows read and write filters to be
created and piggybacks on the poll() file operation to determine when
a filter should trigger. The piggyback mechanism is simply to check
for the EWOULDBLOCK or EAGAIN return code from read(), write() or
ioctl() system calls and then update the kqueue() polling state bits.
The implementation is similar to the one found in the cuse(3) module.
Refer to sys/fs/cuse/*.[ch] for more details.
MFC after: 1 week
Sponsored by: Mellanox Technologies
task structure to avoid deadlock when tearing down the VM object
during a process exit.
Found by: markj @
MFC after: 1 week
Sponsored by: Mellanox Technologies
it shouldn't be called.
Background:
The Linux VM open operation is called when a new VMA is
created on top of the current VMA. This is done through either mremap
flow or split_vma, usually due to mlock, madvise, munmap and so
on. This is currently not supported by the LinuxKPI.
MFC after: 1 week
Sponsored by: Mellanox Technologies
- Allow "struct linux_file" to be refcounted when its "_file" member
is NULL by using its "f_count" field. The reference counts are
transferred to the file structure when the file descriptor is
installed.
- Add missing vdrop() calls for error cases during open().
- Set the "_file" member of "struct linux_file" during open. This
allows use of refcounting through get_file() and fput() with LinuxKPI
character devices.
MFC after: 1 week
Sponsored by: Mellanox Technologies
kern_yield(0) effectively causes the calling thread to be rescheduled
immediately since it resets the thread's priority to the highest possible
value. This can cause livelocks when the pattern
"while (!trylock()) kern_yield(0);" is used since the thread holding the
lock may linger on the runqueue for the CPU on which the looping thread is
running.
MFC after: 1 week
CPU_FOREACH() is not available until SI_SUB_CPU at SI_ORDER_ANY
when the LinuxKPI is loaded as part of the kernel.
MFC after: 1 week
Sponsored by: Mellanox Technologies
Background:
The same VM object might be shared by multiple processes and the
mm_struct is usually freed when a process exits.
Grab a reference on the mm_struct while the vmap is in the
linux_vma_head list in case the first process which inserted a VM
object has exited.
Tested by: kwm @
MFC after: 1 week
Sponsored by: Mellanox Technologies
linux_page_address() function in the LinuxKPI. This solves an issue
where the return value from linux_page_address() is passed to
kmem_free().
MFC after: 1 week
Sponsored by: Mellanox Technologies
kit, CK, in the LinuxKPI.
When threads are pinned to a CPU core or when there is only one CPU,
it can happen that a higher priority thread can call the CK
synchronize function while a lower priority thread holds the read
lock. Because the CK's synchronize is a simple wait loop this can lead
to a deadlock situation. To solve this problem use the recently
introduced CK's wait callback function.
When detecting a CK blocking condition figure out the lowest priority
among the blockers and update the calling thread's priority and
yield. If another CPU core is holding the read lock, pin the thread to
the blocked CPU core and update the priority. The calling threads
priority and CPU bindings are restored before return.
If a thread holding a CK read lock is detected to be sleeping, pause()
will be used instead of yield().
MFC after: 1 week
Sponsored by: Mellanox Technologies
CPUs when allocating a LinuxKPI workqueue. This also ensures that the
created taskqueue always have a non-zero number of worker threads.
MFC after: 1 week
Sponsored by: Mellanox Technologies
- Move all bitmap related functions from bitops.h to bitmap.h, similar
to what Linux does.
- Apply some minor code cleanup and simplifications to optimize the
generated code when using static inline functions.
- Implement the following list of bitmap functions which are needed by
drm-next and ibcore:
- bitmap_find_next_zero_area_off()
- bitmap_find_next_zero_area()
- bitmap_or()
- bitmap_and()
- bitmap_xor()
- Add missing include directives to the qlnxe driver
(davidcs@ has been notified)
MFC after: 1 week
Sponsored by: Mellanox Technologies
In FreeBSD thread IDs and procedure IDs have distinct number
spaces. When asking for the group leader task ID in the LinuxKPI,
return the procedure ID and let this resolve to the first task in the
procedure having a valid LinuxKPI task structure pointer.
MFC after: 1 week
Sponsored by: Mellanox Technologies
like open, close and fault using the character device pager.
Some notes about the implementation:
1) Linux drivers set the vm_ops and vm_private_data fields during a
mmap() call to indicate that the driver wants to use the LinuxKPI VM
operations. Else these operations are not used.
2) The vm_private_data pointer is associated with a VM area structure
and inserted into an internal LinuxKPI list. If the vm_private_data
pointer already exists, the existing VM area structure is used instead
of the allocated one which gets freed.
3) The LinuxKPI's vm_private_data pointer is used as the callback
handle for the FreeBSD VM object. The VM subsystem in FreeBSD has a
similar list to identify equal handles and will only call the
character device pager's close function once.
4) All LinuxKPI VM operations are serialized through the mmap_sem
sempaphore, which is per procedure, which prevents simultaneous access
to the shared VM area structure when receiving page faults.
Obtained from: kmacy @
MFC after: 1 week
Sponsored by: Mellanox Technologies