have a scope ID. Change size of the searched scope ID to the full
16-bits. There can typically be more than 255 interfaces.
Suggested by: ae @
MFC after: 1 week
Sponsored by: Mellanox Technologies
Workaround problem that ifa_ifwithaddr() also matches the scope ID of
the IPv6 address when searching for a maching IPv6 address. For now
simply try all valid scope IDs until a match is found.
MFC after: 1 week
Sponsored by: Mellanox Technologies
use of the linux_poll_wakeup() function from unsafe contexts, which
can lead to use-after-free issues.
Instead of calling linux_poll_wakeup() directly use the wake_up()
family of functions in the LinuxKPI to do this.
Bump the FreeBSD version to force recompilation of external kernel modules.
MFC after: 1 week
Sponsored by: Mellanox Technologies
and convert any messages of types SCM_BINTIME, SCM_TIMESTAMP,
SCM_REALTIME and SCM_MONOTONIC from 64-bit to its 32-bit
representation. Otherwise we either run out of user-supplied
buffer to copy those out resulting in the MSG_CTRUNC or simply
return values that the userland 32-bit code is not going
to parse correctly. This fixes at least two regression tests
failing to function properly in 32-bit compat mode:
tools/regression/sockets/udp_pingpong
tools/regression/sockets/unix_cmsg
PR: kern/222039
MFC after: 30 days
Now that CloudABI's sockets API has been changed to be addressless and
only connected socket instances are used (e.g., socket pairs), they have
become fairly similar to pipes. The only differences on CloudABI is that
socket pairs additionally support shutdown(), send() and recv().
To simplify the ABI, we've therefore decided to remove pipes as a
separate file descriptor type and just let pipe() return a socket pair
of type SOCK_STREAM. S_ISFIFO() and S_ISSOCK() are now defined
identically.
While I am here, declare struct md_ioctl32 as packed which allows
us to stop playing tricks with sizeof(md_ioctl32)+y as well as
simplifies md_pad handling. Both were necessary because of different
alignment preferences on amd64 vs i386.
MFC after: 4 weeks
Now that all of the packaged software has been adjusted to either use
Flower (https://github.com/NuxiNL/flower) for making incoming/outgoing
network connections or can have connections injected, there is no longer
need to keep accept() around. It is now a lot easier to write networked
services that are address family independent, dual-stack, testable, etc.
Remove all of the bits related to accept(), but also to
getsockopt(SO_ACCEPTCONN).
With Flower (CloudABI's network connection daemon) becoming more
complete, there is no longer any need for creating any unconnected
sockets. Socket pairs in combination with file descriptor passing is all
that is necessary, as that is what is used by Flower to pass network
connections from the public internet to listening processes.
Remove all of the kernel bits that were used to implement socket(),
listen(), bindat() and connectat(). In principle, accept() and
SO_ACCEPTCONN may also be removed, but there are still some consumers
left.
Obtained from: https://github.com/NuxiNL/cloudabi
MFC after: 1 month
Deadlock condition:
The return value of TDQ_LOCKPTR(td) is the same for two threads.
1) The first thread signals a wakeup while keeping the rcu_read_lock().
This invokes sched_add() which in turn will try to lock TDQ_LOCK().
2) The second thread is calling synchronize_rcu() calling mi_switch() over
and over again trying to yield(). This prevents the first thread from running
and releasing the RCU reader lock.
Solution:
Release the thread lock while yielding to allow other threads to acquire the
lock pointed to by TDQ_LOCKPTR(td).
Found by: KrishnamRaju ErapaRaju <Krishna2@chelsio.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
Drop the EARLY_AP_STARTUP gtaskqueue code, as gtaskqueues are now
initialized before APs are started.
Reviewed by: hselasky@, jhb@
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D12054
in the VM area structure in the LinuxKPI when doing mmap() and that
unsupported bits are masked away.
While at it fix some redundant use of parenthesing inside some related
macros.
Found by: KrishnamRaju ErapaRaju <Krishna2@chelsio.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
Such drivers attach to a vgapci bus rather than directly to a pci bus. For
the rest of the LinuxKPI to work correctly in this case, we override the
vgapci bus' ivars with those of the grandparent.
Reviewed by: hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D11932
after r319757.
1) Correct the return value from __wait_event_common() from 1 to 0 in
case the timeout is specified as MAX_SCHEDULE_TIMEOUT. In the other
case __ret is zero and will be substituted in the last part of the
macro with the appropriate value before return.
2) Make sure the "timeout" argument is casted to "int" before
evaluating negativity. Else the signedness of a "long" might be
checked instead of the signedness of an integer.
3) The wait_event() function should not have a return value.
Found by: KrishnamRaju ErapaRaju <Krishna2@chelsio.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
handles a timeout value of MAX_SCHEDULE_TIMEOUT which basically means there
is no timeout. This is a regression issue after r319757.
While at it change the type of returned variable from "long" to "int" to
match the actual return type.
MFC after: 1 week
Sponsored by: Mellanox Technologies
LinuxKPI workqueue wrappers reported "successful" cancellation for works
already completed in normal way. This change brings reported status and
real cancellation fact into sync. This required for drm-next operation.
Reviewed by: hselasky (earlier version)
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D11904
While there, switch to FreeBSD internal callout active status.
Reviewed by: markj, hselasky
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D11900
o Replace __riscv64 with (__riscv && __riscv_xlen == 64)
This is required to support new GCC 7.1 compiler.
This is compatible with current GCC 6.1 compiler.
RISC-V is extensible ISA and the idea here is to have built-in define
per each extension, so together with __riscv we will have some subset
of these as well (depending on -march string passed to compiler):
__riscv_compressed
__riscv_atomic
__riscv_mul
__riscv_div
__riscv_muldiv
__riscv_fdiv
__riscv_fsqrt
__riscv_float_abi_soft
__riscv_float_abi_single
__riscv_float_abi_double
__riscv_cmodel_medlow
__riscv_cmodel_medany
__riscv_cmodel_pic
__riscv_xlen
Reviewed by: ngie
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D11901
wide enough to hold the full 64-bit dev_t. Instead use the "dev" field in
the "linux_cdev" structure to store and lookup this value.
While at it remove superfluous use of parenthesis inside the
MAJOR(), MINOR() and MKDEV() macros in the LinuxKPI.
MFC after: 1 week
Sponsored by: Mellanox Technologies
Use them in some existing code that is vulnerable to roundoff errors.
The existing constant SBT_1NS is a honeypot, luring unsuspecting folks into
writing code such as long_timeout_ns*SBT_1NS to generate the argument for a
sleep call. The actual value of 1ns in sbt units is ~4.3, leading to a
large roundoff error giving a shorter sleep than expected when multiplying
by the trucated value of 4 in SBT_1NS. (The evil honeypot aspect becomes
clear after you waste a whole day figuring out why your sleeps return early.)
The CloudABI specification has had some minor changes over the last half
year. No substantial features have been added, but some features that
are deemed unnecessary in retrospect have been removed:
- mlock()/munlock():
These calls tend to be used for two different purposes: real-time
support and handling of sensitive (cryptographic) material that
shouldn't end up in swap. The former use case is out of scope for
CloudABI. The latter may also be handled by encrypting swap.
Removing this has the advantage that we no longer need to worry about
having resource limits put in place.
- SOCK_SEQPACKET:
Support for SOCK_SEQPACKET is rather inconsistent across various
operating systems. Some operating systems supported by CloudABI (e.g.,
macOS) don't support it at all. Considering that they are rarely used,
remove support for the time being.
- getsockname(), getpeername(), etc.:
A shortcoming of the sockets API is that it doesn't allow you to
create socket(pair)s, having fake socket addresses associated with
them. This makes it harder to test applications or transparently
forward (proxy) connections to them.
With CloudABI, we're slowly moving networking connectivity into a
separate daemon called Flower. In addition to passing around socket
file descriptors, this daemon provides address information in the form
of arbitrary string labels. There is thus no longer any need for
requesting socket address information from the kernel itself.
This change also updates consumers of the generated code accordingly.
Even though system calls end up getting renumbered, this won't cause any
problems in practice. CloudABI programs always call into the kernel
through a kernel-supplied vDSO that has the numbers updated as well.
Obtained from: https://github.com/NuxiNL/cloudabi
It looks like the __acquire and __release macros are for the consumption
of static analysis tools and have no semantic effect. Transform the
definitions from constant expressions to empty statements in order to
avoid -Wunused-value from gcc.
Likewise avoid future warnings for __chk_{user,io}_ptr, but with a cast
to void, because it looks like some linux kernel code may use those in
expression contexts.
Reviewed by: hselasky, markj
Approved by: markj (mentor)
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D11695
Using the https://github.com/google/capsicum-test/ suite, the
PosixMqueue.CapModeForked test was failing due to an ECAPMODE after
calling kmq_notify(). On further inspection, the dynamically
loaded syscall entry was initialized with sy_flags zeroed out, since
SYSCALL_INIT_HELPER() left sysent.sy_flags with the default value.
Add a new helper SYSCALL{,32}_INIT_HELPER_F() which takes an
additional argument to specify the sy_flags value.
Submitted by: Siva Mahadevan <smahadevan@freebsdfoundation.org>
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D11576
Also add some checks for overflow to existing functions.
Reviewed by: hselasky
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D11533
Instead of mapping a dummy page upon a page fault, map the page
pointed to by the physical address given by IDX_TO_OFF(vmap->vm_pfn).
To simplify the implementation use OBJT_DEVICE to implement our own
linux_cdev_pager_fault() instead of using the existing
linux_cdev_pager_populate().
Some minor code factoring while at it.
Reviewed by: markj @
MFC after: 1 week
Sponsored by: Mellanox Technologies
Set "td_pinned" to zero after "sched_unbind()" to prevent "td_pinned"
from temporarily becoming negative during "sched_bind()". This can
happen if "sched_bind()" uses "sched_pin()" and "sched_unpin()".
MFC after: 1 week
Sponsored by: Mellanox Technologies