output and replace it with a new visible sysctl kern.ipc.acceptqueue
of the same functionality. It specifies the maximum length of the
accept queue on a listen socket.
The old kern.ipc.somaxconn remains available for reading and writing
for compatibility reasons so that existing programs, scripts and
configurations continue to work. There no plans to ever remove the
orginal and now hidden kern.ipc.somaxconn.
GIANT from VFS. In addition, disconnect also netsmb, which is a base
requirement for SMBFS.
In the while SMBFS regular users can use FUSE interface and smbnetfs
port to work with their SMBFS partitions.
Also, there are ongoing efforts by vendor to support in-kernel smbfs,
so there are good chances that it will get relinked once properly locked.
This is not targeted for MFC.
GIANT from VFS. This code is particulary broken and fragile and other
in-kernel implementations around, found in other operating systems,
don't really seem clean and solid enough to be imported at all.
If someone wants to reconsider in-kernel NTFS implementation for
inclusion again, a fair effort for completely fixing and cleaning it
up is expected.
In the while NTFS regular users can use FUSE interface and ntfs-3g
port to work with their NTFS partitions.
This is not targeted for MFC.
GIANT from VFS. In addition, disconnect also netncp, which is a base
requirement for NWFS.
In the possibility of a future maintenance of the code and later
readd to the FreeBSD base, maybe we should think about a better location
for netncp. I'm not entirely sure the / top location is actually right,
however I will let network people to comment on that more specifically.
This is not targeted for MFC.
counter, without actually allocating the vnodes. The supposed use of
the getnewvnode_reserve(9) is to reclaim enough free vnodes while the
code still does not hold any resources that might be needed during the
reclamation, and to consume the slack later for getnewvnode() calls
made from the innards. After the critical block is finished, the
caller shall free any reserve left, by getnewvnode_drop_reserve(9).
Reviewed by: avg
Tested by: pho
MFC after: 1 week
division by zero later if event timer's minimal period is above one second.
For now it is just a theoretical possibility.
Found by: Clang Static Analyzer
instruction loads/stores at its will.
The macro __compiler_membar() is currently supported for both gcc and
clang, but kernel compilation will fail otherwise.
Reviewed by: bde, kib
Discussed with: dim, theraven
MFC after: 2 weeks
.. when deciding whether to continue tracing across suid/sgid exec.
Otherwise if root ktrace-d an unprivileged process and the processed
exec-ed a suid program, then tracing didn't continue across exec.
Reviewed by: bde, kib
MFC after: 22 days
When performing a non-blocking read(2), on a TTY while no data is
available, we should return EAGAIN. But if there's a modem disconnect,
we should return 0. Right now we only return 0 when doing a blocking
read, which is wrong.
MFC after: 1 month
If you have a binary on a filesystem which is also mounted over by
nullfs, you could execute the binary from the lower filesystem, or
from the nullfs mount. When executed from lower filesystem, the lower
vnode gets VV_TEXT flag set, and the file cannot be modified while the
binary is active. But, if executed as the nullfs alias, only the
nullfs vnode gets VV_TEXT set, and you still can open the lower vnode
for write.
Add a set of VOPs for the VV_TEXT query, set and clear operations,
which are correctly bypassed to lower vnode.
Tested by: pho (previous version)
MFC after: 2 weeks
I have to note that POSIX is simply stupid in how it describes O_EXEC/fexecve
and friends. Yes, not only inconsistent, but stupid.
In the open(2) description, O_RDONLY flag is described as:
O_RDONLY Open for reading only.
Taken from:
http://pubs.opengroup.org/onlinepubs/9699919799/functions/open.html
Note "for reading only". Not "for reading or executing"!
In the fexecve(2) description you can find:
The fexecve() function shall fail if:
[EBADF]
The fd argument is not a valid file descriptor open for executing.
Taken from:
http://pubs.opengroup.org/onlinepubs/9699919799/functions/exec.html
As you can see the function shall fail if the file was not open with O_EXEC!
And yet, if you look closer you can find this mess in the exec.html:
Since execute permission is checked by fexecve(), the file description
fd need not have been opened with the O_EXEC flag.
Yes, O_EXEC flag doesn't have to be specified after all. You can open a file
with O_RDONLY and you still be able to fexecve(2) it.
global variables are placed. When a module is loaded by link_elf
linker its variables from "set_vnet" linker set are copied to the
kernel "set_vnet" ("modspace") and all references to these variables
inside the module are relocated accordingly.
The issue is when a module is loaded that has references to global
variables from another, previously loaded module: these references are
not relocated so an invalid address is used when the module tries to
access the variable. The example is V_layer3_chain, defined in ipfw
module and accessed from ipfw_nat.
The same issue is with DPCPU variables, which use "set_pcpu" linker
set.
Fix this making the link_elf linker on a module load recognize
"external" DPCPU/VNET variables defined in the previously loaded
modules and relocate them accordingly. For this set_pcpu_list and
set_vnet_list are used, where the addresses of modules' "set_pcpu" and
"set_vnet" linker sets are stored.
Note, archs that use link_elf_obj (amd64) were not affected by this
issue.
Reviewed by: jhb, julian, zec (initial version)
MFC after: 1 month
getmq_read() and getmq_write() respectively, just like sys_kmq_timedreceive()
and sys_kmq_timedsend().
Sponsored by: FreeBSD Foundation
MFC after: 2 weeks
Well, in theory we can pass those two flags, because O_RDONLY is 0,
but we won't be able to read from a descriptor opened with O_EXEC.
Update the comment.
Sponsored by: FreeBSD Foundation
MFC after: 2 weeks
If O_EXEC is provided don't require CAP_READ/CAP_WRITE, as O_EXEC
is mutually exclusive to O_RDONLY/O_WRONLY/O_RDWR.
Without this change CAP_FEXECVE capability right is not enforced.
Sponsored by: FreeBSD Foundation
MFC after: 3 days
"genunix" This will requires us to modify externally created
DTrace scripts but makes logical sense for FreeBSD.
Requested by: rpaulo
MFC after: 2 weeks
as controlled by kern.random.sys.harvest.swi. SWI harvesting feeds into
the interrupt FIFO and each event is estimated as providing a single bit of
entropy.
Reviewed by: markm, obrien
MFC after: 2 weeks
slot. This eventually results in exhaustion of the tid space, causing
new threads get tid -1 as identifier.
The bad effect of having the thread id equal to -1 is that
UMTX_OP_UMUTEX_WAIT returns EFAULT for a lock owned by such thread,
because casuword cannot distinguish between literal value -1 read from
the address and -1 returned as an indication of faulted
access. _thr_umutex_lock() helper from libthr does not check for
errors from _umtx_op_err(2), causing an infinite loop in
mutex_lock_sleep().
We observed the JVM processes hanging and consuming enormous amount of
system time on machines with approximately 100 days uptime.
Reported by: Mykola Dzham <freebsd levsha org ua>
MFC after: 1 week
trap checks (eg. printtrap()).
Generally this check is not needed anymore, as there is not a legitimate
case where curthread != NULL, after pcpu 0 area has been properly
initialized.
Reviewed by: bde, jhb
MFC after: 1 week
set p_xstat to the signal that triggered the stop, but p_xstat is also
used to hold the exit status of an exiting process. Without this change,
a stop signal that arrived after a process was marked P_WEXIT but before
it was marked a zombie would overwrite the exit status with the stop signal
number.
Reviewed by: kib
MFC after: 1 week
Idle threads are not allowed to acquire any lock but spinlocks.
Deny any attempt to do so by panicing at the locking operation
when INVARIANTS is on. Then, remove the check on blocking on a
turnstile.
The check in sleepqueues is left because they are not allowed to use
tsleep() either which could happen still.
Reviewed by: bde, jhb, kib
MFC after: 1 week
with TDP_NOSLEEPING on.
The current message has no informations on the thread and wchan
involed, which may be useful in case where dumps have mangled dwarf
informations.
Reported by: kib
Reviewed by: bde, jhb, kib
MFC after: 1 week
about vnode reclamation. Typical use is for the bypass mounts like
nullfs to get a notification about lower vnode going away.
Now, vgone() calls new VFS op vfs_reclaim_lowervp() with an argument
lowervp which is reclaimed. It is possible to register several
reclamation event listeners, to correctly handle the case of several
nullfs mounts over the same directory.
For the filesystem not having nullfs mounts over it, the overhead
added is a single mount interlock lock/unlock in the vnode reclamation
path.
In collaboration with: pho
MFC after: 3 weeks
lookup code that dotdot lookups shall override any shared lock
requests with the exclusive one. The flag is useful for filesystems
which sometimes need to upgrade shared lock to exclusive inside the
VOP_LOOKUP or later, which cannot be done safely for dotdot, due to
dvp also locked and causing LOR.
In collaboration with: pho
MFC after: 3 weeks
TDP_NOSLEEPING leaking from syscallret() to userret() so that also
trap handling is covered. Also, the check on td_locks is not duplicated
between the two functions.
Reported by: avg
Reviewed by: kib
MFC after: 1 week