changes in r246417 were incomplete as they did not add explicit calls to
sigdeferstop() around all the places that previously passed SBDRY to
_sleep(). In addition, nfs_getcacheblk() could trigger a write RPC from
getblk() resulting in sigdeferstop() recursing. Rather than manually
deferring stop signals in specific places, change the VFS_*() and VOP_*()
methods to defer stop signals for filesystems which request this behavior
via a new VFCF_SBDRY flag. Note that this has to be a VFC flag rather than
a MNTK flag so that it works properly with VFS_MOUNT() when the mount is
not yet fully constructed. For now, only the NFS clients are set this new
flag in VFS_SET().
A few other related changes:
- Add an assertion to ensure that TDF_SBDRY doesn't leak to userland.
- When a lookup request uses VOP_READLINK() to follow a symlink, mark
the request as being on behalf of the thread performing the lookup
(cnp_thread) rather than using a NULL thread pointer. This causes
NFS to properly handle signals during this VOP on an interruptible
mount.
PR: kern/176179
Reported by: Russell Cattelan (sigdeferstop() recursion)
Reviewed by: kib
MFC after: 1 month
until child performs exec(). The behaviour is reasonable when a
debugger is the real parent, because the parent is stopped until
exec(), and sending a debugging event to the debugger would deadlock
both parent and child.
On the other hand, when debugger is not the parent of the vforked
child, not sending debugging signals makes it impossible to debug
across vfork.
Fix the issue by declining generating debug signals only when vfork()
was done and child called ptrace(PT_TRACEME). Set a new process flag
P_PPTRACE from the attach code for PT_TRACEME, if P_PPWAIT flag is
set, which indicates that the process was created with vfork() and
still did not execed. Check P_PPTRACE from issignal(), instead of
refusing the trace outright for the P_PPWAIT case. The scope of
P_PPTRACE is exactly contained in the scope of P_PPWAIT.
Found and tested by: zont
Reviewed by: pluknet
MFC after: 2 weeks
195702, 195703, and 195821 prevented a thread from suspending while holding
locks inside of NFS by forcing the thread to fail sleeps with EINTR or
ERESTART but defer the thread suspension to the user boundary. However,
this had the effect that stopping a process during an NFS request could
abort the request and trigger EINTR errors that were visible to userland
processes (previously the thread would have suspended and completed the
request once it was resumed).
This change instead effectively masks stop signals while in the NFS client.
It uses the existing TDF_SBDRY flag to effect this since SIGSTOP cannot
be masked directly. Also, instead of setting PBDRY on individual sleeps,
the NFS client now sets the TDF_SBDRY flag around each NFS request and
stop signals are masked for all sleeps during that region (the previous
change missed sleeps in lockmgr locks). The end result is that stop
signals sent to threads performing an NFS request are completely
ignored until after the NFS request has finished processing and the
thread prepares to return to userland. This restores the behavior of
stop signals being transparent to userland processes while still
preventing threads from suspending while holding NFS locks.
Reviewed by: kib
MFC after: 1 month
only returns name, but also vnode of corefile to use.
This simplifies the code and closes few races, especially in %I handling.
Reviewed by: kib
Obtained from: WHEEL Systems
This fixed panic where we hold mutex (process lock) and try to obtain sleepable
lock (vnode lock in expand_name()). The panic could occur when %I was used
in kern.corefile.
Additionally we avoid expand_name() overhead when coredumps are disabled.
Obtained from: WHEEL Systems
to himself. For example abort(3) at first tries to do kill(getpid(), SIGABRT)
which was failing in capability mode, so the code was failing back to exit(1).
Reviewed by: rwatson
Obtained from: WHEEL Systems
MFC after: 2 weeks
In particular, do not lock Giant conditionally when calling into the
filesystem module, remove the VFS_LOCK_GIANT() and related
macros. Stop handling buffers belonging to non-mpsafe filesystems.
The VFS_VERSION is bumped to indicate the interface change which does
not result in the interface signatures changes.
Conducted and reviewed by: attilio
Tested by: pho
Return EPERM if processes were found but they
were unable to be signaled.
Return the first error from p_cansignal if no signal was successful.
Reviewed by: jilles
Approved by: cperciva
MFC after: 1 week
Return EPERM if processes were found but they
were unable to be signaled.
Return the first error from p_cansignal if no signal was successful.
Reviewed by: jilles
Approved by: cperciva
MFC after: 1 week
set p_xstat to the signal that triggered the stop, but p_xstat is also
used to hold the exit status of an exiting process. Without this change,
a stop signal that arrived after a process was marked P_WEXIT but before
it was marked a zombie would overwrite the exit status with the stop signal
number.
Reviewed by: kib
MFC after: 1 week
debugger exited without calling ptrace(PT_DETACH), there is a time window
that the p_xthread may be pointing to non-existing thread, in practical,
this is not a problem because child process soon will be killed by parent
process.
in td_errno. Flag is supposed to be used by syscalls returning
EJUSTRETURN because errno was already placed into the usermode frame
by a call to set_syscall_retval(9). Both ktrace and dtrace get errno
value from td_errno if the flag is set.
Use the flag to fix sigsuspend(2) error return ktrace records.
Requested by: bde
MFC after: 1 week
The SA_PROC signal property indicated whether each signal number is directed
at a specific thread or at the process in general. However, that depends on
how the signal was generated and not on the signal number. SA_PROC was not
used.
The SYSCTL_NODE macro defines a list that stores all child-elements of
that node. If there's no SYSCTL_DECL macro anywhere else, there's no
reason why it shouldn't be static.
POSIX/SUSvN. The sigwait(2) syscall does return EINTR, and libc.so.7
contains the wrapper sigwait(3) which hides EINTR from callers. The
EINTR return is used by libthr to handle required cancellation point
in the sigwait(3).
To help the binaries linked against pre-libc.so.7, i.e. RELENG_6 and
earlier, to have right ABI for sigwait(3), transform EINTR return from
sigwait(2) into ERESTART.
Discussed with: davidxu
MFC after: 1 week
patch modifies makesyscalls.sh to prefix all of the non-compatibility
calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel
entry points and all places in the code that use them. It also
fixes an additional name space collision between the kernel function
psignal and the libc function of the same name by renaming the kernel
psignal kern_psignal(). By introducing this change now we will ease future
MFCs that change syscalls.
Reviewed by: rwatson
Approved by: re (bz)
A "process descriptor" file descriptor is used to manage processes
without using the PID namespace. This is required for Capsicum's
Capability Mode, where the PID namespace is unavailable.
New system calls pdfork(2) and pdkill(2) offer the functional equivalents
of fork(2) and kill(2). pdgetpid(2) allows querying the PID of the remote
process for debugging purposes. The currently-unimplemented pdwait(2) will,
in the future, allow querying rusage/exit status. In the interim, poll(2)
may be used to check (and wait for) process termination.
When a process is referenced by a process descriptor, it does not issue
SIGCHLD to the parent, making it suitable for use in libraries---a common
scenario when using library compartmentalisation from within large
applications (such as web browsers). Some observers may note a similarity
to Mach task ports; process descriptors provide a subset of this behaviour,
but in a UNIX style.
This feature is enabled by "options PROCDESC", but as with several other
Capsicum kernel features, is not enabled by default in GENERIC 9.0.
Reviewed by: jhb, kib
Approved by: re (kib), mentor (rwatson)
Sponsored by: Google Inc
The code provides information on how the signal was generated.
Formerly, the code was only logged for traps, much like only signal handlers
for traps received a meaningful si_code before FreeBSD 7.0.
In rare cases, no information is available and 0 is still logged.
MFC after: 1 week
instead of using SIGISMEMBER to test every interesting signal, just
unmask the signal set and let cursig() return one, get the signal
after it returns, call reschedule_signal() after signals are blocked
again.
In kern_sigprocmask(), don't call reschedule_signal() when it is
unnecessary.
In reschedule_signal(), replace SIGISEMPTY() + SIGISMEMBER() with
sig_ffs(), rename variable 'i' to sig.
rwlock to protect the table. In old code, thread lookup is done with
process lock held, to find a thread, kernel has to iterate through
process and thread list, this is quite inefficient.
With this change, test shows in extreme case performance is
dramatically improved.
Earlier patch was reviewed by: jhb, julian
code associated with overflow or with the drain function. While this
function is not expected to be used often, it produces more information
in the form of an errno that sbuf_overflowed() did.
signals, because it is managed by debugger, however a normal signal sent to
a interruptibly sleeping thread wakes up the thread so it will handle the
signal when the process leaves the stopped state.
PR: 150138
MFC after: 1 week
use '-' in probe names, matching the probe names in Solaris.[1]
Add userland SDT probes definitions to sys/sdt.h.
Sponsored by: The FreeBSD Foundation
Discussed with: rwaston [1]