There is no reason for it to behave differently from openat(fd, NULL).
Also the handling did not worked because the substituted path was from
the system address space, causing EFAULT.
Submitted by: Jack Halford <jack@gandi.net>
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D18501
Once a signal's siginfo was copied to 'td_si' as part of the signal
exchange in issignal(), it was never cleared. This caused future
thread events that are reported as SIGTRAP events without signal
information to report the stale siginfo in 'td_si'. For example, if a
debugger created a new process and used SIGSTOP to stop it after
PT_ATTACH, future system call entry / exit events would set PL_FLAG_SI
with the SIGSTOP siginfo in pl_siginfo. This broke 'catch syscall' in
current versions of gdb as it assumed PL_FLAG_SI with SIGTRAP
indicates a breakpoint or single step trap.
Reviewed by: kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D18487
and includes the last block represented by the leaf. The reasoning is that,
if the last block is included, then there must be no solution before that
one in the leaf, so the leaf cannot provide an allocation that big again;
indeed, the leaf cannot provide a solution bigger than range1.
Which is all correct, except that if the value of blk passed in did not
represent the first block of the leaf, because the cursor was pointing to
the middle of the leaf, then a possible solution before the cursor may have
been ignored, and bighint cannot be updated.
Consider the sequence allocate 63 (returning address 0), free 0,63 (freeing
that same block, and allocate 1 (returning 63). The result is that one
block is allocated from the first leaf, and the value of bighint is 0, so
that nothing can be allocated from that leaf until the only block allocated
from that leaf is freed. This change detects that skipped-over solution,
and when there is one it makes sure that the value of bighint is not changed
when the last block is allocated.
Submitted by: Doug Moore <dougm@rice.edu>
Tested by: pho
X-MFC with: r340402
Differential Revision: https://reviews.freebsd.org/D18474
If all IDs from trypid to pid_max were used as pids, the code would enter
a loop which would be infinite if none of the IDs could become free (e.g.
they all belong to processes which did not transitioned to zombie).
Fixes: r341684 ("Manage process-related IDs with bitmaps")
Sponsored by: The FreeBSD Foundation
kqueue would always relock immediately afterwards.
While here drop the NULL check for list itself. The list is
always allocated.
Sponsored by: The FreeBSD Foundation
When we detected that the vnode is not symlink, return immediately.
This moves the readlink code out of else branch and unindents it.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Braces were put in the wrong place, causing failing EAGAIN check to
return zero result. Remove the problematic assignment from the
conditional expression at all.
While there, remove used once variable vp, and wrap too long line.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
The kernel was already doing this prior to r329615. It was changed
to reduce contention on allproc. However, introduction of pidhash
locks and removal of proctree -> allproc ordering from fork thanks
to bitmaps fixed things enough to make this change pessimal.
waitpid takes proctree on each call and this change (now) causes
avoidable stalls if allproc is held.
Sponsored by: The FreeBSD Foundation
Currently unique pid allocation on fork often requires a full walk of
process, group, session lists to make sure it is not used by anything.
This has a side effect of requiring proctree to be held along with allproc,
which adds more contention in poudriere -j 128.
The patch below implements trivial bitmaps which gets rid of the problem.
Dedicated lock is introduced to manage IDs.
While here a bug was discovered: all processes would inherit reap id from
the first process spawned by init. This had a side effect of keeping the
ID used and when allocation rolls over to the beginning it keeps being
skipped.
The patch is loosely based on initial work by mjoras@.
Reviewed by: kib
Sponsored by: The FreeBSD Foundation
With the removal of BOOTCDROM and fastboot support, this code always
passed "-s" or "--". The latter simply terminates getopt(3) processing
in init so we only need to pass "-s" in the single user case, or nothing
in other cases.
The passing of "--" seems to have been done to ensure that the number of
arguments passed to init was always the same and thus that argc was the
same.
Also GC the write-only variable pathlen (not in reviewed version).
Reviewed by: kib, jhb
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D18441
cursor == 0.
Every call to blst_meta_alloc but the one at the root is made only when the
meta-node is known to include a free block, so that either the allocation
will succeed, the node hint will be updated, or the last block of the meta-
node range is, and remains, free. But the call at the root is made without
checking that there is a free block, so in the case that every block is
allocated, there is no hint update to prevent the current code from looping
forever.
Submitted by: Doug Moore <dougm@rice.edu>
Reported by: pho
Reviewed by: pho
Tested by: pho
X-MFC with: r340402
Differential Revision: https://reviews.freebsd.org/D17999
When BOOTCDROM is defined (via CFLAGS as there is no config option)
it causes -C to be passed to init, but our init and the version of
sysinstall I glanced at in 6.x don't support -C. The last plausibly
related support was removed from the tree in 1995.
Reviewed by: kib
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D18431
The flag is not used by anything for years and supporting it requires an
explicit read from the lock when entering slow path.
Flag value is left unused on purpose.
Sponsored by: The FreeBSD Foundation
This was in the orignal patch, but lost in a rebase.
Reported by: andrew
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D15816
Have ogetkerninfo, ogetpagesize, ogethostname, osethostname, and oaccept
declare o<foo>_args structs rather than non-compat ones. Due to a
failure to use NOARGS in most cases this adds only one new declaration.
No changes required in freebsd32 as only ogetpagesize() is implemented
and it has a 32-bit specific implementation.
Reviewed by: kib
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D15816
Construct a struct image_args with the help of new exec_args_*() helper
functions and call kern_execve().
The previous code mapped a page in userspace, copied arguments out
to it one at a time, and then constructed a struct execve_args all so
that sys_execve() can call exec_copyin_args() to copy the data back in
to a struct image_args.
Opencode the part of pre_execve()/post_execve() that releases a
reference to the initial vmspace. We don't need to stop threads like
they do.
Reviewed by: kib, jhb (prior version)
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D15469
On some architectures, the structures returned by PT_GET*REGS were not
fully populated and could contain uninitialized stack memory. The same
issue existed with the register files in procfs.
Reported by: Thomas Barabosch, Fraunhofer FKIE
Reviewed by: kib
MFC after: 3 days
Security: kernel stack memory disclosure
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D18421
This applies the fix in r283924 to the vm.objects sysctl
added by r283624 so the output will include the vnode
information (i.e. path) for tmpfs objects.
Reviewed by: kib, dab
MFC after: 2 weeks
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D2724
Given a zeroed struct image_args with an allocated buf member,
exec_args_add_fname() must be called to install a file name (or NULL).
Then zero or more calls to exec_args_add_env() followed by zero or
more calls to exec_args_add_env(). exec_args_adjust_args() may be
called after args and/or env to allow an interpreter to be prepended to
the argument list.
To allow code reuse when adding arg and env variables, begin_envv
should be accessed with the accessor exec_args_get_begin_envv()
which handles the case when no environment entries have been added.
Use these functions to simplify exec_copyin_args() and
freebsd32_exec_copyin_args().
Reviewed by: kib
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D15468
It is possible that we started with a relative path but during the
lookup, found an absolute symlink. In this case, BENEATH handling
code needs the latch, but it is too late to calculate it.
While there, somewhat improve the assertions. Clear the NI_LCF_LATCH
flag when the latch vnode is released, so that asserts know the state.
Assert that there is a latch if we entered beneath+abs path mode,
after the starting point is processed.
Reported by: wulf
With more input from: pho
Sponsored by: The FreeBSD Foundation
racct is not enabled by default and even when it is enabled processes are
typically not throttled. The order of checks is left unchanged since
racct_enable will be annotated as __read_frequently, while checking for the
flag in the processes would probably require an extra fetch.
Sponsored by: The FreeBSD Foundation
waitpid always takes proctree to evaluate the list, but only takes allproc
if it can reap. With this patch allproc is no longer taken, which helps during
poudriere -j 128.
Discussed with: kib
Sponsored by: The FreeBSD Foundation
Avoid relying on unsigned overflow for the test.
Simplify expressions to avoid duplicate check for the range.
Style.
Add herald comment.
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D18361
node is set, allow setting security.bsd.unprivileged_proc_debug per-jail.
In part, this is needed to create jails in which the Address Sanitizer
(ASAN) fully works as ASAN utilizes libkvm to inspect the virtual address
space. Instead of having to allow unprivileged process debugging for the
entire system, allow setting it on a per-jail basis.
The sysctl node is still security.bsd.unprivileged_proc_debug and the
jail(8) param is allow.unprivileged_proc_debug. The sysctl code is now a
sysctl proc rather than a sysctl int. This allows us to determine setting
the flag for the corresponding jail (or prison0).
As part of the change, the dynamic allow.* API needed to be modified to
take into account pr_allow flags which may now be disabled in prison0.
This prevents conflicts with new pr_allow flags (like that of vmm(4)) that
are added (and removed) dynamically.
Also teach the jail creation KPI to allow differences for certain pr_allow
flags between the parent and child jail. This can happen when unprivileged
process debugging is disabled in the parent prison, but enabled in the
child.
Submitted by: Shawn Webb <lattera at gmail.com>
Obtained from: HardenedBSD (45b3625edba0f73b3e3890b1ec3d0d1e95fd47e1, deba0b5078cef0faae43cbdafed3035b16587afc, ab21eeb3b4c72f2500987c96ff603ccf3b6e7de8)
Relnotes: yes
Sponsored by: HardenedBSD and G2, Inc
Differential Revision: https://reviews.freebsd.org/D18319
We zero the whole structure; we don't need to zero the __spare__ field again.
Remove trailing whitespace.
MFC after: 2 weeks
Sponsored by: Dell EMC Isilon
aio has two paths: an asynchronous "physio" path and a synchronous path.
Confusingly, physio(9) isn't actually used by the "physio" path, and never
has been. In fact, it may even be called by the synchronous path! Rename
the "physio" path to the "bio" path to reflect what it actually does:
directly compose BIOs and send them to character devices.
MFC after: 2 weeks
cursor lies in the middle of the space that the meta node represents, then
blanking the low bits of mask may make it zero, and break later code that
expects a nonzero value. Add a test that returns failure if the mask has
been cleared.
Submitted by: Doug Moore <dougm@rice.edu>
Reported by: pho
Tested by: pho
X-MFC with: r340402
Differential Revision: https://reviews.freebsd.org/D18058