Add a clock_nanosleep() syscall, as specified by POSIX.
Make nanosleep() a wrapper around it.
Attach the clock_nanosleep test from NetBSD. Adjust it for the
FreeBSD behavior of updating rmtp only when interrupted by a signal.
I believe this to be POSIX-compliant, since POSIX mentions the rmtp
parameter only in the paragraph about EINTR. This is also what
Linux does. (NetBSD updates rmtp unconditionally.)
Copy the whole nanosleep.2 man page from NetBSD because it is complete
and closely resembles the POSIX description. Edit, polish, and reword it
a bit, being sure to keep any relevant text from the FreeBSD page.
Reviewed by: kib, ngie, jilles
MFC after: 3 weeks
Relnotes: yes
Sponsored by: Dell EMC
Differential Revision: https://reviews.freebsd.org/D10020
Typically, when elf_load_section() unconditionally passed VM_PROT_ALL to
elf_map_insert(), it was needlessly enabling execute access on the
mapping, and it would later have to call vm_map_protect() to correct the
mapping's access rights. Now, instead, elf_load_section() always passes
its parameter "prot" to elf_map_insert(). So, elf_load_section() must
only call vm_map_protect() if it needs to remove the write access that
was temporarily granted to perform a copyout().
Reviewed by: kib
MFC after: 1 week
nanosleep() updates rmtp on EINVAL. In that case, kern_nanosleep()
has not updated rmt, so sys_nanosleep() updates the user-space rmtp
by copying garbage from its stack frame. This is not only a kernel
memory disclosure, it's also not POSIX-compliant. Fix it to update
rmtp only on EINTR.
Reviewed by: jilles (via D10020), dchagin
MFC after: 3 days
Security: possibly
Sponsored by: Dell EMC
Differential Revision: https://reviews.freebsd.org/D10044
for vt. Restore syscons' rendering of background (bg) brightness as
foreground (fg) blinking and vice versa, and add rendering of blinking
as background brightness to vt.
Bright/saturated is conflated with light/white in the implementation
and in this description.
Bright colors were broken in all cases, but appeared to work in the
only case shown by "vidcontrol show". A boldness hack was applied
only in 1 layering-violation place (for some syscons sequences) where
it made some cases seem to work but was undone by clearing bold using
ANSI sequences, and more seriously was not undone when setting
ANSI/xterm dark colors so left them bright. Move this hack to drivers.
The boldness hack is only for fg brightness. Restore/add a similar hack
for bg brightness rendered as fg blinking and vice versa. This works
even better for vt, since vt changes the default text mode to give the
more useful bg brightness instead of fg blinking.
The brightness bit in colors was unnecessarily removed by the boldness
hack. In other cases, it was lost later by teken_256to8(). Use
teken_256to16() to not lose it. teken_256to8() was intended to be
used for bg colors to allow finer or bg-specific control for the more
difficult reduction to 8; however, since 16 bg colors actually work
on VGA except in syscons text mode and the conversion isn't subtle
enough to significantly in that mode, teken_256to8() is not used now.
There are still bugs, especially in vidcontrol, if bright/blinking
background colors are set.
Restore XOR logic for bold/bright fg in syscons (don't change OR
logic for vt). Remove broken ifdef on FG_UNDERLINE and its wrong
or missing bit and restore the correct hard-coded bit. FG_UNDERLINE
is only for mono mode which is not really supported.
Restore XOR logic for blinking/bright bg in syscons (in vt, add
OR logic and render as bright bg). Remove related broken ifdef
on BG_BLINKING and its missing bit and restore the correct
hard-coded bit. The same bit means blinking or bright bg depending
on the mode, and we want to ignore the difference everywhere.
Simplify conversions of attributes in syscons. Don't pretend to
support bold fonts. Don't support unusual encodings of brightness.
It is as good as possible to map 16 VGA colors to 16 xterm-16
colors. E.g., VGA brown -> xterm-16 Olive will be converted back
to VGA brown, so we don't need to convert to xterm-256 Brown. Teken
cons25 compatibility code already does the same, and duplicates some
small tables. This is mostly for the sc -> te direction. The other
direction uses teken_256to16() which is too generic.
The ptrace() user has the option of discarding the signal. In such a
case, p_ptevents should not be modified. If the ptrace() user decides to
send a SIGKILL, ptevents will be cleared in ptracestop(). procfs events
do not have the capability to discard the signal, so continue to clear
the mask in that case.
Reviewed by: jhb (initial revision)
MFC after: 1 week
Sponsored by: Dell EMC
Differential Revision: https://reviews.freebsd.org/D9939
uma_zcreate()'s alignment argument is supposed to be sizeof(foo) - 1,
and uma.h provides a set of helper macros for common types. Passing
sizeof(void *) results in all of the members being misaligned triggering
unaligned access faults on certain architectures (notably MIPS).
Reported by: brooks
Obtained from: CheriBSD
MFC after: 3 days
Sponsored by: DARPA / AFRL
reviewing all uses of OFF_TO_IDX(), I observed that
vm_object_page_noreuse() is requiring an exclusive lock on the object
when, in fact, a shared lock suffices.
Reviewed by: kib, markj
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D10011
The callout may reschedule itself and execute again before callout_drain()
returns, but we should not clear CALLOUT_ACTIVE until the callout is
stopped.
Tested by: pho
MFC after: 2 weeks
Sponsored by: Dell EMC Isilon
I moved this branch from github to a private server, and pulled from the
wrong one when committing r315280, so I failed to include two recent commits.
Thankfully, they were only cosmetic and were included in the review.
Specifically:
Add documentation, polish comments, and improve style(9).
Tested by: pho (r315280)
MFC after: 2 weeks
Sponsored by: Dell EMC
Differential Revision: https://reviews.freebsd.org/D9791
POSIX 2008 says this about clock_settime(2):
If the value of the CLOCK_REALTIME clock is set via clock_settime(),
the new value of the clock shall be used to determine the time
of expiration for absolute time services based upon the
CLOCK_REALTIME clock. This applies to the time at which armed
absolute timers expire. If the absolute time requested at the
invocation of such a time service is before the new value of
the clock, the time service shall expire immediately as if the
clock had reached the requested time normally.
Setting the value of the CLOCK_REALTIME clock via clock_settime()
shall have no effect on threads that are blocked waiting for
a relative time service based upon this clock, including the
nanosleep() function; nor on the expiration of relative timers
based upon this clock. Consequently, these time services shall
expire when the requested relative interval elapses, independently
of the new or old value of the clock.
When the real-time clock is adjusted, such as by clock_settime(3),
wake any threads sleeping until an absolute real-clock time.
Such a sleep is indicated by a non-zero td_rtcgen. The sleep functions
will set that field to zero and return zero to tell the caller
to reevaluate its sleep duration based on the new value of the clock.
At present, this affects the following functions:
pthread_cond_timedwait(3)
pthread_mutex_timedlock(3)
pthread_rwlock_timedrdlock(3)
pthread_rwlock_timedwrlock(3)
sem_timedwait(3)
sem_clockwait_np(3)
I'm working on adding clock_nanosleep(2), which will also be affected.
Reported by: Sebastian Huber <sebastian.huber@embedded-brains.de>
Reviewed by: jhb, kib
MFC after: 2 weeks
Relnotes: yes
Sponsored by: Dell EMC
Differential Revision: https://reviews.freebsd.org/D9791
When sending SIGCHLD informing reaper that a zombie was reparented to
it, we might race with the situation where the previous parent still
not finished delivering SIGCHLD and having its p_ksi structure on the
signal queue. While on queue, the ksi should not be used for another
send.
Fix this by copying p_ksi into newly allocated ksi, which is directly
put onto reaper sigqueue. The later ensures that siginfo for reaper
SIGCHLD is always present, similar to guarantees for siginfo of child.
Reported by: bdrewery
Discussed with: jilles
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
For such segments, GNU bfd linker writes knowingly incorrect value
into the the file offset field of the program header entry, with the
motivation that file should not be mapped for creation of this segment
at all.
Relax checks for the ELF structure validity when on-disk segment
length is zero, and explicitely set mapping length to zero for such
segments to avoid validating rounding arithmetic.
PR: 217610
Reported by: Robert Clausecker <fuz@fuz.su>
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
overly large allocation requests.
When ktrace-ing io, sys_kevent() allocates memory to copy the
requested changes and reported events. Allocations are sized by the
incoming syscall lengths arguments, which are user-controlled, and
might cause overflow in calculations or too large allocations.
Since io trace chunks are limited by ktr_geniosize, there is no sense
it even trying to satisfy unbounded allocations. Export ktr_geniosize
and clamp the buffers sizes in advance.
PR: 217435
Reported by: Tim Newsham <tim.newsham@nccgroup.trust>
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
This function may be called recursively, when a module pulls its dependencies.
Under certain circumstances, e.g. quad chain of dependencies and presence
of dtrace we may run out of stack.
Suppose a traced process is stopped in ptracestop() due to receipt of a
SIGSTOP signal, and is awaiting orders from the tracing process on how
to handle the signal. Before sending any such orders, the tracing
process exits. This should kill the traced process. But suppose a second
thread handles the SIGKILL and proceeds to exit1(), calling
thread_single(). The first thread will now awaken and will have a chance
to check once more if it should go to sleep due to the SIGSTOP. It must
not sleep after P_SINGLE_EXIT has been set; this would prevent the
SIGKILL from taking effect, leaving a stopped orphan behind after the
tracing process dies.
Also add new tests for this condition.
Reviewed by: kib
MFC after: 2 weeks
Sponsored by: Dell EMC
Differential Revision: https://reviews.freebsd.org/D9890
interpreter exactly matches the one requested by the activated image.
This change applies r295277, which did the same for note branding, to
the old brand selection, with the same reasoning of fixing compat32
interpreter substitution.
PR: 211837
Reported by: kenji@kens.fm
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
elf_load_section.
The values passed currently as vm_offset_t are phdr.p_offset, which
have the native Elf word size. Since elf_load_section interprets them
as the file offset, use vm object offset type.
Noted and reviewed by: alc
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
stuck spinning at 100% cpu around sbcut_internal(). Inside
sbflush_internal(), sb_ccc reached to about 4GB and before passing it
to sbcut_internal(), we type-cast it from uint to int making it -ve.
The root cause of sockbuf growing this large is unknown. Correct fix
is also not clear but based on mailing list discussions, adding
KASSERTs to panic instead of looping endlessly.
Reviewed by: glebius
Sponsored by: Limelight Networks
header. This will help to correlate console server logs with dump files,
no matter how precise is clock on a console server appliance, and how
buggy the appliance is.
This KPI explicitely indicates the intent of creating the mapping at
the fixed address, and incorporates the map locking into the callee.
Suggested and reviewed by: alc
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
all the clocks that they provide.
Each clocks are exported under the node 'clock.<clkname>' and have the following
children nodes :
- frequency
- parent (The selected parent, if any)
- parents (The list of parents, if any)
- childrens (The list of childrens, if any)
- enable_cnt (The enabled counter)
This give us the possibility to examine clocks at runtime and make graph of
the clock flow.
Reviewed by: mmel
MFC after: 2 month
Differential Revision: https://reviews.freebsd.org/D9833
We may fail to reset the %CPU tracking window if a thread does not run
for over half of the ticks rollover period, resulting in a bogus %CPU
value for the thread until ticks fully rolls over. Handle this by comparing
the unsigned difference ticks - ts_ltick with SCHED_TICK_TARG instead.
Reviewed by: cem, jeff
MFC after: 1 week
Sponsored by: Dell EMC Isilon
Elf_map_insert() needs to create mapping at the known fixed address.
Usage of vm_map_find() assumes, on the other hand, that any suitable
address space range above or equal the specified hint, is acceptable.
Due to operating on the fresh or cleared address space, vm_map_find()
usually creates mapping starting exactly at hint.
Switch to vm_map_insert() use to clearly request fixed mapping from
the VM.
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
vm_map_insert() failure, drop the vnode lock around the call to
vm_object_deallocate().
Since the deallocated object is the vm object of the vnode, we might
get the vnode lock recursion there. In fact, it is almost impossible
to make vm_map_insert() failing there on stock kernel.
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Unclear how, but the locking routine for mutexes was using the *release*
barrier instead of acquire. This must have been either a copy-pasto or bad
completion.
Going through other uses of atomics shows no barriers in:
- upgrade routines (addressed in this patch)
- sections protected with turnstile locks - this should be fine as necessary
barriers are in the worst case provided by turnstile unlock
I would like to thank Mark Millard and andreast@ for reporting the problem and
testing previous patches before the issue got identified.
ps.
.-'---`-.
,' `.
| \
| \
\ _ \
,\ _ ,'-,/-)\
( * \ \,' ,' ,'-)
`._,) -',-')
\/ ''/
) / /
/ ,'-'
Hardware provided by: IBM LTC