15426 Commits

Author SHA1 Message Date
alc
9e7f2bddf9 Use IDX_TO_OFF(), not ptoa(), when converting the difference between two
vm_pindex_t's into a vm_ooffset_t.

The length given to shm_dotruncate() must never be negative.  Assert this.

Tidy up a comment.

Reviewed by:	kib
MFC after:	1 week
2017-03-20 05:15:55 +00:00
alc
f319609095 Style fixes. In particular, the variable "bogus" is used like a Boolean.
Define it as such.

Reviewed by:	kib
MFC after:	1 week
2017-03-19 23:06:11 +00:00
vangyzen
5dc3189a1b Regenerate syscall files for r315526
Sponsored by:	Dell EMC
2017-03-19 00:54:24 +00:00
vangyzen
d6de25428d Add clock_nanosleep()
Add a clock_nanosleep() syscall, as specified by POSIX.
Make nanosleep() a wrapper around it.

Attach the clock_nanosleep test from NetBSD. Adjust it for the
FreeBSD behavior of updating rmtp only when interrupted by a signal.
I believe this to be POSIX-compliant, since POSIX mentions the rmtp
parameter only in the paragraph about EINTR. This is also what
Linux does. (NetBSD updates rmtp unconditionally.)

Copy the whole nanosleep.2 man page from NetBSD because it is complete
and closely resembles the POSIX description. Edit, polish, and reword it
a bit, being sure to keep any relevant text from the FreeBSD page.

Reviewed by:	kib, ngie, jilles
MFC after:	3 weeks
Relnotes:	yes
Sponsored by:	Dell EMC
Differential Revision:	https://reviews.freebsd.org/D10020
2017-03-19 00:51:12 +00:00
alc
fb24921f88 Avoid unnecessary calls to vm_map_protect() in elf_load_section().
Typically, when elf_load_section() unconditionally passed VM_PROT_ALL to
elf_map_insert(), it was needlessly enabling execute access on the
mapping, and it would later have to call vm_map_protect() to correct the
mapping's access rights.  Now, instead, elf_load_section() always passes
its parameter "prot" to elf_map_insert().  So, elf_load_section() must
only call vm_map_protect() if it needs to remove the write access that
was temporarily granted to perform a copyout().

Reviewed by:	kib
MFC after:	1 week
2017-03-18 23:37:00 +00:00
vangyzen
207af3fa68 nanosleep: plug a kernel memory disclosure
nanosleep() updates rmtp on EINVAL.  In that case, kern_nanosleep()
has not updated rmt, so sys_nanosleep() updates the user-space rmtp
by copying garbage from its stack frame.  This is not only a kernel
memory disclosure, it's also not POSIX-compliant.  Fix it to update
rmtp only on EINTR.

Reviewed by:	jilles (via D10020), dchagin
MFC after:	3 days
Security:	possibly
Sponsored by:	Dell EMC
Differential Revision:	https://reviews.freebsd.org/D10044
2017-03-18 20:16:23 +00:00
bde
0bc4f8ada7 Fix bright colors for syscons, and make them work for the first time
for vt.  Restore syscons' rendering of background (bg) brightness as
foreground (fg) blinking and vice versa, and add rendering of blinking
as background brightness to vt.

Bright/saturated is conflated with light/white in the implementation
and in this description.

Bright colors were broken in all cases, but appeared to work in the
only case shown by "vidcontrol show".  A boldness hack was applied
only in 1 layering-violation place (for some syscons sequences) where
it made some cases seem to work but was undone by clearing bold using
ANSI sequences, and more seriously was not undone when setting
ANSI/xterm dark colors so left them bright.  Move this hack to drivers.

The boldness hack is only for fg brightness.  Restore/add a similar hack
for bg brightness rendered as fg blinking and vice versa.  This works
even better for vt, since vt changes the default text mode to give the
more useful bg brightness instead of fg blinking.

The brightness bit in colors was unnecessarily removed by the boldness
hack.  In other cases, it was lost later by teken_256to8().  Use
teken_256to16() to not lose it.  teken_256to8() was intended to be
used for bg colors to allow finer or bg-specific control for the more
difficult reduction to 8; however, since 16 bg colors actually work
on VGA except in syscons text mode and the conversion isn't subtle
enough to significantly in that mode, teken_256to8() is not used now.

There are still bugs, especially in vidcontrol, if bright/blinking
background colors are set.

Restore XOR logic for bold/bright fg in syscons (don't change OR
logic for vt).  Remove broken ifdef on FG_UNDERLINE and its wrong
or missing bit and restore the correct hard-coded bit.  FG_UNDERLINE
is only for mono mode which is not really supported.

Restore XOR logic for blinking/bright bg in syscons (in vt, add
OR logic and render as bright bg).  Remove related broken ifdef
on BG_BLINKING and its missing bit and restore the correct
hard-coded bit.  The same bit means blinking or bright bg depending
on the mode, and we want to ignore the difference everywhere.

Simplify conversions of attributes in syscons.  Don't pretend to
support bold fonts.  Don't support unusual encodings of brightness.
It is as good as possible to map 16 VGA colors to 16 xterm-16
colors.  E.g., VGA brown -> xterm-16 Olive will be converted back
to VGA brown, so we don't need to convert to xterm-256 Brown.  Teken
cons25 compatibility code already does the same, and duplicates some
small tables.  This is mostly for the sc -> te direction.  The other
direction uses teken_256to16() which is too generic.
2017-03-18 11:13:54 +00:00
kib
0350bb5e92 When clearing altsigstack settings on exec, do it to the right thread.
Diagnosed by:	smh
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-03-17 13:37:37 +00:00
badger
643b691b05 Don't clear p_ptevents on normal SIGKILL delivery
The ptrace() user has the option of discarding the signal. In such a
case, p_ptevents should not be modified. If the ptrace() user decides to
send a SIGKILL, ptevents will be cleared in ptracestop(). procfs events
do not have the capability to discard the signal, so continue to clear
the mask in that case.

Reviewed by:	jhb (initial revision)
MFC after:	1 week
Sponsored by:	Dell EMC
Differential Revision:	https://reviews.freebsd.org/D9939
2017-03-16 13:03:31 +00:00
jhb
fd0aeed164 Use UMA_ALIGN_PTR instead of sizeof(void *) for zone alignment.
uma_zcreate()'s alignment argument is supposed to be sizeof(foo) - 1,
and uma.h provides a set of helper macros for common types.  Passing
sizeof(void *) results in all of the members being misaligned triggering
unaligned access faults on certain architectures (notably MIPS).

Reported by:	brooks
Obtained from:	CheriBSD
MFC after:	3 days
Sponsored by:	DARPA / AFRL
2017-03-15 18:23:32 +00:00
alc
89e63829ad Relax the locking requirements for vm_object_page_noreuse(). While
reviewing all uses of OFF_TO_IDX(), I observed that
vm_object_page_noreuse() is requiring an exclusive lock on the object
when, in fact, a shared lock suffices.

Reviewed by:	kib, markj
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D10011
2017-03-15 17:43:45 +00:00
markj
460a69c6a0 When draining a callout, don't clear CALLOUT_ACTIVE while it is running.
The callout may reschedule itself and execute again before callout_drain()
returns, but we should not clear CALLOUT_ACTIVE until the callout is
stopped.

Tested by:	pho
MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
2017-03-15 00:29:27 +00:00
vangyzen
47fc9e6df6 Add missing pieces of r315280
I moved this branch from github to a private server, and pulled from the
wrong one when committing r315280, so I failed to include two recent commits.
Thankfully, they were only cosmetic and were included in the review.
Specifically:

Add documentation, polish comments, and improve style(9).

Tested by:	pho (r315280)
MFC after:	2 weeks
Sponsored by:	Dell EMC
Differential Revision:	https://reviews.freebsd.org/D9791
2017-03-14 22:02:02 +00:00
kib
8cf2af841c Use atop() instead of OFF_TO_IDX() for convertion of addresses or
addresses offsets, as intended.

Suggested and reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2017-03-14 19:39:17 +00:00
vangyzen
c2172076df When the RTC is adjusted, reevaluate absolute sleep times based on the RTC
POSIX 2008 says this about clock_settime(2):

    If the value of the CLOCK_REALTIME clock is set via clock_settime(),
    the new value of the clock shall be used to determine the time
    of expiration for absolute time services based upon the
    CLOCK_REALTIME clock.  This applies to the time at which armed
    absolute timers expire.  If the absolute time requested at the
    invocation of such a time service is before the new value of
    the clock, the time service shall expire immediately as if the
    clock had reached the requested time normally.

    Setting the value of the CLOCK_REALTIME clock via clock_settime()
    shall have no effect on threads that are blocked waiting for
    a relative time service based upon this clock, including the
    nanosleep() function; nor on the expiration of relative timers
    based upon this clock.  Consequently, these time services shall
    expire when the requested relative interval elapses, independently
    of the new or old value of the clock.

When the real-time clock is adjusted, such as by clock_settime(3),
wake any threads sleeping until an absolute real-clock time.
Such a sleep is indicated by a non-zero td_rtcgen.  The sleep functions
will set that field to zero and return zero to tell the caller
to reevaluate its sleep duration based on the new value of the clock.

At present, this affects the following functions:

    pthread_cond_timedwait(3)
    pthread_mutex_timedlock(3)
    pthread_rwlock_timedrdlock(3)
    pthread_rwlock_timedwrlock(3)
    sem_timedwait(3)
    sem_clockwait_np(3)

I'm working on adding clock_nanosleep(2), which will also be affected.

Reported by:	Sebastian Huber <sebastian.huber@embedded-brains.de>
Reviewed by:	jhb, kib
MFC after:	2 weeks
Relnotes:	yes
Sponsored by:	Dell EMC
Differential Revision:	https://reviews.freebsd.org/D9791
2017-03-14 19:06:44 +00:00
kib
5ba1b0f996 Use designated initializers for kevent_copyops.
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-03-14 09:25:01 +00:00
kib
bd6532333d Hide kev_iovlen() definition under #ifdef KTRACE, fixing build of
kernel configs without KTRACE.

Reported by:	rpokala
Sponsored by:	The FreeBSD Foundation
MFC after:	4 days
2017-03-14 08:45:52 +00:00
ian
016bffc695 Change 'Hz' back to 'HZ'... it's referring to the kernel config option
named HZ, not being used as an abbreviation of the unit of measure.
2017-03-12 18:07:03 +00:00
ian
2788e31637 Correct the abbreviations for microseconds (us, not ms), and for Hz (not HZ). 2017-03-12 17:43:45 +00:00
kib
34f9433195 Avoid reusing p_ksi while it is on queue.
When sending SIGCHLD informing reaper that a zombie was reparented to
it, we might race with the situation where the previous parent still
not finished delivering SIGCHLD and having its p_ksi structure on the
signal queue.  While on queue, the ksi should not be used for another
send.

Fix this by copying p_ksi into newly allocated ksi, which is directly
put onto reaper sigqueue.  The later ensures that siginfo for reaper
SIGCHLD is always present, similar to guarantees for siginfo of child.

Reported by:	bdrewery
Discussed with:	jilles
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-03-12 13:58:51 +00:00
kib
c77fb55571 Accept linkers representation for ELF segments with zero on-disk length.
For such segments, GNU bfd linker writes knowingly incorrect value
into the the file offset field of the program header entry, with the
motivation that file should not be mapped for creation of this segment
at all.

Relax checks for the ELF structure validity when on-disk segment
length is zero, and explicitely set mapping length to zero for such
segments to avoid validating rounding arithmetic.

PR:	217610
Reported by:	Robert Clausecker <fuz@fuz.su>
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-03-12 13:51:13 +00:00
kib
3d6312cf4f Style.
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-03-12 13:49:42 +00:00
kib
0336a66286 Ktracing kevent(2) calls with unusual arguments might leads to an
overly large allocation requests.

When ktrace-ing io, sys_kevent() allocates memory to copy the
requested changes and reported events.  Allocations are sized by the
incoming syscall lengths arguments, which are user-controlled, and
might cause overflow in calculations or too large allocations.

Since io trace chunks are limited by ktr_geniosize, there is no sense
it even trying to satisfy unbounded allocations.  Export ktr_geniosize
and clamp the buffers sizes in advance.

PR:	217435
Reported by:	Tim Newsham <tim.newsham@nccgroup.trust>
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-03-12 13:48:24 +00:00
alc
8854b6f932 Simplify the control flow and tidy up a comment in map_insert.
In collaboration with:	kib
MFC after:	1 week
2017-03-11 18:57:13 +00:00
avg
2758f60873 trace thread running state when a thread is run for the first time
This applies to both KTR_SCHED and DTrace sched:::on-cpu tracing.

MFC after:	10 days
2017-03-11 15:57:41 +00:00
avg
875dfd6f55 actually implement proc:::lwp-exit probe
MFC after:	4 days
2017-03-11 15:47:27 +00:00
mmokhi
8931d7c7fb Fix NULL pointer dereference and panic with shm file pread/pwrite.
PR:		217429
Reported by:	Tim Newsham <tim.newsham@nccgroup.trust>
Reviewed by:	kib
Approved by:	dchagin
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D9844
2017-03-10 10:09:44 +00:00
glebius
e5ca15252d In linker_load_file() print name of a file that failed to load.
Discussed with:	kib
2017-03-09 00:56:07 +00:00
glebius
75fbfbcc07 Reduce stack usage in link_elf_load_file(), allocating struct nameidata.
This function may be called recursively, when a module pulls its dependencies.
Under certain circumstances, e.g. quad chain of dependencies and presence
of dtrace we may run out of stack.
2017-03-09 00:45:15 +00:00
glebius
08e629b875 m_mbuftouio() doesn't modify the mbuf. 2017-03-07 19:00:50 +00:00
badger
320c52139d don't stop in issignal() if P_SINGLE_EXIT is set
Suppose a traced process is stopped in ptracestop() due to receipt of a
SIGSTOP signal, and is awaiting orders from the tracing process on how
to handle the signal. Before sending any such orders, the tracing
process exits. This should kill the traced process. But suppose a second
thread handles the SIGKILL and proceeds to exit1(), calling
thread_single(). The first thread will now awaken and will have a chance
to check once more if it should go to sleep due to the SIGSTOP.  It must
not sleep after P_SINGLE_EXIT has been set; this would prevent the
SIGKILL from taking effect, leaving a stopped orphan behind after the
tracing process dies.

Also add new tests for this condition.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	Dell EMC
Differential Revision:	https://reviews.freebsd.org/D9890
2017-03-07 13:41:01 +00:00
kib
02da26ef90 When selecting brand based on old Elf branding, prefer the brand which
interpreter exactly matches the one requested by the activated image.

This change applies r295277, which did the same for note branding, to
the old brand selection, with the same reasoning of fixing compat32
interpreter substitution.

PR:	211837
Reported by:	kenji@kens.fm
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-03-07 13:38:25 +00:00
kib
2594ff8ef5 Require whole brand string matching for old Elf branding.
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-03-07 13:37:35 +00:00
kib
df94808752 Consistently use vm_ooffset_t type for the vm object offset in
elf_load_section.

The values passed currently as vm_offset_t are phdr.p_offset, which
have the native Elf word size.  Since elf_load_section interprets them
as the file offset, use vm object offset type.

Noted and reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-03-07 13:36:43 +00:00
hiren
808c7eb482 Fix the KASSERT check from r314813.
len being 0 is valid.

Submitted by:	ngie
Reported by:	ngie (via jenkins test run)
Sponsored by:	Limelight Networks
2017-03-07 06:46:38 +00:00
hiren
7bc7c2fbf3 We've found a recurring problem where some userland process would be
stuck spinning at 100% cpu around sbcut_internal(). Inside
sbflush_internal(), sb_ccc reached to about 4GB and before passing it
to sbcut_internal(), we type-cast it from uint to int making it -ve.

The root cause of sockbuf growing this large is unknown. Correct fix
is also not clear but based on mailing list discussions, adding
KASSERTs to panic instead of looping endlessly.

Reviewed by:		glebius
Sponsored by:		Limelight Networks
2017-03-07 00:20:01 +00:00
glebius
7344787f15 Fix compilation of r314784 on 32 bit. 2017-03-06 22:32:56 +00:00
glebius
31a27dce02 In panic() print current timestamp, which matches timestamp in the dump
header.  This will help to correlate console server logs with dump files,
no matter how precise is clock on a console server appliance, and how
buggy the appliance is.
2017-03-06 19:14:08 +00:00
kib
5c17fc3007 Instead of direct use of vm_map_insert(), call vm_map_fixed(MAP_CHECK_EXCL).
This KPI explicitely indicates the intent of creating the mapping at
the fixed address, and incorporates the map locking into the callee.

Suggested and reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-03-06 14:09:54 +00:00
alc
c1537aec36 Style and punctuation fixes.
Reviewed by:	kib
MFC after:	3 days
2017-03-05 23:59:04 +00:00
manu
0b2f987bcf Export a sysctl dev.<clkdom>.<unit>.clocks for each clock domain containing
all the clocks that they provide.
Each clocks are exported under the node 'clock.<clkname>' and have the following
children nodes :
- frequency
- parent (The selected parent, if any)
- parents (The list of parents, if any)
- childrens (The list of childrens, if any)
- enable_cnt (The enabled counter)

This give us the possibility to examine clocks at runtime and make graph of
the clock flow.

Reviewed by:	mmel
MFC after:	2 month
Differential Revision:	https://reviews.freebsd.org/D9833
2017-03-05 07:13:29 +00:00
vangyzen
894839ac8b Fix grammar in some comments in subr_sleepqueue.c
While I'm here, remove trailing whitespace.

Reviewed by:	kib, mostly, as part of a larger review
MFC after:	3 days
2017-03-03 21:03:28 +00:00
markj
8b1baed602 Fix a ticks comparison in sched_pctcpu_update().
We may fail to reset the %CPU tracking window if a thread does not run
for over half of the ticks rollover period, resulting in a bogus %CPU
value for the thread until ticks fully rolls over. Handle this by comparing
the unsigned difference ticks - ts_ltick with SCHED_TICK_TARG instead.

Reviewed by:	cem, jeff
MFC after:	1 week
Sponsored by:	Dell EMC Isilon
2017-03-03 20:57:40 +00:00
emaste
7f416bbe91 kern_sig.c: ANSIfy and remove archaic register keyword
Sponsored by:	The FreeBSD Foundation
2017-03-02 22:17:53 +00:00
kib
6a00199f55 Style.
Reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2017-03-02 17:35:13 +00:00
hselasky
df21508671 Implement taskqueue_poll_is_busy() for use by the LinuxKPI.
Refer to comment above function for a detailed description.

Discussed with:		kib @
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-03-02 12:20:23 +00:00
sbruno
aad65e38ff Make gtaskqueue compatible with drm-next such that they can be used with the
linuxkpi tasklets.

Submitted by:	mmacy@nextbsd.org
Reported by:	hps
2017-03-01 18:37:35 +00:00
kib
4f433ead68 Use vm_map_insert() instead of vm_map_find() in elf_map_insert().
Elf_map_insert() needs to create mapping at the known fixed address.
Usage of vm_map_find() assumes, on the other hand, that any suitable
address space range above or equal the specified hint, is acceptable.
Due to operating on the fresh or cleared address space, vm_map_find()
usually creates mapping starting exactly at hint.

Switch to vm_map_insert() use to clearly request fixed mapping from
the VM.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2017-03-01 10:28:15 +00:00
kib
cf53f19d89 When deallocating the vm object in elf_map_insert() due to
vm_map_insert() failure, drop the vnode lock around the call to
vm_object_deallocate().

Since the deallocated object is the vm object of the vnode, we might
get the vnode lock recursion there.  In fact, it is almost impossible
to make vm_map_insert() failing there on stock kernel.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2017-03-01 10:22:07 +00:00
mjg
ab50e3ceda locks: ensure proper barriers are used with atomic ops when necessary
Unclear how, but the locking routine for mutexes was using the *release*
barrier instead of acquire. This must have been either a copy-pasto or bad
completion.

Going through other uses of atomics shows no barriers in:
- upgrade routines (addressed in this patch)
- sections protected with turnstile locks - this should be fine as necessary
  barriers are in the worst case provided by turnstile unlock

I would like to thank Mark Millard and andreast@ for reporting the problem and
testing previous patches before the issue got identified.

ps.
  .-'---`-.
,'          `.
|             \
|              \
\           _  \
,\  _    ,'-,/-)\
( * \ \,' ,' ,'-)
 `._,)     -',-')
   \/         ''/
    )        / /
   /       ,'-'

Hardware provided by: IBM LTC
2017-03-01 05:06:21 +00:00