Commit Graph

16577 Commits

Author SHA1 Message Date
Ed Maste
0e26cd440f make sysent after r347228
Regenerate to add @generated tag in generated files.
2019-05-07 18:10:21 +00:00
Conrad Meyer
7d7db5298d device_printf: Use sbuf for more coherent prints on SMP
device_printf does multiple calls to printf allowing other console messages to
be inserted between the device name, and the rest of the message.  This change
uses sbuf to compose to two into a single buffer, and prints it all at once.

It exposes an sbuf drain function (drain-to-printf) for common use.

Update documentation to match; some unit tests included.

Submitted by:	jmg
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D16690
2019-05-07 17:47:20 +00:00
Ed Maste
5350e15d0d makesyscalls: use @generated tag in generated files
Multiple tools use @generated to identify generated files (for example,
in a review Phabricator will by default hide diffs in generated files).
Use the @generated tag in makesyscalls.sh as we've done for other
generated files.

Reviewed by:	cem
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20183
2019-05-07 16:17:33 +00:00
Mark Johnston
7d43b5c98e Simplify the test against maxproc in fork1().
Previously nprocs_new would be tested against maxprocs twice when
nprocs_new < maxprocs - 10.  Eliminate the unnecessary comparison.

Submitted by:	Wuyang Chung <wuyang.chung1@gmail.com>
GitHub PR:	https://github.com/freebsd/freebsd/pull/397
MFC after:	1 week
2019-05-07 15:03:26 +00:00
Doug Moore
27d172bb12 The intention of the blist cursor is for the search for free blocks to
resume where the last search left off. Suppose that there are no free
blocks of size 32, but plenty of size 16. If we repeatedly request
size 32 blocks, fail, and retry with size 16 blocks, then the failures
all reset the cursor to the beginning of memory, making the 16 block
allocation use a first fit, rather than next fit, strategy.

This change has blist_alloc make a copy of the cursor for its own
decision making, and only updates the real blist cursor after a
successful allocation, making those 16 block searches behave like
next-fit searches.

Approved by: markj (mentor)
Differential Revision: https://reviews.freebsd.org/D20177
2019-05-06 22:12:15 +00:00
Conrad Meyer
6b6e2954dd List-ify kernel dump device configuration
Allow users to specify multiple dump configurations in a prioritized list.
This enables fallback to secondary device(s) if primary dump fails.  E.g.,
one might configure a preference for netdump, but fallback to disk dump as a
second choice if netdump is unavailable.

This change does not list-ify netdump configuration, which is tracked
separately from ordinary disk dumps internally; only one netdump
configuration can be made at a time, for now.  It also does not implement
IPv6 netdump.

savecore(8) is already capable of scanning and iterating multiple devices
from /etc/fstab or passed on the command line.

This change doesn't update the rc or loader variables 'dumpdev' in any way;
it can still be set to configure a single dump device, and rc.d/savecore
still uses it as a single device.  Only dumpon(8) is updated to be able to
configure the more complicated configurations for now.

As part of revving the ABI, unify netdump and disk dump configuration ioctl
/ structure, and leave room for ipv6 netdump as a future possibility.
Backwards-compatibility ioctls are added to smooth ABI transition,
especially for developers who may not keep kernel and userspace perfectly
synced.

Reviewed by:	markj, scottl (earlier version)
Relnotes:	maybe
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D19996
2019-05-06 18:24:07 +00:00
Konstantin Belousov
78022527bb Switch to use shared vnode locks for text files during image activation.
kern_execve() locks text vnode exclusive to be able to set and clear
VV_TEXT flag. VV_TEXT is mutually exclusive with the v_writecount > 0
condition.

The change removes VV_TEXT, replacing it with the condition
v_writecount <= -1, and puts v_writecount under the vnode interlock.
Each text reference decrements v_writecount.  To clear the text
reference when the segment is unmapped, it is recorded in the
vm_map_entry backed by the text file as MAP_ENTRY_VN_TEXT flag, and
v_writecount is incremented on the map entry removal

The operations like VOP_ADD_WRITECOUNT() and VOP_SET_TEXT() check that
v_writecount does not contradict the desired change.  vn_writecheck()
is now racy and its use was eliminated everywhere except access.
Atomic check for writeability and increment of v_writecount is
performed by the VOP.  vn_truncate() now increments v_writecount
around VOP_SETATTR() call, lack of which is arguably a bug on its own.

nullfs bypasses v_writecount to the lower vnode always, so nullfs
vnode has its own v_writecount correct, and lower vnode gets all
references, since object->handle is always lower vnode.

On the text vnode' vm object dealloc, the v_writecount value is reset
to zero, and deadfs vop_unset_text short-circuit the operation.
Reclamation of lowervp always reclaims all nullfs vnodes referencing
lowervp first, so no stray references are left.

Reviewed by:	markj, trasz
Tested by:	mjg, pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 month
Differential revision:	https://reviews.freebsd.org/D19923
2019-05-05 11:20:43 +00:00
Konstantin Belousov
2d6b8546b7 imgact_elf: do not relock the text vnode if possible.
We unlock the vnode around malloc(M_WAITOK), to make it possible for
pagedaemon to flush vnode pages for us.  Instead of doing it
unconditionally, first try M_NOWAIT allocation, which typically
succeed.  Only on failure, unlock the vnode and retry with M_WAITOK.

Reviewed by:	markj, trasz
Tested by:	mjg, pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D19923
2019-05-05 11:04:01 +00:00
Conrad Meyer
665919aaaf x86: Implement MWAIT support for stopping a CPU
IPI_STOP is used after panic or when ddb is entered manually.  MONITOR/
MWAIT allows CPUs that support the feature to sleep in a low power way
instead of spinning.  Something similar is already used at idle.

It is perhaps especially useful in oversubscribed VM environments, and is
safe to use even if the panic/ddb thread is not the BSP.  (Except in the
presence of MWAIT errata, which are detected automatically on platforms with
known wakeup problems.)

It can be tuned/sysctled with "machdep.stop_mwait," which defaults to 0
(off).  This commit also introduces the tunable
"machdep.mwait_cpustop_broken," which defaults to 0, unless the CPU has
known errata, but may be set to "1" in loader.conf to signal that mwait
wakeup is broken on CPUs FreeBSD does not yet know about.

Unfortunately, Bhyve doesn't yet support MONITOR extensions, so this doesn't
help bhyve hypervisors running FreeBSD guests.

Submitted by:   Anton Rang <rang AT acm.org> (earlier version)
Reviewed by:	kib
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D20135
2019-05-04 20:34:26 +00:00
Mateusz Guzik
5408c6db4e sysv: get rid of fork/exit hooks if the code is compiled in
Sponsored by:	The FreeBSD Foundation
2019-05-04 19:05:30 +00:00
Mateusz Guzik
37d2b1f3e5 Annotate nprocs with __exclusive_cache_line
Sponsored by:	The FreeBSD Foundation
2019-05-04 19:04:17 +00:00
Mark Johnston
bc79b41c40 Disallow excessively small times of day in clock_settime(2).
Reported by:	syzkaller
Reviewed by:	cem, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20151
2019-05-03 21:26:44 +00:00
Mark Johnston
8e7130a8a7 Stop checking TD_IDLETHREAD() in buffer cache routines.
These predicates are vestigal and cannot be true today.  For example,
idle threads are not allowed to acquire locks.

Also cache curthread in breada().

No functional change intended.

Reviewed by:	kib, mckusick
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20066
2019-04-29 13:23:32 +00:00
Alan Somers
f841e638fb [skip ci] fix typo in comment from r59840
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2019-04-26 15:00:59 +00:00
Ed Maste
5803d72f7e make sysent after r346273 (readlinkat arg correction)
PR:		197915
Reminded by:	dchagin
2019-04-26 12:55:52 +00:00
John Baldwin
83bf5ec367 Remove p_code from struct proc.
Contrary to the comments, it was never used by core dumps or
debuggers.  Instead, it used to hold the signal code of a pending
signal, but that was replaced by the 'ksi_code' member of ksiginfo_t
when signal information was reworked in 7.0.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D20047
2019-04-25 18:42:07 +00:00
Andrew Gallatin
50575ce11c Track TCP connection's NUMA domain in the inpcb
Drivers can now pass up numa domain information via the
mbuf numa domain field.  This information is then used
by TCP syncache_socket() to associate that information
with the inpcb. The domain information is then fed back
into transmitted mbufs in ip{6}_output(). This mechanism
is nearly identical to what is done to track RSS hash values
in the inp_flowid.

Follow on changes will use this information for lacp egress
port selection, binding TCP pacers to the appropriate NUMA
domain, etc.

Reviewed by:	markj, kib, slavash, bz, scottl, jtl, tuexen
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D20028
2019-04-25 15:37:28 +00:00
Warner Losh
2834e42cda When parsing command line stuff, treat tabs and spaces the same.
When creating complex config files, people like to use tabs to offset
sections. Treat them the same as spaces for delimiters.
2019-04-18 22:52:12 +00:00
Conrad Meyer
ba57dad4b0 stack_protector: Add tunable to bypass random cookies
This is a stopgap measure to unbreak installer/VM/embedded boot issues
introduced (or at least exposed by) in r346250.

Add the new tunable, "security.stack_protect.permit_nonrandom_cookies," in
order to continue boot with insecure non-random stack cookies if the random
device is unavailable.

For now, enable it by default.  This is NOT safe.  It will be disabled by
default in a future revision.

There is follow-on work planned to use fast random sources (e.g., RDRAND on
x86 and DARN on Power) to seed when the early entropy file cannot be
provided, for whatever reason.  Please see D19928.

Some better hacks may be used to make the non-random __stack_chk_guard
slightly less predictable (from delphij@ and mjg@); those suggestions are
left for a future revision.  I think it may also be plausible to move stack
guard initialization far later in the boot process; potentially it could be
moved all the way to just before userspace is started.

Reported by:	many
Reviewed by:	delphij, emaste, imp (all w/ caveat: this is a stopgap fix)
Security:	yes
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D19927
2019-04-16 18:47:20 +00:00
Ed Maste
e8ee7d9035 correct readlinkat(2) return type
r176215 corrected readlink(2)'s return type and the type of the last
argument.  readlink(2) was introduced in r177788 after being developed
as part of Google Summer of Code 2007; it appears to have inherited the
wrong return type.

Man pages and header files were already ssize_t; update syscalls.master
to match.

PR:		197915
Submitted by:	Henning Petersen <henning.petersen@t-online.de>
MFC after:	2 weeks
2019-04-16 13:26:31 +00:00
Conrad Meyer
13774e8228 random(4): Block read_random(9) on initial seeding
read_random() is/was used, mostly without error checking, in a lot of
very sensitive places in the kernel -- including seeding the widely used
arc4random(9).

Most uses, especially arc4random(9), should block until the device is seeded
rather than proceeding with a bogus or empty seed.  I did not spy any
obvious kernel consumers where blocking would be inappropriate (in the
sense that lack of entropy would be ok -- I did not investigate locking
angle thoroughly).  In many instances, arc4random_buf(9) or that family
of APIs would be more appropriate anyway; that work was done in r345865.

A minor cleanup was made to the implementation of the READ_RANDOM function:
instead of using a variable-length array on the stack to temporarily store
all full random blocks sufficient to satisfy the requested 'len', only store
a single block on the stack.  This has some benefit in terms of reducing
stack usage, reducing memcpy overhead and reducing devrandom output leakage
via the stack.  Additionally, the stack block is now safely zeroed if it was
used.

One caveat of this change is that the kern.arandom sysctl no longer returns
zero bytes immediately if the random device is not seeded.  This means that
FreeBSD-specific userspace applications which attempted to handle an
unseeded random device may be broken by this change.  If such behavior is
needed, it can be replaced by the more portable getrandom(2) GRND_NONBLOCK
option.

On any typical FreeBSD system, entropy is persisted on read/write media and
used to seed the random device very early in boot, and blocking is never a
problem.

This change primarily impacts the behavior of /dev/random on embedded
systems with read-only media that do not configure "nodevice random".  We
toggle the default from 'charge on blindly with no entropy' to 'block
indefinitely.'  This default is safer, but may cause frustration.  Embedded
system designers using FreeBSD have several options.  The most obvious is to
plan to have a small writable NVRAM or NAND to persist entropy, like larger
systems.  Early entropy can be fed from any loader, or by writing directly
to /dev/random during boot.  Some embedded SoCs now provide a fast hardware
entropy source; this would also work for quickly seeding Fortuna.  A 3rd
option would be creating an embedded-specific, more simplistic random
module, like that designed by DJB in [1] (this design still requires a small
rewritable media for forward secrecy).  Finally, the least preferred option
might be "nodevice random", although I plan to remove this in a subsequent
revision.

To help developers emulate the behavior of these embedded systems on
ordinary workstations, the tunable kern.random.block_seeded_status was
added.  When set to 1, it blocks the random device.

I attempted to document this change in random.4 and random.9 and ran into a
bunch of out-of-date or irrelevant or inaccurate content and ended up
rototilling those documents more than I intended to.  Sorry.  I think
they're in a better state now.

PR:		230875
Reviewed by:	delphij, markm (earlier version)
Approved by:	secteam(delphij), devrandom(markm)
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D19744
2019-04-15 18:40:36 +00:00
Rick Macklem
eeb1f3ed51 Fix the NFSv4 client to safely find processes.
r340744 broke the NFSv4 client, because it replaced pfind_locked() with a
call to pfind(), since pfind() acquires the sx lock for the pid hash and
the NFSv4 already holds a mutex when it does the call.
The patch fixes the problem by recreating a pfind_any_locked() and adding the
functions pidhash_slockall() and pidhash_sunlockall to acquire/release
all of the pid hash locks.
These functions are then used by the NFSv4 client instead of acquiring
the allproc_lock and calling pfind().

Reviewed by:	kib, mjg
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D19887
2019-04-15 01:27:15 +00:00
Edward Tomasz Napierala
91ff2d4883 Remove unneeded conditionals for sv_ functions - all the ABIs
(apart from null_sysvec) define them, so the 'else' branch is
never taken.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D19889
2019-04-12 14:18:16 +00:00
Edward Tomasz Napierala
4033ecc915 Use shared vnode locks for the ELF interpreter.
Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D19874
2019-04-11 11:21:45 +00:00
Alan Somers
691d4ab6f0 fix cache_lookup's documentation
cache_lookup's documentation got dislocated by r324378. Relocate and expand
it.

Reviewed by:	jhb, kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2019-04-10 13:02:33 +00:00
Edward Tomasz Napierala
b65ca345ef Improve vnode lock assertions.
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
2019-04-10 10:21:14 +00:00
Konstantin Belousov
ae90941431 Add vn_fsync_buf().
Provide a convenience function to avoid the hack with filling fake
struct vop_fsync_args and then calling vop_stdfsync().

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2019-04-09 20:20:04 +00:00
Edward Tomasz Napierala
9bcd7482b2 Factor out section loading into a separate function.
Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D19846
2019-04-09 15:24:38 +00:00
Edward Tomasz Napierala
9274fb3599 Refactor ELF interpreter loading into a separate function.
Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D19741
2019-04-08 14:31:07 +00:00
Mariusz Zaborski
de0b14f2db In the unlinkat syscall, the operation is performed on the directory
descriptor, not the file descriptor. The file descriptor is used only for
verification so do not expect any additional capabilities on it.

Reported by:	antoine
Tested by:	antoine
Discussed with:	kib, emaste, bapt
Sponsored by:	Fudo Security
2019-04-08 14:23:52 +00:00
Mark Johnston
128c9bc05b Set the p_oppid field of orphans when exiting.
Such processes will be reparented to the reaper when the current
parent is done with them (i.e., ptrace detached), so p_oppid must be
updated accordingly.

Add a regression test to exercise this code path.  Previously it
would not be possible to reap an orphan with a stale oppid.

Reviewed by:	kib, mjg
Tested by:	pho
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D19825
2019-04-07 14:26:14 +00:00
Conrad Meyer
d1139b5286 kern/subr_pctrie: Fix mismatched signedness in assertion comparison
'tos' is an index into an array and never holds a negative value.  Correct
its signedness to match PCTRIE_LIMIT, which it is compared to in assertions.

No functional change (kills a warning).
2019-04-06 21:56:24 +00:00
Conrad Meyer
04f9afae18 kern/subr_pctrie: Convert old-style boolean_t to plain "bool"
No functional change.
2019-04-06 20:38:44 +00:00
Mariusz Zaborski
a489026566 Regen after r345982. 2019-04-06 09:37:10 +00:00
Mariusz Zaborski
a1304030b8 Introduce funlinkat syscall that always us to check if we are removing
the file associated with the given file descriptor.

Reviewed by:	kib, asomers
Reviewed by:	cem, jilles, brooks (they reviewed previous version)
Discussed with:	pjd, and many others
Differential Revision:	https://reviews.freebsd.org/D14567
2019-04-06 09:34:26 +00:00
Konstantin Belousov
1d1a5c2b02 Add DEV_RESET /dev/devctl2 ioctl.
It performs BUS_RESET_CHILD() on the parental bus and the specified
device.

Reviewed by:	imp (previous version), jhb (previous version)
Sponsored by:	Mellanox Technologies
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D19646
2019-04-05 19:31:26 +00:00
Konstantin Belousov
c53df6da4e Provide newbus infrastructure for initiating device reset.
The methods BUS_RESET_PREPARE(), BUS_RESET(), and BUS_RESET_POST()
should be implemented by bus which can provide reset to a device.  The
methods are described in inline doxygen comments.

Code only provides BUS_RESET_PREPARE() and BUS_RESET_POST() helpers
instead of default implementation, because actual bus needs to handle
device state around reset, while helpers provide the other half of
typical prepare/post code.

Reviewed by:	imp (previous version), jhb (previous version)
Sponsored by:	Mellanox Technologies
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D19646
2019-04-05 18:09:22 +00:00
Konstantin Belousov
4ae3f5a7fd vn_vmap_seekhole(): align running offset to the block boundary.
Otherwise we might miss the last iteration where EOF appears below
unaligned noff.

Reported and reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D19811
2019-04-05 16:14:16 +00:00
Konstantin Belousov
be7808dca3 Fix branding after r345661.
In particular, elf32 FreeBSD binaries were not executed on LP64 hosts.
The interp_name_len value should account for the nul terminator.  This
is needed for strncmp()s in brand checking code to work.

Reported by:	andreast
Sponsored by:	The FreeBSD Foundation
MFC after:	12 days (together with r345661)
2019-03-30 16:58:51 +00:00
Edward Tomasz Napierala
09c78d53bf Factor out retrieving the interpreter path from the main ELF
loader routine.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D19715
2019-03-28 21:43:01 +00:00
Oleksandr Tymoshenko
b25ce41e33 Change default value of kern.bootfile to reflect reality
In most cases kernel.bootfile is populated from the information
provided by loader(8). There are certain scenarios when loader
is not available, for instance when kernel is loaded by u-boot
or some other BootROM directly. In this case the default value
"/kernel" points to invalid location and breaks some functinality,
like using installkernel on self-hosted system or dtrace's CTF
lookup. This can be fixed by setting the value manually but the
default that reflects correct location is better than default that
points to invalid one.

Current default was set around FreeBSD 1, when "/kernel" was the
actual path. Transition to /boot/kernel/kernel happened circa FreeBSD 3.

PR:		221550
Reviewed by:	ian, imp
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D18902
2019-03-26 18:03:18 +00:00
Edward Tomasz Napierala
20e1174a00 Factor out resource limit enforcement code in the ELF loader.
It makes the code slightly easier to follow, and might make
it easier to fix the resouce accounting to also account for
the interpreter.

The PROC_UNLOCK() is moved earlier - I don't see anything
it should protect; the lim_max() is a wrapper around lim_rlimit(),
and that, differently from lim_rlimit_proc(), doesn't require
the proc lock to be held.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D19689
2019-03-26 15:35:49 +00:00
Mark Johnston
fd76e780a7 Reject F_SETLK_REMOTE commands when sysid == 0.
A sysid of 0 denotes the local system, and some handlers for remote
locking commands do not attempt to deal with local locks.  Note that
F_SETLK_REMOTE is only available to privileged users as it is intended
to be used as a testing interface.

Reviewed by:	kib
Reported by:	syzbot+9c457a6ae014a3281eb8@syzkaller.appspotmail.com
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D19702
2019-03-25 21:38:58 +00:00
Ian Lepore
6d51c0fe72 Revert accidental change that should not have been included in r345475.
I had changed this value as part of a local experiment, and neglected to
change it back before committing the other changes.
2019-03-24 18:02:27 +00:00
Ian Lepore
67da50a047 Truncate a too-long interrupt handler name when there is only one handler.
There are only 19 bytes available for the name of an interrupt plus the
name(s) of handlers/drivers using it. There is a mechanism from the days of
shared interrupts that replaces some of the handler names with '+' when they
don't all fit into 19 bytes.

In modern times there is typically only one device on an interrupt, but long
device names are the norm, especially with embedded systems. Also, in systems
with multiple interrupt controllers, the names of the interrupts themselves
can be long. For example, 'gic0,s54: imx6_anatop0' doesn't fit, and
replacing the device driver name with a '+' provides no useful info at all.

When there is only one handler but its name was too long to fit, this
change truncates enough leading chars of the handler name (replacing them
with a '-' char to indicate that some chars are missing) to use all 19
bytes, preserving the unit number typically on the end of the name. Using
the prior example, this results in: 'gic0,s54:-6_anatop0' which provides
plenty of info to figure out which device is involved.

PR:		211946
Reviewed by:	gonzo@ (prior version without the '-' char)
Differential Revision:	https://reviews.freebsd.org/D19675
2019-03-24 17:53:26 +00:00
Ravi Pokala
557e162fe7 Add descriptions for sysctls in kern_mib.c and sysctl.3 which lack them.
r343532 noted the difference between "hw.realmem" and "hw.physmem", which I
was previously unaware of. I discovered that neither sysctl had a
description visible via `sysctl -d', so I found where they were defined and
added suitable descriptions. While in the file, I went ahead and added
descriptions for all the others which lacked them. I also updated sysctl.3
accordingly

Reviewed by:	kib, bcr
MFC after:	1 weeks
Sponsored by:	Panasas
Differential Revision:	https://reviews.freebsd.org/D19007
2019-03-23 19:53:15 +00:00
Edward Tomasz Napierala
545517f198 Remove trunc_page_ps() and round_page_ps() macros. This completes
the undoing of r100384.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D19680
2019-03-23 13:41:14 +00:00
Andrew Gallatin
f552633918 Fix a typo introduced in r344133
The line was misedited to change tt to st instead of
changing ut to st.

The use of st as the denominator in mul64_by_fraction() will lead
to an integer divide fault in the intr proc (the process holding
ithreads) where st will be 0.  This divide by 0 happens after
the total runtime for all ithreads exceeds 76 hours.

Submitted by: bde
2019-03-18 12:41:42 +00:00
Konstantin Belousov
fd8d844f76 amd64 KPTI: add control from procctl(2).
Add the infrastructure to allow MD procctl(2) commands, and use it to
introduce amd64 PTI control and reporting.  PTI mode cannot be
modified for existing pmap, the knob controls PTI of the new vmspace
created on exec.

Requested by:	jhb
Reviewed by:	jhb, markj (previous version)
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D19514
2019-03-16 11:44:33 +00:00
Konstantin Belousov
6f1fe3305a amd64: Add md process flags and first P_MD_PTI flag.
PTI mode for the process pmap on exec is activated iff P_MD_PTI is set.

On exec, the existing vmspace can be reused only if pti mode of the
pmap matches the P_MD_PTI flag of the process.  Add MD
cpu_exec_vmspace_reuse() callback for exec_new_vmspace() which can
vetoed reuse of the existing vmspace.

MFC note: md_flags change struct proc KBI.

Reviewed by:	jhb, markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D19514
2019-03-16 11:31:01 +00:00