Commit Graph

16353 Commits

Author SHA1 Message Date
Mark Johnston
aeb7a84ee1 Remove mostly-useless proc provider probes.
For some reason the proc UMA zone's ctor, dtor and init functions are
instrumented, but these functions are always available through FBT.
Moreover, the probes are not part of the original Solaris proc
provider, aren't documented, have no uses (e.g., in dwatch(8)) and
have no clear use to begin with.  Therefore, remove them.

Reviewed by:	rpaulo
Differential Revision:	https://reviews.freebsd.org/D2169
2018-11-15 23:02:59 +00:00
Warner Losh
36173f6976 Do proper conversion to/from sbt.
Doh! sbttoX and Xtosbt were backwards. While they ran, they produced
bogus results.

Pointy hat to: imp@
2018-11-15 16:02:24 +00:00
Gleb Smirnoff
905837ebe7 Initialize compatibility epoch tracker for thread0. Fixes
panics for drivers that call if_maddr_lock() during startup.

Reported by:	cy
2018-11-14 19:10:35 +00:00
Brooks Davis
5b1df30051 Use the main capabilities.conf for freebsd32.
Allow the location of capabilities.conf to be configured.

Also allow a per-abi syscall prefix to be configured with the
abi_func_prefix syscalls.conf variable and check syscalls against
entries in capabilities.conf with and without the prefix amended.

Take advantage of these two features to allow use shared capabilities.conf
between the default syscall vector and the freebsd32 compatability
layer.  We've been inconsistent about keeping the two in sync as
evidenced by the bugs fixed in r340294.  This eliminates that problem
going forward.

Reviewed by:	kib
Obtained from:	CheriBSD
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D17932
2018-11-14 00:46:02 +00:00
Gleb Smirnoff
6febf18036 Fix build on some architectures after r340413. On amd64 epoch.h
appeared to be included implicitly.
2018-11-14 00:33:03 +00:00
Matt Macy
91cf497515 epoch(9) revert r340097 - no longer a need for multiple sections per cpu
I spoke with Samy Bahra and recent changes to CK to make ck_epoch_call and
ck_epoch_poll not modify the record have eliminated the need for this.
2018-11-14 00:12:04 +00:00
Gleb Smirnoff
635c18840a style(9), mostly adjusting overly long lines. 2018-11-13 23:57:34 +00:00
Gleb Smirnoff
a760c50c9e With epoch not inlined, there is no point in using _lite KPI. While here,
remove some unnecessary casts.
2018-11-13 23:45:38 +00:00
Gleb Smirnoff
9f360eecf9 The dualism between epoch_tracker and epoch_thread is fragile and
unnecessary. So, expose CK types to kernel and use a single normal
structure for epoch_tracker.

Reviewed by:	jtl, gallatin
2018-11-13 23:20:55 +00:00
Gleb Smirnoff
b79aa45e0e For compatibility KPI functions like if_addr_rlock() that used to have
mutexes but now are converted to epoch(9) use thread-private epoch_tracker.
Embedding tracker into ifnet(9) or ifnet derived structures creates a non
reentrable function, that will fail miserably if called simultaneously from
two different contexts.
A thread private tracker will provide a single tracker that would allow to
call these functions safely. It doesn't allow nested call, but this is not
expected from compatibility KPIs.

Reviewed by:	markj
2018-11-13 22:58:38 +00:00
Mateusz Guzik
f183fb162c locks: plug warnings about unitialized variables
They only showed up after I redefined LOCKSTAT_ENABLED to 0.

doing_lockprof in mutex.c is a real (but harmless) bug. Should the
value be non-zero it will do checks for lock profiling which would
otherwise be skipped.

state in rwlock.c is a wart from the compiler, the value can't be
used if lock profiling is not enabled.

Sponsored by:	The FreeBSD Foundation
2018-11-13 21:29:56 +00:00
Eric van Gyzen
d54474e63b Make no assertions about lock state when the scheduler is stopped.
Change the assert paths in rm, rw, and sx locks to match the lock
and unlock paths.  I did this for mutexes in r306346.

Reported by:	Travis Lane <tlane@isilon.com>
MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
2018-11-13 20:48:05 +00:00
Gleb Smirnoff
a82296c2df Uninline epoch(9) entrance and exit. There is no proof that modern
processors would benefit from avoiding a function call, but bloating
code. In fact, clang created an uninlined real function for many
object files in the network stack.

- Move epoch_private.h into subr_epoch.c. Code copied exactly, avoiding
  any changes, including style(9).
- Remove private copies of critical_enter/exit.

Reviewed by:	kib, jtl
Differential Revision:	https://reviews.freebsd.org/D17879
2018-11-13 19:02:11 +00:00
Mark Johnston
bb4a27f927 Allow allocations across meta boundaries.
Remove restrictions that prevent allocation requests to cross the
boundary between two meta nodes.

Replace the bmu_avail field in meta nodes with a bitmap that identifies
which subtrees have some free memory, and iterate over the nonempty
subtrees only in blst_meta_alloc.  If free memory is scarce, this should
make searching for it faster.

Put the code for handling the next-leaf allocation in a separate
function.  When taking blocks from the next leaf empties the leaf, be
sure to clear the appropriate bit in its parent, and so on, up to the
least-common ancestor of this leaf and the next.

Eliminate special terminator nodes, and rely instead on the fact that
there is a 0-bit at the end of the bitmask at the root of the tree that
will stop a meta_alloc search, or a next-leaf search, before the search
falls off the end of the tree. Make sure that the tree is big enough to
have space for that 0-bit.

Eliminate special all-free indicators.  Lazy initialization of subtrees
stands in the way of having an allocation span a meta-node boundary, so
a subtree of all free blocks is not treated specially.  Subtrees of
all-allocated blocks are still recognized by looking at the bitmask at
the root and finding 0.

Don't print all-allocated subtrees.  Do print the bitmasks for meta
nodes, when tree-printing.

Submitted by:	Doug Moore <dougm@rice.edu>
Reviewed by:	alc
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D12635
2018-11-13 18:40:01 +00:00
Kyle Evans
75beb4d46a Add dynamic_kenv assertion to init_static_kenv
Both to formally document the requirement that this not be called after the
dynamic kenv is setup, and to perhaps help static analyzers figure out
what's going on. While calling init_static_kenv this late isn't fatal, there
are some caveats that the caller should be aware of:

- Late calls are effectively a no-op, as far as default FreeBSD is
concerned, as everything will switch to searching the dynamic kenv once it's
available.

- Each of the kern_getenv calls will leak memory, as it's assumed that
these are searching static environment and allocations will not be made.

As such, this usage is not sensible and should be detected.
2018-11-13 04:34:30 +00:00
Konstantin Belousov
389474c122 Allow set ether/vlan PCP operation from the VNET jails.
The vlan interfaces can be created from vnet jails, it seems, so it
sounds logical to allow pcp configuration as well.

Reviewed by:	bz, hselasky (previous version)
Sponsored by:	Mellanox Technologies
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D17777
2018-11-12 15:59:32 +00:00
Conrad Meyer
0d1467b199 netdump: Fix netdumping with INVARIANTS kernels
Correct boneheaded assertion I added in r339501.  Mea culpa.

The intent is to notice when an M_WAITOK zone allocation would fail during
netdump, not to prevent all use of mbufs during netdump.

Reviewed by:	markj
X-MFC-With:	r339501
Differential Revision:	https://reviews.freebsd.org/D17957
2018-11-12 05:24:20 +00:00
Konstantin Belousov
8782eef46f Remove one-use variable.
This also removes a lot of #ifdefs and cleans up a warning when the
AUDIT kernel option is defined, but neither KDTRACE_HOOKS nor MAC are.

Reported and tested by:	danger
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2018-11-11 00:21:28 +00:00
Konstantin Belousov
ade85c5eec Allow absolute paths for O_BENEATH.
The path must have a tail which does not escape starting/topping
directory.  The documentation will come shortly, see the man pages
commit message for the reason of separate commit.

Reviewed by:	jilles (previous version)
Discussed with:	emaste
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D17714
2018-11-11 00:04:36 +00:00
Brooks Davis
9a38df59e9 Fix freebsd32 mknod(at).
As dev_t is now a 64-bit integer, it requires special handling as a
system call argument.  64-bit arguments are split between two 64-bit
integers due to the way arguments are promoted to allow reuse of most
system call implementations.  They must be reassembled before use.
Further, 64-bit arguments at an odd offset (counting from zero) are
padded and slid to the next slot on powerpc and mips.  Fix the
non-COMPAT11 system call by adding a freebsd32_mknodat() and
appropriately padded declerations.

The COMPAT11 system calls are fully compatible with the 64-bit
implementations so remove the freebsd32_ versions.

Use uint32_t consistently as the type of the old dev_t.  This matches
the old definition.

Reviewed by:	kib
MFC after:	3 days
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D17928
2018-11-09 21:01:16 +00:00
Brooks Davis
b34f4419fb Make freebsd32_umtx_op follow the freebsd32_foo convention.
Sponsored by:	DARPA, AFRL
2018-11-09 00:46:10 +00:00
John Baldwin
4bf4b0f139 Enable non-executable stacks by default on RISC-V.
Reviewed by:	markj
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D17878
2018-11-07 18:32:02 +00:00
Brooks Davis
5577e44bf4 Regen after r340221: allow pointer return types.
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D17873
2018-11-07 16:56:07 +00:00
Brooks Davis
e56ec0e519 makesyscalls.sh: allow pointer return types.
The previous code required that the return type be a single word.  This
allows it to be a pointer without using a typedef.

Update the return types of break, mmap, and shmat to be void * as
declared.  This only effects systrace output in-tree, but can aid in
generating system call wrappers from syscalls.master.

Reviewed by:	kib
Obtained from:	CheriBSD
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D17873
2018-11-07 16:55:04 +00:00
Mark Johnston
f8a222010f Avoid fixing the tty_info() buffer size in tty.h.
Different compilation units may otherwise get a different view of the
layout of struct tty depending on whether they include opt_printf.h.
This caused a blowup in the number of types defined in the kernel's
CTF file after r339468; thanks to dim@ for bisecting down to that
revision.

PR:		232675
Reported by:	dim
Reviewed by:	cem (previous version)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17877
2018-11-06 23:41:44 +00:00
Mark Johnston
07702f72e5 Avoid specifying VM_PROT_EXECUTE in mappings from pipe_map and exec_map.
These submaps are used for mapping pipe buffers and execv() argument
strings respectively, so there's no need for such mappings to have
execute permissions.

Reported by:	jhb
Reviewed by:	alc, jhb, kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17827
2018-11-06 21:57:03 +00:00
Mark Johnston
6741ea083f We need opt_stack.h after r339605.
Reviewed by:	cem
Sponsored by:	The FreeBSD Foundation
2018-11-06 21:47:22 +00:00
Brooks Davis
dd4d2f216f Update some comments made obsolete by recent commits. 2018-11-06 20:45:15 +00:00
Brooks Davis
938e8dcf60 Regen after r340199: Use declared types for caddr_t arguments.
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D17852
2018-11-06 18:47:29 +00:00
Brooks Davis
318f0d7720 Use declared types for caddr_t arguments.
Leave ptrace(2) alone for the moment as it's defined to take a caddr_t.

Reviewed by:	kib
Obtained from:	CheriBSD
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D17852
2018-11-06 18:46:38 +00:00
Mariusz Zaborski
f4a035b8df Regenerate after r340129.
Pointed out by:	brooks
2018-11-06 18:03:04 +00:00
Mark Johnston
f71ef9b686 Use plain atomic_{add,subtract} when that's sufficient.
CID:		1386920
MFC after:	2 weeks
2018-11-06 17:32:25 +00:00
Andrew Turner
4ea56599e8 Port the NetBSD ubsan runtime to the FreeBSD kernel.
This allows us to build the ubsan code added in r340189 into the kernel
with the KUBSAN option. This will report when undefined behaviour is
detected in the currently running kernel.

As it can be large, the kernel is 65MB on arm64, loader may not be able to
load the kernel on all architectures so is disabled by default for now.

Sponsored by:	DARPA, AFRL
2018-11-06 17:32:07 +00:00
Andrew Turner
0645126fae Import the NetBSD micro ubsan code for the kernel.
This imports revision 1.3 of common/lib/libc/misc/ubsan.c from NetBSD, the
micro-ubsan code. It is an implementation of the Undefined Behavior
Sanitizer runtime for use with recent clang and gcc.

The uubsan code will be used in a later commit to implement kubsan to help
find undefined behavior in the kernel.

Sponsored by:	DARPA, AFRL
2018-11-06 16:56:49 +00:00
Brooks Davis
44cbc1c2b7 Fix a couple indentation errors in r339958. 2018-11-06 00:09:43 +00:00
John Baldwin
4cbbb74888 Add a KPI for the delay while spinning on a spin lock.
Replace a call to DELAY(1) with a new cpu_lock_delay() KPI.  Currently
cpu_lock_delay() is defined to DELAY(1) on all platforms.  However,
platforms with a DELAY() implementation that uses spin locks should
implement a custom cpu_lock_delay() doesn't use locks.

Reviewed by:	kib
MFC after:	3 days
2018-11-05 21:34:17 +00:00
Mariusz Zaborski
82560231d3 capsicum: allow ppoll(2) in capability mode
We already allow to use poll(2). There is no reason to disallow ppoll(2).

PR:		232495
Submitted by:	Stefan Grundmann <sg2342@googlemail.com>
Reviewed by:	cem, oshogbo
MFC after:	2 weeks
2018-11-04 17:12:53 +00:00
Matt Macy
10f42d244b Convert epoch to read / write records per cpu
In discussing D17503 "Run epoch calls sooner and more reliably" with
sbahra@ we came to the conclusion that epoch is currently misusing the
ck_epoch API. It isn't safe to do a "write side" operation (ck_epoch_call
or ck_epoch_poll) in the middle of a "read side" section. Since, by definition,
it's possible to be preempted during the middle of an EPOCH_PREEMPT
epoch the GC task might call ck_epoch_poll or another thread might call
ck_epoch_call on the same section. The right solution is ultimately to change
the way that ck_epoch works for this use case. However, as a stopgap for
12 we agreed to simply have separate records for each use case.

Tested by: pho@

MFC after:	3 days
2018-11-03 03:43:32 +00:00
Brooks Davis
4e8c73eb20 Regen after r340080: Add const to input-only char * arguments.
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D17812
2018-11-02 20:56:19 +00:00
Brooks Davis
12e69f96a2 Add const to input-only char * arguments.
These arguments are mostly paths handled by NAMEI*() macros which already
take const char * arguments.

This change improves the match between syscalls.master and the public
declerations of system calls.

Reviewed by:	kib (prior version)
Obtained from:	CheriBSD
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D17812
2018-11-02 20:50:22 +00:00
Warner Losh
003ffd57fe Add sysctl_usec_to_sbintime and sysctl_msec_to_sbintime.
These functions are used to present a sbintime_t as either a number of
microseconds or a number of milliseconds respectively.

Sponsored by: Netflix
2018-11-02 17:50:57 +00:00
Mark Johnston
2203c46d87 Initialize the eflags field of vm_map headers.
Initializing the eflags field of the map->header entry to a value with a
unique new bit set makes a few comparisons to &map->header unnecessary.

Submitted by:	Doug Moore <dougm@rice.edu>
Reviewed by:	alc, kib
Tested by:	pho
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D14005
2018-11-02 16:26:44 +00:00
Brooks Davis
1493c2ee62 Make vop_symlink take a const target path.
This will enable callers to take const paths as part of syscall
decleration improvements.

Where doing so is easy and non-distruptive carry the const through
implementations. In UFS the value is passed to an interface that must
take non-const values. In ZFS, const poisoning would touch code shared
with upstream and it's not worth adding diffs.

Bump __FreeBSD_version for external API consumers.

Reviewed by:	kib (prior version)
Obtained from:	CheriBSD
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D17805
2018-11-02 14:42:36 +00:00
Conrad Meyer
78c2a9806e kern_poll: Restore explanatory comment removed in r177374
The comment isn't stale.  The check is bogus in the sense that poll(2)
does not require pollfd entries to be unique in fd space, so there is no
reason there cannot be more pollfd entries than open or even allowed
fds.  The check is mostly a seatbelt against accidental misuse or
abuse.  FD_SETSIZE, while usually unrelated to poll, is used as an
arbitrary floor for systems with very low kern.maxfilesperproc.

Additionally, document this possible EINVAL condition in the poll.2
manual.

No functional change.

Reviewed by:	markj
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D17671
2018-11-01 23:46:23 +00:00
Brooks Davis
f7e5ce325f Regent after r340034: Use mode_t when the documented signature does.
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D17784
2018-11-01 23:10:53 +00:00
Brooks Davis
2105ac07d7 Use mode_t when the documented signature does.
This is more clear and produces better results when generating function
stubs from syscalls.master.

Reviewed by:	kib, emaste
Obtained from:	CheribSD
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D17784
2018-11-01 23:06:50 +00:00
John Baldwin
b317cfd4c0 Don't enter DDB for fatal traps before panic by default.
Add a new 'debugger_on_trap' knob separate from 'debugger_on_panic'
and make the calls to kdb_trap() in MD fatal trap handlers prior to
calling panic() conditional on this new knob instead of
'debugger_on_panic'.  Disable the new knob by default.  Developers who
wish to recover from a fatal fault by adjusting saved register state
and retrying the faulting instruction can still do so by enabling the
new knob.  However, for the more common case this makes the user
experience for panics due to a fatal fault match the user experience
for other panics, e.g. 'c' in DDB will generate a crash dump and
reboot the system rather than being stuck in an infinite loop of fatal
fault messages and DDB prompts.

Reviewed by:	kib, avg
MFC after:	2 months
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D17768
2018-11-01 21:34:17 +00:00
Brooks Davis
e3e5481326 Reformat syscalls.master for better readability.
This takes advantage of two recents changes to makesyscalls.sh:
r328598: Permit a range of syscall numbers for UNIMPL
r339624: Remove the need for backslashes in syscalls.master

Syscall declerations are now split across multiple lines with the
syscall name and variables each on seperate lines (with an exception for
syscalls taking no arguments.)

Reviewed by:	imp
Obtained from:	CheriBSD
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D17706
2018-10-31 16:17:45 +00:00
Bjoern A. Zeeb
9afc56849a Fix mips build after r339931.
I erroneously thought that it was two 64bit platforms which use link_elf_obj.c.

PR:		228854
Reported by:	ci.f.o.
MFC after:	3 days
X-MFC with:	r339931
Pointyhat to:	bz
2018-10-30 21:35:56 +00:00
Bjoern A. Zeeb
0f823b6497 As a follow-up to r339930 and various reports implement logging in case
we fail during module load because the pcpu or vnet module sections are
full.  We did return a proper error but not leaving any indication to
the user as to what the actual problem was.

Even worse, on 12/13 currently we are seeing an unrelated error (ENOSYS
instead of ENOSPC, which gets skipped over in kern_linker.c) to be
printed which made problem diagnostics even harder.

PR:		228854
MFC after:	3 days
2018-10-30 20:51:03 +00:00