Commit Graph

262906 Commits

Author SHA1 Message Date
glebius
e581476e1b Add debugging facility EPOCH_TRACE that checks that epochs entered are
properly nested and warns about recursive entrances.  Unlike with locks,
there is nothing fundamentally wrong with such use, the intent of tracer
is to help to review complex epoch-protected code paths, and we mean the
network stack here.

Reviewed by:	hselasky
Sponsored by:	Netflix
Pull Request:	https://reviews.freebsd.org/D21610
2019-09-25 18:26:31 +00:00
kevans
1d9983b221 sysent: regenerate after r352705
This also implements it, fixes kdump, and removes no longer needed bits from
lib/libc/sys/shm_open.c for the interim.
2019-09-25 18:09:19 +00:00
kevans
df8ec3c155 Mark shm_open(2) as COMPAT12, succeeded by shm_open2
Implementation and regenerated files will follow.
2019-09-25 18:06:48 +00:00
kevans
55af970b24 Adjust Makefile.inc1 syscall sub commit 2019-09-25 18:04:09 +00:00
kevans
575e351fdd Add linux-compatible memfd_create
memfd_create is effectively a SHM_ANON shm_open(2) mapping with optional
CLOEXEC and file sealing support. This is used by some mesa parts, some
linux libs, and qemu can also take advantage of it and uses the sealing to
prevent resizing the region.

This reimplements shm_open in terms of shm_open2(2) at the same time.

shm_open(2) will be moved to COMPAT12 shortly.

Reviewed by:	markj, kib
Differential Revision:	https://reviews.freebsd.org/D21393
2019-09-25 18:03:18 +00:00
glebius
3988008507 Enhance the 'ps' command so that it prints a line per proc and a line
per thread, so that instead of repeating the same info for all threads
in proc, it would print thread specific info. Also includes thread number
that would match 'info threads' info and can be used as argument for
thread swithcing with 'thread' command.
2019-09-25 18:03:15 +00:00
kevans
dd20cd52c2 sysent: regenerate after r352700 2019-09-25 17:59:58 +00:00
kevans
61785cc3d4 Add a shm_open2 syscall to support upcoming memfd_create
shm_open2 allows a little more flexibility than the original shm_open.
shm_open2 doesn't enforce CLOEXEC on its callers, and it has a separate
shmflag argument that can be expanded later. Currently the only shmflag is
to allow file sealing on the returned fd.

shm_open and memfd_create will both be implemented in libc to use this new
syscall.

__FreeBSD_version is bumped to indicate the presence.

Reviewed by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D21393
2019-09-25 17:59:15 +00:00
dim
3ef38d8ecf In suite.test.mk, test if ${DESTDIR} exists before attempting to run
chflags -R on it, otherwise the command will error out.  (Note that
adding -f to the chflags invocation does not help, unlike with rm.)

MFC after:	3 days
2019-09-25 17:52:59 +00:00
dim
5ae14be055 In r340411, libufs.so's major number was bumped to 7, but an entry in
ObsoleteFiles.inc was not added.  Retroactively fix that.
2019-09-25 17:35:34 +00:00
kevans
48e2e866d5 [2/3] Add an initial seal argument to kern_shm_open()
Now that flags may be set on posixshm, add an argument to kern_shm_open()
for the initial seals. To maintain past behavior where callers of
shm_open(2) are guaranteed to not have any seals applied to the fd they're
given, apply F_SEAL_SEAL for existing callers of kern_shm_open. A special
flag could be opened later for shm_open(2) to indicate that sealing should
be allowed.

We currently restrict initial seals to F_SEAL_SEAL. We cannot error out if
F_SEAL_SEAL is re-applied, as this would easily break shm_open() twice to a
shmfd that already existed. A note's been added about the assumptions we've
made here as a hint towards anyone wanting to allow other seals to be
applied at creation.

Reviewed by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D21392
2019-09-25 17:35:03 +00:00
kevans
7013010007 Update fcntl(2) after r352695 2019-09-25 17:33:12 +00:00
kevans
13d4dfe478 [1/3] Add mostly Linux-compatible file sealing support
File sealing applies protections against certain actions
(currently: write, growth, shrink) at the inode level. New fileops are added
to accommodate seals - EINVAL is returned by fcntl(2) if they are not
implemented.

Reviewed by:	markj, kib
Differential Revision:	https://reviews.freebsd.org/D21391
2019-09-25 17:32:43 +00:00
kevans
199eacdc01 sysent: regenerate after r352693 2019-09-25 17:30:28 +00:00
kevans
12110b8085 Add COMPAT12 support to makesyscalls.sh
Reviewed by:	kib, imp, brooks (all without syscalls.master edits)
Differential Revision:	https://reviews.freebsd.org/D21366
2019-09-25 17:29:45 +00:00
kevans
fcd2c3120c bsdgrep(1): various fixes of empty pattern/exit code/-c behavior
When an empty pattern is encountered in the pattern list, I had previously
broken bsdgrep to count that as a "match all" and ignore any other patterns
in the list. This commit rectifies that mistake, among others:

- The -v flag semantics were not quite right; lines matched should have been
  counted differently based on whether the -v flag was set or not. procline
  now definitively returns whether it's matched or not, and interpreting
  that result has been kicked up a level.
- Empty patterns with the -x flag was broken similarly to empty patterns
  with the -w flag. The former is a whole-line match and should be more
  strict, only matching blank lines. No -x and no -w will will match the
  empty string at the beginning of each line.
- The exit code with -L was broken, w.r.t. modern grep. Modern grap will
  exit(0) if any file that didn't match was output, so our interpretation
  was simply backwards. The new interpretation makes sense to me.

Tests updated and added to try and catch some of this.

This misbehavior was found by autoconf while fixing ports found in PR 229925
expecting either a more sane or a more GNU-like sed.

MFC after:	1 week
2019-09-25 17:14:43 +00:00
markj
98e8ac89bd Add some counters for per-VM page events.
For now, just count batched page queue state operations.
vm.stats.page.queue_ops counts the number of batch entries that
successfully completed, while queue_nops counts entries that had no
effect, which occurs when the queue operation had been completed before
the batch entry was processed.

Reviewed by:	alc, kib
MFC after:	1 week
Sponsored by:	Intel, Netflix
Differential Revision:	https://reviews.freebsd.org/D21782
2019-09-25 17:08:35 +00:00
emaste
79a5bb91d6 remove obsolete i386 MD memchr implementation
bde reports (in a reply to r351700 commit mail):
    This uses scasb, which was last optimal on the 8086, or perhaps the
    original i386.  On freefall, it is several times slower than the
    naive translation of the naive C code.

Reported by:	bde
Reviewed by:	kib, markj
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D21785
2019-09-25 16:49:22 +00:00
markj
fbe7e9c7e4 Complete the removal of the "wire_count" field from struct vm_page.
Convert all remaining references to that field to "ref_count" and update
comments accordingly.  No functional change intended.

Reviewed by:	alc, kib
Sponsored by:	Intel, Netflix
Differential Revision:	https://reviews.freebsd.org/D21768
2019-09-25 16:11:35 +00:00
kib
c6f3c4a8cf x86: Fall back to leaf 0x16 if TSC frequency is obtained by CPUID and
leaf 0x15 is not functional.

This should improve automatic TSC frequency determination on
Skylake/Kabylake/... families, where 0x15 exists but does not provide
all necessary information.  SDM contains relatively strong wording
against such uses of 0x16, but Intel does not give us any other way to
obtain the frequency. Linux did the same in the commit
604dc9170f2435d27da5039a3efd757dceadc684.

Based on submission by:	Neel Chauhan <neel@neelc.org>
PR:	240475
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D21777
2019-09-25 13:36:56 +00:00
tsoome
c4dd153f44 vt: use colors from terminal emulator
Instead of hardcoded colors, use terminal state. This also means,
we need to record the pointer to terminal state with vtbuf.
2019-09-25 13:24:31 +00:00
tsoome
ff5d84bffd kernel: terminal_init() should check for teken colors from kenv
Check for teken.fg_color and teken.bg_color and prepare the color
attributes accordingly.

When white background is used, make it light to improve visibility.
When black background is used, make kernel messages light.
2019-09-25 13:21:07 +00:00
kevans
8f60fde790 RELNOTES: Document r352668 (crontab -n and -q options)
Suggested by:	bapt
2019-09-25 13:04:34 +00:00
mav
967c071884 Fix wrong assertion in r352658.
MFC after:	1 month
2019-09-25 11:58:54 +00:00
imp
5969ef3c8c Size is unsigned, so remove the test entirely.
The kernel won't crash if you have a bad value and I'd rather not have
nvmecontrol know the internal details about how the nvme driver limits
the transfer size.
2019-09-25 07:51:30 +00:00
tsoome
6db0bdaf31 loader: fix indentation in efi_console and vidconsole
Remove extra tab.

Reported by:	yuripv
2019-09-25 07:36:35 +00:00
tsoome
eb5956c3d8 loader: add teken.fg_color and teken.bg_color variables
Add settable variables to control teken default color attributes.
The supported colors are 0-7 or basic color names:
black, red, green, brown, blue, magenta, cyan, white.

The current implementation does add some duplication which will be addressed
later.
2019-09-25 07:09:25 +00:00
kevans
bbbbd9c361 cron: add log suppression and mail suppression for successful runs
This commit adds two new extensions to crontab, ported from OpenBSD:
- -n: suppress mail on succesful run
- -q: suppress logging of command execution

The -q option appears decades old, but -n is relatively new. The
original proposal by Job Snijder can be found here [1], and gives very
convincing reasons for inclusion in base.

This patch is a nearly identical port of OpenBSD cron for -q and -n
features. It is written to follow existing conventions and style of the
existing codebase.

Example usage:

# should only send email, but won't show up in log
* * * * * -q date

# should not send email
* * * * * -n date

# should not send email or log
* * * * * -n -q date

# should send email because of ping failure
* * * * * -n -q ping -c 1 5.5.5.5

[1]: https://marc.info/?l=openbsd-tech&m=152874866117948&w=2

PR:		237538
Submitted by:	Naveen Nathan <freebsd_t.lastninja.net>
Reviewed by:	bcr (manpages)
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D20046
2019-09-25 02:37:40 +00:00
jhibbits
53af0efd39 powerpc/atomic: Follow recommendations on atomic primitive comparisons
Both IBM and Freescale programming examples presume the cmpset operands will
favor equal, and pessimize the non-equal case instead.  Do the same for
atomic_cmpset_* and atomic_fcmpset_*.  This slightly pessimizes the failure
case, in favor of the success case.

MFC after:	3 weeks
2019-09-25 01:39:58 +00:00
jhibbits
c9cf854b0a powerpc: Allocate DPCPU block from domain-local memory
This should improve NUMA scalability a little, by binding to the CPU's NUMA
domain.  This matches what's done on amd64.
2019-09-25 01:23:08 +00:00
imp
dbeb66a30c After my comnd changes, the number of threads and size weren't set. In
addition, the flags are optional, but were made to be mandatory. Set
these things, as well as santiy check the specified size.

Submitted by: Stefan Rink
PR: 240798
2019-09-25 00:24:57 +00:00
rmacklem
6eaada9c0e Replace all mtx_lock()/mtx_unlock() on the iod lock with macros.
Since the NFS node mutex needs to change to an sx lock so it can be held when
vnode_pager_setsize() is called and the iod lock is held when the NFS node lock
is acquired, the iod mutex will need to be changed to an sx lock as well.
To simply the future commit that changes both the NFS node lock and iod lock
to sx locks, this commit replaces all mtx_lock()/mtx_unlock() calls on the
iod lock with macros.
There is no semantic change as a result of this commit.

I don't know when the future commit will happen and be MFC'd, so I have
set the MFC on this commit to one week so that it can be MFC'd at the same
time.

Suggested by:	kib
MFC after:	1 week
2019-09-24 23:38:10 +00:00
jkim
0d3ff78663 Fix white spaces. 2019-09-24 21:41:19 +00:00
grembo
ae9baa9499 freebsd-update: Add updatesready' and showconfig' commands
`freebsd-update updatesready' can be used to check if there are any pending
fetched updates that can be installed.

`freebsd-update showconfig' writes freebsd-update's configuration to
stdout.

This also changes the exit code of `freebsd-update install' to 2 in case
there are no updates pending to be installed and there wasn't a fetch phase
in the same invocation. This allows scripts to tell apart these error
conditions without breaking existing jail managers.

See freebsd-update(8) for details.

PR:		240757, 240177, 229346
Reviewed by:	manpages (bcr), sectam (emaste), yuripv
Differential Revision:	https://reviews.freebsd.org/D21473
2019-09-24 20:49:33 +00:00
rrs
804541b4e4 lets put (void) in a couple of functions to keep older platforms that
are stuck with gcc happy (ppc). The changes are needed in both bbr and
rack.

Obtained from:	Michael Tuexen (mtuexen@)
2019-09-24 20:36:43 +00:00
rrs
5c5e5b0ec5 don't call in_ratelmit detach when RATELIMIT is not
compiled in the kernel.
2019-09-24 20:11:55 +00:00
rrs
b84f1569d1 Fix the ifdefs in tcp_ratelimit.h. They were reversed so
that instead of functions only being inside the _KERNEL and
the absence of RATELIMIT causing us to have NULL/error returning
interfaces we ended up with non-kernel getting the error path.
opps..
2019-09-24 20:04:31 +00:00
mav
1520ff79f0 Fix/improve interrupt threads scheduling.
Doing some tests with very high interrupt rates I've noticed that one of
conditions I added in r232207 to make interrupt threads in most cases
run on local CPU never worked as expected (worked only if previous time
it was executed on some other CPU, that is quite opposite).  It caused
additional CPU usage to run full CPU search and could schedule interrupt
threads to some other CPU.

This patch removes that code and instead reuses existing non-interrupt
code path with some tweaks for interrupt case:
 - On SMT systems, if current thread is idle, don't look on other threads.
Even if they are busy, it may take more time to do fill search and bounce
the interrupt thread to other core then execute it locally, even sharing
CPU resources.  It is other threads should migrate, not bound interrupts.
 - Try hard to keep interrupt threads within LLC of their original CPU.
This improves scheduling cost and supposedly cache and memory locality.

On a test system with 72 threads doing 2.2M IOPS to NVMe this saves few
percents of CPU time while adding few percents to IOPS.

MFC after:	1 month
Sponsored by:	iXsystems, Inc.
2019-09-24 20:01:20 +00:00
rrs
7648feb4d9 This commit adds BBR (Bottleneck Bandwidth and RTT) congestion control. This
is a completely separate TCP stack (tcp_bbr.ko) that will be built only if
you add the make options WITH_EXTRA_TCP_STACKS=1 and also include the option
TCPHPTS. You can also include the RATELIMIT option if you have a NIC interface that
supports hardware pacing, BBR understands how to use such a feature.

Note that this commit also adds in a general purpose time-filter which
allows you to have a min-filter or max-filter. A filter allows you to
have a low (or high) value for some period of time and degrade slowly
to another value has time passes. You can find out the details of
BBR by looking at the original paper at:

https://queue.acm.org/detail.cfm?id=3022184

or consult many other web resources you can find on the web
referenced by "BBR congestion control". It should be noted that
BBRv1 (which this is) does tend to unfairness in cases of small
buffered paths, and it will usually get less bandwidth in the case
of large BDP paths(when competing with new-reno or cubic flows). BBR
is still an active research area and we do plan on  implementing V2
of BBR to see if it is an improvement over V1.

Sponsored by:	Netflix Inc.
Differential Revision:	https://reviews.freebsd.org/D21582
2019-09-24 18:18:11 +00:00
erj
a9cf046cef ix, ixv: Read msix_bar from device configuration
Instead of predicting the MSI-X bar index based on the device's MAC
type, read it from the device's PCI configuration instead.

PR:		239704
Submitted by:	Piotr Pietruszewski <piotr.pietruszewski@intel.com>
Reviewed by:	erj@
MFC after:	3 days
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D21547
2019-09-24 17:06:32 +00:00
erj
ff0d702014 iflib: Remove redundant VLAN events deregistration
From Piotr:
r351152 introduced iflib_deregister() function calling
EVENTHANDLER_DEREGISTER() to unregister VLAN events. This patch removes
duplicate of EVENTHANDLER_DEREGISTER() calls placed in
iflib_device_deregister() as this function is now calling
iflib_deregister(). This is to avoid deregistering same event twice.

This patch also adds check in iflib_vlan_register() to prevent
registering VLAN while being in detach.

Patch co-authored by Krzysztof Galazka <krzysztof.galazka@intel.com>,
erj <erj@FreeBSD.org> and Jacob Keller <jacob.e.keller@intel.com>.

Signed-off-by: Piotr Pietruszewski <piotr.pietruszewski@intel.com>

Submitted by:	Piotr Pietruszewski <piotr.pietruszewski@intel.com>
Reviewed by:	gallatin@, erj@
MFC after:	3 days
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D21711
2019-09-24 17:03:31 +00:00
olivier
43d5cab889 Fix a minor typo
Approved by:	lwhsu
MFC after:	1 month
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D19970
2019-09-24 16:49:42 +00:00
olivier
b3eac51879 Fix coredump_phnum_test in case of kern.compress_user_cores=1
PR:		240783
Approved by:	ngie, lwhsu
MFC after:	1 month
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D21776
2019-09-24 16:45:34 +00:00
tuexen
eceade6db5 Plumb a memory leak.
Thnanks to Felix Weinrank for finding this issue using fuzz testing
and reporting it for the userland stack:
https://github.com/sctplab/usrsctp/issues/378

MFC after:		3 days
2019-09-24 13:15:24 +00:00
yuripv
6f79d636c5 lib/libc/regex: fix build with REDEBUG defined
Reviewed by:	kevans
Differential Revision:	https://reviews.freebsd.org/D21760
2019-09-24 12:21:01 +00:00
rmacklem
6b0307a0a5 Replace all mtx_lock()/mtx_unlock() on n_mtx with the macros.
For a long time, some places in the NFS code have locked/unlocked the
NFS node lock with the macros NFSLOCKNODE()/NFSUNLOCKNODE() whereas
others have simply used mtx_lock()/mtx_unlock().
Since the NFS node mutex needs to change to an sx lock so it can be held when
vnode_pager_setsize() is called, replace all occurrences of mtx_lock/mtx_unlock
with the macros to simply making the change to an sx lock in future commit.
There is no semantic change as a result of this commit.

I am not sure if the change to an sx lock will be MFC'd soon, so I put
an MFC of 1 week on this commit so that it could be MFC'd with that commit.

Suggested by:	kib
MFC after:	1 week
2019-09-24 01:58:54 +00:00
lwhsu
6d012fe8e3 Clean LINT* kernel configurations for arm*
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
2019-09-24 01:56:27 +00:00
markj
6f4066cdc2 ping6: Use caph_rights_limit(3) for STDIN_FILENO
Update some error messages while here.

Reported by:	olivier
MFC after:	3 days
2019-09-23 22:20:11 +00:00
mjg
65396efbad cache: tidy up handling of negative entries
- track the total count of hot entries
- pre-read the lock when shrinking since it is typically already taken
- place the lock in its own cacheline
- shorten the hold time of hot lock list when zapping

Sponsored by:	The FreeBSD Foundation
2019-09-23 20:50:04 +00:00
mav
37e8a0e005 Make nvme(4) driver some more NUMA aware.
- For each queue pair precalculate CPU and domain it is bound to.
If queue pairs are not per-CPU, then use the domain of the device.
 - Allocate most of queue pair memory from the domain it is bound to.
 - Bind callouts to the same CPUs as queue pair to avoid migrations.
 - Do not assign queue pairs to each SMT thread.  It just wasted
resources and increased lock congestions.
 - Remove fixed multiplier of CPUs per queue pair, spread them even.
This allows to use more queue pairs in some hardware configurations.
 - If queue pair serves multiple CPUs, bind different NVMe devices to
different CPUs.

MFC after:	1 month
Sponsored by:	iXsystems, Inc.
2019-09-23 17:53:47 +00:00