Commit Graph

131113 Commits

Author SHA1 Message Date
Randall Stewart
481be5de9d White space cleanup -- remove trailing tab's or spaces
from any line.

Sponsored by:	Netflix Inc.
2020-02-12 13:31:36 +00:00
Randall Stewart
df341f5986 Whitespace, remove from three files trailing white
space (leftover presents from emacs).

Sponsored by:	Netflix Inc.
2020-02-12 13:07:09 +00:00
Randall Stewart
596ae436ef This small fix makes it so we properly follow
the RFC and only enable ECN when both the
CWR and ECT bits our set within the SYN packet.

Sponsored by:	Netflix Inc.
Differential Revision:	https://reviews.freebsd.org/D23645
2020-02-12 13:04:19 +00:00
Randall Stewart
3fba40d9f2 Remove all trailing white space from the BBR/Rack fold. Bits
left around by emacs (thanks emacs).
2020-02-12 12:40:06 +00:00
Randall Stewart
d2517ab04b Now that all of the stats framework is
in FreeBSD the bits that disabled stats
when netflix-stats is not defined is no longer
needed. Lets remove these bits so that we
will properly use stats per its definition
in BBR and Rack.

Sponsored by:	Netflix Inc
Differential Revision:	https://reviews.freebsd.org/D23088
2020-02-12 12:36:55 +00:00
Mateusz Guzik
4602214772 vfs: refactor vputx and add more comment
Reviewed by:	jeff (previous version)
Tested by:	pho (previous version)
Differential Revision:	https://reviews.freebsd.org/D23530
2020-02-12 11:19:07 +00:00
Mateusz Guzik
ed67a63c39 vfs: drop remaining zpcpu casts 2020-02-12 11:18:12 +00:00
Mateusz Guzik
123c519731 vfs: switch to smp_rendezvous_cpus_retry for vfs_op_thread_enter/exit
In particular on amd64 this eliminates an atomic op in the common case,
trading it for IPIs in the uncommon case of catching CPUs executing the
code while the filesystem is getting suspended or unmounted.
2020-02-12 11:17:45 +00:00
Mateusz Guzik
00ac9d2632 rms: use smp_rendezvous_cpus_retry instead of a hand-rolled variant 2020-02-12 11:17:18 +00:00
Mateusz Guzik
e4f584971b Add smp_rendezvous_cpus_retry
This is a wrapper around smp_rendezvous_cpus which enables use of IPI
handlers which can fail and require retrying.

wait_func argument is added to to provide a routine which can be used to
poll CPU of interest for when the IPI can be retried.

Handlers which succeed must call smp_rendezvous_cpus_done to denote that
fact.

Discussed with:	 jeff
Differential Revision:	https://reviews.freebsd.org/D23582
2020-02-12 11:16:55 +00:00
Mateusz Guzik
2318ed2508 amd64: provide custom zpcpu set/add/sub routines
Note that clobbers are highly overzealous, can be cleaned up later.
2020-02-12 11:15:33 +00:00
Mateusz Guzik
bee115bc59 Dedup zpcpu assertions into one macro and guard the rest with #ifndef
Sponsored by:	The FreeBSD Foundation
2020-02-12 11:14:23 +00:00
Mateusz Guzik
fb886947d9 amd64: store per-cpu allocations subtracted by __pcpu
This eliminates a runtime subtraction from counter_u64_add.

before:
mov    0x4f00ed(%rip),%rax        # 0xffffffff80c01788 <numfullpathfail4>
sub    0x808ff6(%rip),%rax        # 0xffffffff80f1a698 <__pcpu>
addq   $0x1,%gs:(%rax)

after:
mov    0x4f02fd(%rip),%rax        # 0xffffffff80c01788 <numfullpathfail4>
addq   $0x1,%gs:(%rax)

Reviewed by:	jeff
Differential Revision:	https://reviews.freebsd.org/D23570
2020-02-12 11:12:13 +00:00
Mateusz Guzik
3acb6572fc Store offset into zpcpu allocations in the per-cpu area.
This shorten zpcpu_get and allows more optimizations.

Reviewed by:	jeff
Differential Revision:	https://reviews.freebsd.org/D23570
2020-02-12 11:11:22 +00:00
Mateusz Guzik
48baf00f54 epoch: convert zpcpu_get_cpua(.., curcpu) to zpcpu_get 2020-02-12 11:10:10 +00:00
Hans Petter Selasky
01651e9615 Add support for debugnet in mlx5en(4).
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-02-12 10:03:25 +00:00
Hans Petter Selasky
f14d849862 Add support for disabling and polling MSIX interrupts in mlx5core.
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-02-12 09:58:19 +00:00
Hans Petter Selasky
f98977b521 Use NET_TASK_INIT() and NET_GROUPTASK_INIT() for drivers that process
incoming packets in taskqueue context.

This patch extends r357772.

Tested by:	yp@mm.st
Sponsored by:	Mellanox Technologies
2020-02-12 09:19:47 +00:00
Hans Petter Selasky
fb1a29b45e Make sure the so-called end of receive interrupts don't starve in iflib.
When the receive ring cannot be filled with mbufs, due to lack of memory,
no more interrupts may be generated to fill the receive ring later on.
Make sure to have a watchdog, to try refilling the receive ring from time
to time, hopefully when more mbufs are available.

Differential Revision:	https://reviews.freebsd.org/D23315
MFC after:	1 week
Reviewed by:	gallatin@
Sponsored by:	Mellanox Technologies
2020-02-12 08:30:07 +00:00
Navdeep Parhar
77ad00bf36 cxgbe(4): Update T4/5/6 firmwares to 1.24.12.0.
Obtained from:	Chelsio Communications
MFC after:	1 month
Sponsored by:	Chelsio Communications
2020-02-12 02:55:06 +00:00
Brooks Davis
8e2e3137a3 Mark hme(4) as deprecated.
It was saved from the initial purge of drivers in fcp-101 due to being
the onboard Ethernet device on a number of sparc64 machines.  Now that
sparc64 is gone, it serves little purpose (PCI cards exist, but are rare
and are unlikely to have been deployed outside Sun systems).

MFC after:	3 days
2020-02-12 00:58:17 +00:00
Eugene Grosbein
49f384cb47 ng_nat: avoid panic if attached directly to ng_ether and got short packet
From the beginning, ng_nat safely assumed cleansed traffic
because of limited ways it could be attached to NETGRAPH:
ng_ipfw or ng_ppp only.

Now as it may be attached with ng_ether too, the assumption proven wrong.
Add needed check to the ng_nat. Thanks for markj for debugging this.

PR:		243096
Submitted by:	Lutz Donnerhacke <lutz@donnerhacke.de>
Reported by:	Robert James Hernandez <rob@sarcasticadmin.com>
Reviewed by:	markj and others
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D23091
2020-02-12 00:31:00 +00:00
Gleb Smirnoff
47602aa4e9 Remove assertion from TASK_INIT() macro, since some users of
sys/taskqueue.h may not have includes that define MPASS. It
was useful during testing of r357771, but can be omitted now.
An invalid value of priority will yield only in potential
priority inversion, not a crash.

This fixes compilation of ports/x11/nvidia-driver.
2020-02-11 20:59:41 +00:00
Mark Johnston
4ab3aee8fb Reduce lock hold time in keg_drain().
Maintain a count of free slabs in the per-domain keg structure and use
that to clear the free slab list in constant time for most cases.  This
helps minimize lock contention induced by reclamation, in preparation
for proactive trimming of excesses of free memory.

Reviewed by:	jeff, rlibby
Tested by:	pho
Differential Revision:	https://reviews.freebsd.org/D23532
2020-02-11 20:06:33 +00:00
Michael Tuexen
8803350d6d Revert https://svnweb.freebsd.org/changeset/base/357761
This was suggested by cem@
2020-02-11 20:02:20 +00:00
Gleb Smirnoff
301a87ac4c Mark lio taskqueue as requiring network epoch. 2020-02-11 19:13:34 +00:00
Gleb Smirnoff
6c3e93cb5a Use NET_TASK_INIT() and NET_GROUPTASK_INIT() for drivers that process
incoming packets in taskqueue context.

Reviewed by:	hselasky
Differential Revision:	https://reviews.freebsd.org/D23518
2020-02-11 18:57:07 +00:00
Gleb Smirnoff
4426b2e64b Add flag to struct task to mark the task as requiring network epoch.
When processing a taskqueue and a task has associated epoch, then
enter for duration of the task.  If consecutive tasks belong to the
same epoch, batch them.  Now we are talking about the network epoch
only.

Shrink the ta_priority size to 8-bits.  No current consumers use
a priority that won't fit into 8 bits.  Also complexity of
taskqueue_enqueue() is a square of maximum value of priority, so
we unlikely ever want to go over UCHAR_MAX here.

Reviewed by:	hselasky
Differential Revision:	https://reviews.freebsd.org/D23518
2020-02-11 18:48:07 +00:00
Mateusz Guzik
57349a4f41 vfs: fix vhold race in mnt_vnode_next_lazy_relock
vdrop can set the hold count to 0 and wait for the ->mnt_listmtx held by
mnt_vnode_next_lazy_relock caller. The routine incorrectly asserted the
count has to be > 0.

Reported by:	pho
Tested by:	pho
2020-02-11 18:19:56 +00:00
Hans Petter Selasky
b4426a7175 Add missing EPOCH(9) wrapper in ipfw(8).
Backtrace:
panic()
ip_output()
dyn_tick()
softclock_call_cc()
softclock()
ithread_loop()

Differential Revision:	https://reviews.freebsd.org/D23599
Reviewed by:	glebius@ and ae@
Found by:	mmacy@
Reported by:	jmd@
Sponsored by:	Mellanox Technologies
2020-02-11 18:16:29 +00:00
Michael Tuexen
9803f01cdb Don't start an SCTP timer using a net, which has been removed.
Submitted by:		Taylor Brandstetter
MFC after:		1 week
2020-02-11 18:15:57 +00:00
Mateusz Guzik
e7ce9c32a7 amd64: remove redundant sa->code assignment from cpu_fetch_syscall_args_fallback
It is already set in the only caller.
2020-02-11 18:15:23 +00:00
Mateusz Guzik
1b853b62f3 capsicum: restore the cap_rights_contains symbol
It is expected to be provided by libc.

PR:		244033
Reported by:	 Jan Kokemueller
2020-02-11 18:13:53 +00:00
Konstantin Belousov
5d1277ca9a if_media.h: Add 50G KR4 ethernet media type.
Submitted by:	Adam Peace <adam.e.peace@gmail.com>
Reviewed by:	hselasky
Differential revision:	https://reviews.freebsd.org/D23620
2020-02-11 18:03:45 +00:00
Konstantin Belousov
48ad3b215c if_media.c: staticize and constify ifmedia description structures used under IFMEDIA_DEBUG.
The reason for this change is to make it clear the scope of the in-kernel usage
of IFM_TYPE_DESCRIPTIONS and IFM_SUBTYPE_ETHERNET_DESCRIPTIONS macros.  Also it
is somewhat better C.

Reviewed by:	hselasky
Sponsored by:	Mellanox Technologies
Differential revision:	https://reviews.freebsd.org/D23620
2020-02-11 17:45:01 +00:00
Konstantin Belousov
a249895df8 if_media.c: use __FBSDID().
Reviewed by:	hselasky
Sponsored by:	Mellanox Technologies
Differential revision:	https://reviews.freebsd.org/D23620
2020-02-11 17:41:45 +00:00
Ruslan Bukin
667c3fc0f6 Add PCI Express driver for the ARM Neoverse N1 System Development
Platform (N1SDP).

Neoverse N1 is a high-performance ARM microarchitecture designed
by the ARM Holdings for the server market.

The PCI part on N1SDP was shipped untested and suffers from some
integration issues.

For instance accessing to not existing BDFs causes System Error
(SError) exception. To mitigate this, the firmware scans the bus,
catches SErrors and creates a table with valid BDFs. That allows
us to filter-out accesses to invalid BDFs in this driver.

Also the root complex config space (BDF == 0) has an unusual
location in memory map, so remapping accesses to it is required.

Finally, the config space is restricted to 32-bit accesses only.

This was tested on the ARM boxes kindly provided by the ARM Ltd
to the DARPA CHERI Project.

In collaboration with:	andrew
Reviewed by:	andrew
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D23349
2020-02-11 15:12:09 +00:00
Michael Tuexen
95d27478d2 Use an int instead of a bool variable, since bool is not supported
on all platforms the stack is running on in userland.
2020-02-11 14:00:27 +00:00
Mateusz Guzik
2e57c8fde7 vfs: fix device count leak on vrele racing with vgone
The race is:

CPU1                                CPU2
                                    devfs_reclaim_vchr
make v_usecount 0
                                      VI_LOCK
                                      sees v_usecount == 0, no updates
                                      vp->v_rdev = NULL;
                                      ...
                                      VI_UNLOCK
VI_LOCK
v_decr_devcount
  sees v_rdev == NULL, no updates

In this scenario si_devcount decrement is not performed.

Note this can only happen if the vnode lock is not held.

Reviewed by:	kib
Tested by:	pho
Differential Revision:	https://reviews.freebsd.org/D23529
2020-02-10 22:28:54 +00:00
Li-Wen Hsu
37d4ece7c5 Restore the behavior of allowing empty string in a string sysctl
Added as a special case to avoid unnecessary memory operations.

Reviewed by:	delphij
Sponsored by:	The FreeBSD Foundation
2020-02-10 20:53:59 +00:00
Hans Petter Selasky
f912e8f2ff Fix for unbalanced EPOCH(9) usage in the generic kernel interrupt
handler.

Interrupt handlers are removed via intr_event_execute_handlers() when
IH_DEAD is set. The thread removing the interrupt is woken up, and
calls intr_event_update(). When this happens, the ie_hflags are
cleared and re-built from all the remaining handlers sharing the
event. When the last IH_NET handler is removed, the IH_NET flag will
be cleared from ih_hflags (or ie_hflags may still be being rebuilt in
a different context), and the ithread_execute_handlers() may return
with ie_hflags missing IH_NET. This can lead to a scenario where
IH_NET was present before calling ithread_execute_handlers, and is not
present at its return, meaning the need for epoch must be cached
locally.

This can happen when loading and unloading network drivers. Also make
sure the ie_hflags is not cleared before being updated.

This is a regression issue after r357004.

Backtrace:
panic()
# trying to access epoch tracker on stack of dead thread
_epoch_enter_preempt()
ifunit_ref()
ifioctl()
fo_ioctl()
kern_ioctl()
sys_ioctl()
syscallenter()
amd64_syscall()

Differential Revision:	https://reviews.freebsd.org/D23483
Reviewed by:	glebius@, gallatin@, mav@, jeff@ and kib@
Sponsored by:	Mellanox Technologies
2020-02-10 20:23:08 +00:00
Jonathan T. Looney
3c200db9d2 Modify the vm.panic_on_oom sysctl to take a count of events.
Currently, the vm.panic_on_oom sysctl is a boolean which controls the
behavior of the VM system when it encounters an out-of-memory situation.
If set to 0, the VM system kills the largest process. If set to any other
value, the VM system will initiate a panic.

This change makes the sysctl a count of events. If set to 0, the VM system
kills the largest process. If set to any other value, the VM system will
kill the largest process until it has seen the specified number of
out-of-memory events. Once it reaches the specified number of events, it
will initiate a panic.

This change is helpful in capturing cores when the system is in a perpetual
cycle of out-of-memory events (as opposed to just hitting one or two
sporadic out-of-memory events).

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D23601
2020-02-10 18:06:38 +00:00
Scott Long
85eb41f751 Revert r357710 and 357711 until they can be debugged 2020-02-10 14:27:28 +00:00
Mateusz Guzik
cd951a0d8e vfs: fix lock recursion in vrele
vrele is supposed to be called with an unlocked vnode, but this was never
asserted for if v_usecount was > 0. For such counts the lock is never touched
by the routine. As a result the kernel has several consumers which expect
vunref semantics and get away with calling vrele since they happen to never do
it when this is the last reference (and for some of them this may happen to be
a guarantee).

Work around the problem by changing vrele semantics to tolerate being called
with a lock. This eliminates a possible bug where the lock is already held and
vputx takes it anyway.

Reviewed by:	kib
Tested by:	pho
Differential Revision:	https://reviews.freebsd.org/D23528
2020-02-10 13:54:34 +00:00
Mateusz Guzik
d1e5538758 Tidy up zpcpu_replace*
- only compute the target address once
- remove spurious type casting, zpcpu_get already return the correct type

While here add missing newlines to other routines.
2020-02-10 13:52:25 +00:00
Edward Tomasz Napierala
0b40dcbe32 Make linux(4) use kern_socketpair(9) instead of going through
sys_socketpair().  It's a cleanup; no functional changes.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D22814
2020-02-10 13:24:14 +00:00
Hans Petter Selasky
d82c0ebc69 Add USB host controller PCI ID's for Hygon.
Differential Revision:	https://reviews.freebsd.org/D23564
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-02-10 11:09:56 +00:00
Scott Long
9ce150463c Missed a file in r357710, add it here. 2020-02-10 00:26:41 +00:00
Scott Long
7d99bda79e Add rudamentary support for UFS to probe whether a block device supports the
BIO_SPEEDUP command.  Add complimentary support to the CAM periphs that
support it.
2020-02-10 00:23:20 +00:00
Ian Lepore
39c614c6b7 Implement atomic_testandclear_{32,int,long} for 32-bit arm. Also, replace
the existing implementation of atomic_testandset with the same new algorithm,
which uses fewer instructions and fewer registers.
2020-02-10 00:05:04 +00:00