the RFC and only enable ECN when both the
CWR and ECT bits our set within the SYN packet.
Sponsored by: Netflix Inc.
Differential Revision: https://reviews.freebsd.org/D23645
in FreeBSD the bits that disabled stats
when netflix-stats is not defined is no longer
needed. Lets remove these bits so that we
will properly use stats per its definition
in BBR and Rack.
Sponsored by: Netflix Inc
Differential Revision: https://reviews.freebsd.org/D23088
In particular on amd64 this eliminates an atomic op in the common case,
trading it for IPIs in the uncommon case of catching CPUs executing the
code while the filesystem is getting suspended or unmounted.
This is a wrapper around smp_rendezvous_cpus which enables use of IPI
handlers which can fail and require retrying.
wait_func argument is added to to provide a routine which can be used to
poll CPU of interest for when the IPI can be retried.
Handlers which succeed must call smp_rendezvous_cpus_done to denote that
fact.
Discussed with: jeff
Differential Revision: https://reviews.freebsd.org/D23582
When the receive ring cannot be filled with mbufs, due to lack of memory,
no more interrupts may be generated to fill the receive ring later on.
Make sure to have a watchdog, to try refilling the receive ring from time
to time, hopefully when more mbufs are available.
Differential Revision: https://reviews.freebsd.org/D23315
MFC after: 1 week
Reviewed by: gallatin@
Sponsored by: Mellanox Technologies
It was saved from the initial purge of drivers in fcp-101 due to being
the onboard Ethernet device on a number of sparc64 machines. Now that
sparc64 is gone, it serves little purpose (PCI cards exist, but are rare
and are unlikely to have been deployed outside Sun systems).
MFC after: 3 days
From the beginning, ng_nat safely assumed cleansed traffic
because of limited ways it could be attached to NETGRAPH:
ng_ipfw or ng_ppp only.
Now as it may be attached with ng_ether too, the assumption proven wrong.
Add needed check to the ng_nat. Thanks for markj for debugging this.
PR: 243096
Submitted by: Lutz Donnerhacke <lutz@donnerhacke.de>
Reported by: Robert James Hernandez <rob@sarcasticadmin.com>
Reviewed by: markj and others
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D23091
sys/taskqueue.h may not have includes that define MPASS. It
was useful during testing of r357771, but can be omitted now.
An invalid value of priority will yield only in potential
priority inversion, not a crash.
This fixes compilation of ports/x11/nvidia-driver.
Maintain a count of free slabs in the per-domain keg structure and use
that to clear the free slab list in constant time for most cases. This
helps minimize lock contention induced by reclamation, in preparation
for proactive trimming of excesses of free memory.
Reviewed by: jeff, rlibby
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D23532
When processing a taskqueue and a task has associated epoch, then
enter for duration of the task. If consecutive tasks belong to the
same epoch, batch them. Now we are talking about the network epoch
only.
Shrink the ta_priority size to 8-bits. No current consumers use
a priority that won't fit into 8 bits. Also complexity of
taskqueue_enqueue() is a square of maximum value of priority, so
we unlikely ever want to go over UCHAR_MAX here.
Reviewed by: hselasky
Differential Revision: https://reviews.freebsd.org/D23518
vdrop can set the hold count to 0 and wait for the ->mnt_listmtx held by
mnt_vnode_next_lazy_relock caller. The routine incorrectly asserted the
count has to be > 0.
Reported by: pho
Tested by: pho
The reason for this change is to make it clear the scope of the in-kernel usage
of IFM_TYPE_DESCRIPTIONS and IFM_SUBTYPE_ETHERNET_DESCRIPTIONS macros. Also it
is somewhat better C.
Reviewed by: hselasky
Sponsored by: Mellanox Technologies
Differential revision: https://reviews.freebsd.org/D23620
Platform (N1SDP).
Neoverse N1 is a high-performance ARM microarchitecture designed
by the ARM Holdings for the server market.
The PCI part on N1SDP was shipped untested and suffers from some
integration issues.
For instance accessing to not existing BDFs causes System Error
(SError) exception. To mitigate this, the firmware scans the bus,
catches SErrors and creates a table with valid BDFs. That allows
us to filter-out accesses to invalid BDFs in this driver.
Also the root complex config space (BDF == 0) has an unusual
location in memory map, so remapping accesses to it is required.
Finally, the config space is restricted to 32-bit accesses only.
This was tested on the ARM boxes kindly provided by the ARM Ltd
to the DARPA CHERI Project.
In collaboration with: andrew
Reviewed by: andrew
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D23349
The race is:
CPU1 CPU2
devfs_reclaim_vchr
make v_usecount 0
VI_LOCK
sees v_usecount == 0, no updates
vp->v_rdev = NULL;
...
VI_UNLOCK
VI_LOCK
v_decr_devcount
sees v_rdev == NULL, no updates
In this scenario si_devcount decrement is not performed.
Note this can only happen if the vnode lock is not held.
Reviewed by: kib
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D23529
handler.
Interrupt handlers are removed via intr_event_execute_handlers() when
IH_DEAD is set. The thread removing the interrupt is woken up, and
calls intr_event_update(). When this happens, the ie_hflags are
cleared and re-built from all the remaining handlers sharing the
event. When the last IH_NET handler is removed, the IH_NET flag will
be cleared from ih_hflags (or ie_hflags may still be being rebuilt in
a different context), and the ithread_execute_handlers() may return
with ie_hflags missing IH_NET. This can lead to a scenario where
IH_NET was present before calling ithread_execute_handlers, and is not
present at its return, meaning the need for epoch must be cached
locally.
This can happen when loading and unloading network drivers. Also make
sure the ie_hflags is not cleared before being updated.
This is a regression issue after r357004.
Backtrace:
panic()
# trying to access epoch tracker on stack of dead thread
_epoch_enter_preempt()
ifunit_ref()
ifioctl()
fo_ioctl()
kern_ioctl()
sys_ioctl()
syscallenter()
amd64_syscall()
Differential Revision: https://reviews.freebsd.org/D23483
Reviewed by: glebius@, gallatin@, mav@, jeff@ and kib@
Sponsored by: Mellanox Technologies
Currently, the vm.panic_on_oom sysctl is a boolean which controls the
behavior of the VM system when it encounters an out-of-memory situation.
If set to 0, the VM system kills the largest process. If set to any other
value, the VM system will initiate a panic.
This change makes the sysctl a count of events. If set to 0, the VM system
kills the largest process. If set to any other value, the VM system will
kill the largest process until it has seen the specified number of
out-of-memory events. Once it reaches the specified number of events, it
will initiate a panic.
This change is helpful in capturing cores when the system is in a perpetual
cycle of out-of-memory events (as opposed to just hitting one or two
sporadic out-of-memory events).
Reviewed by: kib
MFC after: 2 weeks
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D23601
vrele is supposed to be called with an unlocked vnode, but this was never
asserted for if v_usecount was > 0. For such counts the lock is never touched
by the routine. As a result the kernel has several consumers which expect
vunref semantics and get away with calling vrele since they happen to never do
it when this is the last reference (and for some of them this may happen to be
a guarantee).
Work around the problem by changing vrele semantics to tolerate being called
with a lock. This eliminates a possible bug where the lock is already held and
vputx takes it anyway.
Reviewed by: kib
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D23528
- only compute the target address once
- remove spurious type casting, zpcpu_get already return the correct type
While here add missing newlines to other routines.