We need to set the current vnet before iterating over the global
interface list. Because the dump device may only be set from the host,
only proceed with configuration if the thread belongs to the default
vnet. [1]
Also fix a resource leak that occurs if the priv_check() in set_dumper()
fails.
Reported by: mmacy, sbruno [1]
Reviewed by: sbruno
X-MFC with: r333283
Differential Revision: https://reviews.freebsd.org/D15449
Each TCP connection that uses the system default cc_newreno(4) congestion
control algorithm module leaks a "struct newreno" (8 bytes of memory) at
connection initialisation time. The NULL-pointer dereference is only germane
when using the ABE feature, which is disabled by default.
While at it:
- Defer the allocation of memory until it is actually needed given that ABE is
optional and disabled by default.
- Document the ENOMEM errno in getsockopt(2)/setsockopt(2).
- Document ENOMEM and ENOBUFS in tcp(4) as being synonymous given that they are
used interchangeably throughout the code.
- Fix a few other nits also accidentally omitted from the original patch.
Reported by: Harsh Jain on freebsd-net@
Tested by: tjh@
Differential Revision: https://reviews.freebsd.org/D15358
provisioned for NIC_ETHOFLD and the kernel has option RATELIMIT.
It is possible to use the chip's offload queues for normal NIC Tx and
not just TOE Tx. The difference is that these queues support out of
order processing of work requests and have a per-"flowid" mechanism for
tracking credits between the driver and hardware. This allows Tx for
any number of flows bound to different rate limits to be submitted to a
single Tx queue and the work requests for slow flows won't cause HOL
blocking for the rest.
Sponsored by: Chelsio Communications
by default.
This is the first of a series of commits that will add support for
RATELIMIT kernel option to the base if_cxgbe driver, for use with
ordinary NIC traffic "flows". RATELIMIT is already supported by t4_tom
for the fully-offloaded TCP connections that it handles.
Sponsored by: Chelsio Communications
Add epoch section to struct thread. We can use this to
ennable epoch counter to advance even if a section is
perpetually occupied by a thread.
Approved by: sbruno
(https://svnweb.freebsd.org/base?view=revision&revision=304232)
converting clrbuf() (which clears the entire buffer) to vfs_bio_clrbuf()
(which clears only the new pages that have been added to the buffer).
Failure to properly remove pages from the buffer cache can make
pages that appear not to need clearing to actually have bad random
data in them. See for example base r304232
(https://svnweb.freebsd.org/base?view=revision&revision=304232)
which noted the need to set B_INVAL and B_NOCACHE as well as clear
the B_CACHE flag before calling brelse() to release the buffer.
Rather than trying to find all the incomplete brelse() calls, it
is simpler, though more slightly expensive, to simply clear the
entire buffer when it is newly allocated.
PR: 213507
Submitted by: Damjan Jovanovic
Reviewed by: kib
This implements per-thread counters for PMC sampling. The thread
descriptors are stored in a list attached to the process descriptor.
These thread descriptors can store any per-thread information necessary
for current or future features. For the moment, they just store the counters
for sampling.
The thread descriptors are created when the process descriptor is created.
Additionally, thread descriptors are created or freed when threads
are started or stopped. Because the thread exit function is called in a
critical section, we can't directly free the thread descriptors. Hence,
they are freed to a cache, which is also used as a source of allocations
when needed for new threads.
Approved by: sbruno
Obtained from: jtl
Sponsored by: Juniper Networks, Limelight Networks
Differential Revision: https://reviews.freebsd.org/D15335
When poll() is called via netmap, txsync is initially called,
and if there are no available buffers to reclaim, it waits for the driver
to notify of new buffers. Since the TX IRQ is generally not used in iflib
drivers, this ends up causing a timeout.
Work around this by having the reclaim DELAY(1) if it's initially unable
to reclaim anything, then schedule the tx task, which will spin by
continuously rescheduling itself until some buffers are reclaimed. In
general, the delay is enough to allow some buffers to be reclaimed, so
spinning is minimized.
Reported by: Johannes Lundberg <johalun0@gmail.com>
Reviewed by: sbruno
Sponsored by: Limelight Networks
Differential Revision: https://reviews.freebsd.org/D15455
if an error is reported while pre-processing the configuration file that
the driver attempted to use.
Also, allow the user to explicitly use the built-in configuration with
hw.cxgbe.config_file="built-in"
MFC after: 2 days
Sponsored by: Chelsio Communications
There is no need to try to resume it after each smaller operations
(putchar, cursor_position, copy, fill).
The resume function already checks if the timer is armed before doing
anything, but it uses an atomic cmpset which is expensive. And resuming
the timer at the end of input processing is enough.
While here, we also skip timer resume if the input is for another
windows than the currently displayed one. I.e. if `ttyv0` is currently
displayed, any changes to `ttyv1` shouldn't resume the timer (which
would refresh `ttyv0`).
By doing the same benchmark as r333669, I get:
* vt(4), before r333669: 1500 ms
* vt(4), with this patch: 760 ms
* syscons(4): 700 ms
... to process input, instead of inside each smaller operations such as
appending a character or moving the cursor forward.
In other words, before we were doing (oversimplified):
teken_input()
<for each input character>
vtterm_putchar()
VTBUF_LOCK()
VTBUF_UNLOCK()
vtterm_cursor_position()
VTBUF_LOCK()
VTBUF_UNLOCK()
Now, we are doing:
vtterm_pre_input()
VTBUF_LOCK()
teken_input()
<for each input character>
vtterm_putchar()
vtterm_cursor_position()
vtterm_post_input()
VTBUF_UNLOCK()
The situation was even worse when the vtterm_copy() and vtterm_fill()
callbacks were involved.
The new callbacks are:
* struct terminal_class->tc_pre_input()
* struct terminal_class->tc_post_input()
They are called in teken_input(), surrounding the while() loop.
The goal is to improve input processing speed of vt(4). As a benchmark,
here is the time taken to write a text file of 360 000 lines (26 MiB) on
`ttyv0`:
* vt(4), unmodified: 1500 ms
* vt(4), with this patch: 1200 ms
* syscons(4): 700 ms
This is on a Haswell laptop with a GENERIC-NODEBUG kernel.
At the same time, the locking is changed in the vt_flush() function
which is responsible to draw the text on screen. So instead of
(indirectly) using VTBUF_LOCK() just to read and reset the dirty area
of the internal buffer, the lock is held for about the entire function,
including the drawing part.
The change is mostly visible while content is scrolling fast: before,
lines could appear garbled while scrolling because the internal buffer
was accessed without locks (once the scrolling was finished, the output
was correct). Now, the scrolling appears correct.
In the end, the locking model is closer to what syscons(4) does.
Differential Revision: https://reviews.freebsd.org/D15302
This change updates arm, arm64 and mips achitectures. Additionally, it
removes redundant checks for kdb_active where it already results in
kdb_reenter() and adds kdb_reenter() calls where they were missing.
Some architectures check the return value of kdb_trap(), but some don't.
I haven't changed any of that.
Some trap handling routines have a return code. I am not sure if I
provided correct ones for returns after kdb_reenter(). kdb_reenter
should never return unless kdb_jmpbufp is NULL for some reason.
Only compile tested for all affected architectures. There can be bugs
resulting from my poor understanding of architecture specific details.
Reported by: jhb
Reviewed by: jhb, eadler
MFC after: 4 weeks
Differential Revision: https://reviews.freebsd.org/D15431
Perhaps RB_MUTE could mute user startup (rc) output as well, but right
now it mutes only kernel console output, so make the documentation match
reality.
PR: 228193
Sponsored by: The FreeBSD Foundation
The Intel CPU "Platform Id" is a 3-bit integer reported by a given MSR.
Intel microcode updates have an 8-bit field to indicate Platform Id
compatibility - one bit in the mask for each of the possible Platform Id
values. To simplify interpretation, report the Platform Id mask also as
a list.
freebsd-update depends on phttpget from portsnap. We could move phttpget
out of portsnap and build it as long as WITHOUT_FREEBSD_UPDATE and
WITHOUT_PORTSNAP are not both set, but for now just make the dependency
explicit.
PR: 228220
Reported by: Dries Michiels
Sponsored by: The FreeBSD Foundation
Some of the DEBUG_BUFRING checks are racy, and can lead to
spurious assertions when run under high load. Unhook these
from INVARIANTS until the author can fix or remove them.
Reviewed by: mmacy
Sponsored by: Netflix
Libcasper and its modules have no static libraries so don't define
paths to them. This fixes LIBADD automatically adding DPADD
entries for casper.
Reported by: sbruno
Sponsored by: Dell EMC Isilon
When a disk disappears and the periph is invalidated, any I/Os that
are pending with the controller can cause a crash when they
complete. Move to holding the softc reference count taken in dastart()
until the I/O is complete rather than only until xpt_action()
returns. (This approach was suggested by Ken Merry.) This extends
the method used in da to ada, nda, and mda.
Sponsored by: Netflix
Submitted by: Chuck Silvers
Intel now releases microcode updates in files named after
<family>-<model>-<stepping>. In some cases a single file may include
microcode for multiple Platform Ids for the same family, model, and
stepping. Our current microcode update tooling (/usr/sbin/cpucontrol)
only processes the first microcode update in the file.
This tool splits combined files into individual files with one microcode
update each, named as
<family>-<model>-<stepping>.<platform_id_mask>.
Adding this to tools/ for experimentation and testing. In the future
we'll want to have cpucontrol or other tooling work directly with the
Intel-provided microcode files.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D15433
When a disk disappears and the periph is invalidated, any I/Os that
are pending with the controller can cause a crash when they
complete. Move to holding the softc reference count taken in dastart()
until the I/O is complete rather than only until xpt_action()
returns. (This approach was suggested by Ken Merry.)
Sponsored by: Netflix
Submitted by: Chuck Silvers
Differential Revision: https://reviews.freebsd.org/D15435
later devices. These caches work akin to the ones found in HDDs/SSDs
that ada(4)/da(4) also enable if existent, but likewise increase the
likelihood of data loss in case of a sudden power outage etc. On the
other hand, write performance is up to twice as high for e. g. 1 GiB
files depending on the actual chip and transfer mode employed.
For maximum data integrity, the usage of eMMC caches can be disabled
via the hw.mmcsd.cache tunable.
- Get rid of the NOP mmcsd_open().
The NFSv4 protocol requires that the server only allow reclaim of state
and not issue any new open/lock state for a grace period after booting.
The NFSv4.0 protocol required this grace period to be greater than the
lease duration (over 2minutes). For NFSv4.1, the client tells the server
that it has done reclaiming state by doing a ReclaimComplete operation.
If all NFSv4 clients are NFSv4.1, the grace period can end once all the
clients have done ReclaimComplete, shortening the time period considerably.
This patch does this. If there are any NFSv4.0 mounts, the grace period
will still be over 2minutes.
This change is only an optimization and does not affect correct operation.
Tested by: andreas.nagy@frequentis.com
MFC after: 2 months