In case of health counter fails to increment it indicates a bad device health.
In case when the syndrome indicated by firmware is 0x0, this indicates that
firmware is unable to respond to initialization segment reads.
Add proper print in this case.
Submitted by: slavash@
MFC after: 3 days
Sponsored by: Mellanox Technologies
MPFS is a logical switch in the Mellanox device which forward packets
based on a hardware driven L2 address table, to one or more physical-
or virtual- functions. The physical- or virtual- function is required
to tell the MPFS by using the MPFS firmware commands, which unicast
MAC addresses it is requesting from the physical port's traffic.
Broadcast and multicast traffic however, is copied to all listening
physical- and virtual- functions and does not need a rule in the MPFS
switching table.
Linux commit: eeb66cdb682678bfd1f02a4547e3649b38ffea7e
MFC after: 3 days
Sponsored by: Mellanox Technologies
Add the 512 bytes limit of RDMA READ and the size of remote address to the max
SGE calculation.
Submitted by: slavash@
Linux commit: 288c01b746aa
MFC after: 3 days
Sponsored by: Mellanox Technologies
Currently only suspend requests are acknowledged by writing an empty
string back to the xenstore control node, but poweroff or reboot
requests are not acknowledged and FreeBSD simply proceeds to perform
the desired action.
Fix this by acknowledging all requests, and remove the suspend specific
ack done in the handler.
Sponsored by: Citrix Systems R&D
MFC after: 3 days
As of r347221 the iflib legacy interrupt mode setup assumes that drivers
perform both receive and transmit processing from the interrupt handler.
This assumption is invalid in the vmxnet3 driver, so introduce the
IFLIB_SINGLE_IRQ_RX_ONLY flag to make iflib avoid tx processing in the
interrupt handler.
PR: 239118
Reported and tested by: Juraj Lutter <otis@sk.freebsd.org>
Obtained from: marius
Reviewed by: gallatin
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D21831
r347183 bumped GEOM classes to SI_ORDER_SECOND to resolve a race between
them and the initialization of devsoftc.mtx in devinit, but missed this
dependency on g_flashmap that may now lose the race against GEOM
classes/g_init.
There's a great comment that describes the situation that has also been
updated with the new ordering of GEOM classes.
Reported by: bdragon
MFC after: 4 days
On rockchip board it seems that the value in the DTS
are not enough for reseting the chip, I don't know if
the value are really incorrect or if DELAY is not precise
enough or if the rockchip gpio driver have some "lag" of some
kind or not.
For now just add more delay.
No functional change intended.
The intent is to add a "legacy" e820 pmem newbus bus for nvdimm device in a
subsequent revision, and it's a little more clear if the parent buses get
independent source files.
Quite a lot of ACPI-specific logic is left in nvdimm.c; disentangling that
is a much larger change (and probably not especially useful).
Reviewed by: kib
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D21813
Those functions are used by kernel, and we can't check all possible argument
errors in production kernel. Plus according to docs many of those errors
are checked by hardware. Assertions should just help with code debugging.
MFC after: 2 weeks
The TUNABLE_INT_FETCH is macro around getenv_int() and we will get
return value 0 or 1 for failure or success, we can use it to decide
which background color to use.
Convert all remaining references to that field to "ref_count" and update
comments accordingly. No functional change intended.
Reviewed by: alc, kib
Sponsored by: Intel, Netflix
Differential Revision: https://reviews.freebsd.org/D21768
Instead of predicting the MSI-X bar index based on the device's MAC
type, read it from the device's PCI configuration instead.
PR: 239704
Submitted by: Piotr Pietruszewski <piotr.pietruszewski@intel.com>
Reviewed by: erj@
MFC after: 3 days
Sponsored by: Intel Corporation
Differential Revision: https://reviews.freebsd.org/D21547
- For each queue pair precalculate CPU and domain it is bound to.
If queue pairs are not per-CPU, then use the domain of the device.
- Allocate most of queue pair memory from the domain it is bound to.
- Bind callouts to the same CPUs as queue pair to avoid migrations.
- Do not assign queue pairs to each SMT thread. It just wasted
resources and increased lock congestions.
- Remove fixed multiplier of CPUs per queue pair, spread them even.
This allows to use more queue pairs in some hardware configurations.
- If queue pair serves multiple CPUs, bind different NVMe devices to
different CPUs.
MFC after: 1 month
Sponsored by: iXsystems, Inc.
above 1Kbyte. It might look like some XHCI(4) controllers do not
support when the USB control transfer is split using a link TRB. The
next NORMAL TRB after the link TRB is simply failing with XHCI error
code 4. The quirk ensures we allocate a 64Kbyte buffer so that the
data stage TRB is not broken with a link TRB.
Found at: EuroBSDcon 2019
MFC after: 1 week
Sponsored by: Mellanox Technologies
libusb. This is useful for speeding up large data transfers while reducing
the interrupt rate.
Found at: EuroBSDcon 2019
MFC after: 1 week
Sponsored by: Mellanox Technologies
Allocate ioat->ring memory from the device domain.
Schedule ioat->poll_timer to the first CPU of the device domain.
According to pcm-numa tool from intel-pcm port, this reduces number of
remote DRAM accesses while copying data by 75%. And unless it is a noise,
I've noticed some speed improvement when copying data to other domain.
MFC after: 1 week
Sponsored by: iXsystems, Inc.
If there is an attempt to switch from a process-owned VT to a closed VT,
then vt(4) first requests the process to release its VT and only then
realizes that the target VT is closed and, so, the switch is not
possible. So, the driver does not actually do any switch, but at the
same time the owning process is not notified about that and it does not
re-acquire the VT.
This change adds an early check for the target VT state, so that the
switch can be refused before the process coordination dance.
On top of that, the code now checks for a failure of vt_window_switch()
and calls vt_window_postswitch() for the current VT if it is in the
process mode.
Test Plan:
- configure VT1 - VT8 (ttyv0 - ttyv7) to be text consoles (run getty)
- configure VT9 (ttyv8) to rn X server
- make sure that the X server configuration allows VT switching
- leave VT10 - VT12 unconfigured
- while in the X server press Ctrl+Alt+F10
- without the patch, observe strange screen content and problems with
keyboard input
- with the patch, observe that nothing happens
The problem has been observed and the fix has been tested with an nVidia
graphics card and the proprietary nvidia driver.
Not sure if that matters.
Reviewed by: ray
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D21704
BERI stands for Bluespec Extensible RISC Implementation, based on MIPS.
BERI has not implemented standard MIPS perfomance monitoring counters,
instead it provides statistical counters.
BERI statcounters have a several limitations:
- They can't be written
- They don't support start/stop operation
- None of hardware interrupt is provided on a counter overflow.
So make it separate to hwpmc_mips module and support process/system
counting mode only.
Sponsored by: DARPA, AFRL
- Remove a dead variable from the amd64 pmap_extract_and_hold().
- Fix grammar in the vm_page_wire man page.
Reported by: alc
Reviewed by: alc, kib
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D21639
Since TX interrupt is generated when THRE is set, wait for TEMT set means
wait for full character transmission time. At low speeds that may take
awhile, burning CPU time while holding sc_hwmtx lock, also congested.
This is partial revert of r317659.
PR: 240121
MFC after: 2 weeks
Errors are communicated between the i2c controller layer and upper layers
(iicbus and slave device drivers) using a set of IIC_Exxxxxx constants which
effectively define a private number space separate from (and having values
that conflict with) the system errno number space. Sometimes it is necessary
to report a plain old system error (especially EINTR) from the controller or
bus layer and have that value make it back across the syscall interface
intact.
I initially considered replicating a few "crucial" errno values with similar
names and new numbers, e.g., IIC_EINTR, IIC_ERESTART, etc. It seemed like
that had the potential to grow over time until many of the errno names were
duplicated into the IIC_Exxxxx space.
So instead, this defines a mechanism to "encode" an errno into the IIC_Exxxx
space by setting the high bit and putting the errno into the lower-order
bits; a new errno2iic() function does this. The existing iic2errno()
recognizes the encoded values and extracts the original errno out of the
encoded value. An interesting wrinkle occurs with the pseudo-error values
such as ERESTART -- they aleady have the high bit set, and turning it off
would be the wrong thing to do. Instead, iic2errno() recognizes that lots of
high bits are on (i.e., it's a negative number near to zero) and just
returns that value as-is.
Thus, existing drivers continue to work without needing any changes, and
there is now a way to return errno values from the lower layers. The first
use of that is in iicbus_poll() which does mtx_sleep() with the PCATCH flag,
and needs to return the errno from that up the call chain.
Differential Revision: https://reviews.freebsd.org/D20975