Overview:
This is the first stage of a RDMA stack upgrade introducing kernel
changes only based on Linux 5.7-rc1.
This patch is based on about four main areas of work:
- Update of the IB uobjects system:
- The memory holding so-called AH, CQ, PD, SRQ and UCONTEXT objects
is now managed by ibcore. This also require some changes in the
kernel verbs API. The updated verbs changes are typically about
initialize and deinitialize objects, and remove allocation and
free of memory.
- Update of the uverbs IOCTL framework:
- The parsing and handling of user-space commands has been
completely refactored to integrate with the updated IB uobjects
system.
- Various changes and updates to the generic uverbs interfaces in
device drivers including the new uAPI surface.
- The mlx5_ib_devx.c in mlx5ib and related mlx5 core changes.
Dependencies:
- The mlx4ib driver code has been updated with the minimum changes
needed.
- The mlx5ib driver code has been updated with the minimum changes
needed including DV support.
Compatibility:
- All user-space facing APIs are backwards compatible after this
change.
- All kernel-space facing RDMA APIs are backwards compatible after
this change, with exception of ib_create_ah() and ib_destroy_ah()
which takes a new flag.
- The "ib_device_ops" structure exist, but only contains the driver ID
and some structure sizes.
Differences from Linux:
- Infiniband drivers must use the INIT_IB_DEVICE_OPS() macro to set
the sizes needed for allocating various IB objects, when adding
IB device instances.
Security:
- PRIV_NET_RAW is needed to use raw ethernet transmit features.
- PRIV_DRIVER is needed to use other privileged operations.
Based on upstream Linux, Torvalds (5.7-rc1):
8632e9b5645bbc2331d21d892b0d6961c1a08429
MFC after: 1 week
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D31149
Sponsored by: NVIDIA Networking
Buggy SD card drivers may attach and detach a mmc(4) driver instance in
quick succession. In this case mmc(4) must disestablish its intrhook
callback during detach. Thus, this change adds a call to
config_intrhook_drain(), which blocks or does nothing if the intrhook is
running or has already ran (the SD card was plugged in), and
disestablishes the hook if it hasn't ran yet (the SD card was not
plugged in).
PR: 254373
Reviewed by: imp, manu, markj
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31262
When using SDIO the block size if per function and most of the time
not equal to MMC_SECTOR_SIZE, fix sdio on dwmmc by setting the correct
block size in the mmc registers.
MFC after: 1 month
Sponsored by: Diablotin Systems
This make it easier for script to get the hardware on which they are running.
MFC after: 1 month
Sponsored by: Diablotin Systems
Differential Revision: https://reviews.freebsd.org/D31205
Reviewed by: imp
Should be ok on powerpc: jhibbits (over irc)
Due to a mis-merge, the changes committed to libpmc never called
pmu_parse_event(), or set pm->pm_ev. However, this field shouldn't be
used to carry the actual pmc event code anyway, as it is expected to
contain the index into the pmu event array (otherwise, it breaks event
name lookup in pmclog_get_event()). Add a new MD field,
pm_md.pm_md_config, to pass the raw event code to arm64_allocate_pmc().
Additionally, the change made to pmc_md_op_pmcallocate was incorrect, as
this is a union, not a struct. Restore the proper padding size.
Reviewed by: luporl, ray, andrew
Fixes: 28dd6730a5d6 ("libpmc: enable pmu_utils on arm64")
Fixes: 8cc3815f02be ("hwpmc_arm64: accept raw event codes...")
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31221
These two ioctls are not part of the current version of OSS and were
considered obsolete. However, their behaviour is not the same as their
old one, so this implementation is specific to FreeBSD.
Older OSS versions had the MUTE ioctls take and return an integer with
a value of 0 or 1, which meant that the _whole_ mixer is unmuted or
muted respectively. In my implementation, the ioctl takes and returns
a bitmask that tells us which devices are muted.
This allows us to mute and unmute only the devices we want, instead of the
whole mixer. The bitmask works the same way as in DEVMASK, RECMASK and
RECSRC.
Integrated the hardware volume feature with the new mute system.
Submitted by: Christos Margiolis <christos@freebsd.org>
Differential Revision: https://reviews.freebsd.org/D31130
MFC after: 1 week
Sponsored by: NVIDIA Networking
Currently we use the num-viewports property to decide how many outbound
regions there are we can use, defaulting to 2. However, Linux has
stopped using that and so it no longer appears in new device trees, such
as for the SiFive FU740. Instead, it's possible to just probe the
hardware directly.
Reviewed by: mmel
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D31030
This supersedes the old legacy mode where a viewport register was used
to mux multiple regions behind a single set of registers, and is used on
the SiFive FU740.
Reviewed by: mmel
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D31029
Currently we assume there is only one memory and one prefetch memory
window, and ignore the latter. However, the SiFive FU740 has two normal
memory windows.
As part of this, the viewports are rearranged. Previously the viewports
were memory, config then optionally I/O. Both to simplify the config
index calculation and to ensure it can always be mapped even if we have
too many memory windows for the number of viewports, config is moved to
being the first viewport.
This generalisation now also naturally supports mapping prefetch memory
windows.
Reviewed by: mmel
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D31028
Note that currently Linux's device tree uses the FU540's compatible
string, as does upstream U-Boot, but the U-Boot shipped with the board
based on an older patch series has the correct FU740 name. Thankfully
they are the same, at least as far as software is concerned.
Whilst here, fix a style(9) nit.
Reviewed by: philip, kp
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D31034
The intention here is to reduce differences between em, igb, igc, ixgbe.
The main functional change is logical simplification in igb_rx_checksum
and getting interface caps from scctx instead of the ifp.
Reviewed by: gallatin, markj
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D30073
If a data PDU encounters an error such as a digest error, the firmware
will report that data PDU when completion moderation is active even if
it is not the final data PDU in a burst.
Sponsored by: Chelsio Communications
A non-placed PDU can be delivered by CPL_RX_ISCSI_CMP in the middle of
a burst of placed PDUs (received via DDP) in which case the rcv_nxt
will not match the start of the non-placed PDU.
Reported by: Jithesh Arakkan @ Chelsio
Sponsored by: Chelsio Communications
The intention here is to reduce differences with D30072.
The only functional change is logical simplification in
ixgbe_rx_checksum.
Reviewed by: gallatin
Differential Revision: https://reviews.freebsd.org/D30074
It can be useful for system operators to see this kind of information
when correlating issues or requesting support from the OEM or Intel for
hardware and firmware issues.
Reviewed by: gallatin
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D30178
This fixes a kernel panic when probing for vmd_bus on Intel TigerLake on
14-CURRENT. Apparently, vmd_bus is a type of PCI bus, but was registered
as a separate device class.
PR: 256915
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D31071
To guard against the ill effects of a spurious interrupt during
construction (or one that was bogusly pending), enable interrupts after
the qpair is completely constructed. Otherwise, we can die with null
pointer dereferences in nvme_qpair_process_completions. This has been
observed in at least one pre-release NVMe drive where the MSIX interrupt
fired while the queue was being created, before we'd started the NVMe
controller card.
The alternative of only turning on the interrupts after the rest was
tried, but was insufficient to work around this bug and made the code
more complicated w/o benefit.
Reviewed by: mav, chuck
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D31182
port devices. If it gets eaten it is fine. Many USB device side implementations
don't properly support the clear endpoint halt command and if they do, data is lost
because the transmit FIFO is typically reset when this command is received.
Tested by: jmg
MFC after: 1 week
Sponsored by: NVIDIA Networking
if it is only multiplexed device. Also enable syncbit checks for them.
This fixes touchpad recognition on Panasonic Toughbook CF-MX4 laptop.
Reported by: Tomasz "CeDeROM" CEDRO <tomek_AT_cedro_DOT_info>
MFC after: 1 month
PR: 253279
Differential revision: https://reviews.freebsd.org/D28502
These registers have read side effects and a read at just the right
(wrong?) time can trash some internal hw state.
Obtained from: Chelsio Communications
MFC after: 1 week
Sponsored by: Chelsio Communications
The problem is that ns8250_bus_probe() accesses a field from the
ns8250_softc, which embeds the generic UART softc, but the ns8250_softc
hasn't yet been allocated because we're still probing.
This is a regression from commit 0aefb0a63c50. This fixed a problem
where one of the upper four IER bits, which are usually reserved, needs
to be set in order to get RX interrupts before the RX FIFO is full. At
the same time, we avoid clearing those reserved bits (see commit
58957d87173, though other UART drivers I looked at do not bother with
this).
So, copy what ns8250_init() does to disable interrupts, since we don't
know what the "right" mask is at this point.
Reported by: syzbot+f256beefd0df9eb796e7@syzkaller.appspotmail.com
Reviewed by: imp
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31124
Changes since 1.25.6.0 are listed here. This list comes from the
Release Notes for "Chelsio Unified Wire 3.14.0.4 for Linux" dated
2021-07-08.
Fixes
-----
BASE:
- Wait 5ms before and after the i2c command that clears the mod_select.
This fixes incorrect port module type read from i2c.
Obtained from: Chelsio Communications
MFC after: 1 week
Sponsored by: Chelsio Communications
Properly allocate all mlx5en(4) structures from correct numa domain.
While at it cleanup unused numa domain integers deriving from the
Linux version of mlx5en(4).
MFC after: 1 week
Reviewed by: kib
Sponsored by: Mellanox Technologies // NVIDIA Networking
Make sure the "uid" field gets properly set when destroying DCT and QP
objects by making a copy of the field when creating such objects.
MFC after: 1 week
Reviewed by: kib
Sponsored by: Mellanox Technologies // NVIDIA Networking
Currently when we map the hca_core_clock page to the user space,
there are vulnerable registers, one of which is semaphore, on
this page as well. If user read the wrong offset, it can modify the
above semaphore and hang the device.
Hence, mapping the hca_core_clock page to the user space only when
user required it specifically.
After this patch, mlx4 core_clock won't be mapped to user space by
default. Oppose to current state, where mlx4 core_clock is always mapped
to user space.
MFC after: 1 week
Reviewed by: kib
Sponsored by: Mellanox Technologies // NVIDIA Networking
To avoid congestion on the same PCI memory register space when
traffic consists mostly of small packets.
MFC after: 1 week
Reviewed by: kib
Sponsored by: Mellanox Technologies // NVIDIA Networking
The driver expects all TLS tags to be returned to the driver before
it can free the UMA zone where the TLS tags reside.
MFC after: 1 week
Reviewed by: kib
Sponsored by: Mellanox Technologies // NVIDIA Networking
IB spec says that a lid should be ignored when link layer is Ethernet,
for example when building or parsing a CM request message (CA17-34).
However, since ib_lid_be16() and ib_lid_cpu16() validates the slid,
not only when link layer is IB, we set the slid to zero to prevent
false warnings in the kernel log.
Linux commit:
65389322b28f81cc137b60a41044c2d958a7b950
MFC after: 1 week
Reviewed by: kib
Sponsored by: Mellanox Technologies // NVIDIA Networking
RoCE is short for Remote direct memory access over Converged Ethernet.
ECN is short for Explicit Congestion Notification.
MFC after: 1 week
Reviewed by: kib
Sponsored by: Mellanox Technologies // NVIDIA Networking
Querying the PCI config space for offline for every firmware command blocks
the PCI bus and affects performance. Especially for packet pacing and TLS
when objects are frequently created and destroyed.
MFC after: 1 week
Reviewed by: kib
Sponsored by: Mellanox Technologies // NVIDIA Networking