The hardware rate limiting feature is enabled by the RATELIMIT kernel
option. Please refer to ifconfig(8) and the txrtlmt option and the
SO_MAX_PACING_RATE set socket option for more information. This
feature is compatible with hardware transmit send offload, TSO.
A set of sysctl(8) knobs under dev.mce.<N>.rate_limit are provided to
setup the ratelimit table and also to fine tune various rate limit
related parameters.
Sponsored by: Mellanox Technologies
According to the 802.1Q-2014 9.6 VLAN Tag Control Information, VID value 0
means that there is no VLAN tag assigned to the packet, and only PCP and
DEI values from the tag are meaningful. Current flow table programming
filter out such packets.
When programming VLAN filter for flow table, unconditionally add rule which
accept packets with VLAN id 0. The packets are already handled correctly
by the network stack.
Reviewed by: hselasky, slavash
Sponsored by: Mellanox Technologies
MFC after: 1 week
Make sure the command completion handler is not called when the device is
in internal error state. This can easily trigger use after free situations.
MFC after: 3 days
Sponsored by: Mellanox Technologies
During health care IRQ resources will be reallocated.
Newbus requires that Giant is locked before accessing
these resources.
MFC after: 3 days
Sponsored by: Mellanox Technologies
Firmware dump collecting should be triggered in case firmware syndrome
with request for reset bit is set.
MFC after: 3 days
Submitted by: slavash@
Sponsored by: Mellanox Technologies
- Move the semaphore locking and unlocking to the same function.
- Flags are no longer needed if the reset and crdump will be done in the
same function.
MFC after: 3 days
Submitted by: slavash@
Sponsored by: Mellanox Technologies
- Move firmware dump prep and cleanup to init_one() and remove_one() so that
the init and cleanup will happen only upon driver reload.
- Add some prints to indicate firmware dump.
MFC after: 3 days
Submitted by: slavash@
Sponsored by: Mellanox Technologies
The old code checked for MLX5_CR_SPACE_DOMAIN which is irrelevant here.
However, if dev->vsec_addr would be 0, an access to wrong offset would
happen.
MFC after: 3 days
Submitted by: slavash@
Sponsored by: Mellanox Technologies
This fixes 32-bit compat (no ioctl command defintions are required
as struct ifreq is the same size). This is believed to be sufficent to
fully support ifconfig on 32-bit systems.
Reviewed by: kib
Obtained from: CheriBSD
MFC after: 1 week
Relnotes: yes
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14900
error state.
If the device is in internal error state the hardware will not
generate completions. Just move on to destroy the resources.
Submitted by: slavash@
MFC after: 1 week
Sponsored by: Mellanox Technologies
Change page cleanup flow when in internal error to properly decrement
the page counts when reclaiming pages. That prevents timing out
waiting for extra pages that were actually cleaned up previously.
Submitted by: slavash@
MFC after: 1 week
Sponsored by: Mellanox Technologies
When a PCI error is detected the PCI state could be corrupt, don't
save it in that flow. Save the state after initialization. After
restoring the PCI state during slot reset save it again, restoring
the state destroys the previously saved state info.
Submitted by: slavash@
MFC after: 1 week
Sponsored by: Mellanox Technologies
Since the FW can be shared between PCI functions it is common that
more than one health poll will detected a failure, this can lead to
multiple resets.
The solution is to use a FW locking mechanism using semaphore space to
provide a way to synchronize between functions. The FW semaphore is
acquired via config cycle access. First the VSEC gateway must be
acquired, then the semaphore can be locked by writing a value to it
and confirmed it's locked by reading the same value back. The process
in the same to free the semaphore, except the value written should be
zero.
Submitted by: slavash@
MFC after: 1 week
Sponsored by: Mellanox Technologies
If a FW assert is considered fatal, indicated by a new bit in the
health buffer, reset the FW. After the reset, follow the normal
recovery flow.
Submitted by: slavash@
MFC after: 1 week
Sponsored by: Mellanox Technologies
Some mlx5 adapter firmware allows the driver to reset the firmware in
the event of an error. When a software reset is issued on any physical
function all PFs enter reset state. This is a recoverable condition.
The existing recovery flow was designed to allow the recovery of a
VF after a PF driver reload. This patch expands the scope of that
flow to recover PFs or VFs after a SW reset has been issued.
When a software reset is issued the following occurs:
1. The NIC interface mode is set to SW_RESET (7) while the reset is in
progress.
2. Once the reset completes the NIC interface mode is set to NIC
disabled (1).
After the reset has been issued (added in a subsequent patch) the
health poll for other functions will detect that the NIC interface
state has been set to disabled. This will cause it to enter the
existing recovery flow. If the PCI is still working (meaning it
doesn't return 0xff on all reads) it means recovery can proceed
immediately instead of waiting 60 seconds.
The error detetion has also been refactored to avoid incorrect or
misleading log messages.
Submitted by: slavash@
MFC after: 1 week
Sponsored by: Mellanox Technologies
When mlx5_enter_error_state() operation is forced by shutdown, the
messages surrounding setting the error state are not informational
and confuse users.
Submitted by: kib@
MFC after: 1 week
Sponsored by: Mellanox Technologies
This patch accumulates the following Linux commits:
- 8812c24d28f4972c4f2b9998bf30b1f2a1b62adf
net/mlx5: Add fast unload support in shutdown flow
- 59211bd3b6329c3e5f4a90ac3d7f87ffa7867073
net/mlx5: Split the load/unload flow into hardware and software flows
- 4525abeaae54560254a1bb8970b3d4c225d32ef4
net/mlx5: Expose command polling interface
Submitted by: Matthew Finlay <matt@mellanox.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
This patch accumulates the following Linux commits:
- 04c0c1ab38e95105d950db5b84e727637e149ce7
net/mlx5: PCI error recovery health care simulation
- 0179720d6be2096b8d0a4d143254ff9e77747daa
net/mlx5: Introduce trigger_health_work function
- 3fece5d676939f42f434c63dfe1bd42d7d94e6f0
net/mlx5: Continue health polling until it is explicitly stopped
Submitted by: Matthew Finlay <matt@mellanox.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
The mlx5e_destroy_ifp() function may be called from the system workqueue and
in this case trying to flush all works will cause a dead lock.
Instead of using the system workqueue, create a designated workqueue
for each mlx5en(4) device instance.
Submitted by: slavash@
MFC after: 1 week
Sponsored by: Mellanox Technologies
There is a difference when parsing a completion entry between Ethernet
and IB ports. When link layer is Ethernet the bits describe the type of
L3 header in the packet. In the case when link layer is Ethernet and VLAN
header is present the value of SL is equal to the 3 UP bits in the VLAN
header. If VLAN header is not present then the SL is undefined and consumer
of the completion should check if IB_WC_WITH_VLAN is set.
While that, this patch also fills the vlan_id field in the completion if
present.
linux commit 12f8fedef2ec94c783f929126b20440a01512c14
MFC after: 1 week
Sponsored by: Mellanox Technologies
mlx5core.
Do not consider the inability to create a firmware dump fatal, but
inform about the situation and allow the driver to attach. The device
might not implement the needed VSC, or we might not know the layout of
the registers map. In either case, only firmware dump functionality is
limited, the network operations should be fine.
Submitted by: kib@
MFC after: 1 week
Sponsored by: Mellanox Technologies
When the mlx5en(4) driver was converted to using BUSDMA(9) the call to
m_defrag() was moved after the part of the TX routine that strips the
header from the mbuf chain. Before it called m_defrag it first trimmed
off the now-empty mbufs from the start of the chain. This has the side
effect of also removing the head of the chain that has M_PKTHDR set.
m_defrag() will not defrag a chain that does not have M_PKTHDR set,
thus it was effectively never defragging the mbuf chains.
As it turns out, trimming the mbufs in this fashion is unnecessary since
the call to bus_dmamap_load_mbuf_sg doesn't map empty mbufs anyway, so
remove it.
Differential Revision: https://reviews.freebsd.org/D12050
Submitted by: mjoras@
MFC after: 1 week
Sponsored by: Mellanox Technologies
Set and report vport MTU rather than physical MTU,
The driver will set both vport and physical port mtu
and will rely on the query of vport mtu.
SRIOV VFs have to report their MTU to their vport manager (PF),
and this will allow them to work with any MTU they need
without failing the request.
Also for some cases where the PF is not a port owner, PF can
work with MTU less than the physical port mtu if set physical
port mtu didn't take effect.
Based on Linux upstream commit:
cd255efff9baadd654d6160e52d17ae7c568c9d3
Submitted by: Meny Yossefi <menyy@mellanox.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
Currently the ifnet interface is named mceX, where X is a monotonically
incremented value. If the device is reset due to a fatal error, then the
interface name will change. Using the device unit number will keep the
naming consistent across the reset logic.
Submitted by: Matthew Finlay <matt@mellanox.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
ConnectX-4/5 devices in mlx5core.
The dump is obtained by reading a predefined register map from the
non-destructive crspace, accessible by the vendor-specific PCIe
capability (VSC). The dump is stored in preallocated kernel memory and
managed by the mlx5tool(8), which communicates with the driver using a
character device node.
The utility allows to store the dump in format
<address> <value>
into a file, to reset the dump content, and to manually initiate the
dump.
A call to mlx5_fwdump() should be added at the places where a dump
must be fetched automatically. The most likely place is right before a
firmware reset request.
Submitted by: kib@
MFC after: 1 week
Sponsored by: Mellanox Technologies
Add the ability to access the vendor specific space gateway in order
to support reading and writing data into the different configuration
domains.
Submitted by: Matthew Finlay <matt@mellanox.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
Add support for PFC and implement reading the per priority statistics
using the sysctl(8) interface. PFC is used together with VLAN priority
and can be enabled and disabled on a per priority basis.
Global pause frames and PFC are incompatible features and surrounding
logic has been added to warn the user about misconfiguration.
Update relevant mlx5core APIs for PFC configuration.
MFC after: 1 week
Sponsored by: Mellanox Technologies
ECN configuration and statistics is available through a set of sysctl(8)
nodes under sys.class.infiniband.mlx5_X.cong . The ECN configuration
nodes can also be used as loader tunables.
MFC after: 1 week
Sponsored by: Mellanox Technologies
This patch accumulates the following Linux commits:
mlx5_health.c
- 78ccb25861d76a8fc5c678d762180e6918834200
mlx5_core: Fix wrong name in struct
- 171bb2c560f45c0427ca3776a4c8f4e26e559400
mlx5_core: Update health syndromes
- 0144a95e2ad53a40c62148f44fb0c1f9d2a0d1e9
mlx5_core: Use accessor functions to read from device memory
- ac6ea6e81a80172612e0c9ef93720f371b198918
mlx5_core: Use private health thread for each device
- fd76ee4da55abb21babfc69310d321b9cb9a32e0
mlx5_core: Fix internal error detection conditions
- 2241007b3d783cbdbaa78c30bdb1994278b6f9b9
mlx5: Clear health sick bit when starting health poll
- 712bfef60912d91033cb25739f7444d5b8d8c59f
mlx5: Fix version printout in case of health issue
- 89d44f0a6c732db23b219be708e2fe1e03ee4842
mlx5_core: Add pci error handlers to mlx5_core driver
mlx5_cmd.c
- be87544de8df2b1eb34bcb5e32691287d96f9ec4
mlx5_core: Fix async commands return code
- a31208b1e11df334d443ec8cace7636150bb8ce2
mlx5_core: New init and exit flow for mlx5_core
- 020446e01eebc9dbe7eda038e570ab9c7ab13586
mlx5_core: Prepare cmd interface to system errors handling
- 89d44f0a6c732db23b219be708e2fe1e03ee4842
mlx5_core: Add pci error handlers to mlx5_core driver
- 0d834442cc247c7b3f3bd6019512ae03e96dd99a
mlx5: Fix teardown errors that happen in pci error handler
mlx5_main.c
- 5fc7197d3a256d9c5de3134870304b24892a4908
mlx5: Add pci shutdown callback
Submitted by: Matthew Finlay <matt@mellanox.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
Add support for mapping priority to traffic class via sysctl
Submitted by: Slava Shwartsman <slavash@mellanox.com>
MFC after: 1 week
Sponsored by: Mellanox Technologies
- Factor out port speed definitions into new port.h header file,
similarly as done in Linux upstream.
- Correct two existing port speed definitions in mlx5en according to
Linux upstream.
MFC after: 1 week
Sponsored by: Mellanox Technologies
Adding an interface might be done outside the device_attach() routine
and will then cause a panic, due to the VNET not being set.
MFC after: 1 week
Sponsored by: Mellanox Technologies