IFCAP2_XXX constants are integers, they do not need shift for the
definition. But their usage as bitmask for if_capenable2 does require
shift. Add convenience macro IFCAP2_BIT() for consumers.
Fix the only existing consumer, mlx5(4) RXTLS enable bits.
Reported by: jhb
Reviewed by: jhb, jhibbits, hselasky
Coverity CID: 1501659
Sponsored by: NVIDIA networking
Differential revision: https://reviews.freebsd.org/D37862
The way that the clock is synchronized between the system and the current mlx5 for the purposes of the M_TSTMP
being carried we loose a lot of precision. Instead lets change the math that calculates this to separate out
the seconds/nanoseconds and operate on the two values so we don't get overflow instead of just
shifting the value down and loosing precision.
Reviewed by: kib, hselasky
Sponsored by: Netflix Inc
Differential Revision: https://reviews.freebsd.org/D36327
With clang 15, the following -Werror warning is produced:
sys/dev/mlx5/mlx5_ib/mlx5_ib_devx.c:1926:6: error: variable 'num_alloc_xa_entries' set but not used [-Werror,-Wunused-but-set-variable]
int num_alloc_xa_entries = 0;
^
The 'num_alloc_xa_entries' variable appears to have been a debugging aid
that has never been used for anything, so remove it.
MFC after: 3 days
With clang 15, the following -Werror warning is produced:
sys/dev/mlx5/mlx5_core/mlx5_fs_tree.c:1408:15: error: variable 'candidate_group_num' set but not used [-Werror,-Wunused-but-set-variable]
unsigned int candidate_group_num = 0;
^
The 'candidate_group_num' variable appears to have been a debugging aid
that has never been used for anything, so remove it.
MFC after: 3 days
In the clean_mr() error path the 'mr' should be freed.
Linux commit:
5942d8ae411775b76e5e1ab0cce57b0666516f2d
PR: 264653
MFC after: 1 week
Sponsored by: NVIDIA Networking
While at it, fix double initialization of the "drv_ioctl_data_d" structure
and the "mask" variable.
Reviewed by: kib@
MFC after: 1 week
Sponsored by: NVIDIA Networking
The page size specified for RQ, SQ and CQ is always in units of 4KBytes.
Make sure we subtract MLX5_ADAPTER_PAGE_SHIFT, 12, instead of PAGE_SHIFT
which may vary. This fixes support for using the mlx5en driver on systems
having non-4K page size.
Linux commit:
68cdf5d6e91068c98d6091b193dc7a5ab7dcf5eb
MFC after: 1 week
Sponsored by: NVIDIA Networking
The IBTA specification has new speed - NDR. That speed supports signaling
rate of 100Gb. mlx5 IB driver translates link modes reported by ConnectX
device to IB speed and width. Added translation of new 100Gb, 200Gb and
400Gb link modes to NDR IB type and width of x1, x2 or x4 respectively.
Linux commits:
f946e45f59ef01ff54ffb3b1eba3a8e7915e7326
MFC after: 1 week
Sponsored by: NVIDIA Networking
Create a dedicated free state, in case the taskqueue worker is still pending,
to avoid re-activation of a freed send tag.
MFC after: 1 week
Sponsored by: NVIDIA Networking
Use the send tag refcounting mechanism to refcount the RX- and TX- TLS
send tags. Then it is no longer needed to wait for refcounts to reach
zero when destroying RX- and TX- TLS send tags as a result of pending
data or WQE commands.
This also ensures that when TX-TLS and rate limiting is used at the same
time, the underlying SQ is not prematurely destroyed.
MFC after: 1 week
Sponsored by: NVIDIA Networking
These zones are cache zones used to allocate TLS offload contexts from
firmware. Releasing items from the cache is a sleepable operation due
to the need to await a response from the firmware command freeing the
tag, so items cannot be reclaimed from the zone in non-sleepable
contexts. Since the cache size is limited by firmware limits, avoid
this by setting UMA_ZONE_UNMANAGED to avoid reclamation by uma_timeout()
and the low memory handler.
Reviewed by: hselasky, kib
MFC after: 3 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34142
Backport from Linux 5.17 (drivers/infiniband/hw/mlx5/fs.c)
This fixes creating flow rules from user-space after the
kernel space update based on Linux 5.7-rc1 .
Sponsored by: NVIDIA Networking
TLS RX support is modeled after TLS TX support. The basic structures and layouts
are almost identical, except that the send tag created filters RX traffic and
not TX traffic.
The TLS RX tag keeps track of past TLS records up to a certain limit,
approximately 1 Gbyte of TCP data. TLS records of same length are joined
into a single database record.
Regularly the HW is queried for TLS RX progress information. The TCP sequence
number gotten from the HW is then matches against the database of TLS TCP
sequence number records and lengths. If a match is found a static params WQE
is queued on the IQ and the hardware should immediately resume decrypting TLS
data until the next non-sequential TCP packet arrives.
Offloading TLS RX data is supported for untagged, prio-tagged, and
regular VLAN traffic.
MFC after: 1 week
Sponsored by: NVIDIA Networking
If the driver_version capability bit is enabled, send the driver
version to firmware after the init HCA command, for display purposes.
Example of driver version: "FreeBSD,mlx5_core,14.0.0,3.x-xxx"
Linux commits:
012e50e109fd27ff989492ad74c50ca7ab21e6a1
MFC after: 1 week
Sponsored by: NVIDIA Networking
Currently, unicast/multicast loopback raw ethernet (non-RDMA) packets
are sent back to the vport. A unicast loopback packet is the packet
with destination MAC address the same as the source MAC address. For
multicast, the destination MAC address is in the vport's multicast
filter list.
Moreover, the local loopback is not needed if there is one or none
user space context.
After this patch, the raw ethernet unicast and multicast local
loopback are disabled by default. When there is more than one user
space context, the local loopback is enabled.
Note that when local loopback is disabled, raw ethernet packets are
not looped back to the vport and are forwarded to the next routing
level (eswitch, or multihost switch, or out to the wire depending on
the configuration).
Linux commits:
c85023e153e3824661d07307138fdeff41f6d86a
8978cc921fc7fad3f4d6f91f1da01352aeeeff25
MFC after: 1 week
Sponsored by: NVIDIA Networking
This change adds convenience functions to setup a flow steering rule based on
a TCP socket. The helper function gets all the address information from the
socket and returns a steering rule, to be used with HW TLS RX offload.
MFC after: 1 week
Sponsored by: NVIDIA Networking
Previously flow steering tables and rules were only created and destroyed
at link up and down events, respectivly. Due to new requirements for adding
TLS RX flow tables and rules, the main flow steering table must always be
available as there are permanent redirections from the TLS RX flow table
to the vlan flow table.
MFC after: 1 week
Sponsored by: NVIDIA Networking
All packets must go through the indirection table, RQT,
because it is not possible to modify the RQN of the TIR
for direct dispatchment after it is created, typically
when the link goes up and down.
MFC after: 1 week
Sponsored by: NVIDIA Networking
Add support to map an SQ to a specific schedule queue using a
special WQE as performance enhancement.
SQ remap operation is handled by a privileged internal queue, IQ,
and the mapping is enabled from one rate to another.
The transition from paced to non-paced should however always go
through FW.
MFC after: 1 week
Sponsored by: NVIDIA Networking
Internal send queues are regular sendqueues which are reserved for WQE commands
towards the hardware and firmware. These queues typically carry resync
information for ongoing TLS RX connections and when changing schedule queues
for rate limited connections.
The internal queue, IQ, code is more or less a stripped down copy
of the existing SQ managing code with exception of:
1) An optional single segment memory buffer which can be read or
written as a whole by the hardware, may be provided.
2) An optional completion callback for all transmit operations, may
be provided.
3) Does not support mbufs.
MFC after: 1 week
Sponsored by: NVIDIA Networking