57 Commits

Author SHA1 Message Date
hselasky
c49cd7b58d Make draining a sendqueue more robust.
Add own state variable to track if a sendqueue is stopped or not.
This will prevent traffic from entering the sendqueue while it is
being destroyed.

Update drain function to wait for traffic to be transmitted before
returning when the link state is active.

Add extra checks in transmit path for stopped SQ's.

While at it:
- Use likely() for a mbuf pointer check.
- Remove redundant IFF_DRV_RUNNING check.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-20 12:02:40 +00:00
hselasky
ebf4085d99 Add runtime support for modifying the SQ and RQ completion event
moderation mode. The presence of this feature is indicated through the
firmware capabilities.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-20 11:11:49 +00:00
hselasky
08255e2b1b Update firmware interface structures and definitions adding support
for new features and commands.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-20 10:47:32 +00:00
hselasky
93748b89b8 Make a read only pointer constant.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-12-22 10:12:19 +00:00
hselasky
9ba03afd59 Add more comments regarding collection of statistics counters.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-12-22 10:11:03 +00:00
hselasky
c9d0547e87 Remove useless NULL checks.
NULL is not returned when allocating memory passing the M_WAITOK flag.

Submitted by:		trasz @
Differential Revision:  https://reviews.freebsd.org/D5772
Sponsored by:           Mellanox Technologies
MFC after:		1 week
2016-12-02 09:41:54 +00:00
hselasky
c3180648b2 Add timer to watch the RQ when we are out of mbufs.
The firmware/hardware does not generate additional completion
events unless we post new buffers. Use a timer to try to post
more buffers in case we are temporarily out of mbufs. Else
the receive schedule completely stops.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-11-07 11:39:45 +00:00
hselasky
4f9623fe2d Add more firmware related structures and update existing ones in the
MLX5 core module. Update the set and query diagnostics counter API.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-11-07 11:28:50 +00:00
hselasky
3d96785c4f Query flow table capabilities according to the correct capability bit
for infiniband.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-11-07 11:26:25 +00:00
hselasky
0f38a0f9c7 Correct checksum fields in the "mlx5_mini_cqe8" structure. The fields
in question are currently not used.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-11-07 11:22:50 +00:00
hselasky
bef3529df5 Ensure the firmware is notified of any host memory allocation
failures. Else firmware commands may time out waiting for host
memory.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-11-07 11:20:13 +00:00
hselasky
c36023f206 When a firmware command times out do not free the command structure to
avoid use after free.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-11-07 11:15:40 +00:00
hselasky
0370535171 Set hardware stats flag to avoid double counting the number of incoming bytes.
Found by:	Ben RUBSON <ben.rubson@gmail.com>
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-29 16:35:52 +00:00
hselasky
94f8b79097 mlx5en: Fix duplicate mbuf free-by-code.
When mlx5e_sq_xmit() returns an error code and the mbuf pointer is set,
we should not free the mbuf, because the caller will keep the mbuf in
the drbr. Make sure the mbuf pointer is correctly set upon function
exit.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:57:48 +00:00
hselasky
ca67d9f237 mlx5en: Remove unused pdev pointer.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:55:38 +00:00
hselasky
8071bc2f2b mlx5en: Verify port type is ethernet before creating network device
Else the mlx5en driver might attach to infiniband ports.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:53:53 +00:00
hselasky
753de42ca9 mlx5en: Allow setting the software MTU size below 1500 bytes
The hardware MTU size can't be set to a value less than 1500 bytes due
to side-band management support. Allow setting the software MTU size
below 1500 bytes, thus creating a mismatch between hardware and
software MTU sizes.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:51:31 +00:00
hselasky
76f28d3b21 mlx5en: Factor out common sendqueue code for use with rate limiting SQs.
Try to reuse code to setup sendqueues when possible by making some static
functions global. Further split the mlx5e_close_sq_wait() function to
separate out reusable parts.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:47:16 +00:00
hselasky
0dc18f450b mlx5en: Properly declare doorbell lock for 32-bit CPUs.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:45:35 +00:00
hselasky
bf7a52f004 mlx5en: Optimise away duplicate UAR pointers.
This change also reduces the size of the mlx5e_sq structure so that the last
queue_state element will fit into the previous cacheline and then the mlx5e_sq
structure becomes one cacheline less for amd64.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:40:45 +00:00
hselasky
be081c172c mlx5en: Make the mlx5e_open_cq() and mlx5e_close_cq() functions global.
Make some functions and structures global to allow for code reuse
when creating rate limiting sendqueues.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:39:15 +00:00
hselasky
4f9acf8620 mlx5en: Minor completion queue control path code refactor.
Move setting of CQ moderation mode together with the other
CQ moderation parameters. Pass completion event vector as
a separate argument to mlx5e_open_cq(), because its value is
different for each call. Pass mlx5e_priv pointer instead of
mlx5e_channel pointer so that code can be used by rate
limiting sendqueues.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:37:35 +00:00
hselasky
310b29a8ad mlx5en: Separate the sendqueue from using the mlx5e_channel structure.
This change allows for reusing the transmit path for so called
rate limited senqueues. While at it optimise some pointer lookups
in the fast path.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:35:45 +00:00
hselasky
6ef474775d Update the MLX5 core module:
- Add new firmware commands and update existing ones.
- Add more firmware related structures and update existing ones.
- Some minor fixes, like adding missing \n to some prints.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:28:16 +00:00
hselasky
1fbad2b111 Increase the maximum RX/TX queue size. This allows for a RX/TX queue
size of 16384 mbufs. Previously the limit was 8192.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-08-22 13:43:25 +00:00
hselasky
d6358bfd37 Fix for use after free.
Clear the device description to avoid use after free because the
bsddev is not destroyed when the mlx5en module is unloaded. Only when
the parent mlx5 module is unloaded the bsddev is destroyed. This fixes
a panic on listing sysctls which refer strings in the bsddev after the
mlx5en module has been unloaded.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-08-09 07:43:15 +00:00
hselasky
7c8996f145 Switch to the new block based LRO input function for the mlx5en
driver. This change significantly increases the overall RX aggregation
ratio for heavily loaded networks handling 10-80 thousand simultaneous
connections.

Remove the turbo LRO code and all references to it which has now been
superceeded by the tcp_lro_queue_mbuf() function.

Tested by:	Netflix
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-08-08 16:22:16 +00:00
hselasky
16acbc9102 Use correct Q-counter output array.
Sponsored by:	Mellanox Technologies
Approved by:	re (kib)
MFC after:	3 days
2016-06-23 09:23:37 +00:00
hselasky
52fa2498da Add SR-IOV guest support to the mlx5en driver.
This patch adds the missing pieces needed for device setup using the
mlx5en driver inside a virtual machine which is providing hardware
access through SR-IOV.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-06-07 13:58:52 +00:00
sephe
7acd138965 net: Use M_HASHTYPE_OPAQUE_HASH if the mbuf flowid has hash properties
Reviewed by:	hps, erj, tuexen
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D6688
2016-06-07 04:51:50 +00:00
hselasky
0d8f1c25a2 Prepare for activation of LinuxKPI module parameters as read-only
tunable SYSCTL's. Linux module parameters are associated with the
module they belong to. FreeBSD does not share this concept of a parent
module. Instead add macros which define the prefix to use for the
module parameters in the LinuxKPI consumers.

While at it convert all "bool" LinuxKPI module parameters to "byte"
type, because we don't have a "bool" type of SYSCTL in FreeBSD.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-05-25 12:03:21 +00:00
hselasky
fef1306667 Verify one sysctl parameter at a time. When a mlx5en sysctl parameter
is updated only verify the changed one instead of all.

No functional change.

Sponsored by:	Mellanox Technologies
Tested by:	Netflix
MFC after:	1 week
2016-05-20 07:07:27 +00:00
hselasky
4c05011d13 Optimise use of doorbell and remove redundant NOPs
Store the last doorbell write in the mlx5e_sq structure and write the
doorbell to the hardware when the transmit routine finishes
transmitting all queued mbufs.

Sponsored by:	Mellanox Technologies
Tested by:	Netflix
MFC after:	1 week
2016-05-20 06:59:38 +00:00
hselasky
64d010d533 Implement TX completion event interleaving.
This patch implements a sysctl which allows setting a factor, N, for
how many work queue elements can be generated before requiring a
completion event. When a completion event happens the code simulates N
completion events instead of only one. When draining a transmit queue,
N-1 NOPs are transmitted at most, to force generation of the final
completion event.  Further a timer is running every HZ ticks to flush
any remaining data off the transmit queue when the tx_completion_fact
> 1.

The goal of this feature is to reduce the PCI bandwidth needed when
transmitting data.

Sponsored by:	Mellanox Technologies
Tested by:	Netflix
MFC after:	1 week
2016-05-20 06:54:58 +00:00
hselasky
bc38a8e23c Correct some error codes to native FreeBSD ones.
Sponsored by:	Mellanox Technologies
Tested by:	Netflix
MFC after:	1 week
2016-04-29 11:01:06 +00:00
hselasky
4383f64240 Add function to detect the presence of a port module and use this
function to error out early when no port module is present and doing
eeprom access. This also prevents error codes from filling up in
dmesg.

Sponsored by:	Mellanox Technologies
Tested by:	Netflix
MFC after:	1 week
2016-04-29 11:00:12 +00:00
sephe
d0428dd51c tcp/lro: Use tcp_lro_flush_all in device drivers to avoid code duplication
And factor out tcp_lro_rx_done, which deduplicates the same logic with
netinet/tcp_lro.c

Reviewed by:	gallatin (1st version), hps, zbb, np, Dexuan Cui <decui microsoft com>
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5725
2016-04-01 06:28:33 +00:00
hselasky
4ed23bd887 Fix an issue where the network adapter could be left in down state
after changing the HW LRO sysctl when previously in up state.

Reviewed by:	gnn
Sponsored by:	Mellanox Technologies
MFC after:	5 days
Differential Revision:	https://reviews.freebsd.org/D4941
2016-01-19 10:24:47 +00:00
hselasky
a3ba15952b Add clarifying comment about CQE zipping.
Reviewed by:	gnn
Sponsored by:	Mellanox Technologies
MFC after:	5 days
Differential Revision:	https://reviews.freebsd.org/D4940
2016-01-19 10:19:33 +00:00
hselasky
0b8f89db10 Declare local variables at top of function.
Reviewed by:	gnn
Sponsored by:	Mellanox Technologies
MFC after:	5 days
Differential Revision:	https://reviews.freebsd.org/D4939
2016-01-19 10:17:24 +00:00
hselasky
de7ff988c8 Allow RX and TX pause frames to be set through ifconfig.
Reviewed by:	gnn
Sponsored by:	Mellanox Technologies
MFC after:	5 days
Differential Revision:	https://reviews.freebsd.org/D4817
2016-01-19 10:10:02 +00:00
hselasky
a34806ae5e Add support for modifying coalescing parameters runtime.
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2015-12-30 15:01:47 +00:00
hselasky
3e461a3c0e Allow I2C to read address 0x51 as well as address 0x50.
MFC after:	1 week
Submitted by:	Shahar Klein <shahark@mellanox.com>
Sponsored by:	Mellanox Technologies
2015-12-30 14:58:55 +00:00
hselasky
8646e68235 10G ER/LR should present itself as LR.
MFC after:	1 week
Submitted by:	Shahar Klein <shahark@mellanox.com>
Sponsored by:	Mellanox Technologies
2015-12-30 14:54:08 +00:00
hselasky
e439783711 Add support for CQE zipping. CQE zipping reduces PCI overhead by
coalescing and zipping multiple CQEs into a single merged CQE. The
feature is enabled by default and can be disabled by a sysctl.

Implementing this feature mlx5_cqwq_pop() has been separated from
mlx5e_get_cqe().

MFC after:	1 week
Submitted by:	Mark Bloch <markb@mellanox.com>
Differential Revision:	https://reviews.freebsd.org/D4598
Sponsored by:	Mellanox Technologies
2015-12-28 18:50:18 +00:00
hselasky
078653f474 Add support for sysctl tunables to 10-stable and older. Pushed through
head first to simplify driver maintenance.

MFC after:	1 week
Submitted by:	Drew Gallatin <gallatin@freebsd.org>
Differential Revision:	https://reviews.freebsd.org/D4552
Sponsored by:	Mellanox Technologies
2015-12-28 18:36:00 +00:00
hselasky
967ec3ab7f Make the eeprom dump function more readable and rename variables for
better clarity.

MFC after:	1 week
Submitted by:	Daria Genzel <dariaz@mellanox.com>
Differential Revision:	https://reviews.freebsd.org/D4551
Sponsored by:	Mellanox Technologies
2015-12-28 18:28:18 +00:00
hselasky
3f5871ee5c Update the mlx5 shared driver code to the latest version, which
include the following list of changes:

- Added eswitch ACL table management
  Introduce API for managing ACL table.
  This API include the following features:
  1) vlan filter - for VST/VGT+ support.
  2) spoofcheck.
  3) robust functionality to allow/drop general untagged/tagged traffic.
  4) support for both ingress and egress ACL types.

- Added loopback filter to the vacl table.

- Added multicast list set in the vPort context

- Added promiscuous mode set in the vPort context

- Set the vlan list in vPort context
  1) Check caps if VLAN list is not longer than FW supports
  2) Set MODIFY_NIC_VPORT_CONTEXT command

- Changed MLX5_EEPROM_MAX_BYTES from 48 to 32 so that a single EEPROM
  reading cannot cross the 128-byte boundary. Previously reading the
  MCIA register was done in batches of 48 bytes. The third reading
  would then by-pass the 127th byte, which means that part of the low
  page and part of the high page would be read at the same time, which
  created a bug:
    1st: 0-47 bytes
    2nd: 48-95 bytes
    3rd: 96-143 bytes

MFC after:	1 week
Sponsored by:	Mellanox Technologies
Differential Revision:	https://reviews.freebsd.org/D4411
2015-12-07 13:16:48 +00:00
hselasky
5de8189d2a Add full support for Receive Side Scaling, RSS, to the mlx5en
driver. This includes binding all interrupt and worker threads
according to the RSS configuration, setting up correct Toeplitz
hashing keys as given by RSS and setting the correct mbuf
hashtype for all received traffic.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
Differential Revision:	https://reviews.freebsd.org/D4410
2015-12-07 12:38:51 +00:00
hselasky
7d81938ea5 Add support for setting the TX moderation mode via a sysctl entry. TX
completion events can be moderated in the same way like RX completion
events. Expose this functionality by a sysctl variable.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
Differential Revision:	https://reviews.freebsd.org/D4409
2015-12-07 11:04:50 +00:00