Commit Graph

163 Commits

Author SHA1 Message Date
hselasky
bcaf4a6169 Add timeout handle to commands with callback in mlx5core.
The current implementation does not handle timeout in case of command
with callback request, and this can lead to deadlock if the command
doesn't get firmware response. Add delayed callback timeout work
before posting the command to firmware. In case of real firmware
command completion we will cancel the delayed work. In case of
firmware command timeout the callback timeout handler will be called
and it will simulate firmware completion with timeout error.

linux commit 65ee67084589c1783a74b4a4a5db38d7264ec8b5

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-07 14:41:29 +00:00
hselasky
deeafae0ca Fix potential deadlock in command mode change in mlx5core.
Call command completion handler in case of timeout when working in
interrupts mode. Avoid flushing the commands workqueue after acquiring
the semaphores to prevent a potential deadlock.

linux commit commit 9cba4ebcf374c3772f6eb61f2d065294b2451b49

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-07 14:35:28 +00:00
hselasky
7525588e0c Use a macro in mlx5_command_str() instead of copying OP name.
linux commit 42ca502e179d0654ef441333a9d0f35c948734f3

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-07 14:29:30 +00:00
hselasky
13f5c3c4e7 Disable unsupported disassociate ucontext functionality in mlx5ib(4).
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-07 14:03:31 +00:00
hselasky
5f8973a373 Optimize ibcore RoCE address handle creation from user-space.
Creating a UD address handle from user-space or from the kernel-space,
when the link layer is ethernet, requires resolving the remote L3
address into a L2 address. Doing this from the kernel is easy because
the required ARP(IPv4) and ND6(IPv6) address resolving APIs are readily
available. In userspace such an interface does not exist and kernel
help is required.

It should be noted that in an IP-based GID environment, the GID itself
does not contain all the information needed to resolve the destination
IP address. For example information like VLAN ID and SCOPE ID, is not
part of the GID and must be fetched from the GID attributes. Therefore
a source GID should always be referred to as a GID index. Instead of
going through various racy steps to obtain information about the
GID attributes from user-space, this is now all done by the kernel.

This patch optimises the L3 to L2 address resolving using the existing
create address handle uverbs interface, retrieving back the L2 address
as an additional user-space information structure.

This commit combines the following Linux upstream commits:

IB/core: Let create_ah return extended response to user
IB/core: Change ib_resolve_eth_dmac to use it in create AH
IB/mlx5: Make create/destroy_ah available to userspace
IB/mlx5: Use kernel driver to help userspace create ah
IB/mlx5: Report that device has udata response in create_ah

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-05 14:34:52 +00:00
hselasky
e8b153d235 Move the mlx5 core device pointer first in the mlx5en priv. This help simplify
checks to recognize own network devices when using mlx5ib. This patch fixes
an issues where mlx5ib fails to recognize mceX network devices for use with
RoCE.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-01-30 12:38:06 +00:00
eadler
7ec20ea487 sys/dev/mlx[45]: fix uses of 1 << 31
Reviewed by:		kib (D13858)
2018-01-12 06:36:44 +00:00
kib
9538d0f44b mlx5en: Avoid SFENCe on x86
The IA32 memory model guarantees that all writes are seen in the program
order.  Also, any access to the uncacheable memory flushes the store
buffers.  As the consequence, SFENCE instruction is (almost) never needed,
in particular, it is not needed to ensure the correct order of updates as
seen by a PCIe device.

Use atomic_thread_fence_rel() instead of wb() to only emit compiler barriers
on x86 there.  Other architectures get the right barrier instruction as
well.

Reviewed by:	hselasky
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-12-19 14:11:41 +00:00
kib
fcb3a5a8a1 Implement hardware mlx5(4) rx timestamps.
Driver support is only provided for ConnectX4/5.

System-time timestamp is calculated based on the free-running counter
timestamp provided by hardware.  Driver periodically samples the
counter to calibrate it against the system clock and uses linear
interpolation to convert.  Stability of the crystal which drives the
clock is +-50 ppm at the operational temperature, which makes the
algorithm good enough.

The calculation is somewhat delicate because all values are 64bit and
overflow the naive formula for linear interpolation.  The calculation
drops the least significant bits in advance, see the PREC shift in
mlx5_mbuf_tstmp().

Hardware stamps can be turned off by 'ifconfig mceN -hwrxtsmp'.  Buggy
firmware might result in small but visible errors in the reported
timestamps, detectable e.g. by nonsensical (negative) RTT values for
LAN pings.

Reviewed by:	gallatin, hselasky
Sponsored by:	Mellanox Technologies
Differential revision:	https://reviews.freebsd.org/D12638
2017-11-29 10:04:11 +00:00
hselasky
5b03215ba8 Compile fixes for 32-bit architectures.
Sponsored by:	Mellanox Technologies
2017-11-24 12:08:50 +00:00
hselasky
c6f05b2594 Merge ^/head r325842 through r325998. 2017-11-19 12:36:03 +00:00
hselasky
9e2bb863a3 Remove duplicate static function prototype to fix compilation of
mlx5_fs_tree.c after r325638 when using GCC.

Found by:	kib @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2017-11-18 20:32:09 +00:00
hselasky
7f64b39d2d Merge ^/head r325663 through r325841. 2017-11-15 11:28:11 +00:00
hselasky
e949bc6053 Make sure the ib_wr_opcode enum is signed by adding a negative dummy element.
Different compilers may optimise the enum type in different ways. This ensures
coherency when range checking the value of enums in ibcore.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-14 14:51:37 +00:00
hselasky
be189b8556 The new mlx5ib(4) module requires some existing values to be redefined.
Sponsored by:	Mellanox Technologies
2017-11-10 15:28:17 +00:00
hselasky
e5cba24a64 Update mlx5ib(4) to match Linux 4.9 and the new ibcore APIs.
Sponsored by:	Mellanox Technologies
2017-11-10 15:02:17 +00:00
hselasky
75528d7f53 Add more and update existing mlx5 core firmware structure definitions and bits.
This change is part of coming ibcore and mlx5ib updates.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 14:39:03 +00:00
hselasky
16ff114873 Expose the current hardware MTU in mlx5en(4) as a separate entry
in the sysctl tree.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 14:19:22 +00:00
hselasky
3d7404245e Add support for configuring local multicast and unicast data traffic loopback
in mlx5en(4) driver via the sysctl interface.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 14:14:54 +00:00
hselasky
f50f6a1c2b Add support for disabling and enabling RX and TX DMA rings in mlx5en(4).
This is useful for supporting setups similar to Netmap.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 14:10:41 +00:00
hselasky
8fb8cd757b Make physical address of init segment available in the priv of mlx5 core.
This change is needed by mlx5ib(4).

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 14:02:12 +00:00
hselasky
f804cac481 Add API function to query port performance counters for infiniband and RoCE
traffic in mlx5 core.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 13:58:49 +00:00
hselasky
e6f4ef02bb Add API functions to query and modify local loopback of multicast and
unicast traffic in mlx5 core.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 13:56:11 +00:00
hselasky
813f3e5b94 Add API function to query virtual port counters in mlx5 core.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 13:53:53 +00:00
hselasky
b77e810931 Add API functions to modify the transport interface send object, TIS,
in mlx5 core.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 13:50:08 +00:00
hselasky
c3fb19b1ce Add API functions to set and query dropless port mode in mlx5 core.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 13:44:12 +00:00
hselasky
46269204ab Prevent mlx5 core from accessing host memory after shutdown by disabling
PCI busmaster.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 13:40:27 +00:00
hselasky
a75aa3bb37 Set ATOMIC endian mode in mlx5 core.
The hardware is capable of 2 requestor endianness modes for standard 8
byte atomics: BE (0x0) and host endianness (0x1). Read the supported
modes from hca atomic capabilities and configure HW to host endianness
mode if supported.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 13:38:43 +00:00
hselasky
09f43f2bf3 Add const keyword to input-only argument in mlx5 core.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 13:30:14 +00:00
hselasky
271fa74c14 Make local variable 64-bits to avoid masking away bits in mlx5 core.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 13:28:23 +00:00
hselasky
a511b8c8a5 Implement support for decoding general port notification event in
the mlx5 core module.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 13:25:29 +00:00
hselasky
ca41ce2185 Refactor the flowsteering APIs used by mlx5en(4). This change is needed by
the coming ibcore and mlx5ib updates in order to support traffic redirection
to so-called raw ethernet QPs.

Remove unused E-switch related routines and files while at it.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 09:49:08 +00:00
hselasky
9f6549f2c3 The remote DMA TCP portspace selector, RDMA_PS_TCP, is used for both
iWarp and RoCE in ibcore. The selection of RDMA_PS_TCP can not be used
to indicate iWarp protocol use. Backport the proper IB device
capabilities from Linux upstream to distinguish between iWarp and
RoCE. Only allocate the additional socket required for iWarp for RDMA
IDs when at least one iWarp device present. This resolves
interopability issues between iWarp and RoCE in ibcore

Reviewed by:		np @
Differential Revision:	https://reviews.freebsd.org/D12563
Sponsored by:		Mellanox Technologies
MFC after:		3 days
2017-10-20 08:20:15 +00:00
hselasky
5819ce8c32 Use common rdma_ip2gid() function instead of custom mlx5_ip2gid() one.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-10-10 12:24:52 +00:00
hselasky
f74bf3f764 Make sure the doorbell lock is valid for the i386 version
of the mlx5en(4) driver.

Tested by:		gallatin @
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-10-02 12:20:55 +00:00
hselasky
d2fe78f981 Compile fixes for LINT on 32-bit platforms.
MFC after:		2 weeks
Sponsored by:		Mellanox Technologies
2017-08-24 08:09:42 +00:00
hselasky
4be93617fa Add new mlx5ib(4) driver to the kernel source tree which supports
Remote DMA over Converged Ethernet, RoCE, for the ConnectX-4 series of
PCI express network cards.

There is currently no user-space support and this driver only supports
kernel side non-routable RoCE V1. The krping kernel module can be used
to test this driver. Full user-space support including RoCE V2 will be
added as part of the ongoing upgrade to ibcore from Linux 4.9. Otherwise
this driver is feature equivalent to mlx4ib(4). The mlx5ib(4) kernel
module will only be built when WITH_OFED=YES is specified.

MFC after:		2 weeks
Sponsored by:		Mellanox Technologies
2017-08-23 12:09:37 +00:00
hselasky
32b0d28145 Make sure the received IP header gets 32-bit aligned for short packets
in the mlx5en(4) driver.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-08-08 11:49:36 +00:00
hselasky
5215ede46a Count drop events due to lack of PCI bandwidth as queue drops and not as
input errors in the mlx5en(4) driver. This improves the sysadmin view of
physical port errors.

Submitted by:		gallatin@
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-08-08 11:36:57 +00:00
hselasky
f54e71e0aa Resolve locking issue for non-sleepable context in the mlx5core.
Code inspection reveals the busdma unload and free functions
do not write to the belonging dma tag and does not need to be
serialized. This allows mlx5_fwp_free() to be called from
software interrupt context.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2017-08-03 09:14:43 +00:00
hselasky
687ade08e2 Using GFP_ATOMIC with firmware commands is not supported after busdma was
introduced in the mlx5core, because busdma might sleep when loading memory
into DMA.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2017-08-03 09:11:51 +00:00
markj
881fc6be99 Update io-mapping.h in the LinuxKPI.
Add io_mapping_init_wc() and add a third (unused) parameter to
io_mapping_map_wc().

Reviewed by:	hselasky
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D11286
2017-06-21 18:20:17 +00:00
hselasky
18ee93d3bc Improve sysadmin visibility of physical port error counters in the
mlx5en driver.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-04-28 19:38:57 +00:00
hselasky
04a2576fbe Make "desc" pointer non-constant inside the mlx5_core_diagnostics_entry
structure. This fixes compilation with amd64-xtoolchain-gcc.

PR:			216588
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-30 08:35:15 +00:00
hselasky
85824cd8d0 Use the busdma API to allocate all DMA-able memory.
The MLX5 driver has four different types of DMA allocations which are
now allocated using busdma:

1) The 4K firmware DMA-able blocks. One busdma object per 4K allocation.
2) Data for firmware commands use the 4K firmware blocks split into four 1K blocks.
3) The 4K firmware blocks are also used for doorbell pages.
4) The RQ-, SQ- and CQ- DMA rings. One busdma object per allocation.

After this patch the mlx5en driver can be used with DMAR enabled in
the FreeBSD kernel.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-27 11:46:55 +00:00
hselasky
d01788d7e5 Add support for device surprise removal and other PCI errors.
- When device disappears from PCI indicate error device state and:
  1) Trigger command completion for all pending commands
  2) Prevent new commands from executing and return:
     - success for modify and remove/cleanup commands
     - failure for create/query commands
  3) When reclaiming pages for a device in error state don't ask FW to
     return all given pages, just release the allocated memory

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-27 11:29:33 +00:00
hselasky
120993bc0f Wait for all VFs pages to be reclaimed before closing EQ pages.
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-27 11:19:06 +00:00
hselasky
4afcc8e75a Rename struct fw_page into struct mlx5_fw_page as a preparation step
for adding busdma support.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-27 11:03:58 +00:00
hselasky
c2690f22bd Fix command completion with callback scenario.
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-27 10:56:03 +00:00
hselasky
f062e92852 Minor code refactor as a preparation step for suprise removal of CX-4
PCI device(s), changes:
- alloc_entry() now clears bit for page slot entry aswell
- update of cmd->ent_arr[] is now under cmd->alloc_lock
- complete command if alloc_entry() fails

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-27 10:47:53 +00:00
hselasky
c334e47ed3 Use ffs() to scan for first bit instead of using a for() loop.
Minor code refactor while at it.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-27 10:36:49 +00:00
hselasky
cf42c7427c Make fw_pages statistics counter 64-bit to avoid overflow.
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-27 10:20:38 +00:00
hselasky
51619b4ff2 Add support for reading advanced diagnostic counters.
By default reading the diagnostic counters is disabled. The firmware
decides which counters are supported and only those supported show up
in the dev.mce.X.diagnostics sysctl tree.

To enable reading of diagnostic counters set one or more of the
following sysctls to one:

dev.mce.X.conf.diag_general_enable=1
dev.mce.X.conf.diag_pci_enable=1

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-27 10:03:50 +00:00
hselasky
73d02b3c53 Enforce reading the consumer and producer counters once to ensure
consistent return values from the mlx5e_sq_has_room_for()
function. The two counters are incremented by different threads under
different locks.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-27 08:32:50 +00:00
hselasky
195b63cfea Remove superfluous return statement.
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-20 15:47:29 +00:00
hselasky
f0361cf78b Allow transmit packet bufring in software to be disabled.
- Add new sysctl node to control the transmit packet bufring.

- Add optimised version of the transmit routine which output packets
directly to the DMA ring instead of using bufring in case the transmit
lock is congested. This can reduce the number of taskswitches which in
turn influence the overall system CPU usage, depending on the
workload.

- Add " TX" suffix to debug name for transmit mutexes to silence some
witness warnings about aquiring duplicate locks having same name.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
Suggested by:		gallatin @
2017-01-20 15:45:21 +00:00
hselasky
c49cd7b58d Make draining a sendqueue more robust.
Add own state variable to track if a sendqueue is stopped or not.
This will prevent traffic from entering the sendqueue while it is
being destroyed.

Update drain function to wait for traffic to be transmitted before
returning when the link state is active.

Add extra checks in transmit path for stopped SQ's.

While at it:
- Use likely() for a mbuf pointer check.
- Remove redundant IFF_DRV_RUNNING check.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-20 12:02:40 +00:00
hselasky
ebf4085d99 Add runtime support for modifying the SQ and RQ completion event
moderation mode. The presence of this feature is indicated through the
firmware capabilities.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-20 11:11:49 +00:00
hselasky
08255e2b1b Update firmware interface structures and definitions adding support
for new features and commands.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-20 10:47:32 +00:00
hselasky
93748b89b8 Make a read only pointer constant.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-12-22 10:12:19 +00:00
hselasky
9ba03afd59 Add more comments regarding collection of statistics counters.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-12-22 10:11:03 +00:00
hselasky
c9d0547e87 Remove useless NULL checks.
NULL is not returned when allocating memory passing the M_WAITOK flag.

Submitted by:		trasz @
Differential Revision:  https://reviews.freebsd.org/D5772
Sponsored by:           Mellanox Technologies
MFC after:		1 week
2016-12-02 09:41:54 +00:00
hselasky
c3180648b2 Add timer to watch the RQ when we are out of mbufs.
The firmware/hardware does not generate additional completion
events unless we post new buffers. Use a timer to try to post
more buffers in case we are temporarily out of mbufs. Else
the receive schedule completely stops.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-11-07 11:39:45 +00:00
hselasky
4f9623fe2d Add more firmware related structures and update existing ones in the
MLX5 core module. Update the set and query diagnostics counter API.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-11-07 11:28:50 +00:00
hselasky
3d96785c4f Query flow table capabilities according to the correct capability bit
for infiniband.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-11-07 11:26:25 +00:00
hselasky
0f38a0f9c7 Correct checksum fields in the "mlx5_mini_cqe8" structure. The fields
in question are currently not used.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-11-07 11:22:50 +00:00
hselasky
bef3529df5 Ensure the firmware is notified of any host memory allocation
failures. Else firmware commands may time out waiting for host
memory.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-11-07 11:20:13 +00:00
hselasky
c36023f206 When a firmware command times out do not free the command structure to
avoid use after free.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-11-07 11:15:40 +00:00
hselasky
0370535171 Set hardware stats flag to avoid double counting the number of incoming bytes.
Found by:	Ben RUBSON <ben.rubson@gmail.com>
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-29 16:35:52 +00:00
hselasky
94f8b79097 mlx5en: Fix duplicate mbuf free-by-code.
When mlx5e_sq_xmit() returns an error code and the mbuf pointer is set,
we should not free the mbuf, because the caller will keep the mbuf in
the drbr. Make sure the mbuf pointer is correctly set upon function
exit.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:57:48 +00:00
hselasky
ca67d9f237 mlx5en: Remove unused pdev pointer.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:55:38 +00:00
hselasky
8071bc2f2b mlx5en: Verify port type is ethernet before creating network device
Else the mlx5en driver might attach to infiniband ports.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:53:53 +00:00
hselasky
753de42ca9 mlx5en: Allow setting the software MTU size below 1500 bytes
The hardware MTU size can't be set to a value less than 1500 bytes due
to side-band management support. Allow setting the software MTU size
below 1500 bytes, thus creating a mismatch between hardware and
software MTU sizes.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:51:31 +00:00
hselasky
76f28d3b21 mlx5en: Factor out common sendqueue code for use with rate limiting SQs.
Try to reuse code to setup sendqueues when possible by making some static
functions global. Further split the mlx5e_close_sq_wait() function to
separate out reusable parts.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:47:16 +00:00
hselasky
0dc18f450b mlx5en: Properly declare doorbell lock for 32-bit CPUs.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:45:35 +00:00
hselasky
bf7a52f004 mlx5en: Optimise away duplicate UAR pointers.
This change also reduces the size of the mlx5e_sq structure so that the last
queue_state element will fit into the previous cacheline and then the mlx5e_sq
structure becomes one cacheline less for amd64.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:40:45 +00:00
hselasky
be081c172c mlx5en: Make the mlx5e_open_cq() and mlx5e_close_cq() functions global.
Make some functions and structures global to allow for code reuse
when creating rate limiting sendqueues.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:39:15 +00:00
hselasky
4f9acf8620 mlx5en: Minor completion queue control path code refactor.
Move setting of CQ moderation mode together with the other
CQ moderation parameters. Pass completion event vector as
a separate argument to mlx5e_open_cq(), because its value is
different for each call. Pass mlx5e_priv pointer instead of
mlx5e_channel pointer so that code can be used by rate
limiting sendqueues.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:37:35 +00:00
hselasky
310b29a8ad mlx5en: Separate the sendqueue from using the mlx5e_channel structure.
This change allows for reusing the transmit path for so called
rate limited senqueues. While at it optimise some pointer lookups
in the fast path.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:35:45 +00:00
hselasky
6ef474775d Update the MLX5 core module:
- Add new firmware commands and update existing ones.
- Add more firmware related structures and update existing ones.
- Some minor fixes, like adding missing \n to some prints.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:28:16 +00:00
hselasky
1fbad2b111 Increase the maximum RX/TX queue size. This allows for a RX/TX queue
size of 16384 mbufs. Previously the limit was 8192.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-08-22 13:43:25 +00:00
hselasky
d6358bfd37 Fix for use after free.
Clear the device description to avoid use after free because the
bsddev is not destroyed when the mlx5en module is unloaded. Only when
the parent mlx5 module is unloaded the bsddev is destroyed. This fixes
a panic on listing sysctls which refer strings in the bsddev after the
mlx5en module has been unloaded.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-08-09 07:43:15 +00:00
hselasky
7c8996f145 Switch to the new block based LRO input function for the mlx5en
driver. This change significantly increases the overall RX aggregation
ratio for heavily loaded networks handling 10-80 thousand simultaneous
connections.

Remove the turbo LRO code and all references to it which has now been
superceeded by the tcp_lro_queue_mbuf() function.

Tested by:	Netflix
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-08-08 16:22:16 +00:00
hselasky
16acbc9102 Use correct Q-counter output array.
Sponsored by:	Mellanox Technologies
Approved by:	re (kib)
MFC after:	3 days
2016-06-23 09:23:37 +00:00
hselasky
52fa2498da Add SR-IOV guest support to the mlx5en driver.
This patch adds the missing pieces needed for device setup using the
mlx5en driver inside a virtual machine which is providing hardware
access through SR-IOV.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-06-07 13:58:52 +00:00
sephe
7acd138965 net: Use M_HASHTYPE_OPAQUE_HASH if the mbuf flowid has hash properties
Reviewed by:	hps, erj, tuexen
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D6688
2016-06-07 04:51:50 +00:00
hselasky
0d8f1c25a2 Prepare for activation of LinuxKPI module parameters as read-only
tunable SYSCTL's. Linux module parameters are associated with the
module they belong to. FreeBSD does not share this concept of a parent
module. Instead add macros which define the prefix to use for the
module parameters in the LinuxKPI consumers.

While at it convert all "bool" LinuxKPI module parameters to "byte"
type, because we don't have a "bool" type of SYSCTL in FreeBSD.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-05-25 12:03:21 +00:00
hselasky
fef1306667 Verify one sysctl parameter at a time. When a mlx5en sysctl parameter
is updated only verify the changed one instead of all.

No functional change.

Sponsored by:	Mellanox Technologies
Tested by:	Netflix
MFC after:	1 week
2016-05-20 07:07:27 +00:00
hselasky
4c05011d13 Optimise use of doorbell and remove redundant NOPs
Store the last doorbell write in the mlx5e_sq structure and write the
doorbell to the hardware when the transmit routine finishes
transmitting all queued mbufs.

Sponsored by:	Mellanox Technologies
Tested by:	Netflix
MFC after:	1 week
2016-05-20 06:59:38 +00:00
hselasky
64d010d533 Implement TX completion event interleaving.
This patch implements a sysctl which allows setting a factor, N, for
how many work queue elements can be generated before requiring a
completion event. When a completion event happens the code simulates N
completion events instead of only one. When draining a transmit queue,
N-1 NOPs are transmitted at most, to force generation of the final
completion event.  Further a timer is running every HZ ticks to flush
any remaining data off the transmit queue when the tx_completion_fact
> 1.

The goal of this feature is to reduce the PCI bandwidth needed when
transmitting data.

Sponsored by:	Mellanox Technologies
Tested by:	Netflix
MFC after:	1 week
2016-05-20 06:54:58 +00:00
hselasky
bc38a8e23c Correct some error codes to native FreeBSD ones.
Sponsored by:	Mellanox Technologies
Tested by:	Netflix
MFC after:	1 week
2016-04-29 11:01:06 +00:00
hselasky
4383f64240 Add function to detect the presence of a port module and use this
function to error out early when no port module is present and doing
eeprom access. This also prevents error codes from filling up in
dmesg.

Sponsored by:	Mellanox Technologies
Tested by:	Netflix
MFC after:	1 week
2016-04-29 11:00:12 +00:00
sephe
d0428dd51c tcp/lro: Use tcp_lro_flush_all in device drivers to avoid code duplication
And factor out tcp_lro_rx_done, which deduplicates the same logic with
netinet/tcp_lro.c

Reviewed by:	gallatin (1st version), hps, zbb, np, Dexuan Cui <decui microsoft com>
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5725
2016-04-01 06:28:33 +00:00
hselasky
4ed23bd887 Fix an issue where the network adapter could be left in down state
after changing the HW LRO sysctl when previously in up state.

Reviewed by:	gnn
Sponsored by:	Mellanox Technologies
MFC after:	5 days
Differential Revision:	https://reviews.freebsd.org/D4941
2016-01-19 10:24:47 +00:00
hselasky
a3ba15952b Add clarifying comment about CQE zipping.
Reviewed by:	gnn
Sponsored by:	Mellanox Technologies
MFC after:	5 days
Differential Revision:	https://reviews.freebsd.org/D4940
2016-01-19 10:19:33 +00:00
hselasky
0b8f89db10 Declare local variables at top of function.
Reviewed by:	gnn
Sponsored by:	Mellanox Technologies
MFC after:	5 days
Differential Revision:	https://reviews.freebsd.org/D4939
2016-01-19 10:17:24 +00:00
hselasky
de7ff988c8 Allow RX and TX pause frames to be set through ifconfig.
Reviewed by:	gnn
Sponsored by:	Mellanox Technologies
MFC after:	5 days
Differential Revision:	https://reviews.freebsd.org/D4817
2016-01-19 10:10:02 +00:00
hselasky
a34806ae5e Add support for modifying coalescing parameters runtime.
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2015-12-30 15:01:47 +00:00
hselasky
3e461a3c0e Allow I2C to read address 0x51 as well as address 0x50.
MFC after:	1 week
Submitted by:	Shahar Klein <shahark@mellanox.com>
Sponsored by:	Mellanox Technologies
2015-12-30 14:58:55 +00:00
hselasky
8646e68235 10G ER/LR should present itself as LR.
MFC after:	1 week
Submitted by:	Shahar Klein <shahark@mellanox.com>
Sponsored by:	Mellanox Technologies
2015-12-30 14:54:08 +00:00