Commit Graph

116 Commits

Author SHA1 Message Date
Hans Petter Selasky
c5e5a55f8f Fix i386 build WITH_OFED=YES. Remove some redundant KASSERTs.
Suggested by:	kib, ian
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2015-12-04 18:20:55 +00:00
Enji Cooper
4722f6ef28 Fix scope of bridge_header and bridge_pcix_cap in mthca_reset(..)
They're only used in the __linux__ case

Differential Revision: https://reviews.freebsd.org/D4332
MFC after: 1 week
Reported by: cppcheck
Reviewed by: hselasky
Sponsored by: EMC / Isilon Storage Division
2015-12-04 09:01:58 +00:00
Hans Petter Selasky
3d23c0a436 Convert the mlxen driver to use the BUSDMA(9) APIs instead of
vtophys() when loading mbufs for transmission and reception. While at
it all pointer arithmetic and cast qualifier issues were fixed, mostly
related to transmission and reception.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
Differential Revision:	https://reviews.freebsd.org/D4284
2015-12-03 14:56:17 +00:00
Hans Petter Selasky
6111807106 Updated the mlx4 and mlxen drivers to the latest version, v2.1.6:
- Added support for dumping the SFP EEPROM content to dmesg.
- Fixed handling of network interface capability IOCTLs.
- Fixed race when loading and unloading the mlxen driver by applying
  appropriate locking.
- Removed two unused C-files.

MFC after:	1 week
Submitted by:	Mark Bloch <markb@mellanox.com>
Sponsored by:	Mellanox Technologies
Differential Revision:	https://reviews.freebsd.org/D4283
2015-12-03 13:29:20 +00:00
Enji Cooper
ae9356f143 Don't leak work if __mlx4_register_vlan(..) fails in
mlx4_master_immediate_activate_vlan_qos(..)

MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D4203
Submitted by: Miles Olrich <miles.olrich@isilon.com>
Sponsored by: EMC / Isilon Storage Division
2015-11-19 01:08:16 +00:00
Hans Petter Selasky
0f5150a757 Fix integer to pointer of different size conversion warnings when
using GCC for 32-bit platforms. The integer size in this case is
hardcoded 64-bit while the pointer size is 32-bit.

Sponsored by:	Mellanox Technologies
MFC after:	2 weeks
2015-11-12 10:12:20 +00:00
Hans Petter Selasky
3143f07779 Fix print formatting compile warnings for Sparc64 and PowerPC platforms.
Sponsored by:	Mellanox Technologies
MFC after:	2 weeks
2015-11-12 09:56:25 +00:00
Hans Petter Selasky
2da3897d01 Rename linuxapi[.ko] into linuxkpi[.ko], to reflect that it is a
kernel programming interface module, KPI, to avoid confusion with the
existing Linux userspace binary compatibility shims. Bump the
FreeBSD_version number.

Reviewed by:	np @
Suggested by:	dumbbell @
Sponsored by:	Mellanox Technologies
2015-10-22 09:50:45 +00:00
Hans Petter Selasky
f556cede8a Merge LinuxKPI changes from DragonflyBSD:
- Define the kref structure identical to the one found in Linux.
- Update clients referring inside the kref structure.
- Implement kref_sub() for FreeBSD.

Reviewed by:	np @
Sponsored by:	Mellanox Technologies
2015-10-19 12:26:38 +00:00
Hans Petter Selasky
2404bdddf1 Merge LinuxKPI changes from DragonflyBSD:
- Add more list related functions and macros.
- Update the hlist_for_each_entry() macro to take one less argument.

Sponsored by:	Mellanox Technologies
2015-10-19 11:57:33 +00:00
Alexander V. Chernikov
fb373bc2b1 Fix build broken by r287861.
Spotted by:	zb
2015-09-16 15:40:08 +00:00
Alexander V. Chernikov
1fe201c322 Simplify the way of attaching IPv6 link-layer header.
Problem description:
How do we currently perform layer 2 resolution and header imposition:

For IPv4 we have the following chain:
  ip_output() -> (ether|atm|whatever)_output() -> arpresolve()

Lookup is done in proper place (link-layer output routine) and it is possible
  to provide cached lle data.

For IPv6 situation is more complex:
  ip6_output() -> nd6_output() -> nd6_output_ifp() -> (whatever)_output() ->
    nd6_storelladdr()

We have ip6_ouput() which calls nd6_output() instead of link output routine.
nd6_output() does the following:
  * checks if lle exists, creates it if needed (similar to arpresolve())
  * performes lle state transitions (similar to arpresolve())
  * calls nd6_output_ifp() which pushes packets to link output routine along
    with running SeND/MAC hooks regardless of lle state
    (e.g. works as run-hooks placeholder).

After that, iface output routine like ether_output() calls nd6_storelladdr()
  which performs lle lookup once again.

As a result, we perform lookup twice for each outgoing packet for most types
  of interfaces. We also need to maintain runtime-checked table of 'nd6-free'
  interfaces (see nd6_need_cache()).

Fix this behavior by eliminating first ND lookup. To be more specific:
  * make all nd6_output() consumers use nd6_output_ifp() instead
  * rename nd6_output[_slow]() to nd6_resolve_[slow]()
  * convert nd6_resolve() and nd6_resolve_slow() to arpresolve() semantics,
    e.g. copy L2 address to buffer instead of pushing packet towards lower
    layers
  * Make all nd6_storelladdr() users use nd6_resolve()
  * eliminate nd6_storelladdr()

The resulting callchain is the following:
  ip6_output() -> nd6_output_ifp() -> (whatever)_output() -> nd6_resolve()

Error handling:
Currently sending packet to non-existing la results in ip6_<output|forward>
  -> nd6_output() -> nd6_output _lle() which returns 0.
In new scenario packet is propagated to <ether|whatever>_output() ->
  nd6_resolve() which will return EWOULDBLOCK, and that result
  will be converted to 0.

(And EWOULDBLOCK is actually used by IB/TOE code).

Sponsored by:		Yandex LLC
Differential Revision:	https://reviews.freebsd.org/D1469
2015-09-16 14:26:28 +00:00
Mark Johnston
4af587d062 Ensure that the MAD agent's delayed taskqueue is completely stopped
before proceeding. Otherwise, nothing prevents it from running after the
MAD agent struct has been been freed, and this results in a use-after-free
when the task's ta_pending count is incremented in the callout handler.

MFC after:	2 weeks
Sponsored by:	EMC / Isilon Storage Division
2015-09-15 23:56:31 +00:00
Navdeep Parhar
6b5c8394f1 Reinstate unify_tcp_port_space and associated code that was lost during
the last OFED update (r278886).

iWARP on FreeBSD is properly integrated with the network stack and the
iWARP drivers _never_ operate out of any private TCP port-space that is
invisible to the kernel.  Instead, an iWARP connection shows up as a TCP
socket (which is what it is) fully visible to the kernel and standard
tools like netstat, sockstat, etc.
2015-08-12 22:09:58 +00:00
Hans Petter Selasky
5884383f19 Avoid calling into the random subsystem before it is initialized.
Sponsored by:	Mellanox Technologies
2015-08-04 09:45:10 +00:00
Mark Johnston
e2e45da0e8 ib mad: fix an incorrect use of list_for_each_entry
In tf_dequeue(), if we reach the end of the list without finding a
non-cancelled element, "tmp" will be a pointer into the list head, so the
tmp->canceled check is bogus. Use a flag instead.

Submitted by:	Tao Liu <Tao.Liu@isilon.com>
Reviewed by:	hselasky
MFC after:	1 week
Sponsored by:	EMC / Isilon Storage Division
Differential Revision: https://reviews.freebsd.org/D3244
2015-07-30 18:28:37 +00:00
Mateusz Guzik
f6f6d24062 Implement lockless resource limits.
Use the same scheme implemented to manage credentials.

Code needing to look at process's credentials (as opposed to thred's) is
provided with *_proc variants of relevant functions.

Places which possibly had to take the proc lock anyway still use the proc
pointer to access limits.
2015-06-10 10:48:12 +00:00
Gleb Smirnoff
dfd828e931 Add SIOCGI2C ioctl support to the driver. Would work only on ConnectX-3
with fresh firmware. The low level code is based on code provided by
Mellanox.

Thanks to Mellanox and their distributor Must (http://mustcompany.ru)
for providing hardware.

In collaboration with:	Andre Melkoumian <andre mellanox.com>
Reviewed by:		hselasky
Sponsored by:		Netflix
Sponsored by:		Nginx, Inc.
2015-05-27 13:42:28 +00:00
Hans Petter Selasky
d27c74649b Apply proper locking when iterating the multicast addresses and add a
missing check for NULL from a non-blocking "kzalloc()" function call.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
Found by:	glebius @
2015-05-12 11:52:34 +00:00
Hans Petter Selasky
b7ba031ff7 Factor out mbuf hashing code from LAGG driver so that other network
drivers can use it. This avoids some code duplication. Add missing
default case to all switch statements while at it. Also move the
hashing of the IPv6 flow field to layer 4 because the IPv6 flow field
is constant on a per L4 connection basis and not on a per L3 network.

Differential Revision:	https://reviews.freebsd.org/D1987
Sponsored by:		Mellanox Technologies
MFC after:		1 month
2015-03-11 16:02:24 +00:00
Hans Petter Selasky
5a61df011e Ensure setting promiscious mode when a network interface is up, is
always non-blocking by not locking a SX type of mutex.

Sponsored by:	Mellanox Technologies
MFC after:	3 days
2015-03-10 21:17:10 +00:00
Hans Petter Selasky
2b8859521d Updates for the Mellanox ethernet driver
> List of fixes:
  * use correct format for GID printouts
  * double array indexing
  * spelling in printouts
  * void pointer arithmetic
  * allow more receive rings
  * correct maximum number of transmit rings
  * use "const" instead of "static" for constants
  * check for invalid VLAN tags
  * check for lack of IRQ resources
> Added more hardware specific defines
> Added more verbose printouts of firmware status codes

Sponsored by:	Mellanox Technologies
MFC after:	3 days
2015-03-04 09:30:03 +00:00
Bjoern A. Zeeb
daa99f441e Try to unbreak NOIP and NOINET6 LINT builds after r278886
by placing appropriate #ifdefs around otherwise unused variables
or sections with functions called which are not available without
IPv6 support in the kernel.
2015-02-19 11:48:00 +00:00
Gleb Smirnoff
cc4a90c445 Globally enable -fms-extensions when building kernel with gcc, and remove
this option from all modules that enable it theirselves.
  In C mode -fms-extensions option enables anonymous structs and unions,
allowing us to use this C11 feature in kernel. Of course, clang supports
it without any extra options.

Reviewed by:	dim
2015-02-17 19:27:14 +00:00
Hans Petter Selasky
8a8f7d5bad Fix compilation of the SDP driver and a compile warning after r278886.
Also fix the kernel build rule for mlx4_exp.c.
This fixes the LINT kernel target for amd64.

Sponsored by:	Mellanox Technologies
MFC after:	1 month
2015-02-17 10:00:15 +00:00
Hans Petter Selasky
8a3fed4e54 Fix compilation when DEBUG is defined.
Sponsored by:	Mellanox Technologies
MFC after:	1 month
2015-02-17 08:57:36 +00:00
Hans Petter Selasky
b5c1e0cb8d Update the infiniband stack to Mellanox's OFED version 2.1.
Highlights:
 - Multiple verbs API updates
 - Support for RoCE, RDMA over ethernet

All hardware drivers depending on the common infiniband stack has been
updated aswell.

Discussed with:	np @
Sponsored by:	Mellanox Technologies
MFC after:	1 month
2015-02-17 08:40:27 +00:00
Hans Petter Selasky
271aa1089b The "frag_info" pointer is already pointing to an array index.
Don't index twice.

Sponsored by:	Mellanox Technologies
MFC after:	3 days
2015-02-16 17:05:59 +00:00
Hans Petter Selasky
d39d7c8636 Add missing linuxapi module dependencies and always use the FreeBSD
"MODULE_VERSION" macro definition. Remove the redefinition of the
"MODULE_VERSION" macro from the Linux kernel compatibility API.

MFC after:	1 month
Reported by:	np@
Sponsored by:	Mellanox Technologies
2015-01-19 21:53:00 +00:00
Hans Petter Selasky
e982e5c561 Start importing the basic OFED linux compatibility layer changes made
by dumbbell@ to be able to compile this layer as a dependency module.
Clean up some Makefiles and remove the no longer used OFED define.
Currently only i386 and amd64 targets are supported.

MFC after:		1 month
Sponsored by:		Mellanox Technologies
2015-01-17 16:36:39 +00:00
Hans Petter Selasky
5e3cd9e19c Use the M_SIZE() macro when possible.
MFC after:	3 days
Suggested by:	rwatson@
2015-01-08 14:58:54 +00:00
Hans Petter Selasky
fe657d68f5 Fixes and updates for the Linux compatibility layer:
- Remove unsupported "bus" field from "struct pci_dev".
- Fix logic inside "pci_enable_msix()" when the number of allocated
  interrupts are less than the number of available interrupts.
- Update header files included from "list.h".
- Ensure that "idr_destroy()" removes all entries before destroying
  the IDR root node(s).
- Set the "device->release" function so that we don't leak memory at
  device destruction.
- Use FreeBSD's "log()" function for certain debug printouts.
- Put parenthesis around arguments inside the min, max, min_t and max_t macros.
- Make sure we don't leak file descriptors by dropping the extra file
  reference counts done by the FreeBSD kernel when calling falloc()
  and fget_unlocked().

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2015-01-06 10:02:14 +00:00
Hans Petter Selasky
a94c3b7d98 Make sure callbacks being freed are not pending when the
"mlx4_en_deactivate_cq()" function returns.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2014-12-11 10:47:50 +00:00
Hans Petter Selasky
c25290420e Start process of removing the use of the deprecated "M_FLOWID" flag
from the FreeBSD network code. The flag is still kept around in the
"sys/mbuf.h" header file, but does no longer have any users. Instead
the "m_pkthdr.rsstype" field in the mbuf structure is now used to
decide the meaning of the "m_pkthdr.flowid" field. To modify the
"m_pkthdr.rsstype" field please use the existing "M_HASHTYPE_XXX"
macros as defined in the "sys/mbuf.h" header file.

This patch introduces new behaviour in the transmit direction.
Previously network drivers checked if "M_FLOWID" was set in "m_flags"
before using the "m_pkthdr.flowid" field. This check has now now been
replaced by checking if "M_HASHTYPE_GET(m)" is different from
"M_HASHTYPE_NONE". In the future more hashtypes will be added, for
example hashtypes for hardware dedicated flows.

"M_HASHTYPE_OPAQUE" indicates that the "m_pkthdr.flowid" value is
valid and has no particular type. This change removes the need for an
"if" statement in TCP transmit code checking for the presence of a
valid flowid value. The "if" statement mentioned above is now a direct
variable assignment which is then later checked by the respective
network drivers like before.

Additional notes:
- The SCTP code changes will be committed as a separate patch.
- Removal of the "M_FLOWID" flag will also be done separately.
- The FreeBSD version has been bumped.

MFC after:	1 month
Sponsored by:	Mellanox Technologies
2014-12-01 11:45:24 +00:00
Gleb Smirnoff
651e4e6a30 Merge from projects/sendfile: extend protocols API to support
sending not ready data:
o Add new flag to pru_send() flags - PRUS_NOTREADY.
o Add new protocol method pru_ready().

Sponsored by:	Nginx, Inc.
Sponsored by:	Netflix
2014-11-30 13:24:21 +00:00
Alexander V. Chernikov
74860d4f7c Do not return unlocked/unreferenced lle in arpresolve/nd6_storelladdr -
return lle flags IFF needed.
Do not pass rte to arpresolve - pass is_gateway flag instead.
2014-11-27 23:06:25 +00:00
Gleb Smirnoff
cfa6009e36 In preparation of merging projects/sendfile, transform bare access to
sb_cc member of struct sockbuf to a couple of inline functions:

sbavail() and sbused()

Right now they are equal, but once notion of "not ready socket buffer data",
will be checked in, they are going to be different.

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2014-11-12 09:57:15 +00:00
Hans Petter Selasky
4f8c17ab35 Fix compile warning by removing unused variable.
MFC:		3 days
Sponsored by:	Mellanox Technologies
2014-10-30 16:57:56 +00:00
Hans Petter Selasky
7cabdf161d Update the network interface baudrate integer according to the actual
line rate.

Submitted by:	jhb @
MFC after:	1 week
2014-10-24 16:39:01 +00:00
Hans Petter Selasky
f0188618f2 Fix multiple incorrect SYSCTL arguments in the kernel:
- Wrong integer type was specified.

- Wrong or missing "access" specifier. The "access" specifier
sometimes included the SYSCTL type, which it should not, except for
procedural SYSCTL nodes.

- Logical OR where binary OR was expected.

- Properly assert the "access" argument passed to all SYSCTL macros,
using the CTASSERT macro. This applies to both static- and dynamically
created SYSCTLs.

- Properly assert the the data type for both static and dynamic
SYSCTLs. In the case of static SYSCTLs we only assert that the data
pointed to by the SYSCTL data pointer has the correct size, hence
there is no easy way to assert types in the C language outside a
C-function.

- Rewrote some code which doesn't pass a constant "access" specifier
when creating dynamic SYSCTL nodes, which is now a requirement.

- Updated "EXAMPLES" section in SYSCTL manual page.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2014-10-21 07:31:21 +00:00
Hans Petter Selasky
2c6eb461a7 Update the OFED Linux compatibility layer and
Mellanox hardware driver(s):

- Properly name an inclusion guard
- Fix compile warnings regarding unsigned enums
- Add two new sysctl nodes
- Remove all empty linux header files
- Make an error printout more verbose
- Use "mod_delayed_work()" instead of
  cancelling and starting a timeout.
- Implement more Linux scatterlist
  functions.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2014-10-15 13:40:29 +00:00
Hans Petter Selasky
15983afb04 Fix compile warning when compiling with GCC.
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2014-10-07 10:04:25 +00:00
Gleb Smirnoff
a917ed7723 Mechanically convert to if_inc_counter(). 2014-09-27 20:39:24 +00:00
Hans Petter Selasky
7b8455ae34 Update code to use new network counter API.
Fix some minor compile warnings while at it.

Sponsored by:	Mellanox Technologies
Suggested by:	glebius@
MFC after:	1 week
2014-09-24 08:28:34 +00:00
Hans Petter Selasky
f02f742280 Hardware driver update from Mellanox Technologies, including:
- improved performance
 - better stability
 - new features
 - bugfixes

Supported HCAs:
 - ConnectX-2
 - ConnectX-3
 - ConnectX-3 Pro

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2014-09-23 12:37:01 +00:00
Hans Petter Selasky
9fd573c39d Improve transmit sending offload, TSO, algorithm in general.
The current TSO limitation feature only takes the total number of
bytes in an mbuf chain into account and does not limit by the number
of mbufs in a chain. Some kinds of hardware is limited by two
factors. One is the fragment length and the second is the fragment
count. Both of these limits need to be taken into account when doing
TSO. Else some kinds of hardware might have to drop completely valid
mbuf chains because they cannot loaded into the given hardware's DMA
engine. The new way of doing TSO limitation has been made backwards
compatible as input from other FreeBSD developers and will use
defaults for values not set.

Reviewed by:	adrian, rmacklem
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2014-09-22 08:27:27 +00:00
Hans Petter Selasky
72f3100047 Revert r271504. A new patch to solve this issue will be made.
Suggested by:	adrian @
2014-09-13 20:52:01 +00:00
Hans Petter Selasky
eb93b77ae4 Improve transmit sending offload, TSO, algorithm in general.
The current TSO limitation feature only takes the total number of
bytes in an mbuf chain into account and does not limit by the number
of mbufs in a chain. Some kinds of hardware is limited by two
factors. One is the fragment length and the second is the fragment
count. Both of these limits need to be taken into account when doing
TSO. Else some kinds of hardware might have to drop completely valid
mbuf chains because they cannot loaded into the given hardware's DMA
engine. The new way of doing TSO limitation has been made backwards
compatible as input from other FreeBSD developers and will use
defaults for values not set.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2014-09-13 08:26:09 +00:00
Bjoern A. Zeeb
a10816090b Forward declare struct kiocb, which is only used for an unsued function
argument but not actually defined anywhere.

This fixes the compile complaining about
"declaration of 'struct kiocb' will not be visible outside of this function".

MFC after:	2 weeks
X-MFC with:	whatever changed caused the breakage ;-)
2014-08-29 14:47:05 +00:00
Hans Petter Selasky
c7818b48b6 - Update the OFED Linux Emulation layer as a preparation for a
hardware driver update from Mellanox Technologies.
- Remove empty files from the OFED Linux Emulation layer.
- Fix compile warnings related to printf() and the "%lld" and "%llx"
format specifiers.
- Add some missing 2-clause BSD copyrights.
- Add "Mellanox Technologies, Ltd." to list of copyright holders.
- Add some new compatibility files.
- Fix order of uninit in the mlx4ib module to avoid crash at unload
using the new module_exit_order() function.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2014-08-27 13:21:53 +00:00