Commit Graph

181 Commits

Author SHA1 Message Date
Justin Hibbits
12e99b63d2 ofed: Fix a logic inversion from IfAPI conversion
Reported by:	bartosz.sobczak_intel.com
Fixes:		3e142e0767 ("ofed: Mechanically convert to IfAPI")
Sponsored by:	Juniper Networks, Inc.
2023-04-19 11:56:25 -04:00
Zhenlei Huang
fc6c93b6a5 infiniband: Opt-in for net epoch
This is counterpart to e87c494015, which did the same for ethernet.

Suggested by:	hselasky
Reviewed by:	hselasky, kib
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D39405
2023-04-06 00:08:23 +08:00
Justin Hibbits
3e142e0767 ofed: Mechanically convert to IfAPI
Summary:
Because of the intricacies of this code it wasn't purely scripted, but
instead hand-mechanical.

Reviewed by:	hselasky
Sponsored by:	Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D38560
2023-03-24 10:04:33 -04:00
Gordon Bergling
d048e8c619 ofed: Fix a typo in a source code comment
- s/it it/it to/

MFC after:	3 days
2022-04-09 14:39:36 +02:00
Hans Petter Selasky
1aa593b90c ibcore: Add support for NDR link speed.
Add new IBTA speed NDR, supporting signaling rate of 100Gb.

Linux commit:
c7adf7717301558e8852949d8e3dc3748d1a4a97

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-21 09:35:19 +01:00
Hans Petter Selasky
a30f71704e mlx5ib: Add support for parsing udata in mlx5_ib_create_flow().
Backport from Linux 5.17 (drivers/infiniband/hw/mlx5/fs.c)

This fixes creating flow rules from user-space after the
kernel space update based on Linux 5.7-rc1 .

Sponsored by:	NVIDIA Networking
2022-02-10 11:17:42 +01:00
Hans Petter Selasky
b633e08c70 ibcore: Kernel space update based on Linux 5.7-rc1.
Overview:

This is the first stage of a RDMA stack upgrade introducing kernel
changes only based on Linux 5.7-rc1.

This patch is based on about four main areas of work:
- Update of the IB uobjects system:
  - The memory holding so-called AH, CQ, PD, SRQ and UCONTEXT objects
    is now managed by ibcore. This also require some changes in the
    kernel verbs API. The updated verbs changes are typically about
    initialize and deinitialize objects, and remove allocation and
    free of memory.

- Update of the uverbs IOCTL framework:
  - The parsing and handling of user-space commands has been
    completely refactored to integrate with the updated IB uobjects
    system.

- Various changes and updates to the generic uverbs interfaces in
  device drivers including the new uAPI surface.

- The mlx5_ib_devx.c in mlx5ib and related mlx5 core changes.

Dependencies:

- The mlx4ib driver code has been updated with the minimum changes
needed.

- The mlx5ib driver code has been updated with the minimum changes
needed including DV support.

Compatibility:

- All user-space facing APIs are backwards compatible after this
  change.

- All kernel-space facing RDMA APIs are backwards compatible after
  this change, with exception of ib_create_ah() and ib_destroy_ah()
  which takes a new flag.

- The "ib_device_ops" structure exist, but only contains the driver ID
  and some structure sizes.

Differences from Linux:

- Infiniband drivers must use the INIT_IB_DEVICE_OPS() macro to set
  the sizes needed for allocating various IB objects, when adding
  IB device instances.

Security:

- PRIV_NET_RAW is needed to use raw ethernet transmit features.
- PRIV_DRIVER is needed to use other privileged operations.

Based on upstream Linux, Torvalds (5.7-rc1):
8632e9b5645bbc2331d21d892b0d6961c1a08429

MFC after:	1 week
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D31149
Sponsored by:	NVIDIA Networking
2021-07-28 13:28:29 +02:00
Hans Petter Selasky
f60da09dbb ibcore: Add some functions and definitions for selecting and querying retryable ucontext cleanup.
Linux commit:
1c77483e4c50339b0306572167ccbff6b55d051b

MFC after:	1 week
Reviewed by:	kib
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2021-07-12 14:22:34 +02:00
Hans Petter Selasky
c3987b8ea7 ibcore: Declare ib_post_send() and ib_post_recv() arguments const
Since neither ib_post_send() nor ib_post_recv() modify the data structure
their second argument points at, declare that argument const. This change
makes it necessary to declare the 'bad_wr' argument const too and also to
modify all ULPs that call ib_post_send(), ib_post_recv() or
ib_post_srq_recv(). This patch does not change any functionality but makes
it possible for the compiler to verify whether the
ib_post_(send|recv|srq_recv) really do not modify the posted work request.

Linux commit:
f696bf6d64b195b83ca1bdb7cd33c999c9dcf514
7bb1fafc2f163ad03a2007295bb2f57cfdbfb630
d34ac5cd3a73aacd11009c4fc3ba15d7ea62c411

MFC after:	1 week
Reviewed by:	kib
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2021-07-12 14:22:33 +02:00
Hans Petter Selasky
79b817084c ibcore: Implement ib_uverbs_get_ucontext_file().
Expose ib_ucontext from a given ib_uverbs_file. Drivers that use the ioctl(9)
API may have the ib_uverbs_file and need a way to get the related ib_ucontext
from it, this is enabled by this patch.

Downstream patches from this series will use it.

Linux commit:
7dc08dcfc8c86cb4457e383734ff6844ddaff876

MFC after:	1 week
Reviewed by:	kib
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2021-07-12 14:22:33 +02:00
Hans Petter Selasky
d92a9e5604 ibcore: Simplify ib_modify_qp_is_ok().
All callers to ib_modify_qp_is_ok() provides enum ib_qp_state makes the
checks of out-of-scope redundant. Let's remove them together with updating
function signature to return boolean result.

While at it remove unused "ll" parameter from ib_modify_qp_is_ok().

Linux commit:
19b1f54099b6ee334acbfbcfbdffd1d1f057216d
d31131bba5a1630304c55ea775c48cc84912ab59

MFC after:	1 week
Reviewed by:	kib
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2021-07-12 14:22:32 +02:00
Hans Petter Selasky
0c13880ccc ibcore: Support rate limit for packet pacing
Add new member rate_limit to ib_qp_attr which holds the packet pacing
rate in kbps, 0 means unlimited.

IB_QP_RATE_LIMIT is added to ib_attr_mask and could be used by RAW
QPs when changing QP state from RTR to RTS, RTS to RTS.

Linux commit:
528e5a1bd3f0e9b760cb3a1062fce7513712a15d

MFC after:	1 week
Reviewed by:	kib
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2021-07-12 14:22:32 +02:00
Hans Petter Selasky
20fea7ac64 ibcore: Define option to set ack timeout.
Define new option in 'rdma_set_option' to override calculated QP timeout
when requested to provide QP attributes to modify a QP.

At the same time, pack tos_set to be bitfield.

Linux commit:
2c1619edef61a03cb516efaa81750784c3071d10

MFC after:	1 week
Reviewed by:	kib
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2021-07-12 14:22:32 +02:00
Hans Petter Selasky
912e98cede ibcore: Protect against concurrent access to hardware stats.
Currently access to hardware stats buffer isn't protected, this can
result in multiple writes and reads at the same time to the same
memory location. This can lead to providing an incorrect value to
the user. Add a mutex to protect against it.

Linux commit:
e945130b52bea65d15f9bdf54949d4cb7a88db7f

MFC after:	1 week
Reviewed by:	kib
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2021-07-12 14:22:31 +02:00
Hans Petter Selasky
4238b4a7a2 ibcore: Introduce ib_port_phys_state enum.
In order to improve readability, add ib_port_phys_state enum to replace
the use of magic numbers.

Linux commit:
72a7720fca37fec0daf295923f17ac5d88a613e1

MFC after:	1 week
Reviewed by:	kib
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2021-07-12 14:22:31 +02:00
Hans Petter Selasky
e25bcf8d77 ibcore: Add rdma_reject_msg() helper function.
rdma_reject_msg() returns a pointer to a string message associated with
the transport reject reason codes.

Linux commit:
77a5db13153906a7e00740b10b2730e53385c5a8

MFC after:	1 week
Reviewed by:	kib
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2021-07-12 14:22:30 +02:00
Bjoern A. Zeeb
1411f52fac mlx4/OFED: replace the struct net_device with struct ifnet
Given all the code does operate on struct ifnet, the last step in this
longer series of changes now is to rename struct net_device to
struct ifnet (that is what it was defined to in the LinuxKPi code).
While mlx4 and OFED are "shared" code the decision was made years ago
to not write it based on the netdevice KPI but the native ifnet KPI
for most of it.  This commit simply spells this out and with that
frees "struct netdevice" to be re-done on LinuxKPI to become a more
native/mixed implementation over time as needed by, e.g., wireless
drivers.

Sponsored by:	The FreeBSD Foundation
MFC after:	10 days
Reviewed by:	hselasky
Differential Revision: https://reviews.freebsd.org/D30515
2021-06-18 21:20:08 +00:00
Bjoern A. Zeeb
825b7d4c9d OFED: migrate LinuxKPI net_device/ifnet macros into ofed
The LinuxKPI net_device actually is an ifnet; in order to further
clean that up so we can extend "net_device" migrate the few macros
left into ofed and make sure the header is included in all files
which need access to the macros.

Sponsored by:	The FreeBSD Foundation
MFC after:	12 days
Reviewed by:	kib
Differential Revision: https://reviews.freebsd.org/D30477
2021-05-27 12:26:01 +00:00
Bjoern A. Zeeb
c35034b338 LinuxKPI/OFED/mlx4: cleanup netdevice.h some more
This removes all unused bits from linux/netdevice.h and migrates two
inline functions into the mlx4 and ofed code respectively.

This gets the mlx4/ofed (struct ifnet) specific bits down to 7 lines
in netdevice.h.

Sponsored by:	The FreeBSD Foundation
MFC after:	13 days
Reviewed by:	hselasky, kib
Differential Revision: https://reviews.freebsd.org/D30461
2021-05-26 12:30:02 +00:00
Bjoern A. Zeeb
7069b4c6a4 LinuxKPI/OFED: (re)move inetdevice.h implementation
The two functions in linux/inetdevice.h are highly FreeBSD/ifnet
specific.  This is a result of struct net_device being mapped to
struct ifnet.

The only known consumer of these functions are two files in the
ofed/infiniband code.

As a first step of cleaning up copy linux/inetdevice.h to
rdma/ib_addr_freebsd.h. (It stayed a separate file to preserve
copyright and license of the original file; otherwise it could be
merged into ib_addr.h where more EPOCH/vnet/.. are already used).

Slightly rename the function to not conflict with LinuxKPI
in the future.

Remove the three last, now unneeded includes of inetdevice.h and
zap linux/inetdevice.h to an empty header file with only the forward
include to netdevice.h remaining.

Sponsored-by:	The FreeBSD Foundation
MFC-after:	2 weeks
Reviewed-by:	hselasky, kib
X-D-R:		D29366 (extracted as further cleanup)
Differential Revision:	https://reviews.freebsd.org/D29434
2021-03-30 14:40:46 +00:00
Hans Petter Selasky
f8f5b459d2 Update user access region, UAR, APIs in the core in mlx5core.
This change include several changes as listed below all related to UAR.
UAR is a special PCI memory area where the so-called doorbell register and
blue flame register live. Blue flame is a feature for sending small packets
more efficiently via a PCI memory page, instead of using PCI DMA.

- All structures and functions named xxx_uuars were renamed into xxx_bfreg.
- Remove partially implemented Blueflame support from mlx5en(4) and mlx5ib.
- Implement blue flame register allocator.
- Use blue flame register allocator in mlx5ib.
- A common UAR page is now allocated by the core to support doorbell register
  writes for all of mlx5en and mlx5ib, instead of allocating one UAR per
  sendqueue.
- Add support for DEVX query UAR.
- Add support for 4K UAR for libmlx5.

Linux commits:
7c043e908a74ae0a935037cdd984d0cb89b2b970
2f5ff26478adaff5ed9b7ad4079d6a710b5f27e7
0b80c14f009758cefeed0edff4f9141957964211
30aa60b3bd12bd79b5324b7b595bd3446ab24b52
5fe9dec0d045437e48f112b8fa705197bd7bc3c0
0118717583cda6f4f36092853ad0345e8150b286
a6d51b68611e98f05042ada662aed5dbe3279c1e

MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking
2021-01-08 13:33:46 +01:00
Eric van Gyzen
536457e181 infiniband: Appease Coverty
Coverity claims the call to rdma_gid2ip in cma_igmp_send overwrites addr.
Use a consistent definition of sockaddr to prevent detections and code
changes in the future.

Submitted by:	bret_ketchum@dell.com
Reported by:	Coverity
Reviewed by:	hselasky, kib
MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D26229
2020-08-31 16:17:28 +00:00
Hans Petter Selasky
9220357857 Prevent potential underflow in ibcore.
Linux commit:
a9018adfde809d44e71189b984fa61cc89682b5e

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2019-11-15 11:46:53 +00:00
Hans Petter Selasky
ae9a8ec99f Correct MR length field to be 64-bit in ibcore.
Linux commit:
edd31551148c09608feee6b8756ad148d550ee3b

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2019-11-15 11:45:14 +00:00
Hans Petter Selasky
758a35d0bc VLAN_TRUNKDEV() requires epochification in ibcore after r353292.
Sponsored by:	Mellanox Technologies
2019-10-16 08:56:07 +00:00
Conrad Meyer
f49e79b56b OFED: Fix accidental double-copy of rdma_sdp.h in r351176
The mistake came about like this: the first attempt to commit was blocked by
a pre-commit hook due to missing SVN tags.  svn revert doesn't delete new
files, I guess.  While reapplying the fixed diff, the non-empty target file
was just concatenated with the new contents?  Ugh. :-(
2019-08-18 04:19:41 +00:00
Conrad Meyer
6b2f017186 OFED: Unbreak SDP support in ibcore
This regression was introduced in the r326169 Linux v4.9 Infiniband upgrade.
Restore the functionality.

Reviewed by:	hselasky
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D21298
2019-08-17 18:54:07 +00:00
Conrad Meyer
14f19c6b73 OFED: Fix ib_mad.h ib_user_mad.h include to match new uapi path
Sponsored by:	Dell EMC Isilon
2019-08-17 03:09:03 +00:00
Hans Petter Selasky
013f1e1435 Add new rates to ibcore.
Add the new rates that were added to the Infiniband specification as part of
HDR and 2x support.

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:55:47 +00:00
Hans Petter Selasky
7877f59352 Introduce and use sgid_index in CM requests in ibcore.
For RoCE, when CM requests are received for RC and UD connections,
netdevice of the incoming request is unavailable. Because of that CM
requests are always forwarded to init_net namespace.

Now that we have the GID index available, introduce SGID index in
incoming CM requests and refer to the netdevice of it.

While at it fix some incorrect uses of init_net and make sure
the rdma_create_id() function stores the VNET it is passed.

Based on linux commit:
cee104334c98dd04e9dd4d9a4fa4784f7f6aada9

MFC after:		3 days
Approved by:		re (gjb)
Sponsored by:		Mellanox Technologies
2018-09-09 07:20:15 +00:00
Slava Shwartsman
b4df6efb4e ibcore: Fix endless loop in searching for matching VLAN device
In r337943 ifnet's if_pcp was set to the PCP value in use
instead of IFNET_PCP_NONE.
Current ibcore code assumes that if_pcp is IFNET_PCP_NONE with
VLAN interfaces so it can identify prio-tagged traffic.
Fix that by explicitly verifying that that the if_type is IFT_ETHER
and not IFT_L2VLAN.

MFC after:      3 days
Approved by:    re (Marius), hselasky (mentor), kib (mentor)
Sponsored by:   Mellanox Technologies
2018-09-06 12:26:57 +00:00
Hans Petter Selasky
fb11674d30 Only NULL check the VNET pointer when VIMAGE is enabled in ibcore.
Else a NULL VNET pointer should be ignored. This fixes address resolving
when VIMAGE is disabled.

MFC after:		3 days
Sponsored by:		Mellanox Technologies
2018-07-31 11:23:44 +00:00
Hans Petter Selasky
8ac2952539 Remove blank line.
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2018-07-17 09:44:16 +00:00
Hans Petter Selasky
f736cb92cf Check port number supplied by user verbs cmds in ibcore.
The ib_uverbs_create_ah() ind ib_uverbs_modify_qp() calls receive
the port number from user input as part of its attributes and assumes
it is valid. Down on the stack, that parameter is used to access kernel
data structures.  If the value is invalid, the kernel accesses memory
it should not.  To prevent this, verify the port number before using it.

Linux commit:
5ecce4c9b17bed4dc9cb58bfb10447307569b77b
a62ab66b13a0f9bcb17b7b761f6670941ed5cd62
5a7a88f1b488e4ee49eb3d5b82612d4d9ffdf2c3

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2018-07-17 09:29:14 +00:00
Hans Petter Selasky
855ad7cf39 Check AF family prior resolving address and introduce safer rdma_addr_size() variants in ibcore.
Garbage supplied by user will cause to UCMA module provide zero
memory size for memcpy(), because it wasn't checked, it will
produce unpredictable results in rdma_resolve_addr().

There are several places in the ucma ABI where userspace can pass in a
sockaddr but set the address family to AF_IB.  When that happens,
rdma_addr_size() will return a size bigger than sizeof struct sockaddr_in6,
and the ucma kernel code might end up copying past the end of a buffer
not sized for a struct sockaddr_ib.

Fix this by introducing new variants
    int rdma_addr_size_in6(struct sockaddr_in6 *addr);
    int rdma_addr_size_kss(struct __kernel_sockaddr_storage *addr);

that are type-safe for the types used in the ucma ABI and return 0 if the
size computed is bigger than the size of the type passed in.  We can use
these new variants to check what size userspace has passed in before
copying any addresses.

Linux commit:
2975d5de6428ff6d9317e9948f0968f7d42e5d74
09abfe7b5b2f442a85f4c4d59ecf582ad76088d7
84652aefb347297aa08e91e283adf7b18f77c2d5

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2018-07-17 09:24:39 +00:00
Hans Petter Selasky
f4546fa376 Add support for prio-tagged traffic for RDMA in ibcore.
When receiving a PCP change all GID entries are reloaded.
This ensures the relevant GID entries use prio tagging,
by setting VLAN present and VLAN ID to zero.

The priority for prio tagged traffic is set using the regular
rdma_set_service_type() function.

Fake the real network device to have a VLAN ID of zero
when prio tagging is enabled. This is logic is hidden inside
the rdma_vlan_dev_vlan_id() function which must always be used
to retrieve the VLAN ID throughout all of ibcore and the
infiniband network drivers.

The VLAN presence information then propagates through all
of ibcore and so incoming connections will have the VLAN
bit set. The incoming VLAN ID is then checked against the
return value of rdma_vlan_dev_vlan_id().

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2018-07-17 09:11:53 +00:00
Hans Petter Selasky
b70db3278f Set RoCEv2 MGID according to spec in ibcore.
RoCEv2 Annex states that for RoCEv2 over IPv4, the corresponding
IPv4 address is encoded into the GID according to the following rule:
GID= :ffff:<IPv4 address>

Remove the 0xff0e prefix for RoCEv2 packets with IPv4 and leave it
zeroed and change rdma_is_multicast_addr() to consider the new logic.

Linux commit:
be1d325a335840a86c133a56c6a911c368bac0fd
1c3aea2bc8f0b2e5b57375ead40457ff75a3a2ec

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2018-07-17 09:07:36 +00:00
Matt Macy
d7c5a620e2 ifnet: Replace if_addr_lock rwlock with epoch + mutex
Run on LLNW canaries and tested by pho@

gallatin:
Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5
based ConnectX 4-LX NIC, I see an almost 12% improvement in received
packet rate, and a larger improvement in bytes delivered all the way
to userspace.

When the host receiving 64 streams of netperf -H $DUT -t UDP_STREAM -- -m 1,
I see, using nstat -I mce0 1 before the patch:

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
4.98   0.00   4.42   0.00 4235592     33   83.80 4720653 2149771   1235 247.32
4.73   0.00   4.20   0.00 4025260     33   82.99 4724900 2139833   1204 247.32
4.72   0.00   4.20   0.00 4035252     33   82.14 4719162 2132023   1264 247.32
4.71   0.00   4.21   0.00 4073206     33   83.68 4744973 2123317   1347 247.32
4.72   0.00   4.21   0.00 4061118     33   80.82 4713615 2188091   1490 247.32
4.72   0.00   4.21   0.00 4051675     33   85.29 4727399 2109011   1205 247.32
4.73   0.00   4.21   0.00 4039056     33   84.65 4724735 2102603   1053 247.32

After the patch

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
5.43   0.00   4.20   0.00 3313143     33   84.96 5434214 1900162   2656 245.51
5.43   0.00   4.20   0.00 3308527     33   85.24 5439695 1809382   2521 245.51
5.42   0.00   4.19   0.00 3316778     33   87.54 5416028 1805835   2256 245.51
5.42   0.00   4.19   0.00 3317673     33   90.44 5426044 1763056   2332 245.51
5.42   0.00   4.19   0.00 3314839     33   88.11 5435732 1792218   2499 245.52
5.44   0.00   4.19   0.00 3293228     33   91.84 5426301 1668597   2121 245.52

Similarly, netperf reports 230Mb/s before the patch, and 270Mb/s after the patch

Reviewed by:	gallatin
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D15366
2018-05-18 20:13:34 +00:00
Hans Petter Selasky
d0a82c2414 Add IB_SPEED_HDR definition in ibcore.
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-07 13:01:00 +00:00
Hans Petter Selasky
1456d97c01 Optimize ibcore RoCE address handle creation from user-space.
Creating a UD address handle from user-space or from the kernel-space,
when the link layer is ethernet, requires resolving the remote L3
address into a L2 address. Doing this from the kernel is easy because
the required ARP(IPv4) and ND6(IPv6) address resolving APIs are readily
available. In userspace such an interface does not exist and kernel
help is required.

It should be noted that in an IP-based GID environment, the GID itself
does not contain all the information needed to resolve the destination
IP address. For example information like VLAN ID and SCOPE ID, is not
part of the GID and must be fetched from the GID attributes. Therefore
a source GID should always be referred to as a GID index. Instead of
going through various racy steps to obtain information about the
GID attributes from user-space, this is now all done by the kernel.

This patch optimises the L3 to L2 address resolving using the existing
create address handle uverbs interface, retrieving back the L2 address
as an additional user-space information structure.

This commit combines the following Linux upstream commits:

IB/core: Let create_ah return extended response to user
IB/core: Change ib_resolve_eth_dmac to use it in create AH
IB/mlx5: Make create/destroy_ah available to userspace
IB/mlx5: Use kernel driver to help userspace create ah
IB/mlx5: Report that device has udata response in create_ah

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-05 14:34:52 +00:00
Hans Petter Selasky
bf8641fedb Get correct network device when accepting incoming RDMA connections in ibcore.
This patch ensures the GID index is always used as a basis of resolving
incoming RDMA connections, as compared to the GID value itself.

Background:
On a per infiniband port basis, the GID identifier is not a unique identifier!
This assumption falls apart when VLAN ID, IPv6 scope ID and RoCE type,
as supported by RoCE v2, is taken into account. This additional
information is stored in the so-called GID attributes and is needed to
correctly identify the destination network interface for an incoming
connection.

Different VLANs are allowed to define the same IPv4 addresses and especially
for the default IPv6 link-local addresses or when using so-called containers
or jails, this is true.

The VNET information for the destination network interface is needed in
order to perform the L2 address lookup in the right Virtual Network Stack
context.

Consequently old functions previously used by RoCE v1, like
rdma_addr_find_smac_by_sgid() are impossible to support, because
there can be multiple identical GIDs associated with the same
infiniband port, and the answer to such a request becomes undefined.
This function has been removed.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-05 14:24:30 +00:00
Hans Petter Selasky
09938b2185 Add missing FreeBSD tags and SVN properties to ibcore.
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-05 11:49:45 +00:00
Hans Petter Selasky
33ec1ccbae Import the mthca kernel side infiniband driver from Linux 4.9 and fix
compilation under FreeBSD. The mthca driver was temporarily removed as
part of the Linux 4.9 RoCE/infinband upgrade.

Top commit in Linux source tree:
69973b830859bc6529a7a0468ba0d80ee5117826

Sponsored by:	Mellanox Technologies
2018-02-13 17:04:34 +00:00
Pedro F. Giffuni
fe267a5590 sys: general adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

No functional change intended.
2017-11-27 15:23:17 +00:00
Hans Petter Selasky
c2c014f24c Merge ^/head r323559 through r325504. 2017-11-07 08:39:14 +00:00
Hans Petter Selasky
d05554bb99 The remote DMA TCP portspace selector, RDMA_PS_TCP, is used for both
iWarp and RoCE in ibcore. The selection of RDMA_PS_TCP can not be used
to indicate iWarp protocol use. Backport the proper IB device
capabilities from Linux upstream to distinguish between iWarp and
RoCE. Only allocate the additional socket required for iWarp for RDMA
IDs when at least one iWarp device present. This resolves
interopability issues between iWarp and RoCE in ibcore

Reviewed by:		np @
Differential Revision:	https://reviews.freebsd.org/D12563
Sponsored by:		Mellanox Technologies
MFC after:		3 days
2017-10-20 08:20:15 +00:00
Hans Petter Selasky
aacb037742 Make sure the IPv6 scope ID gets zeroed inside the GID. Else searching for a
valid GID entry based on IPv6 addresses can fail.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-10-10 12:36:41 +00:00
Hans Petter Selasky
9f715dc162 ibcore: Delete old files and add new ones missed in the initial commit
for this projects branch.

Sponsored by:	Mellanox Technologies
2017-07-03 08:32:52 +00:00
Hans Petter Selasky
c19d65049e ibcore: Fix for compiling header files from user-space.
Sponsored by:	Mellanox Technologies
2017-07-03 08:30:43 +00:00
Hans Petter Selasky
478d300572 Initial RoCE/infiniband kernel update to Linux v4.9.
This patch currently supports:
- ibcore as a kernel module only
- krping as a kernel module only
- ipoib as a kernel module only

Sponsored by:	Mellanox Technologies
2017-06-15 12:47:48 +00:00