42 Commits

Author SHA1 Message Date
hselasky
b6bb68cbd9 Add support for setting type of service, TOS, for outgoing RDMA connections
in the krping kernel test utility.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2018-05-15 07:46:24 +00:00
hselasky
397a563cfd Exit krping on device removal to avoid endless hang situation.
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-23 17:03:42 +00:00
hselasky
5b03215ba8 Compile fixes for 32-bit architectures.
Sponsored by:	Mellanox Technologies
2017-11-24 12:08:50 +00:00
hselasky
e2db5c6b2e Add full support for specifying IPv6 addresses to krping.
Sponsored by:	Mellanox Technologies
2017-11-15 11:35:02 +00:00
hselasky
ef734bedd3 Set length of socket address in krping(). Else sobind() will fail with EINVAL.
Submitted by:	Chelsio
Sponsored by:	Mellanox Technologies
2017-07-26 16:44:24 +00:00
hselasky
8463d13f4b Merge ^/head r319973 through 321382. 2017-07-23 15:22:06 +00:00
markj
c8d732dce1 Avoid including list.h in LinuxKPI headers.
list.h includes a number of FreeBSD headers as a workaround for the
LIST_HEAD name collision. To reduce pollution, avoid including list.h
in commonly used headers when it is not explicitly needed.

Reviewed by:	hselasky
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D11249
2017-06-18 16:43:57 +00:00
hselasky
79ee7eff57 Initial RoCE/infiniband kernel update to Linux v4.9.
This patch currently supports:
- ibcore as a kernel module only
- krping as a kernel module only
- ipoib as a kernel module only

Sponsored by:	Mellanox Technologies
2017-06-15 12:47:48 +00:00
np
d442c439a1 krping: Allow the underlying ib_device to handle DMA mappings.
Submitted by:	Vijay Singh @ Netapp
2016-10-24 20:53:44 +00:00
hselasky
d3df4bfca0 Fix for printf() compile warning when fast_reg.length is 64-bit.
Changing fast_reg.length to 64 bits is planned in the future. Krping
uses 32-bit lengths internally.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-04-22 07:29:38 +00:00
np
27e4615bdf Send krping output to the log instead of the tty, as is done upstream.
Reviewed by:	hselasky@
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D5931
2016-04-14 00:25:11 +00:00
np
7db25c0459 Add fastreg support to krping (ported from upstream).
Submitted by:	Krishnamraju Eraparaju @ Chelsio
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D5777
2016-04-12 21:34:04 +00:00
np
0e3e40ed07 krping wasn't designed to take more than one client. Fail any connect
requests if cb->state is not IDLE.

Submitted by:	Krishnamraju Eraparaju @ Chelsio
Reviewed by:	Steve Wise @ Open Grid Computing
Sponsored by:	Chelsio Communications
2016-03-29 01:41:07 +00:00
hselasky
5b52683cd8 Fix crash in krping when run as a client due to NULL pointer access.
Initialize pointer in question which is used only when fast registers
mode is selected.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-03-16 08:49:38 +00:00
np
5f81d0ce4c Have krping use IB_ACCESS_LOCAL_WRITE because it's required for remote
write or remote atomic operations.

Submitted by:	Krishnamraju Eraparaju @ Chelsio
2016-01-05 01:58:30 +00:00
hselasky
1025a58857 Rename linuxapi[.ko] into linuxkpi[.ko], to reflect that it is a
kernel programming interface module, KPI, to avoid confusion with the
existing Linux userspace binary compatibility shims. Bump the
FreeBSD_version number.

Reviewed by:	np @
Suggested by:	dumbbell @
Sponsored by:	Mellanox Technologies
2015-10-22 09:50:45 +00:00
hselasky
5dbc43f9a5 Update the infiniband stack to Mellanox's OFED version 2.1.
Highlights:
 - Multiple verbs API updates
 - Support for RoCE, RDMA over ethernet

All hardware drivers depending on the common infiniband stack has been
updated aswell.

Discussed with:	np @
Sponsored by:	Mellanox Technologies
MFC after:	1 month
2015-02-17 08:40:27 +00:00
hselasky
551300d112 Add missing linuxapi module dependencies and always use the FreeBSD
"MODULE_VERSION" macro definition. Remove the redefinition of the
"MODULE_VERSION" macro from the Linux kernel compatibility API.

MFC after:	1 month
Reported by:	np@
Sponsored by:	Mellanox Technologies
2015-01-19 21:53:00 +00:00
np
1a7f83f8a9 krping: In verbose mode print only first 128 bytes of krping data.
Submitted by:	Hariprasad at Chelsio dot com.
Sponsored by:	Chelsio Communications
2014-10-27 22:41:55 +00:00
hselasky
11046e8048 Update the OFED Linux compatibility layer and
Mellanox hardware driver(s):

- Properly name an inclusion guard
- Fix compile warnings regarding unsigned enums
- Add two new sysctl nodes
- Remove all empty linux header files
- Make an error printout more verbose
- Use "mod_delayed_work()" instead of
  cancelling and starting a timeout.
- Implement more Linux scatterlist
  functions.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2014-10-15 13:40:29 +00:00
hselasky
387f5b6c1a - Update the OFED Linux Emulation layer as a preparation for a
hardware driver update from Mellanox Technologies.
- Remove empty files from the OFED Linux Emulation layer.
- Fix compile warnings related to printf() and the "%lld" and "%llx"
format specifiers.
- Add some missing 2-clause BSD copyrights.
- Add "Mellanox Technologies, Ltd." to list of copyright holders.
- Add some new compatibility files.
- Fix order of uninit in the mlx4ib module to avoid crash at unload
using the new module_exit_order() function.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2014-08-27 13:21:53 +00:00
np
17dd43ec29 Update krping to the latest upstream code. Move all the FreeBSD
specific parts to krping_dev.c, which leaves the other files as
close to their upstream versions as possible.
2013-10-14 23:02:05 +00:00
np
d4625b6fda Delete all of the old RDMA code (except krping, which was switched to
use sys/ofed some time back).  This has been sitting around as dead code
in the tree for a very long time.
2013-10-14 22:39:08 +00:00
alfred
91eb2b78a7 Update OFED to Linux 3.7 and update Mellanox drivers.
Update the OFED Infiniband core to the version supplied in Linux
version 3.7.

The update to OFED is nearly all additional defines and functions
with the exception of the addition of additional parameters to
ib_register_device() and the reg_user_mr callback.

In addition the ibcore (Infiniband core) and ipoib (IP over Infiniband)
have both been made into completely loadable modules to facilitate
testing of the OFED stack in FreeBSD.

Finally the Mellanox Infiniband drivers are now updated to the
latest version shipping with Linux 3.7.

Submitted by: Mellanox FreeBSD driver team:
                Oded Shanoon (odeds mellanox.com),
                Meny Yossefi (menyy mellanox.com),
                Orit Moskovich (oritm mellanox.com)

Approved by: re
2013-09-29 00:35:03 +00:00
np
debdf2d1b8 Assorted fixes to krping. Disconnect the rest of sys/contrib/rdma from
the build while here.  sys/ofed has more recent RDMA code and should be
used instead.  We should probably move krping out of sys/contrib/rdma
and get rid of the rest of it.

Obtained from:	Chelsio
2013-08-23 19:12:29 +00:00
kevlo
ceb08698f2 Revert previous commit...
Pointyhat to:	kevlo (myself)
2012-10-10 08:36:38 +00:00
kevlo
8747a46991 Prefer NULL over 0 for pointers 2012-10-09 08:27:40 +00:00
pjd
49513a3583 Fix an obvious typo. 2012-09-22 17:41:56 +00:00
np
67d5f1a727 - Updated TOE support in the kernel.
- Stateful TCP offload drivers for Terminator 3 and 4 (T3 and T4) ASICs.
  These are available as t3_tom and t4_tom modules that augment cxgb(4)
  and cxgbe(4) respectively.  The cxgb/cxgbe drivers continue to work as
  usual with or without these extra features.

- iWARP driver for Terminator 3 ASIC (kernel verbs).  T4 iWARP in the
  works and will follow soon.

Build-tested with make universe.

30s overview
============
What interfaces support TCP offload?  Look for TOE4 and/or TOE6 in the
capabilities of an interface:
# ifconfig -m | grep TOE

Enable/disable TCP offload on an interface (just like any other ifnet
capability):
# ifconfig cxgbe0 toe
# ifconfig cxgbe0 -toe

Which connections are offloaded?  Look for toe4 and/or toe6 in the
output of netstat and sockstat:
# netstat -np tcp | grep toe
# sockstat -46c | grep toe

Reviewed by:	bz, gnn
Sponsored by:	Chelsio communications.
MFC after:	~3 months (after 9.1, and after ensuring MFC is feasible)
2012-06-19 07:34:13 +00:00
dim
c5c6fab45c Fix the following compilation warnings in sys/contrib/rdma/rdma_cma.c:
sys/contrib/rdma/rdma_cma.c:1259:8: error: case value not in enumerated type 'enum iw_cm_event_status' [-Werror,-Wswitch]
	      case ECONNRESET:
		   ^
  @/sys/errno.h:118:20: note: expanded from macro 'ECONNRESET'
  #define ECONNRESET      54              /* Connection reset by peer */
		      ^
  sys/contrib/rdma/rdma_cma.c:1263:8: error: case value not in enumerated type 'enum iw_cm_event_status' [-Werror,-Wswitch]
	      case ETIMEDOUT:
		   ^
  @/sys/errno.h:124:19: note: expanded from macro 'ETIMEDOUT'
  #define ETIMEDOUT       60              /* Operation timed out */
		      ^
  sys/contrib/rdma/rdma_cma.c:1260:8: error: case value not in enumerated type 'enum iw_cm_event_status' [-Werror,-Wswitch]
	      case ECONNREFUSED:
		   ^
  @/sys/errno.h:125:22: note: expanded from macro 'ECONNREFUSED'
  #define ECONNREFUSED    61              /* Connection refused */
		      ^

This is because the switch uses iw_cm_event::status, which is an enum
iw_cm_event_status, while ECONNRESET, ETIMEDOUT and ECONNREFUSED are
just plain defines from errno.h.

It looks like there is only one use of any of the enumeration values of
iw_cm_event_status, in:

  sys/contrib/rdma/rdma_iwcm.c: 	if (iw_event->status == IW_CM_EVENT_STATUS_ACCEPTED) {

So messing around with the enum definitions to fix the warning seems too
disruptive; the simplest fix is to cast the argument of the switch to
int.

Reviewed by:	kmacy
MFC after:	1 week
2012-04-20 21:52:57 +00:00
dim
0c739e2b56 In sys/contrib/rdma/ib_addr.h, bump MAX_ADDR_LEN to 20 bytes (the same
value used in sys/ofed/include/linux/netdevice.h), so there will be no
buffer overruns in the rest of the inline functions in this file.

Reviewed by:	kmacy
MFC after:	1 week
2012-01-07 00:47:27 +00:00
attilio
d3a4a74154 Remove the explicit definition of inet_aton() as it was introduced as a
general function in r199208.

Reported by:	np
Sponsored by:	Sandvine Incorporated
MFC:		1 week
2009-11-12 14:22:12 +00:00
rwatson
fb9ffed650 Merge the remainder of kern_vimage.c and vimage.h into vnet.c and
vnet.h, we now use jails (rather than vimages) as the abstraction
for virtualization management, and what remained was specific to
virtual network stacks.  Minor cleanups are done in the process,
and comments updated to reflect these changes.

Reviewed by:	bz
Approved by:	re (vimage blanket)
2009-08-01 19:26:27 +00:00
rwatson
57ca4583e7 Build on Jeff Roberson's linker-set based dynamic per-CPU allocator
(DPCPU), as suggested by Peter Wemm, and implement a new per-virtual
network stack memory allocator.  Modify vnet to use the allocator
instead of monolithic global container structures (vinet, ...).  This
change solves many binary compatibility problems associated with
VIMAGE, and restores ELF symbols for virtualized global variables.

Each virtualized global variable exists as a "reference copy", and also
once per virtual network stack.  Virtualized global variables are
tagged at compile-time, placing the in a special linker set, which is
loaded into a contiguous region of kernel memory.  Virtualized global
variables in the base kernel are linked as normal, but those in modules
are copied and relocated to a reserved portion of the kernel's vnet
region with the help of a the kernel linker.

Virtualized global variables exist in per-vnet memory set up when the
network stack instance is created, and are initialized statically from
the reference copy.  Run-time access occurs via an accessor macro, which
converts from the current vnet and requested symbol to a per-vnet
address.  When "options VIMAGE" is not compiled into the kernel, normal
global ELF symbols will be used instead and indirection is avoided.

This change restores static initialization for network stack global
variables, restores support for non-global symbols and types, eliminates
the need for many subsystem constructors, eliminates large per-subsystem
structures that caused many binary compatibility issues both for
monitoring applications (netstat) and kernel modules, removes the
per-function INIT_VNET_*() macros throughout the stack, eliminates the
need for vnet_symmap ksym(2) munging, and eliminates duplicate
definitions of virtualized globals under VIMAGE_GLOBALS.

Bump __FreeBSD_version and update UPDATING.

Portions submitted by:  bz
Reviewed by:            bz, zec
Discussed with:         gnn, jamie, jeff, jhb, julian, sam
Suggested by:           peter
Approved by:            re (kensmith)
2009-07-14 22:48:30 +00:00
rwatson
c9ef486fe1 Modify most routines returning 'struct ifaddr *' to return references
rather than pointers, requiring callers to properly dispose of those
references.  The following routines now return references:

  ifaddr_byindex
  ifa_ifwithaddr
  ifa_ifwithbroadaddr
  ifa_ifwithdstaddr
  ifa_ifwithnet
  ifaof_ifpforaddr
  ifa_ifwithroute
  ifa_ifwithroute_fib
  rt_getifa
  rt_getifa_fib
  IFP_TO_IA
  ip_rtaddr
  in6_ifawithifp
  in6ifa_ifpforlinklocal
  in6ifa_ifpwithaddr
  in6_ifadd
  carp_iamatch6
  ip6_getdstifaddr

Remove unused macro which didn't have required referencing:

  IFP_TO_IA6

This closes many small races in which changes to interface
or address lists while an ifaddr was in use could lead to use of freed
memory (etc).  In a few cases, add missing if_addr_list locking
required to safely acquire references.

Because of a lack of deep copying support, we accept a race in which
an in6_ifaddr pointed to by mbuf tags and extracted with
ip6_getdstifaddr() doesn't hold a reference while in transmit.  Once
we have mbuf tag deep copy support, this can be fixed.

Reviewed by:	bz
Obtained from:	Apple, Inc. (portions)
MFC after:	6 weeks (portions)
2009-06-23 20:19:09 +00:00
qingli
ec826ad5c7 This main goals of this project are:
1. separating L2 tables (ARP, NDP) from the L3 routing tables
2. removing as much locking dependencies among these layers as
   possible to allow for some parallelism in the search operations
3. simplify the logic in the routing code,

The most notable end result is the obsolescent of the route
cloning (RTF_CLONING) concept, which translated into code reduction
in both IPv4 ARP and IPv6 NDP related modules, and size reduction in
struct rtentry{}. The change in design obsoletes the semantics of
RTF_CLONING, RTF_WASCLONE and RTF_LLINFO routing flags. The userland
applications such as "arp" and "ndp" have been modified to reflect
those changes. The output from "netstat -r" shows only the routing
entries.

Quite a few developers have contributed to this project in the
past: Glebius Smirnoff, Luigi Rizzo, Alessandro Cerri, and
Andre Oppermann. And most recently:

- Kip Macy revised the locking code completely, thus completing
  the last piece of the puzzle, Kip has also been conducting
  active functional testing
- Sam Leffler has helped me improving/refactoring the code, and
  provided valuable reviews
- Julian Elischer setup the perforce tree for me and has helped
  me maintaining that branch before the svn conversion
2008-12-15 06:10:57 +00:00
bz
604d89458a Rather than using hidden includes (with cicular dependencies),
directly include only the header files needed. This reduces the
unneeded spamming of various headers into lots of files.

For now, this leaves us with very few modules including vnet.h
and thus needing to depend on opt_route.h.

Reviewed by:	brooks, gnn, des, zec, imp
Sponsored by:	The FreeBSD Foundation
2008-12-02 21:37:28 +00:00
zec
8797d4caec Step 1.5 of importing the network stack virtualization infrastructure
from the vimage project, as per plan established at devsummit 08/08:
http://wiki.freebsd.org/Image/Notes200808DevSummit

Introduce INIT_VNET_*() initializer macros, VNET_FOREACH() iterator
macros, and CURVNET_SET() context setting macros, all currently
resolving to NOPs.

Prepare for virtualization of selected SYSCTL objects by introducing a
family of SYSCTL_V_*() macros, currently resolving to their global
counterparts, i.e. SYSCTL_V_INT() == SYSCTL_INT().

Move selected #defines from sys/sys/vimage.h to newly introduced header
files specific to virtualized subsystems (sys/net/vnet.h,
sys/netinet/vinet.h etc.).

All the changes are verified to have zero functional impact at this
point in time by doing MD5 comparision between pre- and post-change
object files(*).

(*) netipsec/keysock.c did not validate depending on compile time options.

Implemented by:	julian, bz, brooks, zec
Reviewed by:	julian, bz, brooks, kris, rwatson, ...
Approved by:	julian (mentor)
Obtained from:	//depot/projects/vimage-commit2/...
X-MFC after:	never
Sponsored by:	NLnet Foundation, The FreeBSD Foundation
2008-10-02 15:37:58 +00:00
bz
1021d43b56 Commit step 1 of the vimage project, (network stack)
virtualization work done by Marko Zec (zec@).

This is the first in a series of commits over the course
of the next few weeks.

Mark all uses of global variables to be virtualized
with a V_ prefix.
Use macros to map them back to their global names for
now, so this is a NOP change only.

We hope to have caught at least 85-90% of what is needed
so we do not invalidate a lot of outstanding patches again.

Obtained from:	//depot/projects/vimage-commit2/...
Reviewed by:	brooks, des, ed, mav, julian,
		jamie, kris, rwatson, zec, ...
		(various people I forgot, different versions)
		md5 (with a bit of help)
Sponsored by:	NLnet Foundation, The FreeBSD Foundation
X-MFC after:	never
V_Commit_Message_Reviewed_By:	more people than the patch
2008-08-17 23:27:27 +00:00
kmacy
1f11c1549b fix build 2008-05-06 17:45:54 +00:00
kmacy
9dc66be796 conditionally define PANIC_IF 2008-05-05 19:39:20 +00:00
kmacy
3655ca8c3d Import basic common and iwarp kernel RDMA infrastructure.
Supported by: Chelsio Inc.
2008-05-05 18:35:55 +00:00