freebsd-skq

Author	SHA1	Message	Date
Hans Petter Selasky	97549c34ec	Move the ConnectX-3 and ConnectX-2 driver from sys/ofed into sys/dev/mlx4 like other PCI network drivers. The sys/ofed directory is now mainly reserved for generic infiniband code, with exception of the mthca driver. - Add new manual page, mlx4en(4), describing how to configure and load mlx4en. - All relevant driver C-files are now prefixed mlx4, mlx4_en and mlx4_ib respectivly to avoid object filename collisions when compiling the kernel. This also fixes an issue with proper dependency file generation for the C-files in question. - Device mlxen is now device mlx4en and depends on device mlx4, see mlx4en(4). Only the network device name remains unchanged. - The mlx4 and mlx4en modules are now built by default on i386 and amd64 targets. Only building the mlx4ib module depends on WITH_OFED=YES . Sponsored by: Mellanox Technologies	2016-09-30 08:23:06 +00:00
Navdeep Parhar	a5234e8ccb	Do not free an uninitialized pointer on soaccept failure in the iWARP connection manager. Sponsored by: Chelsio Communications	2016-08-26 08:25:28 +00:00
Hans Petter Selasky	7dc445f8d3	Add support for setting blocking and non-blocking mode on /dev/rdma_cm by returning success on FIONBIO and FIOASYNC IOCTLs. The actual flags handling is done by the kern_ioctl() function. Reported by: Alex Bowden <alex.bowden@outlook.com> Sponsored by: Mellanox Technologies MFC after: 1 week	2016-08-18 08:49:02 +00:00
Mark Johnston	f1c1f188f7	mthca: Add a wrapper for the firmware's DIAG_RPRT command. MFC after: 1 week Sponsored by: EMC / Isilon Storage Division	2016-08-05 21:34:09 +00:00
Mark Johnston	49fe7b5029	ipoib: Bound the number of egress mbufs buffered during pathrec lookups. In pathological situations where the master subnet manager becomes unresponsive for an extended period, we may otherwise end up queuing all of the system's mbufs while waiting for a response to a path record lookup. This addresses the same issue as commit 1e85b806f9 in Linux. Reviewed by: cem, ngie MFC after: 2 weeks Sponsored by: EMC / Isilon Storage Division	2016-08-01 22:22:11 +00:00
Mark Johnston	4e071758a7	MFV be9130cc9: "IB/cma: Check for GID on listening devices first" This is an optimization that improves IB connection setup times. Discussed with: hselasky Obtained from: Linux MFC after: 2 weeks Sponsored by: EMC / Isilon Storage Division	2016-08-01 20:29:09 +00:00
Mark Johnston	82f1d3ea2f	MFV 29f27e847: "IB/cma: Use cached gids" This addresses a regression from an earlier upstream change which caused cma_acquire_dev() to bypass the port GID cache and instead query the HCA for each entry in its GID table. These queries can become extremely slow on multiport devices, which has a negative impact on connection setup times. Discussed with: hselasky Obtained from: Linux MFC after: 2 weeks Sponsored by: EMC / Isilon Storage Division	2016-08-01 20:27:11 +00:00
Mark Johnston	adaf3c4906	sdp: Destroy the RDMA ID after destroying the connection's queue pair. This is the ordering documented by rdma_destroy_qp(). Also add a useful KASSERT to sdp_pcbfree(). Sponsored by: EMC / Isilon Storage Division	2016-07-29 21:03:02 +00:00
Mark Johnston	d3461164e0	sdp: Use malloc(9) instead of the Linux compat layer. SDP transmit and receive rings are always created in a sleepable context, so we can use M_WAITOK and remove error checks. Sponsored by: EMC / Isilon Storage Division	2016-07-29 21:01:04 +00:00
Mark Johnston	f66ec15275	sdp: Use the correct socket buffer in sdp_post_recvs_needed(). Sponsored by: EMC / Isilon Storage Division	2016-07-29 20:54:43 +00:00
Mark Johnston	7e968dab6d	sdp: Always free received control packets after they're handled. Sponsored by: EMC / Isilon Storage Division	2016-07-29 20:51:52 +00:00
Mark Johnston	3b3e6d882f	Fix the KASSERT format string arguments after r303507.	2016-07-29 20:48:42 +00:00
Mark Johnston	a3c0b052b7	sdp: Use the PCB as the rx completion handler argument. The generic socket may be detached from the PCB before the completion queue is drained and destroyed, so this change closes a race condition in connection teardown. Sponsored by: EMC / Isilon Storage Division	2016-07-29 20:39:32 +00:00
Mark Johnston	fa46ade837	sdp: Destroy the PCB lock before freeing to the zone. Sponsored by: EMC / Isilon Storage Division	2016-07-29 20:36:01 +00:00
Mark Johnston	2cefa87b0b	sdp: Use an mbufq for received control packets. This is simpler than the hand-rolled queue, and fixes a use-after-free. Sponsored by: EMC / Isilon Storage Division	2016-07-29 20:35:04 +00:00
Mark Johnston	30a71b3c30	sdp: Remove Linux build files. They aren't useful here, and Linux seems to have largely abandoned SDP anyway. Sponsored by: EMC / Isilon Storage Division	2016-07-29 20:33:43 +00:00
Navdeep Parhar	9e2d05841e	Fix bug in iwcm that caused a panic in iw_cm_wq when krping is run repeatedly in a tight loop. Approved by: re (gjb@) Obtained from: hselasky@ (part of larger changes in D5791)	2016-06-14 20:58:05 +00:00
George V. Neville-Neil	fd9e88e0f9	Fix up the Infiniband code to handle the new arpresolve.	2016-06-02 20:53:43 +00:00
Hans Petter Selasky	fa201e28fc	Prepare for activation of LinuxKPI module parameters as read-only tunable SYSCTL's. Linux module parameters are associated with the module they belong to. FreeBSD does not share this concept of a parent module. Instead add macros which define the prefix to use for the module parameters in the LinuxKPI consumers. While at it convert all "bool" LinuxKPI module parameters to "byte" type, because we don't have a "bool" type of SYSCTL in FreeBSD. Sponsored by: Mellanox Technologies MFC after: 1 week	2016-05-25 12:03:21 +00:00
Eitan Adler	cef367e6a1	Don't repeat the the word 'the' (one manual change to fix grammar) Confirmed With: db Approved by: secteam (not really, but this is a comment typo fix)	2016-05-17 12:52:31 +00:00
Pedro F. Giffuni	bf5cba36db	ofed/drivers: minor spelling fixes. No functional change. Reviewed by: hselasky	2016-05-06 15:16:13 +00:00
Bjoern A. Zeeb	77eab110a9	Fix NOIP kernels to compile.	2016-04-24 15:56:05 +00:00
Hans Petter Selasky	0bab509b94	More fixes for using IPv6 addresses with RDMA: - Added check that the SCOPE ID is only restored for IPv6 linklocal addresses. - Changes made by r237263 in the "cma_bind_addr()" function did not check if the socket address was of type IPv6 and used the IPv4 socket address for IPv6 addresses. This caused the function to fail. Fixed this. - In the "rdma_gid2ip()" function and some other places the "sin6_len" and "sin6_scope_id" fields were not set for IPv6 socket addresses. Fixed this. - The scope ID is not stored as part of the GID entries and must be passed as an argument to "rdma_gid2ip()". - Added new method to "struct ib_device" which returns a pointer to the network interface which belongs to the given infiniband device. This is needed to be able to get the scope ID for IPv6 addresses via the associated ethernet interface. - Added convenience function, "rdma_get_ipv6_scope_id()", to get the scope ID for IPv6 addresses. - Implemented new "get_netdev" method for mlx4ib. Other IB controller drivers which want to support IPv6 addresses needs to implement this aswell. - Bumped the FreeBSD version due to changing "struct ib_device". Sponsored by: Mellanox Technologies MFC after: 1 week	2016-04-22 18:16:12 +00:00
Hans Petter Selasky	c3a74bf6d7	Add KASSERT() and set error code in dead code case to help static code analysis tools. Suggested by: ngie@ Sponsored by: Mellanox Technologies MFC after: 1 week	2016-04-22 06:39:07 +00:00
Hans Petter Selasky	10a7a4bf44	Add missing set of the current VNET when inputting IP packets in IPoIB. This fixes a kernel panic when using IPoIB with VIMAGE and infiniband. PR: 208957 Sponsored by: Mellanox Technologies Tested by: Justin Clift <justin@postgresql.org> MFC after: 1 week	2016-04-22 06:33:06 +00:00
Hans Petter Selasky	150c88d471	Fix for using IPv6 addresses with RDMA: IPv6 addresses has a scope ID which sometimes is stored in the "sin6_scope_id" field of "struct sockaddr_in6" and sometimes as part of the IPv6 address itself depending on the context. If the scope ID is not in the expected location, the IPv6 address lookups in the so-called GID table will fail. Some code factoring has been made to achieve a clean exit of the "addr_resolve" function via a common "done" label. Sponsored by: Mellanox Technologies Submitted by: Shani Michaeli <shanim@mellanox.com> MFC after: 1 week	2016-04-21 16:33:42 +00:00
Hans Petter Selasky	a296502fe0	Fix for resolving mac address when the destination address is a gateway. Remove some dead code while at it. Sponsored by: Mellanox Technologies MFC after: 1 week	2016-04-21 16:04:58 +00:00
Hans Petter Selasky	53219aa88a	Properly setup arguments for if_resolvemulti() callback. Sponsored by: Mellanox Technologies MFC after: 1 week	2016-04-21 11:32:22 +00:00
Hans Petter Selasky	03815ec1db	Fix inverted priv check calls. Priv check returns zero on success and an error code on failure. Refer to man 9 priv_check . Sponsored by: Mellanox Technologies MFC after: 1 week	2016-04-20 07:44:50 +00:00
Pedro F. Giffuni	d4a2eaef39	ofed: for pointers replace 0 with NULL. These are mostly cosmetical, no functional change. Found with devel/coccinelle. Reviewed by: hselasky	2016-04-15 12:16:15 +00:00
Pedro F. Giffuni	74b8d63dcc	Cleanup unnecessary semicolons from the kernel. Found with devel/coccinelle.	2016-04-10 23:07:00 +00:00
Navdeep Parhar	0ff41bb737	Remove unnecessary dequeue_mutex (added in r294610) from the iWARP connection manager. Examining so_comp without synchronization with iw_so_event_handler is a harmless race. Submitted by: Krishnamraju Eraparaju @ Chelsio Reviewed by: Steve Wise @ Open Grid Computing Sponsored by: Chelsio Communications	2016-03-30 01:08:08 +00:00
Hans Petter Selasky	9319562b86	Fix witness panic in the ipoib_ioctl() function when unloading the ipoib module. The bpfdetach() function is trying to turn off promiscious mode on the network interface it is attached to while holding a mutex. The fix consists of ignoring any further calls to the ipoib_ioctl() function when the network interface is going to be detached. The ipoib_ioctl() function might sleep. Sponsored by: Mellanox Technologies MFC after: 1 week	2016-03-15 15:47:26 +00:00
John Baldwin	47cedcbd72	Use SI_SUB_LAST instead of SI_SUB_SMP as the "catch-all" subsystem. Reviewed by: kib Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D5515	2016-03-11 23:18:06 +00:00
Hans Petter Selasky	96608f1ff4	Whitespace fixes. MFC after: 1 week Sponsored by: Mellanox Technologies	2016-03-04 09:07:30 +00:00
Navdeep Parhar	097f289f25	Fix for iWARP servers that listen on INADDR_ANY. The iWARP Connection Manager (CM) on FreeBSD creates a TCP socket to represent an iWARP endpoint when the connection is over TCP. For servers the current approach is to invoke create_listen callback for each iWARP RNIC registered with the CM. This doesn't work too well for INADDR_ANY because a listen on any TCP socket already notifies all hardware TOEs/RNICs of the new listener. This patch fixes the server side of things for FreeBSD. We've tried to keep all these modifications in the iWARP/TCP specific parts of the OFED infrastructure as much as possible. Submitted by: Krishnamraju Eraparaju @ Chelsio (with design inputs from Steve Wise) Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D4801	2016-01-22 23:33:34 +00:00
Alexander V. Chernikov	36402a681f	Finish r275196: do not dereference rtentry in if_output() routines. The only piece of information that is required is rt_flags subset. In particular, if_loop() requires RTF_REJECT and RTF_BLACKHOLE flags to check if this particular mbuf needs to be dropped (and what error should be returned). Note that if_loop() will always return EHOSTUNREACH for "reject" routes regardless of RTF_HOST flag existence. This is due to upcoming routing changes where RTF_HOST value won't be available as lookup result. All other functions require RTF_GATEWAY flag to check if they need to return EHOSTUNREACH instead of EHOSTDOWN error. There are 11 places where non-zero 'struct route' is passed to if_output(). For most of the callers (forwarding, bpf, arp) does not care about exact error value. In fact, the only place where this result is propagated is ip_output(). (ip6_output() passes NULL route to nd6_output_ifp()). Given that, add 3 new 'struct route' flags (RT_REJECT, RT_BLACKHOLE and RT_IS_GW) and inline function (rt_update_ro_flags()) to copy necessary rte flags to ro_flags. Call this function in ip_output() after looking up/ verifying rte. Reviewed by: ae	2016-01-09 16:34:37 +00:00
Gleb Smirnoff	829fae9063	Make it possible for sbappend() to preserve M_NOTREADY on mbufs, just like sbappendstream() does. Although, M_NOTREADY may appear only on SOCK_STREAM sockets, due to sendfile(2) supporting only the latter, there is a corner case of AF_UNIX/SOCK_STREAM socket, that still uses records for the sake of control data, albeit being stream socket. Provide private version of m_clrprotoflags(), which understands PRUS_NOTREADY, similar to m_demote().	2016-01-08 19:03:20 +00:00
Alexander V. Chernikov	4fb3a8208c	Implement interface link header precomputation API. Add if_requestencap() interface method which is capable of calculating various link headers for given interface. Right now there is support for INET/INET6/ARP llheader calculation (IFENCAP_LL type request). Other types are planned to support more complex calculation (L2 multipath lagg nexthops, tunnel encap nexthops, etc..). Reshape 'struct route' to be able to pass additional data (with is length) to prepend to mbuf. These two changes permits routing code to pass pre-calculated nexthop data (like L2 header for route w/gateway) down to the stack eliminating the need for other lookups. It also brings us closer to more complex scenarios like transparently handling MPLS nexthops and tunnel interfaces. Last, but not least, it removes layering violation introduced by flowtable code (ro_lle) and simplifies handling of existing if_output consumers. ARP/ND changes: Make arp/ndp stack pre-calculate link header upon installing/updating lle record. Interface link address change are handled by re-calculating headers for all lles based on if_lladdr event. After these changes, arpresolve()/nd6_resolve() returns full pre-calculated header for supported interfaces thus simplifying if_output(). Move these lookups to separate ether_resolve_addr() function which ether returs error or fully-prepared link header. Add <arp\|nd6_>resolve_addr() compat versions to return link addresses instead of pre-calculated data. BPF changes: Raw bpf writes occupied _two_ cases: AF_UNSPEC and pseudo_AF_HDRCMPLT. Despite the naming, both of there have ther header "complete". The only difference is that interface source mac has to be filled by OS for AF_UNSPEC (controlled via BIOCGHDRCMPLT). This logic has to stay inside BPF and not pollute if_output() routines. Convert BPF to pass prepend data via new 'struct route' mechanism. Note that it does not change non-optimized if_output(): ro_prepend handling is purely optional. Side note: hackish pseudo_AF_HDRCMPLT is supported for ethernet and FDDI. It is not needed for ethernet anymore. The only remaining FDDI user is dev/pdq mostly untouched since 2007. FDDI support was eliminated from OpenBSD in 2013 (sys/net/if_fddisubr.c rev 1.65). Flowtable changes: Flowtable violates layering by saving (and not correctly managing) rtes/lles. Instead of passing lle pointer, pass pointer to pre-calculated header data from that lle. Differential Revision: https://reviews.freebsd.org/D4102	2015-12-31 05:03:27 +00:00
Enji Cooper	4722f6ef28	Fix scope of bridge_header and bridge_pcix_cap in mthca_reset(..) They're only used in the __linux__ case Differential Revision: https://reviews.freebsd.org/D4332 MFC after: 1 week Reported by: cppcheck Reviewed by: hselasky Sponsored by: EMC / Isilon Storage Division	2015-12-04 09:01:58 +00:00
Hans Petter Selasky	0f5150a757	Fix integer to pointer of different size conversion warnings when using GCC for 32-bit platforms. The integer size in this case is hardcoded 64-bit while the pointer size is 32-bit. Sponsored by: Mellanox Technologies MFC after: 2 weeks	2015-11-12 10:12:20 +00:00
Hans Petter Selasky	2da3897d01	Rename linuxapi[.ko] into linuxkpi[.ko], to reflect that it is a kernel programming interface module, KPI, to avoid confusion with the existing Linux userspace binary compatibility shims. Bump the FreeBSD_version number. Reviewed by: np @ Suggested by: dumbbell @ Sponsored by: Mellanox Technologies	2015-10-22 09:50:45 +00:00
Hans Petter Selasky	f556cede8a	Merge LinuxKPI changes from DragonflyBSD: - Define the kref structure identical to the one found in Linux. - Update clients referring inside the kref structure. - Implement kref_sub() for FreeBSD. Reviewed by: np @ Sponsored by: Mellanox Technologies	2015-10-19 12:26:38 +00:00
Hans Petter Selasky	2404bdddf1	Merge LinuxKPI changes from DragonflyBSD: - Add more list related functions and macros. - Update the hlist_for_each_entry() macro to take one less argument. Sponsored by: Mellanox Technologies	2015-10-19 11:57:33 +00:00
Alexander V. Chernikov	fb373bc2b1	Fix build broken by r287861. Spotted by: zb	2015-09-16 15:40:08 +00:00
Alexander V. Chernikov	1fe201c322	Simplify the way of attaching IPv6 link-layer header. Problem description: How do we currently perform layer 2 resolution and header imposition: For IPv4 we have the following chain: ip_output() -> (ether\|atm\|whatever)_output() -> arpresolve() Lookup is done in proper place (link-layer output routine) and it is possible to provide cached lle data. For IPv6 situation is more complex: ip6_output() -> nd6_output() -> nd6_output_ifp() -> (whatever)_output() -> nd6_storelladdr() We have ip6_ouput() which calls nd6_output() instead of link output routine. nd6_output() does the following: * checks if lle exists, creates it if needed (similar to arpresolve()) * performes lle state transitions (similar to arpresolve()) * calls nd6_output_ifp() which pushes packets to link output routine along with running SeND/MAC hooks regardless of lle state (e.g. works as run-hooks placeholder). After that, iface output routine like ether_output() calls nd6_storelladdr() which performs lle lookup once again. As a result, we perform lookup twice for each outgoing packet for most types of interfaces. We also need to maintain runtime-checked table of 'nd6-free' interfaces (see nd6_need_cache()). Fix this behavior by eliminating first ND lookup. To be more specific: * make all nd6_output() consumers use nd6_output_ifp() instead * rename nd6_output[_slow]() to nd6_resolve_[slow]() * convert nd6_resolve() and nd6_resolve_slow() to arpresolve() semantics, e.g. copy L2 address to buffer instead of pushing packet towards lower layers * Make all nd6_storelladdr() users use nd6_resolve() * eliminate nd6_storelladdr() The resulting callchain is the following: ip6_output() -> nd6_output_ifp() -> (whatever)_output() -> nd6_resolve() Error handling: Currently sending packet to non-existing la results in ip6_<output\|forward> -> nd6_output() -> nd6_output _lle() which returns 0. In new scenario packet is propagated to <ether\|whatever>_output() -> nd6_resolve() which will return EWOULDBLOCK, and that result will be converted to 0. (And EWOULDBLOCK is actually used by IB/TOE code). Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D1469	2015-09-16 14:26:28 +00:00
Mark Johnston	4af587d062	Ensure that the MAD agent's delayed taskqueue is completely stopped before proceeding. Otherwise, nothing prevents it from running after the MAD agent struct has been been freed, and this results in a use-after-free when the task's ta_pending count is incremented in the callout handler. MFC after: 2 weeks Sponsored by: EMC / Isilon Storage Division	2015-09-15 23:56:31 +00:00
Navdeep Parhar	6b5c8394f1	Reinstate unify_tcp_port_space and associated code that was lost during the last OFED update (r278886). iWARP on FreeBSD is properly integrated with the network stack and the iWARP drivers _never_ operate out of any private TCP port-space that is invisible to the kernel. Instead, an iWARP connection shows up as a TCP socket (which is what it is) fully visible to the kernel and standard tools like netstat, sockstat, etc.	2015-08-12 22:09:58 +00:00
Mark Johnston	e2e45da0e8	ib mad: fix an incorrect use of list_for_each_entry In tf_dequeue(), if we reach the end of the list without finding a non-cancelled element, "tmp" will be a pointer into the list head, so the tmp->canceled check is bogus. Use a flag instead. Submitted by: Tao Liu <Tao.Liu@isilon.com> Reviewed by: hselasky MFC after: 1 week Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D3244	2015-07-30 18:28:37 +00:00
Mateusz Guzik	f6f6d24062	Implement lockless resource limits. Use the same scheme implemented to manage credentials. Code needing to look at process's credentials (as opposed to thred's) is provided with *_proc variants of relevant functions. Places which possibly had to take the proc lock anyway still use the proc pointer to access limits.	2015-06-10 10:48:12 +00:00

1 2 3

110 Commits