freebsd-dev

Author	SHA1	Message	Date
Mark Johnston	f161d294b9	Add missing sockaddr length and family validation to various protocols Several protocol methods take a sockaddr as input. In some cases the sockaddr lengths were not being validated, or were validated after some out-of-bounds accesses could occur. Add requisite checking to various protocol entry points, and convert some existing checks to assertions where appropriate. Reported by: syzkaller+KASAN Reviewed by: tuexen, melifaro MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D29519	2021-05-03 13:35:19 -04:00
Bjoern A. Zeeb	7069b4c6a4	LinuxKPI/OFED: (re)move inetdevice.h implementation The two functions in linux/inetdevice.h are highly FreeBSD/ifnet specific. This is a result of struct net_device being mapped to struct ifnet. The only known consumer of these functions are two files in the ofed/infiniband code. As a first step of cleaning up copy linux/inetdevice.h to rdma/ib_addr_freebsd.h. (It stayed a separate file to preserve copyright and license of the original file; otherwise it could be merged into ib_addr.h where more EPOCH/vnet/.. are already used). Slightly rename the function to not conflict with LinuxKPI in the future. Remove the three last, now unneeded includes of inetdevice.h and zap linux/inetdevice.h to an empty header file with only the forward include to netdevice.h remaining. Sponsored-by: The FreeBSD Foundation MFC-after: 2 weeks Reviewed-by: hselasky, kib X-D-R: D29366 (extracted as further cleanup) Differential Revision: https://reviews.freebsd.org/D29434	2021-03-30 14:40:46 +00:00
Hans Petter Selasky	4e38478c59	ipoib: Fix incorrectly computed IPOIB_CM_RX_SG value. The computed IPOIB_CM_RX_SG is too small. It doesn't account for fallback to mbuf clusters when jumbo frames are not available and it also doesn't account for the packet header and trailer mbuf. This causes a memory overwrite situation when IPOIB_CM is configured. While at it add a kernel assert to ensure the mapping array is not overwritten. PR: 254474 MFC after: 1 week Sponsored by: Mellanox Technologies // NVIDIA Networking	2021-03-25 16:55:37 +01:00
Bjoern A. Zeeb	3b1ecc9fa1	LinuxKPI: remove < 5.0 version support We are not aware of any out-of-tree consumers anymore which would need KPI support for before Linux version 5. Update the two in-tree consumers to use the new KPI. This allows us to remove the extra version check and will also give access to {lower,upper}_32_bits() unconditionally. Sponsored-by: The FreeBSD Foundation Reviewed-by: hselasky, rlibby, rstone MFC-after: 2 weeks X-MFC: to 13 only Differential Revision: https://reviews.freebsd.org/D29391	2021-03-24 23:00:03 +00:00
Bjoern A. Zeeb	a29bbfe6c6	ofed/linuxkpi: use proper accessor function In the notifier event callback function rather than casting directly to the expected type use the proper accessor function as the mlx drivers already do. This is preparational work to allow us to improve the struct net_device is struct ifnet compat code shortcut in the future. Obtained-from: bz_iwlwifi Sponsored-by: The FreeBSD Foundation MFC-after: 2 weeks Reviewed-by: hselasky Differential Revision: https://reviews.freebsd.org/D29364	2021-03-24 22:23:34 +00:00
Ryan Libby	bf667f282a	ofed: quiet gcc -Wint-in-bool-context The int in the argument to the ternary triggered -Wint-in-bool-context from gcc. Upstream linux has a larger and more entangled patch, 12f727721eee61b3d19dedb95cb893b2baa9fe41, which doesn't apply cleanly. When we eventually sync that, we can just drop this change. Reviewed by: hselasky, imp, kib MFC after: 3 days Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D28762	2021-02-24 15:56:16 -08:00
Kyle Evans	4c0bef07be	kern: net: remove TCP_LINGERTIME TCP_LINGERTIME can be traced back to BSD 4.4 Lite and perhaps beyond, in exactly the same form that it appears here modulo slightly different context. It used to be the case that there was a single pr_usrreq method with requests dispatched to it; these exact two lines appeared in tcp_usrreq's PRU_ATTACH handling. The only purpose of this that I can find is to cause surprising behavior on accepted connections. Newly-created sockets will never hit these paths as one cannot set SO_LINGER prior to socket(2). If SO_LINGER is set on a listening socket and inherited, one would expect the timeout to be inherited rather than changed arbitrarily like this -- noting that SO_LINGER is nonsense on a listening socket beyond inheritance, since they cannot be 'connected' by definition. Neither Illumos nor Linux reset the timer like this based on testing and inspection of Illumos, and testing of Linux. Reviewed by: rscheff, tuexen Differential Revision: https://reviews.freebsd.org/D28265	2021-02-18 22:36:01 -06:00
Ryan Stone	8a06ca2f73	Fix mismerge in OFED update When OFED was upgraded to Linux v4.9, a bunch of Linux-specific netlink changes were dropped. Unfortunately, there was a mismerge in this process and as a result ib_sa_cancel_query() would fail to cancel an outstanding MAD. This was causing rdma_destroy_id() to hang indefinitely waiting for the MAD to complete and release the final reference. Sponsored by: Dell Inc. Differential Revision: https://reviews.freebsd.org/D28421 Reviewed by: hselasky, kib MFC after: 2 months	2021-02-04 13:58:24 -05:00
Hans Petter Selasky	f8f5b459d2	Update user access region, UAR, APIs in the core in mlx5core. This change include several changes as listed below all related to UAR. UAR is a special PCI memory area where the so-called doorbell register and blue flame register live. Blue flame is a feature for sending small packets more efficiently via a PCI memory page, instead of using PCI DMA. - All structures and functions named xxx_uuars were renamed into xxx_bfreg. - Remove partially implemented Blueflame support from mlx5en(4) and mlx5ib. - Implement blue flame register allocator. - Use blue flame register allocator in mlx5ib. - A common UAR page is now allocated by the core to support doorbell register writes for all of mlx5en and mlx5ib, instead of allocating one UAR per sendqueue. - Add support for DEVX query UAR. - Add support for 4K UAR for libmlx5. Linux commits: 7c043e908a74ae0a935037cdd984d0cb89b2b970 2f5ff26478adaff5ed9b7ad4079d6a710b5f27e7 0b80c14f009758cefeed0edff4f9141957964211 30aa60b3bd12bd79b5324b7b595bd3446ab24b52 5fe9dec0d045437e48f112b8fa705197bd7bc3c0 0118717583cda6f4f36092853ad0345e8150b286 a6d51b68611e98f05042ada662aed5dbe3279c1e MFC after: 1 week Sponsored by: Mellanox Technologies // NVIDIA Networking	2021-01-08 13:33:46 +01:00
Hans Petter Selasky	44a1b2f07b	Fix for referencing file via its vnode in ibore. Use the native vnode lookup functions, instead of going via the LinuxKPI, because the file referenced is typically created outside the LinuxKPI, and the LinuxKPI's fdget() can only resolve file descriptor numbers which were created by itself. The vnode pointer is used as an identifier to identify XRCD handles which are sharing resources. This patch fixes the so-called XRCD support in ibcore for FreeBSD. Refer to ibv_open_xrcd(3) for more information how the file descriptor argument is used. Reviewed by: kib@ MFC after: 1 week Sponsored by: Mellanox Technologies // NVIDIA Networking	2020-11-02 10:44:29 +00:00
Hans Petter Selasky	9d40cf60d6	Factor out generic IP over infiniband, IPoIB, definitions and code into net/if_infiniband.c and net/infiniband.h . No functional change intended. Differential Revision: https://reviews.freebsd.org/D26254 Reviewed by: melifaro@ MFC after: 1 week Sponsored by: Mellanox Technologies // NVIDIA Networking	2020-10-22 09:09:53 +00:00
Ravi Pokala	9f6f4168b4	Allow IP over IB to work with multiple FIBs. Call M_SETFIB() to make sure the IPoIB packet is directed to the correct interface-specific FIB. This was sufficient to allow general-purpose routing using the default FIB, and a separate FIB for routing between IPoIB on ib0 and IPoEthernet on mce0. Reviewed by: hselasky Obtained from: Anmol Kumar <anmolk at panasas dot com> MFC after: 1 week Sponsored by: Panasas Differential Revision: https://reviews.freebsd.org/D25239	2020-10-13 20:41:51 +00:00
Eric van Gyzen	536457e181	infiniband: Appease Coverty Coverity claims the call to rdma_gid2ip in cma_igmp_send overwrites addr. Use a consistent definition of sockaddr to prevent detections and code changes in the future. Submitted by: bret_ketchum@dell.com Reported by: Coverity Reviewed by: hselasky, kib MFC after: 2 weeks Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D26229	2020-08-31 16:17:28 +00:00
Hans Petter Selasky	1866c98e64	Infiniband clients must be attached and detached in a specific order in ibcore. Currently the linking order of the infiniband, IB, modules decide in which order the clients are attached and detached. For example one IB client may use resources from another IB client. This can lead to a potential deadlock at shutdown. For example if the ipoib is unregistered after the ib_multicast client is detached, then if ipoib is using multicast addresses a deadlock may happen, because ib_multicast will wait for all its resources to be freed before returning from the remove method. Fix this by using module_xxx_order() instead of module_xxx(). Differential Revision: https://reviews.freebsd.org/D23973 MFC after: 1 week Sponsored by: Mellanox Technologies	2020-07-06 08:50:11 +00:00
Alexander V. Chernikov	0f3bf68212	Convert OFED rtable interactions to the new routing KPI. Reviewed by: hselasky Differential Revision: https://reviews.freebsd.org/D24387	2020-04-15 13:06:55 +00:00
Hans Petter Selasky	1c6a456125	Fix for double unlock in ipoib. The ipoib_unicast_send() function is not supposed to unlock the priv lock. MFC after: 3 days Sponsored by: Mellanox Technologies	2020-03-16 12:33:57 +00:00
Hans Petter Selasky	5d4562cb32	Fix some whitespace issues in ipoib. MFC after: 1 week Sponsored by: Mellanox Technologies	2020-03-06 09:59:07 +00:00
Pawel Biernacki	7029da5c36	Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many) r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are still not MPSAFE (or already are but aren’t properly marked). Use it in preparation for a general review of all nodes. This is non-functional change that adds annotations to SYSCTL_NODE and SYSCTL_PROC nodes using one of the soon-to-be-required flags. Mark all obvious cases as MPSAFE. All entries that haven't been marked as MPSAFE before are by default marked as NEEDGIANT Approved by: kib (mentor, blanket) Commented by: kib, gallatin, melifaro Differential Revision: https://reviews.freebsd.org/D23718	2020-02-26 14:26:36 +00:00
Hans Petter Selasky	ae5b45c86e	Make sure the VNET is properly set when reaping mbufs in ipoib. Else the following panic may happen: panic() icmp_error() ipoib_cm_mb_reap() linux_work_fn() taskqueue_run_locked() taskqueue_thread_loop() fork_exit() fork_trampoline() Submitted by: Andreas Kempe <kempe@lysator.liu.se> MFC after: 1 week Sponsored by: Mellanox Technologies	2020-01-11 12:02:16 +00:00
Hans Petter Selasky	9220357857	Prevent potential underflow in ibcore. Linux commit: a9018adfde809d44e71189b984fa61cc89682b5e MFC after: 1 week Sponsored by: Mellanox Technologies	2019-11-15 11:46:53 +00:00
Hans Petter Selasky	ae9a8ec99f	Correct MR length field to be 64-bit in ibcore. Linux commit: edd31551148c09608feee6b8756ad148d550ee3b MFC after: 1 week Sponsored by: Mellanox Technologies	2019-11-15 11:45:14 +00:00
Hans Petter Selasky	758a35d0bc	VLAN_TRUNKDEV() requires epochification in ibcore after r353292. Sponsored by: Mellanox Technologies	2019-10-16 08:56:07 +00:00
Hans Petter Selasky	052bc06c25	Replace rdma_is_upper_dev_rcu() with rdma_vlan_dev_real_dev() in ibcore. This reduces the number of references to VLAN_TRUNKDEV() in ibcore. Currently only VLAN is supported as a child interface in FreeBSD. Remove superfluous RCU locking. Sponsored by: Mellanox Technologies	2019-10-16 08:55:29 +00:00
Hans Petter Selasky	8232fd4dcd	VLAN_DEVAT() requires epochification in ipoib after r353292. Sponsored by: Mellanox Technologies	2019-10-16 08:40:58 +00:00
Hans Petter Selasky	06656d75b1	Fix missing epochification of the ibcore code after r353292. Sponsored by: Mellanox Technologies	2019-10-15 11:12:31 +00:00
Hans Petter Selasky	f570a1bd09	Fix missing epochification of the ipoib code after r353292. Sponsored by: Mellanox Technologies	2019-10-15 11:11:21 +00:00
Gleb Smirnoff	f95dbf26d3	Convert to if_foreach_llmaddr() KPI. Reviewed by: hselasky	2019-10-14 20:22:25 +00:00
Gleb Smirnoff	b8a6e03fac	Widen NET_EPOCH coverage. When epoch(9) was introduced to network stack, it was basically dropped in place of existing locking, which was mutexes and rwlocks. For the sake of performance mutex covered areas were as small as possible, so became epoch covered areas. However, epoch doesn't introduce any contention, it just delays memory reclaim. So, there is no point to minimise epoch covered areas in sense of performance. Meanwhile entering/exiting epoch also has non-zero CPU usage, so doing this less often is a win. Not the least is also code maintainability. In the new paradigm we can assume that at any stage of processing a packet, we are inside network epoch. This makes coding both input and output path way easier. On output path we already enter epoch quite early - in the ip_output(), in the ip6_output(). This patch does the same for the input path. All ISR processing, network related callouts, other ways of packet injection to the network stack shall be performed in net_epoch. Any leaf function that walks network configuration now asserts epoch. Tricky part is configuration code paths - ioctls, sysctls. They also call into leaf functions, so some need to be changed. This patch would introduce more epoch recursions (see EPOCH_TRACE) than we had before. They will be cleaned up separately, as several of them aren't trivial. Note, that unlike a lock recursion the epoch recursion is safe and just wastes a bit of resources. Reviewed by: gallatin, hselasky, cy, adrian, kristof Differential Revision: https://reviews.freebsd.org/D19111	2019-10-07 22:40:05 +00:00
Hans Petter Selasky	6fe20cefa3	Make sure the transmit loop doesn't get starved in ipoib. When the software send queue gets filled up, callbacks to if_transmit will stop. Make sure the transmit callback routine checks the send queue and outputs any remaining mbufs. Else the remaining mbufs may simply sit in the output queue blocking the transmit path. MFC after: 3 days Sponsored by: Mellanox Technologies	2019-10-02 09:06:13 +00:00
Conrad Meyer	f49e79b56b	OFED: Fix accidental double-copy of rdma_sdp.h in r351176 The mistake came about like this: the first attempt to commit was blocked by a pre-commit hook due to missing SVN tags. svn revert doesn't delete new files, I guess. While reapplying the fixed diff, the non-empty target file was just concatenated with the new contents? Ugh. :-(	2019-08-18 04:19:41 +00:00
Conrad Meyer	6b2f017186	OFED: Unbreak SDP support in ibcore This regression was introduced in the r326169 Linux v4.9 Infiniband upgrade. Restore the functionality. Reviewed by: hselasky Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D21298	2019-08-17 18:54:07 +00:00
Conrad Meyer	1c334042f9	SDP: Fix brain-o from r351162 Lost in translation between different SDP stacks. Reported by: hselasky	2019-08-17 10:11:34 +00:00
Conrad Meyer	14f19c6b73	OFED: Fix ib_mad.h ib_user_mad.h include to match new uapi path Sponsored by: Dell EMC Isilon	2019-08-17 03:09:03 +00:00
Conrad Meyer	bab54619e9	SDP: Add a dbg() on QP events Sponsored by: Dell EMC Isilon	2019-08-17 03:07:41 +00:00
Conrad Meyer	1ac512e80b	SDP: Also log a nice status string in RX WC error dbg() Sponsored by: Dell EMC Isilon	2019-08-17 03:06:46 +00:00
Conrad Meyer	92d90f6142	SDP: Include nice string names for raw event numbers in a dbg() Sponsored by: Dell EMC Isilon	2019-08-17 03:05:09 +00:00
Conrad Meyer	6669d5459b	SDP: SYSCTL_DECL SDP-wide sysctl node in header This allows use of the shared _net_inet_sdp in more than one compilation unit. (Nothing in-tree uses this today, but some of Isilon's out-of-tree SDP enhancements add sysctls below the node.) Sponsored by: Dell EMC Isilon	2019-08-17 03:03:26 +00:00
Slava Shwartsman	bb43866c38	Fix prio vs. nonprio tagged traffic in RDMACM In current RDMACM implementation RDMACM server will not find a GID index when the request was prio-tagged and the sever is non prio-tagged and vise-versa. According to 802.1Q-2014, VLAN tagged packets with VLAN id 0 should be considered as untagged. Treat RDMACM request the same. Reviewed by: hselasky, kib MFC after: 3 Days Sponsored by: Mellanox Technologies	2019-06-04 06:21:31 +00:00
Conrad Meyer	e12be3218a	Include eventhandler.h in more compilation units This was enumerated with exhaustive search for sys/eventhandler.h includes, cross-referenced against EVENTHANDLER_* usage with the comm(1) utility. Manual checking was performed to avoid redundant includes in some drivers where a common os_bsd.h (for example) included sys/eventhandler.h indirectly, but it is possible some of these are redundant with driver-specific headers in ways I didn't notice. (These CUs did not show up as missing eventhandler.h in tinderbox.) X-MFC-With: r347984	2019-05-21 01:18:43 +00:00
Hans Petter Selasky	013f1e1435	Add new rates to ibcore. Add the new rates that were added to the Infiniband specification as part of HDR and 2x support. Submitted by: slavash@ MFC after: 3 days Sponsored by: Mellanox Technologies	2019-05-08 10:55:47 +00:00
Hans Petter Selasky	afbe61616f	Handle IB_EVENT_DEVICE_FATAL event in ipoib. Perform flush if IB_EVENT_DEVICE_FATAL was received. Submitted by: slavash@ MFC after: 3 days Sponsored by: Mellanox Technologies	2019-05-08 10:51:49 +00:00
Hans Petter Selasky	cb678cb911	Fix endless loop in ipoib_poll(). ib_req_notify_cq may return negative value which will indicate a failure. In the case of uncorrectable error, we will end up in an endless loop. Fix that, by going to another loop with poll_more only if there is anything left to poll. Submitted by: slavash@ MFC after: 3 days Sponsored by: Mellanox Technologies	2019-05-08 10:42:05 +00:00
Hans Petter Selasky	38f38e9fda	Make sure to error out when arming the CQ fails in ibcore. MFC after: 3 days Sponsored by: Mellanox Technologies	2019-05-08 10:32:45 +00:00
Gleb Smirnoff	a68cc38879	Mechanical cleanup of epoch(9) usage in network stack. - Remove macros that covertly create epoch_tracker on thread stack. Such macros a quite unsafe, e.g. will produce a buggy code if same macro is used in embedded scopes. Explicitly declare epoch_tracker always. - Unmask interface list IFNET_RLOCK_NOSLEEP(), interface address list IF_ADDR_RLOCK() and interface AF specific data IF_AFDATA_RLOCK() read locking macros to what they actually are - the net_epoch. Keeping them as is is very misleading. They all are named FOO_RLOCK(), while they no longer have lock semantics. Now they allow recursion and what's more important they now no longer guarantee protection against their companion WLOCK macros. Note: INP_HASH_RLOCK() has same problems, but not touched by this commit. This is non functional mechanical change. The only functionally changed functions are ni6_addrs() and ni6_store_addrs(), where we no longer enter epoch recursively. Discussed with: jtl, gallatin	2019-01-09 01:11:19 +00:00
Mark Johnston	2f2ddd68a5	Support MSG_DONTWAIT in send(2). As it does for recv(2), MSG_DONTWAIT indicates that the call should not block, returning EAGAIN instead. Linux and OpenBSD both implement this, so the change makes porting easier, especially since we do not return EINVAL or so when unrecognized flags are specified. Submitted by: Greg V <greg@unrelenting.technology> Reviewed by: tuexen MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D18728	2019-01-04 17:31:50 +00:00
Slava Shwartsman	186058782a	ipoib: Notify on modify QP failure only when relevant Modify QP can fail and it can be acceptable, like when moving from RST to ERR state, all the rest are not acceptable and a message to the log should be printed. The current code prints on all failures and many messages like: "Failed to modify QP to ERROR state" appear, even when supported by the state machine of the QP object. Linux commit: 5dc78ad1904db597bdb4427f3ead437aae86f54c Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:27:17 +00:00
Slava Shwartsman	a061f0eb65	ipoib: increase the non-cm queue length When a packet needs fragmentation, it might generate more than 3 fragments. With the queue length 3, all fragments are generated faster than the queue is drained, which effectively drops fourth and later fragments on the floor. Submitted by: kib@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:26:47 +00:00
Slava Shwartsman	d705eff259	ipoib: Don't do a light flush when MTU is unchanged. When changing the MTU of ibX network interfaces, check that the MTU was really changed before requesting an update of the multicast rules. Else we might go into an infinite loop joining and leaving ibX multicast groups towards the opensm master interface. Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:26:17 +00:00
Slava Shwartsman	099ad46e81	ipoib: correct setting MTU from inside ipoib(4). It is not enough to set ifnet->if_mtu to change the interface MTU. System saves the MTU for route in the radix tree, and route cache keeps the interface MTU as well. Since addition of the multicast group causes recalculation of MTU, even bringing the interface up changes MTU from 4042 to 1500, which makes the system configuration inconsistent. Worse, ip_output() prefers route MTU over interface MTU, so large packets are not fragmented and dropped on floor. Fix it for ipoib(4) using the same approach (or hack) as was applied for it_tun/if_tap in r339012. Thanks to bz@ for giving the hint. Submitted by: kib@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:25:47 +00:00
Slava Shwartsman	e13619b68b	ibcore: Fix clearing of bound device interface. Binding to a loopback device is not allowed. Make sure the destination device address is global by clearing the bound device interface. Only do this conditionally, else link local addresses won't work. Submitted by: hselasky@ Approved by: hselasky (mentor) MFC after: 1 week Sponsored by: Mellanox Technologies	2018-12-05 13:25:13 +00:00

1 2 3 4 5 ...

431 Commits