freebsd-skq

Author	SHA1	Message	Date
Andrey V. Elsukov	cc958ed28c	Follow the fix in r339532 (by glebius): Fix exiting an epoch(9) we never entered. May happen only with MAC. MFC after: 1 month	2018-10-21 18:30:27 +00:00
Andrey V. Elsukov	2c87fdf059	Rework if_ipsec(4) to use epoch(9) instead of rmlock. * use CK_LIST and FNV hash to keep chains of softc; * read access to softc is protected by epoch(); * write access is protected by ipsec_ioctl_sx. Changing of softc fields is allowed only when softc is unlinked from CK_LIST chains. * linking/unlinking of softc is allowed only when ipsec_ioctl_sx is exclusive locked. * the plain LIST of all softc is replaced by hash table that uses ingress address of tunnels as a key. * added support for appearing/disappearing of ingress address handling. Now it is allowed configure non-local ingress IP address, and thus the problem with if_ipsec(4) configuration that happens on boot, when ingress address is not yet configured, is solved. MFC after: 1 month Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D17190	2018-10-21 18:24:20 +00:00
Andrey V. Elsukov	df49ca9fd3	Add handling for appearing/disappearing of ingress addresses to if_me(4). * register handler for ingress address appearing/disappearing; * add new srcaddr hash table for fast softc lookup by srcaddr; * when srcaddr disappears, clear IFF_DRV_RUNNING flag from interface, and set it otherwise; MFC after: 1 month Sponsored by: Yandex LLC	2018-10-21 18:18:37 +00:00
Andrey V. Elsukov	19873f4780	Add handling for appearing/disappearing of ingress addresses to if_gre(4). * register handler for ingress address appearing/disappearing; * add new srcaddr hash table for fast softc lookup by srcaddr; * when srcaddr disappears, clear IFF_DRV_RUNNING flag from interface, and set it otherwise; MFC after: 1 month Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D17214	2018-10-21 18:13:45 +00:00
Andrey V. Elsukov	009d82ee0f	Add handling for appearing/disappearing of ingress addresses to if_gif(4). * register handler for ingress address appearing/disappearing; * add new srcaddr hash table for fast softc lookup by srcaddr; * when srcaddr disappears, clear IFF_DRV_RUNNING flag from interface, and set it otherwise; * remove the note about ingress address from BUGS section. MFC after: 1 month Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D17134	2018-10-21 18:06:15 +00:00
Andrey V. Elsukov	8251c68d5c	Add KPI that can be used by tunneling interfaces to handle IP addresses appearing and disappearing on the host system. Such handling is need, because tunneling interfaces must use addresses, that are configured on the host as ingress addresses for tunnels. Otherwise the system can send spoofed packets with source address, that belongs to foreign host. The KPI uses ifaddr_event_ext event to implement addresses tracking. Tunneling interfaces register event handlers and then they are notified by the kernel, when an address disappears or appears. ifaddr_event_compat() handler from if.c replaced by srcaddr_change_event() in the ip_encap.c MFC after: 1 month Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D17134	2018-10-21 17:55:26 +00:00
Kristof Provost	5191a3aea6	vlan: Fix panic with lagg and vlan vlan_lladdr_fn() is called from taskqueue, which means there's no vnet context set. We can end up trying to send ARP messages (through the iflladdr_event event), which requires a vnet context. PR: 227654 MFC after: 3 days	2018-10-21 16:51:35 +00:00
Andrey V. Elsukov	64d63b1e03	Add ifaddr_event_ext event. It is similar to ifaddr_event, but the handler receives the type of event IFADDR_EVENT_ADD/IFADDR_EVENT_DEL, and the pointer to ifaddr. Also ifaddr_event now is implemented using ifaddr_event_ext handler. MFC after: 3 weeks Sponsored by: Yandex LLC Differential Revision: https://reviews.freebsd.org/D17100	2018-10-21 15:02:06 +00:00
Gleb Smirnoff	0a27163f8d	Fix exiting an epoch(9) we never entered. May happen only with MAC.	2018-10-21 12:39:00 +00:00
Hans Petter Selasky	c32a9d66fb	Fix deadlock when destroying VLANs. Synchronizing the epoch before freeing the multicast addresses while holding the VLAN_XLOCK() might lead to a deadlock. Use deferred freeing of the VLAN multicast addresses to resolve deadlock. Backtrace: Thread1: epoch_block_handler_preempt() ck_epoch_synchronize_wait() epoch_wait_preempt() vlan_setmulti() vlan_ioctl() in6m_release_task() gtaskqueue_run_locked() gtaskqueue_thread_loop() fork_exit() fork_trampoline() Thread2: sleepq_switch() sleepq_wait() _sx_xlock_hard() _sx_xlock() in6_leavegroup() in6_purgeaddr() if_purgeaddrs() if_detach_internal() if_detach() vlan_clone_destroy() if_clone_destroyif() if_clone_destroy() ifioctl() kern_ioctl() sys_ioctl() amd64_syscall() fast_syscall_common() syscall() Differential revision: https://reviews.freebsd.org/D17496 Reviewed by: slavash, mmacy Approved by: re (kib) Sponsored by: Mellanox Technologies	2018-10-15 10:29:29 +00:00
Eric Joyner	77c1fcec91	ixl/iavf(4): Change ixlv to iavf and update it to use iflib(9) Finishes the conversion of the 40Gb Intel Ethernet drivers to iflib(9) for FreeBSD 12.0, and fixes numerous bugs in both ixl(4) and iavf(4). This commit also re-adds the VF driver to GENERIC since it now compiles and functions. The VF driver name was changed from ixlv(4) to iavf(4) because the VF driver is now intended to be used with future products, not just with Fortville/Fort Park VFs. A man page update that documents these drivers is forthcoming in a separate commit. Reviewed by: sbruno@, kbowling@ Tested by: jeffrey.e.pieper@intel.com Approved by: re (gjb@) Relnotes: yes Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D16429	2018-10-12 22:40:54 +00:00
Jonathan T. Looney	13c6ba6d94	There are three places where we return from a function which entered an epoch section without exiting that epoch section. This is bad for two reasons: the epoch section won't exit, and we will leave the epoch tracker from the stack on the epoch list. Fix the epoch leak by making sure we exit epoch sections before returning. Reviewed by: ae, gallatin, mmacy Approved by: re (gjb, kib) Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D17450	2018-10-09 13:26:06 +00:00
Michael Tuexen	580e30a33e	Use strlcpy() instead of strncpy(). Approved by: re (kib@) CID: 1395980, 1395981 X-MFC with: r339012 MFC after: 1 week	2018-10-03 07:35:16 +00:00
Michael Tuexen	1687b1ab24	For changing the MTU on tun/tap devices, it should not matter whether it is done via using ifconfig, which uses a SIOCSIFMTU ioctl() command, or doing it using a TUNSIFINFO/TAPSIFINFO ioctl() command. Without this patch, for IPv6 the new MTU is not used when creating routes. Especially, when initiating TCP connections after increasing the MTU, the old MTU is still used to compute the MSS. Thanks to ae@ and bz@ for helping to improve the patch. Reviewed by: ae@, bz@ Approved by: re (kib@) MFC after: 1 week Sponsored by: Netflix, Inc. Differential Revision: https://reviews.freebsd.org/D17180	2018-09-29 13:01:23 +00:00
Warner Losh	329e817fcc	Reapply, with minor tweaks, r338025, from the original commit: Remove unused and easy to misuse PNP macro parameter Inspired by r338025, just remove the element size parameter to the MODULE_PNP_INFO macro entirely. The 'table' parameter is now required to have correct pointer (or array) type. Since all invocations of the macro already had this property and the emitted PNP data continues to include the element size, there is no functional change. Mostly done with the coccinelle 'spatch' tool: $ cat modpnpsize0.cocci @normaltables@ identifier b,c; expression a,d,e; declarer MODULE_PNP_INFO; @@ MODULE_PNP_INFO(a,b,c,d, -sizeof(d[0]), e); @singletons@ identifier b,c,d; expression a; declarer MODULE_PNP_INFO; @@ MODULE_PNP_INFO(a,b,c,&d, -sizeof(d), 1); $ rg -l MODULE_PNP_INFO -- sys \| \ xargs spatch --in-place --sp-file modpnpsize0.cocci (Note that coccinelle invokes diff(1) via a PATH search and expects diff to tolerate the -B flag, which BSD diff does not. So I had to link gdiff into PATH as diff to use spatch.) Tinderbox'd (-DMAKE_JUST_KERNELS). Approved by: re (glen)	2018-09-26 17:12:14 +00:00
Matt Macy	b08d611de8	fix vlan locking to permit sx acquisition in ioctl calls - update vlan(9) to handle changes earlier this year in multicast locking Tested by: np@, darkfiberu at gmail.com PR: 230510 Reviewed by: mjoras@, shurd@, sbruno@ Approved by: re (gjb@) Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D16808	2018-09-21 01:37:08 +00:00
Stephen Hurd	0c919c2370	Fix capabilities handling for iflib drivers Various capabilities were not being handled correctly in the SIOCSIFCAP handler. Specifically: IFCAP_RXCSUM and IFCAP_RXCSUM_IPV6 could be set even if not supported It was impossible to disable IFCAP_RXCSUM and/or IFCAP_RXCSUM_IPV6 via ifconfig since it does ioctl() per command-line flag rather than combine them into a single call. IFCAP_VLAN_HWCSUM could not be modified via the ioctl() Setting any combination of the three IFCAP_WOL flags would set only IFCAP_WOL_MCAST \| IFCAP_WOL_MAGIC. For example, setting only IFCAP_WOL_UCAST would result in both IFCAP_WOL_MCAST and IFCAP_WOL_MAGIC being enabled, but IFCAP_WOL_UCAST would not be enabled. Because if_vlancap() was called before if_togglecapenable(), vlan flags were sometimes not applied correctly. Interfaces were being unnecessarily stopped and restarted for WoL PR: 231151 Submitted by: Kaho Toshikazu <kaho@elam.kais.kyoto-u.ac.jp> Reported by: Shirkdog <mshirk@daemon-security.com> Reviewed by: galladin Approved by: re (gjb) Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D17158	2018-09-20 19:35:35 +00:00
Andrey V. Elsukov	c6851ad04c	Restore outbound packets capturing for if_gre(4). It was missed in r335048. Also clear M_MCAST and M_BCAST flags for encapsulated datagram, since it will have new IP header. Approved by: re (kib)	2018-09-17 10:10:14 +00:00
Ruslan Bukin	86c5937532	Don't mark module data as static on RISC-V. Similar to arm64, riscv compiler uses PC-relative loads/stores, and with static data compiler does not emit relocations. In result, kernel module linker has nothing to fix and data accessed from the wrong location. Approved by: re (gjb) Sponsored by: DARPA, AFRL	2018-09-12 08:05:33 +00:00
Stephen Hurd	64e6fc1379	Clean up iflib sysctls Remove sysctls: txq_drain_encapfail - now a duplicate of encap_txd_encap_fail intr_link - was never incremented intr_msix - was never incremented rx_zero_len - was never incremented The following were not incremented in all code-paths that apply: m_pullups, mbuf_defrag, rxd_flush, tx_encap, rx_intr_enables, tx_frees, encap_txd_encap_fail. Fixes: Replace the broken collapse_pkthdr() implementation with an MPASS(). fl_refills and fl_refills_large were not incremented when using netmap. Reviewed by: gallatin Approved by: re (marius) Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D16733	2018-09-06 18:51:52 +00:00
Bjoern A. Zeeb	d4672c61d3	Rather than duplicating the functionality of a macro after r322866 use the already existing one. No functional changes. Reviewed by: karels, ae Approved by: re (rgrimes) Differential Revision: https://reviews.freebsd.org/D17004	2018-09-03 22:10:49 +00:00
Stephen Hurd	bc0e855bd9	Fix compile error due to missing parenthesis in r338372 Approved by: re (gjb)	2018-08-29 16:21:34 +00:00
Stephen Hurd	a520f8b6fe	Fix potential data corruption in iflib The MP ring may have txq pointers enqueued. Previously, these were passed to m_free() when IFC_QFLUSH was set. This patch checks for the value and doesn't call m_free(). Reviewed by: gallatin Approved by: re (gjb) Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D16882	2018-08-29 15:55:25 +00:00
Mark Murray	19fa89e938	Remove the Yarrow PRNG algorithm option in accordance with due notice given in random(4). This includes updating of the relevant man pages, and no-longer-used harvesting parameters. Ensure that the pseudo-unit-test still does something useful, now also with the "other" algorithm instead of Yarrow. PR: 230870 Reviewed by: cem Approved by: so(delphij,gtetlow) Approved by: re(marius) Differential Revision: https://reviews.freebsd.org/D16898	2018-08-26 12:51:46 +00:00
Navdeep Parhar	47c39432b1	Unbreak VLANs after r337943. ether_set_pcp should not be called from ether_output_frame for VLAN interfaces -- the vid + pcp will be inserted during vlan_transmit in that case. r337943 sets the VLAN's ifnet's if_pcp to a proper PCP value and this led to double encapsulation (once with vid 0 and second time with vid+pcp). PR: 230794 Reviewed by: kib@ Approved by: re@ (gjb@) Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D16887	2018-08-24 21:48:13 +00:00
Patrick Kelsey	249cc75fd1	Extended pf(4) ioctl interface and pfctl(8) to allow bandwidths of 2^32 bps or greater to be used. Prior to this, bandwidth parameters would simply wrap at the 2^32 boundary. The computations in the HFSC scheduler and token bucket regulator have been modified to operate correctly up to at least 100 Gbps. No other algorithms have been examined or modified for correct operation above 2^32 bps (some may have existing computation resolution or overflow issues at rates below that threshold). pfctl(8) will now limit non-HFSC bandwidth parameters to 2^32 - 1 before passing them to the kernel. The extensions to the pf(4) ioctl interface have been made in a backwards-compatible way by versioning affected data structures, supporting all versions in the kernel, and implementing macros that will cause existing code that consumes that interface to use version 0 without source modifications. If version 0 consumers of the interface are used against a new kernel that has had bandwidth parameters of 2^32 or greater configured by updated tools, such bandwidth parameters will be reported as 2^32 - 1 bps by those old consumers. All in-tree consumers of the pf(4) interface have been updated. To update out-of-tree consumers to the latest version of the interface, define PFIOC_USE_LATEST ahead of any includes and use the code of pfctl(8) as a guide for the ioctls of interest. PR: 211730 Reviewed by: jmallett, kp, loos MFC after: 2 weeks Relnotes: yes Sponsored by: RG Nets Differential Revision: https://reviews.freebsd.org/D16782	2018-08-22 19:38:48 +00:00
Eric Joyner	fe2bf351fe	if_media: Add new 2.5G/5G/25G/40G/50G/100G/200G/400G media types Upcoming Ethernet hardware will support new media types that aren't in the kernel yet, so they are added here. These mostly include new 25G/50G/100G media types; and this commit introduces new 200G/400G speeds and media. Reviewed by: hselasky@, jhb@ MFC after: 1 week Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D16731	2018-08-22 18:19:56 +00:00
Matt Macy	77ad07b6a3	fix copy/paste error when clearing ifma flag CID: 1395119 Reported by: vangyzen	2018-08-21 22:59:22 +00:00
Conrad Meyer	b8e771e97a	Back out r338035 until Warner is finished churning GSoC PNP patches I was not aware Warner was making or planning to make forward progress in this area and have since been informed of that. It's easy to apply/reapply when churn dies down.	2018-08-19 00:46:22 +00:00
Conrad Meyer	faa319436f	Remove unused and easy to misuse PNP macro parameter Inspired by r338025, just remove the element size parameter to the MODULE_PNP_INFO macro entirely. The 'table' parameter is now required to have correct pointer (or array) type. Since all invocations of the macro already had this property and the emitted PNP data continues to include the element size, there is no functional change. Mostly done with the coccinelle 'spatch' tool: $ cat modpnpsize0.cocci @normaltables@ identifier b,c; expression a,d,e; declarer MODULE_PNP_INFO; @@ MODULE_PNP_INFO(a,b,c,d, -sizeof(d[0]), e); @singletons@ identifier b,c,d; expression a; declarer MODULE_PNP_INFO; @@ MODULE_PNP_INFO(a,b,c,&d, -sizeof(d), 1); $ rg -l MODULE_PNP_INFO -- sys \| \ xargs spatch --in-place --sp-file modpnpsize0.cocci (Note that coccinelle invokes diff(1) via a PATH search and expects diff to tolerate the -B flag, which BSD diff does not. So I had to link gdiff into PATH as diff to use spatch.) Tinderbox'd (-DMAKE_JUST_KERNELS).	2018-08-19 00:22:21 +00:00
Navdeep Parhar	32a52e9e39	if_vlan(4): A VLAN always has a PCP and its ifnet's if_pcp should be set to the PCP value in use instead of IFNET_PCP_NONE. MFC after: 1 week Sponsored by: Chelsio Communications	2018-08-17 01:03:23 +00:00
Navdeep Parhar	32d2623ae2	Add the ability to look up the 3b PCP of a VLAN interface. Use it in toe_l2_resolve to fill up the complete vtag and not just the vid. Reviewed by: kib@ MFC after: 1 week Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D16752	2018-08-16 23:46:38 +00:00
Matt Macy	f9be038601	Fix in6_multi double free This is actually several different bugs: - The code is not designed to handle inpcb deletion after interface deletion - add reference for inpcb membership - The multicast address has to be removed from interface lists when the refcount goes to zero OR when the interface goes away - decouple list disconnect from refcount (v6 only for now) - ifmultiaddr can exist past being on interface lists - add flag for tracking whether or not it's enqueued - deferring freeing moptions makes the incpb cleanup code simpler but opens the door wider still to races - call inp_gcmoptions synchronously after dropping the the inpcb lock Fundamentally multicast needs a rewrite - but keep applying band-aids for now. Tested by: kp Reported by: novel, kp, lwhsu	2018-08-15 20:23:08 +00:00
Andrew Gallatin	5ccac9f972	lagg: allow lacp to manage the link state Lacp needs to manage the link state itself. Unlike other lagg protocols, the ability of lacp to pass traffic depends not only on the lagg members having link, but also on the lacp protocol converging to a distributing state with the link partner. If we prematurely mark the link as up, then we will send a gratuitous arp (via arp_handle_ifllchange()) before the lacp interface is capable of passing traffic. When this happens, the gratuitous arp is lost, and our link partner may cache a stale mac address (eg, when the base mac address for the lagg bundle changes, due to a BIOS change re-ordering NIC unit numbers) Reviewed by: jtl, hselasky Sponsored by: Netflix	2018-08-13 14:13:25 +00:00
Kristof Provost	91e0f2d200	pf: Increase default hash table size Now that we (by default) limit the number of states to 100.000 it makse sense to also adjust the default size of the hash table. Based on the benchmarking results in https://github.com/ocochard/netbenches/blob/master/Atom_C2758_8Cores-Chelsio_T540-CR/pf-states_hashsize/results/fbsd12-head.r332390/README.md 128K entries offers a good compromise between performance and memory use. Users may still overrule this setting with the net.pf.states_hashsize and net.pf.source_nodes_hashsize loader(8) tunables.	2018-08-05 13:54:37 +00:00
Patrick Kelsey	8f410865b8	Mark the send queue ready so ALTQ is available.	2018-08-04 01:45:17 +00:00
Andrew Turner	b6ea4c5a2a	As with DPCPU_DEFINE_STATIC make VNET_DEFINE_STATIC non-static on arm64 in modules. It also fails in the same way, we are unable to relocate static variables as the compiler uses PC-relative loads with nothing for the kernel linker to relocate. Sponsored by: DARPA, AFRL	2018-07-30 15:05:07 +00:00
Andrew Turner	cd2106eaea	Ensure the DPCPU and VNET module spaces are aligned to hold a pointer. Previously they may have been aligned to a char, leading to misaligned DPCPU and VNET variables. Sponsored by: DARPA, AFRL	2018-07-30 14:25:17 +00:00
Andrew Turner	bc61d94997	As with DPCPU_DEFINE make it a compile error to use static with VNET_DEFINE. There is the VNET_DEFINE_STATIC macro for that.	2018-07-30 12:44:44 +00:00
Patrick Kelsey	b8ca475604	ALTQ support for iflib. Reviewed by: jmallett, mmacy Differential Revision: https://reviews.freebsd.org/D16433	2018-07-25 22:46:36 +00:00
Marius Strobl	c9a49a4fd8	Since r336611, n is only used for INET in iflib_parse_header(). Reported by: rpokala	2018-07-24 23:40:27 +00:00
Andrew Turner	5f901c92a8	Use the new VNET_DEFINE_STATIC macro when we are defining static VNET variables. Reviewed by: bz Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D16147	2018-07-24 16:35:52 +00:00
Andrew Turner	fceba23f93	As with DPCPU create VNET_DEFINE_STATIC for when a variable needs to be declaired static. This will allow us to change the definition on arm64 as it has the same issues described in r336349. Reviewed by: bz Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D16147	2018-07-24 16:31:16 +00:00
Eugene Grosbein	804771f553	epair(4): make sure we do not duplicate MAC addresses in case of reused if_index. PR: 229957 Tested by: O. Hartmann <ohartmann@walstatt.org> Approved by: avg (mentor)	2018-07-23 07:11:58 +00:00
Marius Strobl	7474544bac	Use the maximum of isc_tx_{nsegments,tso_segments_max} for MAX_TX_DESC. Since r336313, TSO support for LEM-class devices is removed again as it was before the conversion of {l,}em(4) to iflib(4) in r311849 and as a result, isc_tx_tso_segments_max is 0 for LEM-class devices now. Thus, inappropriate watermarks were used for this class. This is really only a band-aid, though, because so far iflib(9) doesn't fully take into account that DMA engines can support different maxima of segments for transfers of TSO and non-TSO packets. For example, the DESC_RECLAIMABLE macro is based on isc_tx_nsegments while MAX_TX_DESC used isc_tx_tso_segments_max only. For most in-tree consumers that doesn't make a difference as the maxima are the same for both kinds of transfers (that is, apart from the fact that TSO may require up to 2 sentinel descriptors but also not with every MAC supported). However, isc_tx_nsegments is 8 but isc_tx_tso_segments_max is 85 by default with ixl(4).	2018-07-22 17:51:11 +00:00
Marius Strobl	8b8d90931d	- Given that the controlling expression of the receive loop in iflib_rxeof() tests for avail > 0, avail can never be 0 within that loop. Thus, move decrementing avail and budget_left into the loop and before the code which checks for additional descriptors having become available in case all the previous ones have been processed but there still is budget left so the latter code works as expected. [1] - In iflib_{busdma_load_mbuf_sg,parse_header}(), remove dead stores to m and n respectively. [2, 3] - In collapse_pkthdr(), ensure that m_next isn't NULL before dereferencing it. [4] - Remove a duplicate assignment of segs in iflib_encap(). Reported by: Coverity CID: 1356027 [1], 1356047 [2], 1368205 [3], 1356028 [4]	2018-07-22 17:45:44 +00:00
Stephen Hurd	fe51d4cdfe	Add knob to control tx ring abdication. r323954 changed the mp ring behaviour when 64-bit atomics were available to abdicate the TX ring rather than having one become a consumer thereby running to completion on TX. The consumer of the mp ring was then triggered in the tx task rather than blocking the TX call. While this significantly lowered the number of RX drops in small-packet forwarding, it also negatively impacts TX performance. With this change, the default behaviour is reverted, causing one TX ring to become a consumer during the enqueue call. A new sysctl, dev.X.Y.iflib.tx_abdicate is added to control this behaviour. Reviewed by: gallatin Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D16302	2018-07-20 17:45:26 +00:00
Stephen Hurd	dd7fbcf193	Improve netmap TX handling when TX IRQs are not used/supported Use the timer to poll for TX completions when there are outstanding TX slots. Track when the last driver timer was called to prevent overcalling it. Also clean up some kring vs NIC ring usage. Reviewed by: marius, Johannes Lundberg <johalun0@gmail.com> Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D16300	2018-07-20 17:24:45 +00:00
Andrey V. Elsukov	acf673edf0	Move invoking of callout_stop(&lle->lle_timer) into llentry_free(). This deduplicates the code a bit, and also implicitly adds missing callout_stop() to in[6]_lltable_delete_entry() functions. PR: 209682, 225927 Submitted by: hselasky (previous version) MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D4605	2018-07-17 11:33:23 +00:00
Marius Strobl	7f87c0406d	Assorted TSO fixes for em(4)/iflib(9) and dead code removal: - Ever since the workaround for the silicon bug of TSO4 causing MAC hangs was committed in r295133, CSUM_TSO always got disabled unconditionally by em(4) on the first invocation of em_init_locked(). However, even with that problem fixed, it turned out that for at least e. g. 82579 not all necessary TSO workarounds are in place, still causing MAC hangs even at Gigabit speed. Thus, for stable/11, TSO usage was deliberately disabled in r323292 (r323293 for stable/10) for the EM-class by default, allowing users to turn it on if it happens to work with their particular EM MAC in a Gigabit-only environment. In head, the TSO workaround for speeds other than Gigabit was lost with the conversion to iflib(9) in r311849 (possibly along with another one or two TSO workarounds). Yet at the same time, for EM-class MACs TSO4 got enabled by default again, causing device hangs. Therefore, change the default for this hardware class back to have TSO4 off, allowing users to turn it on manually if it happens to work in their environment as we do in stable/{10,11}. An alternative would be to add a whitelist of EM-class devices where TSO4 actually is reliable with the workarounds in place, but given that the advantage of TSO at Gigabit speed is rather limited - especially with the overhead of these workarounds -, that's really not worth it. [1] This change includes the addition of an isc_capabilities to struct if_softc_ctx so iflib(9) can also handle interface capabilities that shouldn't be enabled by default which is used to handle the default-off capabilities of e1000 as suggested by shurd@ and moving their handling from em_setup_interface() to em_if_attach_pre() accordingly. - Although 82543 support TSO4 in theory, the former lem(4) didn't have support for TSO4, presumably because TSO4 is even more broken in the LEM-class of MACs than the later EM ones. Still, TSO4 for LEM-class devices was enabled as part of the conversion to iflib(9) in r311849, causing device hangs. So revert back to the pre-r311849 behavior of not supporting TSO4 for LEM-class at all, which includes not creating a TSO DMA tag in iflib(9) for devices not having IFCAP_TSO4 set. [2] - In fact, the FreeBSD TCP stack can handle a TSO size of IP_MAXPACKET (65535) rather than FREEBSD_TSO_SIZE_MAX (65518). However, the TSO DMA must have a maxsize of the maximum TSO size plus the size of a VLAN header for software VLAN tagging. The iflib(9) converted em(4), thus, first correctly sets scctx->isc_tx_tso_size_max to EM_TSO_SIZE in em_if_attach_pre(), but later on overrides it with IP_MAXPACKET in em_setup_interface() (apparently, left-over from pre-iflib(9) times). So remove the later and correct iflib(9) to correctly cap the maximum TSO size reported to the stack at IP_MAXPACKET. While at it, let iflib(9) use if_sethwtsomax(). This change includes the addition of isc_tso_max{seg,}size DMA engine constraints for the TSO DMA tag to struct if_shared_ctx and letting iflib_txsd_alloc() automatically adjust the maxsize of that tag in case IFCAP_VLAN_MTU is supported as requested by shurd@. - Move the if_setifheaderlen(9) call for adjusting the maximum Ethernet header length from {ixgbe,ixl,ixlv,ixv,em}_setup_interface() to iflib(9) so adjustment is automatically done in case IFCAP_VLAN_MTU is supported. As a consequence, this adjustment now is also done in case of bnxt(4) which missed it previously. - Move the reduction of the maximum TSO segment count reported to the stack by the number of m_pullup(9) calls (which in the worst case, can add another mbuf and, thus, the requirement for another DMA segment each) in the transmit path for performance reasons from em_setup_interface() to iflib_txsd_alloc() as these pull-ups are now done in iflib_parse_header() rather than in the no longer existing em_xmit(). Moreover, this optimization applies to all drivers using iflib(9) and not just em(4); all in-tree iflib(9) consumers still have enough room to handle full size TSO packets. Also, reduce the adjustment to the maximum number of m_pullup(9)'s now performed in iflib_parse_header(). - Prior to the conversion of em(4)/igb(4)/lem(4) and ixl(4) to iflib(9) in r311849 and r335338 respectively, these drivers didn't enable IFCAP_VLAN_HWFILTER by default due to VLAN events not being passed through by lagg(4). With iflib(9), IFCAP_VLAN_HWFILTER was turned on by default but also lagg(4) was fixed in that regard in r203548. So just remove the now redundant and defunct IFCAP_VLAN_HWFILTER handling in {em,ixl,ixlv}_setup_interface(). - Nuke other redundant IFCAP_ setting in {em,ixl,ixlv}_setup_interface() which is (more completely) already done in {em,ixl,ixlv}_if_attach_pre() now. - Remove some redundant/dead setting of scctx->isc_tx_csum_flags in em_if_attach_pre(). - Remove some IFCAP_* duplicated either directly or indirectly (e. g. via IFCAP_HWCSUM) in {EM,IGB,IXL}_CAPS. - Don't bother to fiddle with IFCAP_HWSTATS in ixgbe(4)/ixgbev(4) as iflib(9) adds that capability unconditionally. - Remove some unused macros from em(4). - Bump __FreeBSD_version as some of the above changes require the modules of drivers using iflib(9) to be recompiled. Okayed by: sbruno@ at 201806 DevSummit Transport Working Group [1] Reviewed by: sbruno (earlier version), erj PR: 219428 (part of; comment #10) [1], 220997 (part of; comment #3) [2] Differential Revision: https://reviews.freebsd.org/D15720	2018-07-15 19:04:23 +00:00

1 2 3 4 5 ...

4017 Commits