freebsd-skq

Author	SHA1	Message	Date
Vincenzo Maffione	1d238b07d5	netmap: iflib: stop krings during interface reset When different processes open separate subsets of the available rings of a same netmap interface, a device reset may be performed while one of the processes is actively using some rings (e.g., caused by another process executing a nmport_open()). With this patch, such situation will cause the active process to get a POLLERR, so that it can have a chance to detect the situation. We also guarantee that no process is running a txsync or rxsync (ioctl or poll) while an iflib device reset is in progress. PR: 252453 MFC after: 1 week	2021-01-09 21:01:46 +00:00
Matt Macy	81be655266	iflib: ensure that tx interrupts enabled and cleanups Doing a 'dd' over iscsi will reliably cause stalls. Tx cleaning _should_ reliably happen as data is sent. However, currently if the transmit queue fills it will wait until the iflib timer (hz/2) runs. This change causes the the tx taskq thread to be run if there are completed descriptors. While here: - make timer interrupt delay a sysctl - simplify txd_db_check handling - comment on INTR types Background on the change: Initially doorbell updates were minimized by only writing to the register on every fourth packet. If txq_drain would return without writing to the doorbell it scheduled a callout on the next tick to do the doorbell write to ensure that the write otherwise happened "soon". At that time a sysctl was added for users to avoid the potential added latency by simply writing to the doorbell register on every packet. This worked perfectly well for e1000 and ixgbe ... and appeared to work well on ixl. However, as it turned out there was a race to this approach that would lockup the ixl MAC. It was possible for a lower producer index to be written after a higher one. On e1000 and ixgbe this was harmless - on ixl it was fatal. My initial response was to add a lock around doorbell writes - fixing the problem but adding an unacceptable amount of lock contention. The next iteration was to use transmit interrupts to drive delayed doorbell writes. If there were no packets in the queue all doorbell writes would be immediate as the queue started to fill up we could delay doorbell writes further and further. At the start of drain if we've cleaned any packets we know we've moved the state machine along and we write the doorbell (an obvious missing optimization was to skip that doorbell write if db_pending is zero). This change required that tx interrupts be scheduled periodically as opposed to just when the hardware txq was full. However, that just leads to our next problem. Initially dedicated msix vectors were used for both tx and rx. However, it was often possible to use up all available vectors before we set up all the queues we wanted. By having rx and tx share a vector for a given queue we could halve the number of vectors used by a given configuration. The problem here is that with this change only e1000 passed the necessary value to have the fast interrupt drive tx when appropriate. Reported by: mav@ Tested by: mav@ Reviewed by: gallatin@ MFC after: 1 month Sponsored by: iXsystems Differential Revision: https://reviews.freebsd.org/D27683	2021-01-07 14:07:35 -08:00
Mark Johnston	c065d4e5e9	iflib: Avoid leaking the freelist bitmaps upon driver detach Submitted by: Sai Rajesh Tallamraju <stallamr@netapp.com> MFC after: 2 weeks Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D27342	2020-12-07 14:53:14 +00:00
Mark Johnston	102540192c	iflib: Detach tasks upon device registration failure In some error paths we would fail to detach from the iflib taskqueue groups. Also move the detach code into its own subroutine instead of duplicating it. Submitted by: Sai Rajesh Tallamraju <stallamr@netapp.com> MFC after: 2 weeks Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D27342	2020-12-07 14:52:57 +00:00
Mark Johnston	54bf96fb4f	iflib: Free full mbuf chains when draining transmit queues Submitted by: Sai Rajesh Tallamraju <stallamr@netapp.com> Reviewed by: gallatin, hselasky MFC after: 1 week Sponsored by: NetApp, Inc. Differential Revision: https://reviews.freebsd.org/D27179	2020-11-11 18:00:06 +00:00
Vincenzo Maffione	be7a6b3d84	iflib: fix typo bug introduced by r367093 Code was supposed to call callout_reset_sbt_on() rather than callout_reset_sbt(). This resulted into passing a "cpu" value to a "flag" argument. A recipe for subtle errors. PR: 248652 Reported by: sg@efficientip.com MFC with: r367093	2020-10-28 21:06:17 +00:00
Vincenzo Maffione	17cec474c0	iflib: add per-tx-queue netmap timer The way netmap TX is handled in iflib when TX interrupts are not used (IFC_NETMAP_TX_IRQ not set) has some issues: - The netmap_tx_irq() function gets called by iflib_timer(), which gets scheduled with tick granularity (hz). This is not frequent enough for 10Gbps NICs and beyond (e.g., ixgbe or ixl). The end result is that the transmitting netmap application is not woken up fast enough to saturate the link with small packets. - The iflib_timer() functions also calls isc_txd_credits_update() to ask for more TX completion updates. However, this violates the netmap requirement that only txsync can access the TX queue for datapath operations. Only netmap_tx_irq() may be called out of the txsync context. This change introduces per-tx-queue netmap timers, using microsecond granularity to ensure that netmap_tx_irq() can be called often enough to allow for maximum packet rate. The timer routine simply calls netmap_tx_irq() to wake up the netmap application. The latter will wake up and call txsync to collect TX completion updates. This change brings back line rate speed with small packets for ixgbe. For the time being, timer expiration is hardcoded to 90 microseconds, in order to avoid introducing a new sysctl. We may eventually implement an adaptive expiration period or use another deferred work mechanism in place of timers. Also, fix the timers usage to make sure that each queue is serviced by a different CPU. PR: 248652 Reported by: sg@efficientip.com MFC after: 2 weeks	2020-10-27 21:53:33 +00:00
Mateusz Guzik	662c13053f	net: clean up empty lines in .c and .h files	2020-09-01 21:19:14 +00:00
Vincenzo Maffione	35d8a463e8	iflib: leave only 1 receive descriptor unused The pidx argument of isc_rxd_flush() indicates which is the last valid receive descriptor to be used by the NIC. However, current code has multiple issues: - Intel drivers write pidx to their RDT register, which means that NICs will only use the descriptors up to pidx-1 (modulo ring size N), and won't actually use the one pointed by pidx. This does not break reception, but it is anyway confusing and suboptimal (the NIC will actually see only N-2 descriptors as available, rather than N-1). Other drivers (if_vmx, if_bnxt, if_mgb) adhere to this semantic). - The semantic used by Intel (RDT is one descriptor past the last valid one) is used by most (if not all) NICs, and it is also used on the TX side (also in iflib). Since iflib is not currently using this semantic for RX, it must decrement fl->ifl_pidx (modulo N) before calling isc_rxd_flush(), and then the per-driver callback implementation must increment the index again (to match the real semantic). This is confusing and suboptimal. - The iflib refill function is also called at initialization. However, in case the ring size is smaller than 128 (e.g. if_mgb), the refill function will actually prepare all the receive descriptors (N), without leaving one unused, as most of NICs assume (e.g. to avoid RDT to overrun RDH). I can speculate that the code looks like this right now because this issue showed up during testing (e.g. with if_mgb), and it was easy to workaround by decrementing pidx before isc_rxd_flush(). The goal of this change is to simplify the code (removing a bunch of instructions from the RX fast path), and to make the semantic of isc_rxd_flush() consistent across drivers. To achieve this, we: - change the semantics of the pidx argument to the usual one (that is the index one past the last valid one), so that both iflib and drivers avoid the decrement/increment dance. - fix the initialization code to prepare at most N-1 descriptors. Reviewed by: markj MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D26191	2020-09-01 20:41:47 +00:00
Vincenzo Maffione	ae750d5cdf	iflib: netmap: publish all the receive buffer At initialization time, the netmap RX refill function used to prepare the NIC RX ring with N-1 buffers rather than N (with N equal to the number of descriptors in the NIC RX ring). This is not how netmap is supposed to work, as it would keep kring->nr_hwcur not in sync with the NIC "next index to refill" (i.e., fl->ifl_pidx). Instead we prepare N buffers, although we still publish (with isc_rxd_flush()) only the first N-1 buffers, to avoid the NIC producer pointer to overrun the NIC consumer pointer (for NICs where this is a real issue, e.g. Intel ones). MFC after: 2 weeks	2020-08-25 15:19:45 +00:00
Vincenzo Maffione	de5b46107c	iflib: fix isc_rxd_flush call in netmap_fl_refill() The semantic of the pidx argument of isc_rxd_flush() is the last valid index of in the free list, rather than the next index to be published. However, netmap was still using the old convention. While there, also refactor the netmap_fl_refill() to simplify a little bit and add an assertion. MFC after: 2 weeks	2020-08-24 11:44:20 +00:00
Vincenzo Maffione	6d84e76a25	iflib: netmap: improve rxsync to support IFLIB_HAS_RXCQ For drivers with IFLIB_HAS_RXCQ set, there is a separate completion queue. In this case, the netmap rxsync routine needs to update rxq->ifr_cq_cidx in the same way it is updated by iflib_rxeof(). This improves the situation for vmx(4) and bnxt(4) drivers, which use iflib and have the IFLIB_HAS_RXCQ bit set. PR: 248494 MFC after: 3 weeks	2020-08-12 14:45:31 +00:00
Vincenzo Maffione	530960be8d	iflib: refactor netmap_fl_refill and fix off-by-one issue First, fix the initialization of the fl->ifl_rxd_idxs array, which was affected by an off-by-one bug. Once there, refactor the function to use better names for local variables, optimize the variable assignments, and merge the bus_dmamap_sync() inner loop with the outer one. PR: 248494 MFC after: 3 weeks	2020-08-12 14:17:38 +00:00
Vincenzo Maffione	c9d886cd7f	iflib: netmap: drop redundant check The validity of head is already checked by nm_rxsync_prologue(). MFC after: 2 weeks	2020-08-06 21:37:38 +00:00
Vincenzo Maffione	ee07345d20	iflib: netmap: don't increment ifl_cidx on the wrong free list Netmap only uses free list 0 to keep it consistent with its one-to-one mapping between each netmap ring and a device RX (or TX) queue. However, the current iflib_netmap_rxsync() routine was mistakenly updating the ifl_cidx field of both free lists. PR: 248494 MFC after: 2 weeks	2020-08-06 21:32:25 +00:00
Matt Macy	0ae0e8d2bd	iflib: fix LOR with bpf detach Reported by: grehan@ Approved by: grehan@ MFC after: 1 week Sponsored by: Netgate Differential Revision: https://reviews.freebsd.org/D25530	2020-07-27 01:17:59 +00:00
Vincenzo Maffione	ac11d85740	iflib: initialize netmap with the correct number of descriptors In case the network device has a RX or TX control queue, the correct number of TX/RX descriptors is contained in the second entry of the isc_ntxd (or isc_nrxd) array, rather than in the first entry. This case is correctly handled by iflib_device_register() and iflib_pseudo_register(), but not by iflib_netmap_attach(). If the first entry is larger than the second, this can result in a panic. This change fixes the bug by introducing two helper functions that also lead to some code simplification. PR: 247647 MFC after: 3 weeks Differential Revision: https://reviews.freebsd.org/D25541	2020-07-20 21:08:56 +00:00
Mark Johnston	b256d25c50	iflib: Fix some nits in the rx refill code. - Get rid of the ifl_vm_addrs array. It is not used by any existing consumer, so we are just dirtying a couple of cache lines for no reason. - Use uma_zalloc(fl->ifl_zone) instead of m_cljget(). Otherwise m_cljget() is doing unnecessary work to look up the correct zone, when iflib already knows what that zone is. - ifl_gen is only used when INVARIANTS is on, so make that more clear. - Fix some style nits and inconsistencies. Reviewed by: gallatin Tested by: pho MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D25490	2020-07-06 14:52:21 +00:00
Mark Johnston	a363e1d4d0	iflib: Fix handling of mbuf cluster allocation failures. When refilling an rx freelist, make sure we only update the hardware producer index if at least one cluster was allocated. Otherwise the NIC is programmed to write a previously used cluster, typically resulting in a use-after-free when packet data is written by the hardware. Also make sure that we don't update the fragment index cursor if the last allocation attempt didn't succeed. For at least Intel drivers, iflib assumes that the consumer index and fragment index cursor stay in lockstep, but this assumption was violated in the face of cluster allocation failures. Reported and tested by: pho Reviewed by: gallatin, hselasky MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D25489	2020-07-06 14:52:09 +00:00
Vincenzo Maffione	9503233f87	iflib: fix compilation issue introduced in r362621 The ifp local variable is useful even without netmap and altq, as it is used to check for IFF_DRV_RUNNING. MFC after: 2 weeks	2020-06-25 20:43:21 +00:00
Vincenzo Maffione	d8b2d26b15	iflib: netmap: add support for partial ring openings Reviewed by: gallatin MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D25254	2020-06-25 19:44:24 +00:00
Vincenzo Maffione	88a688663a	iflib: netmap: add per-tx-queue netmap support Reviewed by: gallatin MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D25253	2020-06-25 19:35:43 +00:00
Vincenzo Maffione	0ff2126795	iflib: netmap: fix rsync index overrun In the current iflib_netmap_rxsync, there is nothing that prevents kring->nr_hwtail to overrun kring->nr_hwcur during the descriptor import phase. This may cause errors in netmap applications, such as: em1 RX0: fail 'head < kring->nr_hwcur \|\| head > kring->nr_hwtail' h 795 c 795 t 282 rh 795 rc 795 rt 282 hc 282 ht 282 Reviewed by: gallatin MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D25252	2020-06-23 20:23:56 +00:00
Matt Macy	9aeca21324	iflib: fix cloneattach fail and generalize pseudo device handling - a cloneattach failure will not currently be handled correctly, jump to the right target - pseudo devices are all treat as if they're ethernet devices - this often doesn't make sense MFC after: 1 week Sponsored by: Netgate, Inc. Differential Revision: https://reviews.freebsd.org/D25083	2020-06-21 22:02:49 +00:00
Vincenzo Maffione	0a182b4c63	iflib: netmap: enter/exit netmap mode after device stops Avoid possible race conditions by calling nm_set_native_flags() and nm_clear_native_flags() only after the device has been stopped. MFC after: 1 week	2020-06-14 21:07:12 +00:00
Vincenzo Maffione	e136e9c88f	iflib: netmap: honor netmap_irx_irq return values In the receive interrupt routine, always call netmap_rx_irq(). The latter function will return != NM_IRQ_PASS if netmap is not active on that specific receive queue, so that the driver can go on with iflib_rxeof(). Note that netmap supports partial opening, where only a subset of the RX or TX rings can be open in netmap mode. Checking the IFCAP_NETMAP flag is not enough to make sure that the queue is indeed in netmap mode. Moreover, in case netmap_rx_irq() returns NM_IRQ_RESCHED, it means that netmap expects the driver to call netmap_rx_irq() again as soon as possible. Currently, this may happen when the device is attached to a VALE switch. Reviewed by: gallatin MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D25167	2020-06-09 19:15:43 +00:00
Matt Macy	1f93e931d9	Fix panics when using iflib pseudo device support Reviewed by: gallatin@, hselasky@ MFC after: 1 week Sponsored by: Netgate, Inc. Differential Revision: https://reviews.freebsd.org/D23710	2020-05-31 18:42:00 +00:00
Mark Johnston	814fa34dfb	Increase the iflib txq callout mutex name length to 32 bytes. With a length of 16, the name ("<if name>:TX(<qid>):callout") typically gets truncated. PR: 245712 Reported by: ghuckriede@blackberry.com MFC after: 1 week	2020-04-30 15:39:04 +00:00
Eric Joyner	45818bf1a0	iflib: Stop interface before (un)registering VLAN This patch is intended to solve a specific problem that iavf(4) encounters, but what it does can be extended to solve other issues. To summarize the iavf(4) issue, if the PF driver configures VLAN anti-spoof, then the VF driver needs to make sure no untagged traffic is sent if a VLAN is configured, and vice-versa. This can be an issue when a VLAN is being registered or unregistered, e.g. when a packet may be on the ring with a VLAN in it, but the VLANs are being unregistered. This can cause that tagged packet to go out and cause an MDD event. To fix this, include a new interface-dependent function that drivers can implement named IFDI_NEEDS_RESTART(). Right now, this function is called in iflib_vlan_unregister/register() to determine whether the interface needs to be stopped and started when a VLAN is registered or unregistered. The default return value of IFDI_NEEDS_RESTART() is true, so this fixes the MDD problem that iavf(4) encounters, since the interface rings are flushed during a stop/init. A future change to iavf(4) will implement that function just in case the default value changes, and to make it explicit that this interface reset is required when a VLAN is added or removed. Reviewed by: gallatin@ MFC after: 1 week Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D22086	2020-04-27 22:02:44 +00:00
Mark Johnston	59d50fe5ef	Simplify taskqgroup inititialization. taskqgroup initialization was broken into two steps: 1. allocate the taskqgroup structure, at SI_SUB_TASKQ; 2. initialize taskqueues, start taskqueue threads, enqueue "binder" tasks to bind threads to specific CPUs, at SI_SUB_SMP. Step 2 tries to handle the case where tasks have already been attached to a queue, by migrating them to their intended queue. In particular, tasks can't be enqueued before step 2 has completed. This breaks NFS mountroot on systems using an iflib-based driver when EARLY_AP_STARTUP is not defined, since mountroot happens before SI_SUB_SMP in this case. Simplify initialization: do all initialization except for CPU binding at SI_SUB_TASKQ. This means that until CPU binding is completed, group tasks may be executed on a CPU other than that to which they were bound, but this should not be a problem for existing users of the taskqgroup KPIs. Reported by: sbruno Tested by: bdragon, sbruno MFC after: 1 month Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D24188	2020-03-30 14:22:52 +00:00
Ed Maste	ed6611cc8c	iflib: simplify MPASS assertion Submitted by: andrew	2020-03-24 17:54:34 +00:00
Ed Maste	68af0153a7	iflib: split compound assertion ThunderX cluster systems are panicking on boot with a failed assertion MPASS(gtask != NULL && gtask->gt_taskqueue != NULL). Split the assertion so that it's clear which part is failing.	2020-03-24 17:25:56 +00:00
Patrick Kelsey	876996910a	Remove extraneous code from iflib ifsd_cidx is never used, and the line removed from rxd_frag_to_sd() is just dead code. Reviewed by: erj, gallatin MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D23951	2020-03-14 20:13:42 +00:00
Patrick Kelsey	3caff1885f	Remove refill budget from iflib Reviewed by: gallatin MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D23948	2020-03-14 19:58:50 +00:00
Patrick Kelsey	b38136097a	Allow iflib drivers to specify the buffer size used for each receive queue Reviewed by: erj, gallatin MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D23947	2020-03-14 19:56:46 +00:00
Patrick Kelsey	e503049067	Remove freelist contiguous-indexes assertion from rxd_frag_to_sd() The vmx driver is an example of an iflib driver that might report packets using non-contiguous descriptors (with unused descriptors either between received packets or between the fragments of a received packet), so this assertion needs to be removed. For such drivers, the freelist producer and consumer indexes don't relate directly to driver ring slots (the driver deals directly with freelist buffer indexes supplied by iflib during refill, and reports them with each fragment during packet reception), but do continue to be used by iflib for accounting, such as determining the number of ring slots that are refillable. PR: 243126, 243392, 240628 Reported by: avg, alexandr.oleynikov@gmail.com, Harald Schmalzbauer Reviewed by: gallatin MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D23946	2020-03-14 19:55:05 +00:00
Patrick Kelsey	4f2beb721b	Fix iflib zero-length fragment handling The dmamap for zero-length fragments should not be unloaded, as doing so breaks the the cluster-reuse logic in _iflib_fl_refill(). All zero-length fragments are now handled by the assemble_segments() path so that the cluster-reuse logic there does not have to be replicated in the small-single-fragment-packet path of iflib_rxd_pkt_get(). Packets consisting entirely of zero-length fragments (which result in a NULL mbuf pointer) are now properly tolerated. This allows drivers (such as the vmx driver) to pass such packets to iflib when a descriptor error occurs during packet reception, the advantage being that the refill of descriptors associated with the error packet are handled via the existing iflib machinery without having to duplicate parts of that machinery in the driver to handle that error case. Reviewed by: avg, erj, gallatin MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D23945	2020-03-14 19:51:55 +00:00
Patrick Kelsey	9e9b738ac5	Fix iflib freelist state corruption This fixes a bug in iflib freelist management that breaks the required correspondence between freelist indexes and driver ring slots. PR: 243126, 243392, 240628 Reported by: avg, alexandr.oleynikov@gmail.com, Harald Schmalzbauer Reviewed by: avg, gallatin MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D23943	2020-03-14 19:43:44 +00:00
Pawel Biernacki	7029da5c36	Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many) r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are still not MPSAFE (or already are but aren’t properly marked). Use it in preparation for a general review of all nodes. This is non-functional change that adds annotations to SYSCTL_NODE and SYSCTL_PROC nodes using one of the soon-to-be-required flags. Mark all obvious cases as MPSAFE. All entries that haven't been marked as MPSAFE before are by default marked as NEEDGIANT Approved by: kib (mentor, blanket) Commented by: kib, gallatin, melifaro Differential Revision: https://reviews.freebsd.org/D23718	2020-02-26 14:26:36 +00:00
Gleb Smirnoff	e87c494015	Although most of the NIC drivers are epoch ready, due to peer pressure switch over to opt-in instead of opt-out for epoch. Instead of IFF_NEEDSEPOCH, provide IFF_KNOWSEPOCH. If driver marks itself with IFF_KNOWSEPOCH, then ether_input() would not enter epoch when processing its packets. Now this will create recursive entrance in epoch in >90% network drivers, but will guarantee safeness of the transition. Mark several tested drivers as IFF_KNOWSEPOCH. Reviewed by: hselasky, jeff, bz, gallatin Differential Revision: https://reviews.freebsd.org/D23674	2020-02-24 21:07:30 +00:00
Hans Petter Selasky	f98977b521	Use NET_TASK_INIT() and NET_GROUPTASK_INIT() for drivers that process incoming packets in taskqueue context. This patch extends r357772. Tested by: yp@mm.st Sponsored by: Mellanox Technologies	2020-02-12 09:19:47 +00:00
Hans Petter Selasky	fb1a29b45e	Make sure the so-called end of receive interrupts don't starve in iflib. When the receive ring cannot be filled with mbufs, due to lack of memory, no more interrupts may be generated to fill the receive ring later on. Make sure to have a watchdog, to try refilling the receive ring from time to time, hopefully when more mbufs are available. Differential Revision: https://reviews.freebsd.org/D23315 MFC after: 1 week Reviewed by: gallatin@ Sponsored by: Mellanox Technologies	2020-02-12 08:30:07 +00:00
Gleb Smirnoff	6c3e93cb5a	Use NET_TASK_INIT() and NET_GROUPTASK_INIT() for drivers that process incoming packets in taskqueue context. Reviewed by: hselasky Differential Revision: https://reviews.freebsd.org/D23518	2020-02-11 18:57:07 +00:00
Gleb Smirnoff	0b8df657a4	Enter network epoch in iflib rxeof task. In upcoming changes ether_input() is going to be changed not to enter the network epoch. It is going to be responsibility of network interrupt. In case of iflib - its taskqueue.	2020-01-23 01:27:58 +00:00
Eric Joyner	f6afed726b	iflib: Prevent watchdog from resetting idle queues While changing link state in iflib_link_state_change(), queues are marked as IFLIB_QUEUE_IDLE to disable watchdog. Currently, iflib_timer() watchdog does not check for previous queue status before marking it as IFLIB_QUEUE_HUNG. This patch adds check of queue status before marking it as hung. Signed-off-by: Piotr Pietruszewski <piotr.pietruszewski@intel.com> PR: 239240 Submitted by: Piotr Pietruszewski <piotr.pietruszewski@intel.com> Reported by: ultima@ Reviewed by: gallatin@, erj@ MFC after: 3 days Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D21712	2020-01-02 23:35:06 +00:00
Eric Joyner	db8e8f1ede	iflib: properly release memory allocated for DMA DMA memory allocations using the bus_dma.h interface are not properly released in all cases for both Tx and Rx. This causes ~448 bytes of M_DEVBUF allocations to be leaked. First, the DMA maps for Rx are not properly destroyed. A slight attempt is made in iflib_fl_bufs_free to destroy the maps if we're detaching. However, this function may not be reliably called during detach. Indeed, there is a comment "asking" if this should be moved out. Fix this by moving the bus_dmamap_destroy call into iflib_rx_sds_free, where we already sync and unload the DMA. Second, the DMA tag associated with the ifr_ifdi descriptor DMA is not released properly anywhere. Add a call to iflib_dma_free in iflib_rx_structures_free. Third, use of NULL as a canary value on the map pointer returned by bus_dmamap_create is not valid. On some platforms, notably x86, this value may be NULL. In this case, we fail to properly release the related resources. Remove the NULL checks on map values in both iflib_fl_bufs_free and iflib_txsd_destroy. With all of these fixes applied, the leaks to M_DEVBUF are squelched, and iflib drivers now seem to properly cleanup when detaching. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Submitted by: Jacob Keller <jacob.e.keller@intel.com> Reviewed by: erj@, gallatin@ MFC after: 1 week Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D22203	2019-11-04 23:06:57 +00:00
Eric Joyner	244e7cffa5	iflib: cleanup memory leaks on driver detach From Jake: The iflib stack failed to release all of the memory allocated under M_IFLIB during device detach. Specifically, the ifmp_ring, the ift_ifdi Tx DMA info, and the ifr_ifdi Rx DMA info were not being released. Release this memory so that iflib won't leak memory when a device detaches. Since we're freeing the ift_ifdi pointer during iflib_txq_destroy we need to call this only after iflib_dma_free in iflib_tx_structures_free. Additionally, also ensure that we destroy the callout mutex associated with each Tx queue when we free it. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Submitted by: Jacob Keller <jacob.e.keller@intel.com> Reviewed by: erj@, gallatin@ MFC after: 1 week Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D22157	2019-10-30 20:45:12 +00:00
Eric Joyner	1558015e3e	iflib: call ether_ifdetach and netmap_detach before stop From Jake: Calling ether_ifdetach after iflib_stop leads to a potential race where a stale ifp pointer can remain in the route entry list for IPv6 traffic. This will potentially cause a page fault or other system instability if the ifp pointer is accessed. Move both iflib_netmap_detach and ether_ifdetach to be called prior to iflib_stop. This avoids the race above, and helps ensure that other ifp references are removed before stopping the interface. Submitted by: Jacob Keller <jacob.e.keller@intel.com> Reviewed by: erj@, gallatin@, jhb@ MFC after: 3 days Sponsored by: Intel Corporation Differential Revision: https://reviews.freebsd.org/D22071	2019-10-23 23:20:49 +00:00
Conrad Meyer	7790c8c199	Split out a more generic debugnet(4) from netdump(4) Debugnet is a simplistic and specialized panic- or debug-time reliable datagram transport. It can drive a single connection at a time and is currently unidirectional (debug/panic machine transmit to remote server only). It is mostly a verbatim code lift from netdump(4). Netdump(4) remains the only consumer (until the rest of this patch series lands). The INET-specific logic has been extracted somewhat more thoroughly than previously in netdump(4), into debugnet_inet.c. UDP-layer logic and up, as much as possible as is protocol-independent, remains in debugnet.c. The separation is not perfect and future improvement is welcome. Supporting INET6 is a long-term goal. Much of the diff is "gratuitous" renaming from 'netdump_' or 'nd_' to 'debugnet_' or 'dn_' -- sorry. I thought keeping the netdump name on the generic module would be more confusing than the refactoring. The only functional change here is the mbuf allocation / tracking. Instead of initiating solely on netdump-configured interface(s) at dumpon(8) configuration time, we watch for any debugnet-enabled NIC for link activation and query it for mbuf parameters at that time. If they exceed the existing high-water mark allocation, we re-allocate and track the new high-water mark. Otherwise, we leave the pre-panic mbuf allocation alone. In a future patch in this series, this will allow initiating netdump from panic ddb(4) without pre-panic configuration. No other functional change intended. Reviewed by: markj (earlier version) Some discussion with: emaste, jhb Objection from: marius Differential Revision: https://reviews.freebsd.org/D21421	2019-10-17 16:23:03 +00:00
Mark Johnston	4166913371	Add IFLIB_SINGLE_IRQ_RX_ONLY. As of r347221 the iflib legacy interrupt mode setup assumes that drivers perform both receive and transmit processing from the interrupt handler. This assumption is invalid in the vmxnet3 driver, so introduce the IFLIB_SINGLE_IRQ_RX_ONLY flag to make iflib avoid tx processing in the interrupt handler. PR: 239118 Reported and tested by: Juraj Lutter <otis@sk.freebsd.org> Obtained from: marius Reviewed by: gallatin MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D21831	2019-09-30 15:59:07 +00:00

1 2 3 4 5

232 Commits