Investigation of iSCSI target data corruption reports brought me to
discovery that cxgb(4) expects mbufs to be physically contiguous, that
is not true after I've started using m_extaddref() in software iSCSI
for large zero-copy transmissions. In case of fragmented memory the
driver transmitted garbage from pages following the first one due to
simple use of pmap_kextract() for the first pointer instead of proper
bus_dmamap_load_mbuf_sg(). Seems like it was done as some optimization
many years ago, and at very least it is wrong in a world of IOMMUs.
This patch just removes that optimization, plus limits packet coalescing
for mbufs crossing page boundary, also depending on assumption of one
segment per packet.
MFC after: 3 days
Sponsored by: iXsystems, Inc.
Reviewed by: mmacy, np
Differential revision: https://reviews.freebsd.org/D28428
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
- remove mbuf iovec - useful, but adds too much complexity when isolated to
the driver
- remove driver private caching - insufficient benefit over UMA to justify
the added complexity and maintenance overhead
- remove separate logic for managing multiple transmit queues, with the
new drbr routines the control flow can be made to much more closely resemble
legacy drivers
- remove dedicated service threads, with per-cpu callouts one can get the same
benefit much more simply by registering a callout 1 tick in the future if there
are still buffered packets
- remove embedded mbuf usage - Jeffr's changes will (I hope) soon be integrated
greatly reducing the overhead of using kernel APIs for reference counting
clusters
- add hysteresis to descriptor coalescing logic
- add coalesce threshold sysctls to allow users to decide at run-time
between optimizing for forwarding / UDP or optimizing for TCP
- add once per second watchdog to effectively close the very rare races
occurring from coalescing
- incorporate Navdeep's changes to the initialization path required to
convert port and adapter locks back to ordinary mutexes (silencing BPF
LOR complaints)
- enable prefetches in get_packet and tx cleaning
Reviewed by: navdeep@
MFC after: 2 weeks
- add support for T3C
- add DDP support (zero-copy receive)
- fix TOE transmit of large requests
- fix shutdown so that sockets don't remain in CLOSING state indefinitely
- register listeners when an interface is brought up after tom is loaded
- fix setting of multicast filter
- enable link at device attach
- exit tick handler if shutdown is in progress
- add helper for logging TCB
- add sysctls for dumping transmit queues
- note that TOE wxill not be MFC'd until after 7.0 has been finalized
MFC after: 3 days
- Track packet zone mbufs separately from other mbufs
- free packet zone buffers via m_free rather than trying to manage the refcount
as with clusters - its refcount and management seems to be "special"
- increase asserts for mbuf accounting
- track outstanding mbufs (maps very closely to leaked)
- actually only create one thread per port if !multiq
Oddly enough this fixes the use after free
- move txq_segs to stack in t3_encap
- add checks that pidx doesn't move pass cidx
- simplify mbuf free logic in collapse mbufs routine
- move cxgb_tx_common in to cxgb_multiq.c and rename to cxgb_tx
- move cxgb_tx_common dependencies
- further simplify cxgb_dequeue_packet for the non-multiqueue case
- only launch one service thread per port in the non-multiq case
- remove dead cleaning code from cxgb_sge.c
- simplify PIO case substantially in by returning directly from mbuf collapse
and just using m_copydata
- remove gratuitous m_gethdr in the rx path
- clarify freeing of mbufs in collapse
implement robust version of m_collapse
add support for sf_buf
add fix for m_iovappend
add calls to m_sanity under INVARIANTS
fix m_freem_vec to correctly travese the mbuf iovec chain