e1000: Enable TSO for lem(4) and em(4)

Most em(4) devices now enjoy TSO and TSO6, matching NetBSD and Linux
defaults.

A prior commit automasks TSO on 10/100 Ethernet due to errata and other
bugs for IPv6 were fixed recently allowing this.

Mike Karels identified a performance anomaly on Intel 82574L devices.
These are multiqueue enabled on FreeBSD since the conversion to
iflib.  I am investigating whether this can be fixed, in the mean time
MSI-X with checksum offloads remain default.

i219 SPT devices have an errata that downclocks the DMA engine, which
results in TSO not being able to acheive line rate.  Therefore, it is
disabled on:
* Intel(R) I219-LM and I219-V SPT
* Intel(R) I219-LM and I219-V SPT-H (2)
* Intel(R) I219-LM and I219-V LBG (3)
* Intel(R) I219-LM and I219-V SPT (4)
* Intel(R) I219-LM and I219-V SPT (5)

Many lem(4) devices enjoy TSO, exceptions being 82542, 82543, 82547.
TSO6 may be possible for some chipsets but I am still working through
my testing matrix and that is hidden behind hw.em.unsupported_tso.

If you encounter issues, you may disable TSO with for example:
ifconfig em0 -tso -tso6.
I ask to be informed of any deviations from normal operation requiring
this.

Thanks to cc@ for access to emulab.net.

On a sample I219 system it saves about 16% CPU on IPv4 and 19% on IPv6.

iperf3 -Vc reported numbers:
total%	user%	system%

IPv4 TSO
21.3	7	14.4
21.4	6	15.4
21.5	6	15.5

IPv4 no TSO
36.8	5.4	31.4
38.5	5.1	33.5
38.2	5.7	32.6

IPv4 no TSO no TXCSUM
45.1	5.8	39.3
46	6.3	39.7
46.2	5.9	40.4

IPv6 TSO6
21.7	5.4	16.3
21.6	5.1	16.5
21.9	5.6	16.3

IPv6 no TSO6
41.2	5.2	36
41	5.1	36
40.8	5.2	35.7

IPv6 no TSO6 no TXCSUM6
49	5.9	43.1
48.8	4.9	43.9
49	5.6	43.4

Tested by:	cc (lem(4)), karels (82574L)
MFC after:	3 months
Relnotes:	yes
Sponsored by:	BBOX.io
Differential Revision:	https://reviews.freebsd.org/D41170
This commit is contained in:
Kevin Bowling 2023-08-03 13:49:15 -07:00
parent 9051987e40
commit f1b5488f7b

View File

@ -903,25 +903,19 @@ em_if_attach_pre(if_ctx_t ctx)
scctx->isc_tx_tso_size_max = EM_TSO_SIZE;
scctx->isc_tx_tso_segsize_max = EM_TSO_SEG_SIZE;
scctx->isc_capabilities = scctx->isc_capenable = EM_CAPS;
/*
* For EM-class devices, don't enable IFCAP_{TSO4,VLAN_HWTSO,TSO6}
* by default as we don't have workarounds for all associated
* silicon errata. E. g., with several MACs such as 82573E,
* TSO only works at Gigabit speed and otherwise can cause the
* hardware to hang (which also would be next to impossible to
* work around given that already queued TSO-using descriptors
* would need to be flushed and vlan(4) reconfigured at runtime
* in case of a link speed change). Moreover, MACs like 82579
* still can hang at Gigabit even with all publicly documented
* TSO workarounds implemented. Generally, the penality of
* these workarounds is rather high and may involve copying
* mbuf data around so advantages of TSO lapse. Still, TSO may
* work for a few MACs of this class - at least when sticking
* with Gigabit - in which case users may enable TSO manually.
*/
scctx->isc_capenable &= ~(IFCAP_TSO4 | IFCAP_VLAN_HWTSO | IFCAP_TSO6);
scctx->isc_tx_csum_flags = CSUM_TCP | CSUM_UDP | CSUM_IP_TSO |
CSUM_IP6_TCP | CSUM_IP6_UDP;
/* Disable TSO on 82574L due to performance loss being investigated */
if (hw->mac.type == e1000_82574)
scctx->isc_capenable &= ~IFCAP_TSO;
/*
* Disable TSO on SPT due to errata that downclocks DMA performance
* i218-i219 Specification Update 1.5.4.5
*/
if (hw->mac.type == e1000_pch_spt)
scctx->isc_capenable &= ~IFCAP_TSO;
/*
* We support MSI-X with 82574 only, but indicate to iflib(4)
* that it shall give MSI at least a try with other devices.
@ -944,24 +938,38 @@ em_if_attach_pre(if_ctx_t ctx)
scctx->isc_capabilities = scctx->isc_capenable = LEM_CAPS;
if (em_unsupported_tso)
scctx->isc_capabilities |= IFCAP_TSO6;
/*
* For LEM-class devices, don't enable IFCAP_{TSO4,VLAN_HWTSO}
* by default as we don't have workarounds for all associated
* silicon errata. TSO4 may work on > 82544 but its status
* is unknown by the authors. Please report any success or failures.
*/
scctx->isc_capenable &= ~(IFCAP_TSO4 | IFCAP_VLAN_HWTSO);
scctx->isc_tx_csum_flags = CSUM_TCP | CSUM_UDP | CSUM_IP_TSO |
CSUM_IP6_TCP | CSUM_IP6_UDP;
/* 82541ER doesn't do HW tagging */
if (hw->device_id == E1000_DEV_ID_82541ER ||
hw->device_id == E1000_DEV_ID_82541ER_LOM) {
scctx->isc_capabilities &= ~IFCAP_VLAN_HWTAGGING;
scctx->isc_capenable = scctx->isc_capabilities;
}
/* This is the first e1000 chip and it does not do offloads */
if (hw->mac.type == e1000_82542) {
scctx->isc_capabilities &= ~(IFCAP_HWCSUM | IFCAP_VLAN_HWCSUM |
IFCAP_HWCSUM_IPV6 | IFCAP_VLAN_HWTAGGING |
IFCAP_VLAN_HWFILTER | IFCAP_TSO | IFCAP_VLAN_HWTSO);
scctx->isc_capenable = scctx->isc_capabilities;
}
/* These can't do TSO for various reasons */
if (hw->mac.type < e1000_82544 || hw->mac.type == e1000_82547 ||
hw->mac.type == e1000_82547_rev_2) {
scctx->isc_capabilities &= ~(IFCAP_TSO | IFCAP_VLAN_HWTSO);
scctx->isc_capenable = scctx->isc_capabilities;
}
/* XXXKB: No IPv6 before this? */
if (hw->mac.type < e1000_82545){
scctx->isc_capabilities &= ~IFCAP_HWCSUM_IPV6;
scctx->isc_capenable = scctx->isc_capabilities;
}
/* "PCI/PCI-X SDM 4.0" page 33 (b) - FDX requirement on these chips */
if (hw->mac.type == e1000_82542 || hw->mac.type == e1000_82547 ||
hw->mac.type == e1000_82547_rev_2)
if (hw->mac.type == e1000_82547 || hw->mac.type == e1000_82547_rev_2)
scctx->isc_capenable &= ~(IFCAP_HWCSUM | IFCAP_VLAN_HWCSUM |
IFCAP_HWCSUM_IPV6);
/* 82541ER doesn't do HW tagging */
if (hw->device_id == E1000_DEV_ID_82541ER || hw->device_id == E1000_DEV_ID_82541ER_LOM)
scctx->isc_capenable &= ~IFCAP_VLAN_HWTAGGING;
/* INTx only */
scctx->isc_msix_bar = 0;
}