freebsd-skq/sys/dev/mlx5
Andrew Gallatin 23feb56348 KTLS: Re-work unmapped mbufs to carry ext_pgs in the mbuf itself.
While the original implementation of unmapped mbufs was a large
step forward in terms of reducing cache misses by enabling mbufs
to carry more than a single page for sendfile, they are rather
cache unfriendly when accessing the ext_pgs metadata and
data. This is because the ext_pgs part of the mbuf is allocated
separately, and almost guaranteed to be cold in cache.

This change takes advantage of the fact that unmapped mbufs
are never used at the same time as pkthdr mbufs. Given this
fact, we can overlap the ext_pgs metadata with the mbuf
pkthdr, and carry the ext_pgs meta directly in the mbuf itself.
Similarly, we can carry the ext_pgs data (TLS hdr/trailer/array
of pages) directly after the existing m_ext.

In order to be able to carry 5 pages (which is the minimum
required for a 16K TLS record which is not perfectly aligned) on
LP64, I've had to steal ext_arg2. The only user of this in the
xmit path is sendfile, and I've adjusted it to use arg1 when
using unmapped mbufs.

This change is almost entirely mechanical, except that we
change mb_alloc_ext_pgs() to no longer allow allocating
pkthdrs, the change to avoid ext_arg2 as mentioned above,
and the removal of the ext_pgs zone,

This change saves roughly 2% "raw" CPU (~59% -> 57%), or over
3% "scaled" CPU on a Netflix 100% software kTLS workload at
90+ Gb/s on Broadwell Xeons.

In a follow-on commit, I plan to remove some hacks to avoid
access ext_pgs fields of mbufs, since they will now be in
cache.

Many thanks to glebius for helping to make this better in
the Netflix tree.

Reviewed by:	hselasky, jhb, rrs, glebius (early version)
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D24213
2020-04-14 14:46:06 +00:00
..
mlx5_accel mlx5fpga: Initial code import. 2018-12-05 14:11:20 +00:00
mlx5_core mlx5_core: lower the severity of message noting that no SR-IOV cap is present. 2020-03-18 22:47:14 +00:00
mlx5_en KTLS: Re-work unmapped mbufs to carry ext_pgs in the mbuf itself. 2020-04-14 14:46:06 +00:00
mlx5_fpga Add MLX5_FPGA_RELOAD IOCTL(2) to mlx5fpga. 2019-05-08 10:25:14 +00:00
mlx5_fpga_tools Add MLX5_FPGA_RELOAD IOCTL(2) to mlx5fpga. 2019-05-08 10:25:14 +00:00
mlx5_ib mlx5en: Support 50GBase-KR4 media type in mlx5en driver. 2020-03-04 17:13:35 +00:00
mlx5_lib mlx5fpga: Initial code import. 2018-12-05 14:11:20 +00:00
cmd.h
cq.h Widen EPOCH(9) usage in mlx5en(4). 2020-01-30 12:35:13 +00:00
device.h mlx5: Add 'follow' vport state, relevant for VFs. 2020-03-18 22:38:57 +00:00
diagnostics.h Move EEPROM information query from a sysctl in mlx5en(4) to an ioctl 2019-10-02 10:14:55 +00:00
doorbell.h Update version information for the mlx5 and mlx5en(4) modules. 2018-07-18 10:12:53 +00:00
driver.h mlx5: Restore eswitch management code from attic. 2020-03-18 22:30:56 +00:00
fs.h
mlx5_ifc.h Add basic support for TCP/IP based hardware TLS offload to mlx5core. 2019-12-05 15:16:19 +00:00
mlx5_rdma_if.h Update version information for the mlx5 and mlx5en(4) modules. 2018-07-18 10:12:53 +00:00
mlx5io.h Move EEPROM information query from a sysctl in mlx5en(4) to an ioctl 2019-10-02 10:14:55 +00:00
mpfs.h mlx5: Integrate eswitch and mpfs management code. 2020-03-18 22:33:39 +00:00
port.h mlx5en: Support 50GBase-KR4 media type in mlx5en driver. 2020-03-04 17:13:35 +00:00
qp.h Fix compilation issue with mlx5core and sparc64 (gcc48): 2019-12-06 16:20:22 +00:00
srq.h Update version information for the mlx5 and mlx5en(4) modules. 2018-07-18 10:12:53 +00:00
tls.h Add basic support for TCP/IP based hardware TLS offload to mlx5core. 2019-12-05 15:16:19 +00:00
vport.h mlx5en: Fix for inlining issues in transmit path 2018-12-05 14:21:28 +00:00