941 Commits

Author SHA1 Message Date
Navdeep Parhar
87d228f935 cxgbe/t4_tom: The MSS in a FLOWC work request must not be 0.
Submitted by:	jhb@
MFC after:	1 week
Sponsored by:	Chelsio Communications
2020-03-10 21:49:56 +00:00
Navdeep Parhar
2b9010f070 cxgbe(4): Do not try to use 0 as an rx buffer address when the driver is
already allocating from the safe zone and the allocation fails.

This bug was introduced in r357481.

MFC after:	3 days
Sponsored by:	Chelsio Communications
2020-03-10 21:44:20 +00:00
Navdeep Parhar
7ba6f5493d cxgbe/t4_tom: Do not uninitialize a toepcb that has not been initialized.
This fixes the following panic:
--- trap 0xc, rip = 0xffffffff80c00411, rsp = 0xfffffe0025192840, rbp = 0xfffffe0025192860 ---
vmem_xfree() at vmem_xfree+0xd1/frame 0xfffffe0025192860
tls_uninit_toep() at tls_uninit_toep+0x78/frame 0xfffffe0025192880
free_toepcb() at free_toepcb+0x32/frame 0xfffffe00251928a0
t4_connect() at t4_connect+0x3be/frame 0xfffffe0025192950
tcp_offload_connect() at tcp_offload_connect+0xa4/frame 0xfffffe0025192990
tcp_usr_connect() at tcp_usr_connect+0xec/frame 0xfffffe00251929f0
soconnect() at soconnect+0xae/frame 0xfffffe0025192a30
kern_connectat() at kern_connectat+0xe2/frame 0xfffffe0025192a90
sys_connect() at sys_connect+0x75/frame 0xfffffe0025192ad0
amd64_syscall() at amd64_syscall+0x137/frame 0xfffffe0025192bf0
fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe0025192bf0
--- syscall (98, FreeBSD ELF64, sys_connect), rip = 0x8008e9d8a, rsp = 0x7fffffffc0f8, rbp = 0x7fffffffc130 ---

Reviewed by:	jhb@
MFC after:	3 days
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D23989
2020-03-06 19:56:12 +00:00
John Baldwin
6d44e8e6b5 Rename TOE TLS stats from [rt]x_tls_* to [rt]x_toe_tls_*.
This more clearly differentiates TLS records encrypted and decrypted
in TOE connections from those encrypted via NIC TLS.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2020-02-28 00:42:27 +00:00
Pawel Biernacki
7029da5c36 Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE.  All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by:	kib (mentor, blanket)
Commented by:	kib, gallatin, melifaro
Differential Revision:	https://reviews.freebsd.org/D23718
2020-02-26 14:26:36 +00:00
Navdeep Parhar
02cd773916 cxgbe(4): Congestion drops are maintained per E-channel and not per
buffer group.

This fixes a bug where congestion drops on port 1 of a T6 card would
incorrectly be counted as drops on port 0.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2020-02-19 00:48:58 +00:00
Navdeep Parhar
9a4a1be02c cxgbe/iw_cxgbe: correctly enforce the max reg_mr depth.
Reported by:	Andrew Zhu @ Netapp
Obtained from:	Chelsio Communications
MFC after:	1 week
Sponsored by:	Chelsio Communications
2020-02-18 20:43:10 +00:00
John Baldwin
ca3b3c573e Remove the per-TXQ tls_wrs stat.
It duplicated the kern_tls_records stat and was not conditional on NIC
TLS being enabled.

Reviewed by:	np
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D23670
2020-02-13 22:55:45 +00:00
Navdeep Parhar
77ad00bf36 cxgbe(4): Update T4/5/6 firmwares to 1.24.12.0.
Obtained from:	Chelsio Communications
MFC after:	1 month
Sponsored by:	Chelsio Communications
2020-02-12 02:55:06 +00:00
Navdeep Parhar
21935a41fd cxgbe(4): Add native netmap support to the main interface.
This means that extra virtual interfaces (VIs) created with
hw.cxgbe.num_vis are no longer required to use netmap.  Use this
tunable to enable native netmap support on the main interface:

hw.cxgbe.native_netmap="3"

There is no change in default behavior.

Suggested by:	jch@
MFC after:	2 weeks
Sponsored by:	Chelsio Communications
2020-02-05 22:29:01 +00:00
Navdeep Parhar
f4220a703d cxgbe(4): Add a knob to allow netmap tx traffic to be checksummed by
the hardware.

hw.cxgbe.nm_txcsum=1

MFC after:	2 weeks
Sponsored by:	Chelsio Communications
2020-02-05 00:13:15 +00:00
Navdeep Parhar
ba8b75ae01 cxgbe(4): Allow nm_black_hole and nm_cong_drop to be set at any time.
The cong_drop setting will apply to queues created after the setting is
changed and not to existing queues.

MFC after:	2 weeks
Sponsored by:	Chelsio Communications
2020-02-05 00:08:58 +00:00
Navdeep Parhar
3479fe20e2 cxgbe(4): Report accurate rx_buf_maxsize to netmap.
MFC after:	2 weeks
Sponsored by:	Chelsio Communications
2020-02-04 23:55:21 +00:00
Navdeep Parhar
87bbb3338e cxgbe(4): Add pfil(9) hooks to the driver's rx.
MFC after:	1 week
Sponsored by:	Chelsio Communications
2020-02-04 01:09:02 +00:00
Navdeep Parhar
1486d2de9e cxgbe(4): Treat NIC rx as special and run its handler directly and not
via the t4_cpl_handler dispatch table.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2020-02-04 01:01:35 +00:00
Navdeep Parhar
46e1e307ed cxgbe(4): Retire the allow_mbufs_in_cluster optimization.
This simplifies the driver's rx fast path as well as the bookkeeping
code that tracks various rx buffer sizes and layouts.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2020-02-04 00:51:10 +00:00
Navdeep Parhar
d6f79b2710 cxgbe(4): Avoid ext_arg2 in rxb_free.
ext_arg2 is the only item in the third cacheline in an mbuf and could be
cold by the time rxb_free runs.  Put the information needed by rxb_free
in the same line as the refcount, which is very likely to be hot given
that rxb_free runs when the refcount is decremented and reaches 0.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2020-02-03 23:50:29 +00:00
Navdeep Parhar
44c6fea82b cxgbe(4): Do not use pack boundary > 512B unless it is explicitly
requested.

This is a tradeoff between PCIe efficiency during large packet rx and
packing efficiency during small packet rx.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2020-02-03 23:30:39 +00:00
Navdeep Parhar
a9c4062a9a cxgbe(4): Initialize the rx buffer's metadata on first-use and not on
allocation.

refill_fl doesn't touch any part of a freshly allocated cluster after
this change.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2020-02-03 23:25:12 +00:00
Navdeep Parhar
9087a3df60 cxgbe(4): Only checksummed TCP should be considered for LRO.
This avoids the per-packet nanouptime in tcp_lro_rx for traffic that's
not even TCP.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2020-02-03 23:06:42 +00:00
Navdeep Parhar
46d29cab25 cxgbe/iw_cxgbe: Do not allow memory registrations with page size greater
than 128MB, which is the maximum supported by the hardware in RDMA mode.

Obtained from:	Chelsio Communications
MFC after:	3 days
Sponsored by:	Chelsio Communications
2020-01-14 01:43:04 +00:00
Bjoern A. Zeeb
334fc5822b vnet: virtualise more network stack sysctls.
Virtualise tcp_always_keepalive, TCP and UDP log_in_vain.  All three are
set in the netoptions startup script, which we would love to run for VNETs
as well [1].

While virtualising the log_in_vain sysctls seems pointles at first for as
long as the kernel message buffer is not virtualised, it at least allows
an administrator to debug the base system or an individual jail if needed
without turning the logging on for all jails running on a system.

PR:		243193 [1]
MFC after:	2 weeks
2020-01-08 23:30:26 +00:00
Gleb Smirnoff
e9edde4110 Fix a typo - passing wrong mbuf pointer to needs_udp_csum(). Will
trigger panic only on a kernel with RATELIMIT.

Submitted by:	rrs
2020-01-07 21:29:42 +00:00
Navdeep Parhar
93065a5afd cxgbe(4): check if the firmware supports FW_RI_FR_NSMR_TPTE_WR work
request.

This is used by iw_cxgbe to figure out how best to register memory.

MFC after:	1 month
Sponsored by:	Chelsio Communications
2019-12-18 19:10:30 +00:00
John Baldwin
93dafad57a Expand net epoch in the cxgbe TOE driver to satisfy assertions.
Reviewed by:	np
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D22483
2019-12-13 23:33:54 +00:00
Navdeep Parhar
c0236bd93d cxgbe(4): Use the _XT variant of the CPL used to transmit NIC traffic.
CPL_TX_PKT_XT disables the internal parser on the chip and instead
relies on the driver to provide the exact length of the L2 and L3
headers.  This allows hw checksumming and TSO to be used with L2 and
L3 encapsulations that the chip doesn't understand directly.

Note that netmap tx still uses the old CPL as it never uses the hw
to generate the checksum on tx.

Reviewed by:	jhb@
MFC after:	1 month
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D22788
2019-12-13 20:38:58 +00:00
Navdeep Parhar
82694ec0c0 cxgbe(4): Never use hardware checksumming in netmap tx.
MFC after:	1 week
Sponsored by:	Chelsio Communications
2019-12-12 21:33:00 +00:00
Navdeep Parhar
c08c2d42cf cxgbe(4): Simplify the firmware version checks a bit.
No functional change.

MFC after:	1 week
2019-12-10 20:12:21 +00:00
Navdeep Parhar
aa7bdbc00c cxgbe(4): Use TX_PKTS2 work requests in netmap Tx if it's available.
TX_PKTS2 is more efficient within the firmware and this improves netmap
Tx by a few Mpps in some common scenarios.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2019-12-10 08:16:19 +00:00
Navdeep Parhar
6f012c14bc cxgbe(4): Update T4/5/6 firmwares to 1.24.11.0.
These were obtained from the Chelsio Unified Wire v3.12.0.1 beta
release.

Note that the firmwares are not uuencoded any more.

MFH:		1 month
Sponsored by:	Chelsio Communications
2019-12-10 07:45:10 +00:00
Navdeep Parhar
168bde45c2 cxgbe/iw_cxgbe: Support 64b length in the memory registration routines.
Submitted by:	bharat @ chelsio
MFC after:	1 week
Sponsored by:	Chelsio Communications
2019-12-09 19:10:42 +00:00
Michael Tuexen
fa49a96419 In order for the TCP Handshake to support ECN++, and further ECN-related
improvements, the ECN bits need to be exposed to the TCP SYNcache.
This change is a minimal modification to the function headers, without any
functional change intended.

Submitted by:		Richard Scheffenegger
Reviewed by:		rgrimes@, rrs@, tuexen@
Differential Revision:	https://reviews.freebsd.org/D22436
2019-12-01 18:05:02 +00:00
Navdeep Parhar
e3338dee08 cxgbe(4): Allow the driver to specify multiple FECs that the firmware
should try in order to link up with the peer.

Various FEC variables within the driver can now have multiple bits set
instead of being powers of 2.  0 and -1 in the user knobs still mean no
FEC and auto (driver decides) respectively for backward compatibility,
but no-FEC and auto now have their own bits in the internal
representation.  There is a new bit that can be set to request the FEC
recommended by the cable/transceiver module.

Add sysctls to display link related capabilities of the local side as
well as the link partner.

Note that all this needs a new firmware and the documentation for the
driver FEC knobs will be updated after that firmware is added to the
driver.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2019-11-26 05:54:25 +00:00
Navdeep Parhar
515a40d5d9 cxgbe(4): sysctl to reset the temperature/voltage sensor.
# sysctl dev.<nexus>.<inst>.reset_sensor=1
# sysctl dev.t6nex.0.reset_sensor=1

MFC after:	1 week
Sponsored by:	Chelsio Communications
2019-11-24 16:40:54 +00:00
Navdeep Parhar
e56d731b7d cxgbe(4): Update the firmware interface header.
This allows the driver to be updated for the next firmware without
waiting for it to be released.

MFC after:	2 weeks
Sponsored by:	Chelsio Communications
2019-11-24 05:37:28 +00:00
John Baldwin
bddf73433e NIC KTLS for Chelsio T6 adapters.
This adds support for ifnet (NIC) KTLS using Chelsio T6 adapters.
Unlike the TOE-based KTLS in r353328, NIC TLS works with non-TOE
connections.

NIC KTLS on T6 is not able to use the normal TSO (LSO) path to segment
the encrypted TLS frames output by the crypto engine.  Instead, the
TOE is placed into a special setup to permit "dummy" connections to be
associated with regular sockets using KTLS.  This permits using the
TOE to segment the encrypted TLS records.  However, this approach does
have some limitations:

1) Regular TOE sockets cannot be used when the TOE is in this special
   mode.  One can use either TOE and TOE-based KTLS or NIC KTLS, but
   not both at the same time.

2) In NIC KTLS mode, the TOE is only able to accept a per-connection
   timestamp offset that varies in the upper 4 bits.  Put another way,
   only connections whose timestamp offset has the 28 lower bits
   cleared can use NIC KTLS and generate correct timestamps.  The
   driver will refuse to enable NIC KTLS on connections with a
   timestamp offset with any of the lower 28 bits set.  To use NIC
   KTLS, users can either disable TCP timestamps by setting the
   net.inet.tcp.rfc1323 sysctl to 0, or apply a local patch to the
   tcp_new_ts_offset() function to clear the lower 28 bits of the
   generated offset.

3) Because the TCP segmentation relies on fields mirrored in a TCB in
   the TOE, not all fields in a TCP packet can be sent in the TCP
   segments generated from a TLS record.  Specifically, for packets
   containing TCP options other than timestamps, the driver will
   inject an "empty" TCP packet holding the requested options (e.g. a
   SACK scoreboard) along with the segments from the TLS record.
   These empty TCP packets are counted by the
   dev.cc.N.txq.M.kern_tls_options sysctls.

Unlike TOE TLS which is able to buffer encrypted TLS records in
on-card memory to handle retransmits, NIC KTLS must re-encrypt TLS
records for retransmit requests as well as non-retransmit requests
that do not include the start of a TLS record but do include the
trailer.  The T6 NIC KTLS code tries to optimize some of the cases for
requests to transmit partial TLS records.  In particular it attempts
to minimize sending "waste" bytes that have to be given as input to
the crypto engine but are not needed on the wire to satisfy mbufs sent
from the TCP stack down to the driver.

TCP packets for TLS requests are broken down into the following
classes (with associated counters):

- Mbufs that send an entire TLS record in full do not have any waste
  bytes (dev.cc.N.txq.M.kern_tls_full).

- Mbufs that send a short TLS record that ends before the end of the
  trailer (dev.cc.N.txq.M.kern_tls_short).  For sockets using AES-CBC,
  the encryption must always start at the beginning, so if the mbuf
  starts at an offset into the TLS record, the offset bytes will be
  "waste" bytes.  For sockets using AES-GCM, the encryption can start
  at the 16 byte block before the starting offset capping the waste at
  15 bytes.

- Mbufs that send a partial TLS record that has a non-zero starting
  offset but ends at the end of the trailer
  (dev.cc.N.txq.M.kern_tls_partial).  In order to compute the
  authentication hash stored in the trailer, the entire TLS record
  must be sent as input to the crypto engine, so the bytes before the
  offset are always "waste" bytes.

In addition, other per-txq sysctls are provided:

- dev.cc.N.txq.M.kern_tls_cbc: Count of sockets sent via this txq
  using AES-CBC.

- dev.cc.N.txq.M.kern_tls_gcm: Count of sockets sent via this txq
  using AES-GCM.

- dev.cc.N.txq.M.kern_tls_fin: Count of empty FIN-only packets sent to
  compensate for the TOE engine not being able to set FIN on the last
  segment of a TLS record if the TLS record mbuf had FIN set.

- dev.cc.N.txq.M.kern_tls_records: Count of TLS records sent via this
  txq including full, short, and partial records.

- dev.cc.N.txq.M.kern_tls_octets: Count of non-waste bytes (TLS header
  and payload) sent for TLS record requests.

- dev.cc.N.txq.M.kern_tls_waste: Count of waste bytes sent for TLS
  record requests.

To enable NIC KTLS with T6, set the following tunables prior to
loading the cxgbe(4) driver:

hw.cxgbe.config_file=kern_tls
hw.cxgbe.kern_tls=1

Reviewed by:	np
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D21962
2019-11-21 19:30:31 +00:00
Navdeep Parhar
5877e649f0 cxgbev(4): Catch up with the pciids in the PF driver.
MFC after:	3 days
Sponsored by:	Chelsio Communications
2019-11-15 18:48:14 +00:00
Gleb Smirnoff
782b97cb80 Fix regression from r353841: ctx.rc needs to be initialized,
otherwise driver might silently fail to initialize.

Pointy hat to:	glebius
2019-11-15 18:02:37 +00:00
John Baldwin
a1b2b6e184 Create a file to hold shared routines for dealing with T6 key contexts.
ccr(4) and TLS support in cxgbe(4) construct key contexts used by the
crypto engine in the T6.  This consolidates some duplicated code for
helper functions used to build key contexts.

Reviewed by:	np
MFC after:	1 month
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D22156
2019-11-13 00:53:45 +00:00
Navdeep Parhar
43b5712444 cxgbe(4): Query Vdd from the firmware if its last known value is 0.
TVSENSE may not be ready by the time t4_fw_initialize returns and the
firmware returns 0 if the driver asks for the Vdd before the sensor is
ready.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2019-11-08 01:13:12 +00:00
Gleb Smirnoff
1a49612526 Mechanically convert INP_INFO_RLOCK() to NET_EPOCH_ENTER().
Remove few outdated comments and extraneous assertions.  No
functional change here.
2019-11-07 00:08:34 +00:00
Navdeep Parhar
2c4c3f83e7 cxgbe(4): Use correct size while converting lpacaps32 to native
endianness.
2019-10-31 00:35:26 +00:00
Navdeep Parhar
adb0cd8408 cxgbe(4): Use correct FetchBurstMin values for T6.
MFC after:	1 week
Sponsored by:	Chelsio Communications
2019-10-25 21:53:05 +00:00
John Baldwin
e38a50e8b6 Split Chelsio send tags into a generic base tag and a ratelimit tag.
NIC KTLS will add a new TLS send tag type in cxgbe(4) that is a
distinct tag from a ratelimit tag.  To support this, refactor
cxgbe_snd_tag to be a simple send tag with a type and convert the
existing ratelimit tag to a new cxgbe_rate_tag structure.

Reviewed by:	np
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D22072
2019-10-22 20:41:54 +00:00
John Baldwin
866a7f286f Always allocate the atid table during attach.
Previously the table was allocated on first use by TOE and the
ratelimit code.  The forthcoming NIC KTLS code also uses this table.
Allocate it unconditionally during attach to simplify consumers.

Reviewed by:	np
Differential Revision:	https://reviews.freebsd.org/D22028
2019-10-22 20:01:47 +00:00
Gleb Smirnoff
02cc07d105 Convert to if_foreach_llmaddr() KPI. 2019-10-21 18:11:11 +00:00
Navdeep Parhar
693a9dfce2 cxgbe(4): An EQ update can be requested in a TX_PKTS2 work request.
MFC after:	1 week
Sponsored by:	Chelsio Communications
2019-10-15 17:35:39 +00:00
John Baldwin
aeb63511bd Remove an unused parameter from get_new_keyid(). 2019-10-14 18:02:56 +00:00
John Baldwin
b60229e2f1 Remove adapters from t4_list earlier during detach.
This ensures the clip task won't race with t4_destroy_clip_table.

While here, make some mutex destroys unconditional since attach always
initializes them.

Reviewed by:	np
MFC after:	1 week
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D21952
2019-10-09 21:08:51 +00:00
John Baldwin
4f13842f75 Add support for KTLS in the Chelsio TOE module.
This adds a TOE hook to allocate a KTLS session.  It also recognizes
TLS mbufs in the socket buffer and sends those to the NIC using a TLS
work request to encrypt the record before segmenting it.

TOE TLS support must be enabled via the dev.t6nex.<N>.tls sysctl in
addition to enabling KTLS.

Reviewed by:	np, gallatin
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D21891
2019-10-08 21:40:42 +00:00