freebsd-dev

Author	SHA1	Message	Date
John Baldwin	671fd0ec8d	cxgbei: Remove unused sysctls. These were seemingly copied over from icl_soft. Reviewed by: np Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D30268	2021-05-19 15:56:45 -07:00
John Baldwin	a9f0cf4838	cxgbe: Fix some merge-o's for the per-rxq iSCSI counters. I botched a few of the changes when rebasing the changes in `4b6ed0758d` across the changes in `43bbae1948`. - Move the counter allocations into alloc_ofld_rxq(). - Free the counters freeing an ofld rxq. Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D30267	2021-05-19 15:56:31 -07:00
Navdeep Parhar	3965469eaa	cxgbe(4): Remove some dead code. MFC after: 3 days	2021-05-18 23:16:03 -07:00
John Baldwin	8d2b4b2e7c	cxgbe: Cast pointer arguments to trunc_page() to vm_offset_t. Reported by: mjg, jenkins, rmacklem Fixes: `46bee8043e` Sponsored by: Chelsio Communications	2021-05-17 17:04:22 -07:00
John Baldwin	e73e2ee0ac	cxgbei: Handle target transfers with excess unsolicited data. The CTL frontend might have provided a buffer that is smaller than the FirstBurstLength and thus smaller than the amount of unsolicited data included in the request PDU. Treat these transfers as an empty transfer. Reported by: Jithesh Arakkan @ Chelsio Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D29940	2021-05-14 12:21:34 -07:00
John Baldwin	e894e3adb2	cxgbei: Explicitly clear the page pode reservation pointer after freeing it. A single union ctl_io can be reused across multiple transfers (in particular by the ramdisk backend). On a reuse, the reservation pointer would retain its value from the previous transfer tripping an assertion. Reported by: Jithesh Arakkan @ Chelsio Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D29939	2021-05-14 12:21:34 -07:00
John Baldwin	1ad32ad0be	cxgbei: Don't clamp iSCSI PDUs to 8K. The firmware no longer requires this workaround. Discussed with: np Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D29912	2021-05-14 12:21:24 -07:00
John Baldwin	4add8e4c89	cxgbei: Don't leak resources for an aborted target transfer. Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D29911	2021-05-14 12:17:26 -07:00
John Baldwin	a1c687347a	cxgbei: Add support for zero-copy iSCSI target transmission/read. - Switch to allocating the cxgbei version of icl_pdu explicitly as a separate refcounted object allocated via malloc/free instead of storing it in the bhs mbuf prior to the bhs. - Support the icl_conn_pdu_queue_cb() method to set a callback on a PDU to be invoked when the PDU is freed. - For ICL_NOCOPY buffers, use an external mbuf to manage the storage for the buffer via m_extaddref(). Each external mbuf holds a reference on the associated PDU, so the callback is invoked once all of the external mbufs have been freed. Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D29910	2021-05-14 12:17:20 -07:00
John Baldwin	31df8ff73e	cxgbei: Rework the pdu_append_data hook to support M_WAITOK. - Only allocate 16K jumbo mbufs if the region of data to be appended is sufficiently large, and use a loop. - Use m_getm2() to allocate a chain for data less than 16K, or if m_getjcl() fails. - Use ENOMEM as the return value instead of '1' if the hook fails due to a memory allocation error. Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D29909	2021-05-14 12:17:14 -07:00
John Baldwin	46bee8043e	cxgbei: Support DDP for target I/O S/G lists with more than one entry. A CAM target layer I/O CCB can use a S/G list of virtual address ranges to describe its data buffer. This change adds zero-copy receive support for such requests. Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D29908	2021-05-14 12:17:06 -07:00
John Baldwin	23b209ee88	cxgbe tom: Account for pre-iSCSI mode data on suspended connections. Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D29907	2021-05-14 12:17:02 -07:00
John Baldwin	91ca7b0954	cxgbei: Whitespace fixes, comment typo, and rewrap a comment. Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D29906	2021-05-14 12:16:57 -07:00
John Baldwin	87bb5ed606	cxgbei: Use hardware RX flow control for offloaded iSCSI connections. Forthcoming T6 iSCSI DDP support requires hardware RX flow control. Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D29905	2021-05-14 12:16:51 -07:00
John Baldwin	4427ac3675	cxgbe tom: Set the tid in the work requests to program page pods for iSCSI. As a result, CPL_FW4_ACK now returns credits for these work requests. To support this, page pod work requests are now constructed in special mbufs similar to "raw" mbufs used for NIC TLS in plain TX queues. These special mbufs are stored in the ulp_pduq and dispatched in order with PDU work requests. Sponsored by: Chelsio Communications Discussed with: np Differential Revision: https://reviews.freebsd.org/D29904	2021-05-14 12:16:40 -07:00
John Baldwin	4b6ed0758d	cxgbe: Make the TOE ISCSI RX stats per-queue instead of per adapter. Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D29903	2021-05-14 12:16:33 -07:00
Navdeep Parhar	f4ba035bca	cxgbe(4): Use ifaddr_event_ext instead of ifaddr_event for CLIP management. The _ext event notification includes the address being added/removed and that gives the driver an easy way to ignore non-IPv6 addresses. Remove 'tom' from the handler's name while here, it was moved out of t4_tom a long time ago. MFC after: 1 week Sponsored by: Chelsio Communications	2021-05-04 20:16:25 -07:00
Navdeep Parhar	b9820bca18	cxgbe(4): Do not panic when tx is called with invalid checksum requests. There is no need to panic in if_transmit if the checksums requested are inconsistent with the frame being transmitted. This typically indicates that the kernel and driver were built with different INET/INET6 options, or there is some other kernel bug. The driver should just throw away the requests that it doesn't understand and move on. MFC after: 1 week Sponsored by: Chelsio Communications	2021-04-28 14:04:53 -07:00
Navdeep Parhar	83b5cda106	cxgbe(4): Add support for NIC suspend/resume and live reset. Add suspend/resume callbacks to the driver and a live reset built around them. This commit covers the basic NIC and future commits will expand this functionality to other stateful parts of the chip. Suspend and resume operate on the chip (the t?nex nexus device) and affect all its ports. It is not possible to suspend/resume or reset individual ports. All these operations can be performed on a running NIC. A reset will look like a link bounce to the networking stack. Here are some ways to exercise this functionality: /* Manual suspend and resume. / # devctl suspend t6nex0 # devctl resume t6nex0 / Manual reset. / # devctl reset t6nex0 / Manual reset with driver sysctl. / # sysctl dev.t6nex.0.reset=1 / Automatic adapter reset on any fatal error. */ # hw.cxgbe.reset_on_fatal_err=1 Suspend disables the adapter (DMA, interrupts, and the port PHYs) and marks the hardware as unavailable to the driver. All ifnets associated with the adapter are still visible to the kernel but operations that require hardware interaction will fail with ENXIO. All ifnets report link-down while the adapter is suspended. Resume will reattach to the card, reconfigure it as before, and recreate the queues servicing the existing ifnets. The ifnets are able to send and receive traffic as soon as the link comes back up. Reset is roughly the same as a suspend and a resume with at least one of these events in between: D0->D3Hot->D0, FLR, PCIe link retrain. MFC after: 1 month Relnotes: yes Sponsored by: Chelsio Communications	2021-04-27 22:48:51 -07:00
Navdeep Parhar	43bbae1948	cxgbe(4): Separate the sw- and hw-specific parts of resource allocations The driver uses both software resources (locks, callouts, memory for descriptors and for bookkeeping, sysctls, etc.) and hardware resources (VIs, DMA queues, TCAM entries, etc.) to operate the NIC. This commit splits the single _ALLOCATED flag used to track all these resources into separate _SW_ALLOCATED and _HW_ALLOCATED flags. This is the simplified pseudocode that now applies to most queues (foo can be ctrlq/txq/rxq/ofld_txq/ofld_rxq): / Idempotent / alloc_foo { if (!SW_ALLOCATED) init_iq/init_eq/init_fl no-fail sw init alloc_iq_fl/alloc_eq/alloc_wrq may-fail sw alloc add_foo_sysctls, etc. no-fail post-alloc items if (!HW_ALLOCATED) alloc_iq_fl_hwq/alloc_eq_hwq hw resource allocation } / Idempotent */ free_foo { if (!HW_ALLOCATED) free_iq_fl_hwq/free_eq_hwq release hw resources if (!SW_ALLOCATED) free_iq_fl/free_eq/free_wrq release sw resources } The routines that take the driver to FULL_INIT_DONE and VI_INIT_DONE and back are now all idempotent. The quiesce routines pay attention to the HW_ALLOCATED flag and will not wait on the hardware for pidx/cidx updates and other completions if this flag is not set. MFC after: 1 month Sponsored by: Chelsio Communications	2021-04-26 14:09:59 -07:00
Navdeep Parhar	50f5d13eeb	cxgbe(4): hw.cxgbe.panic_on_fatal_err can be changed any time. MFC after: 2 weeks Sponsored by: Chelsio Communications	2021-04-23 12:17:54 -07:00
Navdeep Parhar	5f00292fe3	cxgbe(4): Move the hw-specific parts of VXLAN setup to a separate function. It can be called to (re)apply the settings in the driver softc to the hardware. MFC after: 2 weeks Sponsored by: Chelsio Communications	2021-04-23 00:26:47 -07:00
Navdeep Parhar	b47b28e5b2	cxgbe(4): Add flag to reliably stop the driver from accessing hw stats. There are two kinds of routines in the driver that read statistics from the hardware: the cxgbe_* variants read the per-port MPS/MAC registers and the vi_* variants read the per-VI registers. They can be called from the 1Hz callout or if_get_counter. All stats collection now takes place under the callout lock and there is a new flag to indicate that these routines should not access any hardware register. MFC after: 2 weeks Sponsored by: Chelsio Communications	2021-04-22 17:45:52 -07:00
Navdeep Parhar	dc77e79296	cxgbe(4): Fix minor nit in the display of MPS TCAM entries. MFC after: 3 days	2021-04-22 15:36:51 -07:00
Navdeep Parhar	8f1bc78ef7	cxgbe(4): make the logging helpers a little more robust. MFC after: 3 days Sponsored by: Chelsio Communications	2021-04-22 15:28:43 -07:00
Navdeep Parhar	557c4521bb	cxgbe/t4_tom: Implement tod_pmtu_update. tod_pmtu_update was added to the kernel in `01d74fe1ff`. Sponsored by: Chelsio Communications	2021-04-22 14:48:57 -07:00
Navdeep Parhar	d107ee06f3	cxgbe(4): RSS hash for VXLAN traffic is computed from the inner frame. Sponsored by: Chelsio Communications	2021-04-13 16:50:12 -07:00
John Baldwin	774c4c82ff	TOE: Use a read lock on the PCB for syncache_add(). Reviewed by: np, glebius Fixes: `08d9c92027` Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D29739	2021-04-13 16:31:04 -07:00
John Baldwin	45d5c28439	cxgbe: Ignore doomed virtual interfaces when updating the clip table. A doomed VI does not have a valid ifnet. Reported by: Jithesh Arakkan @ Chelsio Reviewed by: np MFC after: 1 week Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D29662	2021-04-12 14:36:40 -07:00
John Baldwin	568e69e4eb	cxgbe: Add counters for iSCSI PDUs transmitted via TOE. Reviewed by: np MFC after: 1 week Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D29297	2021-04-12 13:57:45 -07:00
Navdeep Parhar	bf5057691b	cxgbe/tom: Fix potential leak in t4_aiotx_process_job. The mbuf allocated could be a chain and must be freed with m_freem. Reviewed by: jhb@ MFC after: 1 week Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D29579	2021-04-11 19:14:18 -07:00
Navdeep Parhar	516fe911a6	cxgbe(4): Always use the per-VI callout to read interface stats. There is no change in the source of the stats (t4_get_port_stats or t4_get_vi_stats) but the per-port callout is gone. Sponsored by: Chelsio Communications Reviewed by: jhb@ Differential Revision: https://reviews.freebsd.org/D29527	2021-04-01 14:24:29 -07:00
Navdeep Parhar	5394893269	cxgbe/t4_tom: restore socket's protosw before entering TIME_WAIT. This fixes a panic due to stale so->so_proto if t4_tom is unloaded and one or more connections that were previously offloaded are still around in TIME_WAIT state. Reviewed by: jhb@ MFC after: 1 week Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D29503	2021-03-31 10:54:32 -07:00
John Baldwin	fe496dc02a	cxgbe: Make the TOE TLS stats per-queue instead of per-port. This avoids some atomics by using counter_u64 for TX and relying on existing single-threading (single ithread per rxq) for RX. Reviewed by: np Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D29383	2021-03-26 15:19:58 -07:00
John Baldwin	077ba6a845	cxgbe: Add a struct sge_ofld_txq type. This type mirrors struct sge_ofld_rxq and holds state for TCP offload transmit queues. Currently it only holds a work queue but will include additional state in future changes. Reviewed by: np Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D29382	2021-03-26 15:19:58 -07:00
Bjoern A. Zeeb	0a7b99553f	cxgbe: remove unused linux headers Remove unused #includes of LinuxKPI headers noticed while trying to solve LinuxKPI struct net_device and related functions. Neither netdevice.h nor inetdevice.h nor notifier.h seem to be needed. This takes cxgbe(4) out of the picture of D29366. Sponsored-by: The FreeBSD Foundation MFC-after: 2 weeks Reviewed-by: np X-D-R: D29366 (extracted as further cleanup) Differential Revision: https://reviews.freebsd.org/D29432	2021-03-26 17:44:38 +00:00
Navdeep Parhar	15f3355567	cxgbe(4): Allow a T6 adapter to switch between TOE and NIC TLS mode. The hw.cxgbe.kern_tls tunable was used for this in the past and if it was set then all T6 adapters would be configured for NIC TLS operation and could not be reconfigured for TOE without a reload. With this change ifconfig can be used to manipulate toe and txtls caps like any other caps. hw.cxgbe.kern_tls continues to work as usual but its effects are not permanent any more. * Enable nic_ktls_ofld in the default configuration file and use the firmware instead of direct register manipulation to apply/rollback NIC TLS configuration. This allows the driver to switch the hardware between TOE and NIC TLS mode in a safe manner. Note that the configuration is adapter-wide and not per-port. * Remove the kern_tls config file as it works with 100G T6 cards only and leads to firmware crashes with 25G cards. The configurations included with the driver (with the exception of the FPGA configs) are supposed to work with all adapters. Reported by: Veeresh U.K. at Chelsio MFC after: 2 weeks Sponsored by: Chelsio Communications Reviewed by: jhb@ Differential Revision: https://reviews.freebsd.org/D29291	2021-03-25 12:39:41 -07:00
John Baldwin	90c74b2b60	cxgbei: Enter network epoch and set vnet around t4_push_pdus(). Reviewed by: np MFC after: 2 weeks Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D29302	2021-03-22 10:05:02 -07:00
John Baldwin	017902fc5f	cxgbe ddp: Use CPL_COOKIE_DDP* instead of DDP_BUF*_INVALIDATED. This avoids mixing the use of two different enums which modern C compilers warn about. Reviewed by: np MFC after: 2 weeks Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D29301	2021-03-22 10:05:02 -07:00
John Baldwin	8855ed61b5	cxgbei: Pass ULP submode directly to set_ulp_mode_iscsi(). Reviewed by: np MFC after: 2 weeks Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D29300	2021-03-22 10:05:02 -07:00
John Baldwin	45eed2331e	cxgbei: Move some function prototypes to cxgbei.h. Reviewed by: np MFC after: 2 weeks Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D29299	2021-03-22 10:05:02 -07:00
John Baldwin	52c11c3f74	cxgbei: Set vnet around tcp_drop() in do_rx_iscsi_ddp(). Reviewed by: np MFC after: 2 weeks Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D29298	2021-03-22 10:05:02 -07:00
Navdeep Parhar	3cc6f777be	cxgbe(4): create a separate helper routine to write the global RSS key. While here, make sure only the PF driver attempts to program the global RSS key (with options RSS). The VF driver doesn't have access to those device registers. MFC after: 1 week Sponsored by: Chelsio Communications	2021-03-19 13:35:30 -07:00
Navdeep Parhar	a1d803c162	cxgbe(4): make it safe to call setup_memwin repeatedly. A repeat call will recreate the memory windows in the hardware and move them to their last-known positions without repeating any of the software initialization. MFC after: 1 week Sponsored by: Chelsio Communications	2021-03-19 12:37:44 -07:00
Navdeep Parhar	473f6163e3	cxgbe(4): use standard sysctl routines to deal with 16b values. These routines to handle 8b and 16b types were added in r289773 5+ years ago. MFC after: 2 weeks Sponsored by: Chelsio Communications	2021-03-19 10:56:24 -07:00
Navdeep Parhar	0b373f26be	cxgbe(4): catch up with the latest cryptocaps. There are two crypto capabilities that the driver didn't know about. MFC after: 1 week Sponsored by: Chelsio Communications	2021-03-16 10:53:52 -07:00
John Baldwin	5fe0cd6503	ccr: Disable requests on port 1 when needed to workaround a firmware bug. Completions for crypto requests on port 1 can sometimes return a stale cookie value due to a firmware bug. Disable requests on port 1 by default on affected firmware. Reviewed by: np Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D26581	2021-03-12 10:59:35 -08:00
John Baldwin	9c5137beb5	ccr: Add per-port stats of queued and completed requests. Reviewed by: np Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D29176	2021-03-12 10:59:35 -08:00
John Baldwin	8f885fd1f3	ccr: Set the RX channel ID correctly in work requests. These fixes are only relevant for requests on the second port. In some cases, the crypto completion data, completion message, and receive descriptor could be written in the wrong order. - Add a separate rx_channel_id that is a copy of the port's rx_c_chan and use it when an RX channel ID is required in crypto requests instead of using the tx_channel_id. - Set the correct rx_channel_id in the CPL_RX_PHYS_ADDR used to write the crypto result. - Set the FID to the first rx queue ID on the adapter rather than the queue ID of the first rx queue for the port. - While here, use tx_chan to set the tx_channel_id though this is identical to the previous value. Reviewed by: np Reported by: Chelsio QA Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D29175	2021-03-12 10:59:35 -08:00
Navdeep Parhar	765d623d60	cxgbe(4): Remove extra blank line. No functional change.	2021-03-05 12:48:39 -08:00
Navdeep Parhar	4a4e9c516c	cxgbe(4): Fix an assertion that is not valid during attach. Firmware access from t4_attach takes place without any synchronization. The driver should not panic (debug kernels) if something goes wrong in early communication with the firmware. It should still load so that it's possible to poke around with cxgbetool. MFC after: 1 week Sponsored by: Chelsio Communications	2021-03-05 11:28:18 -08:00
Navdeep Parhar	dfff1de729	cxgbe(4): Read the rx 'c' channel for a port and make it available. MFC after: 1 week Sponsored by: Chelsio Communications	2021-02-25 23:46:14 -08:00
Navdeep Parhar	0460a45062	cxgbe(4): Use the correct filter width for T5+. T5 and above have extra bits for the optional filter fields. This is a correctness issue and not just a waste because a filter mode valid on a T4 (36b) may not be valid on a T5+ (40b). MFC after: 2 weeks Sponsored by: Chelsio Communications	2021-02-19 14:23:58 -08:00
Navdeep Parhar	c91dda5ad9	cxgbe(4): Add a driver ioctl to set the filter mask. Allow the filter mask (aka the hashfilter mode when hashfilters are in use) to be set any time it is safe to do so. The requested mask must be a subset of the filter mode already. The driver will not change the mode or ingress config just to support a new mask. MFC after: 2 weeks Sponsored by: Chelsio Communications	2021-02-19 14:23:58 -08:00
Navdeep Parhar	7ac8040a99	cxgbe(4): Use firmware commands to get/set filter configuration. 1. Query the firmware for filter mode, mask, and related ingress config instead of trying to figure them out from hardware registers. Read configuration from the registers only when the firmware does not support this query. 2. Use the firmware to set the filter mode. This is the correct way to do it and is more flexible as well. The filter mode (and associated ingress config) can now be changed any time it is safe to do so. The user can specify a subset of a valid mode and the driver will enable enough bits to make sure that the mode is maxed out -- that is, it is not possible to set another bit without exceeding the total width for optional filter fields. This is a hardware requirement that was not enforced by the driver previously. MFC after: 2 weeks Sponsored by: Chelsio Communications	2021-02-19 14:23:58 -08:00
Navdeep Parhar	fae028dd97	cxgbe(4): Break up t4_read_chip_settings. Read the PF-only hardware settings directly in get_params__post_init. Split the rest into two routines used by both the PF and VF drivers: one that reads the SGE rx buffer configuration and another that verifies miscellaneous hardware configuration. MFC after: 1 week Sponsored by: Chelsio Communications	2021-02-18 01:22:42 -08:00
John Baldwin	1deaad9364	Handle negative return values from syncache_expand(). These errors do not clear so to NULL, so the existing check was treating these failures as success. The rest of do_pass_establish() then tried to use the listen socket as if it was a connection socket newly created by syncache_expand(). In addition, for negative return values, do not send a RST to the peer. Reported by: Sony Arpita Das @ Chelsio Reviewed by: np Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D28243	2021-02-17 13:28:04 -08:00
Alexander Motin	294e62bebf	cxgbe(4): Save proper zone index on low memory in refill_fl(). When refill_fl() fails to allocate large (9/16KB) mbuf cluster, it falls back to safe (4KB) ones. But it still saved into sd->zidx the original fl->zidx instead of fl->safe_zidx. It caused problems with the later use of that cluster, including memory and/or data corruption. While there, make refill_fl() to use the safe zone for all following clusters for the call, since it is unlikely that large succeed. MFC after: 3 days Sponsored by: iXsystems, Inc. Reviewed by: np, jhb Differential Revision: https://reviews.freebsd.org/D28716	2021-02-16 21:15:28 -05:00
Navdeep Parhar	3447df8bc5	cxgbe(4): Fixes to tx coalescing. - The behavior implemented in r362905 resulted in delayed transmission of packets in some cases, causing performance issues. Use a different heuristic to predict tx requests. - Add a tunable/sysctl (hw.cxgbe.tx_coalesce) to disable tx coalescing entirely. It can be changed at any time. There is no change in default behavior.	2021-02-01 03:00:09 -08:00
Gleb Smirnoff	3f43ada98c	Catch up with `6edfd179c8`: mechanically rename IFCAP_NOMAP to IFCAP_MEXTPG. Originally IFCAP_NOMAP meant that the mbuf has external storage pointer that points to unmapped address. Then, this was extended to array of such pointers. Then, such mbufs were augmented with header/trailer. Basically, extended mbufs are extended, and set of features is subject to change. The new name should be generic enough to avoid further renaming.	2021-01-29 11:46:24 -08:00
Mateusz Guzik	6b3a9a0f3d	Convert remaining cap_rights_init users to cap_rights_init_one semantic patch: @@ expression rights, r; @@ - cap_rights_init(&rights, r) + cap_rights_init_one(&rights, r)	2021-01-12 13:16:10 +00:00
John Baldwin	6727847500	Don't try to adjust a TLS TOE socket that has been closed. The handshake timer can race with another thread sending a FIN or RST to close a TOE TLS socket. Just bail from the timer without rescheduling if the connection is closed when the timer fires. Reported by: Sony Arpita Das @ Chelsio QA Reviewed by: np Differential Revision: https://reviews.freebsd.org/D27583	2020-12-30 09:56:24 -08:00
Toomas Soome	40c4557bee	cxgbe: replace zero sized array by flexible array The issue was found while building cxgbe with gcc 10 (in illumos), the array subscription check is warning us about outside the bounds access. See also: https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html	2020-12-29 23:09:15 +02:00
John Baldwin	0082e479ef	Clear TLS offload mode if a TLS socket hangs without receiving data. By default, if a TOE TLS socket stops receiving data for more than 5 seconds, revert the connection back to plain TOE mode. This provides a fallback if the userland SSL library does not support KTLS. In addition, for client TLS 1.3 sockets using connect(), the TOE socket blocks before the handshake has completed since the socket option is only invoked for the final handshake. The timeout defaults to 5 seconds, but can be changed at boot via the hw.cxgbe.toe.tls_rx_timeout tunable or for an individual interface via the dev.<nexus>.toe.tls_rx_timeout sysctl. Reviewed by: np MFC after: 2 weeks Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D27470	2020-12-03 22:06:08 +00:00
Navdeep Parhar	180c2dca4e	cxgbe(4): Fix vertical alignment in sysctl_cpl_stats. MFC after: 3 days Sponsored by: Chelsio Communications	2020-12-03 22:04:23 +00:00
John Baldwin	99963f5343	Don't transmit mbufs that aren't yet ready on TOE sockets. This includes mbufs waiting for data from sendfile() I/O requests, or mbufs awaiting encryption for KTLS. Reviewed by: np MFC after: 2 weeks Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D27469	2020-12-03 22:01:13 +00:00
Navdeep Parhar	dbc5c85c66	cxgbe(4): two new debug sysctls. dev.<nexus>.<instance>.misc.tid_stats dev.<nexus>.<instance>.misc.tnl_stats MFC after: 3 days Sponsored by: Chelsio Communications	2020-12-03 22:00:41 +00:00
John Baldwin	a42f096821	Clear TLS offload mode for unsupported cipher suites and versions. If TOE TLS is requested for an unsupported cipher suite or TLS version, disable TLS processing and fall back to plain TOE. In addition, if an error occurs when saving the decryption keys in the card's memory, disable TLS processing and fall back to plain TOE. Reviewed by: np Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D27468	2020-12-03 21:59:47 +00:00
John Baldwin	05d5675520	Fix downgrading of TOE TLS sockets to plain TOE. If a TOE TLS socket ends up using an unsupported TLS version or ciphersuite, it must be downgraded to a "plain" TOE socket with TLS encryption/decryption performed on the host. The previous implementation of this fallback was incomplete and resulted in hung connections. Reviewed by: np MFC after: 2 weeks Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D27467	2020-12-03 21:49:20 +00:00
Navdeep Parhar	8eba75ed68	cxgbe(4): Stop but don't free netmap queues when netmap is switched off. It is common for freelists to be starving when a netmap application stops. Mailbox commands to free queues can hang in such a situation. Avoid that by not freeing the queues when netmap is switched off. Instead, use an alternate method to stop the queues without releasing the context ids. If netmap is enabled again later then the same queue is reinitialized for use. Move alloc_nm_rxq and txq to t4_netmap.c while here. MFC after: 1 week Sponsored by: Chelsio Communications	2020-12-03 08:30:29 +00:00
Navdeep Parhar	f42f3b2955	cxgbe(4): Revert r367917. r367917 fixed the backpressure on the netmap rxq being stopped but that doesn't help if some other netmap rxq is starved (because it is stopping too although the driver doesn't know this yet) and blocks the pipeline. An alternate fix that works in all cases will be checked in instead. Sponsored by: Chelsio Communications	2020-12-02 20:54:03 +00:00
Navdeep Parhar	b3718e2d7e	cxgbe(4): Catch up with in-flight netmap rx before destroying queues. The netmap application using the driver is responsible for replenishing the receive freelists and they may be totally depleted when the application exits. Packets in flight, if any, might block the pipeline in case there aren't enough buffers left in the freelist. Avoid this by filling up the freelists with a driver allocated buffer. MFC after: 1 week Sponsored by: Chelsio Communications	2020-11-21 03:27:32 +00:00
Navdeep Parhar	bdabd00d65	cxgbe/t4_tom: Handle VXLAN-encapsulated SYNs correctly. TCP SYNs in inner traffic will hit hardware listeners when VXLAN/NVGRE rx parsing is enabled in the chip. t4_tom should pass on these SYNs to the kernel and let it deal with them as if they arrived on the non-TOE path. Reported by: Sony at Chelsio MFC after: 1 week Sponsored by: Chelsio Communications	2020-11-12 20:02:48 +00:00
Navdeep Parhar	f14d7c9516	cxgbev(4): Make sure that the iq/eq map sizes are correct for VFs. This should have been part of r366929. MFC after: 3 days Sponsored by: Chelsio Communications	2020-11-12 01:18:05 +00:00
John Baldwin	b3ceca0c80	Clear tp->tod in t4_pcb_detach(). Otherwise, a socket can have a non-NULL tp->tod while TF_TOE is clear. In particular, if a newly accepted socket falls back to non-TOE due to an active open failure, the non-TOE socket will still have tp->tod set even though TF_TOE is clear. Reviewed by: np MFC after: 2 weeks Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D27028	2020-11-10 19:54:39 +00:00
Navdeep Parhar	de0a3472d8	cxgbe(4): Allow the PF driver to set a VF's MAC address. The MAC address can be set with the optional mac-addr property in the VF section of the iovctl.conf(5) used to instantiate the VFs. MFC after: 2 weeks Sponsored by: Chelsio Communications	2020-11-09 00:08:35 +00:00
Navdeep Parhar	dc0800a9ad	cxgbev(4): Use the MAC address set by the the PF if there is one. Query the firmware for the MAC address set by the PF for the VF and use it instead of the firmware generated MAC if it's available. MFC after: 2 weeks Sponsored by: Chelsio Communications	2020-11-09 00:01:13 +00:00
Navdeep Parhar	76b976ad98	cxgbe(4): Add the firmware binaries missing in r367428. Obtained from: Chelsio Communications MFC after: 5 days Sponsored by: Chelsio Communications	2020-11-08 22:30:13 +00:00
Navdeep Parhar	890efa1ab9	cxgbe(4): Update firmwares to 1.25.0.40. This fixes a potential crash in firmware 1.25.0.0 on the passive open side during TOE operation. Obtained from: Chelsio Communications MFC after: 1 week Sponsored by: Chelsio Communications	2020-11-06 19:04:20 +00:00
Mark Johnston	f7db0c9532	vmspace: Convert to refcount(9) This is mostly mechanical except for vmspace_exit(). There, use the new refcount_release_if_last() to avoid switching to vmspace0 unless other processes are sharing the vmspace. In that case, upon switching to vmspace0 we can unconditionally release the reference. Remove the volatile qualifier from vm_refcnt now that accesses are protected using refcount(9) KPIs. Reviewed by: alc, kib, mmel MFC after: 1 month Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D27057	2020-11-04 16:30:56 +00:00
Navdeep Parhar	e2e43aafd7	cxgbe(4): Fix min/max typo in r366958.	2020-10-23 02:24:43 +00:00
Navdeep Parhar	b8b01d9be8	cxgbe(4): refine the values reported in if_ratelimit_query. - Get the number of classes from chip_params. - Get the number of ethofld tids from the firmware. - Do not let tcp_ratelimit allocate all traffic classes. Sponsored by: Chelsio Communications	2020-10-23 01:36:54 +00:00
John Baldwin	8a82be5044	Handle CPL_RX_DATA on active TLS sockets. In certain edge cases, the NIC might have only received a partial TLS record which it needs to return to the driver. For example, if the local socket was closed while data was still in flight, a partial TLS record might be pending when the connection is closed. Receiving a RST in the middle of a TLS record is another example. When this happens, the firmware returns the the partial TLS record as plain TCP data via CPL_RX_DATA. Handle these requests by returning an error to OpenSSL (via so_error for KTLS or via an error TLS record header for the older Chelsio OpenSSL interface). Reported by: Sony Arpita Das @ Chelsio Reviewed by: np MFC after: 2 weeks Sponsored by: Chelsio Communications Differential Revision: Revision: https://reviews.freebsd.org/D26800	2020-10-23 00:23:54 +00:00
Navdeep Parhar	b20b25e744	cxgbe(4): fix the size of the iq/eq maps. The firmware can allocate ingress and egress context ids anywhere from its configured range. Size the iq/eq maps to match the entire range instead of assuming that the firmware always allocates the first available context id. Reported by: Baptiste Wicht @ Verisign MFC after: 1 week Sponsored by: Chelsio Communications	2020-10-22 08:40:25 +00:00
Navdeep Parhar	37d411338e	cxgbe(4): display correct tid range for T6 based -SO cards. Reported by: Chelsio QA MFC after: 1 week Sponsored by: Chelsio Communications	2020-10-21 20:42:29 +00:00
Navdeep Parhar	ae5da4e14d	cxgbe(4): Updates to the drop features from r366532. MFC after: 1 week Sponsored by: Chelsio Communications	2020-10-19 21:11:49 +00:00
John Baldwin	6b7ecdcd9d	Re-enable receive flow control for TOE TLS sockets. Flow control was disabled during initial TOE TLS development to workaround a hang (and to match the Linux TOE TLS support for T6). The rest of the TOE TLS code maintained credits as if flow control was enabled which was inherited from before the workaround was added with the exception that the receive window was allowed to go negative. This negative receive window handling (rcv_over) was because I hadn't realized the full implications of disabling flow control. To clean this up, re-enable flow control on TOE TLS sockets. The existing TPF_FORCE_CREDITS workaround is sufficient for the original hang. Now that flow control is enabled, remove the rcv_over workaround and instead assert that the receive window never goes negative matching plain TCP TOE sockets. Reviewed by: np MFC after: 2 weeks Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D26799	2020-10-19 20:08:50 +00:00
Navdeep Parhar	3f3e04a062	cxgbe(4): Fix page fault in t4_get_lb_stats with 2 port T5 cards. PR: 250449 Reported by: freqlabs@ MFC after: 1 week Sponsored by: Chelsio Communications	2020-10-19 20:08:47 +00:00
Navdeep Parhar	472d183268	cxgbe(4): Do not request FEC when requesting speeds that don't have FEC. MFC after: 1 week Sponsored by: Chelsio Communications	2020-10-14 10:12:39 +00:00
Navdeep Parhar	6cc4520b0a	cxgbe(4): unimplemented cudbg routines should return the correct internal error code and not an errno. Submitted by: Krishnamraju Eraparaju @ Chelsio MFC after: 1 week Sponsored by: Chelsio Communications	2020-10-14 08:04:39 +00:00
Navdeep Parhar	31deb3cc76	cxgbe(4): More fixes for the T6 FCS error counter. r365732 was the first attempt to get an accurate count but it was writing to some read-only registers to clear them and that obviously didn't work. Instead, note the counter's value when it is supposed to be cleared and subtract it from future readings. dev.<port>.stats.rx_fcs_error should not be serviced from the MPS register for T6. The stats.* sysctls should all use T5_PORT_REG for T5 and above. This must have been missed in the initial T5 support years ago. Fix it while here. MFC after: 3 days Sponsored by: Chelsio Communications	2020-10-09 22:23:39 +00:00
Navdeep Parhar	77af2b2c85	cxgbe(4): knobs to drop various kinds of undesirable frames on ingress. These kind of drops come for free in the sense that they do not use the filter TCAM or any other resource that wouldn't normally be used during rx. Frames dropped by the hardware get counted in the MAC's rx stats but are not delivered to the driver. hw.cxgbe.attack_filter Set to 1 to enable the "attack filter". Default is 0. The attack filter will drop an incoming frame if any of these conditions is true: src ip/ip6 == dst ip/ip6; tcp and src/dst ip is not unicast; src/dst ip is loopback (127.x.y.z); src ip6 is not unicast; src/dst ip6 is loopback (::1/128) or unspecified (::/128); tcp and src/dst ip6 is mcast (ff00::/8). hw.cxgbe.drop_ip_fragments Set to 1 to drop all incoming IP fragments. Default is 0. Note that this drops valid frames. hw.cxgbe.drop_pkts_with_l2_errors Set to 1 to drop incoming frames with Layer 2 length or checksum errors. Default is 1. hw.cxgbe.drop_pkts_with_l3_errors Set to 1 to drop incoming frames with IP version, length, or checksum errors. Default is 0. hw.cxgbe.drop_pkts_with_l4_errors Set to 1 to drop incoming frames with Layer 4 length, checksum, or other errors. Default is 0. MFC after: 2 weeks Sponsored by: Chelsio Communications	2020-10-08 10:00:13 +00:00
John Baldwin	56fb710f1b	Store the send tag type in the common send tag header. Both cxgbe(4) and mlx5(4) wrapped the existing send tag header with their own identical headers that stored the type that the type-specific tag structures inherited from, so in practice it seems drivers need this in the tag anyway. This permits removing these extra header indirections (struct cxgbe_snd_tag and struct mlx5e_snd_tag). In addition, this permits driver-independent code to query the type of a tag, e.g. to know what type of tag is being queried via if_snd_query. Reviewed by: gallatin, hselasky, np, kib Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D26689	2020-10-06 17:58:56 +00:00
Navdeep Parhar	8741306b3b	cxgbe(4) sysctls do not need Giant. Sponsored by: Chelsio Communications	2020-10-05 22:18:04 +00:00
Navdeep Parhar	73f6606b47	cxgbe(4): set up the firmware flowc for the tid before send_abort_rpl. MFC after: 3 days Sponsored by: Chelsio Communications	2020-10-02 23:48:57 +00:00
Navdeep Parhar	7676c62aa3	cxgbe(4): validate largest_rx_cluster and safest_rx_cluster. These tunables can only be set to a valid cluster size (2K, 4K, 9K, or 16K) as documented in the man page. Anything else could lead to a panic on interface up. Reported by: mav@ MFC after: 1 week Sponsored by: Chelsio Communications	2020-10-02 05:59:55 +00:00
John Baldwin	0e99339684	Fallback to software for more GCM and CCM requests. ccr(4) uses software to handle GCM and CCM requests not supported by the crypto engine (e.g. with only AAD and no payload). This change adds a fallback for a few more requests such as those with more SGL entries than can fit in a work request (this can happen for GCM when decrypting a TLS record split across 15 or more packets). Reported by: Chelsio QA Reviewed by: np Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D26582	2020-09-29 21:51:32 +00:00
Navdeep Parhar	822967e7e5	cxgbe(4): Avoid unnecessary work in the firmware during netmap tx. Bind the netmap tx queues to a special '0xff' scheduling class which makes the firmware skip some processing related to rate limiting on the outgoing traffic. Future firmwares will do this automatically. MFC after: 1 week Sponsored by: Chelsio Communications	2020-09-29 09:25:52 +00:00
Navdeep Parhar	7efe256233	Remove duplicate line.	2020-09-29 09:11:51 +00:00
Navdeep Parhar	15ca0766ed	cxgbe(4): adjust the doorbell threshold for netmap freelists to match the maximum burst size used when fetching descriptors from the list. MFC after: 1 week Sponsored by: Chelsio Communications	2020-09-29 07:51:06 +00:00
Navdeep Parhar	f7b8615af5	cxgbe(4): display an error message when netmap cannot be enabled because the interface is down. MFC after: 1 week	2020-09-29 07:36:21 +00:00
Navdeep Parhar	a9f476580e	cxgbe(4): fixes for netmap operation with only some queues active. - Only active netmap receive queues should be in the RSS lookup table. - The RSS table should be restored for NIC operation when the last active netmap queue is switched off, not the first one. - Support repeated netmap ON/OFF on a subset of the queues. This works whether the the queues being enabled and disabled are the only ones active or not. Some kring indexes have to be reset in the driver for the second case. MFC after: 1 week Sponsored by: Chelsio Communications	2020-09-29 05:08:45 +00:00
Navdeep Parhar	30e3f2b4ea	cxgbe(4): let the PF driver use VM work requests for transmit. This allows the PF interfaces to communicate with the VF interfaces over the internal switch in the ASIC. Fix the GL limits for VM work requests while here. MFC after: 3 days Sponsored by: Chelsio Communications	2020-09-22 04:16:40 +00:00
Navdeep Parhar	7054f6ec97	cxgbe(4): add counters for mbuf pullups and defrags. MFC after: 3 days Sponsored by: Chelsio Communications	2020-09-22 03:06:36 +00:00
Navdeep Parhar	3b8506ae30	cxgbe(4): add the firmware binaries instead of the empty files that were added in r365861. Obtained from: Chelsio Communications MFC after: 3 days Sponsored by: Chelsio Communications	2020-09-18 03:11:47 +00:00
Navdeep Parhar	a4a4ad2dd9	cxgbe(4): add support for stateless offloads for VXLAN traffic. Hardware assistance includes checksumming (tx and rx), TSO, and RSS on the inner traffic in a VXLAN tunnel. Relnotes: Yes Sponsored by: Chelsio Communications	2020-09-18 03:01:47 +00:00
Navdeep Parhar	88c9c3f4dd	cxgbe(4): Update T4/5/6 firmwares to 1.25.0.0. Obtained from: Chelsio Communications MFC after: 3 days Sponsored by: Chelsio Communications	2020-09-17 22:14:11 +00:00
Navdeep Parhar	bb60ba7e22	cxgbe(4): Get the count of FCS errors from the MAC and not MPS for T6 ports. The MPS register on the T6 counts something other than FCS errors despite its name. MFC after: 3 days Sponsored by: Chelsio Communications	2020-09-14 22:15:54 +00:00
Navdeep Parhar	565b8fce23	cxgbe(4): Check for descriptors before writing a TLS or raw work request. This fixes a regression in r362905. Submitted by: jhb@ Sponsored by: Chelsio Communications	2020-08-31 22:44:59 +00:00
Alan Somers	e6f6d0c9bc	crypto(9): add CRYPTO_BUF_VMPAGE crypto(9) functions can now be used on buffers composed of an array of vm_page_t structures, such as those stored in an unmapped struct bio. It requires the running to kernel to support the direct memory map, so not all architectures can use it. Reviewed by: markj, kib, jhb, mjg, mat, bcr (manpages) MFC after: 1 week Sponsored by: Axcient Differential Revision: https://reviews.freebsd.org/D25671	2020-08-26 02:37:42 +00:00
Navdeep Parhar	6a59b9940e	cxgbe(4): Use large clusters for TOE rx queues when TOE+TLS is enabled. Rx is more efficient within the chip when the receive buffer size matches the TLS PDU size. MFC after: 3 days Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D26127	2020-08-23 04:16:20 +00:00
Navdeep Parhar	11a82cd688	cxgbei: destroy the worker threads' CV and mutex in stop_worker_threads. Reported by: bz@ MFC after: 3 days	2020-08-21 00:34:33 +00:00
Mark Johnston	5822a14c43	cxgbe(4): Stop checking for failures from malloc(M_WAITOK). PR: 240545 Submitted by: Andrew Reiter <arr@watson.org> Reviewed by: np MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D25767	2020-07-27 19:05:53 +00:00
Navdeep Parhar	a2e160c5af	cxgbe(4): Some updates to the common code. Obtained from: Chelsio Communications MFC after: 1 week Sponsored by: Chelsio Communications	2020-07-24 23:15:42 +00:00
Navdeep Parhar	800535c2ca	cxgbev(4): Compare at most 16 bytes of the Ethernet header when trying to coalesce tx work requests. Note that Coverity will still treat this as an out-of-bounds access. We do want to compare 16B starting from ethmacdst but cmp_l2hdr was was going beyond that by 2B. cmp_l2hdr was introduced in r362905. Reported by: Coverity (CID 1430284) Sponsored by: Chelsio Communications	2020-07-13 19:15:29 +00:00
Navdeep Parhar	3bbb68f0e3	cxgbe(4): Fix a bug (introduced in r362905) where some tx traffic wasn't being reported to BPF.	2020-07-05 05:14:33 +00:00
Navdeep Parhar	d735920d33	cxgbe(4): changes in the Tx path to help increase tx coalescing. - Ask the firmware for the number of frames that can be stuffed in one work request. - Modify mp_ring to increase the likelihood of tx coalescing when there are just one or two threads that are doing most of the tx. Add teeth to the abdication mechanism by pushing the consumer lock into mp_ring. This reduces the likelihood that a consumer will get stuck with all the work even though it is above its budget. - Add support for coalesced tx WR to the VF driver. This, with the changes above, results in a 7x improvement in the tx pps of the VF driver for some common cases. The firmware vets the L2 headers submitted by the VF driver and it's a big win if the checks are performed for a batch of packets and not each one individually. Reviewed by: jhb@ MFC after: 2 weeks Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D25454	2020-07-03 04:44:23 +00:00
John Baldwin	94578db218	Reduce contention on per-adapter lock. - Move temporary sglists into the session structure and protect them with a per-session lock instead of a per-adapter lock. - Retire an unused session field, and move a debugging field under INVARIANTS to avoid using the session lock for completion handling when INVARIANTS isn't enabled. - Use counter_u64 for per-adapter statistics. Note that this helps for cases where multiple sessions are used (e.g. multiple IPsec SAs or multiple KTLS connections). It does not help for workloads that use a single session (e.g. a single GELI volume). Reviewed by: np Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D25457	2020-06-26 00:01:31 +00:00
John Baldwin	4a711b8d04	Use zfree() instead of explicit_bzero() and free(). In addition to reducing lines of code, this also ensures that the full allocation is always zeroed avoiding possible bugs with incorrect lengths passed to explicit_bzero(). Suggested by: cem Reviewed by: cem, delphij Approved by: csprng (cem) Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D25435	2020-06-25 20:17:34 +00:00
Navdeep Parhar	7c228be30b	cxgbe(4): Add a pointer to the adapter softc in vi_info. There were quite a few places where port_info was being accessed only to get to the adapter. Reviewed by: jhb@ MFC after: 1 week Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D25432	2020-06-25 17:04:22 +00:00
Navdeep Parhar	0cadedfc46	cxgbe(4): Add a tx_len16_to_desc helper. No functional change. MFC after: 1 week Sponsored by: Chelsio Communications	2020-06-23 07:33:29 +00:00
John Baldwin	6deb4131b8	Add support for requests with separate AAD to ccr(4). Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D25290	2020-06-22 23:41:33 +00:00
Alexander V. Chernikov	b158cfb3fc	Switch cxgbe interface lookup to use fibX_lookup() from older fibX_lookup_nh_ext(). fibX_lookup_nh_ represents pre-epoch generation of fib kpi, providing less guarantees over pointer validness and requiring on-stack data copying. Reviewed by: np Differential Revision: https://reviews.freebsd.org/D24975	2020-06-22 07:35:23 +00:00
Ryan Moeller	cbb9ccf735	Avoid trying to toggle TSO twice Remove TSO from the toggle mask when automatically disabled by TXCKSUM* in various NIC drivers. Reviewed by: hselasky, np, gallatin, jpaetzel Approved by: mav (mentor) MFC after: 1 week Sponsored by: iXsystems, Inc. Differential Revision: https://reviews.freebsd.org/D25120	2020-06-15 16:35:27 +00:00
John Baldwin	1a4a7e98eb	Explicitly zero IVs on the stack. Reviewed by: delphij Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D25057	2020-06-03 22:19:52 +00:00
John Baldwin	0065d9a47f	Explicitly zero AES key schedules on the stack. Reviewed by: delphij MFC after: 1 week Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D25057	2020-06-03 22:18:21 +00:00
John Baldwin	20c128da91	Add explicit bzero's of sensitive data in software crypto consumers. Explicitly zero IVs, block buffers, and hashes/digests. Reviewed by: delphij Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D25057	2020-06-03 22:11:05 +00:00
John Baldwin	2adc3c9417	Support separate output buffers in ccr(4). Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D24545	2020-05-25 22:23:13 +00:00
John Baldwin	9c0e3d3a53	Add support for optional separate output buffers to in-kernel crypto. Some crypto consumers such as GELI and KTLS for file-backed sendfile need to store their output in a separate buffer from the input. Currently these consumers copy the contents of the input buffer into the output buffer and queue an in-place crypto operation on the output buffer. Using a separate output buffer avoids this copy. - Create a new 'struct crypto_buffer' describing a crypto buffer containing a type and type-specific fields. crp_ilen is gone, instead buffers that use a flat kernel buffer have a cb_buf_len field for their length. The length of other buffer types is inferred from the backing store (e.g. uio_resid for a uio). Requests now have two such structures: crp_buf for the input buffer, and crp_obuf for the output buffer. - Consumers now use helper functions (crypto_use_, e.g. crypto_use_mbuf()) to configure the input buffer. If an output buffer is not configured, the request still modifies the input buffer in-place. A consumer uses a second set of helper functions (crypto_use_output_) to configure an output buffer. - Consumers must request support for separate output buffers when creating a crypto session via the CSP_F_SEPARATE_OUTPUT flag and are only permitted to queue a request with a separate output buffer on sessions with this flag set. Existing drivers already reject sessions with unknown flags, so this permits drivers to be modified to support this extension without requiring all drivers to change. - Several data-related functions now have matching versions that operate on an explicit buffer (e.g. crypto_apply_buf, crypto_contiguous_subsegment_buf, bus_dma_load_crp_buf). - Most of the existing data-related functions operate on the input buffer. However crypto_copyback always writes to the output buffer if a request uses a separate output buffer. - For the regions in input/output buffers, the following conventions are followed: - AAD and IV are always present in input only and their fields are offsets into the input buffer. - payload is always present in both buffers. If a request uses a separate output buffer, it must set a new crp_payload_start_output field to the offset of the payload in the output buffer. - digest is in the input buffer for verify operations, and in the output buffer for compute operations. crp_digest_start is relative to the appropriate buffer. - Add a crypto buffer cursor abstraction. This is a more general form of some bits in the cryptosoft driver that tried to always use uio's. However, compared to the original code, this avoids rewalking the uio iovec array for requests with multiple vectors. It also avoids allocate an iovec array for mbufs and populating it by instead walking the mbuf chain directly. - Update the cryptosoft(4) driver to support separate output buffers making use of the cursor abstraction. Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D24545	2020-05-25 22:12:04 +00:00
John Baldwin	3e9470482a	Various cleanups to the software encryption transform interface. - Consistently use 'void ' for key schedules / key contexts instead of a mix of 'caddr_t', 'uint8_t ', and 'void *'. - Add a ctxsize member to enc_xform similar to what auth transforms use and require callers to malloc/zfree the context. The setkey callback now supplies the caller-allocated context pointer and the zerokey callback is removed. Callers now always use zfree() to ensure key contexts are zeroed. - Consistently use C99 initializers for all statically-initialized instances of 'struct enc_xform'. - Change the encrypt and decrypt functions to accept separate in and out buffer pointers. Almost all of the backend crypto functions already supported separate input and output buffers and this makes it simpler to support separate buffers in OCF. - Remove xform_userland.h shim to permit transforms to be compiled in userland. Transforms no longer call malloc/free directly. Reviewed by: cem (earlier version) Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D24855	2020-05-20 21:21:01 +00:00
Navdeep Parhar	b0dede77b1	cxgbe/iw_cxgbe: Add an async callback to notify iw_cxgbe in case of a fatal error. Submitted by: Krishnamraju Eraparaju @ Chelsio MFC after: 2 weeks Sponsored by: Chelsio Communications	2020-05-19 16:28:20 +00:00
Gleb Smirnoff	365e8da44a	Mechanically rename MBUF_EXT_PGS_ASSERT() to M_ASSERTEXTPG() to match classical M_ASSERTPKTHDR. Reviewed by: gallatin Differential Revision: https://reviews.freebsd.org/D24598	2020-05-03 00:27:41 +00:00
Gleb Smirnoff	6edfd179c8	Step 4.1: mechanically rename M_NOMAP to M_EXTPG Reviewed by: gallatin Differential Revision: https://reviews.freebsd.org/D24598	2020-05-03 00:21:11 +00:00
Gleb Smirnoff	7b6c99d08d	Step 3: anonymize struct mbuf_ext_pgs and move all its fields into mbuf within m_epg namespace. All edits except the 'struct mbuf' declaration and mb_dupcl() were done mechanically with sed: s/->m_ext_pgs.nrdy/->m_epg_nrdy/g s/->m_ext_pgs.hdr_len/->m_epg_hdrlen/g s/->m_ext_pgs.trail_len/->m_epg_trllen/g s/->m_ext_pgs.first_pg_off/->m_epg_1st_off/g s/->m_ext_pgs.last_pg_len/->m_epg_last_len/g s/->m_ext_pgs.flags/->m_epg_flags/g s/->m_ext_pgs.record_type/->m_epg_record_type/g s/->m_ext_pgs.enc_cnt/->m_epg_enc_cnt/g s/->m_ext_pgs.tls/->m_epg_tls/g s/->m_ext_pgs.so/->m_epg_so/g s/->m_ext_pgs.seqno/->m_epg_seqno/g s/->m_ext_pgs.stailq/->m_epg_stailq/g Reviewed by: gallatin Differential Revision: https://reviews.freebsd.org/D24598	2020-05-03 00:12:56 +00:00
Gleb Smirnoff	6fbcdeb6f1	Step 2.4: Stop using 'struct mbuf_ext_pgs' in drivers. Reviewed by: gallatin, hselasky Differential Revision: https://reviews.freebsd.org/D24598	2020-05-02 23:58:20 +00:00
Gleb Smirnoff	c4ee38f8e8	Step 2.3: Rename mbuf_ext_pg_len() to m_epg_pagelen() that uses mbuf argument. Reviewed by: gallatin Differential Revision: https://reviews.freebsd.org/D24598	2020-05-02 23:52:35 +00:00
Gleb Smirnoff	49b6b60e22	Step 2.2: o Shrink sglist(9) functions to work with multipage mbufs down from four functions to two. o Don't use 'struct mbuf_ext_pgs *' as argument, use struct mbuf. o Rename to something matching _epg. Reviewed by: gallatin Differential Revision: https://reviews.freebsd.org/D24598	2020-05-02 23:46:29 +00:00
Gleb Smirnoff	0c1032665c	Continuation of multi page mbuf redesign from r359919. The following series of patches addresses three things: Now that array of pages is embedded into mbuf, we no longer need separate structure to pass around, so struct mbuf_ext_pgs is an artifact of the first implementation. And struct mbuf_ext_pgs_data is a crutch to accomodate the main idea r359919 with minimal churn. Also, M_EXT of type EXT_PGS are just a synonym of M_NOMAP. The namespace for the newfeature is somewhat inconsistent and sometimes has a lengthy prefixes. In these patches we will gradually bring the namespace to "m_epg" prefix for all mbuf fields and most functions. Step 1 of 4: o Anonymize mbuf_ext_pgs_data, embed in m_ext o Embed mbuf_ext_pgs o Start documenting all this entanglement Reviewed by: gallatin Differential Revision: https://reviews.freebsd.org/D24598	2020-05-02 22:39:26 +00:00
John Baldwin	8cce4145fa	Add support for KTLS RX over TOE to T6. This largely reuses the TLS TOE support added in r330884. However, this uses the KTLS framework in upstream OpenSSL rather than requiring Chelsio-specific patches to OpenSSL. As with the existing TLS TOE support, use of RX offload requires setting the tls_rx_ports sysctl. Reviewed by: np Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D24453	2020-04-27 23:59:42 +00:00
John Baldwin	f1f9347546	Initial support for kernel offload of TLS receive. - Add a new TCP_RXTLS_ENABLE socket option to set the encryption and authentication algorithms and keys as well as the initial sequence number. - When reading from a socket using KTLS receive, applications must use recvmsg(). Each successful call to recvmsg() will return a single TLS record. A new TCP control message, TLS_GET_RECORD, will contain the TLS record header of the decrypted record. The regular message buffer passed to recvmsg() will receive the decrypted payload. This is similar to the interface used by Linux's KTLS RX except that Linux does not return the full TLS header in the control message. - Add plumbing to the TOE KTLS interface to request either transmit or receive KTLS sessions. - When a socket is using receive KTLS, redirect reads from soreceive_stream() into soreceive_generic(). - Note that this interface is currently only defined for TLS 1.1 and 1.2, though I believe we will be able to reuse the same interface and structures for 1.3.	2020-04-27 23:17:19 +00:00
Navdeep Parhar	55eae197fc	cxgbe/crypto: Fix the key size in a couple of places to catch up with the recent OCF refactor. Sponsored by: Chelsio Communications	2020-04-23 23:54:23 +00:00
Navdeep Parhar	a3372bd833	cxgbe/iw_cxgbe: Create a LinuxKPI pci device for an adapter and use it as the dma_device during RDMA registration. cxgbe's struct device cannot be used as-is because it's a native FreeBSD driver and ibcore is LinuxKPI based. MFC after: 1 week MFC after: r360196	2020-04-22 21:54:21 +00:00
Alexander V. Chernikov	8d6708ba80	Convert TOE routing lookups to the new routing KPI. Reviewed by: np Differential Revision: https://reviews.freebsd.org/D24388	2020-04-22 07:53:43 +00:00
John Baldwin	29fe41ddd7	Retire the CRYPTO_F_IV_GENERATE flag. The sole in-tree user of this flag has been retired, so remove this complexity from all drivers. While here, add a helper routine drivers can use to read the current request's IV into a local buffer. Use this routine to replace duplicated code in nearly all drivers. Reviewed by: cem Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D24450	2020-04-20 22:24:49 +00:00
John Baldwin	708652acc4	Set inp_flowid's for TOE connections. KTLS uses the flowid to distribute software encryption tasks among its pool of worker threads. Without this change, all software KTLS requests for TOE sockets ended up on the first worker thread. Note that the flowid for TOE sockets created via connect() is not a hash of the 4-tuple, but is instead the id of the TOE pcb (tid). The flowid of TOE sockets created from TOE listen sockets do use the 4-tuple RSS hash as the flowid since the firmware provides the hash in the message containing the original SYN. Reviewed by: np (earlier version) Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D24348	2020-04-15 19:28:51 +00:00
John Baldwin	f3b6d8ad2e	Clear CPL_GET_TCB_RPL handler on module unload. This fixes a panic when unloading and reloading t4_tom.ko since the old pointer is still stored when t4_tom_load tries to set it. Reviewed by: np Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D24358	2020-04-15 19:23:53 +00:00
Navdeep Parhar	ddde90ac81	cxgbe/iw_cxgbe: Do not start the EP timer if soaccept fails. This fixes a panic that would occur when the timer tried to close a stale socket. Submitted by: Krishnamraju Eraparaju @ Chelsio MFC after: 1 week Sponsored by: Chelsio Communications	2020-04-15 03:40:33 +00:00
Andrew Gallatin	23feb56348	KTLS: Re-work unmapped mbufs to carry ext_pgs in the mbuf itself. While the original implementation of unmapped mbufs was a large step forward in terms of reducing cache misses by enabling mbufs to carry more than a single page for sendfile, they are rather cache unfriendly when accessing the ext_pgs metadata and data. This is because the ext_pgs part of the mbuf is allocated separately, and almost guaranteed to be cold in cache. This change takes advantage of the fact that unmapped mbufs are never used at the same time as pkthdr mbufs. Given this fact, we can overlap the ext_pgs metadata with the mbuf pkthdr, and carry the ext_pgs meta directly in the mbuf itself. Similarly, we can carry the ext_pgs data (TLS hdr/trailer/array of pages) directly after the existing m_ext. In order to be able to carry 5 pages (which is the minimum required for a 16K TLS record which is not perfectly aligned) on LP64, I've had to steal ext_arg2. The only user of this in the xmit path is sendfile, and I've adjusted it to use arg1 when using unmapped mbufs. This change is almost entirely mechanical, except that we change mb_alloc_ext_pgs() to no longer allow allocating pkthdrs, the change to avoid ext_arg2 as mentioned above, and the removal of the ext_pgs zone, This change saves roughly 2% "raw" CPU (~59% -> 57%), or over 3% "scaled" CPU on a Netflix 100% software kTLS workload at 90+ Gb/s on Broadwell Xeons. In a follow-on commit, I plan to remove some hacks to avoid access ext_pgs fields of mbufs, since they will now be in cache. Many thanks to glebius for helping to make this better in the Netflix tree. Reviewed by: hselasky, jhb, rrs, glebius (early version) Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D24213	2020-04-14 14:46:06 +00:00
Navdeep Parhar	843b264a85	cxgbe(4): Make sure 'flags' is at the same offset in structs toepcb and synq_entry. TAILQ_ENTRY isn't always the same size as two pointers. Reported by: rmacklem@ MFC after: 3 days Sponsored by: Chelsio Communications	2020-04-13 20:12:47 +00:00
John Baldwin	94fad5ffc6	Use both crypto engines on a T6. A T6 adapter contains two crypto engines on separate channels. This commit distributes sessions between the two engines. Previously, only the first engine was used. Reviewed by: np Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D24347	2020-04-10 22:27:45 +00:00

1 2 3 4 5 ...

1194 Commits