freebsd-nq/sys/dev/cxgbe/t4_l2t.h
John Baldwin bddf73433e NIC KTLS for Chelsio T6 adapters.
This adds support for ifnet (NIC) KTLS using Chelsio T6 adapters.
Unlike the TOE-based KTLS in r353328, NIC TLS works with non-TOE
connections.

NIC KTLS on T6 is not able to use the normal TSO (LSO) path to segment
the encrypted TLS frames output by the crypto engine.  Instead, the
TOE is placed into a special setup to permit "dummy" connections to be
associated with regular sockets using KTLS.  This permits using the
TOE to segment the encrypted TLS records.  However, this approach does
have some limitations:

1) Regular TOE sockets cannot be used when the TOE is in this special
   mode.  One can use either TOE and TOE-based KTLS or NIC KTLS, but
   not both at the same time.

2) In NIC KTLS mode, the TOE is only able to accept a per-connection
   timestamp offset that varies in the upper 4 bits.  Put another way,
   only connections whose timestamp offset has the 28 lower bits
   cleared can use NIC KTLS and generate correct timestamps.  The
   driver will refuse to enable NIC KTLS on connections with a
   timestamp offset with any of the lower 28 bits set.  To use NIC
   KTLS, users can either disable TCP timestamps by setting the
   net.inet.tcp.rfc1323 sysctl to 0, or apply a local patch to the
   tcp_new_ts_offset() function to clear the lower 28 bits of the
   generated offset.

3) Because the TCP segmentation relies on fields mirrored in a TCB in
   the TOE, not all fields in a TCP packet can be sent in the TCP
   segments generated from a TLS record.  Specifically, for packets
   containing TCP options other than timestamps, the driver will
   inject an "empty" TCP packet holding the requested options (e.g. a
   SACK scoreboard) along with the segments from the TLS record.
   These empty TCP packets are counted by the
   dev.cc.N.txq.M.kern_tls_options sysctls.

Unlike TOE TLS which is able to buffer encrypted TLS records in
on-card memory to handle retransmits, NIC KTLS must re-encrypt TLS
records for retransmit requests as well as non-retransmit requests
that do not include the start of a TLS record but do include the
trailer.  The T6 NIC KTLS code tries to optimize some of the cases for
requests to transmit partial TLS records.  In particular it attempts
to minimize sending "waste" bytes that have to be given as input to
the crypto engine but are not needed on the wire to satisfy mbufs sent
from the TCP stack down to the driver.

TCP packets for TLS requests are broken down into the following
classes (with associated counters):

- Mbufs that send an entire TLS record in full do not have any waste
  bytes (dev.cc.N.txq.M.kern_tls_full).

- Mbufs that send a short TLS record that ends before the end of the
  trailer (dev.cc.N.txq.M.kern_tls_short).  For sockets using AES-CBC,
  the encryption must always start at the beginning, so if the mbuf
  starts at an offset into the TLS record, the offset bytes will be
  "waste" bytes.  For sockets using AES-GCM, the encryption can start
  at the 16 byte block before the starting offset capping the waste at
  15 bytes.

- Mbufs that send a partial TLS record that has a non-zero starting
  offset but ends at the end of the trailer
  (dev.cc.N.txq.M.kern_tls_partial).  In order to compute the
  authentication hash stored in the trailer, the entire TLS record
  must be sent as input to the crypto engine, so the bytes before the
  offset are always "waste" bytes.

In addition, other per-txq sysctls are provided:

- dev.cc.N.txq.M.kern_tls_cbc: Count of sockets sent via this txq
  using AES-CBC.

- dev.cc.N.txq.M.kern_tls_gcm: Count of sockets sent via this txq
  using AES-GCM.

- dev.cc.N.txq.M.kern_tls_fin: Count of empty FIN-only packets sent to
  compensate for the TOE engine not being able to set FIN on the last
  segment of a TLS record if the TLS record mbuf had FIN set.

- dev.cc.N.txq.M.kern_tls_records: Count of TLS records sent via this
  txq including full, short, and partial records.

- dev.cc.N.txq.M.kern_tls_octets: Count of non-waste bytes (TLS header
  and payload) sent for TLS record requests.

- dev.cc.N.txq.M.kern_tls_waste: Count of waste bytes sent for TLS
  record requests.

To enable NIC KTLS with T6, set the following tunables prior to
loading the cxgbe(4) driver:

hw.cxgbe.config_file=kern_tls
hw.cxgbe.kern_tls=1

Reviewed by:	np
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D21962
2019-11-21 19:30:31 +00:00

116 lines
4.4 KiB
C

/*-
* SPDX-License-Identifier: BSD-2-Clause-FreeBSD
*
* Copyright (c) 2011 Chelsio Communications, Inc.
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*
* $FreeBSD$
*
*/
#ifndef __T4_L2T_H
#define __T4_L2T_H
/* identifies sync vs async L2T_WRITE_REQs */
#define S_SYNC_WR 12
#define V_SYNC_WR(x) ((x) << S_SYNC_WR)
#define F_SYNC_WR V_SYNC_WR(1)
enum { L2T_SIZE = 4096 }; /* # of L2T entries */
enum {
L2T_STATE_VALID, /* entry is up to date */
L2T_STATE_STALE, /* entry may be used but needs revalidation */
L2T_STATE_RESOLVING, /* entry needs address resolution */
L2T_STATE_FAILED, /* failed to resolve */
L2T_STATE_SYNC_WRITE, /* synchronous write of entry underway */
/* when state is one of the below the entry is not hashed */
L2T_STATE_SWITCHING, /* entry is being used by a switching filter */
L2T_STATE_TLS, /* entry is being used by TLS sessions */
L2T_STATE_UNUSED /* entry not in use */
};
/*
* Each L2T entry plays multiple roles. First of all, it keeps state for the
* corresponding entry of the HW L2 table and maintains a queue of offload
* packets awaiting address resolution. Second, it is a node of a hash table
* chain, where the nodes of the chain are linked together through their next
* pointer. Finally, each node is a bucket of a hash table, pointing to the
* first element in its chain through its first pointer.
*/
struct l2t_entry {
uint16_t state; /* entry state */
uint16_t idx; /* entry index */
uint32_t addr[4]; /* next hop IP or IPv6 address */
uint32_t iqid; /* iqid for reply to write_l2e */
struct sge_wrq *wrq; /* queue to use for write_l2e */
struct ifnet *ifp; /* outgoing interface */
uint16_t smt_idx; /* SMT index */
uint16_t vlan; /* VLAN TCI (id: 0-11, prio: 13-15) */
struct l2t_entry *first; /* start of hash chain */
struct l2t_entry *next; /* next l2t_entry on chain */
STAILQ_HEAD(, wrqe) wr_list; /* list of WRs awaiting resolution */
struct mtx lock;
volatile int refcnt; /* entry reference count */
uint16_t hash; /* hash bucket the entry is on */
uint8_t ipv6; /* entry is for an IPv6 address */
uint8_t lport; /* associated offload logical port */
uint8_t dmac[ETHER_ADDR_LEN]; /* next hop's MAC address */
};
struct l2t_data {
struct rwlock lock;
u_int l2t_size;
volatile int nfree; /* number of free entries */
struct l2t_entry *rover;/* starting point for next allocation */
struct l2t_entry l2tab[];
};
int t4_init_l2t(struct adapter *, int);
int t4_free_l2t(struct l2t_data *);
struct l2t_entry *t4_alloc_l2e(struct l2t_data *);
struct l2t_entry *t4_l2t_alloc_switching(struct adapter *, uint16_t, uint8_t,
uint8_t *);
struct l2t_entry *t4_l2t_alloc_tls(struct adapter *, struct sge_txq *,
void *, int *, uint16_t, uint8_t, uint8_t *);
int t4_l2t_set_switching(struct adapter *, struct l2t_entry *, uint16_t,
uint8_t, uint8_t *);
int t4_write_l2e(struct l2t_entry *, int);
int do_l2t_write_rpl(struct sge_iq *, const struct rss_header *, struct mbuf *);
static inline void
t4_l2t_release(struct l2t_entry *e)
{
struct l2t_data *d = __containerof(e, struct l2t_data, l2tab[e->idx]);
if (atomic_fetchadd_int(&e->refcnt, -1) == 1)
atomic_add_int(&d->nfree, 1);
}
int sysctl_l2t(SYSCTL_HANDLER_ARGS);
#endif /* __T4_L2T_H */