freebsd-dev

Author	SHA1	Message	Date
John Baldwin	94578db218	Reduce contention on per-adapter lock. - Move temporary sglists into the session structure and protect them with a per-session lock instead of a per-adapter lock. - Retire an unused session field, and move a debugging field under INVARIANTS to avoid using the session lock for completion handling when INVARIANTS isn't enabled. - Use counter_u64 for per-adapter statistics. Note that this helps for cases where multiple sessions are used (e.g. multiple IPsec SAs or multiple KTLS connections). It does not help for workloads that use a single session (e.g. a single GELI volume). Reviewed by: np Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D25457	2020-06-26 00:01:31 +00:00
John Baldwin	4a711b8d04	Use zfree() instead of explicit_bzero() and free(). In addition to reducing lines of code, this also ensures that the full allocation is always zeroed avoiding possible bugs with incorrect lengths passed to explicit_bzero(). Suggested by: cem Reviewed by: cem, delphij Approved by: csprng (cem) Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D25435	2020-06-25 20:17:34 +00:00
Navdeep Parhar	7c228be30b	cxgbe(4): Add a pointer to the adapter softc in vi_info. There were quite a few places where port_info was being accessed only to get to the adapter. Reviewed by: jhb@ MFC after: 1 week Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D25432	2020-06-25 17:04:22 +00:00
Navdeep Parhar	0cadedfc46	cxgbe(4): Add a tx_len16_to_desc helper. No functional change. MFC after: 1 week Sponsored by: Chelsio Communications	2020-06-23 07:33:29 +00:00
John Baldwin	6deb4131b8	Add support for requests with separate AAD to ccr(4). Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D25290	2020-06-22 23:41:33 +00:00
John Baldwin	1a4a7e98eb	Explicitly zero IVs on the stack. Reviewed by: delphij Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D25057	2020-06-03 22:19:52 +00:00
John Baldwin	0065d9a47f	Explicitly zero AES key schedules on the stack. Reviewed by: delphij MFC after: 1 week Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D25057	2020-06-03 22:18:21 +00:00
John Baldwin	20c128da91	Add explicit bzero's of sensitive data in software crypto consumers. Explicitly zero IVs, block buffers, and hashes/digests. Reviewed by: delphij Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D25057	2020-06-03 22:11:05 +00:00
John Baldwin	2adc3c9417	Support separate output buffers in ccr(4). Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D24545	2020-05-25 22:23:13 +00:00
John Baldwin	9c0e3d3a53	Add support for optional separate output buffers to in-kernel crypto. Some crypto consumers such as GELI and KTLS for file-backed sendfile need to store their output in a separate buffer from the input. Currently these consumers copy the contents of the input buffer into the output buffer and queue an in-place crypto operation on the output buffer. Using a separate output buffer avoids this copy. - Create a new 'struct crypto_buffer' describing a crypto buffer containing a type and type-specific fields. crp_ilen is gone, instead buffers that use a flat kernel buffer have a cb_buf_len field for their length. The length of other buffer types is inferred from the backing store (e.g. uio_resid for a uio). Requests now have two such structures: crp_buf for the input buffer, and crp_obuf for the output buffer. - Consumers now use helper functions (crypto_use_, e.g. crypto_use_mbuf()) to configure the input buffer. If an output buffer is not configured, the request still modifies the input buffer in-place. A consumer uses a second set of helper functions (crypto_use_output_) to configure an output buffer. - Consumers must request support for separate output buffers when creating a crypto session via the CSP_F_SEPARATE_OUTPUT flag and are only permitted to queue a request with a separate output buffer on sessions with this flag set. Existing drivers already reject sessions with unknown flags, so this permits drivers to be modified to support this extension without requiring all drivers to change. - Several data-related functions now have matching versions that operate on an explicit buffer (e.g. crypto_apply_buf, crypto_contiguous_subsegment_buf, bus_dma_load_crp_buf). - Most of the existing data-related functions operate on the input buffer. However crypto_copyback always writes to the output buffer if a request uses a separate output buffer. - For the regions in input/output buffers, the following conventions are followed: - AAD and IV are always present in input only and their fields are offsets into the input buffer. - payload is always present in both buffers. If a request uses a separate output buffer, it must set a new crp_payload_start_output field to the offset of the payload in the output buffer. - digest is in the input buffer for verify operations, and in the output buffer for compute operations. crp_digest_start is relative to the appropriate buffer. - Add a crypto buffer cursor abstraction. This is a more general form of some bits in the cryptosoft driver that tried to always use uio's. However, compared to the original code, this avoids rewalking the uio iovec array for requests with multiple vectors. It also avoids allocate an iovec array for mbufs and populating it by instead walking the mbuf chain directly. - Update the cryptosoft(4) driver to support separate output buffers making use of the cursor abstraction. Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D24545	2020-05-25 22:12:04 +00:00
John Baldwin	3e9470482a	Various cleanups to the software encryption transform interface. - Consistently use 'void ' for key schedules / key contexts instead of a mix of 'caddr_t', 'uint8_t ', and 'void *'. - Add a ctxsize member to enc_xform similar to what auth transforms use and require callers to malloc/zfree the context. The setkey callback now supplies the caller-allocated context pointer and the zerokey callback is removed. Callers now always use zfree() to ensure key contexts are zeroed. - Consistently use C99 initializers for all statically-initialized instances of 'struct enc_xform'. - Change the encrypt and decrypt functions to accept separate in and out buffer pointers. Almost all of the backend crypto functions already supported separate input and output buffers and this makes it simpler to support separate buffers in OCF. - Remove xform_userland.h shim to permit transforms to be compiled in userland. Transforms no longer call malloc/free directly. Reviewed by: cem (earlier version) Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D24855	2020-05-20 21:21:01 +00:00
Gleb Smirnoff	365e8da44a	Mechanically rename MBUF_EXT_PGS_ASSERT() to M_ASSERTEXTPG() to match classical M_ASSERTPKTHDR. Reviewed by: gallatin Differential Revision: https://reviews.freebsd.org/D24598	2020-05-03 00:27:41 +00:00
Gleb Smirnoff	6edfd179c8	Step 4.1: mechanically rename M_NOMAP to M_EXTPG Reviewed by: gallatin Differential Revision: https://reviews.freebsd.org/D24598	2020-05-03 00:21:11 +00:00
Gleb Smirnoff	7b6c99d08d	Step 3: anonymize struct mbuf_ext_pgs and move all its fields into mbuf within m_epg namespace. All edits except the 'struct mbuf' declaration and mb_dupcl() were done mechanically with sed: s/->m_ext_pgs.nrdy/->m_epg_nrdy/g s/->m_ext_pgs.hdr_len/->m_epg_hdrlen/g s/->m_ext_pgs.trail_len/->m_epg_trllen/g s/->m_ext_pgs.first_pg_off/->m_epg_1st_off/g s/->m_ext_pgs.last_pg_len/->m_epg_last_len/g s/->m_ext_pgs.flags/->m_epg_flags/g s/->m_ext_pgs.record_type/->m_epg_record_type/g s/->m_ext_pgs.enc_cnt/->m_epg_enc_cnt/g s/->m_ext_pgs.tls/->m_epg_tls/g s/->m_ext_pgs.so/->m_epg_so/g s/->m_ext_pgs.seqno/->m_epg_seqno/g s/->m_ext_pgs.stailq/->m_epg_stailq/g Reviewed by: gallatin Differential Revision: https://reviews.freebsd.org/D24598	2020-05-03 00:12:56 +00:00
Gleb Smirnoff	6fbcdeb6f1	Step 2.4: Stop using 'struct mbuf_ext_pgs' in drivers. Reviewed by: gallatin, hselasky Differential Revision: https://reviews.freebsd.org/D24598	2020-05-02 23:58:20 +00:00
Gleb Smirnoff	49b6b60e22	Step 2.2: o Shrink sglist(9) functions to work with multipage mbufs down from four functions to two. o Don't use 'struct mbuf_ext_pgs *' as argument, use struct mbuf. o Rename to something matching _epg. Reviewed by: gallatin Differential Revision: https://reviews.freebsd.org/D24598	2020-05-02 23:46:29 +00:00
Gleb Smirnoff	0c1032665c	Continuation of multi page mbuf redesign from r359919. The following series of patches addresses three things: Now that array of pages is embedded into mbuf, we no longer need separate structure to pass around, so struct mbuf_ext_pgs is an artifact of the first implementation. And struct mbuf_ext_pgs_data is a crutch to accomodate the main idea r359919 with minimal churn. Also, M_EXT of type EXT_PGS are just a synonym of M_NOMAP. The namespace for the newfeature is somewhat inconsistent and sometimes has a lengthy prefixes. In these patches we will gradually bring the namespace to "m_epg" prefix for all mbuf fields and most functions. Step 1 of 4: o Anonymize mbuf_ext_pgs_data, embed in m_ext o Embed mbuf_ext_pgs o Start documenting all this entanglement Reviewed by: gallatin Differential Revision: https://reviews.freebsd.org/D24598	2020-05-02 22:39:26 +00:00
Navdeep Parhar	55eae197fc	cxgbe/crypto: Fix the key size in a couple of places to catch up with the recent OCF refactor. Sponsored by: Chelsio Communications	2020-04-23 23:54:23 +00:00
John Baldwin	29fe41ddd7	Retire the CRYPTO_F_IV_GENERATE flag. The sole in-tree user of this flag has been retired, so remove this complexity from all drivers. While here, add a helper routine drivers can use to read the current request's IV into a local buffer. Use this routine to replace duplicated code in nearly all drivers. Reviewed by: cem Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D24450	2020-04-20 22:24:49 +00:00
Andrew Gallatin	23feb56348	KTLS: Re-work unmapped mbufs to carry ext_pgs in the mbuf itself. While the original implementation of unmapped mbufs was a large step forward in terms of reducing cache misses by enabling mbufs to carry more than a single page for sendfile, they are rather cache unfriendly when accessing the ext_pgs metadata and data. This is because the ext_pgs part of the mbuf is allocated separately, and almost guaranteed to be cold in cache. This change takes advantage of the fact that unmapped mbufs are never used at the same time as pkthdr mbufs. Given this fact, we can overlap the ext_pgs metadata with the mbuf pkthdr, and carry the ext_pgs meta directly in the mbuf itself. Similarly, we can carry the ext_pgs data (TLS hdr/trailer/array of pages) directly after the existing m_ext. In order to be able to carry 5 pages (which is the minimum required for a 16K TLS record which is not perfectly aligned) on LP64, I've had to steal ext_arg2. The only user of this in the xmit path is sendfile, and I've adjusted it to use arg1 when using unmapped mbufs. This change is almost entirely mechanical, except that we change mb_alloc_ext_pgs() to no longer allow allocating pkthdrs, the change to avoid ext_arg2 as mentioned above, and the removal of the ext_pgs zone, This change saves roughly 2% "raw" CPU (~59% -> 57%), or over 3% "scaled" CPU on a Netflix 100% software kTLS workload at 90+ Gb/s on Broadwell Xeons. In a follow-on commit, I plan to remove some hacks to avoid access ext_pgs fields of mbufs, since they will now be in cache. Many thanks to glebius for helping to make this better in the Netflix tree. Reviewed by: hselasky, jhb, rrs, glebius (early version) Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D24213	2020-04-14 14:46:06 +00:00
John Baldwin	94fad5ffc6	Use both crypto engines on a T6. A T6 adapter contains two crypto engines on separate channels. This commit distributes sessions between the two engines. Previously, only the first engine was used. Reviewed by: np Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D24347	2020-04-10 22:27:45 +00:00
John Baldwin	c034143269	Refactor driver and consumer interfaces for OCF (in-kernel crypto). - The linked list of cryptoini structures used in session initialization is replaced with a new flat structure: struct crypto_session_params. This session includes a new mode to define how the other fields should be interpreted. Available modes include: - COMPRESS (for compression/decompression) - CIPHER (for simply encryption/decryption) - DIGEST (computing and verifying digests) - AEAD (combined auth and encryption such as AES-GCM and AES-CCM) - ETA (combined auth and encryption using encrypt-then-authenticate) Additional modes could be added in the future (e.g. if we wanted to support TLS MtE for AES-CBC in the kernel we could add a new mode for that. TLS modes might also affect how AAD is interpreted, etc.) The flat structure also includes the key lengths and algorithms as before. However, code doesn't have to walk the linked list and switch on the algorithm to determine which key is the auth key vs encryption key. The 'csp_auth_' fields are always used for auth keys and settings and 'csp_cipher_' for cipher. (Compression algorithms are stored in csp_cipher_alg.) - Drivers no longer register a list of supported algorithms. This doesn't quite work when you factor in modes (e.g. a driver might support both AES-CBC and SHA2-256-HMAC separately but not combined for ETA). Instead, a new 'crypto_probesession' method has been added to the kobj interface for symmteric crypto drivers. This method returns a negative value on success (similar to how device_probe works) and the crypto framework uses this value to pick the "best" driver. There are three constants for hardware (e.g. ccr), accelerated software (e.g. aesni), and plain software (cryptosoft) that give preference in that order. One effect of this is that if you request only hardware when creating a new session, you will no longer get a session using accelerated software. Another effect is that the default setting to disallow software crypto via /dev/crypto now disables accelerated software. Once a driver is chosen, 'crypto_newsession' is invoked as before. - Crypto operations are now solely described by the flat 'cryptop' structure. The linked list of descriptors has been removed. A separate enum has been added to describe the type of data buffer in use instead of using CRYPTO_F_* flags to make it easier to add more types in the future if needed (e.g. wired userspace buffers for zero-copy). It will also make it easier to re-introduce separate input and output buffers (in-kernel TLS would benefit from this). Try to make the flags related to IV handling less insane: - CRYPTO_F_IV_SEPARATE means that the IV is stored in the 'crp_iv' member of the operation structure. If this flag is not set, the IV is stored in the data buffer at the 'crp_iv_start' offset. - CRYPTO_F_IV_GENERATE means that a random IV should be generated and stored into the data buffer. This cannot be used with CRYPTO_F_IV_SEPARATE. If a consumer wants to deal with explicit vs implicit IVs, etc. it can always generate the IV however it needs and store partial IVs in the buffer and the full IV/nonce in crp_iv and set CRYPTO_F_IV_SEPARATE. The layout of the buffer is now described via fields in cryptop. crp_aad_start and crp_aad_length define the boundaries of any AAD. Previously with GCM and CCM you defined an auth crd with this range, but for ETA your auth crd had to span both the AAD and plaintext (and they had to be adjacent). crp_payload_start and crp_payload_length define the boundaries of the plaintext/ciphertext. Modes that only do a single operation (COMPRESS, CIPHER, DIGEST) should only use this region and leave the AAD region empty. If a digest is present (or should be generated), it's starting location is marked by crp_digest_start. Instead of using the CRD_F_ENCRYPT flag to determine the direction of the operation, cryptop now includes an 'op' field defining the operation to perform. For digests I've added a new VERIFY digest mode which assumes a digest is present in the input and fails the request with EBADMSG if it doesn't match the internally-computed digest. GCM and CCM already assumed this, and the new AEAD mode requires this for decryption. The new ETA mode now also requires this for decryption, so IPsec and GELI no longer do their own authentication verification. Simple DIGEST operations can also do this, though there are no in-tree consumers. To eventually support some refcounting to close races, the session cookie is now passed to crypto_getop() and clients should no longer set crp_sesssion directly. - Assymteric crypto operation structures should be allocated via crypto_getkreq() and freed via crypto_freekreq(). This permits the crypto layer to track open asym requests and close races with a driver trying to unregister while asym requests are in flight. - crypto_copyback, crypto_copydata, crypto_apply, and crypto_contiguous_subsegment now accept the 'crp' object as the first parameter instead of individual members. This makes it easier to deal with different buffer types in the future as well as separate input and output buffers. It's also simpler for driver writers to use. - bus_dmamap_load_crp() loads a DMA mapping for a crypto buffer. This understands the various types of buffers so that drivers that use DMA do not have to be aware of different buffer types. - Helper routines now exist to build an auth context for HMAC IPAD and OPAD. This reduces some duplicated work among drivers. - Key buffers are now treated as const throughout the framework and in device drivers. However, session key buffers provided when a session is created are expected to remain alive for the duration of the session. - GCM and CCM sessions now only specify a cipher algorithm and a cipher key. The redundant auth information is not needed or used. - For cryptosoft, split up the code a bit such that the 'process' callback now invokes a function pointer in the session. This function pointer is set based on the mode (in effect) though it simplifies a few edge cases that would otherwise be in the switch in 'process'. It does split up GCM vs CCM which I think is more readable even if there is some duplication. - I changed /dev/crypto to support GMAC requests using CRYPTO_AES_NIST_GMAC as an auth algorithm and updated cryptocheck to work with it. - Combined cipher and auth sessions via /dev/crypto now always use ETA mode. The COP_F_CIPHER_FIRST flag is now a no-op that is ignored. This was actually documented as being true in crypto(4) before, but the code had not implemented this before I added the CIPHER_FIRST flag. - I have not yet updated /dev/crypto to be aware of explicit modes for sessions. I will probably do that at some point in the future as well as teach it about IV/nonce and tag lengths for AEAD so we can support all of the NIST KAT tests for GCM and CCM. - I've split up the exising crypto.9 manpage into several pages of which many are written from scratch. - I have converted all drivers and consumers in the tree and verified that they compile, but I have not tested all of them. I have tested the following drivers: - cryptosoft - aesni (AES only) - blake2 - ccr and the following consumers: - cryptodev - IPsec - ktls_ocf - GELI (lightly) I have not tested the following: - ccp - aesni with sha - hifn - kgssapi_krb5 - ubsec - padlock - safe - armv8_crypto (aarch64) - glxsb (i386) - sec (ppc) - cesa (armv7) - cryptocteon (mips64) - nlmsec (mips64) Discussed with: cem Relnotes: yes Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D23677	2020-03-27 18:25:23 +00:00
Pawel Biernacki	7029da5c36	Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many) r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are still not MPSAFE (or already are but aren’t properly marked). Use it in preparation for a general review of all nodes. This is non-functional change that adds annotations to SYSCTL_NODE and SYSCTL_PROC nodes using one of the soon-to-be-required flags. Mark all obvious cases as MPSAFE. All entries that haven't been marked as MPSAFE before are by default marked as NEEDGIANT Approved by: kib (mentor, blanket) Commented by: kib, gallatin, melifaro Differential Revision: https://reviews.freebsd.org/D23718	2020-02-26 14:26:36 +00:00
John Baldwin	ca3b3c573e	Remove the per-TXQ tls_wrs stat. It duplicated the kern_tls_records stat and was not conditional on NIC TLS being enabled. Reviewed by: np Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D23670	2020-02-13 22:55:45 +00:00
Navdeep Parhar	c0236bd93d	cxgbe(4): Use the _XT variant of the CPL used to transmit NIC traffic. CPL_TX_PKT_XT disables the internal parser on the chip and instead relies on the driver to provide the exact length of the L2 and L3 headers. This allows hw checksumming and TSO to be used with L2 and L3 encapsulations that the chip doesn't understand directly. Note that netmap tx still uses the old CPL as it never uses the hw to generate the checksum on tx. Reviewed by: jhb@ MFC after: 1 month Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D22788	2019-12-13 20:38:58 +00:00
John Baldwin	bddf73433e	NIC KTLS for Chelsio T6 adapters. This adds support for ifnet (NIC) KTLS using Chelsio T6 adapters. Unlike the TOE-based KTLS in r353328, NIC TLS works with non-TOE connections. NIC KTLS on T6 is not able to use the normal TSO (LSO) path to segment the encrypted TLS frames output by the crypto engine. Instead, the TOE is placed into a special setup to permit "dummy" connections to be associated with regular sockets using KTLS. This permits using the TOE to segment the encrypted TLS records. However, this approach does have some limitations: 1) Regular TOE sockets cannot be used when the TOE is in this special mode. One can use either TOE and TOE-based KTLS or NIC KTLS, but not both at the same time. 2) In NIC KTLS mode, the TOE is only able to accept a per-connection timestamp offset that varies in the upper 4 bits. Put another way, only connections whose timestamp offset has the 28 lower bits cleared can use NIC KTLS and generate correct timestamps. The driver will refuse to enable NIC KTLS on connections with a timestamp offset with any of the lower 28 bits set. To use NIC KTLS, users can either disable TCP timestamps by setting the net.inet.tcp.rfc1323 sysctl to 0, or apply a local patch to the tcp_new_ts_offset() function to clear the lower 28 bits of the generated offset. 3) Because the TCP segmentation relies on fields mirrored in a TCB in the TOE, not all fields in a TCP packet can be sent in the TCP segments generated from a TLS record. Specifically, for packets containing TCP options other than timestamps, the driver will inject an "empty" TCP packet holding the requested options (e.g. a SACK scoreboard) along with the segments from the TLS record. These empty TCP packets are counted by the dev.cc.N.txq.M.kern_tls_options sysctls. Unlike TOE TLS which is able to buffer encrypted TLS records in on-card memory to handle retransmits, NIC KTLS must re-encrypt TLS records for retransmit requests as well as non-retransmit requests that do not include the start of a TLS record but do include the trailer. The T6 NIC KTLS code tries to optimize some of the cases for requests to transmit partial TLS records. In particular it attempts to minimize sending "waste" bytes that have to be given as input to the crypto engine but are not needed on the wire to satisfy mbufs sent from the TCP stack down to the driver. TCP packets for TLS requests are broken down into the following classes (with associated counters): - Mbufs that send an entire TLS record in full do not have any waste bytes (dev.cc.N.txq.M.kern_tls_full). - Mbufs that send a short TLS record that ends before the end of the trailer (dev.cc.N.txq.M.kern_tls_short). For sockets using AES-CBC, the encryption must always start at the beginning, so if the mbuf starts at an offset into the TLS record, the offset bytes will be "waste" bytes. For sockets using AES-GCM, the encryption can start at the 16 byte block before the starting offset capping the waste at 15 bytes. - Mbufs that send a partial TLS record that has a non-zero starting offset but ends at the end of the trailer (dev.cc.N.txq.M.kern_tls_partial). In order to compute the authentication hash stored in the trailer, the entire TLS record must be sent as input to the crypto engine, so the bytes before the offset are always "waste" bytes. In addition, other per-txq sysctls are provided: - dev.cc.N.txq.M.kern_tls_cbc: Count of sockets sent via this txq using AES-CBC. - dev.cc.N.txq.M.kern_tls_gcm: Count of sockets sent via this txq using AES-GCM. - dev.cc.N.txq.M.kern_tls_fin: Count of empty FIN-only packets sent to compensate for the TOE engine not being able to set FIN on the last segment of a TLS record if the TLS record mbuf had FIN set. - dev.cc.N.txq.M.kern_tls_records: Count of TLS records sent via this txq including full, short, and partial records. - dev.cc.N.txq.M.kern_tls_octets: Count of non-waste bytes (TLS header and payload) sent for TLS record requests. - dev.cc.N.txq.M.kern_tls_waste: Count of waste bytes sent for TLS record requests. To enable NIC KTLS with T6, set the following tunables prior to loading the cxgbe(4) driver: hw.cxgbe.config_file=kern_tls hw.cxgbe.kern_tls=1 Reviewed by: np Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D21962	2019-11-21 19:30:31 +00:00
John Baldwin	a1b2b6e184	Create a file to hold shared routines for dealing with T6 key contexts. ccr(4) and TLS support in cxgbe(4) construct key contexts used by the crypto engine in the T6. This consolidates some duplicated code for helper functions used to build key contexts. Reviewed by: np MFC after: 1 month Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D22156	2019-11-13 00:53:45 +00:00
John Baldwin	c59050aab5	Set the FID field in lookaside crypto requests to the rx queue ID. The PCI block in the adapter requires this field to be set to a valid queue ID. It is not clear why it did not fail on all machines, but the effect was that crypto operations reading input data via DMA failed with an internal PCI read error on machines with 128G or more of RAM. Reported by: gallatin Reviewed by: np MFC after: 3 days Sponsored by: Chelsio Communications	2019-10-08 20:22:05 +00:00
John Baldwin	6b0451d603	Add support for AES-CCM to ccr(4). This is fairly similar to the AES-GCM support in ccr(4) in that it will fall back to software for certain cases (requests with only AAD and requests that are too large). Tested by: cryptocheck, cryptotest.py MFC after: 1 month Sponsored by: Chelsio Communications	2019-04-24 23:31:46 +00:00
John Baldwin	a2ad169e61	Fix requests for "plain" SHA digests of an empty buffer. To workaround limitations in the crypto engine, empty buffers are handled by manually constructing the final length block as the payload passed to the crypto engine and disabling the normal "final" handling. For HMAC this length block should hold the length of a single block since the hash is actually the hash of the IPAD digest, but for "plain" SHA the length should be zero instead. Reported by: NIST SHA1 test failure MFC after: 2 weeks Sponsored by: Chelsio Communications	2019-04-24 23:18:10 +00:00
John Baldwin	475d54fac3	Reject new sessions if the necessary queues aren't initialized. ccr reuses the control queue and first rx queue from the first port on each adapter. The driver cannot send requests until those queues are initialized. Refuse to create sessions for now if the queues aren't ready. This is a workaround until cxgbe allocates one or more dedicated queues for ccr. PR: 233851 MFC after: 1 week Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D18478	2019-01-15 18:53:45 +00:00
John Baldwin	d09389fd05	Consolidate on a single set of constants for SCMD fields. Both ccr(4) and the TOE TLS code had separate sets of constants for fields in SCMD messages. Sponsored by: Chelsio Communications	2018-11-16 19:08:52 +00:00
John Baldwin	567a3784c2	Add support for "plain" (non-HMAC) SHA digests. MFC after: 2 months Sponsored by: Chelsio Communications	2018-10-29 22:24:31 +00:00
John Baldwin	1146377b4b	Support the SHA224 HMAC algorithm in ccr(4). MFC after: 2 months Sponsored by: Chelsio Communications	2018-10-23 18:31:39 +00:00
Conrad Meyer	1b0909d51a	OpenCrypto: Convert sessions to opaque handles instead of integers Track session objects in the framework, and pass handles between the framework (OCF), consumers, and drivers. Avoid redundancy and complexity in individual drivers by allocating session memory in the framework and providing it to drivers in ::newsession(). Session handles are no longer integers with information encoded in various high bits. Use of the CRYPTO_SESID2FOO() macros should be replaced with the appropriate crypto_ses2foo() function on the opaque session handle. Convert OCF drivers (in particular, cryptosoft, as well as myriad others) to the opaque handle interface. Discard existing session tracking as much as possible (quick pass). There may be additional code ripe for deletion. Convert OCF consumers (ipsec, geom_eli, krb5, cryptodev) to handle-style interface. The conversion is largely mechnical. The change is documented in crypto.9. Inspired by https://lists.freebsd.org/pipermail/freebsd-arch/2018-January/018835.html . No objection from: ae (ipsec portion) Reported by: jhb	2018-07-18 00:56:25 +00:00
John Baldwin	db631975fe	Don't overflow the ipad[] array when clearing the remainder. After the auth key is copied into the ipad[] array, any remaining bytes are cleared to zero (in case the key is shorter than one block size). The full block size was used as the length of the zero rather than the size of the remaining ipad[]. In practice this overflow was harmless as it could only clear bytes in the following opad[] array which is initialized with a copy of ipad[] in the next statement. Sponsored by: Chelsio Communications	2018-02-26 22:17:27 +00:00
John Baldwin	52f8c52677	Move ccr_aes_getdeckey() from ccr(4) to the cxgbe(4) driver. This routine will also be used by the TOE module to manage TLS keys. Sponsored by: Chelsio Communications	2018-02-26 22:12:31 +00:00
John Baldwin	c0154062c7	Store IV in output buffer in GCM software fallback when requested. Properly honor the lack of the CRD_F_IV_PRESENT flag in the GCM software fallback case for encryption requests. Submitted by: Harsh Jain @ Chelsio Sponsored by: Chelsio Communications	2018-01-24 20:16:48 +00:00
John Baldwin	2bc40b6ca9	Don't read or generate an IV until all error checking is complete. In particular, this avoids edge cases where a generated IV might be written into the output buffer even though the request is failed with an error. Sponsored by: Chelsio Communications	2018-01-24 20:15:49 +00:00
John Baldwin	04043b3dcd	Expand the software fallback for GCM to cover more cases. - Extend ccr_gcm_soft() to handle requests with a non-empty payload. While here, switch to allocating the GMAC context instead of placing it on the stack since it is over 1KB in size. - Allow ccr_gcm() to return a special error value (EMSGSIZE) which triggers a fallback to ccr_gcm_soft(). Move the existing empty payload check into ccr_gcm() and change a few other cases (e.g. large AAD) to fallback to software via EMSGSIZE as well. - Add a new 'sw_fallback' stat to count the number of requests processed via the software fallback. Submitted by: Harsh Jain @ Chelsio (original version) Sponsored by: Chelsio Communications	2018-01-24 20:14:57 +00:00
John Baldwin	bf5b662033	Clamp DSGL entries to a length of 2KB. This works around an issue in the T6 that can result in DMA engine stalls if an error occurs while processing a DSGL entry with a length larger than 2KB. Submitted by: Harsh Jain @ Chelsio Sponsored by: Chelsio Communications	2018-01-24 20:13:07 +00:00
John Baldwin	f7b61e2fcc	Fail crypto requests when the resulting work request is too large. Most crypto requests will not trigger this condition, but a request with a highly-fragmented data buffer (and a resulting "large" S/G list) could trigger it. Sponsored by: Chelsio Communications	2018-01-24 20:12:00 +00:00
John Baldwin	5929c9fb13	Don't discard AAD and IV output data for AEAD requests. The T6 can hang when processing certain AEAD requests if the request sets a flag asking the crypto engine to discard the input IV and AAD rather than copying them into the output buffer. The existing driver always discards the IV and AAD as we do not need it. As a workaround, allocate a single "dummy" buffer when the ccr driver attaches and change all AEAD requests to write the IV and AAD to this scratch buffer. The contents of the scratch buffer are never used (similar to "bogus_page"), and it is ok for multiple in-flight requests to share this dummy buffer. Submitted by: Harsh Jain @ Chelsio (original version) Sponsored by: Chelsio Communications	2018-01-24 20:11:00 +00:00
John Baldwin	acaabdbbee	Reject requests with AAD and IV larger than 511 bytes. The T6 crypto engine's control messages only support a total AAD length (including the prefixed IV) of 511 bytes. Reject requests with large AAD rather than returning incorrect results. Sponsored by: Chelsio Communications	2018-01-24 20:08:10 +00:00
John Baldwin	020ce53af3	Always set the IV location to IV_NOP. The firmware ignores this field in the FW_CRYPTO_LOOKASIDE_WR work request. Submitted by: Harsh Jain @ Chelsio Sponsored by: Chelsio Communications	2018-01-24 20:06:02 +00:00
John Baldwin	d3f25aa152	Always store the IV in the immediate portion of a work request. Combined authentication-encryption and GCM requests already stored the IV in the immediate explicitly. This extends this behavior to block cipher requests to work around a firmware bug. While here, simplify the AEAD and GCM handlers to not include always-true conditions. Submitted by: Harsh Jain @ Chelsio Sponsored by: Chelsio Communications	2018-01-24 20:04:08 +00:00
Pedro F. Giffuni	ac2fffa4b7	Revert r327828, r327949, r327953, r328016-r328026, r328041: Uses of mallocarray(9). The use of mallocarray(9) has rocketed the required swap to build FreeBSD. This is likely caused by the allocation size attributes which put extra pressure on the compiler. Given that most of these checks are superfluous we have to choose better where to use mallocarray(9). We still have more uses of mallocarray(9) but hopefully this is enough to bring swap usage to a reasonable level. Reported by: wosch PR: 225197	2018-01-21 15:42:36 +00:00
Pedro F. Giffuni	26c1d774b5	dev: make some use of mallocarray(9). Focus on code where we are doing multiplications within malloc(9). None of these is likely to overflow, however the change is still useful as some static checkers can benefit from the allocation attributes we use for mallocarray. This initial sweep only covers malloc(9) calls with M_NOWAIT. No good reason but I started doing the changes before r327796 and at that time it was convenient to make sure the sorrounding code could handle NULL values.	2018-01-13 22:30:30 +00:00
John Baldwin	2bd1e600e3	Fix some incorrect sysctl pointers for some error stats. The bad_session, sglist_error, and process_error sysctl nodes were returning the value of the pad_error node instead of the appropriate error counters. Sponsored by: Chelsio Communications	2017-09-14 21:06:08 +00:00
John Baldwin	1496376fee	Fix the software fallback for GCM to validate the existing tag for decrypts. Sponsored by: Chelsio Communications	2017-06-08 21:33:10 +00:00

1 2

53 Commits