The assertion checking for valid IV lengths added in 1833d6042c9a
was not properly updated to permit an IV length of 8 in commit
42dcd39528c6.
Reported by: syzbot+f0c0559b8be1d6eb28c7@syzkaller.appspotmail.com
Reviewed by: markj
Fixes: 42dcd39528c6 crypto: Support Chacha20-Poly1305 with a nonce size of 8 bytes.
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32860
Don't pass the same name to multiple mutexes while using unique types
for WITNESS. Just use the unique types as the mutex names.
Reviewed by: markj
MFC after: 1 week
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D32740
The starting sequence number used to verify that TLS 1.0 CBC records
are encrypted in-order in the OCF layer was always set to 0 and not to
the initial sequence number from the struct tls_enable.
In practice, OpenSSL always starts TLS transmit offload with a
sequence number of zero, so this only matters for tests that use a
random starting sequence number.
Reviewed by: markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D32676
As a followup to SW KTLS assuming an OCF backend, rename
struct ocf_session to struct ktls_ocf_session and forward
declare it in <sys/ktls.h> to use as the type of
struct ktls_session.cipher.
Reviewed by: gallatin, hselasky
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D32565
Pass the ivlen along through, and just drop this KASSERT() if we're
building _STANDALONE for the time being.
Fixes: 1833d6042c9a ("crypto: Permit variable-sized IVs ...")
This is useful for WireGuard which uses a nonce of 8 bytes rather
than the 12 bytes used for IPsec and TLS.
Note that this also fixes a (should be) harmless bug in ossl(4) where
the counter was incorrectly treated as a 64-bit counter instead of a
32-bit counter in terms of wrapping when using a 12 byte nonce.
However, this required a single message (TLS record) longer than 64 *
(2^32 - 1) bytes (about 256 GB) to trigger.
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32122
The tag length is included as one of the values in the flags byte of
block 0 passed to CBC_MAC, so merely copying the first N bytes is
insufficient.
To avoid adding more sideband data to the CBC MAC software context,
pull the generation of block 0, the AAD length, and AAD padding out of
cbc_mac.c and into cryptosoft.c. This matches how GCM/GMAC are
handled where the length block is constructed in cryptosoft.c and
passed as an input to the Update callback. As a result, the CBC MAC
Update() routine is now much simpler and simply performs the
XOR-and-encrypt step on each input block.
While here, avoid a copy to the staging block in the Update routine
when one or more full blocks are passed as input to the Update
callback.
Reviewed by: sef
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32120
Permit nonces of lengths 7 through 13 in the OCF framework and the
cryptosoft driver. A helper function (ccm_max_payload_length) can be
used in OCF drivers to reject CCM requests which are too large for the
specified nonce length.
Reviewed by: sef
Sponsored by: Chelsio Communications, The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32111
If an operation would generate a MAC output (e.g. for digest operation
or for an AEAD or EtA operation), then an empty payload buffer is
valid. Only reject requests with an empty buffer for "plain" cipher
sessions.
Some of the AES-CCM NIST KAT vectors use an empty payload.
While here, don't advance crp_payload_start for requests that use an
empty payload with an inline IV. (*)
Reported by: syzbot+d4b94fbd9a44b032f428@syzkaller.appspotmail.com (*)
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32109
A request without AAD for an AEAD cipher can be submitted via
CIOCCRYPT rather than CIOCCRYPTAEAD.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32108
Add 'ivlen' and 'maclen' fields to the structure used for CIOGSESSION2
to specify the explicit IV/nonce and MAC/tag lengths for crypto
sessions. If these fields are zero, the default lengths are used.
This permits selecting an alternate nonce length for AEAD ciphers such
as AES-CCM which support multiple nonce leengths. It also supports
truncated MACs as input to AEAD or ETA requests.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32107
Rather than copying crp_iv to a local array on the stack that is then
passed to xform reinit routines, pass crp_iv directly and remove the
local copy.
Reviewed by: markj
Sponsored by: Chelsio Communications, The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32106
Add a 'len' argument to the reinit hook in 'struct enc_xform' to
permit support for AEAD ciphers such as AES-CCM and Chacha20-Poly1305
which support different nonce lengths.
Reviewed by: markj
Sponsored by: Chelsio Communications, The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32105
- Retire cse->mode and use csp->csp_mode instead.
- Use csp->csp_cipher_algorithm instead of the ivsize when checking
for the fixup for the IV length for AES-XTS.
Reviewed by: markj
Sponsored by: Chelsio Communications, The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32103
Otherwise we can end up comparing the computed digest with an
uninitialized kernel buffer.
In cryptoaead_op() we already unconditionally fail the request if a
pointer to a digest buffer is not specified.
Based on a patch by Simran Kathpalia.
Reported by: syzkaller
Reviewed by: jhb
MFC after: 1 week
Pull Request: https://github.com/freebsd/freebsd-src/pull/529
Differential Revision: https://reviews.freebsd.org/D32124
KTLS OCF support was originally targeted at software backends that
used host CPU cycles to encrypt TLS records. As a result, each KTLS
worker thread queued a single TLS record at a time and waited for it
to be encrypted before processing another TLS record. This works well
for software backends but limits throughput on OCF drivers for
coprocessors that support asynchronous operation such as qat(4) or
ccr(4). This change uses an alternate function (ktls_encrypt_async)
when encrypt TLS records via a coprocessor. This function queues TLS
records for encryption and returns. It defers the work done after a
TLS record has been encrypted (such as marking the mbufs ready) to a
callback invoked asynchronously by the coprocessor driver when a
record has been encrypted.
- Add a struct ktls_ocf_state that holds the per-request state stored
on the stack for synchronous requests. Asynchronous requests malloc
this structure while synchronous requests continue to allocate this
structure on the stack.
- Add a ktls_encrypt_async() variant of ktls_encrypt() which does not
perform request completion after dispatching a request to OCF.
Instead, the ktls_ocf backends invoke ktls_encrypt_cb() when a TLS
record request completes for an asynchronous request.
- Flag AEAD software TLS sessions as async if the backend driver
selected by OCF is an async driver.
- Pull code to create and dispatch an OCF request out of
ktls_encrypt() into a new ktls_encrypt_one() function used by both
ktls_encrypt() and ktls_encrypt_async().
- Pull code to "finish" the VM page shuffling for a file-backed TLS
record into a helper function ktls_finish_noanon() used by both
ktls_encrypt() and ktls_encrypt_cb().
Reviewed by: markj
Tested on: ccr(4) (jhb), qat(4) (markj)
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D31665
These ones were unambiguous cases where the Foundation was the only
listed copyright holder (in the associated license block).
Sponsored by: The FreeBSD Foundation
This function combines crypto_cursor_segbase() and
crypto_cursor_seglen() into a single function. This is mostly
beneficial in the unmapped mbuf case where back to back calls of these
two functions have to iterate over the sub-components of unmapped
mbufs twice.
Bump __FreeBSD_version for crypto drivers in ports.
Suggested by: markj
Reviewed by: markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D30445
This avoids creating a duplicate copy on the stack just to
append the trailer.
Reviewed by: gallatin, markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D30139
This removes support for loadable software backends. The KTLS OCF
support is now always included in kernels with KERN_TLS and the
ktls_ocf.ko module has been removed. The software encryption routines
now take an mbuf directly and use the TLS mbuf as the crypto buffer
when possible.
Bump __FreeBSD_version for software backends in ports.
Reviewed by: gallatin, markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D30138
This is not a functional change as the Poly1305 hash is the same
length as the GMAC hash length.
Reviewed by: gallatin, markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D30137
This is intended for use in KTLS transmit where each TLS record is
described by a single mbuf that is itself queued in the socket buffer.
Using the existing CRYPTO_BUF_MBUF would result in
bus_dmamap_load_crp() walking additional mbufs in the socket buffer
that are not relevant, but generating a S/G list that potentially
exceeds the limit of the tag (while also wasting CPU cycles).
Reviewed by: markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D30136
There haven't been any non-obscure drivers that supported this
functionality and it has been impossible to test to ensure that it
still works. The only known consumer of this interface was the engine
in OpenSSL < 1.1. Modern OpenSSL versions do not include support for
this interface as it was not well-documented.
Reviewed by: cem
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D29736
Copy the iovec for the trailer from the proper place. This is the same
fix for CBC encryption from ff6a7e4ba6bf.
Reported by: gallatin
Reviewed by: gallatin, markj
Fixes: 49f6925ca
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D29177
cryptosoft is always present and doesn't print any useful information
when it attaches.
Reviewed by: jhb
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D29098
There currently isn't a need to provide a public interface to a
software Poly1305 implementation beyond what is already available via
libsodium's APIs and these symbols conflict with symbols shared within
the ossl.ko module between ossl_poly1305.c and ossl_chacha20.c.
Reported by: se, kp
Fixes: 78991a93eb9d
Sponsored by: Netflix
Maintain a cache of physically contiguous runs of pages for use as
output buffers when software encryption is configured and in-place
encryption is not possible. This makes allocation and free cheaper
since in the common case we avoid touching the vm_page structures for
the buffer, and fewer calls into UMA are needed. gallatin@ reports a
~10% absolute decrease in CPU usage with sendfile/KTLS on a Xeon after
this change.
It is possible that we will not be able to allocate these buffers if
physical memory is fragmented. To avoid frequently calling into the
physical memory allocator in this scenario, rate-limit allocation
attempts after a failure. In the failure case we fall back to the old
behaviour of allocating a page at a time.
N.B.: this scheme could be simplified, either by simply using malloc()
and looking up the PAs of the pages backing the buffer, or by falling
back to page by page allocation and creating a mapping in the cache
zone. This requires some way to save a mapping of an M_EXTPG page array
in the mbuf, though. m_data is not really appropriate. The second
approach may be possible by saving the mapping in the plinks union of
the first vm_page structure of the array, but this would force a vm_page
access when freeing an mbuf.
Reviewed by: gallatin, jhb
Tested by: gallatin
Sponsored by: Ampere Computing
Submitted by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D28556
This supports Chacha20-Poly1305 for both send and receive for TLS 1.2
and for send in TLS 1.3.
Reviewed by: gallatin
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D27841
This uses the chacha20 IETF and poly1305 implementations from
libsodium. A seperate auth_hash is created for the auth side whose
Setkey method derives the poly1305 key from the AEAD key and nonce as
described in RFC 8439.
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D27837
Note that this algorithm implements the mode defined in RFC 8439.
Reviewed by: cem
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D27836
When performing encryption in software, the KTLS crypto callback always
locks the session to deliver a wakeup. But, if we're handling the
operation synchronously this is wasted effort and can result in
sleepqueue lock contention on large systems.
Use CRYPTO_SESS_SYNC() to determine whether the operation will be
completed asynchronously or not, and select a callback appropriately.
Avoid locking the session to check for completion if the session handles
requests synchronously.
Reviewed by: jhb
Sponsored by: Ampere Computing
Submitted by: Klara, Inc.
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D28195
Currently, OpenCrypto consumers can request asynchronous dispatch by
setting a flag in the cryptop. (Currently only IPSec may do this.) I
think this is a bit confusing: we (conditionally) set cryptop flags to
request async dispatch, and then crypto_dispatch() immediately examines
those flags to see if the consumer wants async dispatch. The flag names
are also confusing since they don't specify what "async" applies to:
dispatch or completion.
Add a new KPI, crypto_dispatch_async(), rather than encoding the
requested dispatch type in each cryptop. crypto_dispatch_async() falls
back to crypto_dispatch() if the session's driver provides asynchronous
dispatch. Get rid of CRYPTOP_ASYNC() and CRYPTOP_ASYNC_KEEPORDER().
Similarly, add crypto_dispatch_batch() to request processing of a tailq
of cryptops, rather than encoding the scheduling policy using cryptop
flags. Convert GELI, the only user of this interface (disabled by
default) to use the new interface.
Add CRYPTO_SESS_SYNC(), which can be used by consumers to determine
whether crypto requests will be dispatched synchronously. This is just
a helper macro. Use it instead of looking at cap flags directly.
Fix style in crypto_done(). Also get rid of CRYPTO_RETW_EMPTY() and
just check the relevant queues directly. This could result in some
unnecessary wakeups but I think it's very uncommon to be using more than
one queue per worker in a given workload, so checking all three queues
is a waste of cycles.
Reviewed by: jhb
Sponsored by: Ampere Computing
Submitted by: Klara, Inc.
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D28194
This makes it a bit more straightforward to add new counters when
debugging. No functional change intended.
Reviewed by: jhb
Sponsored by: Ampere Computing
Submitted by: Klara, Inc.
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D28498
Since r336439 we simply take the session pointer value mod the number of
worker threads (ncpu by default). On small systems this ends up
funneling all completion work through a single thread, which becomes a
bottleneck when processing IPSec traffic using hardware crypto drivers.
(Software drivers such as aesni(4) are unaffected since they invoke
completion handlers synchonously.)
Instead, maintain an incrementing counter with a unique value per
session, and use that to distribute work to completion threads.
Reviewed by: cem, jhb
MFC after: 2 weeks
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D28159
Store the driver softc below the fields owned by opencrypto. This is
a bit simpler and saves a pointer dereference when fetching the driver
softc when processing a request.
Get rid of the crypto session UMA zone. Session allocations are
frequent or performance-critical enough to warrant a dedicated zone.
No functional change intended.
Reviewed by: cem, jhb
Sponsored by: Rubicon Communications, LLC ("Netgate")
Differential Revision: https://reviews.freebsd.org/D28158
Crypto file descriptors were added in the original OCF import as a way
to provide per-open data (specifically the list of symmetric
sessions). However, this gives a bit of a confusing API where one has
to open /dev/crypto and then invoke an ioctl to obtain a second file
descriptor. This also does not match the API used with /dev/crypto on
other BSDs or with Linux's /dev/crypto driver.
Character devices have gained support for per-open data via cdevpriv
since OCF was imported, so use cdevpriv to simplify the userland API
by permitting ioctls directly on /dev/crypto descriptors.
To provide backwards compatibility, CRIOGET now opens another
/dev/crypto descriptor via kern_openat() rather than dup'ing the
existing file descriptor. This preserves prior semantics in case
CRIOGET is invoked multiple times on a single file descriptor.
Reviewed by: markj
Relnotes: yes
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D27302