This commit essentially has three parts:
* Add the AES-CCM encryption hooks. This is in and of itself fairly small,
as there is only a small difference between CCM and the other ICM-based
algorithms.
* Hook the code into the OpenCrypto framework. This is the bulk of the
changes, as the algorithm type has to be checked for, and the differences
between it and GCM dealt with.
* Update the cryptocheck tool to be aware of it. This is invaluable for
confirming that the code works.
This is a software-only implementation, meaning that the performance is very
low.
Sponsored by: iXsystems Inc.
Differential Revision: https://reviews.freebsd.org/D19090
This adds the CBC-MAC code to the kernel, but does not hook it up to
anything (that comes in the next commit).
https://tools.ietf.org/html/rfc3610 describes the algorithm.
Note that this is a software-only implementation, which means it is
fairly slow.
Sponsored by: iXsystems Inc
Differential Revision: https://reviews.freebsd.org/D18592
Right now, aesni_cipher_alloc does a bit of special-casing
for CRYPTO_F_IOV, to not do any allocation if the first uio
is large enough for the requested size. While working on ZFS
crypto port, I ran into horrible performance because the code
uses scatter-gather, and many of the times the data to encrypt
was in the second entry. This code looks through the list, and
tries to see if there is a single uio that can contain the
requested data, and, if so, uses that.
This has a slight impact on the current consumers, in that the
check is a little more complicated for the ones that use
CRYPTO_F_IOV -- but none of them meet the criteria for testing
more than one.
Submitted by: sef at ixsystems.com
Reviewed by: cem@
MFC after: 3 days
Sponsored by: iX Systems
Differential Revision: https://reviews.freebsd.org/D18522
The wrapper is a thin shim around libsodium's Poly-1305 implementation. For
now, we just use the C algorithm and do not attempt to build the
SSE-optimized variant for x86 processors.
The algorithm support has not yet been plumbed through cryptodev, or added
to cryptosoft.
Track session objects in the framework, and pass handles between the
framework (OCF), consumers, and drivers. Avoid redundancy and complexity in
individual drivers by allocating session memory in the framework and
providing it to drivers in ::newsession().
Session handles are no longer integers with information encoded in various
high bits. Use of the CRYPTO_SESID2FOO() macros should be replaced with the
appropriate crypto_ses2foo() function on the opaque session handle.
Convert OCF drivers (in particular, cryptosoft, as well as myriad others) to
the opaque handle interface. Discard existing session tracking as much as
possible (quick pass). There may be additional code ripe for deletion.
Convert OCF consumers (ipsec, geom_eli, krb5, cryptodev) to handle-style
interface. The conversion is largely mechnical.
The change is documented in crypto.9.
Inspired by
https://lists.freebsd.org/pipermail/freebsd-arch/2018-January/018835.html .
No objection from: ae (ipsec portion)
Reported by: jhb
In part, to support OpenSSL's use of cryptodev, which puts the HMAC pieces
in software and only offloads the raw hash primitive.
The following cryptodev identifiers are added:
* CRYPTO_RIPEMD160 (not hooked up)
* CRYPTO_SHA2_224
* CRYPTO_SHA2_256
* CRYPTO_SHA2_384
* CRYPTO_SHA2_512
The plain SHA1 and 2 hashes are plumbed through cryptodev (feels like there
is a lot of redundancy here...) and cryptosoft.
This adds new auth_hash implementations for the plain hashes, as well as
SHA1 (which had a cryptodev.h identifier, but no implementation).
Add plain SHA 1 and 2 hash tests to the cryptocheck tool.
Motivation stems from John Baldwin's earlier OCF email,
https://lists.freebsd.org/pipermail/freebsd-arch/2018-January/018835.html .
Mostly this is a thin shim around existing code to integrate with enc_xform
and cryptosoft (+ cryptodev).
Expand the cryptodev buffer used to match that of Chacha20's native block
size as a performance enhancement for chacha20_xform_crypt_multi.
The upstream repository is on github BLAKE2/libb2. Files landed in
sys/contrib/libb2 are the unmodified upstream files, except for one
difference: secure_zero_memory's contents have been replaced with
explicit_bzero() only because the previous implementation broke powerpc
link. Preferential use of explicit_bzero() is in progress upstream, so
it is anticipated we will be able to drop this diff in the future.
sys/crypto/blake2 contains the source files needed to port libb2 to our
build system, a wrapped (limited) variant of the algorithm to match the API
of our auth_transform softcrypto abstraction, incorporation into the Open
Crypto Framework (OCF) cryptosoft(4) driver, as well as an x86 SSE/AVX
accelerated OCF driver, blake2(4).
Optimized variants of blake2 are compiled for a number of x86 machines
(anything from SSE2 to AVX + XOP). On those machines, FPU context will need
to be explicitly saved before using blake2(4)-provided algorithms directly.
Use via cryptodev / OCF saves FPU state automatically, and use via the
auth_transform softcrypto abstraction does not use FPU.
The intent of the OCF driver is mostly to enable testing in userspace via
/dev/crypto. ATF tests are added with published KAT test vectors to
validate correctness.
Reviewed by: jhb, markj
Obtained from: github BLAKE2/libb2
Differential Revision: https://reviews.freebsd.org/D14662
This adds explicit crp_mbuf and crp_uio pointers of the right type to
replace casts of crp_buf. This does not sweep through changing existing
code, but new code should use the correct fields instead of casts.
Reviewed by: kib
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D13927
Opaque pointers should be void *. Note that this does not go through
the tree removing all of the now-unnecessary casts.
Reviewed by: kib
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D13848
fine when a lot of different flows to be ciphered/deciphered are involved.
However, when a software crypto driver is used, there are
situations where we could benefit from making crypto(9) multi threaded:
- a single flow is to be ciphered: only one thread is used to cipher it,
- a single ESP flow is to be deciphered: only one thread is used to
decipher it.
The idea here is to call crypto(9) using a new mode (CRYPTO_F_ASYNC) to
dispatch the crypto jobs on multiple threads, if the underlying crypto
driver is working in synchronous mode.
Another flag is added (CRYPTO_F_ASYNC_KEEPORDER) to make crypto(9)
dispatch the crypto jobs in the order they are received (an additional
queue/thread is used), so that the packets are reinjected in the network
using the same order they were posted.
A new sysctl net.inet.ipsec.async_crypto can be used to activate
this new behavior (disabled by default).
Submitted by: Emeric Poupon <emeric.poupon@stormshield.eu>
Reviewed by: ae, jmg, jhb
Differential Revision: https://reviews.freebsd.org/D10680
Sponsored by: Stormshield
Theoretically, HMACs do not actually have any limit on key sizes.
Transforms should compact input keys larger than the HMAC block size by
using the transform (hash) on the input key.
(Short input keys are padded out with zeros to the HMAC block size.)
Still, not all FreeBSD crypto drivers that provide HMAC functionality
handle longer-than-blocksize keys appropriately, so enforce a "maximum" key
length in the crypto API for auth_hashes that previously expressed a
requirement. (The "maximum" is the size of a single HMAC block for the
given transform.) Unconstrained auth_hashes are left as-is.
I believe the previous hardcoded sizes were committed in the original
import of opencrypto from OpenBSD and are due to specific protocol
details of IPSec. Note that none of the previous sizes actually matched
the appropriate HMAC block size.
The previous hardcoded sizes made the SHA tests in cryptotest.py
useless for testing FreeBSD crypto drivers; none of the NIST-KAT example
inputs had keys sized to the previous expectations.
The following drivers were audited to check that they handled keys up to
the block size of the HMAC safely:
Software HMAC:
* padlock(4)
* cesa
* glxsb
* safe(4)
* ubsec(4)
Hardware accelerated HMAC:
* ccr(4)
* hifn(4)
* sec(4) (Only supports up to 64 byte keys despite claiming to
support SHA2 HMACs, but validates input key sizes)
* cryptocteon (MIPS)
* nlmsec (MIPS)
* rmisec (MIPS) (Amusingly, does not appear to use key material at
all -- presumed broken)
Reviewed by: jhb (previous version), rlibby (previous version)
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D12437
This requests that the cipher be performed before rather than after
the HMAC when both are specified for a single operation.
Reviewed by: cem
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D11757
- Mark the source buffer for a copyback operation as const in the kernel
API.
- Use const with input-only buffers in crypto ioctl structures used with
/dev/crypto.
MFC after: 1 month
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D10517
defines the keys differently than NIST does, so we have to muck with
key lengths and nonce/IVs to be standard compliant...
Remove the iv from secasvar as it was unused...
Add a counter protected by a mutex to ensure that the counter for GCM
and ICM will never be repeated.. This is a requirement for security..
I would use atomics, but we don't have a 64bit one on all platforms..
Fix a bug where IPsec was depending upon the OCF to ensure that the
blocksize was always at least 4 bytes to maintain alignment... Move
this logic into IPsec so changes to OCF won't break IPsec...
In one place, espx was always non-NULL, so don't test that it's
non-NULL before doing work..
minor style cleanups...
drop setting key and klen as they were not used...
Enforce that OCF won't pass invalid key lengths to AES that would
panic the machine...
This was has been tested by others too... I tested this against
NetBSD 6.1.5 using mini-test suite in
https://github.com/jmgurney/ipseccfgs and the only things that don't
pass are keyed md5 and sha1, and 3des-deriv (setkey syntax error),
all other modes listed in setkey's man page... The nice thing is
that NetBSD uses setkey, so same config files were used on both...
Reviewed by: gnn
Though confusing, GCM using ICM_BLOCK_LEN, but ICM does not is
correct... GCM is built on ICM, but uses a function other than
swcr_encdec... swcr_encdec cannot handle partial blocks which is
why it must still use AES_BLOCK_LEN and is why XTS was broken by the
commit...
Thanks to the tests for helping sure I didn't break GCM w/ an earlier
patch...
I did run the tests w/o this patch, and need to figure out why they
did not fail, clearly more tests are needed...
Prodded by: peter
mode and with hardware support on systems that have AESNI instructions.
Differential Revision: D2936
Reviewed by: jmg, eri, cognet
Sponsored by: Rubicon Communications (Netgate)
for counter mode), and AES-GCM. Both of these modes have been added to
the aesni module.
Included is a set of tests to validate that the software and aesni
module calculate the correct values. These use the NIST KAT test
vectors. To run the test, you will need to install a soon to be
committed port, nist-kat that will install the vectors. Using a port
is necessary as the test vectors are around 25MB.
All the man pages were updated. I have added a new man page, crypto.7,
which includes a description of how to use each mode. All the new modes
and some other AES modes are present. It would be good for someone
else to go through and document the other modes.
A new ioctl was added to support AEAD modes which AES-GCM is one of them.
Without this ioctl, it is not possible to test AEAD modes from userland.
Add a timing safe bcmp for use to compare MACs. Previously we were using
bcmp which could leak timing info and result in the ability to forge
messages.
Add a minor optimization to the aesni module so that single segment
mbufs don't get copied and instead are updated in place. The aesni
module needs to be updated to support blocked IO so segmented mbufs
don't have to be copied.
We require that the IV be specified for all calls for both GCM and ICM.
This is to ensure proper use of these functions.
Obtained from: p4: //depot/projects/opencrypto
Relnotes: yes
Sponsored by: FreeBSD Foundation
Sponsored by: NetGate
o make all crypto drivers have a device_t; pseudo drivers like the s/w
crypto driver synthesize one
o change the api between the crypto subsystem and drivers to use kobj;
cryptodev_if.m defines this api
o use the fact that all crypto drivers now have a device_t to add support
for specifying which of several potential devices to use when doing
crypto operations
o add new ioctls that allow user apps to select a specific crypto device
to use (previous ioctls maintained for compatibility)
o overhaul crypto subsystem code to eliminate lots of cruft and hide
implementation details from drivers
o bring in numerous fixes from Michale Richardson/hifn; mostly for
795x parts
o add an optional mechanism for mmap'ing the hifn 795x public key h/w
to user space for use by openssl (not enabled by default)
o update crypto test tools to use new ioctl's and add cmd line options
to specify a device to use for tests
These changes will also enable much future work on improving the core
crypto subsystem; including proper load balancing and interposing code
between the core and drivers to dispatch small operations to the s/w
driver as appropriate.
These changes were instigated by the work of Michael Richardson.
Reviewed by: pjd
Approved by: re
- Add defines with block length for each HMAC algorithm.
- Add AES_BLOCK_LEN define which is an alias for RIJNDAEL128_BLOCK_LEN.
- Add NULL_BLOCK_LEN define.
or SHA512, the blocksize is 128 bytes, not 64 bytes as anywhere else.
The bug also exists in NetBSD, OpenBSD and various other independed
implementations I look at.
- We cannot decide which hash function to use for HMAC based on the key
length, because any HMAC function can use any key length.
To fix it split CRYPTO_SHA2_HMAC into three algorithm:
CRYPTO_SHA2_256_HMAC, CRYPTO_SHA2_384_HMAC and CRYPTO_SHA2_512_HMAC.
Those names are consistent with OpenBSD's naming.
- Remove authsize field from auth_hash structure.
- Allow consumer to define size of hash he wants to receive.
This allows to use HMAC not only for IPsec, where 96 bits MAC is requested.
The size of requested MAC is defined at newsession time in the cri_mlen
field - when 0, entire MAC will be returned.
- Add swcr_authprepare() function which prepares authentication key.
- Allow to provide key for every authentication operation, not only at
newsession time by honoring CRD_F_KEY_EXPLICIT flag.
- Make giving key at newsession time optional - don't try to operate on it
if its NULL.
- Extend COPYBACK()/COPYDATA() macros to handle CRYPTO_BUF_CONTIG buffer
type as well.
- Accept CRYPTO_BUF_IOV buffer type in swcr_authcompute() as we have
cuio_apply() now.
- 16 bits for key length (SW_klen) is more than enough.
Reviewed by: sam
crypto_invoke(). This allows to serve multiple crypto requests in
parallel and not bached requests are served lock-less.
Drivers should not depend on the queue lock beeing held around
crypto_invoke() and if they do, that's an error in the driver - it
should do its own synchronization.
- Don't forget to wakeup the crypto thread when new requests is
queued and only if both symmetric and asymmetric queues are empty.
- Symmetric requests use sessions and there is no way driver can
disappear when there is an active session, so we don't need to check
this, but assert this. This is also safe to not use the driver lock
in this case.
- Assymetric requests don't use sessions, so don't check the driver
in crypto_kinvoke().
- Protect assymetric operation with the driver lock, because if there
is no symmetric session, driver can disappear.
- Don't send assymetric request to the driver if it is marked as
blocked.
- Add an XXX comment, because I don't think migration to another driver
is safe when there are pending requests using freed session.
- Remove 'hint' argument from crypto_kinvoke(), as it serves no purpose.
- Don't hold the driver lock around kprocess method call, instead use
cc_koperations to track number of in-progress requests.
- Cleanup register/unregister code a bit.
- Other small simplifications and cleanups.
Reviewed by: sam
software crypto device:
o record crypto device capabilities in each session id
o add a capability that indicates if the crypto driver operates synchronously
o tag the software crypto driver as operating synchronously
This commit also introduces crypto session id macros that cleanup their
construction and querying.
o add a ``done'' flag for crypto operations; this is set when the operation
completes and is intended for callers to check operations that may complete
``prematurely'' because of direct callbacks
o close a race for operations where the crypto driver returns ERESTART: we
need to hold the q lock to insure the blocked state for the driver and any
driver-private state is consistent; otherwise drivers may take an interrupt
and notify the crypto subsystem that it can unblock the driver but operations
will be left queued and never be processed
o close a race in /dev/crypto where operations can complete before the caller
can sleep waiting for the callback: use a per-session mutex and the new done
flag to handle this
o correct crypto_dispatch's handling of operations where the driver returns
ERESTART: the return value must be zero and not ERESTART, otherwise the
caller may free the crypto request despite it being queued for later handling
(this typically results in a later panic)
o change crypto mutex ``names'' so witness printouts and the like are more
meaningful
should be done in crypto_done rather than in the callback thread
o use this flag to mark operations from /dev/crypto since the callback
routine just does a wakeup; this eliminates the last unneeded ctx switch
o change CRYPTO_F_NODELAY to CRYPTO_F_BATCH with an inverted meaning
so "0" becomes the default/desired setting (needed for user-mode
compatibility with openbsd)
o change crypto_dispatch to honor CRYPTO_F_BATCH instead of always
dispatching immediately
o remove uses of CRYPTO_F_NODELAY
o define COP_F_BATCH for ops submitted through /dev/crypto and pass
this on to the op that is submitted
Similar changes and more eventually coming for asymmetric ops.
MFC if re gives approval.
cryptodev or kldunload cryptodev module); crypto statistcs; remove
unused alloctype field from crypto op to offset addition of the
performance time stamp
Supported by: Vernier Networks
a consistent interface to h/w and s/w crypto algorithms for use by the
kernel and (for h/w at least) by user-mode apps. Access for user-level
code is through a /dev/crypto device that'll eventually be used by openssl
to (potentially) accelerate many applications. Coming soon is an IPsec
that makes use of this service to accelerate ESP, AH, and IPCOMP protocols.
Included here is the "core" crypto support, /dev/crypto driver, various
crypto algorithms that are not already present in the KAME crypto area,
and support routines used by crypto device drivers.
Obtained from: openbsd