after the array to its proper location. Otherwise, the linker.hints
file has things out of order and we associated it with whatever was
the previous module.
Apply r328361 to duplicate copy of ccr_gcm_soft in ccp(4).
Properly honor the lack of the CRD_F_IV_PRESENT flag in the GCM software
fallback case for encryption requests.
* Registers TRNG source for random(4)
* Finds available queues, LSBs; allocates static objects
* Allocates a shared MSI-X for all queues. The hardware does not have
separate interrupts per queue. Working interrupt mode driver.
* Computes SHA hashes, HMAC. Passes cryptotest.py, cryptocheck tests.
* Does AES-CBC, CTR mode, and XTS. cryptotest.py and cryptocheck pass.
* Support for "authenc" (AES + HMAC). (SHA1 seems to result in
"unaligned" cleartext inputs from cryptocheck -- which the engine
cannot handle. SHA2 seems to work fine.)
* GCM passes for block-multiple AAD, input lengths
Largely based on ccr(4), part of cxgbe(4).
Rough performance averages on AMD Ryzen 1950X (4kB buffer):
aesni: SHA1: ~8300 Mb/s SHA256: ~8000 Mb/s
ccp: ~630 Mb/s SHA256: ~660 Mb/s SHA512: ~700 Mb/s
cryptosoft: ~1800 Mb/s SHA256: ~1800 Mb/s SHA512: ~2700 Mb/s
As you can see, performance is poor in comparison to aesni(4) and even
cryptosoft (due to high setup cost). At a larger buffer size (128kB),
throughput is a little better (but still worse than aesni(4)):
aesni: SHA1:~10400 Mb/s SHA256: ~9950 Mb/s
ccp: ~2200 Mb/s SHA256: ~2600 Mb/s SHA512: ~3800 Mb/s
cryptosoft: ~1750 Mb/s SHA256: ~1800 Mb/s SHA512: ~2700 Mb/s
AES performance has a similar story:
aesni: 4kB: ~11250 Mb/s 128kB: ~11250 Mb/s
ccp: ~350 Mb/s 128kB: ~4600 Mb/s
cryptosoft: ~1750 Mb/s 128kB: ~1700 Mb/s
This driver is EXPERIMENTAL. You should verify cryptographic results on
typical and corner case inputs from your application against a known- good
implementation.
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D12723