Conrad Meyer fe182ba1d0 aesni(4): Add support for x86 SHA intrinsics
Some x86 class CPUs have accelerated intrinsics for SHA1 and SHA256.
Provide this functionality on CPUs that support it.

This implements CRYPTO_SHA1, CRYPTO_SHA1_HMAC, and CRYPTO_SHA2_256_HMAC.

Correctness: The cryptotest.py suite in tests/sys/opencrypto has been
enhanced to verify SHA1 and SHA256 HMAC using standard NIST test vectors.
The test passes on this driver.  Additionally, jhb's cryptocheck tool has
been used to compare various random inputs against OpenSSL.  This test also
passes.

Rough performance averages on AMD Ryzen 1950X (4kB buffer):
aesni:      SHA1: ~8300 Mb/s    SHA256: ~8000 Mb/s
cryptosoft:       ~1800 Mb/s    SHA256: ~1800 Mb/s

So ~4.4-4.6x speedup depending on algorithm choice.  This is consistent with
the results the Linux folks saw for 4kB buffers.

The driver borrows SHA update code from sys/crypto sha1 and sha256.  The
intrinsic step function comes from Intel under a 3-clause BSDL.[0]  The
intel_sha_extensions_sha<foo>_intrinsic.c files were renamed and lightly
modified (added const, resolved a warning or two; included the sha_sse
header to declare the functions).

[0]: https://software.intel.com/en-us/articles/intel-sha-extensions-implementations

Reviewed by:	jhb
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D12452
2017-09-26 23:12:32 +00:00

112 lines
3.7 KiB
Groff

.\" Copyright (c) 2010 Konstantin Belousov <kib@FreeBSD.org>
.\" All rights reserved.
.\"
.\" Redistribution and use in source and binary forms, with or without
.\" modification, are permitted provided that the following conditions
.\" are met:
.\" 1. Redistributions of source code must retain the above copyright
.\" notice, this list of conditions and the following disclaimer.
.\" 2. Redistributions in binary form must reproduce the above copyright
.\" notice, this list of conditions and the following disclaimer in the
.\" documentation and/or other materials provided with the distribution.
.\"
.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
.\" SUCH DAMAGE.
.\"
.\" $FreeBSD$
.\"
.Dd September 26, 2017
.Dt AESNI 4
.Os
.Sh NAME
.Nm aesni
.Nd "driver for the AES and SHA accelerator on x86 CPUs"
.Sh SYNOPSIS
To compile this driver into the kernel,
place the following lines in your
kernel configuration file:
.Bd -ragged -offset indent
.Cd "device crypto"
.Cd "device cryptodev"
.Cd "device aesni"
.Ed
.Pp
Alternatively, to load the driver as a
module at boot time, place the following line in
.Xr loader.conf 5 :
.Bd -literal -offset indent
aesni_load="YES"
.Ed
.Sh DESCRIPTION
Starting with Intel Westmere and AMD Bulldozer, some x86 processors implement a
new set of instructions called AESNI.
The set of six instructions accelerates the calculation of the key
schedule for key lengths of 128, 192, and 256 of the Advanced
Encryption Standard (AES) symmetric cipher, and provides a hardware
implementation of the regular and the last encryption and decryption
rounds.
.Pp
The processor capability is reported as AESNI in the Features2 line at boot.
.Pp
Starting with the Intel Goldmont and AMD Ryzen microarchitectures, some x86
processors implement a new set of SHA instructions.
The set of seven instructions accelerates the calculation of SHA1 and SHA256
hashes.
.Pp
The processor capability is reported as SHA in the Structured Extended Features
line at boot.
.Pp
The
.Nm
driver does not attach on systems that lack both CPU capabilities.
On systems that support only one of AESNI or SHA extensions, the driver will
attach and support that one function.
.Pp
The
.Nm
driver registers itself to accelerate AES and SHA operations for
.Xr crypto 4 .
Besides speed, the advantage of using the
.Nm
driver is that the AESNI operation
is data-independent, thus eliminating some attack vectors based on
measuring cache use and timings typically present in table-driven
implementations.
.Sh SEE ALSO
.Xr crypt 3 ,
.Xr crypto 4 ,
.Xr intro 4 ,
.Xr ipsec 4 ,
.Xr padlock 4 ,
.Xr random 4 ,
.Xr crypto 9
.Sh HISTORY
The
.Nm
driver first appeared in
.Fx 9.0 .
SHA support was added in
.Fx 12.0 .
.Sh AUTHORS
.An -nosplit
The
.Nm
driver was written by
.An Konstantin Belousov Aq Mt kib@FreeBSD.org
and
.An Conrad Meyer Aq Mt cem@FreeBSD.org .
The key schedule calculation code was adopted from the sample provided
by Intel and used in the analogous
.Ox
driver.
The hash step intrinsics implementations were supplied by Intel.