freebsd-nq/sys/crypto/ccp/ccp_lsb.c
Conrad Meyer 844d9543dc Add ccp(4): experimental driver for AMD Crypto Co-Processor
* Registers TRNG source for random(4)
* Finds available queues, LSBs; allocates static objects
* Allocates a shared MSI-X for all queues.  The hardware does not have
  separate interrupts per queue.  Working interrupt mode driver.
* Computes SHA hashes, HMAC.  Passes cryptotest.py, cryptocheck tests.
* Does AES-CBC, CTR mode, and XTS.  cryptotest.py and cryptocheck pass.
* Support for "authenc" (AES + HMAC).  (SHA1 seems to result in
  "unaligned" cleartext inputs from cryptocheck -- which the engine
  cannot handle.  SHA2 seems to work fine.)
* GCM passes for block-multiple AAD, input lengths

Largely based on ccr(4), part of cxgbe(4).

Rough performance averages on AMD Ryzen 1950X (4kB buffer):
aesni:      SHA1: ~8300 Mb/s    SHA256: ~8000 Mb/s
ccp:               ~630 Mb/s    SHA256:  ~660 Mb/s  SHA512:  ~700 Mb/s
cryptosoft:       ~1800 Mb/s    SHA256: ~1800 Mb/s  SHA512: ~2700 Mb/s

As you can see, performance is poor in comparison to aesni(4) and even
cryptosoft (due to high setup cost).  At a larger buffer size (128kB),
throughput is a little better (but still worse than aesni(4)):

aesni:      SHA1:~10400 Mb/s    SHA256: ~9950 Mb/s
ccp:              ~2200 Mb/s    SHA256: ~2600 Mb/s  SHA512: ~3800 Mb/s
cryptosoft:       ~1750 Mb/s    SHA256: ~1800 Mb/s  SHA512: ~2700 Mb/s

AES performance has a similar story:

aesni:      4kB: ~11250 Mb/s    128kB: ~11250 Mb/s
ccp:               ~350 Mb/s    128kB:  ~4600 Mb/s
cryptosoft:       ~1750 Mb/s    128kB:  ~1700 Mb/s

This driver is EXPERIMENTAL.  You should verify cryptographic results on
typical and corner case inputs from your application against a known- good
implementation.

Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D12723
2018-01-18 22:01:30 +00:00

100 lines
2.9 KiB
C

/*-
* SPDX-License-Identifier: BSD-2-Clause-FreeBSD
*
* Copyright (c) 2017 Conrad Meyer <cem@FreeBSD.org>
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
* SUCH DAMAGE.
*/
#include <sys/cdefs.h>
__FBSDID("$FreeBSD$");
#include <sys/types.h>
#include <sys/bus.h>
#include <sys/malloc.h>
#include <sys/sysctl.h>
#include <opencrypto/xform.h>
#include "ccp.h"
#include "ccp_lsb.h"
void
ccp_queue_decode_lsb_regions(struct ccp_softc *sc, uint64_t lsbmask,
unsigned queue)
{
struct ccp_queue *qp;
unsigned i;
qp = &sc->queues[queue];
qp->lsb_mask = 0;
for (i = 0; i < MAX_LSB_REGIONS; i++) {
if (((1 << queue) & lsbmask) != 0)
qp->lsb_mask |= (1 << i);
lsbmask >>= MAX_HW_QUEUES;
}
/*
* Ignore region 0, which has special entries that cannot be used
* generally.
*/
qp->lsb_mask &= ~(1 << 0);
}
/*
* Look for a private LSB for each queue. There are 7 general purpose LSBs
* total and 5 queues. PSP will reserve some of both. Firmware limits some
* queues' access to some LSBs; we hope it is fairly sane and just use a dumb
* greedy algorithm to assign LSBs to queues.
*/
void
ccp_assign_lsb_regions(struct ccp_softc *sc, uint64_t lsbmask)
{
unsigned q, i;
for (q = 0; q < nitems(sc->queues); q++) {
if (((1 << q) & sc->valid_queues) == 0)
continue;
sc->queues[q].private_lsb = -1;
/* Intentionally skip specialized 0th LSB */
for (i = 1; i < MAX_LSB_REGIONS; i++) {
if ((lsbmask &
(1ull << (q + (MAX_HW_QUEUES * i)))) != 0) {
sc->queues[q].private_lsb = i;
lsbmask &= ~(0x1Full << (MAX_HW_QUEUES * i));
break;
}
}
if (i == MAX_LSB_REGIONS) {
device_printf(sc->dev,
"Ignoring queue %u with no private LSB\n", q);
sc->valid_queues &= ~(1 << q);
}
}
}