The loops for Chacha20 and Chacha20+Poly1305 which encrypted/decrypted
full blocks of data used the minimum of the input and output segment
lengths to determine the size of the next chunk ('todo') to pass to
Chacha20_ctr32(). However, the input and output segments could extend
past the end of the ciphertext region into the tag (e.g. if a "plain"
single mbuf contained an entire TLS record). If the length of the tag
plus the length of the last partial block together were at least as
large as a full Chacha20 block (64 bytes), then an extra block was
encrypted/decrypted overlapping with the tag. Fix this by also
capping the amount of data to encrypt/decrypt by the amount of
remaining data in the ciphertext region ('resid').
Reported by: gallatin
Reviewed by: cem, gallatin, markj
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D29517
This file inherits some boilerplate and structure from the analogous
file in aesni(4), aesni_wrap.c. Note the derivation and the copyright
holders of that file.
For example, the AES-XTS bits added in 4979620ece were ported from
aesni(4).
Requested by: jmg
Reviewed by: imp, gnn
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D29268
Initialization of the XTS key schedule was accidentally dropped
when adding AES-GCM support so all-zero schedule was used instead.
This rendered previously created GELI partitions unusable.
This change restores proper XTS key schedule initialization.
Reported by: Peter Jeremy <peter@rulingia.com>
MFC after: immediately
The missing newline mildly garbles boot-time messages and this can be
troublesome if you need those.
Fixes: a520f5ca58 ("armv8crypto: print a message on probe failure")
Reported by: Mike Karels (mike@karels.net)
Reviewed By: gonzo
Differential Revision: https://reviews.freebsd.org/D28988
This makes it easier to refactor the GCM code to operate on
crypto_buffer_cursors rather than plain contiguous buffers, with the aim
of minimizing the amount of copying and zeroing done today.
No functional change intended.
Reviewed by: jhb
MFC after: 1 week
Sponsored by: Ampere Computing
Submitted by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D28500
- We were only hashing up to the first 16 bytes of the AAD.
- When computing the digest during decryption, handle the case where
len == trailer, i.e., len < AES_BLOCK_LEN, properly.
While here:
- trailer is always smaller than AES_BLOCK_LEN, so remove a pair of
unnecessary modulus operations.
- Replace some byte-by-byte loops with memcpy() and memset() calls.
In particular, zero the full block before copying a partial block into
it since we do that elsewhere and it means that the memset() length is
known at compile time.
Reviewed by: jhb
Sponsored by: Ampere Computing
Submitted by: Klara, Inc.
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D28501
Rather than depending on malloc() returning 16-byte aligned chunks,
allocate some extra pad bytes and ensure that key schedules are
appropriately aligned.
Reviewed by: kib
MFC after: 2 weeks
Sponsored by: Rubicon Communications, LLC (Netgate)
Differential Revision: https://reviews.freebsd.org/D28157
Similar to the message printed by aesni(4), let the user know if the
driver is unsupported by their CPU.
PR: 252543
Reported by: gbe
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
A straightforward(ish) port from aesni(4). This implementation does not
perform loop unrolling on the input blocks, so this is left as a future
performance improvement.
Submitted by: Greg V <greg AT unrelenting.technology>
Looks good: jhb, jmg
Tested by: mhorne
Differential Revision: https://reviews.freebsd.org/D21017
Follow-up to r353959 and r368070: do the same for other architectures.
arm32 already seems to use its own .fnstart/.fnend directives, which
appear to be ARM-specific variants of the same thing. Likewise, MIPS
uses .frame directives.
Reviewed by: arichardson
Differential Revision: https://reviews.freebsd.org/D27387
Enable in-kernel acceleration of SHA1 and SHA2 operations on arm64 by adding
support for the ossl(4) crypto driver. This uses OpenSSL's assembly routines
under the hood, which will detect and use SHA intrinsics if they are
supported by the CPU.
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D27390
Make room for adding arm64 support to this driver by moving the
x86-specific feature parsing to a separate file.
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D27388
OCF drivers in general should perform as many session parameter checks
as possible during probesession rather than when creating a new
session. I got this wrong for aesni(4) in r359374. In addition,
aesni(4) was performing the check for digest-only requests and failing
to create digest-only sessions as a result.
Reported by: jkim
Tested by: jkim
Sponsored by: Chelsio Communications
Add missing break to prevent falling through to the default case statement
and returning EINVAL for all session configs.
Sponsored by: Ampere Computing
Submitted by: Klara, Inc.
Currently, this supports SHA1 and SHA2-{224,256,384,512} both as plain
hashes and in HMAC mode on both amd64 and i386. It uses the SHA
intrinsics when present similar to aesni(4), but uses SSE/AVX
instructions when they are not.
Note that some files from OpenSSL that normally wrap the assembly
routines have been adapted to export methods usable by 'struct
auth_xform' as is used by existing software crypto routines.
Reviewed by: gallatin, jkim, delphij, gnn
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D26821
This patch adds support for IPsec ESN (Extended Sequence Numbers) in
encrypt and authenticate mode (eg. AES-CBC and SHA256) and combined mode
(eg. AES-GCM).
For the encrypt and authenticate mode the ESN is stored in separate
crp_esn buffer because the high-order 32 bits of the sequence number are
appended after the Next Header (RFC 4303).
For the combined modes the high-order 32 bits of the sequence number
[e.g. RFC 4106, Chapter 5 AAD Construction] are part of crp_aad
(prepared by netipsec layer in case of ESN support enabled), therefore
non visible diff around combined modes.
Submitted by: Grzegorz Jaszczyk <jaz@semihalf.com>
Patryk Duda <pdk@semihalf.com>
Reviewed by: jhb
Differential revision: https://reviews.freebsd.org/D22365
Obtained from: Semihalf
Sponsored by: Stormshield
Weirdly, I needed to sprinkle more parens here to get gcc-as in 6.4
to correctly generate things.
Without them, I'd get an unknown variable reference to SKEIN_ASM_UNROLL1024.
This at least links now, but I haven't run any test cases against it.
It may be worthwhile doing it in case gcc-as demands we liberally sprinkle
more brackets around variables in .if statements.
Thanks to ed for the suggestion of just sprinkling more brackets to
see if that helped.
Reviewed by: emaste
For some reason I don't want to really understand, the following
happens with gnu as.
/home/adrian/git/freebsd/src/sys/crypto/skein/amd64/skein_block_asm.S: Assembler messages:
/home/adrian/git/freebsd/src/sys/crypto/skein/amd64/skein_block_asm.S:466: Error: found '(', expected: ')'
/home/adrian/git/freebsd/src/sys/crypto/skein/amd64/skein_block_asm.S:466: Error: junk at end of line, first unrecognized character is `('
/home/adrian/git/freebsd/src/sys/crypto/skein/amd64/skein_block_asm.S:795: Error: found '(', expected: ')'
/home/adrian/git/freebsd/src/sys/crypto/skein/amd64/skein_block_asm.S:795: Error: junk at end of line, first unrecognized character is `('
/home/adrian/git/freebsd/src/sys/crypto/skein/amd64/skein_block_asm.S:885: Error: non-constant expression in ".if" statement
/home/adrian/git/freebsd/src/sys/crypto/skein/amd64/skein_block_asm.S:885: Error: non-constant expression in ".if" statement
/home/adrian/git/freebsd/src/sys/crypto/skein/amd64/skein_block_asm.S:885: Error: non-constant expression in ".if" statement
/home/adrian/git/freebsd/src/sys/crypto/skein/amd64/skein_block_asm.S:885: Error: non-constant expression in ".if" statement
/home/adrian/git/freebsd/src/sys/crypto/skein/amd64/skein_block_asm.S:885: Error: non-constant expression in ".if" statement
/home/adrian/git/freebsd/src/sys/crypto/skein/amd64/skein_block_asm.S:885: Error: non-constant expression in ".if" statement
After an exhaustive search and experimentation at 11pm, I discovered that
putting them in parentheses fixes the compilation.
Ed pointed out that I could likely fix this in a bunch of other
locations but I'd rather leave these alone until other options
are enabled.
Tested:
* gcc-6, amd64
Reviewed by: emaste
arm64 has a similar wrapper. This permits defining <machine/fpu.h> as
the standard header for fpu_kern_*.
Reviewed by: kib
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D26753
The assembly implementation incorrectly used logical AND instead of
bitwise AND. Fix, and re-enable in libmd.
Submitted by: Yang Zhong <yzhong@freebsdfoundation.org>
Reviewed by: cem (earlier)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26614
The cryptodev_process() method should either return 0 if it has
completed a request, or ERESTART to defer the request until later. If
a request encounters an error, the error should be reported via
crp_etype before completing the request via crypto_done().
Fix a few more drivers noticed by asomers@ similar to the fix in
r365389. This is an old bug, but went unnoticed since crypto requests
did not start failing as a normal part of operation until digest
verification was introduced which can fail requests with EBADMSG.
PR: 247986
Reported by: asomers
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D26361
crypto(9) functions can now be used on buffers composed of an array of
vm_page_t structures, such as those stored in an unmapped struct bio. It
requires the running to kernel to support the direct memory map, so not all
architectures can use it.
Reviewed by: markj, kib, jhb, mjg, mat, bcr (manpages)
MFC after: 1 week
Sponsored by: Axcient
Differential Revision: https://reviews.freebsd.org/D25671
Like other types of allocation, fpu_kern_ctx are frequently allocated per-cpu.
Provide the API and sketch some example consumers.
fpu_kern_alloc_ctx_domain() preferentially allocates memory from the
provided domain, and falls back to other domains if that one is empty
(DOMAINSET_PREF(domain) policy).
Maybe it makes more sense to just shove one of these in the DPCPU area
sooner or later -- left for future work.
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D22053
These bzero's should have been explicit_bzero's.
Reviewed by: cem, delphij
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D25437
In addition to reducing lines of code, this also ensures that the full
allocation is always zeroed avoiding possible bugs with incorrect
lengths passed to explicit_bzero().
Suggested by: cem
Reviewed by: cem, delphij
Approved by: csprng (cem)
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D25435
The amount to copy for the first block is the minimum of the size of
the AAD region or the remaining space in the first block.
Reported by: cryptocheck -z
MFC after: 2 weeks
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D25140
- crypto_apply() is only used for reading a buffer to compute a
digest, so change the data pointer to a const pointer.
- To better match m_apply(), change the data pointer type to void *
and the length from uint16_t to u_int. The length field in
particular matters as none of the apply logic was splitting requests
larger than UINT16_MAX.
- Adjust the auth_xform Update callback to match the function
prototype passed to crypto_apply() and crypto_apply_buf(). This
removes the needs for casts when using the Update callback.
- Change the Reinit and Setkey callbacks to also use a u_int length
instead of uint16_t.
- Update auth transforms for the changes. While here, use C99
initializers for auth_hash structures and avoid casts on callbacks.
Reviewed by: cem
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D25171
Use this in GELI to print out a different message when accelerated
software such as AESNI is used vs plain software crypto.
While here, simplify the logic in GELI a bit for determing which type
of crypto driver was chosen the first time by examining the
capabilities of the matched driver after a single call to
crypto_newsession rather than making separate calls with different
flags.
Reviewed by: delphij
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D25126
Comparing the object files produced by GNU as 2.17.50 and Clang IAS
shows many immaterial changes in strtab etc., and one material change
in .text:
1bac: 4c 8b 4f 18 mov 0x18(%rdi),%r9
1bb0: eb 0e jmp 1bc0 <Skein1024_block_loop>
- 1bb2: 66 66 2e 0f 1f 84 00 data16 nopw %cs:0x0(%rax,%rax,1)
- 1bb9: 00 00 00 00
- 1bbd: 0f 1f 00 nopl (%rax)
+ 1bb2: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
+ 1bb9: 00 00 00
+ 1bbc: 0f 1f 40 00 nopl 0x0(%rax)
0000000000001bc0 <Skein1024_block_loop>:
Skein1024_block_loop():
1bc0: 4c 8b 47 10 mov 0x10(%rdi),%r8
1bc4: 4c 03 85 c0 00 00 00 add 0xc0(%rbp),%r8
That is, GNU as and Clang's integrated assembler use different multi-
byte NOPs for alignment (GNU as emits an 11 byte NOP + a 3 byte NOP,
while Clang IAS emits a 10 byte NOP + a 4 byte NOP).
Dependency cleanup hacks are not required, because we do not create
.depend files from GNU as.
Reviewed by: allanjude, arichardson, cem, tsoome
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D8434
Clang IAS does not support the --defsym argument, and
.ifndef SKEIN_USE_ASM
gets turned into
.ifndef 1792
by the preprocessor, which results in
error: expected identifier after '.ifdef'
.ifndef 1792
^
Use #ifdef instead, which still works with GNU as.
Reviewed by: cem
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25154
GNU as seems to allow macro arguments without the '\' but clang is more
strict in that regard.
This change makes the source code compatible with LLVM's but does not yet
change the build system or rename it to .S.
The new code assembles identically with GNU as 2.17.50.
Reviewed By: emaste
Differential Revision: https://reviews.freebsd.org/D25143
r359374 introduced crypto_apply function which takes as argument a function pointer
that is expected to return an int, however aesni hash update functions
return void.
Because of that the function pointer passed was simply cast with
its return value changed.
This resulted in undefined behavior, in particular when mbuf is used, (ipsec)
m_apply checks return value of function pointer passed to it
and in our case bogusly fails after calculating hash of the first mbuf
in chain.
Fix it by changing signatures of sha update routines in aesni and
dropping the casts.
Submitted by: Kornel Duleba
Reviewed by: jhb, cem
Obtained from: Semihalf
Sponsored by: Stormshield
Differential Revision: https://reviews.freebsd.org/D25030
The backend routines aesni(4) call for specific encryption modes all
expect virtually contiguous input/output buffers. If the existing
output buffer is virtually contiguous, always write to the output
buffer directly from the mode-specific routines. If the output buffer
is not contiguous, then a temporary buffer is allocated whose output
is then copied to the output buffer. If the input buffer is not
contiguous, then the existing buffer used to hold the input is also
used to hold temporary output.
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D24545
Some crypto consumers such as GELI and KTLS for file-backed sendfile
need to store their output in a separate buffer from the input.
Currently these consumers copy the contents of the input buffer into
the output buffer and queue an in-place crypto operation on the output
buffer. Using a separate output buffer avoids this copy.
- Create a new 'struct crypto_buffer' describing a crypto buffer
containing a type and type-specific fields. crp_ilen is gone,
instead buffers that use a flat kernel buffer have a cb_buf_len
field for their length. The length of other buffer types is
inferred from the backing store (e.g. uio_resid for a uio).
Requests now have two such structures: crp_buf for the input buffer,
and crp_obuf for the output buffer.
- Consumers now use helper functions (crypto_use_*,
e.g. crypto_use_mbuf()) to configure the input buffer. If an output
buffer is not configured, the request still modifies the input
buffer in-place. A consumer uses a second set of helper functions
(crypto_use_output_*) to configure an output buffer.
- Consumers must request support for separate output buffers when
creating a crypto session via the CSP_F_SEPARATE_OUTPUT flag and are
only permitted to queue a request with a separate output buffer on
sessions with this flag set. Existing drivers already reject
sessions with unknown flags, so this permits drivers to be modified
to support this extension without requiring all drivers to change.
- Several data-related functions now have matching versions that
operate on an explicit buffer (e.g. crypto_apply_buf,
crypto_contiguous_subsegment_buf, bus_dma_load_crp_buf).
- Most of the existing data-related functions operate on the input
buffer. However crypto_copyback always writes to the output buffer
if a request uses a separate output buffer.
- For the regions in input/output buffers, the following conventions
are followed:
- AAD and IV are always present in input only and their
fields are offsets into the input buffer.
- payload is always present in both buffers. If a request uses a
separate output buffer, it must set a new crp_payload_start_output
field to the offset of the payload in the output buffer.
- digest is in the input buffer for verify operations, and in the
output buffer for compute operations. crp_digest_start is relative
to the appropriate buffer.
- Add a crypto buffer cursor abstraction. This is a more general form
of some bits in the cryptosoft driver that tried to always use uio's.
However, compared to the original code, this avoids rewalking the uio
iovec array for requests with multiple vectors. It also avoids
allocate an iovec array for mbufs and populating it by instead walking
the mbuf chain directly.
- Update the cryptosoft(4) driver to support separate output buffers
making use of the cursor abstraction.
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D24545