Commit Graph

5 Commits

Author SHA1 Message Date
Warner Losh
52467047aa Regularize the Netflix copyright
Use recent best practices for Copyright form at the top of
the license:
1. Remove all the All Rights Reserved clauses on our stuff. Where we
   piggybacked others, use a separate line to make things clear.
2. Use "Netflix, Inc." everywhere.
3. Use a single line for the copyright for grep friendliness.
4. Use date ranges in all places for our stuff.

Approved by: Netflix Legal (who gave me the form), adrian@ (pmc files)
2019-02-04 21:28:25 +00:00
John-Mark Gurney
a13589bc47 unroll the loop slightly... This improves performance enough to
justify, especially for CBC performance where we can't pipeline..  I
don't happen to have my measurements handy though...

Sponsored by:	Netflix, Inc.
2015-07-07 20:31:09 +00:00
Craig Rodrigues
800be1b6f9 In the version of gcc in the FreeBSD tree, this modification was made to
the compiler in svn r242182:

#if STDC_HOSTED
#include <mm_malloc.h>
#endif

A similar change was done to clang in the FreeBSD tree in svn r218893:

However, for external gcc toolchains, this patch is not in the compiler's header
file.

This patch to FreeBSD's aesni code allows compilation with an external
gcc toolchain.

Differential Revision: https://reviews.freebsd.org/D2285
Reviewed by:  jmg, dim
Approved by:  dim
2015-04-16 17:42:52 +00:00
John-Mark Gurney
038ffd3e43 make it so that from/to can be missaligned as it can happen (the geli
regression manages to do it)...  We use a packed struct to coerce
gcc/clang into producing unaligned loads (there is not packed pointer
attribute, otherwise this would be easier)...

use _storeu_ and _loadu_ when using the structure is overkill...

be better at using types properly...  Since we allocate our own key
schedule and make sure it's aligned, use the __m128i type in various
arguments to functions...

clang ignores __aligned on prototypes and gcc errors on them, leave them
in comments to document that these function arguments are require to be
aligned...

about all that changes is movdqa -> movdqu from reading the diff of the
disassembly output...

Noticed by:	symbolics at gmx.com
MFC after:	3 days
2013-11-06 19:14:49 +00:00
John-Mark Gurney
ff6c7bf5ca Use the fact that the AES-NI instructions can be pipelined to improve
performance... Use SSE2 instructions for calculating the XTS tweek
factor...  Let the compiler do more work and handle register allocation
by using intrinsics, now only the key schedule is in assembly...

Replace .byte hard coded instructions w/ the proper instructions now
that both clang and gcc support them...

On my machine, pulling the code to userland I saw performance go from
~150MB/sec to 2GB/sec in XTS mode.  GELI on GNOP saw a more modest
increase of about 3x due to other system overhead (geom and
opencrypto)...

These changes allow almost full disk io rate w/ geli...

Reviewed by:	-current, -security
Thanks to:	Mike Hamburg for the XTS tweek algorithm
2013-09-03 18:31:23 +00:00