my tests, it is faster ~20%, even on an old IXP425 533MHz it is ~45%
faster... This is partly due to loop unrolling, so the code size does
significantly increase... I do plan on committing a version that
rolls up the loops again for smaller code size for embedded systems
where size is more important than absolute performance (it'll save ~6k
code)...
The kernel implementation is now shared w/ userland's libcrypt and
libmd...
We drop support for sha256 from sha2.c, so now sha2.c only contains
sha384 and sha512...
Reviewed by: secteam@