Import OpenSSL 1.0.2d.

This commit is contained in:
Jung-uk Kim 2015-10-23 19:46:02 +00:00
parent c07d7b3a38
commit e9fcefce9b
Notes: svn2git 2020-12-20 02:59:44 +00:00
svn path=/vendor-crypto/openssl/dist/; revision=289848
svn path=/vendor-crypto/openssl/1.0.2d/; revision=289849; tag=vendor/openssl/1.0.2d
490 changed files with 94556 additions and 13367 deletions

437
CHANGES
View File

@ -2,7 +2,7 @@
OpenSSL CHANGES
_______________
Changes between 1.0.1o and 1.0.1p [9 Jul 2015]
Changes between 1.0.2c and 1.0.2d [9 Jul 2015]
*) Alternate chains certificate forgery
@ -17,13 +17,13 @@
(Google/BoringSSL).
[Matt Caswell]
Changes between 1.0.1n and 1.0.1o [12 Jun 2015]
Changes between 1.0.2b and 1.0.2c [12 Jun 2015]
*) Fix HMAC ABI incompatibility. The previous version introduced an ABI
incompatibility in the handling of HMAC. The previous ABI has now been
restored.
Changes between 1.0.1m and 1.0.1n [11 Jun 2015]
Changes between 1.0.2a and 1.0.2b [11 Jun 2015]
*) Malformed ECParameters causes infinite loop
@ -91,10 +91,65 @@
(CVE-2015-1791)
[Matt Caswell]
*) Removed support for the two export grade static DH ciphersuites
EXP-DH-RSA-DES-CBC-SHA and EXP-DH-DSS-DES-CBC-SHA. These two ciphersuites
were newly added (along with a number of other static DH ciphersuites) to
1.0.2. However the two export ones have *never* worked since they were
introduced. It seems strange in any case to be adding new export
ciphersuites, and given "logjam" it also does not seem correct to fix them.
[Matt Caswell]
*) Only support 256-bit or stronger elliptic curves with the
'ecdh_auto' setting (server) or by default (client). Of supported
curves, prefer P-256 (both).
[Emilia Kasper]
*) Reject DH handshakes with parameters shorter than 768 bits.
[Kurt Roeckx and Emilia Kasper]
Changes between 1.0.1l and 1.0.1m [19 Mar 2015]
Changes between 1.0.2 and 1.0.2a [19 Mar 2015]
*) ClientHello sigalgs DoS fix
If a client connects to an OpenSSL 1.0.2 server and renegotiates with an
invalid signature algorithms extension a NULL pointer dereference will
occur. This can be exploited in a DoS attack against the server.
This issue was was reported to OpenSSL by David Ramos of Stanford
University.
(CVE-2015-0291)
[Stephen Henson and Matt Caswell]
*) Multiblock corrupted pointer fix
OpenSSL 1.0.2 introduced the "multiblock" performance improvement. This
feature only applies on 64 bit x86 architecture platforms that support AES
NI instructions. A defect in the implementation of "multiblock" can cause
OpenSSL's internal write buffer to become incorrectly set to NULL when
using non-blocking IO. Typically, when the user application is using a
socket BIO for writing, this will only result in a failed connection.
However if some other BIO is used then it is likely that a segmentation
fault will be triggered, thus enabling a potential DoS attack.
This issue was reported to OpenSSL by Daniel Danner and Rainer Mueller.
(CVE-2015-0290)
[Matt Caswell]
*) Segmentation fault in DTLSv1_listen fix
The DTLSv1_listen function is intended to be stateless and processes the
initial ClientHello from many peers. It is common for user code to loop
over the call to DTLSv1_listen until a valid ClientHello is received with
an associated cookie. A defect in the implementation of DTLSv1_listen means
that state is preserved in the SSL object from one invocation to the next
that can lead to a segmentation fault. Errors processing the initial
ClientHello can trigger this scenario. An example of such an error could be
that a DTLS1.0 only client is attempting to connect to a DTLS1.2 only
server.
This issue was reported to OpenSSL by Per Allansson.
(CVE-2015-0207)
[Matt Caswell]
*) Segmentation fault in ASN1_TYPE_cmp fix
@ -107,6 +162,20 @@
(CVE-2015-0286)
[Stephen Henson]
*) Segmentation fault for invalid PSS parameters fix
The signature verification routines will crash with a NULL pointer
dereference if presented with an ASN.1 signature using the RSA PSS
algorithm and invalid parameters. Since these routines are used to verify
certificate signature algorithms this can be used to crash any
certificate verification operation and exploited in a DoS attack. Any
application which performs certificate verification is vulnerable including
OpenSSL clients and servers which enable client authentication.
This issue was was reported to OpenSSL by Brian Carpenter.
(CVE-2015-0208)
[Stephen Henson]
*) ASN.1 structure reuse memory corruption fix
Reusing a structure in ASN.1 parsing may allow an attacker to cause
@ -145,6 +214,36 @@
(CVE-2015-0293)
[Emilia Käsper]
*) Empty CKE with client auth and DHE fix
If client auth is used then a server can seg fault in the event of a DHE
ciphersuite being selected and a zero length ClientKeyExchange message
being sent by the client. This could be exploited in a DoS attack.
(CVE-2015-1787)
[Matt Caswell]
*) Handshake with unseeded PRNG fix
Under certain conditions an OpenSSL 1.0.2 client can complete a handshake
with an unseeded PRNG. The conditions are:
- The client is on a platform where the PRNG has not been seeded
automatically, and the user has not seeded manually
- A protocol specific client method version has been used (i.e. not
SSL_client_methodv23)
- A ciphersuite is used that does not require additional random data from
the PRNG beyond the initial ClientHello client random (e.g. PSK-RC4-SHA).
If the handshake succeeds then the client random that has been used will
have been generated from a PRNG with insufficient entropy and therefore the
output may be predictable.
For example using the following command with an unseeded openssl will
succeed on an unpatched platform:
openssl s_client -psk 1a2b3c4d -tls1_2 -cipher PSK-RC4-SHA
(CVE-2015-0285)
[Matt Caswell]
*) Use After Free following d2i_ECPrivatekey error fix
A malformed EC private key file consumed via the d2i_ECPrivateKey function
@ -171,6 +270,336 @@
*) Removed the export ciphers from the DEFAULT ciphers
[Kurt Roeckx]
Changes between 1.0.1l and 1.0.2 [22 Jan 2015]
*) Facilitate "universal" ARM builds targeting range of ARM ISAs, e.g.
ARMv5 through ARMv8, as opposite to "locking" it to single one.
So far those who have to target multiple plaforms would compromise
and argue that binary targeting say ARMv5 would still execute on
ARMv8. "Universal" build resolves this compromise by providing
near-optimal performance even on newer platforms.
[Andy Polyakov]
*) Accelerated NIST P-256 elliptic curve implementation for x86_64
(other platforms pending).
[Shay Gueron & Vlad Krasnov (Intel Corp), Andy Polyakov]
*) Add support for the SignedCertificateTimestampList certificate and
OCSP response extensions from RFC6962.
[Rob Stradling]
*) Fix ec_GFp_simple_points_make_affine (thus, EC_POINTs_mul etc.)
for corner cases. (Certain input points at infinity could lead to
bogus results, with non-infinity inputs mapped to infinity too.)
[Bodo Moeller]
*) Initial support for PowerISA 2.0.7, first implemented in POWER8.
This covers AES, SHA256/512 and GHASH. "Initial" means that most
common cases are optimized and there still is room for further
improvements. Vector Permutation AES for Altivec is also added.
[Andy Polyakov]
*) Add support for little-endian ppc64 Linux target.
[Marcelo Cerri (IBM)]
*) Initial support for AMRv8 ISA crypto extensions. This covers AES,
SHA1, SHA256 and GHASH. "Initial" means that most common cases
are optimized and there still is room for further improvements.
Both 32- and 64-bit modes are supported.
[Andy Polyakov, Ard Biesheuvel (Linaro)]
*) Improved ARMv7 NEON support.
[Andy Polyakov]
*) Support for SPARC Architecture 2011 crypto extensions, first
implemented in SPARC T4. This covers AES, DES, Camellia, SHA1,
SHA256/512, MD5, GHASH and modular exponentiation.
[Andy Polyakov, David Miller]
*) Accelerated modular exponentiation for Intel processors, a.k.a.
RSAZ.
[Shay Gueron & Vlad Krasnov (Intel Corp)]
*) Support for new and upcoming Intel processors, including AVX2,
BMI and SHA ISA extensions. This includes additional "stitched"
implementations, AESNI-SHA256 and GCM, and multi-buffer support
for TLS encrypt.
This work was sponsored by Intel Corp.
[Andy Polyakov]
*) Support for DTLS 1.2. This adds two sets of DTLS methods: DTLS_*_method()
supports both DTLS 1.2 and 1.0 and should use whatever version the peer
supports and DTLSv1_2_*_method() which supports DTLS 1.2 only.
[Steve Henson]
*) Use algorithm specific chains in SSL_CTX_use_certificate_chain_file():
this fixes a limiation in previous versions of OpenSSL.
[Steve Henson]
*) Extended RSA OAEP support via EVP_PKEY API. Options to specify digest,
MGF1 digest and OAEP label.
[Steve Henson]
*) Add EVP support for key wrapping algorithms, to avoid problems with
existing code the flag EVP_CIPHER_CTX_WRAP_ALLOW has to be set in
the EVP_CIPHER_CTX or an error is returned. Add AES and DES3 wrap
algorithms and include tests cases.
[Steve Henson]
*) Add functions to allocate and set the fields of an ECDSA_METHOD
structure.
[Douglas E. Engert, Steve Henson]
*) New functions OPENSSL_gmtime_diff and ASN1_TIME_diff to find the
difference in days and seconds between two tm or ASN1_TIME structures.
[Steve Henson]
*) Add -rev test option to s_server to just reverse order of characters
received by client and send back to server. Also prints an abbreviated
summary of the connection parameters.
[Steve Henson]
*) New option -brief for s_client and s_server to print out a brief summary
of connection parameters.
[Steve Henson]
*) Add callbacks for arbitrary TLS extensions.
[Trevor Perrin <trevp@trevp.net> and Ben Laurie]
*) New option -crl_download in several openssl utilities to download CRLs
from CRLDP extension in certificates.
[Steve Henson]
*) New options -CRL and -CRLform for s_client and s_server for CRLs.
[Steve Henson]
*) New function X509_CRL_diff to generate a delta CRL from the difference
of two full CRLs. Add support to "crl" utility.
[Steve Henson]
*) New functions to set lookup_crls function and to retrieve
X509_STORE from X509_STORE_CTX.
[Steve Henson]
*) Print out deprecated issuer and subject unique ID fields in
certificates.
[Steve Henson]
*) Extend OCSP I/O functions so they can be used for simple general purpose
HTTP as well as OCSP. New wrapper function which can be used to download
CRLs using the OCSP API.
[Steve Henson]
*) Delegate command line handling in s_client/s_server to SSL_CONF APIs.
[Steve Henson]
*) SSL_CONF* functions. These provide a common framework for application
configuration using configuration files or command lines.
[Steve Henson]
*) SSL/TLS tracing code. This parses out SSL/TLS records using the
message callback and prints the results. Needs compile time option
"enable-ssl-trace". New options to s_client and s_server to enable
tracing.
[Steve Henson]
*) New ctrl and macro to retrieve supported points extensions.
Print out extension in s_server and s_client.
[Steve Henson]
*) New functions to retrieve certificate signature and signature
OID NID.
[Steve Henson]
*) Add functions to retrieve and manipulate the raw cipherlist sent by a
client to OpenSSL.
[Steve Henson]
*) New Suite B modes for TLS code. These use and enforce the requirements
of RFC6460: restrict ciphersuites, only permit Suite B algorithms and
only use Suite B curves. The Suite B modes can be set by using the
strings "SUITEB128", "SUITEB192" or "SUITEB128ONLY" for the cipherstring.
[Steve Henson]
*) New chain verification flags for Suite B levels of security. Check
algorithms are acceptable when flags are set in X509_verify_cert.
[Steve Henson]
*) Make tls1_check_chain return a set of flags indicating checks passed
by a certificate chain. Add additional tests to handle client
certificates: checks for matching certificate type and issuer name
comparison.
[Steve Henson]
*) If an attempt is made to use a signature algorithm not in the peer
preference list abort the handshake. If client has no suitable
signature algorithms in response to a certificate request do not
use the certificate.
[Steve Henson]
*) If server EC tmp key is not in client preference list abort handshake.
[Steve Henson]
*) Add support for certificate stores in CERT structure. This makes it
possible to have different stores per SSL structure or one store in
the parent SSL_CTX. Include distint stores for certificate chain
verification and chain building. New ctrl SSL_CTRL_BUILD_CERT_CHAIN
to build and store a certificate chain in CERT structure: returing
an error if the chain cannot be built: this will allow applications
to test if a chain is correctly configured.
Note: if the CERT based stores are not set then the parent SSL_CTX
store is used to retain compatibility with existing behaviour.
[Steve Henson]
*) New function ssl_set_client_disabled to set a ciphersuite disabled
mask based on the current session, check mask when sending client
hello and checking the requested ciphersuite.
[Steve Henson]
*) New ctrls to retrieve and set certificate types in a certificate
request message. Print out received values in s_client. If certificate
types is not set with custom values set sensible values based on
supported signature algorithms.
[Steve Henson]
*) Support for distinct client and server supported signature algorithms.
[Steve Henson]
*) Add certificate callback. If set this is called whenever a certificate
is required by client or server. An application can decide which
certificate chain to present based on arbitrary criteria: for example
supported signature algorithms. Add very simple example to s_server.
This fixes many of the problems and restrictions of the existing client
certificate callback: for example you can now clear an existing
certificate and specify the whole chain.
[Steve Henson]
*) Add new "valid_flags" field to CERT_PKEY structure which determines what
the certificate can be used for (if anything). Set valid_flags field
in new tls1_check_chain function. Simplify ssl_set_cert_masks which used
to have similar checks in it.
Add new "cert_flags" field to CERT structure and include a "strict mode".
This enforces some TLS certificate requirements (such as only permitting
certificate signature algorithms contained in the supported algorithms
extension) which some implementations ignore: this option should be used
with caution as it could cause interoperability issues.
[Steve Henson]
*) Update and tidy signature algorithm extension processing. Work out
shared signature algorithms based on preferences and peer algorithms
and print them out in s_client and s_server. Abort handshake if no
shared signature algorithms.
[Steve Henson]
*) Add new functions to allow customised supported signature algorithms
for SSL and SSL_CTX structures. Add options to s_client and s_server
to support them.
[Steve Henson]
*) New function SSL_certs_clear() to delete all references to certificates
from an SSL structure. Before this once a certificate had been added
it couldn't be removed.
[Steve Henson]
*) Integrate hostname, email address and IP address checking with certificate
verification. New verify options supporting checking in opensl utility.
[Steve Henson]
*) Fixes and wildcard matching support to hostname and email checking
functions. Add manual page.
[Florian Weimer (Red Hat Product Security Team)]
*) New functions to check a hostname email or IP address against a
certificate. Add options x509 utility to print results of checks against
a certificate.
[Steve Henson]
*) Fix OCSP checking.
[Rob Stradling <rob.stradling@comodo.com> and Ben Laurie]
*) Initial experimental support for explicitly trusted non-root CAs.
OpenSSL still tries to build a complete chain to a root but if an
intermediate CA has a trust setting included that is used. The first
setting is used: whether to trust (e.g., -addtrust option to the x509
utility) or reject.
[Steve Henson]
*) Add -trusted_first option which attempts to find certificates in the
trusted store even if an untrusted chain is also supplied.
[Steve Henson]
*) MIPS assembly pack updates: support for MIPS32r2 and SmartMIPS ASE,
platform support for Linux and Android.
[Andy Polyakov]
*) Support for linux-x32, ILP32 environment in x86_64 framework.
[Andy Polyakov]
*) Experimental multi-implementation support for FIPS capable OpenSSL.
When in FIPS mode the approved implementations are used as normal,
when not in FIPS mode the internal unapproved versions are used instead.
This means that the FIPS capable OpenSSL isn't forced to use the
(often lower perfomance) FIPS implementations outside FIPS mode.
[Steve Henson]
*) Transparently support X9.42 DH parameters when calling
PEM_read_bio_DHparameters. This means existing applications can handle
the new parameter format automatically.
[Steve Henson]
*) Initial experimental support for X9.42 DH parameter format: mainly
to support use of 'q' parameter for RFC5114 parameters.
[Steve Henson]
*) Add DH parameters from RFC5114 including test data to dhtest.
[Steve Henson]
*) Support for automatic EC temporary key parameter selection. If enabled
the most preferred EC parameters are automatically used instead of
hardcoded fixed parameters. Now a server just has to call:
SSL_CTX_set_ecdh_auto(ctx, 1) and the server will automatically
support ECDH and use the most appropriate parameters.
[Steve Henson]
*) Enhance and tidy EC curve and point format TLS extension code. Use
static structures instead of allocation if default values are used.
New ctrls to set curves we wish to support and to retrieve shared curves.
Print out shared curves in s_server. New options to s_server and s_client
to set list of supported curves.
[Steve Henson]
*) New ctrls to retrieve supported signature algorithms and
supported curve values as an array of NIDs. Extend openssl utility
to print out received values.
[Steve Henson]
*) Add new APIs EC_curve_nist2nid and EC_curve_nid2nist which convert
between NIDs and the more common NIST names such as "P-256". Enhance
ecparam utility and ECC method to recognise the NIST names for curves.
[Steve Henson]
*) Enhance SSL/TLS certificate chain handling to support different
chains for each certificate instead of one chain in the parent SSL_CTX.
[Steve Henson]
*) Support for fixed DH ciphersuite client authentication: where both
server and client use DH certificates with common parameters.
[Steve Henson]
*) Support for fixed DH ciphersuites: those requiring DH server
certificates.
[Steve Henson]
*) New function i2d_re_X509_tbs for re-encoding the TBS portion of
the certificate.
Note: Related 1.0.2-beta specific macros X509_get_cert_info,
X509_CINF_set_modified, X509_CINF_get_issuer, X509_CINF_get_extensions and
X509_CINF_get_signature were reverted post internal team review.
Changes between 1.0.1k and 1.0.1l [15 Jan 2015]
*) Build fixes for the Windows and OpenVMS platforms

224
Configure
View File

@ -105,6 +105,25 @@ my $usage="Usage: Configure [no-<cipher> ...] [enable-<cipher> ...] [experimenta
my $gcc_devteam_warn = "-Wall -pedantic -DPEDANTIC -Wno-long-long -Wsign-compare -Wmissing-prototypes -Wshadow -Wformat -Werror -DCRYPTO_MDEBUG_ALL -DCRYPTO_MDEBUG_ABORT -DREF_CHECK -DOPENSSL_NO_DEPRECATED";
# TODO(openssl-team): fix problems and investigate if (at least) the following
# warnings can also be enabled:
# -Wconditional-uninitialized, -Wswitch-enum, -Wunused-macros,
# -Wmissing-field-initializers, -Wmissing-variable-declarations,
# -Wincompatible-pointer-types-discards-qualifiers, -Wcast-align,
# -Wunreachable-code -Wunused-parameter -Wlanguage-extension-token
# -Wextended-offsetof
my $clang_disabled_warnings = "-Wno-unused-parameter -Wno-missing-field-initializers -Wno-language-extension-token -Wno-extended-offsetof";
# These are used in addition to $gcc_devteam_warn when the compiler is clang.
# TODO(openssl-team): fix problems and investigate if (at least) the
# following warnings can also be enabled: -Wconditional-uninitialized,
# -Wswitch-enum, -Wunused-macros, -Wmissing-field-initializers,
# -Wmissing-variable-declarations,
# -Wincompatible-pointer-types-discards-qualifiers, -Wcast-align,
# -Wunreachable-code -Wunused-parameter -Wlanguage-extension-token
# -Wextended-offsetof
my $clang_devteam_warn = "-Wno-unused-parameter -Wno-missing-field-initializers -Wno-language-extension-token -Wno-extended-offsetof -Qunused-arguments";
my $strict_warnings = 0;
my $x86_gcc_des="DES_PTR DES_RISC1 DES_UNROLL";
@ -124,24 +143,25 @@ my $tlib="-lnsl -lsocket";
my $bits1="THIRTY_TWO_BIT ";
my $bits2="SIXTY_FOUR_BIT ";
my $x86_asm="x86cpuid.o:bn-586.o co-586.o x86-mont.o x86-gf2m.o:des-586.o crypt586.o:aes-586.o vpaes-x86.o aesni-x86.o:bf-586.o:md5-586.o:sha1-586.o sha256-586.o sha512-586.o:cast-586.o:rc4-586.o:rmd-586.o:rc5-586.o:wp_block.o wp-mmx.o:cmll-x86.o:ghash-x86.o:";
my $x86_asm="x86cpuid.o:bn-586.o co-586.o x86-mont.o x86-gf2m.o::des-586.o crypt586.o:aes-586.o vpaes-x86.o aesni-x86.o:bf-586.o:md5-586.o:sha1-586.o sha256-586.o sha512-586.o:cast-586.o:rc4-586.o:rmd-586.o:rc5-586.o:wp_block.o wp-mmx.o:cmll-x86.o:ghash-x86.o:";
my $x86_elf_asm="$x86_asm:elf";
my $x86_64_asm="x86_64cpuid.o:x86_64-gcc.o x86_64-mont.o x86_64-mont5.o x86_64-gf2m.o modexp512-x86_64.o::aes-x86_64.o vpaes-x86_64.o bsaes-x86_64.o aesni-x86_64.o aesni-sha1-x86_64.o::md5-x86_64.o:sha1-x86_64.o sha256-x86_64.o sha512-x86_64.o::rc4-x86_64.o rc4-md5-x86_64.o:::wp-x86_64.o:cmll-x86_64.o cmll_misc.o:ghash-x86_64.o:";
my $ia64_asm="ia64cpuid.o:bn-ia64.o ia64-mont.o::aes_core.o aes_cbc.o aes-ia64.o::md5-ia64.o:sha1-ia64.o sha256-ia64.o sha512-ia64.o::rc4-ia64.o rc4_skey.o:::::ghash-ia64.o::void";
my $sparcv9_asm="sparcv9cap.o sparccpuid.o:bn-sparcv9.o sparcv9-mont.o sparcv9a-mont.o:des_enc-sparc.o fcrypt_b.o:aes_core.o aes_cbc.o aes-sparcv9.o:::sha1-sparcv9.o sha256-sparcv9.o sha512-sparcv9.o:::::::ghash-sparcv9.o::void";
my $sparcv8_asm=":sparcv8.o:des_enc-sparc.o fcrypt_b.o:::::::::::::void";
my $alpha_asm="alphacpuid.o:bn_asm.o alpha-mont.o:::::sha1-alpha.o:::::::ghash-alpha.o::void";
my $mips32_asm=":bn-mips.o::aes_cbc.o aes-mips.o:::sha1-mips.o sha256-mips.o::::::::";
my $mips64_asm=":bn-mips.o mips-mont.o::aes_cbc.o aes-mips.o:::sha1-mips.o sha256-mips.o sha512-mips.o::::::::";
my $s390x_asm="s390xcap.o s390xcpuid.o:bn-s390x.o s390x-mont.o s390x-gf2m.o::aes-s390x.o aes-ctr.o aes-xts.o:::sha1-s390x.o sha256-s390x.o sha512-s390x.o::rc4-s390x.o:::::ghash-s390x.o:";
my $armv4_asm="armcap.o armv4cpuid.o:bn_asm.o armv4-mont.o armv4-gf2m.o::aes_cbc.o aes-armv4.o:::sha1-armv4-large.o sha256-armv4.o sha512-armv4.o:::::::ghash-armv4.o::void";
my $parisc11_asm="pariscid.o:bn_asm.o parisc-mont.o::aes_core.o aes_cbc.o aes-parisc.o:::sha1-parisc.o sha256-parisc.o sha512-parisc.o::rc4-parisc.o:::::ghash-parisc.o::32";
my $parisc20_asm="pariscid.o:pa-risc2W.o parisc-mont.o::aes_core.o aes_cbc.o aes-parisc.o:::sha1-parisc.o sha256-parisc.o sha512-parisc.o::rc4-parisc.o:::::ghash-parisc.o::64";
my $ppc32_asm="ppccpuid.o ppccap.o:bn-ppc.o ppc-mont.o ppc64-mont.o::aes_core.o aes_cbc.o aes-ppc.o:::sha1-ppc.o sha256-ppc.o::::::::";
my $ppc64_asm="ppccpuid.o ppccap.o:bn-ppc.o ppc-mont.o ppc64-mont.o::aes_core.o aes_cbc.o aes-ppc.o:::sha1-ppc.o sha256-ppc.o sha512-ppc.o::::::::";
my $no_asm=":::::::::::::::void";
my $x86_64_asm="x86_64cpuid.o:x86_64-gcc.o x86_64-mont.o x86_64-mont5.o x86_64-gf2m.o rsaz_exp.o rsaz-x86_64.o rsaz-avx2.o:ecp_nistz256.o ecp_nistz256-x86_64.o::aes-x86_64.o vpaes-x86_64.o bsaes-x86_64.o aesni-x86_64.o aesni-sha1-x86_64.o aesni-sha256-x86_64.o aesni-mb-x86_64.o::md5-x86_64.o:sha1-x86_64.o sha256-x86_64.o sha512-x86_64.o sha1-mb-x86_64.o sha256-mb-x86_64.o::rc4-x86_64.o rc4-md5-x86_64.o:::wp-x86_64.o:cmll-x86_64.o cmll_misc.o:ghash-x86_64.o aesni-gcm-x86_64.o:";
my $ia64_asm="ia64cpuid.o:bn-ia64.o ia64-mont.o:::aes_core.o aes_cbc.o aes-ia64.o::md5-ia64.o:sha1-ia64.o sha256-ia64.o sha512-ia64.o::rc4-ia64.o rc4_skey.o:::::ghash-ia64.o::void";
my $sparcv9_asm="sparcv9cap.o sparccpuid.o:bn-sparcv9.o sparcv9-mont.o sparcv9a-mont.o vis3-mont.o sparct4-mont.o sparcv9-gf2m.o::des_enc-sparc.o fcrypt_b.o dest4-sparcv9.o:aes_core.o aes_cbc.o aes-sparcv9.o aest4-sparcv9.o::md5-sparcv9.o:sha1-sparcv9.o sha256-sparcv9.o sha512-sparcv9.o::::::camellia.o cmll_misc.o cmll_cbc.o cmllt4-sparcv9.o:ghash-sparcv9.o::void";
my $sparcv8_asm=":sparcv8.o::des_enc-sparc.o fcrypt_b.o:::::::::::::void";
my $alpha_asm="alphacpuid.o:bn_asm.o alpha-mont.o::::::sha1-alpha.o:::::::ghash-alpha.o::void";
my $mips64_asm=":bn-mips.o mips-mont.o:::aes_cbc.o aes-mips.o:::sha1-mips.o sha256-mips.o sha512-mips.o::::::::";
my $mips32_asm=$mips64_asm; $mips32_asm =~ s/\s*sha512\-mips\.o//;
my $s390x_asm="s390xcap.o s390xcpuid.o:bn-s390x.o s390x-mont.o s390x-gf2m.o:::aes-s390x.o aes-ctr.o aes-xts.o:::sha1-s390x.o sha256-s390x.o sha512-s390x.o::rc4-s390x.o:::::ghash-s390x.o:";
my $armv4_asm="armcap.o armv4cpuid.o:bn_asm.o armv4-mont.o armv4-gf2m.o:::aes_cbc.o aes-armv4.o bsaes-armv7.o aesv8-armx.o:::sha1-armv4-large.o sha256-armv4.o sha512-armv4.o:::::::ghash-armv4.o ghashv8-armx.o::void";
my $aarch64_asm="armcap.o arm64cpuid.o mem_clr.o::::aes_core.o aes_cbc.o aesv8-armx.o:::sha1-armv8.o sha256-armv8.o sha512-armv8.o:::::::ghashv8-armx.o:";
my $parisc11_asm="pariscid.o:bn_asm.o parisc-mont.o:::aes_core.o aes_cbc.o aes-parisc.o:::sha1-parisc.o sha256-parisc.o sha512-parisc.o::rc4-parisc.o:::::ghash-parisc.o::32";
my $parisc20_asm="pariscid.o:pa-risc2W.o parisc-mont.o:::aes_core.o aes_cbc.o aes-parisc.o:::sha1-parisc.o sha256-parisc.o sha512-parisc.o::rc4-parisc.o:::::ghash-parisc.o::64";
my $ppc64_asm="ppccpuid.o ppccap.o:bn-ppc.o ppc-mont.o ppc64-mont.o:::aes_core.o aes_cbc.o aes-ppc.o vpaes-ppc.o aesp8-ppc.o:::sha1-ppc.o sha256-ppc.o sha512-ppc.o sha256p8-ppc.o sha512p8-ppc.o:::::::ghashp8-ppc.o:";
my $ppc32_asm=$ppc64_asm;
my $no_asm="::::::::::::::::void";
# As for $BSDthreads. Idea is to maintain "collective" set of flags,
# which would cover all BSD flavors. -pthread applies to them all,
@ -152,7 +172,7 @@ my $no_asm=":::::::::::::::void";
# seems to be sufficient?
my $BSDthreads="-pthread -D_THREAD_SAFE -D_REENTRANT";
#config-string $cc : $cflags : $unistd : $thread_cflag : $sys_id : $lflags : $bn_ops : $cpuid_obj : $bn_obj : $des_obj : $aes_obj : $bf_obj : $md5_obj : $sha1_obj : $cast_obj : $rc4_obj : $rmd160_obj : $rc5_obj : $wp_obj : $cmll_obj : $modes_obj : $engines_obj : $dso_scheme : $shared_target : $shared_cflag : $shared_ldflag : $shared_extension : $ranlib : $arflags : $multilib
#config-string $cc : $cflags : $unistd : $thread_cflag : $sys_id : $lflags : $bn_ops : $cpuid_obj : $bn_obj : $ec_obj : $des_obj : $aes_obj : $bf_obj : $md5_obj : $sha1_obj : $cast_obj : $rc4_obj : $rmd160_obj : $rc5_obj : $wp_obj : $cmll_obj : $modes_obj : $engines_obj : $dso_scheme : $shared_target : $shared_cflag : $shared_ldflag : $shared_extension : $ranlib : $arflags : $multilib
my %table=(
# File 'TABLE' (created by 'make TABLE') contains the data from this list,
@ -174,14 +194,14 @@ my %table=(
"debug-ben-debug-64", "gcc:$gcc_devteam_warn -Wno-error=overlength-strings -DBN_DEBUG -DCONF_DEBUG -DDEBUG_SAFESTACK -DDEBUG_UNUSED -g3 -O3 -pipe::${BSDthreads}:::SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_INT DES_UNROLL:${x86_64_asm}:elf:dlfcn:bsd-gcc-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"debug-ben-macos", "cc:$gcc_devteam_warn -arch i386 -DBN_DEBUG -DCONF_DEBUG -DDEBUG_SAFESTACK -DDEBUG_UNUSED -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -O3 -DL_ENDIAN -g3 -pipe::(unknown)::-Wl,-search_paths_first::::",
"debug-ben-macos-gcc46", "gcc-mp-4.6:$gcc_devteam_warn -Wconversion -DBN_DEBUG -DCONF_DEBUG -DDEBUG_SAFESTACK -DDEBUG_UNUSED -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -O3 -DL_ENDIAN -g3 -pipe::(unknown)::::::",
"debug-ben-darwin64","cc:$gcc_devteam_warn -Wno-language-extension-token -Wno-extended-offsetof -arch x86_64 -O3 -DL_ENDIAN -Wall::-D_REENTRANT:MACOSX:-Wl,-search_paths_first%:SIXTY_FOUR_BIT_LONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL:".eval{my $asm=$x86_64_asm;$asm=~s/rc4\-[^:]+//;$asm}.":macosx:dlfcn:darwin-shared:-fPIC -fno-common:-arch x86_64 -dynamiclib:.\$(SHLIB_MAJOR).\$(SHLIB_MINOR).dylib",
"debug-ben-darwin64","cc:$gcc_devteam_warn -g -Wno-language-extension-token -Wno-extended-offsetof -arch x86_64 -O3 -DL_ENDIAN -Wall::-D_REENTRANT:MACOSX:-Wl,-search_paths_first%:SIXTY_FOUR_BIT_LONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL:".eval{my $asm=$x86_64_asm;$asm=~s/rc4\-[^:]+//;$asm}.":macosx:dlfcn:darwin-shared:-fPIC -fno-common:-arch x86_64 -dynamiclib:.\$(SHLIB_MAJOR).\$(SHLIB_MINOR).dylib",
"debug-ben-debug-64-clang", "clang:$gcc_devteam_warn -Wno-error=overlength-strings -Wno-error=extended-offsetof -Qunused-arguments -DBN_DEBUG -DCONF_DEBUG -DDEBUG_SAFESTACK -DDEBUG_UNUSED -g3 -O3 -pipe::${BSDthreads}:::SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_INT DES_UNROLL:${x86_64_asm}:elf:dlfcn:bsd-gcc-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"debug-ben-no-opt", "gcc: -Wall -Wmissing-prototypes -Wstrict-prototypes -Wmissing-declarations -DDEBUG_SAFESTACK -DCRYPTO_MDEBUG -Werror -DL_ENDIAN -DTERMIOS -Wall -g3::(unknown)::::::",
"debug-ben-strict", "gcc:-DBN_DEBUG -DREF_CHECK -DCONF_DEBUG -DBN_CTX_DEBUG -DCRYPTO_MDEBUG -DCONST_STRICT -O2 -Wall -Wshadow -Werror -Wpointer-arith -Wcast-qual -Wwrite-strings -pipe::(unknown)::::::",
"debug-rse","cc:-DTERMIOS -DL_ENDIAN -pipe -O -g -ggdb3 -Wall::(unknown):::BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:${x86_elf_asm}",
"debug-bodo", "gcc:$gcc_devteam_warn -Wno-error=overlength-strings -DBN_DEBUG -DBN_DEBUG_RAND -DCONF_DEBUG -DBIO_PAIR_DEBUG -m64 -DL_ENDIAN -DTERMIO -g -DMD32_REG_T=int::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_INT DES_UNROLL:${x86_64_asm}:elf:dlfcn:linux-shared:-fPIC:-m64:.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR):::64",
"debug-ulf", "gcc:-DTERMIOS -DL_ENDIAN -march=i486 -Wall -DBN_DEBUG -DBN_DEBUG_RAND -DREF_CHECK -DCONF_DEBUG -DBN_CTX_DEBUG -DCRYPTO_MDEBUG -DOPENSSL_NO_ASM -g -Wformat -Wshadow -Wmissing-prototypes -Wmissing-declarations:::CYGWIN32:::${no_asm}:win32:cygwin-shared:::.dll",
"debug-steve64", "gcc:$gcc_devteam_warn -m64 -DL_ENDIAN -DTERMIO -DCONF_DEBUG -DDEBUG_SAFESTACK -Wno-overlength-strings -g::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_INT DES_UNROLL:${x86_64_asm}:elf:dlfcn:linux-shared:-fPIC:-m64:.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"debug-steve32", "gcc:$gcc_devteam_warn -m32 -DL_ENDIAN -DCONF_DEBUG -DDEBUG_SAFESTACK -g -pipe::-D_REENTRANT::-rdynamic -ldl:BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:${x86_elf_asm}:dlfcn:linux-shared:-fPIC:-m32:.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"debug-steve32", "gcc:$gcc_devteam_warn -m32 -DL_ENDIAN -DCONF_DEBUG -DDEBUG_SAFESTACK -Wno-overlength-strings -g -pipe::-D_REENTRANT::-rdynamic -ldl:BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:${x86_elf_asm}:dlfcn:linux-shared:-fPIC:-m32:.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"debug-steve-opt", "gcc:$gcc_devteam_warn -m64 -O3 -DL_ENDIAN -DTERMIO -DCONF_DEBUG -DDEBUG_SAFESTACK -g::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_INT DES_UNROLL:${x86_64_asm}:elf:dlfcn:linux-shared:-fPIC:-m64:.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"debug-levitte-linux-elf","gcc:-DLEVITTE_DEBUG -DBN_DEBUG -DREF_CHECK -DCONF_DEBUG -DCRYPTO_MDEBUG -DL_ENDIAN -ggdb -g3 -Wall::-D_REENTRANT::-ldl:BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:${x86_elf_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"debug-levitte-linux-noasm","gcc:-DLEVITTE_DEBUG -DBN_DEBUG -DREF_CHECK -DCONF_DEBUG -DCRYPTO_MDEBUG -DOPENSSL_NO_ASM -DL_ENDIAN -ggdb -g3 -Wall::-D_REENTRANT::-ldl:BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:${no_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
@ -193,9 +213,9 @@ my %table=(
"debug-linux-ppro","gcc:-DBN_DEBUG -DREF_CHECK -DCONF_DEBUG -DBN_CTX_DEBUG -DCRYPTO_MDEBUG -DL_ENDIAN -g -mcpu=pentiumpro -Wall::-D_REENTRANT::-ldl:BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:${x86_elf_asm}:dlfcn",
"debug-linux-elf","gcc:-DBN_DEBUG -DREF_CHECK -DCONF_DEBUG -DBN_CTX_DEBUG -DCRYPTO_MDEBUG -DL_ENDIAN -g -march=i486 -Wall::-D_REENTRANT::-lefence -ldl:BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:${x86_elf_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"debug-linux-elf-noefence","gcc:-DBN_DEBUG -DREF_CHECK -DCONF_DEBUG -DBN_CTX_DEBUG -DCRYPTO_MDEBUG -DL_ENDIAN -g -march=i486 -Wall::-D_REENTRANT::-ldl:BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:${x86_elf_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"debug-linux-ia32-aes", "gcc:-DAES_EXPERIMENTAL -DL_ENDIAN -O3 -fomit-frame-pointer -Wall::-D_REENTRANT::-ldl:BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:x86cpuid.o:bn-586.o co-586.o x86-mont.o:des-586.o crypt586.o:aes_x86core.o aes_cbc.o aesni-x86.o:bf-586.o:md5-586.o:sha1-586.o sha256-586.o sha512-586.o:cast-586.o:rc4-586.o:rmd-586.o:rc5-586.o:wp_block.o wp-mmx.o::ghash-x86.o::elf:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"debug-linux-ia32-aes", "gcc:-DAES_EXPERIMENTAL -DL_ENDIAN -O3 -fomit-frame-pointer -Wall::-D_REENTRANT::-ldl:BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:x86cpuid.o:bn-586.o co-586.o x86-mont.o::des-586.o crypt586.o:aes_x86core.o aes_cbc.o aesni-x86.o:bf-586.o:md5-586.o:sha1-586.o sha256-586.o sha512-586.o:cast-586.o:rc4-586.o:rmd-586.o:rc5-586.o:wp_block.o wp-mmx.o::ghash-x86.o::elf:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"debug-linux-generic32","gcc:-DBN_DEBUG -DREF_CHECK -DCONF_DEBUG -DCRYPTO_MDEBUG -g -Wall::-D_REENTRANT::-ldl:BN_LLONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL BF_PTR:${no_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"debug-linux-generic64","gcc:-DBN_DEBUG -DREF_CHECK -DCONF_DEBUG -DCRYPTO_MDEBUG -g -Wall::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL BF_PTR:${no_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"debug-linux-generic64","gcc:-DBN_DEBUG -DREF_CHECK -DCONF_DEBUG -DCRYPTO_MDEBUG -DTERMIO -g -Wall::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL BF_PTR:${no_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"debug-linux-x86_64","gcc:-DBN_DEBUG -DREF_CHECK -DCONF_DEBUG -DCRYPTO_MDEBUG -m64 -DL_ENDIAN -g -Wall::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_INT DES_UNROLL:${x86_64_asm}:elf:dlfcn:linux-shared:-fPIC:-m64:.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR):::64",
"dist", "cc:-O::(unknown)::::::",
@ -225,7 +245,7 @@ my %table=(
"solaris64-x86_64-gcc","gcc:-m64 -O3 -Wall -DL_ENDIAN::-D_REENTRANT::-lsocket -lnsl -ldl:SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_INT DES_UNROLL:${x86_64_asm}:elf:dlfcn:solaris-shared:-fPIC:-m64 -shared -static-libgcc:.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR):::/64",
#### Solaris x86 with Sun C setups
"solaris-x86-cc","cc:-fast -O -Xa::-D_REENTRANT::-lsocket -lnsl -ldl:BN_LLONG RC4_CHAR RC4_CHUNK DES_PTR DES_UNROLL BF_PTR:${no_asm}:dlfcn:solaris-shared:-KPIC:-G -dy -z text:.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"solaris-x86-cc","cc:-fast -xarch=generic -O -Xa::-D_REENTRANT::-lsocket -lnsl -ldl:BN_LLONG RC4_CHAR RC4_CHUNK DES_PTR DES_UNROLL BF_PTR:${no_asm}:dlfcn:solaris-shared:-KPIC:-G -dy -z text:.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"solaris64-x86_64-cc","cc:-fast -xarch=amd64 -xstrconst -Xa -DL_ENDIAN::-D_REENTRANT::-lsocket -lnsl -ldl:SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_INT DES_UNROLL:${x86_64_asm}:elf:dlfcn:solaris-shared:-KPIC:-xarch=amd64 -G -dy -z text:.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR):::/64",
#### SPARC Solaris with GNU C setups
@ -300,7 +320,7 @@ my %table=(
"hpux-parisc-gcc","gcc:-O3 -DB_ENDIAN -DBN_DIV2W::-D_REENTRANT::-Wl,+s -ldld:BN_LLONG DES_PTR DES_UNROLL DES_RISC1:${no_asm}:dl:hpux-shared:-fPIC:-shared:.sl.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"hpux-parisc1_1-gcc","gcc:-O3 -DB_ENDIAN -DBN_DIV2W::-D_REENTRANT::-Wl,+s -ldld:BN_LLONG DES_PTR DES_UNROLL DES_RISC1:${parisc11_asm}:dl:hpux-shared:-fPIC:-shared:.sl.\$(SHLIB_MAJOR).\$(SHLIB_MINOR):::/pa1.1",
"hpux-parisc2-gcc","gcc:-march=2.0 -O3 -DB_ENDIAN -D_REENTRANT::::-Wl,+s -ldld:SIXTY_FOUR_BIT RC4_CHAR RC4_CHUNK DES_PTR DES_UNROLL DES_RISC1:".eval{my $asm=$parisc20_asm;$asm=~s/2W\./2\./;$asm=~s/:64/:32/;$asm}.":dl:hpux-shared:-fPIC:-shared:.sl.\$(SHLIB_MAJOR).\$(SHLIB_MINOR):::/pa20_32",
"hpux64-parisc2-gcc","gcc:-O3 -DB_ENDIAN -D_REENTRANT::::-ldl:SIXTY_FOUR_BIT_LONG MD2_CHAR RC4_INDEX RC4_CHAR DES_UNROLL DES_RISC1 DES_INT::pa-risc2W.o::::::::::::::void:dlfcn:hpux-shared:-fpic:-shared:.sl.\$(SHLIB_MAJOR).\$(SHLIB_MINOR):::/pa20_64",
"hpux64-parisc2-gcc","gcc:-O3 -DB_ENDIAN -D_REENTRANT::::-ldl:SIXTY_FOUR_BIT_LONG MD2_CHAR RC4_INDEX RC4_CHAR DES_UNROLL DES_RISC1 DES_INT::pa-risc2W.o:::::::::::::::void:dlfcn:hpux-shared:-fpic:-shared:.sl.\$(SHLIB_MAJOR).\$(SHLIB_MINOR):::/pa20_64",
# More attempts at unified 10.X and 11.X targets for HP C compiler.
#
@ -347,20 +367,57 @@ my %table=(
# throw in -D[BL]_ENDIAN, whichever appropriate...
"linux-generic32","gcc:-O3 -fomit-frame-pointer -Wall::-D_REENTRANT::-ldl:BN_LLONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL BF_PTR:${no_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"linux-ppc", "gcc:-DB_ENDIAN -O3 -Wall::-D_REENTRANT::-ldl:BN_LLONG RC4_CHAR RC4_CHUNK DES_RISC1 DES_UNROLL:${ppc32_asm}:linux32:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
# It's believed that majority of ARM toolchains predefine appropriate -march.
# If you compiler does not, do complement config command line with one!
"linux-armv4", "gcc:-O3 -Wall::-D_REENTRANT::-ldl:BN_LLONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL BF_PTR:${armv4_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
#######################################################################
# Note that -march is not among compiler options in below linux-armv4
# target line. Not specifying one is intentional to give you choice to:
#
# a) rely on your compiler default by not specifying one;
# b) specify your target platform explicitly for optimal performance,
# e.g. -march=armv6 or -march=armv7-a;
# c) build "universal" binary that targets *range* of platforms by
# specifying minimum and maximum supported architecture;
#
# As for c) option. It actually makes no sense to specify maximum to be
# less than ARMv7, because it's the least requirement for run-time
# switch between platform-specific code paths. And without run-time
# switch performance would be equivalent to one for minimum. Secondly,
# there are some natural limitations that you'd have to accept and
# respect. Most notably you can *not* build "universal" binary for
# big-endian platform. This is because ARMv7 processor always picks
# instructions in little-endian order. Another similar limitation is
# that -mthumb can't "cross" -march=armv6t2 boundary, because that's
# where it became Thumb-2. Well, this limitation is a bit artificial,
# because it's not really impossible, but it's deemed too tricky to
# support. And of course you have to be sure that your binutils are
# actually up to the task of handling maximum target platform. With all
# this in mind here is an example of how to configure "universal" build:
#
# ./Configure linux-armv4 -march=armv6 -D__ARM_MAX_ARCH__=8
#
"linux-armv4", "gcc: -O3 -Wall::-D_REENTRANT::-ldl:BN_LLONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL BF_PTR:${armv4_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"linux-aarch64","gcc: -O3 -Wall::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL BF_PTR:${aarch64_asm}:linux64:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
# Configure script adds minimally required -march for assembly support,
# if no -march was specified at command line. mips32 and mips64 below
# refer to contemporary MIPS Architecture specifications, MIPS32 and
# MIPS64, rather than to kernel bitness.
"linux-mips32", "gcc:-mabi=32 -O3 -Wall -DBN_DIV3W::-D_REENTRANT::-ldl:BN_LLONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL BF_PTR:${mips32_asm}:o32:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"linux-mips64", "gcc:-mabi=n32 -O3 -Wall -DBN_DIV3W::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL BF_PTR:${mips64_asm}:n32:dlfcn:linux-shared:-fPIC:-mabi=n32:.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR):::32",
"linux64-mips64", "gcc:-mabi=64 -O3 -Wall -DBN_DIV3W::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL BF_PTR:${mips64_asm}:64:dlfcn:linux-shared:-fPIC:-mabi=64:.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR):::64",
#### IA-32 targets...
"linux-ia32-icc", "icc:-DL_ENDIAN -O2 -no_cpprt::-D_REENTRANT::-ldl:BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:${x86_elf_asm}:dlfcn:linux-shared:-KPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"linux-ia32-icc", "icc:-DL_ENDIAN -O2::-D_REENTRANT::-ldl -no_cpprt:BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:${x86_elf_asm}:dlfcn:linux-shared:-KPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"linux-elf", "gcc:-DL_ENDIAN -O3 -fomit-frame-pointer -Wall::-D_REENTRANT::-ldl:BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:${x86_elf_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"linux-aout", "gcc:-DL_ENDIAN -O3 -fomit-frame-pointer -march=i486 -Wall::(unknown):::BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:${x86_asm}:a.out",
####
"linux-generic64","gcc:-O3 -Wall::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL BF_PTR:${no_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"linux-ppc64", "gcc:-m64 -DB_ENDIAN -O3 -Wall::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHAR RC4_CHUNK DES_RISC1 DES_UNROLL:${ppc64_asm}:linux64:dlfcn:linux-shared:-fPIC:-m64:.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR):::64",
"linux-ia64", "gcc:-DL_ENDIAN -O3 -Wall::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_UNROLL DES_INT:${ia64_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"linux-ia64-ecc","ecc:-DL_ENDIAN -O2 -Wall -no_cpprt::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_INT:${ia64_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"linux-ia64-icc","icc:-DL_ENDIAN -O2 -Wall -no_cpprt::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_RISC1 DES_INT:${ia64_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"linux-ppc64le","gcc:-m64 -DL_ENDIAN -O3 -Wall::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHAR RC4_CHUNK DES_RISC1 DES_UNROLL:$ppc64_asm:linux64le:dlfcn:linux-shared:-fPIC:-m64:.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR):::",
"linux-ia64", "gcc:-DL_ENDIAN -DTERMIO -O3 -Wall::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_UNROLL DES_INT:${ia64_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"linux-ia64-icc","icc:-DL_ENDIAN -O2 -Wall::-D_REENTRANT::-ldl -no_cpprt:SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_RISC1 DES_INT:${ia64_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"linux-x86_64", "gcc:-m64 -DL_ENDIAN -O3 -Wall::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_INT DES_UNROLL:${x86_64_asm}:elf:dlfcn:linux-shared:-fPIC:-m64:.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR):::64",
"linux-x86_64-clang", "clang: -m64 -DL_ENDIAN -O3 -Wall -Wextra $clang_disabled_warnings -Qunused-arguments::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_INT DES_UNROLL:${x86_64_asm}:elf:dlfcn:linux-shared:-fPIC:-m64:.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR):::64",
"linux-x86_64-icc", "icc:-DL_ENDIAN -O2::-D_REENTRANT::-ldl -no_cpprt:SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_INT DES_UNROLL:${x86_64_asm}:elf:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR):::64",
"linux-x32", "gcc:-mx32 -DL_ENDIAN -O3 -Wall::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT RC4_CHUNK_LL DES_INT DES_UNROLL:${x86_64_asm}:elf:dlfcn:linux-shared:-fPIC:-mx32:.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR):::x32",
"linux64-s390x", "gcc:-m64 -DB_ENDIAN -O3 -Wall::-D_REENTRANT::-ldl:SIXTY_FOUR_BIT_LONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL:${s390x_asm}:64:dlfcn:linux-shared:-fPIC:-m64:.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR):::64",
#### So called "highgprs" target for z/Architecture CPUs
# "Highgprs" is kernel feature first implemented in Linux 2.6.32, see
@ -407,6 +464,7 @@ my %table=(
"android","gcc:-mandroid -I\$(ANDROID_DEV)/include -B\$(ANDROID_DEV)/lib -O3 -fomit-frame-pointer -Wall::-D_REENTRANT::-ldl:BN_LLONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL BF_PTR:${no_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"android-x86","gcc:-mandroid -I\$(ANDROID_DEV)/include -B\$(ANDROID_DEV)/lib -O3 -fomit-frame-pointer -Wall::-D_REENTRANT::-ldl:BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:".eval{my $asm=${x86_elf_asm};$asm=~s/:elf/:android/;$asm}.":dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"android-armv7","gcc:-march=armv7-a -mandroid -I\$(ANDROID_DEV)/include -B\$(ANDROID_DEV)/lib -O3 -fomit-frame-pointer -Wall::-D_REENTRANT::-ldl:BN_LLONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL BF_PTR:${armv4_asm}:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"android-mips","gcc:-mandroid -I\$(ANDROID_DEV)/include -B\$(ANDROID_DEV)/lib -O3 -Wall::-D_REENTRANT::-ldl:BN_LLONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL BF_PTR:${mips32_asm}:o32:dlfcn:linux-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
#### *BSD [do see comment about ${BSDthreads} above!]
"BSD-generic32","gcc:-O3 -fomit-frame-pointer -Wall::${BSDthreads}:::BN_LLONG RC2_CHAR RC4_INDEX DES_INT DES_UNROLL:${no_asm}:dlfcn:bsd-gcc-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
@ -421,7 +479,7 @@ my %table=(
# triggered by RIPEMD160 code.
"BSD-sparc64", "gcc:-DB_ENDIAN -O3 -DMD32_REG_T=int -Wall::${BSDthreads}:::BN_LLONG RC2_CHAR RC4_CHUNK DES_INT DES_PTR DES_RISC2 BF_PTR:${sparcv9_asm}:dlfcn:bsd-gcc-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"BSD-ia64", "gcc:-DL_ENDIAN -O3 -Wall::${BSDthreads}:::SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_UNROLL DES_INT:${ia64_asm}:dlfcn:bsd-gcc-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"BSD-x86_64", "gcc:-DL_ENDIAN -O3 -Wall::${BSDthreads}:::SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_INT DES_UNROLL:${x86_64_asm}:elf:dlfcn:bsd-gcc-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"BSD-x86_64", "cc:-DL_ENDIAN -O3 -Wall::${BSDthreads}:::SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_INT DES_UNROLL:${x86_64_asm}:elf:dlfcn:bsd-gcc-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"bsdi-elf-gcc", "gcc:-DPERL5 -DL_ENDIAN -fomit-frame-pointer -O3 -march=i486 -Wall::(unknown)::-ldl:BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:${x86_elf_asm}:dlfcn:bsd-gcc-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
@ -454,11 +512,11 @@ my %table=(
# UnixWare 2.0x fails destest with -O.
"unixware-2.0","cc:-DFILIO_H -DNO_STRINGS_H::-Kthread::-lsocket -lnsl -lresolv -lx:${x86_gcc_des} ${x86_gcc_opts}:::",
"unixware-2.1","cc:-O -DFILIO_H::-Kthread::-lsocket -lnsl -lresolv -lx:${x86_gcc_des} ${x86_gcc_opts}:::",
"unixware-7","cc:-O -DFILIO_H -Kalloca::-Kthread::-lsocket -lnsl:BN_LLONG MD2_CHAR RC4_INDEX ${x86_gcc_des}:${x86_elf_asm}:dlfcn:svr5-shared:-Kpic::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"unixware-7-gcc","gcc:-DL_ENDIAN -DFILIO_H -O3 -fomit-frame-pointer -march=pentium -Wall::-D_REENTRANT::-lsocket -lnsl:BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:${x86_elf_asm}:dlfcn:gnu-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"unixware-7","cc:-O -DFILIO_H -Kalloca::-Kthread::-lsocket -lnsl:BN_LLONG MD2_CHAR RC4_INDEX ${x86_gcc_des}:${x86_elf_asm}-1:dlfcn:svr5-shared:-Kpic::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"unixware-7-gcc","gcc:-DL_ENDIAN -DFILIO_H -O3 -fomit-frame-pointer -march=pentium -Wall::-D_REENTRANT::-lsocket -lnsl:BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:${x86_elf_asm}-1:dlfcn:gnu-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
# SCO 5 - Ben Laurie <ben@algroup.co.uk> says the -O breaks the SCO cc.
"sco5-cc", "cc:-belf::(unknown)::-lsocket -lnsl:${x86_gcc_des} ${x86_gcc_opts}:${x86_elf_asm}:dlfcn:svr3-shared:-Kpic::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"sco5-gcc", "gcc:-O3 -fomit-frame-pointer::(unknown)::-lsocket -lnsl:BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:${x86_elf_asm}:dlfcn:svr3-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"sco5-cc", "cc:-belf::(unknown)::-lsocket -lnsl:${x86_gcc_des} ${x86_gcc_opts}:${x86_elf_asm}-1:dlfcn:svr3-shared:-Kpic::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
"sco5-gcc", "gcc:-O3 -fomit-frame-pointer::(unknown)::-lsocket -lnsl:BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:${x86_elf_asm}-1:dlfcn:svr3-shared:-fPIC::.so.\$(SHLIB_MAJOR).\$(SHLIB_MINOR)",
#### IBM's AIX.
"aix3-cc", "cc:-O -DB_ENDIAN -qmaxmem=16384::(unknown):AIX::BN_LLONG RC4_CHAR:::",
@ -518,9 +576,9 @@ my %table=(
# Visual C targets
#
# Win64 targets, WIN64I denotes IA-64 and WIN64A - AMD64
"VC-WIN64I","cl:-W3 -Gs0 -Gy -nologo -DOPENSSL_SYSNAME_WIN32 -DWIN32_LEAN_AND_MEAN -DL_ENDIAN -DUNICODE -D_UNICODE -D_CRT_SECURE_NO_DEPRECATE:::WIN64I::SIXTY_FOUR_BIT RC4_CHUNK_LL DES_INT EXPORT_VAR_AS_FN:ia64cpuid.o:ia64.o ia64-mont.o::aes_core.o aes_cbc.o aes-ia64.o::md5-ia64.o:sha1-ia64.o sha256-ia64.o sha512-ia64.o:::::::ghash-ia64.o::ias:win32",
"VC-WIN64I","cl:-W3 -Gs0 -Gy -nologo -DOPENSSL_SYSNAME_WIN32 -DWIN32_LEAN_AND_MEAN -DL_ENDIAN -DUNICODE -D_UNICODE -D_CRT_SECURE_NO_DEPRECATE:::WIN64I::SIXTY_FOUR_BIT RC4_CHUNK_LL DES_INT EXPORT_VAR_AS_FN:ia64cpuid.o:ia64.o ia64-mont.o:::aes_core.o aes_cbc.o aes-ia64.o::md5-ia64.o:sha1-ia64.o sha256-ia64.o sha512-ia64.o:::::::ghash-ia64.o::ias:win32",
"VC-WIN64A","cl:-W3 -Gs0 -Gy -nologo -DOPENSSL_SYSNAME_WIN32 -DWIN32_LEAN_AND_MEAN -DL_ENDIAN -DUNICODE -D_UNICODE -D_CRT_SECURE_NO_DEPRECATE:::WIN64A::SIXTY_FOUR_BIT RC4_CHUNK_LL DES_INT EXPORT_VAR_AS_FN:".eval{my $asm=$x86_64_asm;$asm=~s/x86_64-gcc\.o/bn_asm.o/;$asm}.":auto:win32",
"debug-VC-WIN64I","cl:-W3 -Gs0 -Gy -Zi -nologo -DOPENSSL_SYSNAME_WIN32 -DWIN32_LEAN_AND_MEAN -DL_ENDIAN -DUNICODE -D_UNICODE -D_CRT_SECURE_NO_DEPRECATE:::WIN64I::SIXTY_FOUR_BIT RC4_CHUNK_LL DES_INT EXPORT_VAR_AS_FN:ia64cpuid.o:ia64.o::aes_core.o aes_cbc.o aes-ia64.o::md5-ia64.o:sha1-ia64.o sha256-ia64.o sha512-ia64.o:::::::ghash-ia64.o::ias:win32",
"debug-VC-WIN64I","cl:-W3 -Gs0 -Gy -Zi -nologo -DOPENSSL_SYSNAME_WIN32 -DWIN32_LEAN_AND_MEAN -DL_ENDIAN -DUNICODE -D_UNICODE -D_CRT_SECURE_NO_DEPRECATE:::WIN64I::SIXTY_FOUR_BIT RC4_CHUNK_LL DES_INT EXPORT_VAR_AS_FN:ia64cpuid.o:ia64.o:::aes_core.o aes_cbc.o aes-ia64.o::md5-ia64.o:sha1-ia64.o sha256-ia64.o sha512-ia64.o:::::::ghash-ia64.o::ias:win32",
"debug-VC-WIN64A","cl:-W3 -Gs0 -Gy -Zi -nologo -DOPENSSL_SYSNAME_WIN32 -DWIN32_LEAN_AND_MEAN -DL_ENDIAN -DUNICODE -D_UNICODE -D_CRT_SECURE_NO_DEPRECATE:::WIN64A::SIXTY_FOUR_BIT RC4_CHUNK_LL DES_INT EXPORT_VAR_AS_FN:".eval{my $asm=$x86_64_asm;$asm=~s/x86_64-gcc\.o/bn_asm.o/;$asm}.":auto:win32",
# x86 Win32 target defaults to ANSI API, if you want UNICODE, complement
# 'perl Configure VC-WIN32' with '-DUNICODE -D_UNICODE'
@ -547,9 +605,8 @@ my %table=(
"UWIN", "cc:-DTERMIOS -DL_ENDIAN -O -Wall:::UWIN::BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:${no_asm}:win32",
# Cygwin
"Cygwin-pre1.3", "gcc:-DTERMIOS -DL_ENDIAN -fomit-frame-pointer -O3 -m486 -Wall::(unknown):CYGWIN32::BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:${no_asm}:win32",
"Cygwin", "gcc:-DTERMIOS -DL_ENDIAN -fomit-frame-pointer -O3 -march=i486 -Wall:::CYGWIN32::BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:${x86_asm}:coff:dlfcn:cygwin-shared:-D_WINDLL:-shared:.dll.a",
"debug-Cygwin", "gcc:-DTERMIOS -DL_ENDIAN -march=i486 -Wall -DBN_DEBUG -DREF_CHECK -DCONF_DEBUG -DBN_CTX_DEBUG -DCRYPTO_MDEBUG -DOPENSSL_NO_ASM -g -Wformat -Wshadow -Wmissing-prototypes -Wmissing-declarations -Werror:::CYGWIN32:::${no_asm}:dlfcn:cygwin-shared:-D_WINDLL:-shared:.dll.a",
"Cygwin", "gcc:-DTERMIOS -DL_ENDIAN -fomit-frame-pointer -O3 -march=i486 -Wall:::CYGWIN::BN_LLONG ${x86_gcc_des} ${x86_gcc_opts}:${x86_asm}:coff:dlfcn:cygwin-shared:-D_WINDLL:-shared:.dll.a",
"Cygwin-x86_64", "gcc:-DTERMIOS -DL_ENDIAN -O3 -Wall:::CYGWIN::SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_INT DES_UNROLL:${x86_64_asm}:mingw64:dlfcn:cygwin-shared:-D_WINDLL:-shared:.dll.a",
# NetWare from David Ward (dsward@novell.com)
# requires either MetroWerks NLM development tools, or gcc / nlmconv
@ -581,7 +638,8 @@ my %table=(
"darwin64-ppc-cc","cc:-arch ppc64 -O3 -DB_ENDIAN::-D_REENTRANT:MACOSX:-Wl,-search_paths_first%:SIXTY_FOUR_BIT_LONG RC4_CHAR RC4_CHUNK DES_UNROLL BF_PTR:${ppc64_asm}:osx64:dlfcn:darwin-shared:-fPIC -fno-common:-arch ppc64 -dynamiclib:.\$(SHLIB_MAJOR).\$(SHLIB_MINOR).dylib",
"darwin-i386-cc","cc:-arch i386 -O3 -fomit-frame-pointer -DL_ENDIAN::-D_REENTRANT:MACOSX:-Wl,-search_paths_first%:BN_LLONG RC4_INT RC4_CHUNK DES_UNROLL BF_PTR:".eval{my $asm=$x86_asm;$asm=~s/cast\-586\.o//;$asm}.":macosx:dlfcn:darwin-shared:-fPIC -fno-common:-arch i386 -dynamiclib:.\$(SHLIB_MAJOR).\$(SHLIB_MINOR).dylib",
"debug-darwin-i386-cc","cc:-arch i386 -g3 -DL_ENDIAN::-D_REENTRANT:MACOSX:-Wl,-search_paths_first%:BN_LLONG RC4_INT RC4_CHUNK DES_UNROLL BF_PTR:${x86_asm}:macosx:dlfcn:darwin-shared:-fPIC -fno-common:-arch i386 -dynamiclib:.\$(SHLIB_MAJOR).\$(SHLIB_MINOR).dylib",
"darwin64-x86_64-cc","cc:-arch x86_64 -O3 -DL_ENDIAN -Wall::-D_REENTRANT:MACOSX:-Wl,-search_paths_first%:SIXTY_FOUR_BIT_LONG RC4_CHAR RC4_CHUNK DES_INT DES_UNROLL:".eval{my $asm=$x86_64_asm;$asm=~s/rc4\-[^:]+//;$asm}.":macosx:dlfcn:darwin-shared:-fPIC -fno-common:-arch x86_64 -dynamiclib:.\$(SHLIB_MAJOR).\$(SHLIB_MINOR).dylib",
"darwin64-x86_64-cc","cc:-arch x86_64 -O3 -DL_ENDIAN -Wall::-D_REENTRANT:MACOSX:-Wl,-search_paths_first%:SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_INT DES_UNROLL:".eval{my $asm=$x86_64_asm;$asm=~s/rc4\-[^:]+//;$asm}.":macosx:dlfcn:darwin-shared:-fPIC -fno-common:-arch x86_64 -dynamiclib:.\$(SHLIB_MAJOR).\$(SHLIB_MINOR).dylib",
"debug-darwin64-x86_64-cc","cc:-arch x86_64 -ggdb -g2 -O0 -DL_ENDIAN -Wall::-D_REENTRANT:MACOSX:-Wl,-search_paths_first%:SIXTY_FOUR_BIT_LONG RC4_CHUNK DES_INT DES_UNROLL:".eval{my $asm=$x86_64_asm;$asm=~s/rc4\-[^:]+//;$asm}.":macosx:dlfcn:darwin-shared:-fPIC -fno-common:-arch x86_64 -dynamiclib:.\$(SHLIB_MAJOR).\$(SHLIB_MINOR).dylib",
"debug-darwin-ppc-cc","cc:-DBN_DEBUG -DREF_CHECK -DCONF_DEBUG -DCRYPTO_MDEBUG -DB_ENDIAN -g -Wall -O::-D_REENTRANT:MACOSX::BN_LLONG RC4_CHAR RC4_CHUNK DES_UNROLL BF_PTR:${ppc32_asm}:osx32:dlfcn:darwin-shared:-fPIC:-dynamiclib:.\$(SHLIB_MAJOR).\$(SHLIB_MINOR).dylib",
# iPhoneOS/iOS
"iphoneos-cross","llvm-gcc:-O3 -isysroot \$(CROSS_TOP)/SDKs/\$(CROSS_SDK) -fomit-frame-pointer -fno-common::-D_REENTRANT:iOS:-Wl,-search_paths_first%:BN_LLONG RC4_CHAR RC4_CHUNK DES_UNROLL BF_PTR:${no_asm}:dlfcn:darwin-shared:-fPIC -fno-common:-dynamiclib:.\$(SHLIB_MAJOR).\$(SHLIB_MINOR).dylib",
@ -634,6 +692,7 @@ my $idx_lflags = $idx++;
my $idx_bn_ops = $idx++;
my $idx_cpuid_obj = $idx++;
my $idx_bn_obj = $idx++;
my $idx_ec_obj = $idx++;
my $idx_des_obj = $idx++;
my $idx_aes_obj = $idx++;
my $idx_bf_obj = $idx++;
@ -714,11 +773,13 @@ my %disabled = ( # "what" => "comment" [or special keyword "experimental
"ec_nistp_64_gcc_128" => "default",
"gmp" => "default",
"jpake" => "experimental",
"libunbound" => "experimental",
"md2" => "default",
"rc5" => "default",
"rfc3779" => "default",
"sctp" => "default",
"shared" => "default",
"ssl-trace" => "default",
"store" => "experimental",
"unit-test" => "default",
"zlib" => "default",
@ -728,7 +789,7 @@ my @experimental = ();
# This is what $depflags will look like with the above defaults
# (we need this to see if we should advise the user to run "make depend"):
my $default_depflags = " -DOPENSSL_NO_EC_NISTP_64_GCC_128 -DOPENSSL_NO_GMP -DOPENSSL_NO_JPAKE -DOPENSSL_NO_MD2 -DOPENSSL_NO_RC5 -DOPENSSL_NO_RFC3779 -DOPENSSL_NO_SCTP -DOPENSSL_NO_STORE -DOPENSSL_NO_UNIT_TEST";
my $default_depflags = " -DOPENSSL_NO_EC_NISTP_64_GCC_128 -DOPENSSL_NO_GMP -DOPENSSL_NO_JPAKE -DOPENSSL_NO_LIBUNBOUND -DOPENSSL_NO_MD2 -DOPENSSL_NO_RC5 -DOPENSSL_NO_RFC3779 -DOPENSSL_NO_SCTP -DOPENSSL_NO_SSL_TRACE -DOPENSSL_NO_STORE -DOPENSSL_NO_UNIT_TEST";
# Explicit "no-..." options will be collected in %disabled along with the defaults.
# To remove something from %disabled, use "enable-foo" (unless it's experimental).
@ -873,16 +934,7 @@ PROCESS_ARGS:
}
elsif (/^[-+]/)
{
if (/^-[lL](.*)$/ or /^-Wl,/)
{
$libs.=$_." ";
}
elsif (/^-[^-]/ or /^\+/)
{
$_ =~ s/%([0-9a-f]{1,2})/chr(hex($1))/gei;
$flags.=$_." ";
}
elsif (/^--prefix=(.*)$/)
if (/^--prefix=(.*)$/)
{
$prefix=$1;
}
@ -926,10 +978,14 @@ PROCESS_ARGS:
{
$cross_compile_prefix=$1;
}
else
elsif (/^-[lL](.*)$/ or /^-Wl,/)
{
print STDERR $usage;
exit(1);
$libs.=$_." ";
}
else # common if (/^[-+]/), just pass down...
{
$_ =~ s/%([0-9a-f]{1,2})/chr(hex($1))/gei;
$flags.=$_." ";
}
}
elsif ($_ =~ /^([^:]+):(.+)$/)
@ -1156,6 +1212,7 @@ my $cc = $fields[$idx_cc];
if($ENV{CC}) {
$cc = $ENV{CC};
}
my $cflags = $fields[$idx_cflags];
my $unistd = $fields[$idx_unistd];
my $thread_cflag = $fields[$idx_thread_cflag];
@ -1164,6 +1221,7 @@ my $lflags = $fields[$idx_lflags];
my $bn_ops = $fields[$idx_bn_ops];
my $cpuid_obj = $fields[$idx_cpuid_obj];
my $bn_obj = $fields[$idx_bn_obj];
my $ec_obj = $fields[$idx_ec_obj];
my $des_obj = $fields[$idx_des_obj];
my $aes_obj = $fields[$idx_aes_obj];
my $bf_obj = $fields[$idx_bf_obj];
@ -1209,6 +1267,12 @@ if ($target =~ /^mingw/ && `$cc --target-help 2>&1` !~ m/\-mno\-cygwin/m)
$shared_ldflag =~ s/\-mno\-cygwin\s*//;
}
if ($target =~ /linux.*\-mips/ && !$no_asm && $flags !~ /\-m(ips|arch=)/) {
# minimally required architecture flags for assembly modules
$cflags="-mips2 $cflags" if ($target =~ /mips32/);
$cflags="-mips3 $cflags" if ($target =~ /mips64/);
}
my $no_shared_warn=0;
my $no_user_cflags=0;
@ -1335,7 +1399,7 @@ $lflags="$libs$lflags" if ($libs ne "");
if ($no_asm)
{
$cpuid_obj=$bn_obj=
$cpuid_obj=$bn_obj=$ec_obj=
$des_obj=$aes_obj=$bf_obj=$cast_obj=$rc4_obj=$rc5_obj=$cmll_obj=
$modes_obj=$sha1_obj=$md5_obj=$rmd160_obj=$wp_obj=$engines_obj="";
}
@ -1416,6 +1480,7 @@ if ($target =~ /\-icc$/) # Intel C compiler
}
if ($iccver>=8)
{
$cflags=~s/\-KPIC/-fPIC/;
# Eliminate unnecessary dependency from libirc.a. This is
# essential for shared library support, as otherwise
# apps/openssl can end up in endless loop upon startup...
@ -1423,12 +1488,17 @@ if ($target =~ /\-icc$/) # Intel C compiler
}
if ($iccver>=9)
{
$cflags.=" -i-static";
$cflags=~s/\-no_cpprt/-no-cpprt/;
$lflags.=" -i-static";
$lflags=~s/\-no_cpprt/-no-cpprt/;
}
if ($iccver>=10)
{
$cflags=~s/\-i\-static/-static-intel/;
$lflags=~s/\-i\-static/-static-intel/;
}
if ($iccver>=11)
{
$cflags.=" -no-intel-extensions"; # disable Cilk
$lflags=~s/\-no\-cpprt/-no-cxxlib/;
}
}
@ -1509,7 +1579,7 @@ if ($rmd160_obj =~ /\.o$/)
}
if ($aes_obj =~ /\.o$/)
{
$cflags.=" -DAES_ASM";
$cflags.=" -DAES_ASM" if ($aes_obj =~ m/\baes\-/);;
# aes-ctr.o is not a real file, only indication that assembler
# module implements AES_ctr32_encrypt...
$cflags.=" -DAES_CTR_ASM" if ($aes_obj =~ s/\s*aes\-ctr\.o//);
@ -1531,10 +1601,14 @@ else {
$wp_obj="wp_block.o";
}
$cmll_obj=$cmll_enc unless ($cmll_obj =~ /.o$/);
if ($modes_obj =~ /ghash/)
if ($modes_obj =~ /ghash\-/)
{
$cflags.=" -DGHASH_ASM";
}
if ($ec_obj =~ /ecp_nistz256/)
{
$cflags.=" -DECP_NISTZ256_ASM";
}
# "Stringify" the C flags string. This permits it to be made part of a string
# and works as well on command lines.
@ -1574,12 +1648,21 @@ if ($shlib_version_number =~ /(^[0-9]*)\.([0-9\.]*)/)
if ($strict_warnings)
{
my $ecc = $cc;
$ecc = "clang" if `$cc --version 2>&1` =~ /clang/;
my $wopt;
die "ERROR --strict-warnings requires gcc" unless ($cc =~ /gcc$/);
die "ERROR --strict-warnings requires gcc or clang" unless ($ecc =~ /gcc$/ or $ecc =~ /clang$/);
foreach $wopt (split /\s+/, $gcc_devteam_warn)
{
$cflags .= " $wopt" unless ($cflags =~ /$wopt/)
}
if ($ecc eq "clang")
{
foreach $wopt (split /\s+/, $clang_devteam_warn)
{
$cflags .= " $wopt" unless ($cflags =~ /$wopt/)
}
}
}
open(IN,'<Makefile.org') || die "unable to read Makefile.org:$!\n";
@ -1638,6 +1721,7 @@ while (<IN>)
s/^EXE_EXT=.*$/EXE_EXT= $exe_ext/;
s/^CPUID_OBJ=.*$/CPUID_OBJ= $cpuid_obj/;
s/^BN_ASM=.*$/BN_ASM= $bn_obj/;
s/^EC_ASM=.*$/EC_ASM= $ec_obj/;
s/^DES_ENC=.*$/DES_ENC= $des_obj/;
s/^AES_ENC=.*$/AES_ENC= $aes_obj/;
s/^BF_ENC=.*$/BF_ENC= $bf_obj/;
@ -1699,6 +1783,7 @@ print "CFLAG =$cflags\n";
print "EX_LIBS =$lflags\n";
print "CPUID_OBJ =$cpuid_obj\n";
print "BN_ASM =$bn_obj\n";
print "EC_ASM =$ec_obj\n";
print "DES_ENC =$des_obj\n";
print "AES_ENC =$aes_obj\n";
print "BF_ENC =$bf_obj\n";
@ -1997,7 +2082,7 @@ BEGIN
VALUE "ProductVersion", "$version\\0"
// Optional:
//VALUE "Comments", "\\0"
VALUE "LegalCopyright", "Copyright © 1998-2005 The OpenSSL Project. Copyright © 1995-1998 Eric A. Young, Tim J. Hudson. All rights reserved.\\0"
VALUE "LegalCopyright", "Copyright © 1998-2005 The OpenSSL Project. Copyright © 1995-1998 Eric A. Young, Tim J. Hudson. All rights reserved.\\0"
//VALUE "LegalTrademarks", "\\0"
//VALUE "PrivateBuild", "\\0"
//VALUE "SpecialBuild", "\\0"
@ -2106,12 +2191,12 @@ sub print_table_entry
{
my $target = shift;
(my $cc,my $cflags,my $unistd,my $thread_cflag,my $sys_id,my $lflags,
my $bn_ops,my $cpuid_obj,my $bn_obj,my $des_obj,my $aes_obj, my $bf_obj,
my $md5_obj,my $sha1_obj,my $cast_obj,my $rc4_obj,my $rmd160_obj,
my $rc5_obj,my $wp_obj,my $cmll_obj,my $modes_obj, my $engines_obj,
my $perlasm_scheme,my $dso_scheme,my $shared_target,my $shared_cflag,
my $shared_ldflag,my $shared_extension,my $ranlib,my $arflags,my $multilib)=
my ($cc, $cflags, $unistd, $thread_cflag, $sys_id, $lflags,
$bn_ops, $cpuid_obj, $bn_obj, $ec_obj, $des_obj, $aes_obj, $bf_obj,
$md5_obj, $sha1_obj, $cast_obj, $rc4_obj, $rmd160_obj,
$rc5_obj, $wp_obj, $cmll_obj, $modes_obj, $engines_obj,
$perlasm_scheme, $dso_scheme, $shared_target, $shared_cflag,
$shared_ldflag, $shared_extension, $ranlib, $arflags, $multilib)=
split(/\s*:\s*/,$table{$target} . ":" x 30 , -1);
print <<EOF
@ -2126,6 +2211,7 @@ sub print_table_entry
\$bn_ops = $bn_ops
\$cpuid_obj = $cpuid_obj
\$bn_obj = $bn_obj
\$ec_obj = $ec_obj
\$des_obj = $des_obj
\$aes_obj = $aes_obj
\$bf_obj = $bf_obj

40
FAQ
View File

@ -83,7 +83,7 @@ OpenSSL - Frequently Asked Questions
* Which is the current version of OpenSSL?
The current version is available from <URL: http://www.openssl.org>.
OpenSSL 1.0.1e was released on Feb 11th, 2013.
OpenSSL 1.0.1a was released on Apr 19th, 2012.
In addition to the current stable release, you can also access daily
snapshots of the OpenSSL development version at <URL:
@ -184,14 +184,18 @@ Therefore the answer to the common question "when will feature X be
backported to OpenSSL 1.0.0/0.9.8?" is "never" but it could appear
in the next minor release.
* What happens when the letter release reaches z?
It was decided after the release of OpenSSL 0.9.8y the next version should
be 0.9.8za then 0.9.8zb and so on.
[LEGAL] =======================================================================
* Do I need patent licenses to use OpenSSL?
The patents section of the README file lists patents that may apply to
you if you want to use OpenSSL. For information on intellectual
property rights, please consult a lawyer. The OpenSSL team does not
offer legal advice.
For information on intellectual property rights, please consult a lawyer.
The OpenSSL team does not offer legal advice.
You can configure OpenSSL so as not to use IDEA, MDC2 and RC5 by using
./config no-idea no-mdc2 no-rc5
@ -608,8 +612,8 @@ valid for the current DOS session.
* What is special about OpenSSL on Redhat?
Red Hat Linux (release 7.0 and later) include a preinstalled limited
version of OpenSSL. For patent reasons, support for IDEA, RC5 and MDC2
is disabled in this version. The same may apply to other Linux distributions.
version of OpenSSL. Red Hat has chosen to disable support for IDEA, RC5 and
MDC2 in this version. The same may apply to other Linux distributions.
Users may therefore wish to install more or all of the features left out.
To do this you MUST ensure that you do not overwrite the openssl that is in
@ -632,11 +636,6 @@ relevant updates in packages up to and including 0.9.6b.
A possible way around this is to persuade Red Hat to produce a non-US
version of Red Hat Linux.
FYI: Patent numbers and expiry dates of US patents:
MDC-2: 4,908,861 13/03/2007
IDEA: 5,214,703 25/05/2010
RC5: 5,724,428 03/03/2015
* Why does the OpenSSL compilation fail on MacOS X?
@ -862,7 +861,7 @@ The opposite assumes we already have len bytes in buf:
p = buf;
p7 = d2i_PKCS7(NULL, &p, len);
At this point p7 contains a valid PKCS7 structure of NULL if an error
At this point p7 contains a valid PKCS7 structure or NULL if an error
occurred. If an error occurred ERR_print_errors(bio) should give more
information.
@ -874,6 +873,21 @@ that has been read or written. This may well be uninitialized data
and attempts to free the buffer will have unpredictable results
because it no longer points to the same address.
Memory allocation and encoding can also be combined in a single
operation by the ASN1 routines:
unsigned char *buf = NULL; /* mandatory */
int len;
len = i2d_PKCS7(p7, &buf);
if (len < 0)
/* Error */
/* Do some things with 'buf' */
/* Finished with buf: free it */
OPENSSL_free(buf);
In this special case the "buf" parameter is *not* incremented, it points
to the start of the encoding.
* OpenSSL uses DER but I need BER format: does OpenSSL support BER?

View File

@ -10,6 +10,7 @@ openssl-*/*/*.save
openssl-*/*/*/*.bat
openssl-*/*/*/*.com
openssl-*/*/*/*.save
openssl-*/Git*
openssl-*/INSTALL.DJGPP
openssl-*/INSTALL.MacOS
openssl-*/INSTALL.NW

View File

@ -10,9 +10,9 @@ First, read http://wiki.freebsd.org/SubversionPrimer/VendorImports
# Xlist
setenv XLIST /FreeBSD/work/openssl/svn-FREEBSD-files/FREEBSD-Xlist
setenv FSVN "svn+ssh://svn.freebsd.org/base"
setenv OSSLVER 1.0.1p
# OSSLTAG format: v1_0_1p
setenv FSVN "svn+ssh://repo.freebsd.org/base"
setenv OSSLVER 1.0.2d
# OSSLTAG format: v1_0_2d
###setenv OSSLTAG v`echo ${OSSLVER} | tr . _`

View File

@ -4,16 +4,16 @@
## Makefile for OpenSSL
##
VERSION=1.0.1p
VERSION=1.0.2d
MAJOR=1
MINOR=0.1
MINOR=0.2
SHLIB_VERSION_NUMBER=1.0.0
SHLIB_VERSION_HISTORY=
SHLIB_MAJOR=1
SHLIB_MINOR=0.0
SHLIB_EXT=
PLATFORM=dist
OPTIONS= no-ec_nistp_64_gcc_128 no-gmp no-jpake no-krb5 no-md2 no-rc5 no-rfc3779 no-sctp no-shared no-store no-unit-test no-zlib no-zlib-dynamic static-engine
OPTIONS= no-ec_nistp_64_gcc_128 no-gmp no-jpake no-krb5 no-libunbound no-md2 no-rc5 no-rfc3779 no-sctp no-shared no-ssl-trace no-store no-unit-test no-zlib no-zlib-dynamic static-engine
CONFIGURE_ARGS=dist
SHLIB_TARGET=
@ -61,7 +61,7 @@ OPENSSLDIR=/usr/local/ssl
CC= cc
CFLAG= -O
DEPFLAG= -DOPENSSL_NO_EC_NISTP_64_GCC_128 -DOPENSSL_NO_GMP -DOPENSSL_NO_JPAKE -DOPENSSL_NO_MD2 -DOPENSSL_NO_RC5 -DOPENSSL_NO_RFC3779 -DOPENSSL_NO_SCTP -DOPENSSL_NO_STORE -DOPENSSL_NO_UNIT_TEST
DEPFLAG= -DOPENSSL_NO_EC_NISTP_64_GCC_128 -DOPENSSL_NO_GMP -DOPENSSL_NO_JPAKE -DOPENSSL_NO_LIBUNBOUND -DOPENSSL_NO_MD2 -DOPENSSL_NO_RC5 -DOPENSSL_NO_RFC3779 -DOPENSSL_NO_SCTP -DOPENSSL_NO_SSL_TRACE -DOPENSSL_NO_STORE -DOPENSSL_NO_UNIT_TEST
PEX_LIBS=
EX_LIBS=
EXE_EXT=
@ -71,7 +71,7 @@ RANLIB= /usr/bin/ranlib
NM= nm
PERL= /usr/bin/perl
TAR= tar
TARFLAGS= --no-recursion --record-size=10240
TARFLAGS= --no-recursion
MAKEDEPPROG=makedepend
LIBDIR=lib
@ -90,6 +90,7 @@ PROCESSOR=
# CPUID module collects small commonly used assembler snippets
CPUID_OBJ= mem_clr.o
BN_ASM= bn_asm.o
EC_ASM=
DES_ENC= des_enc.o fcrypt_b.o
AES_ENC= aes_core.o aes_cbc.o
BF_ENC= bf_enc.o
@ -223,8 +224,8 @@ BUILDENV= PLATFORM='$(PLATFORM)' PROCESSOR='$(PROCESSOR)' \
EXE_EXT='$(EXE_EXT)' SHARED_LIBS='$(SHARED_LIBS)' \
SHLIB_EXT='$(SHLIB_EXT)' SHLIB_TARGET='$(SHLIB_TARGET)' \
PEX_LIBS='$(PEX_LIBS)' EX_LIBS='$(EX_LIBS)' \
CPUID_OBJ='$(CPUID_OBJ)' \
BN_ASM='$(BN_ASM)' DES_ENC='$(DES_ENC)' \
CPUID_OBJ='$(CPUID_OBJ)' BN_ASM='$(BN_ASM)' \
EC_ASM='$(EC_ASM)' DES_ENC='$(DES_ENC)' \
AES_ENC='$(AES_ENC)' CMLL_ENC='$(CMLL_ENC)' \
BF_ENC='$(BF_ENC)' CAST_ENC='$(CAST_ENC)' \
RC4_ENC='$(RC4_ENC)' RC5_ENC='$(RC5_ENC)' \
@ -332,7 +333,7 @@ clean-shared:
done; \
fi; \
( set -x; rm -f lib$$i$(SHLIB_EXT) ); \
if [ "$(PLATFORM)" = "Cygwin" ]; then \
if expr "$(PLATFORM)" : "Cygwin" >/dev/null; then \
( set -x; rm -f cyg$$i$(SHLIB_EXT) lib$$i$(SHLIB_EXT).a ); \
fi; \
done
@ -381,11 +382,11 @@ libssl.pc: Makefile
echo 'libdir=$${exec_prefix}/$(LIBDIR)'; \
echo 'includedir=$${prefix}/include'; \
echo ''; \
echo 'Name: OpenSSL'; \
echo 'Name: OpenSSL-libssl'; \
echo 'Description: Secure Sockets Layer and cryptography libraries'; \
echo 'Version: '$(VERSION); \
echo 'Requires: '; \
echo 'Libs: -L$${libdir} -lssl -lcrypto'; \
echo 'Requires.private: libcrypto'; \
echo 'Libs: -L$${libdir} -lssl'; \
echo 'Libs.private: $(EX_LIBS)'; \
echo 'Cflags: -I$${includedir} $(KRB5_INCLUDES)' ) > libssl.pc
@ -398,10 +399,7 @@ openssl.pc: Makefile
echo 'Name: OpenSSL'; \
echo 'Description: Secure Sockets Layer and cryptography libraries and tools'; \
echo 'Version: '$(VERSION); \
echo 'Requires: '; \
echo 'Libs: -L$${libdir} -lssl -lcrypto'; \
echo 'Libs.private: $(EX_LIBS)'; \
echo 'Cflags: -I$${includedir} $(KRB5_INCLUDES)' ) > openssl.pc
echo 'Requires: libssl libcrypto' ) > openssl.pc
Makefile: Makefile.org Configure config
@echo "Makefile is older than Makefile.org, Configure or config."
@ -564,11 +562,7 @@ install_sw:
do \
if [ -f "$$i" -o -f "$$i.a" ]; then \
( echo installing $$i; \
if [ "$(PLATFORM)" != "Cygwin" ]; then \
cp $$i $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new; \
chmod 555 $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new; \
mv -f $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i; \
else \
if expr "$(PLATFORM)" : "Cygwin" >/dev/null; then \
c=`echo $$i | sed 's/^lib\(.*\)\.dll\.a/cyg\1-$(SHLIB_VERSION_NUMBER).dll/'`; \
cp $$c $(INSTALL_PREFIX)$(INSTALLTOP)/bin/$$c.new; \
chmod 755 $(INSTALL_PREFIX)$(INSTALLTOP)/bin/$$c.new; \
@ -576,6 +570,10 @@ install_sw:
cp $$i $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new; \
chmod 644 $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new; \
mv -f $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i; \
else \
cp $$i $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new; \
chmod 555 $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new; \
mv -f $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i; \
fi ); \
if expr $(PLATFORM) : 'mingw' > /dev/null; then \
( case $$i in \
@ -608,6 +606,10 @@ install_sw:
install_html_docs:
here="`pwd`"; \
filecase=; \
case "$(PLATFORM)" in DJGPP|Cygwin*|mingw*|darwin*-*-cc) \
filecase=-i; \
esac; \
for subdir in apps crypto ssl; do \
mkdir -p $(INSTALL_PREFIX)$(HTMLDIR)/$$subdir; \
for i in doc/$$subdir/*.pod; do \
@ -636,9 +638,9 @@ install_docs:
@pod2man="`cd ./util; ./pod2mantest $(PERL)`"; \
here="`pwd`"; \
filecase=; \
if [ "$(PLATFORM)" = "DJGPP" -o "$(PLATFORM)" = "Cygwin" -o "$(PLATFORM)" = "mingw" ]; then \
case "$(PLATFORM)" in DJGPP|Cygwin*|mingw*|darwin*-*-cc) \
filecase=-i; \
fi; \
esac; \
set -e; for i in doc/apps/*.pod; do \
fn=`basename $$i .pod`; \
sec=`$(PERL) util/extract-section.pl 1 < $$i`; \

View File

@ -69,7 +69,7 @@ RANLIB= ranlib
NM= nm
PERL= perl
TAR= tar
TARFLAGS= --no-recursion --record-size=10240
TARFLAGS= --no-recursion
MAKEDEPPROG=makedepend
LIBDIR=lib
@ -88,6 +88,7 @@ PROCESSOR=
# CPUID module collects small commonly used assembler snippets
CPUID_OBJ=
BN_ASM= bn_asm.o
EC_ASM=
DES_ENC= des_enc.o fcrypt_b.o
AES_ENC= aes_core.o aes_cbc.o
BF_ENC= bf_enc.o
@ -221,8 +222,8 @@ BUILDENV= PLATFORM='$(PLATFORM)' PROCESSOR='$(PROCESSOR)' \
EXE_EXT='$(EXE_EXT)' SHARED_LIBS='$(SHARED_LIBS)' \
SHLIB_EXT='$(SHLIB_EXT)' SHLIB_TARGET='$(SHLIB_TARGET)' \
PEX_LIBS='$(PEX_LIBS)' EX_LIBS='$(EX_LIBS)' \
CPUID_OBJ='$(CPUID_OBJ)' \
BN_ASM='$(BN_ASM)' DES_ENC='$(DES_ENC)' \
CPUID_OBJ='$(CPUID_OBJ)' BN_ASM='$(BN_ASM)' \
EC_ASM='$(EC_ASM)' DES_ENC='$(DES_ENC)' \
AES_ENC='$(AES_ENC)' CMLL_ENC='$(CMLL_ENC)' \
BF_ENC='$(BF_ENC)' CAST_ENC='$(CAST_ENC)' \
RC4_ENC='$(RC4_ENC)' RC5_ENC='$(RC5_ENC)' \
@ -330,7 +331,7 @@ clean-shared:
done; \
fi; \
( set -x; rm -f lib$$i$(SHLIB_EXT) ); \
if [ "$(PLATFORM)" = "Cygwin" ]; then \
if expr "$(PLATFORM)" : "Cygwin" >/dev/null; then \
( set -x; rm -f cyg$$i$(SHLIB_EXT) lib$$i$(SHLIB_EXT).a ); \
fi; \
done
@ -379,11 +380,11 @@ libssl.pc: Makefile
echo 'libdir=$${exec_prefix}/$(LIBDIR)'; \
echo 'includedir=$${prefix}/include'; \
echo ''; \
echo 'Name: OpenSSL'; \
echo 'Name: OpenSSL-libssl'; \
echo 'Description: Secure Sockets Layer and cryptography libraries'; \
echo 'Version: '$(VERSION); \
echo 'Requires: '; \
echo 'Libs: -L$${libdir} -lssl -lcrypto'; \
echo 'Requires.private: libcrypto'; \
echo 'Libs: -L$${libdir} -lssl'; \
echo 'Libs.private: $(EX_LIBS)'; \
echo 'Cflags: -I$${includedir} $(KRB5_INCLUDES)' ) > libssl.pc
@ -396,10 +397,7 @@ openssl.pc: Makefile
echo 'Name: OpenSSL'; \
echo 'Description: Secure Sockets Layer and cryptography libraries and tools'; \
echo 'Version: '$(VERSION); \
echo 'Requires: '; \
echo 'Libs: -L$${libdir} -lssl -lcrypto'; \
echo 'Libs.private: $(EX_LIBS)'; \
echo 'Cflags: -I$${includedir} $(KRB5_INCLUDES)' ) > openssl.pc
echo 'Requires: libssl libcrypto' ) > openssl.pc
Makefile: Makefile.org Configure config
@echo "Makefile is older than Makefile.org, Configure or config."
@ -562,11 +560,7 @@ install_sw:
do \
if [ -f "$$i" -o -f "$$i.a" ]; then \
( echo installing $$i; \
if [ "$(PLATFORM)" != "Cygwin" ]; then \
cp $$i $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new; \
chmod 555 $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new; \
mv -f $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i; \
else \
if expr "$(PLATFORM)" : "Cygwin" >/dev/null; then \
c=`echo $$i | sed 's/^lib\(.*\)\.dll\.a/cyg\1-$(SHLIB_VERSION_NUMBER).dll/'`; \
cp $$c $(INSTALL_PREFIX)$(INSTALLTOP)/bin/$$c.new; \
chmod 755 $(INSTALL_PREFIX)$(INSTALLTOP)/bin/$$c.new; \
@ -574,6 +568,10 @@ install_sw:
cp $$i $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new; \
chmod 644 $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new; \
mv -f $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i; \
else \
cp $$i $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new; \
chmod 555 $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new; \
mv -f $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i.new $(INSTALL_PREFIX)$(INSTALLTOP)/$(LIBDIR)/$$i; \
fi ); \
if expr $(PLATFORM) : 'mingw' > /dev/null; then \
( case $$i in \
@ -606,6 +604,10 @@ install_sw:
install_html_docs:
here="`pwd`"; \
filecase=; \
case "$(PLATFORM)" in DJGPP|Cygwin*|mingw*|darwin*-*-cc) \
filecase=-i; \
esac; \
for subdir in apps crypto ssl; do \
mkdir -p $(INSTALL_PREFIX)$(HTMLDIR)/$$subdir; \
for i in doc/$$subdir/*.pod; do \
@ -634,9 +636,9 @@ install_docs:
@pod2man="`cd ./util; ./pod2mantest $(PERL)`"; \
here="`pwd`"; \
filecase=; \
if [ "$(PLATFORM)" = "DJGPP" -o "$(PLATFORM)" = "Cygwin" -o "$(PLATFORM)" = "mingw" ]; then \
case "$(PLATFORM)" in DJGPP|Cygwin*|mingw*|darwin*-*-cc) \
filecase=-i; \
fi; \
esac; \
set -e; for i in doc/apps/*.pod; do \
fn=`basename $$i .pod`; \
sec=`$(PERL) util/extract-section.pl 1 < $$i`; \

25
NEWS
View File

@ -5,15 +5,15 @@
This file gives a brief overview of the major changes between each OpenSSL
release. For more details please read the CHANGES file.
Major changes between OpenSSL 1.0.1o and OpenSSL 1.0.1p [9 Jul 2015]
Major changes between OpenSSL 1.0.2c and OpenSSL 1.0.2d [9 Jul 2015]
o Alternate chains certificate forgery (CVE-2015-1793)
Major changes between OpenSSL 1.0.1n and OpenSSL 1.0.1o [12 Jun 2015]
Major changes between OpenSSL 1.0.2b and OpenSSL 1.0.2c [12 Jun 2015]
o Fix HMAC ABI incompatibility
Major changes between OpenSSL 1.0.1m and OpenSSL 1.0.1n [11 Jun 2015]
Major changes between OpenSSL 1.0.2a and OpenSSL 1.0.2b [11 Jun 2015]
o Malformed ECParameters causes infinite loop (CVE-2015-1788)
o Exploitable out-of-bounds read in X509_cmp_time (CVE-2015-1789)
@ -21,16 +21,33 @@
o CMS verify infinite loop with unknown hash function (CVE-2015-1792)
o Race condition handling NewSessionTicket (CVE-2015-1791)
Major changes between OpenSSL 1.0.1l and OpenSSL 1.0.1m [19 Mar 2015]
Major changes between OpenSSL 1.0.2 and OpenSSL 1.0.2a [19 Mar 2015]
o OpenSSL 1.0.2 ClientHello sigalgs DoS fix (CVE-2015-0291)
o Multiblock corrupted pointer fix (CVE-2015-0290)
o Segmentation fault in DTLSv1_listen fix (CVE-2015-0207)
o Segmentation fault in ASN1_TYPE_cmp fix (CVE-2015-0286)
o Segmentation fault for invalid PSS parameters fix (CVE-2015-0208)
o ASN.1 structure reuse memory corruption fix (CVE-2015-0287)
o PKCS7 NULL pointer dereferences fix (CVE-2015-0289)
o DoS via reachable assert in SSLv2 servers fix (CVE-2015-0293)
o Empty CKE with client auth and DHE fix (CVE-2015-1787)
o Handshake with unseeded PRNG fix (CVE-2015-0285)
o Use After Free following d2i_ECPrivatekey error fix (CVE-2015-0209)
o X509_to_X509_REQ NULL pointer deref fix (CVE-2015-0288)
o Removed the export ciphers from the DEFAULT ciphers
Major changes between OpenSSL 1.0.1l and OpenSSL 1.0.2 [22 Jan 2015]:
o Suite B support for TLS 1.2 and DTLS 1.2
o Support for DTLS 1.2
o TLS automatic EC curve selection.
o API to set TLS supported signature algorithms and curves
o SSL_CONF configuration API.
o TLS Brainpool support.
o ALPN support.
o CMS support for RSA-PSS, RSA-OAEP, ECDH and X9.42 DH.
Major changes between OpenSSL 1.0.1k and OpenSSL 1.0.1l [15 Jan 2015]
o Build fixes for the Windows and OpenVMS platforms

40
README
View File

@ -1,5 +1,5 @@
OpenSSL 1.0.1p 9 Jul 2015
OpenSSL 1.0.2d 9 Jul 2015
Copyright (c) 1998-2011 The OpenSSL Project
Copyright (c) 1995-1998 Eric A. Young, Tim J. Hudson
@ -90,32 +90,6 @@
SSL/TLS Client and Server Tests
Handling of S/MIME signed or encrypted mail
PATENTS
-------
Various companies hold various patents for various algorithms in various
locations around the world. _YOU_ are responsible for ensuring that your use
of any algorithms is legal by checking if there are any patents in your
country. The file contains some of the patents that we know about or are
rumored to exist. This is not a definitive list.
RSA Security holds software patents on the RC5 algorithm. If you
intend to use this cipher, you must contact RSA Security for
licensing conditions. Their web page is http://www.rsasecurity.com/.
RC4 is a trademark of RSA Security, so use of this label should perhaps
only be used with RSA Security's permission.
The IDEA algorithm is patented by Ascom in Austria, France, Germany, Italy,
Japan, the Netherlands, Spain, Sweden, Switzerland, UK and the USA. They
should be contacted if that algorithm is to be used; their web page is
http://www.ascom.ch/.
NTT and Mitsubishi have patents and pending patents on the Camellia
algorithm, but allow use at no charge without requiring an explicit
licensing agreement: http://info.isl.ntt.co.jp/crypt/eng/info/chiteki.html
INSTALLATION
------------
@ -161,8 +135,7 @@
- Problem Description (steps that will reproduce the problem, if known)
- Stack Traceback (if the application dumps core)
Report the bug to the OpenSSL project via the Request Tracker
(http://www.openssl.org/support/rt.html) by mail to:
Email the report to:
openssl-bugs@openssl.org
@ -170,10 +143,11 @@
or support queries. Just because something doesn't work the way you expect
does not mean it is necessarily a bug in OpenSSL.
Note that mail to openssl-bugs@openssl.org is recorded in the publicly
readable request tracker database and is forwarded to a public
mailing list. Confidential mail may be sent to openssl-security@openssl.org
(PGP key available from the key servers).
Note that mail to openssl-bugs@openssl.org is recorded in the public
request tracker database (see https://www.openssl.org/support/rt.html
for details) and also forwarded to a public mailing list. Confidential
mail may be sent to openssl-security@openssl.org (PGP key available from
the key servers).
HOW TO CONTRIBUTE TO OpenSSL
----------------------------

View File

@ -119,7 +119,7 @@
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#if !defined(OPENSSL_SYSNAME_WIN32) && !defined(NETWARE_CLIB)
#if !defined(OPENSSL_SYSNAME_WIN32) && !defined(OPENSSL_SYSNAME_WINCE) && !defined(NETWARE_CLIB)
# include <strings.h>
#endif
#include <sys/types.h>
@ -285,6 +285,8 @@ int str2fmt(char *s)
return (FORMAT_PKCS12);
else if ((*s == 'E') || (*s == 'e'))
return (FORMAT_ENGINE);
else if ((*s == 'H') || (*s == 'h'))
return FORMAT_HTTP;
else if ((*s == 'P') || (*s == 'p')) {
if (s[1] == 'V' || s[1] == 'v')
return FORMAT_PVK;
@ -787,12 +789,72 @@ static int load_pkcs12(BIO *err, BIO *in, const char *desc,
return ret;
}
int load_cert_crl_http(const char *url, BIO *err,
X509 **pcert, X509_CRL **pcrl)
{
char *host = NULL, *port = NULL, *path = NULL;
BIO *bio = NULL;
OCSP_REQ_CTX *rctx = NULL;
int use_ssl, rv = 0;
if (!OCSP_parse_url(url, &host, &port, &path, &use_ssl))
goto err;
if (use_ssl) {
if (err)
BIO_puts(err, "https not supported\n");
goto err;
}
bio = BIO_new_connect(host);
if (!bio || !BIO_set_conn_port(bio, port))
goto err;
rctx = OCSP_REQ_CTX_new(bio, 1024);
if (!rctx)
goto err;
if (!OCSP_REQ_CTX_http(rctx, "GET", path))
goto err;
if (!OCSP_REQ_CTX_add1_header(rctx, "Host", host))
goto err;
if (pcert) {
do {
rv = X509_http_nbio(rctx, pcert);
}
while (rv == -1);
} else {
do {
rv = X509_CRL_http_nbio(rctx, pcrl);
} while (rv == -1);
}
err:
if (host)
OPENSSL_free(host);
if (path)
OPENSSL_free(path);
if (port)
OPENSSL_free(port);
if (bio)
BIO_free_all(bio);
if (rctx)
OCSP_REQ_CTX_free(rctx);
if (rv != 1) {
if (bio && err)
BIO_printf(bio_err, "Error loading %s from %s\n",
pcert ? "certificate" : "CRL", url);
ERR_print_errors(bio_err);
}
return rv;
}
X509 *load_cert(BIO *err, const char *file, int format,
const char *pass, ENGINE *e, const char *cert_descrip)
{
X509 *x = NULL;
BIO *cert;
if (format == FORMAT_HTTP) {
load_cert_crl_http(file, err, &x, NULL);
return x;
}
if ((cert = BIO_new(BIO_s_file())) == NULL) {
ERR_print_errors(err);
goto end;
@ -850,6 +912,49 @@ X509 *load_cert(BIO *err, const char *file, int format,
return (x);
}
X509_CRL *load_crl(const char *infile, int format)
{
X509_CRL *x = NULL;
BIO *in = NULL;
if (format == FORMAT_HTTP) {
load_cert_crl_http(infile, bio_err, NULL, &x);
return x;
}
in = BIO_new(BIO_s_file());
if (in == NULL) {
ERR_print_errors(bio_err);
goto end;
}
if (infile == NULL)
BIO_set_fp(in, stdin, BIO_NOCLOSE);
else {
if (BIO_read_filename(in, infile) <= 0) {
perror(infile);
goto end;
}
}
if (format == FORMAT_ASN1)
x = d2i_X509_CRL_bio(in, NULL);
else if (format == FORMAT_PEM)
x = PEM_read_bio_X509_CRL(in, NULL, NULL, NULL);
else {
BIO_printf(bio_err, "bad input format specified for input crl\n");
goto end;
}
if (x == NULL) {
BIO_printf(bio_err, "unable to load CRL\n");
ERR_print_errors(bio_err);
goto end;
}
end:
BIO_free(in);
return (x);
}
EVP_PKEY *load_key(BIO *err, const char *file, int format, int maybe_stdin,
const char *pass, ENGINE *e, const char *key_descrip)
{
@ -2159,6 +2264,9 @@ int args_verify(char ***pargs, int *pargc,
char **oldargs = *pargs;
char *arg = **pargs, *argn = (*pargs)[1];
time_t at_time = 0;
char *hostname = NULL;
char *email = NULL;
char *ipasc = NULL;
if (!strcmp(arg, "-policy")) {
if (!argn)
*badarg = 1;
@ -2212,6 +2320,21 @@ int args_verify(char ***pargs, int *pargc,
at_time = (time_t)timestamp;
}
(*pargs)++;
} else if (strcmp(arg, "-verify_hostname") == 0) {
if (!argn)
*badarg = 1;
hostname = argn;
(*pargs)++;
} else if (strcmp(arg, "-verify_email") == 0) {
if (!argn)
*badarg = 1;
email = argn;
(*pargs)++;
} else if (strcmp(arg, "-verify_ip") == 0) {
if (!argn)
*badarg = 1;
ipasc = argn;
(*pargs)++;
} else if (!strcmp(arg, "-ignore_critical"))
flags |= X509_V_FLAG_IGNORE_CRITICAL;
else if (!strcmp(arg, "-issuer_checks"))
@ -2238,6 +2361,16 @@ int args_verify(char ***pargs, int *pargc,
flags |= X509_V_FLAG_NOTIFY_POLICY;
else if (!strcmp(arg, "-check_ss_sig"))
flags |= X509_V_FLAG_CHECK_SS_SIGNATURE;
else if (!strcmp(arg, "-trusted_first"))
flags |= X509_V_FLAG_TRUSTED_FIRST;
else if (!strcmp(arg, "-suiteB_128_only"))
flags |= X509_V_FLAG_SUITEB_128_LOS_ONLY;
else if (!strcmp(arg, "-suiteB_128"))
flags |= X509_V_FLAG_SUITEB_128_LOS;
else if (!strcmp(arg, "-suiteB_192"))
flags |= X509_V_FLAG_SUITEB_192_LOS;
else if (!strcmp(arg, "-partial_chain"))
flags |= X509_V_FLAG_PARTIAL_CHAIN;
else if (!strcmp(arg, "-no_alt_chains"))
flags |= X509_V_FLAG_NO_ALT_CHAINS;
else
@ -2269,6 +2402,15 @@ int args_verify(char ***pargs, int *pargc,
if (at_time)
X509_VERIFY_PARAM_set_time(*pm, at_time);
if (hostname && !X509_VERIFY_PARAM_set1_host(*pm, hostname, 0))
*badarg = 1;
if (email && !X509_VERIFY_PARAM_set1_email(*pm, email, 0))
*badarg = 1;
if (ipasc && !X509_VERIFY_PARAM_set1_ip_asc(*pm, ipasc))
*badarg = 1;
end:
(*pargs)++;
@ -2552,6 +2694,9 @@ void jpake_client_auth(BIO *out, BIO *conn, const char *secret)
BIO_puts(out, "JPAKE authentication succeeded, setting PSK\n");
if (psk_key)
OPENSSL_free(psk_key);
psk_key = BN_bn2hex(JPAKE_get_shared_key(ctx));
BIO_pop(bconn);
@ -2581,6 +2726,9 @@ void jpake_server_auth(BIO *out, BIO *conn, const char *secret)
BIO_puts(out, "JPAKE authentication succeeded, setting PSK\n");
if (psk_key)
OPENSSL_free(psk_key);
psk_key = BN_bn2hex(JPAKE_get_shared_key(ctx));
BIO_pop(bconn);
@ -2591,7 +2739,7 @@ void jpake_server_auth(BIO *out, BIO *conn, const char *secret)
#endif
#if !defined(OPENSSL_NO_TLSEXT) && !defined(OPENSSL_NO_NEXTPROTONEG)
#ifndef OPENSSL_NO_TLSEXT
/*-
* next_protos_parse parses a comma separated list of strings into a string
* in a format suitable for passing to SSL_CTX_set_next_protos_advertised.
@ -2630,8 +2778,106 @@ unsigned char *next_protos_parse(unsigned short *outlen, const char *in)
*outlen = len + 1;
return out;
}
#endif /* !OPENSSL_NO_TLSEXT &&
* !OPENSSL_NO_NEXTPROTONEG */
#endif /* ndef OPENSSL_NO_TLSEXT */
void print_cert_checks(BIO *bio, X509 *x,
const char *checkhost,
const char *checkemail, const char *checkip)
{
if (x == NULL)
return;
if (checkhost) {
BIO_printf(bio, "Hostname %s does%s match certificate\n",
checkhost, X509_check_host(x, checkhost, 0, 0, NULL) == 1
? "" : " NOT");
}
if (checkemail) {
BIO_printf(bio, "Email %s does%s match certificate\n",
checkemail, X509_check_email(x, checkemail, 0,
0) ? "" : " NOT");
}
if (checkip) {
BIO_printf(bio, "IP %s does%s match certificate\n",
checkip, X509_check_ip_asc(x, checkip, 0) ? "" : " NOT");
}
}
/* Get first http URL from a DIST_POINT structure */
static const char *get_dp_url(DIST_POINT *dp)
{
GENERAL_NAMES *gens;
GENERAL_NAME *gen;
int i, gtype;
ASN1_STRING *uri;
if (!dp->distpoint || dp->distpoint->type != 0)
return NULL;
gens = dp->distpoint->name.fullname;
for (i = 0; i < sk_GENERAL_NAME_num(gens); i++) {
gen = sk_GENERAL_NAME_value(gens, i);
uri = GENERAL_NAME_get0_value(gen, &gtype);
if (gtype == GEN_URI && ASN1_STRING_length(uri) > 6) {
char *uptr = (char *)ASN1_STRING_data(uri);
if (!strncmp(uptr, "http://", 7))
return uptr;
}
}
return NULL;
}
/*
* Look through a CRLDP structure and attempt to find an http URL to
* downloads a CRL from.
*/
static X509_CRL *load_crl_crldp(STACK_OF(DIST_POINT) *crldp)
{
int i;
const char *urlptr = NULL;
for (i = 0; i < sk_DIST_POINT_num(crldp); i++) {
DIST_POINT *dp = sk_DIST_POINT_value(crldp, i);
urlptr = get_dp_url(dp);
if (urlptr)
return load_crl(urlptr, FORMAT_HTTP);
}
return NULL;
}
/*
* Example of downloading CRLs from CRLDP: not usable for real world as it
* always downloads, doesn't support non-blocking I/O and doesn't cache
* anything.
*/
static STACK_OF(X509_CRL) *crls_http_cb(X509_STORE_CTX *ctx, X509_NAME *nm)
{
X509 *x;
STACK_OF(X509_CRL) *crls = NULL;
X509_CRL *crl;
STACK_OF(DIST_POINT) *crldp;
x = X509_STORE_CTX_get_current_cert(ctx);
crldp = X509_get_ext_d2i(x, NID_crl_distribution_points, NULL, NULL);
crl = load_crl_crldp(crldp);
sk_DIST_POINT_pop_free(crldp, DIST_POINT_free);
if (!crl)
return NULL;
crls = sk_X509_CRL_new_null();
sk_X509_CRL_push(crls, crl);
/* Try to download delta CRL */
crldp = X509_get_ext_d2i(x, NID_freshest_crl, NULL, NULL);
crl = load_crl_crldp(crldp);
sk_DIST_POINT_pop_free(crldp, DIST_POINT_free);
if (crl)
sk_X509_CRL_push(crls, crl);
return crls;
}
void store_setup_crl_download(X509_STORE *st)
{
X509_STORE_set_lookup_crls_cb(st, crls_http_cb);
}
/*
* Platform-specific sections

View File

@ -205,7 +205,7 @@ extern BIO *bio_err;
# endif
# endif
# ifdef OPENSSL_SYSNAME_WIN32
# if defined(OPENSSL_SYSNAME_WIN32) || defined(OPENSSL_SYSNAME_WINCE)
# define openssl_fdset(a,b) FD_SET((unsigned int)a, b)
# else
# define openssl_fdset(a,b) FD_SET(a, b)
@ -245,6 +245,9 @@ int app_passwd(BIO *err, char *arg1, char *arg2, char **pass1, char **pass2);
int add_oid_section(BIO *err, CONF *conf);
X509 *load_cert(BIO *err, const char *file, int format,
const char *pass, ENGINE *e, const char *cert_descrip);
X509_CRL *load_crl(const char *infile, int format);
int load_cert_crl_http(const char *url, BIO *err,
X509 **pcert, X509_CRL **pcrl);
EVP_PKEY *load_key(BIO *err, const char *file, int format, int maybe_stdin,
const char *pass, ENGINE *e, const char *key_descrip);
EVP_PKEY *load_pubkey(BIO *err, const char *file, int format, int maybe_stdin,
@ -262,8 +265,9 @@ ENGINE *setup_engine(BIO *err, const char *engine, int debug);
# ifndef OPENSSL_NO_OCSP
OCSP_RESPONSE *process_responder(BIO *err, OCSP_REQUEST *req,
char *host, char *path, char *port,
int use_ssl, STACK_OF(CONF_VALUE) *headers,
const char *host, const char *path,
const char *port, int use_ssl,
const STACK_OF(CONF_VALUE) *headers,
int req_timeout);
# endif
@ -334,10 +338,15 @@ void jpake_client_auth(BIO *out, BIO *conn, const char *secret);
void jpake_server_auth(BIO *out, BIO *conn, const char *secret);
# endif
# if !defined(OPENSSL_NO_TLSEXT) && !defined(OPENSSL_NO_NEXTPROTONEG)
# ifndef OPENSSL_NO_TLSEXT
unsigned char *next_protos_parse(unsigned short *outlen, const char *in);
# endif /* !OPENSSL_NO_TLSEXT &&
* !OPENSSL_NO_NEXTPROTONEG */
# endif /* ndef OPENSSL_NO_TLSEXT */
void print_cert_checks(BIO *bio, X509 *x,
const char *checkhost,
const char *checkemail, const char *checkip);
void store_setup_crl_download(X509_STORE *st);
# define FORMAT_UNDEF 0
# define FORMAT_ASN1 1
@ -353,6 +362,7 @@ unsigned char *next_protos_parse(unsigned short *outlen, const char *in);
# define FORMAT_ASN1RSA 10 /* DER RSAPubicKey format */
# define FORMAT_MSBLOB 11 /* MS Key blob format */
# define FORMAT_PVK 12 /* MS PVK file format */
# define FORMAT_HTTP 13 /* Download using HTTP */
# define EXT_COPY_NONE 0
# define EXT_COPY_ADD 1

View File

@ -479,6 +479,11 @@ int MAIN(int argc, char **argv)
goto bad;
infile = *(++argv);
dorevoke = 1;
} else if (strcmp(*argv, "-valid") == 0) {
if (--argc < 1)
goto bad;
infile = *(++argv);
dorevoke = 2;
} else if (strcmp(*argv, "-extensions") == 0) {
if (--argc < 1)
goto bad;
@ -1441,6 +1446,8 @@ int MAIN(int argc, char **argv)
revcert = load_cert(bio_err, infile, FORMAT_PEM, NULL, e, infile);
if (revcert == NULL)
goto err;
if (dorevoke == 2)
rev_type = -1;
j = do_revoke(revcert, db, rev_type, rev_arg);
if (j <= 0)
goto err;
@ -1968,8 +1975,12 @@ static int do_body(X509 **xret, EVP_PKEY *pkey, X509 *x509,
if (enddate == NULL)
X509_time_adj_ex(X509_get_notAfter(ret), days, 0, NULL);
else
else {
int tdays;
ASN1_TIME_set_string(X509_get_notAfter(ret), enddate);
ASN1_TIME_diff(&tdays, NULL, NULL, X509_get_notAfter(ret));
days = tdays;
}
if (!X509_set_subject_name(ret, subject))
goto err;
@ -2409,13 +2420,20 @@ static int do_revoke(X509 *x509, CA_DB *db, int type, char *value)
}
/* Revoke Certificate */
ok = do_revoke(x509, db, type, value);
if (type == -1)
ok = 1;
else
ok = do_revoke(x509, db, type, value);
goto err;
} else if (index_name_cmp_noconst(row, rrow)) {
BIO_printf(bio_err, "ERROR:name does not match %s\n", row[DB_name]);
goto err;
} else if (type == -1) {
BIO_printf(bio_err, "ERROR:Already present, serial number %s\n",
row[DB_serial]);
goto err;
} else if (rrow[DB_type][0] == 'R') {
BIO_printf(bio_err, "ERROR:Already revoked, serial number %s\n",
row[DB_serial]);

View File

@ -85,6 +85,9 @@ int MAIN(int argc, char **argv)
{
int ret = 1, i;
int verbose = 0, Verbose = 0;
#ifndef OPENSSL_NO_SSL_TRACE
int stdname = 0;
#endif
const char **pp;
const char *p;
int badops = 0;
@ -119,6 +122,10 @@ int MAIN(int argc, char **argv)
verbose = 1;
else if (strcmp(*argv, "-V") == 0)
verbose = Verbose = 1;
#ifndef OPENSSL_NO_SSL_TRACE
else if (strcmp(*argv, "-stdname") == 0)
stdname = verbose = 1;
#endif
#ifndef OPENSSL_NO_SSL2
else if (strcmp(*argv, "-ssl2") == 0)
meth = SSLv2_client_method();
@ -202,7 +209,14 @@ int MAIN(int argc, char **argv)
id1, id2, id3);
}
}
#ifndef OPENSSL_NO_SSL_TRACE
if (stdname) {
const char *nm = SSL_CIPHER_standard_name(c);
if (nm == NULL)
nm = "UNKNOWN";
BIO_printf(STDout, "%s - ", nm);
}
#endif
BIO_puts(STDout, SSL_CIPHER_description(c, buf, sizeof buf));
}
}

View File

@ -75,6 +75,8 @@ static void receipt_request_print(BIO *out, CMS_ContentInfo *cms);
static CMS_ReceiptRequest *make_receipt_request(STACK_OF(OPENSSL_STRING)
*rr_to, int rr_allorfirst, STACK_OF(OPENSSL_STRING)
*rr_from);
static int cms_set_pkey_param(EVP_PKEY_CTX *pctx,
STACK_OF(OPENSSL_STRING) *param);
# define SMIME_OP 0x10
# define SMIME_IP 0x20
@ -98,6 +100,14 @@ static CMS_ReceiptRequest *make_receipt_request(STACK_OF(OPENSSL_STRING)
int verify_err = 0;
typedef struct cms_key_param_st cms_key_param;
struct cms_key_param_st {
int idx;
STACK_OF(OPENSSL_STRING) *param;
cms_key_param *next;
};
int MAIN(int, char **);
int MAIN(int argc, char **argv)
@ -112,7 +122,7 @@ int MAIN(int argc, char **argv)
STACK_OF(OPENSSL_STRING) *sksigners = NULL, *skkeys = NULL;
char *certfile = NULL, *keyfile = NULL, *contfile = NULL;
char *certsoutfile = NULL;
const EVP_CIPHER *cipher = NULL;
const EVP_CIPHER *cipher = NULL, *wrap_cipher = NULL;
CMS_ContentInfo *cms = NULL, *rcms = NULL;
X509_STORE *store = NULL;
X509 *cert = NULL, *recip = NULL, *signer = NULL;
@ -140,6 +150,8 @@ int MAIN(int argc, char **argv)
unsigned char *pwri_pass = NULL, *pwri_tmp = NULL;
size_t secret_keylen = 0, secret_keyidlen = 0;
cms_key_param *key_first = NULL, *key_param = NULL;
ASN1_OBJECT *econtent_type = NULL;
X509_VERIFY_PARAM *vpm = NULL;
@ -201,6 +213,8 @@ int MAIN(int argc, char **argv)
cipher = EVP_des_ede3_cbc();
else if (!strcmp(*args, "-des"))
cipher = EVP_des_cbc();
else if (!strcmp(*args, "-des3-wrap"))
wrap_cipher = EVP_des_ede3_wrap();
# endif
# ifndef OPENSSL_NO_SEED
else if (!strcmp(*args, "-seed"))
@ -221,6 +235,12 @@ int MAIN(int argc, char **argv)
cipher = EVP_aes_192_cbc();
else if (!strcmp(*args, "-aes256"))
cipher = EVP_aes_256_cbc();
else if (!strcmp(*args, "-aes128-wrap"))
wrap_cipher = EVP_aes_128_wrap();
else if (!strcmp(*args, "-aes192-wrap"))
wrap_cipher = EVP_aes_192_wrap();
else if (!strcmp(*args, "-aes256-wrap"))
wrap_cipher = EVP_aes_256_wrap();
# endif
# ifndef OPENSSL_NO_CAMELLIA
else if (!strcmp(*args, "-camellia128"))
@ -378,7 +398,17 @@ int MAIN(int argc, char **argv)
} else if (!strcmp(*args, "-recip")) {
if (!args[1])
goto argerr;
recipfile = *++args;
if (operation == SMIME_ENCRYPT) {
if (!encerts)
encerts = sk_X509_new_null();
cert = load_cert(bio_err, *++args, FORMAT_PEM,
NULL, e, "recipient certificate file");
if (!cert)
goto end;
sk_X509_push(encerts, cert);
cert = NULL;
} else
recipfile = *++args;
} else if (!strcmp(*args, "-certsout")) {
if (!args[1])
goto argerr;
@ -413,6 +443,40 @@ int MAIN(int argc, char **argv)
if (!args[1])
goto argerr;
keyform = str2fmt(*++args);
} else if (!strcmp(*args, "-keyopt")) {
int keyidx = -1;
if (!args[1])
goto argerr;
if (operation == SMIME_ENCRYPT) {
if (encerts)
keyidx += sk_X509_num(encerts);
} else {
if (keyfile || signerfile)
keyidx++;
if (skkeys)
keyidx += sk_OPENSSL_STRING_num(skkeys);
}
if (keyidx < 0) {
BIO_printf(bio_err, "No key specified\n");
goto argerr;
}
if (key_param == NULL || key_param->idx != keyidx) {
cms_key_param *nparam;
nparam = OPENSSL_malloc(sizeof(cms_key_param));
if (!nparam) {
BIO_printf(bio_err, "Out of memory\n");
goto argerr;
}
nparam->idx = keyidx;
nparam->param = sk_OPENSSL_STRING_new_null();
nparam->next = NULL;
if (key_first == NULL)
key_first = nparam;
else
key_param->next = nparam;
key_param = nparam;
}
sk_OPENSSL_STRING_push(key_param->param, *++args);
} else if (!strcmp(*args, "-rctform")) {
if (!args[1])
goto argerr;
@ -502,7 +566,7 @@ int MAIN(int argc, char **argv)
badarg = 1;
}
} else if (operation == SMIME_ENCRYPT) {
if (!*args && !secret_key && !pwri_pass) {
if (!*args && !secret_key && !pwri_pass && !encerts) {
BIO_printf(bio_err, "No recipient(s) certificate(s) specified\n");
badarg = 1;
}
@ -567,6 +631,7 @@ int MAIN(int argc, char **argv)
"-inkey file input private key (if not signer or recipient)\n");
BIO_printf(bio_err,
"-keyform arg input private key format (PEM or ENGINE)\n");
BIO_printf(bio_err, "-keyopt nm:v set public key parameters\n");
BIO_printf(bio_err, "-out file output file\n");
BIO_printf(bio_err,
"-outform arg output format SMIME (default), PEM or DER\n");
@ -652,7 +717,7 @@ int MAIN(int argc, char **argv)
goto end;
}
if (*args)
if (*args && !encerts)
encerts = sk_X509_new_null();
while (*args) {
if (!(cert = load_cert(bio_err, *args, FORMAT_PEM,
@ -804,10 +869,39 @@ int MAIN(int argc, char **argv)
} else if (operation == SMIME_COMPRESS) {
cms = CMS_compress(in, -1, flags);
} else if (operation == SMIME_ENCRYPT) {
int i;
flags |= CMS_PARTIAL;
cms = CMS_encrypt(encerts, in, cipher, flags);
cms = CMS_encrypt(NULL, in, cipher, flags);
if (!cms)
goto end;
for (i = 0; i < sk_X509_num(encerts); i++) {
CMS_RecipientInfo *ri;
cms_key_param *kparam;
int tflags = flags;
X509 *x = sk_X509_value(encerts, i);
for (kparam = key_first; kparam; kparam = kparam->next) {
if (kparam->idx == i) {
tflags |= CMS_KEY_PARAM;
break;
}
}
ri = CMS_add1_recipient_cert(cms, x, tflags);
if (!ri)
goto end;
if (kparam) {
EVP_PKEY_CTX *pctx;
pctx = CMS_RecipientInfo_get0_pkey_ctx(ri);
if (!cms_set_pkey_param(pctx, kparam->param))
goto end;
}
if (CMS_RecipientInfo_type(ri) == CMS_RECIPINFO_AGREE
&& wrap_cipher) {
EVP_CIPHER_CTX *wctx;
wctx = CMS_RecipientInfo_kari_get0_ctx(ri);
EVP_EncryptInit_ex(wctx, wrap_cipher, NULL, NULL, NULL);
}
}
if (secret_key) {
if (!CMS_add0_recipient_key(cms, NID_undef,
secret_key, secret_keylen,
@ -880,8 +974,11 @@ int MAIN(int argc, char **argv)
flags |= CMS_REUSE_DIGEST;
for (i = 0; i < sk_OPENSSL_STRING_num(sksigners); i++) {
CMS_SignerInfo *si;
cms_key_param *kparam;
int tflags = flags;
signerfile = sk_OPENSSL_STRING_value(sksigners, i);
keyfile = sk_OPENSSL_STRING_value(skkeys, i);
signer = load_cert(bio_err, signerfile, FORMAT_PEM, NULL,
e, "signer certificate");
if (!signer)
@ -890,9 +987,21 @@ int MAIN(int argc, char **argv)
"signing key file");
if (!key)
goto end;
si = CMS_add1_signer(cms, signer, key, sign_md, flags);
for (kparam = key_first; kparam; kparam = kparam->next) {
if (kparam->idx == i) {
tflags |= CMS_KEY_PARAM;
break;
}
}
si = CMS_add1_signer(cms, signer, key, sign_md, tflags);
if (!si)
goto end;
if (kparam) {
EVP_PKEY_CTX *pctx;
pctx = CMS_SignerInfo_get0_pkey_ctx(si);
if (!cms_set_pkey_param(pctx, kparam->param))
goto end;
}
if (rr && !CMS_add1_ReceiptRequest(si, rr))
goto end;
X509_free(signer);
@ -1047,6 +1156,13 @@ int MAIN(int argc, char **argv)
sk_OPENSSL_STRING_free(rr_to);
if (rr_from)
sk_OPENSSL_STRING_free(rr_from);
for (key_param = key_first; key_param;) {
cms_key_param *tparam;
sk_OPENSSL_STRING_free(key_param->param);
tparam = key_param->next;
OPENSSL_free(key_param);
key_param = tparam;
}
X509_STORE_free(store);
X509_free(cert);
X509_free(recip);
@ -1220,4 +1336,22 @@ static CMS_ReceiptRequest *make_receipt_request(STACK_OF(OPENSSL_STRING)
return NULL;
}
static int cms_set_pkey_param(EVP_PKEY_CTX *pctx,
STACK_OF(OPENSSL_STRING) *param)
{
char *keyopt;
int i;
if (sk_OPENSSL_STRING_num(param) <= 0)
return 1;
for (i = 0; i < sk_OPENSSL_STRING_num(param); i++) {
keyopt = sk_OPENSSL_STRING_value(param, i);
if (pkey_ctrl_string(pctx, keyopt) <= 0) {
BIO_printf(bio_err, "parameter error \"%s\"\n", keyopt);
ERR_print_errors(bio_err);
return 0;
}
}
return 1;
}
#endif

View File

@ -96,7 +96,6 @@ static const char *crl_usage[] = {
NULL
};
static X509_CRL *load_crl(char *file, int format);
static BIO *bio_out = NULL;
int MAIN(int, char **);
@ -106,10 +105,10 @@ int MAIN(int argc, char **argv)
unsigned long nmflag = 0;
X509_CRL *x = NULL;
char *CAfile = NULL, *CApath = NULL;
int ret = 1, i, num, badops = 0;
int ret = 1, i, num, badops = 0, badsig = 0;
BIO *out = NULL;
int informat, outformat;
char *infile = NULL, *outfile = NULL;
int informat, outformat, keyformat;
char *infile = NULL, *outfile = NULL, *crldiff = NULL, *keyfile = NULL;
int hash = 0, issuer = 0, lastupdate = 0, nextupdate = 0, noout =
0, text = 0;
#ifndef OPENSSL_NO_MD5
@ -147,6 +146,7 @@ int MAIN(int argc, char **argv)
informat = FORMAT_PEM;
outformat = FORMAT_PEM;
keyformat = FORMAT_PEM;
argc--;
argv++;
@ -173,6 +173,18 @@ int MAIN(int argc, char **argv)
if (--argc < 1)
goto bad;
infile = *(++argv);
} else if (strcmp(*argv, "-gendelta") == 0) {
if (--argc < 1)
goto bad;
crldiff = *(++argv);
} else if (strcmp(*argv, "-key") == 0) {
if (--argc < 1)
goto bad;
keyfile = *(++argv);
} else if (strcmp(*argv, "-keyform") == 0) {
if (--argc < 1)
goto bad;
keyformat = str2fmt(*(++argv));
} else if (strcmp(*argv, "-out") == 0) {
if (--argc < 1)
goto bad;
@ -214,6 +226,8 @@ int MAIN(int argc, char **argv)
fingerprint = ++num;
else if (strcmp(*argv, "-crlnumber") == 0)
crlnumber = ++num;
else if (strcmp(*argv, "-badsig") == 0)
badsig = 1;
else if ((md_alg = EVP_get_digestbyname(*argv + 1))) {
/* ok */
digest = md_alg;
@ -281,6 +295,33 @@ int MAIN(int argc, char **argv)
BIO_printf(bio_err, "verify OK\n");
}
if (crldiff) {
X509_CRL *newcrl, *delta;
if (!keyfile) {
BIO_puts(bio_err, "Missing CRL signing key\n");
goto end;
}
newcrl = load_crl(crldiff, informat);
if (!newcrl)
goto end;
pkey = load_key(bio_err, keyfile, keyformat, 0, NULL, NULL,
"CRL signing key");
if (!pkey) {
X509_CRL_free(newcrl);
goto end;
}
delta = X509_CRL_diff(x, newcrl, pkey, digest, 0);
X509_CRL_free(newcrl);
EVP_PKEY_free(pkey);
if (delta) {
X509_CRL_free(x);
x = delta;
} else {
BIO_puts(bio_err, "Error creating delta CRL\n");
goto end;
}
}
if (num) {
for (i = 1; i <= num; i++) {
if (issuer == i) {
@ -369,6 +410,9 @@ int MAIN(int argc, char **argv)
goto end;
}
if (badsig)
x->signature->data[x->signature->length - 1] ^= 0x1;
if (outformat == FORMAT_ASN1)
i = (int)i2d_X509_CRL_bio(out, x);
else if (outformat == FORMAT_PEM)
@ -383,6 +427,8 @@ int MAIN(int argc, char **argv)
}
ret = 0;
end:
if (ret != 0)
ERR_print_errors(bio_err);
BIO_free_all(out);
BIO_free_all(bio_out);
bio_out = NULL;
@ -394,41 +440,3 @@ int MAIN(int argc, char **argv)
apps_shutdown();
OPENSSL_EXIT(ret);
}
static X509_CRL *load_crl(char *infile, int format)
{
X509_CRL *x = NULL;
BIO *in = NULL;
in = BIO_new(BIO_s_file());
if (in == NULL) {
ERR_print_errors(bio_err);
goto end;
}
if (infile == NULL)
BIO_set_fp(in, stdin, BIO_NOCLOSE);
else {
if (BIO_read_filename(in, infile) <= 0) {
perror(infile);
goto end;
}
}
if (format == FORMAT_ASN1)
x = d2i_X509_CRL_bio(in, NULL);
else if (format == FORMAT_PEM)
x = PEM_read_bio_X509_CRL(in, NULL, NULL, NULL);
else {
BIO_printf(bio_err, "bad input format specified for input crl\n");
goto end;
}
if (x == NULL) {
BIO_printf(bio_err, "unable to load CRL\n");
ERR_print_errors(bio_err);
goto end;
}
end:
BIO_free(in);
return (x);
}

View File

@ -103,7 +103,7 @@ int MAIN(int, char **);
int MAIN(int argc, char **argv)
{
ENGINE *e = NULL;
ENGINE *e = NULL, *impl = NULL;
unsigned char *buf = NULL;
int i, err = 1;
const EVP_MD *md = NULL, *m;
@ -124,6 +124,7 @@ int MAIN(int argc, char **argv)
char *passargin = NULL, *passin = NULL;
#ifndef OPENSSL_NO_ENGINE
char *engine = NULL;
int engine_impl = 0;
#endif
char *hmac_key = NULL;
char *mac_name = NULL;
@ -199,7 +200,8 @@ int MAIN(int argc, char **argv)
break;
engine = *(++argv);
e = setup_engine(bio_err, engine, 0);
}
} else if (strcmp(*argv, "-engine_impl") == 0)
engine_impl = 1;
#endif
else if (strcmp(*argv, "-hex") == 0)
out_bin = 0;
@ -284,6 +286,10 @@ int MAIN(int argc, char **argv)
EVP_MD_do_all_sorted(list_md_fn, bio_err);
goto end;
}
#ifndef OPENSSL_NO_ENGINE
if (engine_impl)
impl = e;
#endif
in = BIO_new(BIO_s_file());
bmd = BIO_new(BIO_f_md());
@ -357,7 +363,7 @@ int MAIN(int argc, char **argv)
if (mac_name) {
EVP_PKEY_CTX *mac_ctx = NULL;
int r = 0;
if (!init_gen_str(bio_err, &mac_ctx, mac_name, e, 0))
if (!init_gen_str(bio_err, &mac_ctx, mac_name, impl, 0))
goto mac_end;
if (macopts) {
char *macopt;
@ -391,7 +397,7 @@ int MAIN(int argc, char **argv)
}
if (hmac_key) {
sigkey = EVP_PKEY_new_mac_key(EVP_PKEY_HMAC, e,
sigkey = EVP_PKEY_new_mac_key(EVP_PKEY_HMAC, impl,
(unsigned char *)hmac_key, -1);
if (!sigkey)
goto end;
@ -407,9 +413,9 @@ int MAIN(int argc, char **argv)
goto end;
}
if (do_verify)
r = EVP_DigestVerifyInit(mctx, &pctx, md, NULL, sigkey);
r = EVP_DigestVerifyInit(mctx, &pctx, md, impl, sigkey);
else
r = EVP_DigestSignInit(mctx, &pctx, md, NULL, sigkey);
r = EVP_DigestSignInit(mctx, &pctx, md, impl, sigkey);
if (!r) {
BIO_printf(bio_err, "Error setting context\n");
ERR_print_errors(bio_err);
@ -429,9 +435,15 @@ int MAIN(int argc, char **argv)
}
/* we use md as a filter, reading from 'in' */
else {
EVP_MD_CTX *mctx = NULL;
if (!BIO_get_md_ctx(bmd, &mctx)) {
BIO_printf(bio_err, "Error getting context\n");
ERR_print_errors(bio_err);
goto end;
}
if (md == NULL)
md = EVP_md5();
if (!BIO_set_md(bmd, md)) {
if (!EVP_DigestInit_ex(mctx, md, impl)) {
BIO_printf(bio_err, "Error setting digest %s\n", pname);
ERR_print_errors(bio_err);
goto end;
@ -483,7 +495,8 @@ int MAIN(int argc, char **argv)
EVP_PKEY_asn1_get0_info(NULL, NULL,
NULL, NULL, &sig_name, ameth);
}
md_name = EVP_MD_name(md);
if (md)
md_name = EVP_MD_name(md);
}
err = 0;
for (i = 0; i < argc; i++) {
@ -581,9 +594,12 @@ int do_fp(BIO *out, unsigned char *buf, BIO *bp, int sep, int binout,
BIO_printf(out, "%02x", buf[i]);
BIO_printf(out, " *%s\n", file);
} else {
if (sig_name)
BIO_printf(out, "%s-%s(%s)= ", sig_name, md_name, file);
else if (md_name)
if (sig_name) {
BIO_puts(out, sig_name);
if (md_name)
BIO_printf(out, "-%s", md_name);
BIO_printf(out, "(%s)= ", file);
} else if (md_name)
BIO_printf(out, "%s(%s)= ", md_name, file);
else
BIO_printf(out, "(%s)= ", file);

View File

@ -489,9 +489,12 @@ int MAIN(int argc, char **argv)
if (!noout) {
if (outformat == FORMAT_ASN1)
i = i2d_DHparams_bio(out, dh);
else if (outformat == FORMAT_PEM)
i = PEM_write_bio_DHparams(out, dh);
else {
else if (outformat == FORMAT_PEM) {
if (dh->q)
i = PEM_write_bio_DHxparams(out, dh);
else
i = PEM_write_bio_DHparams(out, dh);
} else {
BIO_printf(bio_err, "bad output format specified for outfile\n");
goto end;
}

View File

@ -370,6 +370,9 @@ int MAIN(int argc, char **argv)
} else
nid = OBJ_sn2nid(curve_name);
if (nid == 0)
nid = EC_curve_nist2nid(curve_name);
if (nid == 0) {
BIO_printf(bio_err, "unknown curve name (%s)\n", curve_name);
goto end;

View File

@ -80,7 +80,7 @@
# include <openssl/pem.h>
# include <openssl/rand.h>
# define DEFBITS 1024
# define DEFBITS 2048
# undef PROG
# define PROG genrsa_main

View File

@ -110,16 +110,17 @@ static int print_ocsp_summary(BIO *out, OCSP_BASICRESP *bs, OCSP_REQUEST *req,
static int make_ocsp_response(OCSP_RESPONSE **resp, OCSP_REQUEST *req,
CA_DB *db, X509 *ca, X509 *rcert,
EVP_PKEY *rkey, STACK_OF(X509) *rother,
unsigned long flags, int nmin, int ndays);
EVP_PKEY *rkey, const EVP_MD *md,
STACK_OF(X509) *rother, unsigned long flags,
int nmin, int ndays, int badsig);
static char **lookup_serial(CA_DB *db, ASN1_INTEGER *ser);
static BIO *init_responder(char *port);
static BIO *init_responder(const char *port);
static int do_responder(OCSP_REQUEST **preq, BIO **pcbio, BIO *acbio,
char *port);
const char *port);
static int send_ocsp_response(BIO *cbio, OCSP_RESPONSE *resp);
static OCSP_RESPONSE *query_responder(BIO *err, BIO *cbio, char *path,
STACK_OF(CONF_VALUE) *headers,
static OCSP_RESPONSE *query_responder(BIO *err, BIO *cbio, const char *path,
const STACK_OF(CONF_VALUE) *headers,
OCSP_REQUEST *req, int req_timeout);
# undef PROG
@ -154,12 +155,14 @@ int MAIN(int argc, char **argv)
long nsec = MAX_VALIDITY_PERIOD, maxage = -1;
char *CAfile = NULL, *CApath = NULL;
X509_STORE *store = NULL;
X509_VERIFY_PARAM *vpm = NULL;
STACK_OF(X509) *sign_other = NULL, *verify_other = NULL, *rother = NULL;
char *sign_certfile = NULL, *verify_certfile = NULL, *rcertfile = NULL;
unsigned long sign_flags = 0, verify_flags = 0, rflags = 0;
int ret = 1;
int accept_count = -1;
int badarg = 0;
int badsig = 0;
int i;
int ignore_err = 0;
STACK_OF(OPENSSL_STRING) *reqnames = NULL;
@ -170,7 +173,7 @@ int MAIN(int argc, char **argv)
char *rca_filename = NULL;
CA_DB *rdb = NULL;
int nmin = 0, ndays = -1;
const EVP_MD *cert_id_md = NULL;
const EVP_MD *cert_id_md = NULL, *rsign_md = NULL;
if (bio_err == NULL)
bio_err = BIO_new_fp(stderr, BIO_NOCLOSE);
@ -206,6 +209,7 @@ int MAIN(int argc, char **argv)
OPENSSL_free(tport);
if (tpath)
OPENSSL_free(tpath);
thost = tport = tpath = NULL;
if (args[1]) {
args++;
if (!OCSP_parse_url(*args, &host, &port, &path, &use_ssl)) {
@ -264,6 +268,8 @@ int MAIN(int argc, char **argv)
verify_flags |= OCSP_TRUSTOTHER;
else if (!strcmp(*args, "-no_intern"))
verify_flags |= OCSP_NOINTERN;
else if (!strcmp(*args, "-badsig"))
badsig = 1;
else if (!strcmp(*args, "-text")) {
req_text = 1;
resp_text = 1;
@ -320,6 +326,10 @@ int MAIN(int argc, char **argv)
CApath = *args;
} else
badarg = 1;
} else if (args_verify(&args, NULL, &badarg, bio_err, &vpm)) {
if (badarg)
goto end;
continue;
} else if (!strcmp(*args, "-validity_period")) {
if (args[1]) {
args++;
@ -465,6 +475,14 @@ int MAIN(int argc, char **argv)
rcertfile = *args;
} else
badarg = 1;
} else if (!strcmp(*args, "-rmd")) {
if (args[1]) {
args++;
rsign_md = EVP_get_digestbyname(*args);
if (!rsign_md)
badarg = 1;
} else
badarg = 1;
} else if ((cert_id_md = EVP_get_digestbyname((*args) + 1)) == NULL) {
badarg = 1;
}
@ -584,7 +602,10 @@ int MAIN(int argc, char **argv)
add_nonce = 0;
if (!req && reqin) {
derbio = BIO_new_file(reqin, "rb");
if (!strcmp(reqin, "-"))
derbio = BIO_new_fp(stdin, BIO_NOCLOSE);
else
derbio = BIO_new_file(reqin, "rb");
if (!derbio) {
BIO_printf(bio_err, "Error Opening OCSP request file\n");
goto end;
@ -681,7 +702,10 @@ int MAIN(int argc, char **argv)
OCSP_REQUEST_print(out, req, 0);
if (reqout) {
derbio = BIO_new_file(reqout, "wb");
if (!strcmp(reqout, "-"))
derbio = BIO_new_fp(stdout, BIO_NOCLOSE);
else
derbio = BIO_new_file(reqout, "wb");
if (!derbio) {
BIO_printf(bio_err, "Error opening file %s\n", reqout);
goto end;
@ -706,7 +730,7 @@ int MAIN(int argc, char **argv)
if (rdb) {
i = make_ocsp_response(&resp, req, rdb, rca_cert, rsigner, rkey,
rother, rflags, nmin, ndays);
rsign_md, rother, rflags, nmin, ndays, badsig);
if (cbio)
send_ocsp_response(cbio, resp);
} else if (host) {
@ -721,7 +745,10 @@ int MAIN(int argc, char **argv)
goto end;
# endif
} else if (respin) {
derbio = BIO_new_file(respin, "rb");
if (!strcmp(respin, "-"))
derbio = BIO_new_fp(stdin, BIO_NOCLOSE);
else
derbio = BIO_new_file(respin, "rb");
if (!derbio) {
BIO_printf(bio_err, "Error Opening OCSP response file\n");
goto end;
@ -741,7 +768,10 @@ int MAIN(int argc, char **argv)
done_resp:
if (respout) {
derbio = BIO_new_file(respout, "wb");
if (!strcmp(respout, "-"))
derbio = BIO_new_fp(stdout, BIO_NOCLOSE);
else
derbio = BIO_new_file(respout, "wb");
if (!derbio) {
BIO_printf(bio_err, "Error opening file %s\n", respout);
goto end;
@ -778,6 +808,10 @@ int MAIN(int argc, char **argv)
resp = NULL;
goto redo_accept;
}
ret = 0;
goto end;
} else if (ridx_filename) {
ret = 0;
goto end;
}
@ -785,6 +819,8 @@ int MAIN(int argc, char **argv)
store = setup_verify(bio_err, CAfile, CApath);
if (!store)
goto end;
if (vpm)
X509_STORE_set1_param(store, vpm);
if (verify_certfile) {
verify_other = load_certs(bio_err, verify_certfile, FORMAT_PEM,
NULL, e, "validator certificate");
@ -799,37 +835,38 @@ int MAIN(int argc, char **argv)
goto end;
}
ret = 0;
if (!noverify) {
if (req && ((i = OCSP_check_nonce(req, bs)) <= 0)) {
if (i == -1)
BIO_printf(bio_err, "WARNING: no nonce in response\n");
else {
BIO_printf(bio_err, "Nonce Verify error\n");
ret = 1;
goto end;
}
}
i = OCSP_basic_verify(bs, verify_other, store, verify_flags);
if (i < 0)
i = OCSP_basic_verify(bs, NULL, store, 0);
if (i <= 0) {
BIO_printf(bio_err, "Response Verify Failure\n");
ERR_print_errors(bio_err);
ret = 1;
} else
BIO_printf(bio_err, "Response verify OK\n");
}
if (!print_ocsp_summary(out, bs, req, reqnames, ids, nsec, maxage))
goto end;
ret = 0;
ret = 1;
end:
ERR_print_errors(bio_err);
X509_free(signer);
X509_STORE_free(store);
if (vpm)
X509_VERIFY_PARAM_free(vpm);
EVP_PKEY_free(key);
EVP_PKEY_free(rkey);
X509_free(issuer);
@ -984,8 +1021,9 @@ static int print_ocsp_summary(BIO *out, OCSP_BASICRESP *bs, OCSP_REQUEST *req,
static int make_ocsp_response(OCSP_RESPONSE **resp, OCSP_REQUEST *req,
CA_DB *db, X509 *ca, X509 *rcert,
EVP_PKEY *rkey, STACK_OF(X509) *rother,
unsigned long flags, int nmin, int ndays)
EVP_PKEY *rkey, const EVP_MD *rmd,
STACK_OF(X509) *rother, unsigned long flags,
int nmin, int ndays, int badsig)
{
ASN1_TIME *thisupd = NULL, *nextupd = NULL;
OCSP_CERTID *cid, *ca_id = NULL;
@ -1069,7 +1107,10 @@ static int make_ocsp_response(OCSP_RESPONSE **resp, OCSP_REQUEST *req,
OCSP_copy_nonce(bs, req);
OCSP_basic_sign(bs, rcert, rkey, NULL, rother, flags);
OCSP_basic_sign(bs, rcert, rkey, rmd, rother, flags);
if (badsig)
bs->signature->data[bs->signature->length - 1] ^= 0x1;
*resp = OCSP_response_create(OCSP_RESPONSE_STATUS_SUCCESSFUL, bs);
@ -1105,7 +1146,7 @@ static char **lookup_serial(CA_DB *db, ASN1_INTEGER *ser)
/* Quick and dirty OCSP server: read in and parse input request */
static BIO *init_responder(char *port)
static BIO *init_responder(const char *port)
{
BIO *acbio = NULL, *bufbio = NULL;
bufbio = BIO_new(BIO_f_buffer());
@ -1137,7 +1178,7 @@ static BIO *init_responder(char *port)
}
static int do_responder(OCSP_REQUEST **preq, BIO **pcbio, BIO *acbio,
char *port)
const char *port)
{
int have_post = 0, len;
OCSP_REQUEST *req = NULL;
@ -1198,8 +1239,8 @@ static int send_ocsp_response(BIO *cbio, OCSP_RESPONSE *resp)
return 1;
}
static OCSP_RESPONSE *query_responder(BIO *err, BIO *cbio, char *path,
STACK_OF(CONF_VALUE) *headers,
static OCSP_RESPONSE *query_responder(BIO *err, BIO *cbio, const char *path,
const STACK_OF(CONF_VALUE) *headers,
OCSP_REQUEST *req, int req_timeout)
{
int fd;
@ -1286,8 +1327,9 @@ static OCSP_RESPONSE *query_responder(BIO *err, BIO *cbio, char *path,
}
OCSP_RESPONSE *process_responder(BIO *err, OCSP_REQUEST *req,
char *host, char *path, char *port,
int use_ssl, STACK_OF(CONF_VALUE) *headers,
const char *host, const char *path,
const char *port, int use_ssl,
const STACK_OF(CONF_VALUE) *headers,
int req_timeout)
{
BIO *cbio = NULL;

View File

@ -103,7 +103,7 @@ emailAddress = optional
####################################################################
[ req ]
default_bits = 1024
default_bits = 2048
default_keyfile = privkey.pem
distinguished_name = req_distinguished_name
attributes = req_attributes

View File

@ -124,6 +124,16 @@ int MAIN(int argc, char **argv)
}
} else
badarg = 1;
} else if (!strcmp(*args, "-v2prf")) {
if (args[1]) {
args++;
pbe_nid = OBJ_txt2nid(*args);
if (!EVP_PBE_find(EVP_PBE_TYPE_PRF, pbe_nid, NULL, NULL, 0)) {
BIO_printf(bio_err, "Unknown PRF algorithm %s\n", *args);
badarg = 1;
}
} else
badarg = 1;
} else if (!strcmp(*args, "-inform")) {
if (args[1]) {
args++;

View File

@ -152,15 +152,21 @@ typedef fd_mask fd_set;
#define PROTOCOL "tcp"
int do_server(int port, int type, int *ret,
int (*cb) (char *hostname, int s, unsigned char *context),
unsigned char *context);
int (*cb) (char *hostname, int s, int stype,
unsigned char *context), unsigned char *context,
int naccept);
#ifdef HEADER_X509_H
int MS_CALLBACK verify_callback(int ok, X509_STORE_CTX *ctx);
#endif
#ifdef HEADER_SSL_H
int set_cert_stuff(SSL_CTX *ctx, char *cert_file, char *key_file);
int set_cert_key_stuff(SSL_CTX *ctx, X509 *cert, EVP_PKEY *key);
int set_cert_key_stuff(SSL_CTX *ctx, X509 *cert, EVP_PKEY *key,
STACK_OF(X509) *chain, int build_chain);
int ssl_print_sigalgs(BIO *out, SSL *s);
int ssl_print_point_formats(BIO *out, SSL *s);
int ssl_print_curves(BIO *out, SSL *s, int noshared);
#endif
int ssl_print_tmp_key(BIO *out, SSL *s);
int init_client(int *sock, char *server, int port, int type);
int should_retry(int i);
int extract_port(char *str, short *port_ptr);
@ -182,3 +188,24 @@ int MS_CALLBACK generate_cookie_callback(SSL *ssl, unsigned char *cookie,
unsigned int *cookie_len);
int MS_CALLBACK verify_cookie_callback(SSL *ssl, unsigned char *cookie,
unsigned int cookie_len);
typedef struct ssl_excert_st SSL_EXCERT;
void ssl_ctx_set_excert(SSL_CTX *ctx, SSL_EXCERT *exc);
void ssl_excert_free(SSL_EXCERT *exc);
int args_excert(char ***pargs, int *pargc,
int *badarg, BIO *err, SSL_EXCERT **pexc);
int load_excert(SSL_EXCERT **pexc, BIO *err);
void print_ssl_summary(BIO *bio, SSL *s);
#ifdef HEADER_SSL_H
int args_ssl(char ***pargs, int *pargc, SSL_CONF_CTX *cctx,
int *badarg, BIO *err, STACK_OF(OPENSSL_STRING) **pstr);
int args_ssl_call(SSL_CTX *ctx, BIO *err, SSL_CONF_CTX *cctx,
STACK_OF(OPENSSL_STRING) *str, int no_ecdhe, int no_jpake);
int ssl_ctx_add_crls(SSL_CTX *ctx, STACK_OF(X509_CRL) *crls,
int crl_download);
int ssl_load_stores(SSL_CTX *ctx, const char *vfyCApath,
const char *vfyCAfile, const char *chCApath,
const char *chCAfile, STACK_OF(X509_CRL) *crls,
int crl_download);
#endif

View File

@ -111,7 +111,7 @@
#include <stdio.h>
#include <stdlib.h>
#include <string.h> /* for memcpy() */
#include <string.h> /* for memcpy() and strcmp() */
#define USE_SOCKETS
#define NON_MAIN
#include "apps.h"
@ -126,6 +126,7 @@
#define COOKIE_SECRET_LENGTH 16
int verify_depth = 0;
int verify_quiet = 0;
int verify_error = X509_V_OK;
int verify_return_error = 0;
unsigned char cookie_secret[COOKIE_SECRET_LENGTH];
@ -140,13 +141,16 @@ int MS_CALLBACK verify_callback(int ok, X509_STORE_CTX *ctx)
err = X509_STORE_CTX_get_error(ctx);
depth = X509_STORE_CTX_get_error_depth(ctx);
BIO_printf(bio_err, "depth=%d ", depth);
if (err_cert) {
X509_NAME_print_ex(bio_err, X509_get_subject_name(err_cert),
0, XN_FLAG_ONELINE);
BIO_puts(bio_err, "\n");
} else
BIO_puts(bio_err, "<no cert>\n");
if (!verify_quiet || !ok) {
BIO_printf(bio_err, "depth=%d ", depth);
if (err_cert) {
X509_NAME_print_ex(bio_err,
X509_get_subject_name(err_cert),
0, XN_FLAG_ONELINE);
BIO_puts(bio_err, "\n");
} else
BIO_puts(bio_err, "<no cert>\n");
}
if (!ok) {
BIO_printf(bio_err, "verify error:num=%d:%s\n", err,
X509_verify_cert_error_string(err));
@ -179,13 +183,14 @@ int MS_CALLBACK verify_callback(int ok, X509_STORE_CTX *ctx)
BIO_printf(bio_err, "\n");
break;
case X509_V_ERR_NO_EXPLICIT_POLICY:
policies_print(bio_err, ctx);
if (!verify_quiet)
policies_print(bio_err, ctx);
break;
}
if (err == X509_V_OK && ok == 2)
if (err == X509_V_OK && ok == 2 && !verify_quiet)
policies_print(bio_err, ctx);
BIO_printf(bio_err, "verify return:%d\n", ok);
if (ok && !verify_quiet)
BIO_printf(bio_err, "verify return:%d\n", ok);
return (ok);
}
@ -246,8 +251,10 @@ int set_cert_stuff(SSL_CTX *ctx, char *cert_file, char *key_file)
return (1);
}
int set_cert_key_stuff(SSL_CTX *ctx, X509 *cert, EVP_PKEY *key)
int set_cert_key_stuff(SSL_CTX *ctx, X509 *cert, EVP_PKEY *key,
STACK_OF(X509) *chain, int build_chain)
{
int chflags = chain ? SSL_BUILD_CHAIN_FLAG_CHECK : 0;
if (cert == NULL)
return 1;
if (SSL_CTX_use_certificate(ctx, cert) <= 0) {
@ -255,6 +262,7 @@ int set_cert_key_stuff(SSL_CTX *ctx, X509 *cert, EVP_PKEY *key)
ERR_print_errors(bio_err);
return 0;
}
if (SSL_CTX_use_PrivateKey(ctx, key) <= 0) {
BIO_printf(bio_err, "error setting private key\n");
ERR_print_errors(bio_err);
@ -269,6 +277,263 @@ int set_cert_key_stuff(SSL_CTX *ctx, X509 *cert, EVP_PKEY *key)
"Private key does not match the certificate public key\n");
return 0;
}
if (chain && !SSL_CTX_set1_chain(ctx, chain)) {
BIO_printf(bio_err, "error setting certificate chain\n");
ERR_print_errors(bio_err);
return 0;
}
if (build_chain && !SSL_CTX_build_cert_chain(ctx, chflags)) {
BIO_printf(bio_err, "error building certificate chain\n");
ERR_print_errors(bio_err);
return 0;
}
return 1;
}
static void ssl_print_client_cert_types(BIO *bio, SSL *s)
{
const unsigned char *p;
int i;
int cert_type_num = SSL_get0_certificate_types(s, &p);
if (!cert_type_num)
return;
BIO_puts(bio, "Client Certificate Types: ");
for (i = 0; i < cert_type_num; i++) {
unsigned char cert_type = p[i];
char *cname;
switch (cert_type) {
case TLS_CT_RSA_SIGN:
cname = "RSA sign";
break;
case TLS_CT_DSS_SIGN:
cname = "DSA sign";
break;
case TLS_CT_RSA_FIXED_DH:
cname = "RSA fixed DH";
break;
case TLS_CT_DSS_FIXED_DH:
cname = "DSS fixed DH";
break;
case TLS_CT_ECDSA_SIGN:
cname = "ECDSA sign";
break;
case TLS_CT_RSA_FIXED_ECDH:
cname = "RSA fixed ECDH";
break;
case TLS_CT_ECDSA_FIXED_ECDH:
cname = "ECDSA fixed ECDH";
break;
case TLS_CT_GOST94_SIGN:
cname = "GOST94 Sign";
break;
case TLS_CT_GOST01_SIGN:
cname = "GOST01 Sign";
break;
default:
cname = NULL;
}
if (i)
BIO_puts(bio, ", ");
if (cname)
BIO_puts(bio, cname);
else
BIO_printf(bio, "UNKNOWN (%d),", cert_type);
}
BIO_puts(bio, "\n");
}
static int do_print_sigalgs(BIO *out, SSL *s, int shared)
{
int i, nsig, client;
client = SSL_is_server(s) ? 0 : 1;
if (shared)
nsig = SSL_get_shared_sigalgs(s, -1, NULL, NULL, NULL, NULL, NULL);
else
nsig = SSL_get_sigalgs(s, -1, NULL, NULL, NULL, NULL, NULL);
if (nsig == 0)
return 1;
if (shared)
BIO_puts(out, "Shared ");
if (client)
BIO_puts(out, "Requested ");
BIO_puts(out, "Signature Algorithms: ");
for (i = 0; i < nsig; i++) {
int hash_nid, sign_nid;
unsigned char rhash, rsign;
const char *sstr = NULL;
if (shared)
SSL_get_shared_sigalgs(s, i, &sign_nid, &hash_nid, NULL,
&rsign, &rhash);
else
SSL_get_sigalgs(s, i, &sign_nid, &hash_nid, NULL, &rsign, &rhash);
if (i)
BIO_puts(out, ":");
if (sign_nid == EVP_PKEY_RSA)
sstr = "RSA";
else if (sign_nid == EVP_PKEY_DSA)
sstr = "DSA";
else if (sign_nid == EVP_PKEY_EC)
sstr = "ECDSA";
if (sstr)
BIO_printf(out, "%s+", sstr);
else
BIO_printf(out, "0x%02X+", (int)rsign);
if (hash_nid != NID_undef)
BIO_printf(out, "%s", OBJ_nid2sn(hash_nid));
else
BIO_printf(out, "0x%02X", (int)rhash);
}
BIO_puts(out, "\n");
return 1;
}
int ssl_print_sigalgs(BIO *out, SSL *s)
{
int mdnid;
if (!SSL_is_server(s))
ssl_print_client_cert_types(out, s);
do_print_sigalgs(out, s, 0);
do_print_sigalgs(out, s, 1);
if (SSL_get_peer_signature_nid(s, &mdnid))
BIO_printf(out, "Peer signing digest: %s\n", OBJ_nid2sn(mdnid));
return 1;
}
#ifndef OPENSSL_NO_EC
int ssl_print_point_formats(BIO *out, SSL *s)
{
int i, nformats;
const char *pformats;
nformats = SSL_get0_ec_point_formats(s, &pformats);
if (nformats <= 0)
return 1;
BIO_puts(out, "Supported Elliptic Curve Point Formats: ");
for (i = 0; i < nformats; i++, pformats++) {
if (i)
BIO_puts(out, ":");
switch (*pformats) {
case TLSEXT_ECPOINTFORMAT_uncompressed:
BIO_puts(out, "uncompressed");
break;
case TLSEXT_ECPOINTFORMAT_ansiX962_compressed_prime:
BIO_puts(out, "ansiX962_compressed_prime");
break;
case TLSEXT_ECPOINTFORMAT_ansiX962_compressed_char2:
BIO_puts(out, "ansiX962_compressed_char2");
break;
default:
BIO_printf(out, "unknown(%d)", (int)*pformats);
break;
}
}
if (nformats <= 0)
BIO_puts(out, "NONE");
BIO_puts(out, "\n");
return 1;
}
int ssl_print_curves(BIO *out, SSL *s, int noshared)
{
int i, ncurves, *curves, nid;
const char *cname;
ncurves = SSL_get1_curves(s, NULL);
if (ncurves <= 0)
return 1;
curves = OPENSSL_malloc(ncurves * sizeof(int));
if (!curves) {
BIO_puts(out, "Malloc error getting supported curves\n");
return 0;
}
SSL_get1_curves(s, curves);
BIO_puts(out, "Supported Elliptic Curves: ");
for (i = 0; i < ncurves; i++) {
if (i)
BIO_puts(out, ":");
nid = curves[i];
/* If unrecognised print out hex version */
if (nid & TLSEXT_nid_unknown)
BIO_printf(out, "0x%04X", nid & 0xFFFF);
else {
/* Use NIST name for curve if it exists */
cname = EC_curve_nid2nist(nid);
if (!cname)
cname = OBJ_nid2sn(nid);
BIO_printf(out, "%s", cname);
}
}
if (ncurves == 0)
BIO_puts(out, "NONE");
OPENSSL_free(curves);
if (noshared) {
BIO_puts(out, "\n");
return 1;
}
BIO_puts(out, "\nShared Elliptic curves: ");
ncurves = SSL_get_shared_curve(s, -1);
for (i = 0; i < ncurves; i++) {
if (i)
BIO_puts(out, ":");
nid = SSL_get_shared_curve(s, i);
cname = EC_curve_nid2nist(nid);
if (!cname)
cname = OBJ_nid2sn(nid);
BIO_printf(out, "%s", cname);
}
if (ncurves == 0)
BIO_puts(out, "NONE");
BIO_puts(out, "\n");
return 1;
}
#endif
int ssl_print_tmp_key(BIO *out, SSL *s)
{
EVP_PKEY *key;
if (!SSL_get_server_tmp_key(s, &key))
return 1;
BIO_puts(out, "Server Temp Key: ");
switch (EVP_PKEY_id(key)) {
case EVP_PKEY_RSA:
BIO_printf(out, "RSA, %d bits\n", EVP_PKEY_bits(key));
break;
case EVP_PKEY_DH:
BIO_printf(out, "DH, %d bits\n", EVP_PKEY_bits(key));
break;
#ifndef OPENSSL_NO_ECDH
case EVP_PKEY_EC:
{
EC_KEY *ec = EVP_PKEY_get1_EC_KEY(key);
int nid;
const char *cname;
nid = EC_GROUP_get_curve_name(EC_KEY_get0_group(ec));
EC_KEY_free(ec);
cname = EC_curve_nid2nist(nid);
if (!cname)
cname = OBJ_nid2sn(nid);
BIO_printf(out, "ECDH, %s, %d bits\n", cname, EVP_PKEY_bits(key));
}
#endif
}
EVP_PKEY_free(key);
return 1;
}
@ -884,3 +1149,504 @@ int MS_CALLBACK verify_cookie_callback(SSL *ssl, unsigned char *cookie,
return 0;
}
/*
* Example of extended certificate handling. Where the standard support of
* one certificate per algorithm is not sufficient an application can decide
* which certificate(s) to use at runtime based on whatever criteria it deems
* appropriate.
*/
/* Linked list of certificates, keys and chains */
struct ssl_excert_st {
int certform;
const char *certfile;
int keyform;
const char *keyfile;
const char *chainfile;
X509 *cert;
EVP_PKEY *key;
STACK_OF(X509) *chain;
int build_chain;
struct ssl_excert_st *next, *prev;
};
struct chain_flags {
int flag;
const char *name;
};
struct chain_flags chain_flags_list[] = {
{CERT_PKEY_VALID, "Overall Validity"},
{CERT_PKEY_SIGN, "Sign with EE key"},
{CERT_PKEY_EE_SIGNATURE, "EE signature"},
{CERT_PKEY_CA_SIGNATURE, "CA signature"},
{CERT_PKEY_EE_PARAM, "EE key parameters"},
{CERT_PKEY_CA_PARAM, "CA key parameters"},
{CERT_PKEY_EXPLICIT_SIGN, "Explicity sign with EE key"},
{CERT_PKEY_ISSUER_NAME, "Issuer Name"},
{CERT_PKEY_CERT_TYPE, "Certificate Type"},
{0, NULL}
};
static void print_chain_flags(BIO *out, SSL *s, int flags)
{
struct chain_flags *ctmp = chain_flags_list;
while (ctmp->name) {
BIO_printf(out, "\t%s: %s\n", ctmp->name,
flags & ctmp->flag ? "OK" : "NOT OK");
ctmp++;
}
BIO_printf(out, "\tSuite B: ");
if (SSL_set_cert_flags(s, 0) & SSL_CERT_FLAG_SUITEB_128_LOS)
BIO_puts(out, flags & CERT_PKEY_SUITEB ? "OK\n" : "NOT OK\n");
else
BIO_printf(out, "not tested\n");
}
/*
* Very basic selection callback: just use any certificate chain reported as
* valid. More sophisticated could prioritise according to local policy.
*/
static int set_cert_cb(SSL *ssl, void *arg)
{
int i, rv;
SSL_EXCERT *exc = arg;
#ifdef CERT_CB_TEST_RETRY
static int retry_cnt;
if (retry_cnt < 5) {
retry_cnt++;
fprintf(stderr, "Certificate callback retry test: count %d\n",
retry_cnt);
return -1;
}
#endif
SSL_certs_clear(ssl);
if (!exc)
return 1;
/*
* Go to end of list and traverse backwards since we prepend newer
* entries this retains the original order.
*/
while (exc->next)
exc = exc->next;
i = 0;
while (exc) {
i++;
rv = SSL_check_chain(ssl, exc->cert, exc->key, exc->chain);
BIO_printf(bio_err, "Checking cert chain %d:\nSubject: ", i);
X509_NAME_print_ex(bio_err, X509_get_subject_name(exc->cert), 0,
XN_FLAG_ONELINE);
BIO_puts(bio_err, "\n");
print_chain_flags(bio_err, ssl, rv);
if (rv & CERT_PKEY_VALID) {
SSL_use_certificate(ssl, exc->cert);
SSL_use_PrivateKey(ssl, exc->key);
/*
* NB: we wouldn't normally do this as it is not efficient
* building chains on each connection better to cache the chain
* in advance.
*/
if (exc->build_chain) {
if (!SSL_build_cert_chain(ssl, 0))
return 0;
} else if (exc->chain)
SSL_set1_chain(ssl, exc->chain);
}
exc = exc->prev;
}
return 1;
}
void ssl_ctx_set_excert(SSL_CTX *ctx, SSL_EXCERT *exc)
{
SSL_CTX_set_cert_cb(ctx, set_cert_cb, exc);
}
static int ssl_excert_prepend(SSL_EXCERT **pexc)
{
SSL_EXCERT *exc;
exc = OPENSSL_malloc(sizeof(SSL_EXCERT));
if (!exc)
return 0;
exc->certfile = NULL;
exc->keyfile = NULL;
exc->chainfile = NULL;
exc->cert = NULL;
exc->key = NULL;
exc->chain = NULL;
exc->prev = NULL;
exc->build_chain = 0;
exc->next = *pexc;
*pexc = exc;
if (exc->next) {
exc->certform = exc->next->certform;
exc->keyform = exc->next->keyform;
exc->next->prev = exc;
} else {
exc->certform = FORMAT_PEM;
exc->keyform = FORMAT_PEM;
}
return 1;
}
void ssl_excert_free(SSL_EXCERT *exc)
{
SSL_EXCERT *curr;
while (exc) {
if (exc->cert)
X509_free(exc->cert);
if (exc->key)
EVP_PKEY_free(exc->key);
if (exc->chain)
sk_X509_pop_free(exc->chain, X509_free);
curr = exc;
exc = exc->next;
OPENSSL_free(curr);
}
}
int load_excert(SSL_EXCERT **pexc, BIO *err)
{
SSL_EXCERT *exc = *pexc;
if (!exc)
return 1;
/* If nothing in list, free and set to NULL */
if (!exc->certfile && !exc->next) {
ssl_excert_free(exc);
*pexc = NULL;
return 1;
}
for (; exc; exc = exc->next) {
if (!exc->certfile) {
BIO_printf(err, "Missing filename\n");
return 0;
}
exc->cert = load_cert(err, exc->certfile, exc->certform,
NULL, NULL, "Server Certificate");
if (!exc->cert)
return 0;
if (exc->keyfile) {
exc->key = load_key(err, exc->keyfile, exc->keyform,
0, NULL, NULL, "Server Key");
} else {
exc->key = load_key(err, exc->certfile, exc->certform,
0, NULL, NULL, "Server Key");
}
if (!exc->key)
return 0;
if (exc->chainfile) {
exc->chain = load_certs(err,
exc->chainfile, FORMAT_PEM,
NULL, NULL, "Server Chain");
if (!exc->chain)
return 0;
}
}
return 1;
}
int args_excert(char ***pargs, int *pargc,
int *badarg, BIO *err, SSL_EXCERT **pexc)
{
char *arg = **pargs, *argn = (*pargs)[1];
SSL_EXCERT *exc = *pexc;
int narg = 2;
if (!exc) {
if (ssl_excert_prepend(&exc))
*pexc = exc;
else {
BIO_printf(err, "Error initialising xcert\n");
*badarg = 1;
goto err;
}
}
if (strcmp(arg, "-xcert") == 0) {
if (!argn) {
*badarg = 1;
return 1;
}
if (exc->certfile && !ssl_excert_prepend(&exc)) {
BIO_printf(err, "Error adding xcert\n");
*badarg = 1;
goto err;
}
exc->certfile = argn;
} else if (strcmp(arg, "-xkey") == 0) {
if (!argn) {
*badarg = 1;
return 1;
}
if (exc->keyfile) {
BIO_printf(err, "Key already specified\n");
*badarg = 1;
return 1;
}
exc->keyfile = argn;
} else if (strcmp(arg, "-xchain") == 0) {
if (!argn) {
*badarg = 1;
return 1;
}
if (exc->chainfile) {
BIO_printf(err, "Chain already specified\n");
*badarg = 1;
return 1;
}
exc->chainfile = argn;
} else if (strcmp(arg, "-xchain_build") == 0) {
narg = 1;
exc->build_chain = 1;
} else if (strcmp(arg, "-xcertform") == 0) {
if (!argn) {
*badarg = 1;
goto err;
}
exc->certform = str2fmt(argn);
} else if (strcmp(arg, "-xkeyform") == 0) {
if (!argn) {
*badarg = 1;
goto err;
}
exc->keyform = str2fmt(argn);
} else
return 0;
(*pargs) += narg;
if (pargc)
*pargc -= narg;
*pexc = exc;
return 1;
err:
ERR_print_errors(err);
ssl_excert_free(exc);
*pexc = NULL;
return 1;
}
static void print_raw_cipherlist(BIO *bio, SSL *s)
{
const unsigned char *rlist;
static const unsigned char scsv_id[] = { 0, 0, 0xFF };
size_t i, rlistlen, num;
if (!SSL_is_server(s))
return;
num = SSL_get0_raw_cipherlist(s, NULL);
rlistlen = SSL_get0_raw_cipherlist(s, &rlist);
BIO_puts(bio, "Client cipher list: ");
for (i = 0; i < rlistlen; i += num, rlist += num) {
const SSL_CIPHER *c = SSL_CIPHER_find(s, rlist);
if (i)
BIO_puts(bio, ":");
if (c)
BIO_puts(bio, SSL_CIPHER_get_name(c));
else if (!memcmp(rlist, scsv_id - num + 3, num))
BIO_puts(bio, "SCSV");
else {
size_t j;
BIO_puts(bio, "0x");
for (j = 0; j < num; j++)
BIO_printf(bio, "%02X", rlist[j]);
}
}
BIO_puts(bio, "\n");
}
void print_ssl_summary(BIO *bio, SSL *s)
{
const SSL_CIPHER *c;
X509 *peer;
/*
* const char *pnam = SSL_is_server(s) ? "client" : "server";
*/
BIO_printf(bio, "Protocol version: %s\n", SSL_get_version(s));
print_raw_cipherlist(bio, s);
c = SSL_get_current_cipher(s);
BIO_printf(bio, "Ciphersuite: %s\n", SSL_CIPHER_get_name(c));
do_print_sigalgs(bio, s, 0);
peer = SSL_get_peer_certificate(s);
if (peer) {
int nid;
BIO_puts(bio, "Peer certificate: ");
X509_NAME_print_ex(bio, X509_get_subject_name(peer),
0, XN_FLAG_ONELINE);
BIO_puts(bio, "\n");
if (SSL_get_peer_signature_nid(s, &nid))
BIO_printf(bio, "Hash used: %s\n", OBJ_nid2sn(nid));
} else
BIO_puts(bio, "No peer certificate\n");
if (peer)
X509_free(peer);
#ifndef OPENSSL_NO_EC
ssl_print_point_formats(bio, s);
if (SSL_is_server(s))
ssl_print_curves(bio, s, 1);
else
ssl_print_tmp_key(bio, s);
#else
if (!SSL_is_server(s))
ssl_print_tmp_key(bio, s);
#endif
}
int args_ssl(char ***pargs, int *pargc, SSL_CONF_CTX *cctx,
int *badarg, BIO *err, STACK_OF(OPENSSL_STRING) **pstr)
{
char *arg = **pargs, *argn = (*pargs)[1];
int rv;
/* Attempt to run SSL configuration command */
rv = SSL_CONF_cmd_argv(cctx, pargc, pargs);
/* If parameter not recognised just return */
if (rv == 0)
return 0;
/* see if missing argument error */
if (rv == -3) {
BIO_printf(err, "%s needs an argument\n", arg);
*badarg = 1;
goto end;
}
/* Check for some other error */
if (rv < 0) {
BIO_printf(err, "Error with command: \"%s %s\"\n",
arg, argn ? argn : "");
*badarg = 1;
goto end;
}
/* Store command and argument */
/* If only one argument processed store value as NULL */
if (rv == 1)
argn = NULL;
if (!*pstr)
*pstr = sk_OPENSSL_STRING_new_null();
if (!*pstr || !sk_OPENSSL_STRING_push(*pstr, arg) ||
!sk_OPENSSL_STRING_push(*pstr, argn)) {
BIO_puts(err, "Memory allocation failure\n");
goto end;
}
end:
if (*badarg)
ERR_print_errors(err);
return 1;
}
int args_ssl_call(SSL_CTX *ctx, BIO *err, SSL_CONF_CTX *cctx,
STACK_OF(OPENSSL_STRING) *str, int no_ecdhe, int no_jpake)
{
int i;
SSL_CONF_CTX_set_ssl_ctx(cctx, ctx);
for (i = 0; i < sk_OPENSSL_STRING_num(str); i += 2) {
const char *param = sk_OPENSSL_STRING_value(str, i);
const char *value = sk_OPENSSL_STRING_value(str, i + 1);
/*
* If no_ecdhe or named curve already specified don't need a default.
*/
if (!no_ecdhe && !strcmp(param, "-named_curve"))
no_ecdhe = 1;
#ifndef OPENSSL_NO_JPAKE
if (!no_jpake && !strcmp(param, "-cipher")) {
BIO_puts(err, "JPAKE sets cipher to PSK\n");
return 0;
}
#endif
if (SSL_CONF_cmd(cctx, param, value) <= 0) {
BIO_printf(err, "Error with command: \"%s %s\"\n",
param, value ? value : "");
ERR_print_errors(err);
return 0;
}
}
/*
* This is a special case to keep existing s_server functionality: if we
* don't have any curve specified *and* we haven't disabled ECDHE then
* use P-256.
*/
if (!no_ecdhe) {
if (SSL_CONF_cmd(cctx, "-named_curve", "P-256") <= 0) {
BIO_puts(err, "Error setting EC curve\n");
ERR_print_errors(err);
return 0;
}
}
#ifndef OPENSSL_NO_JPAKE
if (!no_jpake) {
if (SSL_CONF_cmd(cctx, "-cipher", "PSK") <= 0) {
BIO_puts(err, "Error setting cipher to PSK\n");
ERR_print_errors(err);
return 0;
}
}
#endif
if (!SSL_CONF_CTX_finish(cctx)) {
BIO_puts(err, "Error finishing context\n");
ERR_print_errors(err);
return 0;
}
return 1;
}
static int add_crls_store(X509_STORE *st, STACK_OF(X509_CRL) *crls)
{
X509_CRL *crl;
int i;
for (i = 0; i < sk_X509_CRL_num(crls); i++) {
crl = sk_X509_CRL_value(crls, i);
X509_STORE_add_crl(st, crl);
}
return 1;
}
int ssl_ctx_add_crls(SSL_CTX *ctx, STACK_OF(X509_CRL) *crls, int crl_download)
{
X509_STORE *st;
st = SSL_CTX_get_cert_store(ctx);
add_crls_store(st, crls);
if (crl_download)
store_setup_crl_download(st);
return 1;
}
int ssl_load_stores(SSL_CTX *ctx,
const char *vfyCApath, const char *vfyCAfile,
const char *chCApath, const char *chCAfile,
STACK_OF(X509_CRL) *crls, int crl_download)
{
X509_STORE *vfy = NULL, *ch = NULL;
int rv = 0;
if (vfyCApath || vfyCAfile) {
vfy = X509_STORE_new();
if (!X509_STORE_load_locations(vfy, vfyCAfile, vfyCApath))
goto err;
add_crls_store(vfy, crls);
SSL_CTX_set1_verify_cert_store(ctx, vfy);
if (crl_download)
store_setup_crl_download(vfy);
}
if (chCApath || chCAfile) {
ch = X509_STORE_new();
if (!X509_STORE_load_locations(ch, chCAfile, chCApath))
goto err;
SSL_CTX_set1_chain_cert_store(ctx, ch);
}
rv = 1;
err:
if (vfy)
X509_STORE_free(vfy);
if (ch)
X509_STORE_free(ch);
return rv;
}

View File

@ -202,6 +202,7 @@ typedef unsigned int u_int;
extern int verify_depth;
extern int verify_error;
extern int verify_return_error;
extern int verify_quiet;
#ifdef FIONBIO
static int c_nbio = 0;
@ -224,8 +225,10 @@ static void print_stuff(BIO *berr, SSL *con, int full);
static int ocsp_resp_cb(SSL *s, void *arg);
#endif
static BIO *bio_c_out = NULL;
static BIO *bio_c_msg = NULL;
static int c_quiet = 0;
static int c_ign_eof = 0;
static int c_brief = 0;
#ifndef OPENSSL_NO_PSK
/* Default PSK identity and key */
@ -304,6 +307,12 @@ static void sc_usage(void)
BIO_printf(bio_err,
" -connect host:port - who to connect to (default is %s:%s)\n",
SSL_HOST_NAME, PORT_STR);
BIO_printf(bio_err,
" -verify_host host - check peer certificate matches \"host\"\n");
BIO_printf(bio_err,
" -verify_email email - check peer certificate matches \"email\"\n");
BIO_printf(bio_err,
" -verify_ip ipaddr - check peer certificate matches \"ipaddr\"\n");
BIO_printf(bio_err,
" -verify arg - turn on peer certificate verification\n");
@ -413,11 +422,15 @@ static void sc_usage(void)
" -status - request certificate status from server\n");
BIO_printf(bio_err,
" -no_ticket - disable use of RFC4507bis session tickets\n");
# ifndef OPENSSL_NO_NEXTPROTONEG
BIO_printf(bio_err,
" -serverinfo types - send empty ClientHello extensions (comma-separated numbers)\n");
#endif
#ifndef OPENSSL_NO_NEXTPROTONEG
BIO_printf(bio_err,
" -nextprotoneg arg - enable NPN extension, considering named protocols supported (comma-separated list)\n");
# endif
#endif
BIO_printf(bio_err,
" -alpn arg - enable ALPN extension, considering named protocols supported (comma-separated list)\n");
BIO_printf(bio_err,
" -legacy_renegotiation - enable use of legacy renegotiation (dangerous)\n");
#ifndef OPENSSL_NO_SRTP
@ -605,6 +618,27 @@ static int next_proto_cb(SSL *s, unsigned char **out, unsigned char *outlen,
return SSL_TLSEXT_ERR_OK;
}
# endif /* ndef OPENSSL_NO_NEXTPROTONEG */
static int serverinfo_cli_parse_cb(SSL *s, unsigned int ext_type,
const unsigned char *in, size_t inlen,
int *al, void *arg)
{
char pem_name[100];
unsigned char ext_buf[4 + 65536];
/* Reconstruct the type/len fields prior to extension data */
ext_buf[0] = ext_type >> 8;
ext_buf[1] = ext_type & 0xFF;
ext_buf[2] = inlen >> 8;
ext_buf[3] = inlen & 0xFF;
memcpy(ext_buf + 4, in, inlen);
BIO_snprintf(pem_name, sizeof(pem_name), "SERVERINFO FOR EXTENSION %d",
ext_type);
PEM_write_bio(bio_c_out, pem_name, "", ext_buf, 4 + inlen);
return 1;
}
#endif
enum {
@ -620,7 +654,7 @@ int MAIN(int, char **);
int MAIN(int argc, char **argv)
{
unsigned int off = 0, clr = 0;
int build_chain = 0;
SSL *con = NULL;
#ifndef OPENSSL_NO_KRB5
KSSL_CTX *kctx;
@ -633,13 +667,16 @@ int MAIN(int argc, char **argv)
short port = PORT;
int full_log = 1;
char *host = SSL_HOST_NAME;
char *cert_file = NULL, *key_file = NULL;
char *cert_file = NULL, *key_file = NULL, *chain_file = NULL;
int cert_format = FORMAT_PEM, key_format = FORMAT_PEM;
char *passarg = NULL, *pass = NULL;
X509 *cert = NULL;
EVP_PKEY *key = NULL;
char *CApath = NULL, *CAfile = NULL, *cipher = NULL;
int reconnect = 0, badop = 0, verify = SSL_VERIFY_NONE, bugs = 0;
STACK_OF(X509) *chain = NULL;
char *CApath = NULL, *CAfile = NULL;
char *chCApath = NULL, *chCAfile = NULL;
char *vfyCApath = NULL, *vfyCAfile = NULL;
int reconnect = 0, badop = 0, verify = SSL_VERIFY_NONE;
int crlf = 0;
int write_tty, read_tty, write_ssl, read_ssl, tty_on, ssl_pending;
SSL_CTX *ctx = NULL;
@ -672,6 +709,10 @@ int MAIN(int argc, char **argv)
# ifndef OPENSSL_NO_NEXTPROTONEG
const char *next_proto_neg_in = NULL;
# endif
const char *alpn_in = NULL;
# define MAX_SI_TYPES 100
unsigned short serverinfo_types[MAX_SI_TYPES];
int serverinfo_types_count = 0;
#endif
char *sess_in = NULL;
char *sess_out = NULL;
@ -681,13 +722,25 @@ int MAIN(int argc, char **argv)
int enable_timeouts = 0;
long socket_mtu = 0;
#ifndef OPENSSL_NO_JPAKE
char *jpake_secret = NULL;
static char *jpake_secret = NULL;
# define no_jpake !jpake_secret
#else
# define no_jpake 1
#endif
#ifndef OPENSSL_NO_SRP
char *srppass = NULL;
int srp_lateuser = 0;
SRP_ARG srp_arg = { NULL, NULL, 0, 0, 0, 1024 };
#endif
SSL_EXCERT *exc = NULL;
SSL_CONF_CTX *cctx = NULL;
STACK_OF(OPENSSL_STRING) *ssl_args = NULL;
char *crl_file = NULL;
int crl_format = FORMAT_PEM;
int crl_download = 0;
STACK_OF(X509_CRL) *crls = NULL;
meth = SSLv23_client_method();
@ -705,6 +758,12 @@ int MAIN(int argc, char **argv)
if (!load_config(bio_err, NULL))
goto end;
cctx = SSL_CONF_CTX_new();
if (!cctx)
goto end;
SSL_CONF_CTX_set_flags(cctx, SSL_CONF_FLAG_CLIENT);
SSL_CONF_CTX_set_flags(cctx, SSL_CONF_FLAG_CMDLINE);
if (((cbuf = OPENSSL_malloc(BUFSIZZ)) == NULL) ||
((sbuf = OPENSSL_malloc(BUFSIZZ)) == NULL) ||
((mbuf = OPENSSL_malloc(BUFSIZZ)) == NULL)) {
@ -741,12 +800,19 @@ int MAIN(int argc, char **argv)
if (--argc < 1)
goto bad;
verify_depth = atoi(*(++argv));
BIO_printf(bio_err, "verify depth is %d\n", verify_depth);
if (!c_quiet)
BIO_printf(bio_err, "verify depth is %d\n", verify_depth);
} else if (strcmp(*argv, "-cert") == 0) {
if (--argc < 1)
goto bad;
cert_file = *(++argv);
} else if (strcmp(*argv, "-sess_out") == 0) {
} else if (strcmp(*argv, "-CRL") == 0) {
if (--argc < 1)
goto bad;
crl_file = *(++argv);
} else if (strcmp(*argv, "-crl_download") == 0)
crl_download = 1;
else if (strcmp(*argv, "-sess_out") == 0) {
if (--argc < 1)
goto bad;
sess_out = *(++argv);
@ -758,13 +824,31 @@ int MAIN(int argc, char **argv)
if (--argc < 1)
goto bad;
cert_format = str2fmt(*(++argv));
} else if (strcmp(*argv, "-CRLform") == 0) {
if (--argc < 1)
goto bad;
crl_format = str2fmt(*(++argv));
} else if (args_verify(&argv, &argc, &badarg, bio_err, &vpm)) {
if (badarg)
goto bad;
continue;
} else if (strcmp(*argv, "-verify_return_error") == 0)
verify_return_error = 1;
else if (strcmp(*argv, "-prexit") == 0)
else if (strcmp(*argv, "-verify_quiet") == 0)
verify_quiet = 1;
else if (strcmp(*argv, "-brief") == 0) {
c_brief = 1;
verify_quiet = 1;
c_quiet = 1;
} else if (args_excert(&argv, &argc, &badarg, bio_err, &exc)) {
if (badarg)
goto bad;
continue;
} else if (args_ssl(&argv, &argc, cctx, &badarg, bio_err, &ssl_args)) {
if (badarg)
goto bad;
continue;
} else if (strcmp(*argv, "-prexit") == 0)
prexit = 1;
else if (strcmp(*argv, "-crlf") == 0)
crlf = 1;
@ -791,6 +875,15 @@ int MAIN(int argc, char **argv)
#endif
else if (strcmp(*argv, "-msg") == 0)
c_msg = 1;
else if (strcmp(*argv, "-msgfile") == 0) {
if (--argc < 1)
goto bad;
bio_c_msg = BIO_new_file(*(++argv), "w");
}
#ifndef OPENSSL_NO_SSL_TRACE
else if (strcmp(*argv, "-trace") == 0)
c_msg = 2;
#endif
else if (strcmp(*argv, "-showcerts") == 0)
c_showcerts = 1;
else if (strcmp(*argv, "-nbio_test") == 0)
@ -859,11 +952,15 @@ int MAIN(int argc, char **argv)
meth = TLSv1_client_method();
#endif
#ifndef OPENSSL_NO_DTLS1
else if (strcmp(*argv, "-dtls1") == 0) {
else if (strcmp(*argv, "-dtls") == 0) {
meth = DTLS_client_method();
socket_type = SOCK_DGRAM;
} else if (strcmp(*argv, "-dtls1") == 0) {
meth = DTLSv1_client_method();
socket_type = SOCK_DGRAM;
} else if (strcmp(*argv, "-fallback_scsv") == 0) {
fallback_scsv = 1;
} else if (strcmp(*argv, "-dtls1_2") == 0) {
meth = DTLSv1_2_client_method();
socket_type = SOCK_DGRAM;
} else if (strcmp(*argv, "-timeout") == 0)
enable_timeouts = 1;
else if (strcmp(*argv, "-mtu") == 0) {
@ -872,9 +969,9 @@ int MAIN(int argc, char **argv)
socket_mtu = atol(*(++argv));
}
#endif
else if (strcmp(*argv, "-bugs") == 0)
bugs = 1;
else if (strcmp(*argv, "-keyform") == 0) {
else if (strcmp(*argv, "-fallback_scsv") == 0) {
fallback_scsv = 1;
} else if (strcmp(*argv, "-keyform") == 0) {
if (--argc < 1)
goto bad;
key_format = str2fmt(*(++argv));
@ -882,6 +979,10 @@ int MAIN(int argc, char **argv)
if (--argc < 1)
goto bad;
passarg = *(++argv);
} else if (strcmp(*argv, "-cert_chain") == 0) {
if (--argc < 1)
goto bad;
chain_file = *(++argv);
} else if (strcmp(*argv, "-key") == 0) {
if (--argc < 1)
goto bad;
@ -892,27 +993,30 @@ int MAIN(int argc, char **argv)
if (--argc < 1)
goto bad;
CApath = *(++argv);
} else if (strcmp(*argv, "-CAfile") == 0) {
} else if (strcmp(*argv, "-chainCApath") == 0) {
if (--argc < 1)
goto bad;
chCApath = *(++argv);
} else if (strcmp(*argv, "-verifyCApath") == 0) {
if (--argc < 1)
goto bad;
vfyCApath = *(++argv);
} else if (strcmp(*argv, "-build_chain") == 0)
build_chain = 1;
else if (strcmp(*argv, "-CAfile") == 0) {
if (--argc < 1)
goto bad;
CAfile = *(++argv);
} else if (strcmp(*argv, "-no_tls1_2") == 0)
off |= SSL_OP_NO_TLSv1_2;
else if (strcmp(*argv, "-no_tls1_1") == 0)
off |= SSL_OP_NO_TLSv1_1;
else if (strcmp(*argv, "-no_tls1") == 0)
off |= SSL_OP_NO_TLSv1;
else if (strcmp(*argv, "-no_ssl3") == 0)
off |= SSL_OP_NO_SSLv3;
else if (strcmp(*argv, "-no_ssl2") == 0)
off |= SSL_OP_NO_SSLv2;
else if (strcmp(*argv, "-no_comp") == 0) {
off |= SSL_OP_NO_COMPRESSION;
} else if (strcmp(*argv, "-chainCAfile") == 0) {
if (--argc < 1)
goto bad;
chCAfile = *(++argv);
} else if (strcmp(*argv, "-verifyCAfile") == 0) {
if (--argc < 1)
goto bad;
vfyCAfile = *(++argv);
}
#ifndef OPENSSL_NO_TLSEXT
else if (strcmp(*argv, "-no_ticket") == 0) {
off |= SSL_OP_NO_TICKET;
}
# ifndef OPENSSL_NO_NEXTPROTONEG
else if (strcmp(*argv, "-nextprotoneg") == 0) {
if (--argc < 1)
@ -920,20 +1024,32 @@ int MAIN(int argc, char **argv)
next_proto_neg_in = *(++argv);
}
# endif
#endif
else if (strcmp(*argv, "-serverpref") == 0)
off |= SSL_OP_CIPHER_SERVER_PREFERENCE;
else if (strcmp(*argv, "-legacy_renegotiation") == 0)
off |= SSL_OP_ALLOW_UNSAFE_LEGACY_RENEGOTIATION;
else if (strcmp(*argv, "-legacy_server_connect") == 0) {
off |= SSL_OP_LEGACY_SERVER_CONNECT;
} else if (strcmp(*argv, "-no_legacy_server_connect") == 0) {
clr |= SSL_OP_LEGACY_SERVER_CONNECT;
} else if (strcmp(*argv, "-cipher") == 0) {
else if (strcmp(*argv, "-alpn") == 0) {
if (--argc < 1)
goto bad;
cipher = *(++argv);
alpn_in = *(++argv);
} else if (strcmp(*argv, "-serverinfo") == 0) {
char *c;
int start = 0;
int len;
if (--argc < 1)
goto bad;
c = *(++argv);
serverinfo_types_count = 0;
len = strlen(c);
for (i = 0; i <= len; ++i) {
if (i == len || c[i] == ',') {
serverinfo_types[serverinfo_types_count]
= atoi(c + start);
serverinfo_types_count++;
start = i + 1;
}
if (serverinfo_types_count == MAX_SI_TYPES)
break;
}
}
#endif
#ifdef FIONBIO
else if (strcmp(*argv, "-nbio") == 0) {
c_nbio = 1;
@ -1024,11 +1140,6 @@ int MAIN(int argc, char **argv)
goto end;
}
psk_identity = "JPAKE";
if (cipher) {
BIO_printf(bio_err, "JPAKE sets cipher to PSK\n");
goto end;
}
cipher = "PSK";
}
#endif
@ -1087,6 +1198,33 @@ int MAIN(int argc, char **argv)
}
}
if (chain_file) {
chain = load_certs(bio_err, chain_file, FORMAT_PEM,
NULL, e, "client certificate chain");
if (!chain)
goto end;
}
if (crl_file) {
X509_CRL *crl;
crl = load_crl(crl_file, crl_format);
if (!crl) {
BIO_puts(bio_err, "Error loading CRL\n");
ERR_print_errors(bio_err);
goto end;
}
crls = sk_X509_CRL_new_null();
if (!crls || !sk_X509_CRL_push(crls, crl)) {
BIO_puts(bio_err, "Error adding CRL\n");
ERR_print_errors(bio_err);
X509_CRL_free(crl);
goto end;
}
}
if (!load_excert(&exc, bio_err))
goto end;
if (!app_RAND_load_file(NULL, bio_err, 1) && inrand == NULL
&& !RAND_status()) {
BIO_printf(bio_err,
@ -1097,8 +1235,10 @@ int MAIN(int argc, char **argv)
app_RAND_load_files(inrand));
if (bio_c_out == NULL) {
if (c_quiet && !c_debug && !c_msg) {
if (c_quiet && !c_debug) {
bio_c_out = BIO_new(BIO_s_null());
if (c_msg && !bio_c_msg)
bio_c_msg = BIO_new_fp(stdout, BIO_NOCLOSE);
} else {
if (bio_c_out == NULL)
bio_c_out = BIO_new_fp(stdout, BIO_NOCLOSE);
@ -1120,6 +1260,17 @@ int MAIN(int argc, char **argv)
if (vpm)
SSL_CTX_set1_param(ctx, vpm);
if (!args_ssl_call(ctx, bio_err, cctx, ssl_args, 1, no_jpake)) {
ERR_print_errors(bio_err);
goto end;
}
if (!ssl_load_stores(ctx, vfyCApath, vfyCAfile, chCApath, chCAfile,
crls, crl_download)) {
BIO_printf(bio_err, "Error loading store locations\n");
ERR_print_errors(bio_err);
goto end;
}
#ifndef OPENSSL_NO_ENGINE
if (ssl_client_engine) {
if (!SSL_CTX_set_client_cert_engine(ctx, ssl_client_engine)) {
@ -1149,35 +1300,43 @@ int MAIN(int argc, char **argv)
if (srtp_profiles != NULL)
SSL_CTX_set_tlsext_use_srtp(ctx, srtp_profiles);
#endif
if (bugs)
SSL_CTX_set_options(ctx, SSL_OP_ALL | off);
else
SSL_CTX_set_options(ctx, off);
if (exc)
ssl_ctx_set_excert(ctx, exc);
if (clr)
SSL_CTX_clear_options(ctx, clr);
#if !defined(OPENSSL_NO_TLSEXT) && !defined(OPENSSL_NO_NEXTPROTONEG)
#if !defined(OPENSSL_NO_TLSEXT)
# if !defined(OPENSSL_NO_NEXTPROTONEG)
if (next_proto.data)
SSL_CTX_set_next_proto_select_cb(ctx, next_proto_cb, &next_proto);
# endif
if (alpn_in) {
unsigned short alpn_len;
unsigned char *alpn = next_protos_parse(&alpn_len, alpn_in);
if (alpn == NULL) {
BIO_printf(bio_err, "Error parsing -alpn argument\n");
goto end;
}
SSL_CTX_set_alpn_protos(ctx, alpn, alpn_len);
OPENSSL_free(alpn);
}
#endif
#ifndef OPENSSL_NO_TLSEXT
for (i = 0; i < serverinfo_types_count; i++) {
SSL_CTX_add_client_custom_ext(ctx,
serverinfo_types[i],
NULL, NULL, NULL,
serverinfo_cli_parse_cb, NULL);
}
#endif
if (state)
SSL_CTX_set_info_callback(ctx, apps_ssl_info_callback);
if (cipher != NULL)
if (!SSL_CTX_set_cipher_list(ctx, cipher)) {
BIO_printf(bio_err, "error setting cipher list\n");
ERR_print_errors(bio_err);
goto end;
}
#if 0
else
SSL_CTX_set_cipher_list(ctx, getenv("SSL_CIPHER"));
else
SSL_CTX_set_cipher_list(ctx, getenv("SSL_CIPHER"));
#endif
SSL_CTX_set_verify(ctx, verify, verify_callback);
if (!set_cert_key_stuff(ctx, cert, key))
goto end;
if ((CAfile || CApath)
&& !SSL_CTX_load_verify_locations(ctx, CAfile, CApath)) {
@ -1186,6 +1345,11 @@ int MAIN(int argc, char **argv)
if (!SSL_CTX_set_default_verify_paths(ctx)) {
ERR_print_errors(bio_err);
}
ssl_ctx_add_crls(ctx, crls, crl_download);
if (!set_cert_key_stuff(ctx, cert, key, chain, build_chain))
goto end;
#ifndef OPENSSL_NO_TLSEXT
if (servername != NULL) {
tlsextcbp.biodebug = bio_err;
@ -1277,7 +1441,7 @@ int MAIN(int argc, char **argv)
if (c_Pause & 0x01)
SSL_set_debug(con, 1);
if (SSL_version(con) == DTLS1_VERSION) {
if (socket_type == SOCK_DGRAM) {
sbio = BIO_new_dgram(s, BIO_NOCLOSE);
if (getsockname(s, &peer, (void *)&peerlen) < 0) {
@ -1331,8 +1495,13 @@ int MAIN(int argc, char **argv)
BIO_set_callback_arg(sbio, (char *)bio_c_out);
}
if (c_msg) {
SSL_set_msg_callback(con, msg_cb);
SSL_set_msg_callback_arg(con, bio_c_out);
#ifndef OPENSSL_NO_SSL_TRACE
if (c_msg == 2)
SSL_set_msg_callback(con, SSL_trace);
else
#endif
SSL_set_msg_callback(con, msg_cb);
SSL_set_msg_callback_arg(con, bio_c_msg ? bio_c_msg : bio_c_out);
}
#ifndef OPENSSL_NO_TLSEXT
if (c_tlsextdebug) {
@ -1515,6 +1684,11 @@ int MAIN(int argc, char **argv)
BIO_printf(bio_err, "Error writing session file %s\n",
sess_out);
}
if (c_brief) {
BIO_puts(bio_err, "CONNECTION ESTABLISHED\n");
print_ssl_summary(bio_err, con);
}
print_stuff(bio_c_out, con, full_log);
if (full_log > 0)
full_log--;
@ -1780,7 +1954,10 @@ int MAIN(int argc, char **argv)
break;
case SSL_ERROR_SYSCALL:
ret = get_last_socket_error();
BIO_printf(bio_err, "read:errno=%d\n", ret);
if (c_brief)
BIO_puts(bio_err, "CONNECTION CLOSED BY SERVER\n");
else
BIO_printf(bio_err, "read:errno=%d\n", ret);
goto shut;
case SSL_ERROR_ZERO_RETURN:
BIO_printf(bio_c_out, "closed\n");
@ -1880,12 +2057,25 @@ int MAIN(int argc, char **argv)
SSL_CTX_free(ctx);
if (cert)
X509_free(cert);
if (crls)
sk_X509_CRL_pop_free(crls, X509_CRL_free);
if (key)
EVP_PKEY_free(key);
if (chain)
sk_X509_pop_free(chain, X509_free);
if (pass)
OPENSSL_free(pass);
if (vpm)
X509_VERIFY_PARAM_free(vpm);
ssl_excert_free(exc);
if (ssl_args)
sk_OPENSSL_STRING_free(ssl_args);
if (cctx)
SSL_CONF_CTX_free(cctx);
#ifndef OPENSSL_NO_JPAKE
if (jpake_secret && psk_key)
OPENSSL_free(psk_key);
#endif
if (cbuf != NULL) {
OPENSSL_cleanse(cbuf, BUFSIZZ);
OPENSSL_free(cbuf);
@ -1902,6 +2092,10 @@ int MAIN(int argc, char **argv)
BIO_free(bio_c_out);
bio_c_out = NULL;
}
if (bio_c_msg != NULL) {
BIO_free(bio_c_msg);
bio_c_msg = NULL;
}
apps_shutdown();
OPENSSL_EXIT(ret);
}
@ -1995,6 +2189,9 @@ static void print_stuff(BIO *bio, SSL *s, int full)
BIO_write(bio, "\n", 1);
}
ssl_print_sigalgs(bio, s);
ssl_print_tmp_key(bio, s);
BIO_printf(bio,
"---\nSSL handshake has read %ld bytes and written %ld bytes\n",
BIO_number_read(SSL_get_rbio(s)),
@ -2034,7 +2231,8 @@ static void print_stuff(BIO *bio, SSL *s, int full)
}
#endif
#if !defined(OPENSSL_NO_TLSEXT) && !defined(OPENSSL_NO_NEXTPROTONEG)
#if !defined(OPENSSL_NO_TLSEXT)
# if !defined(OPENSSL_NO_NEXTPROTONEG)
if (next_proto.status != -1) {
const unsigned char *proto;
unsigned int proto_len;
@ -2043,6 +2241,18 @@ static void print_stuff(BIO *bio, SSL *s, int full)
BIO_write(bio, proto, proto_len);
BIO_write(bio, "\n", 1);
}
# endif
{
const unsigned char *proto;
unsigned int proto_len;
SSL_get0_alpn_selected(s, &proto, &proto_len);
if (proto_len > 0) {
BIO_printf(bio, "ALPN protocol: ");
BIO_write(bio, proto, proto_len);
BIO_write(bio, "\n", 1);
} else
BIO_printf(bio, "No ALPN negotiated\n");
}
#endif
#ifndef OPENSSL_NO_SRTP

File diff suppressed because it is too large Load Diff

View File

@ -290,8 +290,9 @@ static int init_client_ip(int *sock, unsigned char ip[4], int port, int type)
}
int do_server(int port, int type, int *ret,
int (*cb) (char *hostname, int s, unsigned char *context),
unsigned char *context)
int (*cb) (char *hostname, int s, int stype,
unsigned char *context), unsigned char *context,
int naccept)
{
int sock;
char *name = NULL;
@ -313,12 +314,14 @@ int do_server(int port, int type, int *ret,
}
} else
sock = accept_socket;
i = (*cb) (name, sock, context);
i = (*cb) (name, sock, type, context);
if (name != NULL)
OPENSSL_free(name);
if (type == SOCK_STREAM)
SHUTDOWN2(sock);
if (i < 0) {
if (naccept != -1)
naccept--;
if (i < 0 || naccept == 0) {
SHUTDOWN2(accept_socket);
return (i);
}

View File

@ -634,6 +634,12 @@ int MAIN(int argc, char **argv)
p7 = PKCS7_sign(NULL, NULL, other, in, flags);
if (!p7)
goto end;
if (flags & PKCS7_NOCERTS) {
for (i = 0; i < sk_X509_num(other); i++) {
X509 *x = sk_X509_value(other, i);
PKCS7_add_certificate(p7, x);
}
}
} else
flags |= PKCS7_REUSE_DIGEST;
for (i = 0; i < sk_OPENSSL_STRING_num(sksigners); i++) {

View File

@ -366,6 +366,8 @@ static void *KDF1_SHA1(const void *in, size_t inlen, void *out,
}
# endif /* OPENSSL_NO_ECDH */
static void multiblock_speed(const EVP_CIPHER *evp_cipher);
int MAIN(int, char **);
int MAIN(int argc, char **argv)
@ -646,6 +648,7 @@ int MAIN(int argc, char **argv)
# ifndef NO_FORK
int multi = 0;
# endif
int multiblock = 0;
# ifndef TIMES
usertime = -1;
@ -776,6 +779,9 @@ int MAIN(int argc, char **argv)
mr = 1;
j--; /* Otherwise, -mr gets confused with an
* algorithm. */
} else if (argc > 0 && !strcmp(*argv, "-mb")) {
multiblock = 1;
j--;
} else
# ifndef OPENSSL_NO_MD2
if (strcmp(*argv, "md2") == 0)
@ -1941,6 +1947,20 @@ int MAIN(int argc, char **argv)
# endif
if (doit[D_EVP]) {
# ifdef EVP_CIPH_FLAG_TLS1_1_MULTIBLOCK
if (multiblock && evp_cipher) {
if (!
(EVP_CIPHER_flags(evp_cipher) &
EVP_CIPH_FLAG_TLS1_1_MULTIBLOCK)) {
fprintf(stderr, "%s is not multi-block capable\n",
OBJ_nid2ln(evp_cipher->nid));
goto end;
}
multiblock_speed(evp_cipher);
mret = 0;
goto end;
}
# endif
for (j = 0; j < SIZE_NUM; j++) {
if (evp_cipher) {
EVP_CIPHER_CTX ctx;
@ -2742,4 +2762,113 @@ static int do_multi(int multi)
return 1;
}
# endif
static void multiblock_speed(const EVP_CIPHER *evp_cipher)
{
static int mblengths[] =
{ 8 * 1024, 2 * 8 * 1024, 4 * 8 * 1024, 8 * 8 * 1024, 8 * 16 * 1024 };
int j, count, num = sizeof(lengths) / sizeof(lengths[0]);
const char *alg_name;
unsigned char *inp, *out, no_key[32], no_iv[16];
EVP_CIPHER_CTX ctx;
double d = 0.0;
inp = OPENSSL_malloc(mblengths[num - 1]);
out = OPENSSL_malloc(mblengths[num - 1] + 1024);
if (!inp || !out) {
BIO_printf(bio_err,"Out of memory\n");
goto end;
}
EVP_CIPHER_CTX_init(&ctx);
EVP_EncryptInit_ex(&ctx, evp_cipher, NULL, no_key, no_iv);
EVP_CIPHER_CTX_ctrl(&ctx, EVP_CTRL_AEAD_SET_MAC_KEY, sizeof(no_key),
no_key);
alg_name = OBJ_nid2ln(evp_cipher->nid);
for (j = 0; j < num; j++) {
print_message(alg_name, 0, mblengths[j]);
Time_F(START);
for (count = 0, run = 1; run && count < 0x7fffffff; count++) {
unsigned char aad[EVP_AEAD_TLS1_AAD_LEN];
EVP_CTRL_TLS1_1_MULTIBLOCK_PARAM mb_param;
size_t len = mblengths[j];
int packlen;
memset(aad, 0, 8); /* avoid uninitialized values */
aad[8] = 23; /* SSL3_RT_APPLICATION_DATA */
aad[9] = 3; /* version */
aad[10] = 2;
aad[11] = 0; /* length */
aad[12] = 0;
mb_param.out = NULL;
mb_param.inp = aad;
mb_param.len = len;
mb_param.interleave = 8;
packlen = EVP_CIPHER_CTX_ctrl(&ctx,
EVP_CTRL_TLS1_1_MULTIBLOCK_AAD,
sizeof(mb_param), &mb_param);
if (packlen > 0) {
mb_param.out = out;
mb_param.inp = inp;
mb_param.len = len;
EVP_CIPHER_CTX_ctrl(&ctx,
EVP_CTRL_TLS1_1_MULTIBLOCK_ENCRYPT,
sizeof(mb_param), &mb_param);
} else {
int pad;
RAND_bytes(out, 16);
len += 16;
aad[11] = len >> 8;
aad[12] = len;
pad = EVP_CIPHER_CTX_ctrl(&ctx,
EVP_CTRL_AEAD_TLS1_AAD,
EVP_AEAD_TLS1_AAD_LEN, aad);
EVP_Cipher(&ctx, out, inp, len + pad);
}
}
d = Time_F(STOP);
BIO_printf(bio_err,
mr ? "+R:%d:%s:%f\n"
: "%d %s's in %.2fs\n", count, "evp", d);
results[D_EVP][j] = ((double)count) / d * mblengths[j];
}
if (mr) {
fprintf(stdout, "+H");
for (j = 0; j < num; j++)
fprintf(stdout, ":%d", mblengths[j]);
fprintf(stdout, "\n");
fprintf(stdout, "+F:%d:%s", D_EVP, alg_name);
for (j = 0; j < num; j++)
fprintf(stdout, ":%.2f", results[D_EVP][j]);
fprintf(stdout, "\n");
} else {
fprintf(stdout,
"The 'numbers' are in 1000s of bytes per second processed.\n");
fprintf(stdout, "type ");
for (j = 0; j < num; j++)
fprintf(stdout, "%7d bytes", mblengths[j]);
fprintf(stdout, "\n");
fprintf(stdout, "%-24s", alg_name);
for (j = 0; j < num; j++) {
if (results[D_EVP][j] > 10000)
fprintf(stdout, " %11.2fk", results[D_EVP][j] / 1e3);
else
fprintf(stdout, " %11.2f ", results[D_EVP][j]);
}
fprintf(stdout, "\n");
}
end:
if (inp)
OPENSSL_free(inp);
if (out)
OPENSSL_free(out);
}
#endif

View File

@ -88,6 +88,7 @@ int MAIN(int argc, char **argv)
X509_STORE *cert_ctx = NULL;
X509_LOOKUP *lookup = NULL;
X509_VERIFY_PARAM *vpm = NULL;
int crl_download = 0;
#ifndef OPENSSL_NO_ENGINE
char *engine = NULL;
#endif
@ -136,7 +137,8 @@ int MAIN(int argc, char **argv)
if (argc-- < 1)
goto end;
crlfile = *(++argv);
}
} else if (strcmp(*argv, "-crl_download") == 0)
crl_download = 1;
#ifndef OPENSSL_NO_ENGINE
else if (strcmp(*argv, "-engine") == 0) {
if (--argc < 1)
@ -214,6 +216,9 @@ int MAIN(int argc, char **argv)
}
ret = 0;
if (crl_download)
store_setup_crl_download(cert_ctx);
if (argc < 1) {
if (1 != check(cert_ctx, NULL, untrusted, trusted, crls, e))
ret = -1;

View File

@ -150,6 +150,9 @@ static const char *x509_usage[] = {
" -engine e - use engine e, possibly a hardware device.\n",
#endif
" -certopt arg - various certificate text options\n",
" -checkhost host - check certificate matches \"host\"\n",
" -checkemail email - check certificate matches \"email\"\n",
" -checkip ipaddr - check certificate matches \"ipaddr\"\n",
NULL
};
@ -163,6 +166,9 @@ static int x509_certify(X509_STORE *ctx, char *CAfile, const EVP_MD *digest,
char *section, ASN1_INTEGER *sno);
static int purpose_print(BIO *bio, X509 *cert, X509_PURPOSE *pt);
static int reqfile = 0;
#ifdef OPENSSL_SSL_DEBUG_BROKEN_PROTOCOL
static int force_version = 2;
#endif
int MAIN(int, char **);
@ -174,15 +180,16 @@ int MAIN(int argc, char **argv)
X509 *x = NULL, *xca = NULL;
ASN1_OBJECT *objtmp;
STACK_OF(OPENSSL_STRING) *sigopts = NULL;
EVP_PKEY *Upkey = NULL, *CApkey = NULL;
EVP_PKEY *Upkey = NULL, *CApkey = NULL, *fkey = NULL;
ASN1_INTEGER *sno = NULL;
int i, num, badops = 0;
int i, num, badops = 0, badsig = 0;
BIO *out = NULL;
BIO *STDout = NULL;
STACK_OF(ASN1_OBJECT) *trust = NULL, *reject = NULL;
int informat, outformat, keyformat, CAformat, CAkeyformat;
char *infile = NULL, *outfile = NULL, *keyfile = NULL, *CAfile = NULL;
char *CAkeyfile = NULL, *CAserial = NULL;
char *fkeyfile = NULL;
char *alias = NULL;
int text = 0, serial = 0, subject = 0, issuer = 0, startdate =
0, enddate = 0;
@ -208,6 +215,9 @@ int MAIN(int argc, char **argv)
int need_rand = 0;
int checkend = 0, checkoffset = 0;
unsigned long nmflag = 0, certflag = 0;
char *checkhost = NULL;
char *checkemail = NULL;
char *checkip = NULL;
#ifndef OPENSSL_NO_ENGINE
char *engine = NULL;
#endif
@ -274,7 +284,15 @@ int MAIN(int argc, char **argv)
sigopts = sk_OPENSSL_STRING_new_null();
if (!sigopts || !sk_OPENSSL_STRING_push(sigopts, *(++argv)))
goto bad;
} else if (strcmp(*argv, "-days") == 0) {
}
#ifdef OPENSSL_SSL_DEBUG_BROKEN_PROTOCOL
else if (strcmp(*argv, "-force_version") == 0) {
if (--argc < 1)
goto bad;
force_version = atoi(*(++argv)) - 1;
}
#endif
else if (strcmp(*argv, "-days") == 0) {
if (--argc < 1)
goto bad;
days = atoi(*(++argv));
@ -327,6 +345,10 @@ int MAIN(int argc, char **argv)
goto bad;
if (!(sno = s2i_ASN1_INTEGER(NULL, *(++argv))))
goto bad;
} else if (strcmp(*argv, "-force_pubkey") == 0) {
if (--argc < 1)
goto bad;
fkeyfile = *(++argv);
} else if (strcmp(*argv, "-addtrust") == 0) {
if (--argc < 1)
goto bad;
@ -424,6 +446,18 @@ int MAIN(int argc, char **argv)
goto bad;
checkoffset = atoi(*(++argv));
checkend = 1;
} else if (strcmp(*argv, "-checkhost") == 0) {
if (--argc < 1)
goto bad;
checkhost = *(++argv);
} else if (strcmp(*argv, "-checkemail") == 0) {
if (--argc < 1)
goto bad;
checkemail = *(++argv);
} else if (strcmp(*argv, "-checkip") == 0) {
if (--argc < 1)
goto bad;
checkip = *(++argv);
} else if (strcmp(*argv, "-noout") == 0)
noout = ++num;
else if (strcmp(*argv, "-trustout") == 0)
@ -447,6 +481,8 @@ int MAIN(int argc, char **argv)
#endif
else if (strcmp(*argv, "-ocspid") == 0)
ocspid = ++num;
else if (strcmp(*argv, "-badsig") == 0)
badsig = 1;
else if ((md_alg = EVP_get_digestbyname(*argv + 1))) {
/* ok */
digest = md_alg;
@ -484,6 +520,13 @@ int MAIN(int argc, char **argv)
goto end;
}
if (fkeyfile) {
fkey = load_pubkey(bio_err, fkeyfile, keyformat, 0,
NULL, e, "Forced key");
if (fkey == NULL)
goto end;
}
if ((CAkeyfile == NULL) && (CA_flag) && (CAformat == FORMAT_PEM)) {
CAkeyfile = CAfile;
} else if ((CA_flag) && (CAkeyfile == NULL)) {
@ -605,10 +648,13 @@ int MAIN(int argc, char **argv)
X509_gmtime_adj(X509_get_notBefore(x), 0);
X509_time_adj_ex(X509_get_notAfter(x), days, 0, NULL);
pkey = X509_REQ_get_pubkey(req);
X509_set_pubkey(x, pkey);
EVP_PKEY_free(pkey);
if (fkey)
X509_set_pubkey(x, fkey);
else {
pkey = X509_REQ_get_pubkey(req);
X509_set_pubkey(x, pkey);
EVP_PKEY_free(pkey);
}
} else
x = load_cert(bio_err, infile, informat, NULL, e, "Certificate");
@ -937,11 +983,16 @@ int MAIN(int argc, char **argv)
goto end;
}
print_cert_checks(STDout, x, checkhost, checkemail, checkip);
if (noout) {
ret = 0;
goto end;
}
if (badsig)
x->signature->data[x->signature->length - 1] ^= 0x1;
if (outformat == FORMAT_ASN1)
i = i2d_X509_bio(out, x);
else if (outformat == FORMAT_PEM) {
@ -982,6 +1033,7 @@ int MAIN(int argc, char **argv)
X509_free(xca);
EVP_PKEY_free(Upkey);
EVP_PKEY_free(CApkey);
EVP_PKEY_free(fkey);
if (sigopts)
sk_OPENSSL_STRING_free(sigopts);
X509_REQ_free(rq);
@ -1101,7 +1153,11 @@ static int x509_certify(X509_STORE *ctx, char *CAfile, const EVP_MD *digest,
if (conf) {
X509V3_CTX ctx2;
#ifdef OPENSSL_SSL_DEBUG_BROKEN_PROTOCOL
X509_set_version(x, force_version);
#else
X509_set_version(x, 2); /* version 3 certificate */
#endif
X509V3_set_ctx(&ctx2, xca, x, NULL, NULL, 0);
X509V3_set_nconf(&ctx2, conf);
if (!X509V3_EXT_add_nconf(conf, &ctx2, section, x))
@ -1186,7 +1242,11 @@ static int sign(X509 *x, EVP_PKEY *pkey, int days, int clrext,
}
if (conf) {
X509V3_CTX ctx;
#ifdef OPENSSL_SSL_DEBUG_BROKEN_PROTOCOL
X509_set_version(x, force_version);
#else
X509_set_version(x, 2); /* version 3 certificate */
#endif
X509V3_set_ctx(&ctx, x, x, NULL, NULL, 0);
X509V3_set_nconf(&ctx, conf);
if (!X509V3_EXT_add_nconf(conf, &ctx, section, x))

25
config
View File

@ -587,15 +587,33 @@ case "$GUESSOS" in
fi
;;
ppc64-*-linux2)
if [ -z "$KERNEL_BITS" ]; then
echo "WARNING! If you wish to build 64-bit library, then you have to"
echo " invoke './Configure linux-ppc64' *manually*."
if [ "$TEST" = "false" -a -t 1 ]; then
echo " You have about 5 seconds to press Ctrl-C to abort."
(trap "stty `stty -g`" 2 0; stty -icanon min 0 time 50; read waste) <&1
fi
fi
if [ "$KERNEL_BITS" = "64" ]; then
OUT="linux-ppc64"
else
OUT="linux-ppc"
(echo "__LP64__" | gcc -E -x c - 2>/dev/null | grep "^__LP64__" 2>&1 > /dev/null) || options="$options -m32"
fi
;;
ppc64le-*-linux2) OUT="linux-ppc64le" ;;
ppc-*-linux2) OUT="linux-ppc" ;;
mips64*-*-linux2)
echo "WARNING! If you wish to build 64-bit library, then you have to"
echo " invoke './Configure linux-ppc64' *manually*."
echo " invoke './Configure linux64-mips64' *manually*."
if [ "$TEST" = "false" -a -t 1 ]; then
echo " You have about 5 seconds to press Ctrl-C to abort."
(trap "stty `stty -g`" 2 0; stty -icanon min 0 time 50; read waste) <&1
fi
OUT="linux-ppc"
OUT="linux-mips64"
;;
ppc-*-linux2) OUT="linux-ppc" ;;
mips*-*-linux2) OUT="linux-mips32" ;;
ppc60x-*-vxworks*) OUT="vxworks-ppc60x" ;;
ppcgen-*-vxworks*) OUT="vxworks-ppcgen" ;;
pentium-*-vxworks*) OUT="vxworks-pentium" ;;
@ -644,6 +662,7 @@ case "$GUESSOS" in
armv[1-3]*-*-linux2) OUT="linux-generic32" ;;
armv[7-9]*-*-linux2) OUT="linux-armv4"; options="$options -march=armv7-a" ;;
arm*-*-linux2) OUT="linux-armv4" ;;
aarch64-*-linux2) OUT="linux-aarch64" ;;
sh*b-*-linux2) OUT="linux-generic32"; options="$options -DB_ENDIAN" ;;
sh*-*-linux2) OUT="linux-generic32"; options="$options -DL_ENDIAN" ;;
m68k*-*-linux2) OUT="linux-generic32"; options="$options -DB_ENDIAN" ;;

View File

@ -74,9 +74,9 @@ ia64cpuid.s: ia64cpuid.S; $(CC) $(CFLAGS) -E ia64cpuid.S > $@
ppccpuid.s: ppccpuid.pl; $(PERL) ppccpuid.pl $(PERLASM_SCHEME) $@
pariscid.s: pariscid.pl; $(PERL) pariscid.pl $(PERLASM_SCHEME) $@
alphacpuid.s: alphacpuid.pl
(preproc=/tmp/$$$$.$@; trap "rm $$preproc" INT; \
(preproc=$$$$.$@.S; trap "rm $$preproc" INT; \
$(PERL) alphacpuid.pl > $$preproc && \
$(CC) -E $$preproc > $@ && rm $$preproc)
$(CC) -E -P $$preproc > $@ && rm $$preproc)
testapps:
[ -z "$(THIS)" ] || ( if echo $(SDIRS) | fgrep ' des '; \
@ -88,7 +88,7 @@ subdirs:
@target=all; $(RECURSIVE_MAKE)
files:
$(PERL) $(TOP)/util/files.pl Makefile >> $(TOP)/MINFO
$(PERL) $(TOP)/util/files.pl "CPUID_OBJ=$(CPUID_OBJ)" Makefile >> $(TOP)/MINFO
@target=files; $(RECURSIVE_MAKE)
links:
@ -102,7 +102,7 @@ lib: $(LIB)
@touch lib
$(LIB): $(LIBOBJ)
$(AR) $(LIB) $(LIBOBJ)
[ -z "$(FIPSLIBDIR)" ] || $(AR) $(LIB) $(FIPSLIBDIR)fipscanister.o
test -z "$(FIPSLIBDIR)" || $(AR) $(LIB) $(FIPSLIBDIR)fipscanister.o
$(RANLIB) $(LIB) || echo Never mind.
shared: buildinf.h lib subdirs

View File

@ -65,12 +65,22 @@ aesni-x86_64.s: asm/aesni-x86_64.pl
$(PERL) asm/aesni-x86_64.pl $(PERLASM_SCHEME) > $@
aesni-sha1-x86_64.s: asm/aesni-sha1-x86_64.pl
$(PERL) asm/aesni-sha1-x86_64.pl $(PERLASM_SCHEME) > $@
aesni-sha256-x86_64.s: asm/aesni-sha256-x86_64.pl
$(PERL) asm/aesni-sha256-x86_64.pl $(PERLASM_SCHEME) > $@
aesni-mb-x86_64.s: asm/aesni-mb-x86_64.pl
$(PERL) asm/aesni-mb-x86_64.pl $(PERLASM_SCHEME) > $@
aes-sparcv9.s: asm/aes-sparcv9.pl
$(PERL) asm/aes-sparcv9.pl $(CFLAGS) > $@
aest4-sparcv9.s: asm/aest4-sparcv9.pl ../perlasm/sparcv9_modes.pl
$(PERL) asm/aest4-sparcv9.pl $(CFLAGS) > $@
aes-ppc.s: asm/aes-ppc.pl
$(PERL) asm/aes-ppc.pl $(PERLASM_SCHEME) $@
vpaes-ppc.s: asm/vpaes-ppc.pl
$(PERL) asm/vpaes-ppc.pl $(PERLASM_SCHEME) $@
aesp8-ppc.s: asm/aesp8-ppc.pl
$(PERL) asm/aesp8-ppc.pl $(PERLASM_SCHEME) $@
aes-parisc.s: asm/aes-parisc.pl
$(PERL) asm/aes-parisc.pl $(PERLASM_SCHEME) $@
@ -78,12 +88,18 @@ aes-parisc.s: asm/aes-parisc.pl
aes-mips.S: asm/aes-mips.pl
$(PERL) asm/aes-mips.pl $(PERLASM_SCHEME) $@
aesv8-armx.S: asm/aesv8-armx.pl
$(PERL) asm/aesv8-armx.pl $(PERLASM_SCHEME) $@
aesv8-armx.o: aesv8-armx.S
# GNU make "catch all"
aes-%.S: asm/aes-%.pl; $(PERL) $< $(PERLASM_SCHEME) > $@
aes-armv4.o: aes-armv4.S
bsaes-%.S: asm/bsaes-%.pl; $(PERL) $< $(PERLASM_SCHEME) $@
bsaes-armv7.o: bsaes-armv7.S
files:
$(PERL) $(TOP)/util/files.pl Makefile >> $(TOP)/MINFO
$(PERL) $(TOP)/util/files.pl "AES_ENC=$(AES_ENC)" Makefile >> $(TOP)/MINFO
links:
@$(PERL) $(TOP)/util/mklink.pl ../../include/openssl $(EXHEADER)
@ -149,7 +165,7 @@ aes_wrap.o: ../../e_os.h ../../include/openssl/aes.h
aes_wrap.o: ../../include/openssl/bio.h ../../include/openssl/buffer.h
aes_wrap.o: ../../include/openssl/crypto.h ../../include/openssl/e_os2.h
aes_wrap.o: ../../include/openssl/err.h ../../include/openssl/lhash.h
aes_wrap.o: ../../include/openssl/opensslconf.h
aes_wrap.o: ../../include/openssl/modes.h ../../include/openssl/opensslconf.h
aes_wrap.o: ../../include/openssl/opensslv.h ../../include/openssl/ossl_typ.h
aes_wrap.o: ../../include/openssl/safestack.h ../../include/openssl/stack.h
aes_wrap.o: ../../include/openssl/symhacks.h ../cryptlib.h aes_wrap.c

View File

@ -54,197 +54,19 @@
#include "cryptlib.h"
#include <openssl/aes.h>
#include <openssl/bio.h>
static const unsigned char default_iv[] = {
0xA6, 0xA6, 0xA6, 0xA6, 0xA6, 0xA6, 0xA6, 0xA6,
};
#include <openssl/modes.h>
int AES_wrap_key(AES_KEY *key, const unsigned char *iv,
unsigned char *out,
const unsigned char *in, unsigned int inlen)
{
unsigned char *A, B[16], *R;
unsigned int i, j, t;
if ((inlen & 0x7) || (inlen < 8))
return -1;
A = B;
t = 1;
memcpy(out + 8, in, inlen);
if (!iv)
iv = default_iv;
memcpy(A, iv, 8);
for (j = 0; j < 6; j++) {
R = out + 8;
for (i = 0; i < inlen; i += 8, t++, R += 8) {
memcpy(B + 8, R, 8);
AES_encrypt(B, B, key);
A[7] ^= (unsigned char)(t & 0xff);
if (t > 0xff) {
A[6] ^= (unsigned char)((t >> 8) & 0xff);
A[5] ^= (unsigned char)((t >> 16) & 0xff);
A[4] ^= (unsigned char)((t >> 24) & 0xff);
}
memcpy(R, B + 8, 8);
}
}
memcpy(out, A, 8);
return inlen + 8;
return CRYPTO_128_wrap(key, iv, out, in, inlen, (block128_f) AES_encrypt);
}
int AES_unwrap_key(AES_KEY *key, const unsigned char *iv,
unsigned char *out,
const unsigned char *in, unsigned int inlen)
{
unsigned char *A, B[16], *R;
unsigned int i, j, t;
inlen -= 8;
if (inlen & 0x7)
return -1;
if (inlen < 8)
return -1;
A = B;
t = 6 * (inlen >> 3);
memcpy(A, in, 8);
memcpy(out, in + 8, inlen);
for (j = 0; j < 6; j++) {
R = out + inlen - 8;
for (i = 0; i < inlen; i += 8, t--, R -= 8) {
A[7] ^= (unsigned char)(t & 0xff);
if (t > 0xff) {
A[6] ^= (unsigned char)((t >> 8) & 0xff);
A[5] ^= (unsigned char)((t >> 16) & 0xff);
A[4] ^= (unsigned char)((t >> 24) & 0xff);
}
memcpy(B + 8, R, 8);
AES_decrypt(B, B, key);
memcpy(R, B + 8, 8);
}
}
if (!iv)
iv = default_iv;
if (memcmp(A, iv, 8)) {
OPENSSL_cleanse(out, inlen);
return 0;
}
return inlen;
return CRYPTO_128_unwrap(key, iv, out, in, inlen,
(block128_f) AES_decrypt);
}
#ifdef AES_WRAP_TEST
int AES_wrap_unwrap_test(const unsigned char *kek, int keybits,
const unsigned char *iv,
const unsigned char *eout,
const unsigned char *key, int keylen)
{
unsigned char *otmp = NULL, *ptmp = NULL;
int r, ret = 0;
AES_KEY wctx;
otmp = OPENSSL_malloc(keylen + 8);
ptmp = OPENSSL_malloc(keylen);
if (!otmp || !ptmp)
return 0;
if (AES_set_encrypt_key(kek, keybits, &wctx))
goto err;
r = AES_wrap_key(&wctx, iv, otmp, key, keylen);
if (r <= 0)
goto err;
if (eout && memcmp(eout, otmp, keylen))
goto err;
if (AES_set_decrypt_key(kek, keybits, &wctx))
goto err;
r = AES_unwrap_key(&wctx, iv, ptmp, otmp, r);
if (memcmp(key, ptmp, keylen))
goto err;
ret = 1;
err:
if (otmp)
OPENSSL_free(otmp);
if (ptmp)
OPENSSL_free(ptmp);
return ret;
}
int main(int argc, char **argv)
{
static const unsigned char kek[] = {
0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,
0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f,
0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17,
0x18, 0x19, 0x1a, 0x1b, 0x1c, 0x1d, 0x1e, 0x1f
};
static const unsigned char key[] = {
0x00, 0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77,
0x88, 0x99, 0xaa, 0xbb, 0xcc, 0xdd, 0xee, 0xff,
0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,
0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f
};
static const unsigned char e1[] = {
0x1f, 0xa6, 0x8b, 0x0a, 0x81, 0x12, 0xb4, 0x47,
0xae, 0xf3, 0x4b, 0xd8, 0xfb, 0x5a, 0x7b, 0x82,
0x9d, 0x3e, 0x86, 0x23, 0x71, 0xd2, 0xcf, 0xe5
};
static const unsigned char e2[] = {
0x96, 0x77, 0x8b, 0x25, 0xae, 0x6c, 0xa4, 0x35,
0xf9, 0x2b, 0x5b, 0x97, 0xc0, 0x50, 0xae, 0xd2,
0x46, 0x8a, 0xb8, 0xa1, 0x7a, 0xd8, 0x4e, 0x5d
};
static const unsigned char e3[] = {
0x64, 0xe8, 0xc3, 0xf9, 0xce, 0x0f, 0x5b, 0xa2,
0x63, 0xe9, 0x77, 0x79, 0x05, 0x81, 0x8a, 0x2a,
0x93, 0xc8, 0x19, 0x1e, 0x7d, 0x6e, 0x8a, 0xe7
};
static const unsigned char e4[] = {
0x03, 0x1d, 0x33, 0x26, 0x4e, 0x15, 0xd3, 0x32,
0x68, 0xf2, 0x4e, 0xc2, 0x60, 0x74, 0x3e, 0xdc,
0xe1, 0xc6, 0xc7, 0xdd, 0xee, 0x72, 0x5a, 0x93,
0x6b, 0xa8, 0x14, 0x91, 0x5c, 0x67, 0x62, 0xd2
};
static const unsigned char e5[] = {
0xa8, 0xf9, 0xbc, 0x16, 0x12, 0xc6, 0x8b, 0x3f,
0xf6, 0xe6, 0xf4, 0xfb, 0xe3, 0x0e, 0x71, 0xe4,
0x76, 0x9c, 0x8b, 0x80, 0xa3, 0x2c, 0xb8, 0x95,
0x8c, 0xd5, 0xd1, 0x7d, 0x6b, 0x25, 0x4d, 0xa1
};
static const unsigned char e6[] = {
0x28, 0xc9, 0xf4, 0x04, 0xc4, 0xb8, 0x10, 0xf4,
0xcb, 0xcc, 0xb3, 0x5c, 0xfb, 0x87, 0xf8, 0x26,
0x3f, 0x57, 0x86, 0xe2, 0xd8, 0x0e, 0xd3, 0x26,
0xcb, 0xc7, 0xf0, 0xe7, 0x1a, 0x99, 0xf4, 0x3b,
0xfb, 0x98, 0x8b, 0x9b, 0x7a, 0x02, 0xdd, 0x21
};
AES_KEY wctx, xctx;
int ret;
ret = AES_wrap_unwrap_test(kek, 128, NULL, e1, key, 16);
fprintf(stderr, "Key test result %d\n", ret);
ret = AES_wrap_unwrap_test(kek, 192, NULL, e2, key, 16);
fprintf(stderr, "Key test result %d\n", ret);
ret = AES_wrap_unwrap_test(kek, 256, NULL, e3, key, 16);
fprintf(stderr, "Key test result %d\n", ret);
ret = AES_wrap_unwrap_test(kek, 192, NULL, e4, key, 24);
fprintf(stderr, "Key test result %d\n", ret);
ret = AES_wrap_unwrap_test(kek, 256, NULL, e5, key, 24);
fprintf(stderr, "Key test result %d\n", ret);
ret = AES_wrap_unwrap_test(kek, 256, NULL, e6, key, 32);
fprintf(stderr, "Key test result %d\n", ret);
}
#endif

View File

@ -89,8 +89,10 @@ typedef unsigned long long u64;
#endif
#undef ROTATE
#if defined(_MSC_VER) || defined(__ICC)
# define ROTATE(a,n) _lrotl(a,n)
#if defined(_MSC_VER)
# define ROTATE(a,n) _lrotl(a,n)
#elif defined(__ICC)
# define ROTATE(a,n) _rotl(a,n)
#elif defined(__GNUC__) && __GNUC__>=2
# if defined(__i386) || defined(__i386__) || defined(__x86_64) || defined(__x86_64__)
# define ROTATE(a,n) ({ register unsigned int ret; \

View File

@ -39,7 +39,7 @@
# but exhibits up to 10% improvement on other cores.
#
# Second version is "monolithic" replacement for aes_core.c, which in
# addition to AES_[de|en]crypt implements private_AES_set_[de|en]cryption_key.
# addition to AES_[de|en]crypt implements AES_set_[de|en]cryption_key.
# This made it possible to implement little-endian variant of the
# algorithm without modifying the base C code. Motivating factor for
# the undertaken effort was that it appeared that in tight IA-32
@ -103,11 +103,12 @@
# byte for 128-bit key.
#
# ECB encrypt ECB decrypt CBC large chunk
# P4 56[60] 84[100] 23
# AMD K8 48[44] 70[79] 18
# PIII 41[50] 61[91] 24
# Core 2 32[38] 45[70] 18.5
# Pentium 120 160 77
# P4 52[54] 83[95] 23
# AMD K8 46[41] 66[70] 18
# PIII 41[50] 60[77] 24
# Core 2 31[36] 45[64] 18.5
# Atom 76[100] 96[138] 60
# Pentium 115 150 77
#
# Version 4.1 switches to compact S-box even in key schedule setup.
#
@ -242,7 +243,7 @@ $vertical_spin=0; # shift "verticaly" defaults to 0, because of
sub encvert()
{ my ($te,@s) = @_;
my $v0 = $acc, $v1 = $key;
my ($v0,$v1) = ($acc,$key);
&mov ($v0,$s[3]); # copy s3
&mov (&DWP(4,"esp"),$s[2]); # save s2
@ -299,7 +300,7 @@ sub encvert()
# Another experimental routine, which features "horizontal spin," but
# eliminates one reference to stack. Strangely enough runs slower...
sub enchoriz()
{ my $v0 = $key, $v1 = $acc;
{ my ($v0,$v1) = ($key,$acc);
&movz ($v0,&LB($s0)); # 3, 2, 1, 0*
&rotr ($s2,8); # 8,11,10, 9
@ -427,7 +428,7 @@ sub sse_encbody()
######################################################################
sub enccompact()
{ my $Fn = mov;
{ my $Fn = \&mov;
while ($#_>5) { pop(@_); $Fn=sub{}; }
my ($i,$te,@s)=@_;
my $tmp = $key;
@ -476,24 +477,25 @@ sub enctransform()
my $tmp = $tbl;
my $r2 = $key ;
&mov ($acc,$s[$i]);
&and ($acc,0x80808080);
&mov ($tmp,$acc);
&shr ($tmp,7);
&and ($tmp,$s[$i]);
&lea ($r2,&DWP(0,$s[$i],$s[$i]));
&sub ($acc,$tmp);
&mov ($acc,$tmp);
&shr ($tmp,7);
&and ($r2,0xfefefefe);
&and ($acc,0x1b1b1b1b);
&sub ($acc,$tmp);
&mov ($tmp,$s[$i]);
&and ($acc,0x1b1b1b1b);
&rotr ($tmp,16);
&xor ($acc,$r2); # r2
&mov ($r2,$s[$i]);
&xor ($s[$i],$acc); # r0 ^ r2
&rotr ($r2,16+8);
&xor ($acc,$tmp);
&rotl ($s[$i],24);
&xor ($s[$i],$acc) # ROTATE(r2^r0,24) ^ r2
&rotr ($tmp,16);
&xor ($s[$i],$tmp);
&rotr ($tmp,8);
&xor ($s[$i],$tmp);
&xor ($acc,$r2);
&mov ($tmp,0x80808080) if ($i!=1);
&xor ($s[$i],$acc); # ROTATE(r2^r0,24) ^ r2
}
&function_begin_B("_x86_AES_encrypt_compact");
@ -526,6 +528,7 @@ sub enctransform()
&enccompact(1,$tbl,$s1,$s2,$s3,$s0,1);
&enccompact(2,$tbl,$s2,$s3,$s0,$s1,1);
&enccompact(3,$tbl,$s3,$s0,$s1,$s2,1);
&mov ($tbl,0x80808080);
&enctransform(2);
&enctransform(3);
&enctransform(0);
@ -607,82 +610,84 @@ sub sse_enccompact()
&pshufw ("mm5","mm4",0x0d); # 15,14,11,10
&movd ("eax","mm1"); # 5, 4, 1, 0
&movd ("ebx","mm5"); # 15,14,11,10
&mov ($__key,$key);
&movz ($acc,&LB("eax")); # 0
&movz ("ecx",&BP(-128,$tbl,$acc,1)); # 0
&pshufw ("mm2","mm0",0x0d); # 7, 6, 3, 2
&movz ("edx",&HB("eax")); # 1
&pshufw ("mm2","mm0",0x0d); # 7, 6, 3, 2
&movz ("ecx",&BP(-128,$tbl,$acc,1)); # 0
&movz ($key,&LB("ebx")); # 10
&movz ("edx",&BP(-128,$tbl,"edx",1)); # 1
&shl ("edx",8); # 1
&shr ("eax",16); # 5, 4
&shl ("edx",8); # 1
&movz ($acc,&LB("ebx")); # 10
&movz ($acc,&BP(-128,$tbl,$acc,1)); # 10
&movz ($acc,&BP(-128,$tbl,$key,1)); # 10
&movz ($key,&HB("ebx")); # 11
&shl ($acc,16); # 10
&or ("ecx",$acc); # 10
&pshufw ("mm6","mm4",0x08); # 13,12, 9, 8
&movz ($acc,&HB("ebx")); # 11
&movz ($acc,&BP(-128,$tbl,$acc,1)); # 11
&or ("ecx",$acc); # 10
&movz ($acc,&BP(-128,$tbl,$key,1)); # 11
&movz ($key,&HB("eax")); # 5
&shl ($acc,24); # 11
&or ("edx",$acc); # 11
&shr ("ebx",16); # 15,14
&or ("edx",$acc); # 11
&movz ($acc,&HB("eax")); # 5
&movz ($acc,&BP(-128,$tbl,$acc,1)); # 5
&movz ($acc,&BP(-128,$tbl,$key,1)); # 5
&movz ($key,&HB("ebx")); # 15
&shl ($acc,8); # 5
&or ("ecx",$acc); # 5
&movz ($acc,&HB("ebx")); # 15
&movz ($acc,&BP(-128,$tbl,$acc,1)); # 15
&movz ($acc,&BP(-128,$tbl,$key,1)); # 15
&movz ($key,&LB("eax")); # 4
&shl ($acc,24); # 15
&or ("ecx",$acc); # 15
&movd ("mm0","ecx"); # t[0] collected
&movz ($acc,&LB("eax")); # 4
&movz ("ecx",&BP(-128,$tbl,$acc,1)); # 4
&movz ($acc,&BP(-128,$tbl,$key,1)); # 4
&movz ($key,&LB("ebx")); # 14
&movd ("eax","mm2"); # 7, 6, 3, 2
&movz ($acc,&LB("ebx")); # 14
&movz ($acc,&BP(-128,$tbl,$acc,1)); # 14
&shl ($acc,16); # 14
&movd ("mm0","ecx"); # t[0] collected
&movz ("ecx",&BP(-128,$tbl,$key,1)); # 14
&movz ($key,&HB("eax")); # 3
&shl ("ecx",16); # 14
&movd ("ebx","mm6"); # 13,12, 9, 8
&or ("ecx",$acc); # 14
&movd ("ebx","mm6"); # 13,12, 9, 8
&movz ($acc,&HB("eax")); # 3
&movz ($acc,&BP(-128,$tbl,$acc,1)); # 3
&movz ($acc,&BP(-128,$tbl,$key,1)); # 3
&movz ($key,&HB("ebx")); # 9
&shl ($acc,24); # 3
&or ("ecx",$acc); # 3
&movz ($acc,&HB("ebx")); # 9
&movz ($acc,&BP(-128,$tbl,$acc,1)); # 9
&movz ($acc,&BP(-128,$tbl,$key,1)); # 9
&movz ($key,&LB("ebx")); # 8
&shl ($acc,8); # 9
&or ("ecx",$acc); # 9
&movd ("mm1","ecx"); # t[1] collected
&movz ($acc,&LB("ebx")); # 8
&movz ("ecx",&BP(-128,$tbl,$acc,1)); # 8
&shr ("ebx",16); # 13,12
&movz ($acc,&LB("eax")); # 2
&movz ($acc,&BP(-128,$tbl,$acc,1)); # 2
&shl ($acc,16); # 2
&or ("ecx",$acc); # 2
&or ("ecx",$acc); # 9
&movz ($acc,&BP(-128,$tbl,$key,1)); # 8
&movz ($key,&LB("eax")); # 2
&shr ("eax",16); # 7, 6
&movd ("mm1","ecx"); # t[1] collected
&movz ("ecx",&BP(-128,$tbl,$key,1)); # 2
&movz ($key,&HB("eax")); # 7
&shl ("ecx",16); # 2
&and ("eax",0xff); # 6
&or ("ecx",$acc); # 2
&punpckldq ("mm0","mm1"); # t[0,1] collected
&movz ($acc,&HB("eax")); # 7
&movz ($acc,&BP(-128,$tbl,$acc,1)); # 7
&movz ($acc,&BP(-128,$tbl,$key,1)); # 7
&movz ($key,&HB("ebx")); # 13
&shl ($acc,24); # 7
&or ("ecx",$acc); # 7
&and ("eax",0xff); # 6
&movz ("eax",&BP(-128,$tbl,"eax",1)); # 6
&shl ("eax",16); # 6
&or ("edx","eax"); # 6
&movz ($acc,&HB("ebx")); # 13
&movz ($acc,&BP(-128,$tbl,$acc,1)); # 13
&shl ($acc,8); # 13
&or ("ecx",$acc); # 13
&movd ("mm4","ecx"); # t[2] collected
&and ("ebx",0xff); # 12
&movz ("eax",&BP(-128,$tbl,"eax",1)); # 6
&or ("ecx",$acc); # 7
&shl ("eax",16); # 6
&movz ($acc,&BP(-128,$tbl,$key,1)); # 13
&or ("edx","eax"); # 6
&shl ($acc,8); # 13
&movz ("ebx",&BP(-128,$tbl,"ebx",1)); # 12
&or ("ecx",$acc); # 13
&or ("edx","ebx"); # 12
&mov ($key,$__key);
&movd ("mm4","ecx"); # t[2] collected
&movd ("mm5","edx"); # t[3] collected
&punpckldq ("mm4","mm5"); # t[2,3] collected
@ -1222,7 +1227,7 @@ sub enclast()
######################################################################
sub deccompact()
{ my $Fn = mov;
{ my $Fn = \&mov;
while ($#_>5) { pop(@_); $Fn=sub{}; }
my ($i,$td,@s)=@_;
my $tmp = $key;
@ -1270,30 +1275,30 @@ sub dectransform()
my $tp4 = @s[($i+3)%4]; $tp4 = @s[3] if ($i==1);
my $tp8 = $tbl;
&mov ($acc,$s[$i]);
&and ($acc,0x80808080);
&mov ($tmp,$acc);
&mov ($tmp,0x80808080);
&and ($tmp,$s[$i]);
&mov ($acc,$tmp);
&shr ($tmp,7);
&lea ($tp2,&DWP(0,$s[$i],$s[$i]));
&sub ($acc,$tmp);
&and ($tp2,0xfefefefe);
&and ($acc,0x1b1b1b1b);
&xor ($acc,$tp2);
&mov ($tp2,$acc);
&xor ($tp2,$acc);
&mov ($tmp,0x80808080);
&and ($acc,0x80808080);
&mov ($tmp,$acc);
&and ($tmp,$tp2);
&mov ($acc,$tmp);
&shr ($tmp,7);
&lea ($tp4,&DWP(0,$tp2,$tp2));
&sub ($acc,$tmp);
&and ($tp4,0xfefefefe);
&and ($acc,0x1b1b1b1b);
&xor ($tp2,$s[$i]); # tp2^tp1
&xor ($acc,$tp4);
&mov ($tp4,$acc);
&xor ($tp4,$acc);
&mov ($tmp,0x80808080);
&and ($acc,0x80808080);
&mov ($tmp,$acc);
&and ($tmp,$tp4);
&mov ($acc,$tmp);
&shr ($tmp,7);
&lea ($tp8,&DWP(0,$tp4,$tp4));
&sub ($acc,$tmp);
@ -1305,13 +1310,13 @@ sub dectransform()
&xor ($s[$i],$tp2);
&xor ($tp2,$tp8);
&rotl ($tp2,24);
&xor ($s[$i],$tp4);
&xor ($tp4,$tp8);
&rotl ($tp4,16);
&rotl ($tp2,24);
&xor ($s[$i],$tp8); # ^= tp8^(tp4^tp1)^(tp2^tp1)
&rotl ($tp8,8);
&rotl ($tp4,16);
&xor ($s[$i],$tp2); # ^= ROTATE(tp8^tp2^tp1,24)
&rotl ($tp8,8);
&xor ($s[$i],$tp4); # ^= ROTATE(tp8^tp4^tp1,16)
&mov ($s[0],$__s0) if($i==2); #prefetch $s0
&mov ($s[1],$__s1) if($i==3); #prefetch $s1
@ -1389,85 +1394,87 @@ sub dectransform()
sub sse_deccompact()
{
&pshufw ("mm1","mm0",0x0c); # 7, 6, 1, 0
&movd ("eax","mm1"); # 7, 6, 1, 0
&pshufw ("mm5","mm4",0x09); # 13,12,11,10
&movz ($acc,&LB("eax")); # 0
&movz ("ecx",&BP(-128,$tbl,$acc,1)); # 0
&movd ("eax","mm1"); # 7, 6, 1, 0
&movd ("ebx","mm5"); # 13,12,11,10
&mov ($__key,$key);
&movz ($acc,&LB("eax")); # 0
&movz ("edx",&HB("eax")); # 1
&pshufw ("mm2","mm0",0x06); # 3, 2, 5, 4
&movz ("ecx",&BP(-128,$tbl,$acc,1)); # 0
&movz ($key,&LB("ebx")); # 10
&movz ("edx",&BP(-128,$tbl,"edx",1)); # 1
&shr ("eax",16); # 7, 6
&shl ("edx",8); # 1
&pshufw ("mm2","mm0",0x06); # 3, 2, 5, 4
&movz ($acc,&LB("ebx")); # 10
&movz ($acc,&BP(-128,$tbl,$acc,1)); # 10
&movz ($acc,&BP(-128,$tbl,$key,1)); # 10
&movz ($key,&HB("ebx")); # 11
&shl ($acc,16); # 10
&or ("ecx",$acc); # 10
&shr ("eax",16); # 7, 6
&movz ($acc,&HB("ebx")); # 11
&movz ($acc,&BP(-128,$tbl,$acc,1)); # 11
&shl ($acc,24); # 11
&or ("edx",$acc); # 11
&shr ("ebx",16); # 13,12
&pshufw ("mm6","mm4",0x03); # 9, 8,15,14
&movz ($acc,&HB("eax")); # 7
&movz ($acc,&BP(-128,$tbl,$acc,1)); # 7
&or ("ecx",$acc); # 10
&movz ($acc,&BP(-128,$tbl,$key,1)); # 11
&movz ($key,&HB("eax")); # 7
&shl ($acc,24); # 11
&shr ("ebx",16); # 13,12
&or ("edx",$acc); # 11
&movz ($acc,&BP(-128,$tbl,$key,1)); # 7
&movz ($key,&HB("ebx")); # 13
&shl ($acc,24); # 7
&or ("ecx",$acc); # 7
&movz ($acc,&HB("ebx")); # 13
&movz ($acc,&BP(-128,$tbl,$acc,1)); # 13
&movz ($acc,&BP(-128,$tbl,$key,1)); # 13
&movz ($key,&LB("eax")); # 6
&shl ($acc,8); # 13
&or ("ecx",$acc); # 13
&movd ("mm0","ecx"); # t[0] collected
&movz ($acc,&LB("eax")); # 6
&movd ("eax","mm2"); # 3, 2, 5, 4
&movz ("ecx",&BP(-128,$tbl,$acc,1)); # 6
&shl ("ecx",16); # 6
&movz ($acc,&LB("ebx")); # 12
&or ("ecx",$acc); # 13
&movz ($acc,&BP(-128,$tbl,$key,1)); # 6
&movz ($key,&LB("ebx")); # 12
&shl ($acc,16); # 6
&movd ("ebx","mm6"); # 9, 8,15,14
&movz ($acc,&BP(-128,$tbl,$acc,1)); # 12
&movd ("mm0","ecx"); # t[0] collected
&movz ("ecx",&BP(-128,$tbl,$key,1)); # 12
&movz ($key,&LB("eax")); # 4
&or ("ecx",$acc); # 12
&movz ($acc,&LB("eax")); # 4
&movz ($acc,&BP(-128,$tbl,$acc,1)); # 4
&movz ($acc,&BP(-128,$tbl,$key,1)); # 4
&movz ($key,&LB("ebx")); # 14
&or ("edx",$acc); # 4
&movz ($acc,&LB("ebx")); # 14
&movz ($acc,&BP(-128,$tbl,$acc,1)); # 14
&movz ($acc,&BP(-128,$tbl,$key,1)); # 14
&movz ($key,&HB("eax")); # 5
&shl ($acc,16); # 14
&or ("edx",$acc); # 14
&movd ("mm1","edx"); # t[1] collected
&movz ($acc,&HB("eax")); # 5
&movz ("edx",&BP(-128,$tbl,$acc,1)); # 5
&shl ("edx",8); # 5
&movz ($acc,&HB("ebx")); # 15
&shr ("eax",16); # 3, 2
&movz ($acc,&BP(-128,$tbl,$acc,1)); # 15
&shl ($acc,24); # 15
&or ("edx",$acc); # 15
&or ("edx",$acc); # 14
&movz ($acc,&BP(-128,$tbl,$key,1)); # 5
&movz ($key,&HB("ebx")); # 15
&shr ("ebx",16); # 9, 8
&shl ($acc,8); # 5
&movd ("mm1","edx"); # t[1] collected
&movz ("edx",&BP(-128,$tbl,$key,1)); # 15
&movz ($key,&HB("ebx")); # 9
&shl ("edx",24); # 15
&and ("ebx",0xff); # 8
&or ("edx",$acc); # 15
&punpckldq ("mm0","mm1"); # t[0,1] collected
&movz ($acc,&HB("ebx")); # 9
&movz ($acc,&BP(-128,$tbl,$acc,1)); # 9
&movz ($acc,&BP(-128,$tbl,$key,1)); # 9
&movz ($key,&LB("eax")); # 2
&shl ($acc,8); # 9
&or ("ecx",$acc); # 9
&and ("ebx",0xff); # 8
&movz ("ebx",&BP(-128,$tbl,"ebx",1)); # 8
&or ("edx","ebx"); # 8
&movz ($acc,&LB("eax")); # 2
&movz ($acc,&BP(-128,$tbl,$acc,1)); # 2
&shl ($acc,16); # 2
&or ("edx",$acc); # 2
&movd ("mm4","edx"); # t[2] collected
&movz ("eax",&HB("eax")); # 3
&movz ("ebx",&BP(-128,$tbl,"ebx",1)); # 8
&or ("ecx",$acc); # 9
&movz ($acc,&BP(-128,$tbl,$key,1)); # 2
&or ("edx","ebx"); # 8
&shl ($acc,16); # 2
&movz ("eax",&BP(-128,$tbl,"eax",1)); # 3
&or ("edx",$acc); # 2
&shl ("eax",24); # 3
&or ("ecx","eax"); # 3
&mov ($key,$__key);
&movd ("mm4","edx"); # t[2] collected
&movd ("mm5","ecx"); # t[3] collected
&punpckldq ("mm4","mm5"); # t[2,3] collected
@ -2181,8 +2188,8 @@ my $mark=&DWP(76+240,"esp"); # copy of aes_key->rounds
&mov ("ecx",240/4);
&xor ("eax","eax");
&align (4);
&data_word(0xABF3F689); # rep stosd
&set_label("skip_ezero")
&data_word(0xABF3F689); # rep stosd
&set_label("skip_ezero");
&mov ("esp",$_esp);
&popf ();
&set_label("drop_out");
@ -2301,8 +2308,8 @@ my $mark=&DWP(76+240,"esp"); # copy of aes_key->rounds
&mov ("ecx",240/4);
&xor ("eax","eax");
&align (4);
&data_word(0xABF3F689); # rep stosd
&set_label("skip_dzero")
&data_word(0xABF3F689); # rep stosd
&set_label("skip_dzero");
&mov ("esp",$_esp);
&popf ();
&function_end_A();
@ -2865,32 +2872,32 @@ sub deckey()
{ my ($i,$key,$tp1,$tp2,$tp4,$tp8) = @_;
my $tmp = $tbl;
&mov ($acc,$tp1);
&and ($acc,0x80808080);
&mov ($tmp,$acc);
&shr ($tmp,7);
&mov ($tmp,0x80808080);
&and ($tmp,$tp1);
&lea ($tp2,&DWP(0,$tp1,$tp1));
&mov ($acc,$tmp);
&shr ($tmp,7);
&sub ($acc,$tmp);
&and ($tp2,0xfefefefe);
&and ($acc,0x1b1b1b1b);
&xor ($acc,$tp2);
&mov ($tp2,$acc);
&xor ($tp2,$acc);
&mov ($tmp,0x80808080);
&and ($acc,0x80808080);
&mov ($tmp,$acc);
&shr ($tmp,7);
&and ($tmp,$tp2);
&lea ($tp4,&DWP(0,$tp2,$tp2));
&mov ($acc,$tmp);
&shr ($tmp,7);
&sub ($acc,$tmp);
&and ($tp4,0xfefefefe);
&and ($acc,0x1b1b1b1b);
&xor ($tp2,$tp1); # tp2^tp1
&xor ($acc,$tp4);
&mov ($tp4,$acc);
&xor ($tp4,$acc);
&mov ($tmp,0x80808080);
&and ($acc,0x80808080);
&mov ($tmp,$acc);
&shr ($tmp,7);
&and ($tmp,$tp4);
&lea ($tp8,&DWP(0,$tp4,$tp4));
&mov ($acc,$tmp);
&shr ($tmp,7);
&xor ($tp4,$tp1); # tp4^tp1
&sub ($acc,$tmp);
&and ($tp8,0xfefefefe);

View File

@ -1,7 +1,7 @@
#!/usr/bin/env perl
# ====================================================================
# Written by Andy Polyakov <appro@fy.chalmers.se> for the OpenSSL
# Written by Andy Polyakov <appro@openssl.org> for the OpenSSL
# project. The module is, however, dual licensed under OpenSSL and
# CRYPTOGAMS licenses depending on where you obtain it. For further
# details see http://www.openssl.org/~appro/cryptogams/.
@ -51,9 +51,23 @@ $key="r11";
$rounds="r12";
$code=<<___;
#include "arm_arch.h"
#ifndef __KERNEL__
# include "arm_arch.h"
#else
# define __ARM_ARCH__ __LINUX_ARM_ARCH__
#endif
.text
#if __ARM_ARCH__<7
.code 32
#else
.syntax unified
# ifdef __thumb2__
.thumb
# else
.code 32
# endif
#endif
.type AES_Te,%object
.align 5
@ -167,7 +181,11 @@ AES_Te:
.type AES_encrypt,%function
.align 5
AES_encrypt:
#if __ARM_ARCH__<7
sub r3,pc,#8 @ AES_encrypt
#else
adr r3,AES_encrypt
#endif
stmdb sp!,{r1,r4-r12,lr}
mov $rounds,r0 @ inp
mov $key,r2
@ -409,11 +427,21 @@ _armv4_AES_encrypt:
.align 5
private_AES_set_encrypt_key:
_armv4_AES_set_encrypt_key:
#if __ARM_ARCH__<7
sub r3,pc,#8 @ AES_set_encrypt_key
#else
adr r3,private_AES_set_encrypt_key
#endif
teq r0,#0
#if __ARM_ARCH__>=7
itt eq @ Thumb2 thing, sanity check in ARM
#endif
moveq r0,#-1
beq .Labrt
teq r2,#0
#if __ARM_ARCH__>=7
itt eq @ Thumb2 thing, sanity check in ARM
#endif
moveq r0,#-1
beq .Labrt
@ -422,6 +450,9 @@ _armv4_AES_set_encrypt_key:
teq r1,#192
beq .Lok
teq r1,#256
#if __ARM_ARCH__>=7
itt ne @ Thumb2 thing, sanity check in ARM
#endif
movne r0,#-1
bne .Labrt
@ -576,6 +607,9 @@ _armv4_AES_set_encrypt_key:
str $s2,[$key,#-16]
subs $rounds,$rounds,#1
str $s3,[$key,#-12]
#if __ARM_ARCH__>=7
itt eq @ Thumb2 thing, sanity check in ARM
#endif
subeq r2,$key,#216
beq .Ldone
@ -645,6 +679,9 @@ _armv4_AES_set_encrypt_key:
str $s2,[$key,#-24]
subs $rounds,$rounds,#1
str $s3,[$key,#-20]
#if __ARM_ARCH__>=7
itt eq @ Thumb2 thing, sanity check in ARM
#endif
subeq r2,$key,#256
beq .Ldone
@ -674,11 +711,17 @@ _armv4_AES_set_encrypt_key:
str $i3,[$key,#-4]
b .L256_loop
.align 2
.Ldone: mov r0,#0
ldmia sp!,{r4-r12,lr}
.Labrt: tst lr,#1
.Labrt:
#if __ARM_ARCH__>=5
ret @ bx lr
#else
tst lr,#1
moveq pc,lr @ be binary compatible with V4, yet
bx lr @ interoperable with Thumb ISA:-)
#endif
.size private_AES_set_encrypt_key,.-private_AES_set_encrypt_key
.global private_AES_set_decrypt_key
@ -688,34 +731,57 @@ private_AES_set_decrypt_key:
str lr,[sp,#-4]! @ push lr
bl _armv4_AES_set_encrypt_key
teq r0,#0
ldrne lr,[sp],#4 @ pop lr
ldr lr,[sp],#4 @ pop lr
bne .Labrt
stmdb sp!,{r4-r12}
mov r0,r2 @ AES_set_encrypt_key preserves r2,
mov r1,r2 @ which is AES_KEY *key
b _armv4_AES_set_enc2dec_key
.size private_AES_set_decrypt_key,.-private_AES_set_decrypt_key
ldr $rounds,[r2,#240] @ AES_set_encrypt_key preserves r2,
mov $key,r2 @ which is AES_KEY *key
mov $i1,r2
add $i2,r2,$rounds,lsl#4
@ void AES_set_enc2dec_key(const AES_KEY *inp,AES_KEY *out)
.global AES_set_enc2dec_key
.type AES_set_enc2dec_key,%function
.align 5
AES_set_enc2dec_key:
_armv4_AES_set_enc2dec_key:
stmdb sp!,{r4-r12,lr}
.Linv: ldr $s0,[$i1]
ldr $rounds,[r0,#240]
mov $i1,r0 @ input
add $i2,r0,$rounds,lsl#4
mov $key,r1 @ ouput
add $tbl,r1,$rounds,lsl#4
str $rounds,[r1,#240]
.Linv: ldr $s0,[$i1],#16
ldr $s1,[$i1,#-12]
ldr $s2,[$i1,#-8]
ldr $s3,[$i1,#-4]
ldr $t1,[$i2],#-16
ldr $t2,[$i2,#16+4]
ldr $t3,[$i2,#16+8]
ldr $i3,[$i2,#16+12]
str $s0,[$tbl],#-16
str $s1,[$tbl,#16+4]
str $s2,[$tbl,#16+8]
str $s3,[$tbl,#16+12]
str $t1,[$key],#16
str $t2,[$key,#-12]
str $t3,[$key,#-8]
str $i3,[$key,#-4]
teq $i1,$i2
bne .Linv
ldr $s0,[$i1]
ldr $s1,[$i1,#4]
ldr $s2,[$i1,#8]
ldr $s3,[$i1,#12]
ldr $t1,[$i2]
ldr $t2,[$i2,#4]
ldr $t3,[$i2,#8]
ldr $i3,[$i2,#12]
str $s0,[$i2],#-16
str $s1,[$i2,#16+4]
str $s2,[$i2,#16+8]
str $s3,[$i2,#16+12]
str $t1,[$i1],#16
str $t2,[$i1,#-12]
str $t3,[$i1,#-8]
str $i3,[$i1,#-4]
teq $i1,$i2
bne .Linv
str $s0,[$key]
str $s1,[$key,#4]
str $s2,[$key,#8]
str $s3,[$key,#12]
sub $key,$key,$rounds,lsl#3
___
$mask80=$i1;
$mask1b=$i2;
@ -773,7 +839,7 @@ $code.=<<___;
moveq pc,lr @ be binary compatible with V4, yet
bx lr @ interoperable with Thumb ISA:-)
#endif
.size private_AES_set_decrypt_key,.-private_AES_set_decrypt_key
.size AES_set_enc2dec_key,.-AES_set_enc2dec_key
.type AES_Td,%object
.align 5
@ -883,7 +949,11 @@ AES_Td:
.type AES_decrypt,%function
.align 5
AES_decrypt:
#if __ARM_ARCH__<7
sub r3,pc,#8 @ AES_decrypt
#else
adr r3,AES_decrypt
#endif
stmdb sp!,{r1,r4-r12,lr}
mov $rounds,r0 @ inp
mov $key,r2
@ -1080,8 +1150,9 @@ _armv4_AES_decrypt:
ldrb $t3,[$tbl,$i3] @ Td4[s0>>0]
and $i3,lr,$s1,lsr#8
add $s1,$tbl,$s1,lsr#24
ldrb $i1,[$tbl,$i1] @ Td4[s1>>0]
ldrb $s1,[$tbl,$s1,lsr#24] @ Td4[s1>>24]
ldrb $s1,[$s1] @ Td4[s1>>24]
ldrb $i2,[$tbl,$i2] @ Td4[s1>>16]
eor $s0,$i1,$s0,lsl#24
ldrb $i3,[$tbl,$i3] @ Td4[s1>>8]
@ -1094,7 +1165,8 @@ _armv4_AES_decrypt:
ldrb $i2,[$tbl,$i2] @ Td4[s2>>0]
and $i3,lr,$s2,lsr#16
ldrb $s2,[$tbl,$s2,lsr#24] @ Td4[s2>>24]
add $s2,$tbl,$s2,lsr#24
ldrb $s2,[$s2] @ Td4[s2>>24]
eor $s0,$s0,$i1,lsl#8
ldrb $i3,[$tbl,$i3] @ Td4[s2>>16]
eor $s1,$i2,$s1,lsl#16
@ -1106,8 +1178,9 @@ _armv4_AES_decrypt:
ldrb $i2,[$tbl,$i2] @ Td4[s3>>8]
and $i3,lr,$s3 @ i2
add $s3,$tbl,$s3,lsr#24
ldrb $i3,[$tbl,$i3] @ Td4[s3>>0]
ldrb $s3,[$tbl,$s3,lsr#24] @ Td4[s3>>24]
ldrb $s3,[$s3] @ Td4[s3>>24]
eor $s0,$s0,$i1,lsl#16
ldr $i1,[$key,#0]
eor $s1,$s1,$i2,lsl#8
@ -1130,5 +1203,15 @@ _armv4_AES_decrypt:
___
$code =~ s/\bbx\s+lr\b/.word\t0xe12fff1e/gm; # make it possible to compile with -march=armv4
$code =~ s/\bret\b/bx\tlr/gm;
open SELF,$0;
while(<SELF>) {
next if (/^#!/);
last if (!s/^#/@/ and !/^$/);
print;
}
close SELF;
print $code;
close STDOUT; # enforce flush

File diff suppressed because it is too large Load Diff

View File

@ -45,6 +45,8 @@ if ($flavour =~ /64/) {
$PUSH ="stw";
} else { die "nonsense $flavour"; }
$LITTLE_ENDIAN = ($flavour=~/le$/) ? $SIZE_T : 0;
$0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1;
( $xlate="${dir}ppc-xlate.pl" and -f $xlate ) or
( $xlate="${dir}../../perlasm/ppc-xlate.pl" and -f $xlate) or
@ -68,7 +70,7 @@ $key="r5";
$Tbl0="r3";
$Tbl1="r6";
$Tbl2="r7";
$Tbl3="r2";
$Tbl3=$out; # stay away from "r2"; $out is offloaded to stack
$s0="r8";
$s1="r9";
@ -76,7 +78,7 @@ $s2="r10";
$s3="r11";
$t0="r12";
$t1="r13";
$t1="r0"; # stay away from "r13";
$t2="r14";
$t3="r15";
@ -100,9 +102,6 @@ $acc13="r29";
$acc14="r30";
$acc15="r31";
# stay away from TLS pointer
if ($SIZE_T==8) { die if ($t1 ne "r13"); $t1="r0"; }
else { die if ($Tbl3 ne "r2"); $Tbl3=$t0; $t0="r0"; }
$mask80=$Tbl2;
$mask1b=$Tbl3;
@ -337,8 +336,7 @@ $code.=<<___;
$STU $sp,-$FRAME($sp)
mflr r0
$PUSH $toc,`$FRAME-$SIZE_T*20`($sp)
$PUSH r13,`$FRAME-$SIZE_T*19`($sp)
$PUSH $out,`$FRAME-$SIZE_T*19`($sp)
$PUSH r14,`$FRAME-$SIZE_T*18`($sp)
$PUSH r15,`$FRAME-$SIZE_T*17`($sp)
$PUSH r16,`$FRAME-$SIZE_T*16`($sp)
@ -365,16 +363,61 @@ $code.=<<___;
bne Lenc_unaligned
Lenc_unaligned_ok:
___
$code.=<<___ if (!$LITTLE_ENDIAN);
lwz $s0,0($inp)
lwz $s1,4($inp)
lwz $s2,8($inp)
lwz $s3,12($inp)
___
$code.=<<___ if ($LITTLE_ENDIAN);
lwz $t0,0($inp)
lwz $t1,4($inp)
lwz $t2,8($inp)
lwz $t3,12($inp)
rotlwi $s0,$t0,8
rotlwi $s1,$t1,8
rotlwi $s2,$t2,8
rotlwi $s3,$t3,8
rlwimi $s0,$t0,24,0,7
rlwimi $s1,$t1,24,0,7
rlwimi $s2,$t2,24,0,7
rlwimi $s3,$t3,24,0,7
rlwimi $s0,$t0,24,16,23
rlwimi $s1,$t1,24,16,23
rlwimi $s2,$t2,24,16,23
rlwimi $s3,$t3,24,16,23
___
$code.=<<___;
bl LAES_Te
bl Lppc_AES_encrypt_compact
$POP $out,`$FRAME-$SIZE_T*19`($sp)
___
$code.=<<___ if ($LITTLE_ENDIAN);
rotlwi $t0,$s0,8
rotlwi $t1,$s1,8
rotlwi $t2,$s2,8
rotlwi $t3,$s3,8
rlwimi $t0,$s0,24,0,7
rlwimi $t1,$s1,24,0,7
rlwimi $t2,$s2,24,0,7
rlwimi $t3,$s3,24,0,7
rlwimi $t0,$s0,24,16,23
rlwimi $t1,$s1,24,16,23
rlwimi $t2,$s2,24,16,23
rlwimi $t3,$s3,24,16,23
stw $t0,0($out)
stw $t1,4($out)
stw $t2,8($out)
stw $t3,12($out)
___
$code.=<<___ if (!$LITTLE_ENDIAN);
stw $s0,0($out)
stw $s1,4($out)
stw $s2,8($out)
stw $s3,12($out)
___
$code.=<<___;
b Lenc_done
Lenc_unaligned:
@ -417,6 +460,7 @@ Lenc_xpage:
bl LAES_Te
bl Lppc_AES_encrypt_compact
$POP $out,`$FRAME-$SIZE_T*19`($sp)
extrwi $acc00,$s0,8,0
extrwi $acc01,$s0,8,8
@ -449,8 +493,6 @@ Lenc_xpage:
Lenc_done:
$POP r0,`$FRAME+$LRSAVE`($sp)
$POP $toc,`$FRAME-$SIZE_T*20`($sp)
$POP r13,`$FRAME-$SIZE_T*19`($sp)
$POP r14,`$FRAME-$SIZE_T*18`($sp)
$POP r15,`$FRAME-$SIZE_T*17`($sp)
$POP r16,`$FRAME-$SIZE_T*16`($sp)
@ -764,6 +806,7 @@ Lenc_compact_done:
blr
.long 0
.byte 0,12,0x14,0,0,0,0,0
.size .AES_encrypt,.-.AES_encrypt
.globl .AES_decrypt
.align 7
@ -771,8 +814,7 @@ Lenc_compact_done:
$STU $sp,-$FRAME($sp)
mflr r0
$PUSH $toc,`$FRAME-$SIZE_T*20`($sp)
$PUSH r13,`$FRAME-$SIZE_T*19`($sp)
$PUSH $out,`$FRAME-$SIZE_T*19`($sp)
$PUSH r14,`$FRAME-$SIZE_T*18`($sp)
$PUSH r15,`$FRAME-$SIZE_T*17`($sp)
$PUSH r16,`$FRAME-$SIZE_T*16`($sp)
@ -799,16 +841,61 @@ Lenc_compact_done:
bne Ldec_unaligned
Ldec_unaligned_ok:
___
$code.=<<___ if (!$LITTLE_ENDIAN);
lwz $s0,0($inp)
lwz $s1,4($inp)
lwz $s2,8($inp)
lwz $s3,12($inp)
___
$code.=<<___ if ($LITTLE_ENDIAN);
lwz $t0,0($inp)
lwz $t1,4($inp)
lwz $t2,8($inp)
lwz $t3,12($inp)
rotlwi $s0,$t0,8
rotlwi $s1,$t1,8
rotlwi $s2,$t2,8
rotlwi $s3,$t3,8
rlwimi $s0,$t0,24,0,7
rlwimi $s1,$t1,24,0,7
rlwimi $s2,$t2,24,0,7
rlwimi $s3,$t3,24,0,7
rlwimi $s0,$t0,24,16,23
rlwimi $s1,$t1,24,16,23
rlwimi $s2,$t2,24,16,23
rlwimi $s3,$t3,24,16,23
___
$code.=<<___;
bl LAES_Td
bl Lppc_AES_decrypt_compact
$POP $out,`$FRAME-$SIZE_T*19`($sp)
___
$code.=<<___ if ($LITTLE_ENDIAN);
rotlwi $t0,$s0,8
rotlwi $t1,$s1,8
rotlwi $t2,$s2,8
rotlwi $t3,$s3,8
rlwimi $t0,$s0,24,0,7
rlwimi $t1,$s1,24,0,7
rlwimi $t2,$s2,24,0,7
rlwimi $t3,$s3,24,0,7
rlwimi $t0,$s0,24,16,23
rlwimi $t1,$s1,24,16,23
rlwimi $t2,$s2,24,16,23
rlwimi $t3,$s3,24,16,23
stw $t0,0($out)
stw $t1,4($out)
stw $t2,8($out)
stw $t3,12($out)
___
$code.=<<___ if (!$LITTLE_ENDIAN);
stw $s0,0($out)
stw $s1,4($out)
stw $s2,8($out)
stw $s3,12($out)
___
$code.=<<___;
b Ldec_done
Ldec_unaligned:
@ -851,6 +938,7 @@ Ldec_xpage:
bl LAES_Td
bl Lppc_AES_decrypt_compact
$POP $out,`$FRAME-$SIZE_T*19`($sp)
extrwi $acc00,$s0,8,0
extrwi $acc01,$s0,8,8
@ -883,8 +971,6 @@ Ldec_xpage:
Ldec_done:
$POP r0,`$FRAME+$LRSAVE`($sp)
$POP $toc,`$FRAME-$SIZE_T*20`($sp)
$POP r13,`$FRAME-$SIZE_T*19`($sp)
$POP r14,`$FRAME-$SIZE_T*18`($sp)
$POP r15,`$FRAME-$SIZE_T*17`($sp)
$POP r16,`$FRAME-$SIZE_T*16`($sp)
@ -1355,6 +1441,7 @@ Ldec_compact_done:
blr
.long 0
.byte 0,12,0x14,0,0,0,0,0
.size .AES_decrypt,.-.AES_decrypt
.asciz "AES for PPC, CRYPTOGAMS by <appro\@openssl.org>"
.align 7

View File

@ -19,9 +19,10 @@
# Performance in number of cycles per processed byte for 128-bit key:
#
# ECB encrypt ECB decrypt CBC large chunk
# AMD64 33 41 13.0
# EM64T 38 59 18.6(*)
# Core 2 30 43 14.5(*)
# AMD64 33 43 13.0
# EM64T 38 56 18.6(*)
# Core 2 30 42 14.5(*)
# Atom 65 86 32.1(*)
#
# (*) with hyper-threading off
@ -366,68 +367,66 @@ $code.=<<___;
movzb `&lo("$s0")`,$t0
movzb `&lo("$s1")`,$t1
movzb `&lo("$s2")`,$t2
movzb ($sbox,$t0,1),$t0
movzb ($sbox,$t1,1),$t1
movzb ($sbox,$t2,1),$t2
movzb `&lo("$s3")`,$t3
movzb `&hi("$s1")`,$acc0
movzb `&hi("$s2")`,$acc1
movzb ($sbox,$t3,1),$t3
movzb ($sbox,$acc0,1),$t4 #$t0
movzb ($sbox,$acc1,1),$t5 #$t1
movzb `&hi("$s3")`,$acc2
movzb `&hi("$s0")`,$acc0
shr \$16,$s2
movzb `&hi("$s3")`,$acc2
movzb ($sbox,$t0,1),$t0
movzb ($sbox,$t1,1),$t1
movzb ($sbox,$t2,1),$t2
movzb ($sbox,$t3,1),$t3
movzb ($sbox,$acc0,1),$t4 #$t0
movzb `&hi("$s0")`,$acc0
movzb ($sbox,$acc1,1),$t5 #$t1
movzb `&lo("$s2")`,$acc1
movzb ($sbox,$acc2,1),$acc2 #$t2
movzb ($sbox,$acc0,1),$acc0 #$t3
shr \$16,$s3
movzb `&lo("$s2")`,$acc1
shl \$8,$t4
shr \$16,$s3
shl \$8,$t5
movzb ($sbox,$acc1,1),$acc1 #$t0
xor $t4,$t0
xor $t5,$t1
movzb `&lo("$s3")`,$t4
shr \$16,$s0
movzb `&lo("$s3")`,$t4
shr \$16,$s1
movzb `&lo("$s0")`,$t5
xor $t5,$t1
shl \$8,$acc2
shl \$8,$acc0
movzb ($sbox,$t4,1),$t4 #$t1
movzb ($sbox,$t5,1),$t5 #$t2
movzb `&lo("$s0")`,$t5
movzb ($sbox,$acc1,1),$acc1 #$t0
xor $acc2,$t2
xor $acc0,$t3
shl \$8,$acc0
movzb `&lo("$s1")`,$acc2
movzb `&hi("$s3")`,$acc0
shl \$16,$acc1
movzb ($sbox,$acc2,1),$acc2 #$t3
movzb ($sbox,$acc0,1),$acc0 #$t0
xor $acc0,$t3
movzb ($sbox,$t4,1),$t4 #$t1
movzb `&hi("$s3")`,$acc0
movzb ($sbox,$t5,1),$t5 #$t2
xor $acc1,$t0
movzb `&hi("$s0")`,$acc1
shr \$8,$s2
movzb `&hi("$s0")`,$acc1
shl \$16,$t4
shr \$8,$s1
shl \$16,$t5
xor $t4,$t1
movzb ($sbox,$acc2,1),$acc2 #$t3
movzb ($sbox,$acc0,1),$acc0 #$t0
movzb ($sbox,$acc1,1),$acc1 #$t1
movzb ($sbox,$s2,1),$s3 #$t3
movzb ($sbox,$s1,1),$s2 #$t2
shl \$16,$t4
shl \$16,$t5
shl \$16,$acc2
xor $t4,$t1
xor $t5,$t2
xor $acc2,$t3
shl \$16,$acc2
xor $t5,$t2
shl \$24,$acc0
xor $acc2,$t3
shl \$24,$acc1
shl \$24,$s3
xor $acc0,$t0
shl \$24,$s2
shl \$24,$s3
xor $acc1,$t1
shl \$24,$s2
mov $t0,$s0
mov $t1,$s1
xor $t2,$s2
@ -466,12 +465,12 @@ sub enctransform()
{ my ($t3,$r20,$r21)=($acc2,"%r8d","%r9d");
$code.=<<___;
mov $s0,$acc0
mov $s1,$acc1
and \$0x80808080,$acc0
and \$0x80808080,$acc1
mov $acc0,$t0
mov $acc1,$t1
mov \$0x80808080,$t0
mov \$0x80808080,$t1
and $s0,$t0
and $s1,$t1
mov $t0,$acc0
mov $t1,$acc1
shr \$7,$t0
lea ($s0,$s0),$r20
shr \$7,$t1
@ -489,25 +488,25 @@ $code.=<<___;
xor $r20,$s0
xor $r21,$s1
mov $s2,$acc0
mov $s3,$acc1
mov \$0x80808080,$t2
rol \$24,$s0
mov \$0x80808080,$t3
rol \$24,$s1
and \$0x80808080,$acc0
and \$0x80808080,$acc1
and $s2,$t2
and $s3,$t3
xor $r20,$s0
xor $r21,$s1
mov $acc0,$t2
mov $acc1,$t3
mov $t2,$acc0
ror \$16,$t0
mov $t3,$acc1
ror \$16,$t1
shr \$7,$t2
lea ($s2,$s2),$r20
shr \$7,$t2
xor $t0,$s0
xor $t1,$s1
shr \$7,$t3
lea ($s3,$s3),$r21
xor $t1,$s1
ror \$8,$t0
lea ($s3,$s3),$r21
ror \$8,$t1
sub $t2,$acc0
sub $t3,$acc1
@ -523,23 +522,23 @@ $code.=<<___;
xor $acc0,$r20
xor $acc1,$r21
ror \$16,$t2
xor $r20,$s2
ror \$16,$t3
xor $r21,$s3
rol \$24,$s2
mov 0($sbox),$acc0 # prefetch Te4
rol \$24,$s3
xor $r20,$s2
xor $r21,$s3
mov 0($sbox),$acc0 # prefetch Te4
ror \$16,$t2
ror \$16,$t3
mov 64($sbox),$acc1
xor $t2,$s2
xor $t3,$s3
xor $r21,$s3
mov 128($sbox),$r20
ror \$8,$t2
ror \$8,$t3
mov 192($sbox),$r21
xor $t2,$s2
ror \$8,$t2
xor $t3,$s3
ror \$8,$t3
xor $t2,$s2
mov 192($sbox),$r21
xor $t3,$s3
___
}
@ -936,70 +935,69 @@ $code.=<<___;
movzb `&lo("$s0")`,$t0
movzb `&lo("$s1")`,$t1
movzb `&lo("$s2")`,$t2
movzb ($sbox,$t0,1),$t0
movzb ($sbox,$t1,1),$t1
movzb ($sbox,$t2,1),$t2
movzb `&lo("$s3")`,$t3
movzb `&hi("$s3")`,$acc0
movzb `&hi("$s0")`,$acc1
movzb ($sbox,$t3,1),$t3
movzb ($sbox,$acc0,1),$t4 #$t0
movzb ($sbox,$acc1,1),$t5 #$t1
shr \$16,$s3
movzb `&hi("$s1")`,$acc2
movzb ($sbox,$t0,1),$t0
movzb ($sbox,$t1,1),$t1
movzb ($sbox,$t2,1),$t2
movzb ($sbox,$t3,1),$t3
movzb ($sbox,$acc0,1),$t4 #$t0
movzb `&hi("$s2")`,$acc0
shr \$16,$s2
movzb ($sbox,$acc1,1),$t5 #$t1
movzb ($sbox,$acc2,1),$acc2 #$t2
movzb ($sbox,$acc0,1),$acc0 #$t3
shr \$16,$s3
movzb `&lo("$s2")`,$acc1
shl \$8,$t4
shr \$16,$s2
shl \$8,$t5
movzb ($sbox,$acc1,1),$acc1 #$t0
xor $t4,$t0
xor $t5,$t1
movzb `&lo("$s3")`,$t4
shl \$8,$t4
movzb `&lo("$s2")`,$acc1
shr \$16,$s0
xor $t4,$t0
shr \$16,$s1
movzb `&lo("$s0")`,$t5
movzb `&lo("$s3")`,$t4
shl \$8,$acc2
xor $t5,$t1
shl \$8,$acc0
movzb ($sbox,$t4,1),$t4 #$t1
movzb ($sbox,$t5,1),$t5 #$t2
movzb `&lo("$s0")`,$t5
movzb ($sbox,$acc1,1),$acc1 #$t0
xor $acc2,$t2
xor $acc0,$t3
movzb `&lo("$s1")`,$acc2
movzb `&hi("$s1")`,$acc0
shl \$16,$acc1
movzb ($sbox,$acc2,1),$acc2 #$t3
movzb ($sbox,$acc0,1),$acc0 #$t0
xor $acc1,$t0
shl \$16,$acc1
xor $acc0,$t3
movzb ($sbox,$t4,1),$t4 #$t1
movzb `&hi("$s1")`,$acc0
movzb ($sbox,$acc2,1),$acc2 #$t3
xor $acc1,$t0
movzb ($sbox,$t5,1),$t5 #$t2
movzb `&hi("$s2")`,$acc1
shl \$16,$acc2
shl \$16,$t4
shl \$16,$t5
movzb ($sbox,$acc1,1),$s1 #$t1
xor $acc2,$t3
movzb `&hi("$s3")`,$acc2
xor $t4,$t1
shr \$8,$s0
xor $t5,$t2
movzb `&hi("$s3")`,$acc1
shr \$8,$s0
shl \$16,$acc2
movzb ($sbox,$acc1,1),$s2 #$t2
movzb ($sbox,$acc0,1),$acc0 #$t0
movzb ($sbox,$acc1,1),$s1 #$t1
movzb ($sbox,$acc2,1),$s2 #$t2
movzb ($sbox,$s0,1),$s3 #$t3
xor $acc2,$t3
mov $t0,$s0
shl \$24,$acc0
shl \$24,$s1
shl \$24,$s2
xor $acc0,$t0
xor $acc0,$s0
shl \$24,$s3
xor $t1,$s1
mov $t0,$s0
xor $t2,$s2
xor $t3,$s3
___
@ -1014,12 +1012,12 @@ sub dectransform()
my $prefetch = shift;
$code.=<<___;
mov $tp10,$acc0
mov $tp18,$acc8
and $mask80,$acc0
and $mask80,$acc8
mov $acc0,$tp40
mov $acc8,$tp48
mov $mask80,$tp40
mov $mask80,$tp48
and $tp10,$tp40
and $tp18,$tp48
mov $tp40,$acc0
mov $tp48,$acc8
shr \$7,$tp40
lea ($tp10,$tp10),$tp20
shr \$7,$tp48
@ -1030,15 +1028,15 @@ $code.=<<___;
and $maskfe,$tp28
and $mask1b,$acc0
and $mask1b,$acc8
xor $tp20,$acc0
xor $tp28,$acc8
mov $acc0,$tp20
mov $acc8,$tp28
xor $acc0,$tp20
xor $acc8,$tp28
mov $mask80,$tp80
mov $mask80,$tp88
and $mask80,$acc0
and $mask80,$acc8
mov $acc0,$tp80
mov $acc8,$tp88
and $tp20,$tp80
and $tp28,$tp88
mov $tp80,$acc0
mov $tp88,$acc8
shr \$7,$tp80
lea ($tp20,$tp20),$tp40
shr \$7,$tp88
@ -1049,15 +1047,15 @@ $code.=<<___;
and $maskfe,$tp48
and $mask1b,$acc0
and $mask1b,$acc8
xor $tp40,$acc0
xor $tp48,$acc8
mov $acc0,$tp40
mov $acc8,$tp48
xor $acc0,$tp40
xor $acc8,$tp48
mov $mask80,$tp80
mov $mask80,$tp88
and $mask80,$acc0
and $mask80,$acc8
mov $acc0,$tp80
mov $acc8,$tp88
and $tp40,$tp80
and $tp48,$tp88
mov $tp80,$acc0
mov $tp88,$acc8
shr \$7,$tp80
xor $tp10,$tp20 # tp2^=tp1
shr \$7,$tp88
@ -1082,51 +1080,51 @@ $code.=<<___;
mov $tp10,$acc0
mov $tp18,$acc8
xor $tp80,$tp40 # tp4^tp1^=tp8
xor $tp88,$tp48 # tp4^tp1^=tp8
shr \$32,$acc0
xor $tp88,$tp48 # tp4^tp1^=tp8
shr \$32,$acc8
xor $tp20,$tp80 # tp8^=tp8^tp2^tp1=tp2^tp1
xor $tp28,$tp88 # tp8^=tp8^tp2^tp1=tp2^tp1
rol \$8,`&LO("$tp10")` # ROTATE(tp1^tp8,8)
xor $tp28,$tp88 # tp8^=tp8^tp2^tp1=tp2^tp1
rol \$8,`&LO("$tp18")` # ROTATE(tp1^tp8,8)
xor $tp40,$tp80 # tp2^tp1^=tp8^tp4^tp1=tp8^tp4^tp2
rol \$8,`&LO("$acc0")` # ROTATE(tp1^tp8,8)
xor $tp48,$tp88 # tp2^tp1^=tp8^tp4^tp1=tp8^tp4^tp2
rol \$8,`&LO("$acc0")` # ROTATE(tp1^tp8,8)
rol \$8,`&LO("$acc8")` # ROTATE(tp1^tp8,8)
xor `&LO("$tp80")`,`&LO("$tp10")`
xor `&LO("$tp88")`,`&LO("$tp18")`
shr \$32,$tp80
xor `&LO("$tp88")`,`&LO("$tp18")`
shr \$32,$tp88
xor `&LO("$tp80")`,`&LO("$acc0")`
xor `&LO("$tp88")`,`&LO("$acc8")`
mov $tp20,$tp80
mov $tp28,$tp88
shr \$32,$tp80
shr \$32,$tp88
rol \$24,`&LO("$tp20")` # ROTATE(tp2^tp1^tp8,24)
mov $tp28,$tp88
rol \$24,`&LO("$tp28")` # ROTATE(tp2^tp1^tp8,24)
rol \$24,`&LO("$tp80")` # ROTATE(tp2^tp1^tp8,24)
rol \$24,`&LO("$tp88")` # ROTATE(tp2^tp1^tp8,24)
shr \$32,$tp80
xor `&LO("$tp20")`,`&LO("$tp10")`
shr \$32,$tp88
xor `&LO("$tp28")`,`&LO("$tp18")`
rol \$24,`&LO("$tp80")` # ROTATE(tp2^tp1^tp8,24)
mov $tp40,$tp20
rol \$24,`&LO("$tp88")` # ROTATE(tp2^tp1^tp8,24)
mov $tp48,$tp28
shr \$32,$tp20
xor `&LO("$tp80")`,`&LO("$acc0")`
shr \$32,$tp28
xor `&LO("$tp88")`,`&LO("$acc8")`
`"mov 0($sbox),$mask80" if ($prefetch)`
shr \$32,$tp20
shr \$32,$tp28
`"mov 64($sbox),$maskfe" if ($prefetch)`
rol \$16,`&LO("$tp40")` # ROTATE(tp4^tp1^tp8,16)
`"mov 64($sbox),$maskfe" if ($prefetch)`
rol \$16,`&LO("$tp48")` # ROTATE(tp4^tp1^tp8,16)
`"mov 128($sbox),$mask1b" if ($prefetch)`
rol \$16,`&LO("$tp20")` # ROTATE(tp4^tp1^tp8,16)
rol \$16,`&LO("$tp28")` # ROTATE(tp4^tp1^tp8,16)
`"mov 192($sbox),$tp80" if ($prefetch)`
xor `&LO("$tp40")`,`&LO("$tp10")`
rol \$16,`&LO("$tp28")` # ROTATE(tp4^tp1^tp8,16)
xor `&LO("$tp48")`,`&LO("$tp18")`
`"mov 256($sbox),$tp88" if ($prefetch)`
xor `&LO("$tp20")`,`&LO("$acc0")`
@ -1302,10 +1300,6 @@ private_AES_set_encrypt_key:
call _x86_64_AES_set_encrypt_key
mov 8(%rsp),%r15
mov 16(%rsp),%r14
mov 24(%rsp),%r13
mov 32(%rsp),%r12
mov 40(%rsp),%rbp
mov 48(%rsp),%rbx
add \$56,%rsp

1395
crypto/aes/asm/aesni-mb-x86_64.pl Executable file

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

1942
crypto/aes/asm/aesp8-ppc.pl Executable file

File diff suppressed because it is too large Load Diff

919
crypto/aes/asm/aest4-sparcv9.pl Executable file
View File

@ -0,0 +1,919 @@
#!/usr/bin/env perl
# ====================================================================
# Written by David S. Miller <davem@devemloft.net> and Andy Polyakov
# <appro@openssl.org>. The module is licensed under 2-clause BSD
# license. October 2012. All rights reserved.
# ====================================================================
######################################################################
# AES for SPARC T4.
#
# AES round instructions complete in 3 cycles and can be issued every
# cycle. It means that round calculations should take 4*rounds cycles,
# because any given round instruction depends on result of *both*
# previous instructions:
#
# |0 |1 |2 |3 |4
# |01|01|01|
# |23|23|23|
# |01|01|...
# |23|...
#
# Provided that fxor [with IV] takes 3 cycles to complete, critical
# path length for CBC encrypt would be 3+4*rounds, or in other words
# it should process one byte in at least (3+4*rounds)/16 cycles. This
# estimate doesn't account for "collateral" instructions, such as
# fetching input from memory, xor-ing it with zero-round key and
# storing the result. Yet, *measured* performance [for data aligned
# at 64-bit boundary!] deviates from this equation by less than 0.5%:
#
# 128-bit key 192- 256-
# CBC encrypt 2.70/2.90(*) 3.20/3.40 3.70/3.90
# (*) numbers after slash are for
# misaligned data.
#
# Out-of-order execution logic managed to fully overlap "collateral"
# instructions with those on critical path. Amazing!
#
# As with Intel AES-NI, question is if it's possible to improve
# performance of parallelizeable modes by interleaving round
# instructions. Provided round instruction latency and throughput
# optimal interleave factor is 2. But can we expect 2x performance
# improvement? Well, as round instructions can be issued one per
# cycle, they don't saturate the 2-way issue pipeline and therefore
# there is room for "collateral" calculations... Yet, 2x speed-up
# over CBC encrypt remains unattaintable:
#
# 128-bit key 192- 256-
# CBC decrypt 1.64/2.11 1.89/2.37 2.23/2.61
# CTR 1.64/2.08(*) 1.89/2.33 2.23/2.61
# (*) numbers after slash are for
# misaligned data.
#
# Estimates based on amount of instructions under assumption that
# round instructions are not pairable with any other instruction
# suggest that latter is the actual case and pipeline runs
# underutilized. It should be noted that T4 out-of-order execution
# logic is so capable that performance gain from 2x interleave is
# not even impressive, ~7-13% over non-interleaved code, largest
# for 256-bit keys.
# To anchor to something else, software implementation processes
# one byte in 29 cycles with 128-bit key on same processor. Intel
# Sandy Bridge encrypts byte in 5.07 cycles in CBC mode and decrypts
# in 0.93, naturally with AES-NI.
$0 =~ m/(.*[\/\\])[^\/\\]+$/; $dir=$1;
push(@INC,"${dir}","${dir}../../perlasm");
require "sparcv9_modes.pl";
&asm_init(@ARGV);
$::evp=1; # if $evp is set to 0, script generates module with
# AES_[en|de]crypt, AES_set_[en|de]crypt_key and AES_cbc_encrypt entry
# points. These however are not fully compatible with openssl/aes.h,
# because they expect AES_KEY to be aligned at 64-bit boundary. When
# used through EVP, alignment is arranged at EVP layer. Second thing
# that is arranged by EVP is at least 32-bit alignment of IV.
######################################################################
# single-round subroutines
#
{
my ($inp,$out,$key,$rounds,$tmp,$mask)=map("%o$_",(0..5));
$code.=<<___ if ($::abibits==64);
.register %g2,#scratch
.register %g3,#scratch
___
$code.=<<___;
.text
.globl aes_t4_encrypt
.align 32
aes_t4_encrypt:
andcc $inp, 7, %g1 ! is input aligned?
andn $inp, 7, $inp
ldx [$key + 0], %g4
ldx [$key + 8], %g5
ldx [$inp + 0], %o4
bz,pt %icc, 1f
ldx [$inp + 8], %o5
ldx [$inp + 16], $inp
sll %g1, 3, %g1
sub %g0, %g1, %o3
sllx %o4, %g1, %o4
sllx %o5, %g1, %g1
srlx %o5, %o3, %o5
srlx $inp, %o3, %o3
or %o5, %o4, %o4
or %o3, %g1, %o5
1:
ld [$key + 240], $rounds
ldd [$key + 16], %f12
ldd [$key + 24], %f14
xor %g4, %o4, %o4
xor %g5, %o5, %o5
movxtod %o4, %f0
movxtod %o5, %f2
srl $rounds, 1, $rounds
ldd [$key + 32], %f16
sub $rounds, 1, $rounds
ldd [$key + 40], %f18
add $key, 48, $key
.Lenc:
aes_eround01 %f12, %f0, %f2, %f4
aes_eround23 %f14, %f0, %f2, %f2
ldd [$key + 0], %f12
ldd [$key + 8], %f14
sub $rounds,1,$rounds
aes_eround01 %f16, %f4, %f2, %f0
aes_eround23 %f18, %f4, %f2, %f2
ldd [$key + 16], %f16
ldd [$key + 24], %f18
brnz,pt $rounds, .Lenc
add $key, 32, $key
andcc $out, 7, $tmp ! is output aligned?
aes_eround01 %f12, %f0, %f2, %f4
aes_eround23 %f14, %f0, %f2, %f2
aes_eround01_l %f16, %f4, %f2, %f0
aes_eround23_l %f18, %f4, %f2, %f2
bnz,pn %icc, 2f
nop
std %f0, [$out + 0]
retl
std %f2, [$out + 8]
2: alignaddrl $out, %g0, $out
mov 0xff, $mask
srl $mask, $tmp, $mask
faligndata %f0, %f0, %f4
faligndata %f0, %f2, %f6
faligndata %f2, %f2, %f8
stda %f4, [$out + $mask]0xc0 ! partial store
std %f6, [$out + 8]
add $out, 16, $out
orn %g0, $mask, $mask
retl
stda %f8, [$out + $mask]0xc0 ! partial store
.type aes_t4_encrypt,#function
.size aes_t4_encrypt,.-aes_t4_encrypt
.globl aes_t4_decrypt
.align 32
aes_t4_decrypt:
andcc $inp, 7, %g1 ! is input aligned?
andn $inp, 7, $inp
ldx [$key + 0], %g4
ldx [$key + 8], %g5
ldx [$inp + 0], %o4
bz,pt %icc, 1f
ldx [$inp + 8], %o5
ldx [$inp + 16], $inp
sll %g1, 3, %g1
sub %g0, %g1, %o3
sllx %o4, %g1, %o4
sllx %o5, %g1, %g1
srlx %o5, %o3, %o5
srlx $inp, %o3, %o3
or %o5, %o4, %o4
or %o3, %g1, %o5
1:
ld [$key + 240], $rounds
ldd [$key + 16], %f12
ldd [$key + 24], %f14
xor %g4, %o4, %o4
xor %g5, %o5, %o5
movxtod %o4, %f0
movxtod %o5, %f2
srl $rounds, 1, $rounds
ldd [$key + 32], %f16
sub $rounds, 1, $rounds
ldd [$key + 40], %f18
add $key, 48, $key
.Ldec:
aes_dround01 %f12, %f0, %f2, %f4
aes_dround23 %f14, %f0, %f2, %f2
ldd [$key + 0], %f12
ldd [$key + 8], %f14
sub $rounds,1,$rounds
aes_dround01 %f16, %f4, %f2, %f0
aes_dround23 %f18, %f4, %f2, %f2
ldd [$key + 16], %f16
ldd [$key + 24], %f18
brnz,pt $rounds, .Ldec
add $key, 32, $key
andcc $out, 7, $tmp ! is output aligned?
aes_dround01 %f12, %f0, %f2, %f4
aes_dround23 %f14, %f0, %f2, %f2
aes_dround01_l %f16, %f4, %f2, %f0
aes_dround23_l %f18, %f4, %f2, %f2
bnz,pn %icc, 2f
nop
std %f0, [$out + 0]
retl
std %f2, [$out + 8]
2: alignaddrl $out, %g0, $out
mov 0xff, $mask
srl $mask, $tmp, $mask
faligndata %f0, %f0, %f4
faligndata %f0, %f2, %f6
faligndata %f2, %f2, %f8
stda %f4, [$out + $mask]0xc0 ! partial store
std %f6, [$out + 8]
add $out, 16, $out
orn %g0, $mask, $mask
retl
stda %f8, [$out + $mask]0xc0 ! partial store
.type aes_t4_decrypt,#function
.size aes_t4_decrypt,.-aes_t4_decrypt
___
}
######################################################################
# key setup subroutines
#
{
my ($inp,$bits,$out,$tmp)=map("%o$_",(0..5));
$code.=<<___;
.globl aes_t4_set_encrypt_key
.align 32
aes_t4_set_encrypt_key:
.Lset_encrypt_key:
and $inp, 7, $tmp
alignaddr $inp, %g0, $inp
cmp $bits, 192
ldd [$inp + 0], %f0
bl,pt %icc,.L128
ldd [$inp + 8], %f2
be,pt %icc,.L192
ldd [$inp + 16], %f4
brz,pt $tmp, .L256aligned
ldd [$inp + 24], %f6
ldd [$inp + 32], %f8
faligndata %f0, %f2, %f0
faligndata %f2, %f4, %f2
faligndata %f4, %f6, %f4
faligndata %f6, %f8, %f6
.L256aligned:
___
for ($i=0; $i<6; $i++) {
$code.=<<___;
std %f0, [$out + `32*$i+0`]
aes_kexpand1 %f0, %f6, $i, %f0
std %f2, [$out + `32*$i+8`]
aes_kexpand2 %f2, %f0, %f2
std %f4, [$out + `32*$i+16`]
aes_kexpand0 %f4, %f2, %f4
std %f6, [$out + `32*$i+24`]
aes_kexpand2 %f6, %f4, %f6
___
}
$code.=<<___;
std %f0, [$out + `32*$i+0`]
aes_kexpand1 %f0, %f6, $i, %f0
std %f2, [$out + `32*$i+8`]
aes_kexpand2 %f2, %f0, %f2
std %f4, [$out + `32*$i+16`]
std %f6, [$out + `32*$i+24`]
std %f0, [$out + `32*$i+32`]
std %f2, [$out + `32*$i+40`]
mov 14, $tmp
st $tmp, [$out + 240]
retl
xor %o0, %o0, %o0
.align 16
.L192:
brz,pt $tmp, .L192aligned
nop
ldd [$inp + 24], %f6
faligndata %f0, %f2, %f0
faligndata %f2, %f4, %f2
faligndata %f4, %f6, %f4
.L192aligned:
___
for ($i=0; $i<7; $i++) {
$code.=<<___;
std %f0, [$out + `24*$i+0`]
aes_kexpand1 %f0, %f4, $i, %f0
std %f2, [$out + `24*$i+8`]
aes_kexpand2 %f2, %f0, %f2
std %f4, [$out + `24*$i+16`]
aes_kexpand2 %f4, %f2, %f4
___
}
$code.=<<___;
std %f0, [$out + `24*$i+0`]
aes_kexpand1 %f0, %f4, $i, %f0
std %f2, [$out + `24*$i+8`]
aes_kexpand2 %f2, %f0, %f2
std %f4, [$out + `24*$i+16`]
std %f0, [$out + `24*$i+24`]
std %f2, [$out + `24*$i+32`]
mov 12, $tmp
st $tmp, [$out + 240]
retl
xor %o0, %o0, %o0
.align 16
.L128:
brz,pt $tmp, .L128aligned
nop
ldd [$inp + 16], %f4
faligndata %f0, %f2, %f0
faligndata %f2, %f4, %f2
.L128aligned:
___
for ($i=0; $i<10; $i++) {
$code.=<<___;
std %f0, [$out + `16*$i+0`]
aes_kexpand1 %f0, %f2, $i, %f0
std %f2, [$out + `16*$i+8`]
aes_kexpand2 %f2, %f0, %f2
___
}
$code.=<<___;
std %f0, [$out + `16*$i+0`]
std %f2, [$out + `16*$i+8`]
mov 10, $tmp
st $tmp, [$out + 240]
retl
xor %o0, %o0, %o0
.type aes_t4_set_encrypt_key,#function
.size aes_t4_set_encrypt_key,.-aes_t4_set_encrypt_key
.globl aes_t4_set_decrypt_key
.align 32
aes_t4_set_decrypt_key:
mov %o7, %o5
call .Lset_encrypt_key
nop
mov %o5, %o7
sll $tmp, 4, $inp ! $tmp is number of rounds
add $tmp, 2, $tmp
add $out, $inp, $inp ! $inp=$out+16*rounds
srl $tmp, 2, $tmp ! $tmp=(rounds+2)/4
.Lkey_flip:
ldd [$out + 0], %f0
ldd [$out + 8], %f2
ldd [$out + 16], %f4
ldd [$out + 24], %f6
ldd [$inp + 0], %f8
ldd [$inp + 8], %f10
ldd [$inp - 16], %f12
ldd [$inp - 8], %f14
sub $tmp, 1, $tmp
std %f0, [$inp + 0]
std %f2, [$inp + 8]
std %f4, [$inp - 16]
std %f6, [$inp - 8]
std %f8, [$out + 0]
std %f10, [$out + 8]
std %f12, [$out + 16]
std %f14, [$out + 24]
add $out, 32, $out
brnz $tmp, .Lkey_flip
sub $inp, 32, $inp
retl
xor %o0, %o0, %o0
.type aes_t4_set_decrypt_key,#function
.size aes_t4_set_decrypt_key,.-aes_t4_set_decrypt_key
___
}
{{{
my ($inp,$out,$len,$key,$ivec,$enc)=map("%i$_",(0..5));
my ($ileft,$iright,$ooff,$omask,$ivoff)=map("%l$_",(1..7));
$code.=<<___;
.align 32
_aes128_encrypt_1x:
___
for ($i=0; $i<4; $i++) {
$code.=<<___;
aes_eround01 %f`16+8*$i+0`, %f0, %f2, %f4
aes_eround23 %f`16+8*$i+2`, %f0, %f2, %f2
aes_eround01 %f`16+8*$i+4`, %f4, %f2, %f0
aes_eround23 %f`16+8*$i+6`, %f4, %f2, %f2
___
}
$code.=<<___;
aes_eround01 %f48, %f0, %f2, %f4
aes_eround23 %f50, %f0, %f2, %f2
aes_eround01_l %f52, %f4, %f2, %f0
retl
aes_eround23_l %f54, %f4, %f2, %f2
.type _aes128_encrypt_1x,#function
.size _aes128_encrypt_1x,.-_aes128_encrypt_1x
.align 32
_aes128_encrypt_2x:
___
for ($i=0; $i<4; $i++) {
$code.=<<___;
aes_eround01 %f`16+8*$i+0`, %f0, %f2, %f8
aes_eround23 %f`16+8*$i+2`, %f0, %f2, %f2
aes_eround01 %f`16+8*$i+0`, %f4, %f6, %f10
aes_eround23 %f`16+8*$i+2`, %f4, %f6, %f6
aes_eround01 %f`16+8*$i+4`, %f8, %f2, %f0
aes_eround23 %f`16+8*$i+6`, %f8, %f2, %f2
aes_eround01 %f`16+8*$i+4`, %f10, %f6, %f4
aes_eround23 %f`16+8*$i+6`, %f10, %f6, %f6
___
}
$code.=<<___;
aes_eround01 %f48, %f0, %f2, %f8
aes_eround23 %f50, %f0, %f2, %f2
aes_eround01 %f48, %f4, %f6, %f10
aes_eround23 %f50, %f4, %f6, %f6
aes_eround01_l %f52, %f8, %f2, %f0
aes_eround23_l %f54, %f8, %f2, %f2
aes_eround01_l %f52, %f10, %f6, %f4
retl
aes_eround23_l %f54, %f10, %f6, %f6
.type _aes128_encrypt_2x,#function
.size _aes128_encrypt_2x,.-_aes128_encrypt_2x
.align 32
_aes128_loadkey:
ldx [$key + 0], %g4
ldx [$key + 8], %g5
___
for ($i=2; $i<22;$i++) { # load key schedule
$code.=<<___;
ldd [$key + `8*$i`], %f`12+2*$i`
___
}
$code.=<<___;
retl
nop
.type _aes128_loadkey,#function
.size _aes128_loadkey,.-_aes128_loadkey
_aes128_load_enckey=_aes128_loadkey
_aes128_load_deckey=_aes128_loadkey
___
&alg_cbc_encrypt_implement("aes",128);
if ($::evp) {
&alg_ctr32_implement("aes",128);
&alg_xts_implement("aes",128,"en");
&alg_xts_implement("aes",128,"de");
}
&alg_cbc_decrypt_implement("aes",128);
$code.=<<___;
.align 32
_aes128_decrypt_1x:
___
for ($i=0; $i<4; $i++) {
$code.=<<___;
aes_dround01 %f`16+8*$i+0`, %f0, %f2, %f4
aes_dround23 %f`16+8*$i+2`, %f0, %f2, %f2
aes_dround01 %f`16+8*$i+4`, %f4, %f2, %f0
aes_dround23 %f`16+8*$i+6`, %f4, %f2, %f2
___
}
$code.=<<___;
aes_dround01 %f48, %f0, %f2, %f4
aes_dround23 %f50, %f0, %f2, %f2
aes_dround01_l %f52, %f4, %f2, %f0
retl
aes_dround23_l %f54, %f4, %f2, %f2
.type _aes128_decrypt_1x,#function
.size _aes128_decrypt_1x,.-_aes128_decrypt_1x
.align 32
_aes128_decrypt_2x:
___
for ($i=0; $i<4; $i++) {
$code.=<<___;
aes_dround01 %f`16+8*$i+0`, %f0, %f2, %f8
aes_dround23 %f`16+8*$i+2`, %f0, %f2, %f2
aes_dround01 %f`16+8*$i+0`, %f4, %f6, %f10
aes_dround23 %f`16+8*$i+2`, %f4, %f6, %f6
aes_dround01 %f`16+8*$i+4`, %f8, %f2, %f0
aes_dround23 %f`16+8*$i+6`, %f8, %f2, %f2
aes_dround01 %f`16+8*$i+4`, %f10, %f6, %f4
aes_dround23 %f`16+8*$i+6`, %f10, %f6, %f6
___
}
$code.=<<___;
aes_dround01 %f48, %f0, %f2, %f8
aes_dround23 %f50, %f0, %f2, %f2
aes_dround01 %f48, %f4, %f6, %f10
aes_dround23 %f50, %f4, %f6, %f6
aes_dround01_l %f52, %f8, %f2, %f0
aes_dround23_l %f54, %f8, %f2, %f2
aes_dround01_l %f52, %f10, %f6, %f4
retl
aes_dround23_l %f54, %f10, %f6, %f6
.type _aes128_decrypt_2x,#function
.size _aes128_decrypt_2x,.-_aes128_decrypt_2x
___
$code.=<<___;
.align 32
_aes192_encrypt_1x:
___
for ($i=0; $i<5; $i++) {
$code.=<<___;
aes_eround01 %f`16+8*$i+0`, %f0, %f2, %f4
aes_eround23 %f`16+8*$i+2`, %f0, %f2, %f2
aes_eround01 %f`16+8*$i+4`, %f4, %f2, %f0
aes_eround23 %f`16+8*$i+6`, %f4, %f2, %f2
___
}
$code.=<<___;
aes_eround01 %f56, %f0, %f2, %f4
aes_eround23 %f58, %f0, %f2, %f2
aes_eround01_l %f60, %f4, %f2, %f0
retl
aes_eround23_l %f62, %f4, %f2, %f2
.type _aes192_encrypt_1x,#function
.size _aes192_encrypt_1x,.-_aes192_encrypt_1x
.align 32
_aes192_encrypt_2x:
___
for ($i=0; $i<5; $i++) {
$code.=<<___;
aes_eround01 %f`16+8*$i+0`, %f0, %f2, %f8
aes_eround23 %f`16+8*$i+2`, %f0, %f2, %f2
aes_eround01 %f`16+8*$i+0`, %f4, %f6, %f10
aes_eround23 %f`16+8*$i+2`, %f4, %f6, %f6
aes_eround01 %f`16+8*$i+4`, %f8, %f2, %f0
aes_eround23 %f`16+8*$i+6`, %f8, %f2, %f2
aes_eround01 %f`16+8*$i+4`, %f10, %f6, %f4
aes_eround23 %f`16+8*$i+6`, %f10, %f6, %f6
___
}
$code.=<<___;
aes_eround01 %f56, %f0, %f2, %f8
aes_eround23 %f58, %f0, %f2, %f2
aes_eround01 %f56, %f4, %f6, %f10
aes_eround23 %f58, %f4, %f6, %f6
aes_eround01_l %f60, %f8, %f2, %f0
aes_eround23_l %f62, %f8, %f2, %f2
aes_eround01_l %f60, %f10, %f6, %f4
retl
aes_eround23_l %f62, %f10, %f6, %f6
.type _aes192_encrypt_2x,#function
.size _aes192_encrypt_2x,.-_aes192_encrypt_2x
.align 32
_aes256_encrypt_1x:
aes_eround01 %f16, %f0, %f2, %f4
aes_eround23 %f18, %f0, %f2, %f2
ldd [$key + 208], %f16
ldd [$key + 216], %f18
aes_eround01 %f20, %f4, %f2, %f0
aes_eround23 %f22, %f4, %f2, %f2
ldd [$key + 224], %f20
ldd [$key + 232], %f22
___
for ($i=1; $i<6; $i++) {
$code.=<<___;
aes_eround01 %f`16+8*$i+0`, %f0, %f2, %f4
aes_eround23 %f`16+8*$i+2`, %f0, %f2, %f2
aes_eround01 %f`16+8*$i+4`, %f4, %f2, %f0
aes_eround23 %f`16+8*$i+6`, %f4, %f2, %f2
___
}
$code.=<<___;
aes_eround01 %f16, %f0, %f2, %f4
aes_eround23 %f18, %f0, %f2, %f2
ldd [$key + 16], %f16
ldd [$key + 24], %f18
aes_eround01_l %f20, %f4, %f2, %f0
aes_eround23_l %f22, %f4, %f2, %f2
ldd [$key + 32], %f20
retl
ldd [$key + 40], %f22
.type _aes256_encrypt_1x,#function
.size _aes256_encrypt_1x,.-_aes256_encrypt_1x
.align 32
_aes256_encrypt_2x:
aes_eround01 %f16, %f0, %f2, %f8
aes_eround23 %f18, %f0, %f2, %f2
aes_eround01 %f16, %f4, %f6, %f10
aes_eround23 %f18, %f4, %f6, %f6
ldd [$key + 208], %f16
ldd [$key + 216], %f18
aes_eround01 %f20, %f8, %f2, %f0
aes_eround23 %f22, %f8, %f2, %f2
aes_eround01 %f20, %f10, %f6, %f4
aes_eround23 %f22, %f10, %f6, %f6
ldd [$key + 224], %f20
ldd [$key + 232], %f22
___
for ($i=1; $i<6; $i++) {
$code.=<<___;
aes_eround01 %f`16+8*$i+0`, %f0, %f2, %f8
aes_eround23 %f`16+8*$i+2`, %f0, %f2, %f2
aes_eround01 %f`16+8*$i+0`, %f4, %f6, %f10
aes_eround23 %f`16+8*$i+2`, %f4, %f6, %f6
aes_eround01 %f`16+8*$i+4`, %f8, %f2, %f0
aes_eround23 %f`16+8*$i+6`, %f8, %f2, %f2
aes_eround01 %f`16+8*$i+4`, %f10, %f6, %f4
aes_eround23 %f`16+8*$i+6`, %f10, %f6, %f6
___
}
$code.=<<___;
aes_eround01 %f16, %f0, %f2, %f8
aes_eround23 %f18, %f0, %f2, %f2
aes_eround01 %f16, %f4, %f6, %f10
aes_eround23 %f18, %f4, %f6, %f6
ldd [$key + 16], %f16
ldd [$key + 24], %f18
aes_eround01_l %f20, %f8, %f2, %f0
aes_eround23_l %f22, %f8, %f2, %f2
aes_eround01_l %f20, %f10, %f6, %f4
aes_eround23_l %f22, %f10, %f6, %f6
ldd [$key + 32], %f20
retl
ldd [$key + 40], %f22
.type _aes256_encrypt_2x,#function
.size _aes256_encrypt_2x,.-_aes256_encrypt_2x
.align 32
_aes192_loadkey:
ldx [$key + 0], %g4
ldx [$key + 8], %g5
___
for ($i=2; $i<26;$i++) { # load key schedule
$code.=<<___;
ldd [$key + `8*$i`], %f`12+2*$i`
___
}
$code.=<<___;
retl
nop
.type _aes192_loadkey,#function
.size _aes192_loadkey,.-_aes192_loadkey
_aes256_loadkey=_aes192_loadkey
_aes192_load_enckey=_aes192_loadkey
_aes192_load_deckey=_aes192_loadkey
_aes256_load_enckey=_aes192_loadkey
_aes256_load_deckey=_aes192_loadkey
___
&alg_cbc_encrypt_implement("aes",256);
&alg_cbc_encrypt_implement("aes",192);
if ($::evp) {
&alg_ctr32_implement("aes",256);
&alg_xts_implement("aes",256,"en");
&alg_xts_implement("aes",256,"de");
&alg_ctr32_implement("aes",192);
}
&alg_cbc_decrypt_implement("aes",192);
&alg_cbc_decrypt_implement("aes",256);
$code.=<<___;
.align 32
_aes256_decrypt_1x:
aes_dround01 %f16, %f0, %f2, %f4
aes_dround23 %f18, %f0, %f2, %f2
ldd [$key + 208], %f16
ldd [$key + 216], %f18
aes_dround01 %f20, %f4, %f2, %f0
aes_dround23 %f22, %f4, %f2, %f2
ldd [$key + 224], %f20
ldd [$key + 232], %f22
___
for ($i=1; $i<6; $i++) {
$code.=<<___;
aes_dround01 %f`16+8*$i+0`, %f0, %f2, %f4
aes_dround23 %f`16+8*$i+2`, %f0, %f2, %f2
aes_dround01 %f`16+8*$i+4`, %f4, %f2, %f0
aes_dround23 %f`16+8*$i+6`, %f4, %f2, %f2
___
}
$code.=<<___;
aes_dround01 %f16, %f0, %f2, %f4
aes_dround23 %f18, %f0, %f2, %f2
ldd [$key + 16], %f16
ldd [$key + 24], %f18
aes_dround01_l %f20, %f4, %f2, %f0
aes_dround23_l %f22, %f4, %f2, %f2
ldd [$key + 32], %f20
retl
ldd [$key + 40], %f22
.type _aes256_decrypt_1x,#function
.size _aes256_decrypt_1x,.-_aes256_decrypt_1x
.align 32
_aes256_decrypt_2x:
aes_dround01 %f16, %f0, %f2, %f8
aes_dround23 %f18, %f0, %f2, %f2
aes_dround01 %f16, %f4, %f6, %f10
aes_dround23 %f18, %f4, %f6, %f6
ldd [$key + 208], %f16
ldd [$key + 216], %f18
aes_dround01 %f20, %f8, %f2, %f0
aes_dround23 %f22, %f8, %f2, %f2
aes_dround01 %f20, %f10, %f6, %f4
aes_dround23 %f22, %f10, %f6, %f6
ldd [$key + 224], %f20
ldd [$key + 232], %f22
___
for ($i=1; $i<6; $i++) {
$code.=<<___;
aes_dround01 %f`16+8*$i+0`, %f0, %f2, %f8
aes_dround23 %f`16+8*$i+2`, %f0, %f2, %f2
aes_dround01 %f`16+8*$i+0`, %f4, %f6, %f10
aes_dround23 %f`16+8*$i+2`, %f4, %f6, %f6
aes_dround01 %f`16+8*$i+4`, %f8, %f2, %f0
aes_dround23 %f`16+8*$i+6`, %f8, %f2, %f2
aes_dround01 %f`16+8*$i+4`, %f10, %f6, %f4
aes_dround23 %f`16+8*$i+6`, %f10, %f6, %f6
___
}
$code.=<<___;
aes_dround01 %f16, %f0, %f2, %f8
aes_dround23 %f18, %f0, %f2, %f2
aes_dround01 %f16, %f4, %f6, %f10
aes_dround23 %f18, %f4, %f6, %f6
ldd [$key + 16], %f16
ldd [$key + 24], %f18
aes_dround01_l %f20, %f8, %f2, %f0
aes_dround23_l %f22, %f8, %f2, %f2
aes_dround01_l %f20, %f10, %f6, %f4
aes_dround23_l %f22, %f10, %f6, %f6
ldd [$key + 32], %f20
retl
ldd [$key + 40], %f22
.type _aes256_decrypt_2x,#function
.size _aes256_decrypt_2x,.-_aes256_decrypt_2x
.align 32
_aes192_decrypt_1x:
___
for ($i=0; $i<5; $i++) {
$code.=<<___;
aes_dround01 %f`16+8*$i+0`, %f0, %f2, %f4
aes_dround23 %f`16+8*$i+2`, %f0, %f2, %f2
aes_dround01 %f`16+8*$i+4`, %f4, %f2, %f0
aes_dround23 %f`16+8*$i+6`, %f4, %f2, %f2
___
}
$code.=<<___;
aes_dround01 %f56, %f0, %f2, %f4
aes_dround23 %f58, %f0, %f2, %f2
aes_dround01_l %f60, %f4, %f2, %f0
retl
aes_dround23_l %f62, %f4, %f2, %f2
.type _aes192_decrypt_1x,#function
.size _aes192_decrypt_1x,.-_aes192_decrypt_1x
.align 32
_aes192_decrypt_2x:
___
for ($i=0; $i<5; $i++) {
$code.=<<___;
aes_dround01 %f`16+8*$i+0`, %f0, %f2, %f8
aes_dround23 %f`16+8*$i+2`, %f0, %f2, %f2
aes_dround01 %f`16+8*$i+0`, %f4, %f6, %f10
aes_dround23 %f`16+8*$i+2`, %f4, %f6, %f6
aes_dround01 %f`16+8*$i+4`, %f8, %f2, %f0
aes_dround23 %f`16+8*$i+6`, %f8, %f2, %f2
aes_dround01 %f`16+8*$i+4`, %f10, %f6, %f4
aes_dround23 %f`16+8*$i+6`, %f10, %f6, %f6
___
}
$code.=<<___;
aes_dround01 %f56, %f0, %f2, %f8
aes_dround23 %f58, %f0, %f2, %f2
aes_dround01 %f56, %f4, %f6, %f10
aes_dround23 %f58, %f4, %f6, %f6
aes_dround01_l %f60, %f8, %f2, %f0
aes_dround23_l %f62, %f8, %f2, %f2
aes_dround01_l %f60, %f10, %f6, %f4
retl
aes_dround23_l %f62, %f10, %f6, %f6
.type _aes192_decrypt_2x,#function
.size _aes192_decrypt_2x,.-_aes192_decrypt_2x
___
}}}
if (!$::evp) {
$code.=<<___;
.global AES_encrypt
AES_encrypt=aes_t4_encrypt
.global AES_decrypt
AES_decrypt=aes_t4_decrypt
.global AES_set_encrypt_key
.align 32
AES_set_encrypt_key:
andcc %o2, 7, %g0 ! check alignment
bnz,a,pn %icc, 1f
mov -1, %o0
brz,a,pn %o0, 1f
mov -1, %o0
brz,a,pn %o2, 1f
mov -1, %o0
andncc %o1, 0x1c0, %g0
bnz,a,pn %icc, 1f
mov -2, %o0
cmp %o1, 128
bl,a,pn %icc, 1f
mov -2, %o0
b aes_t4_set_encrypt_key
nop
1: retl
nop
.type AES_set_encrypt_key,#function
.size AES_set_encrypt_key,.-AES_set_encrypt_key
.global AES_set_decrypt_key
.align 32
AES_set_decrypt_key:
andcc %o2, 7, %g0 ! check alignment
bnz,a,pn %icc, 1f
mov -1, %o0
brz,a,pn %o0, 1f
mov -1, %o0
brz,a,pn %o2, 1f
mov -1, %o0
andncc %o1, 0x1c0, %g0
bnz,a,pn %icc, 1f
mov -2, %o0
cmp %o1, 128
bl,a,pn %icc, 1f
mov -2, %o0
b aes_t4_set_decrypt_key
nop
1: retl
nop
.type AES_set_decrypt_key,#function
.size AES_set_decrypt_key,.-AES_set_decrypt_key
___
my ($inp,$out,$len,$key,$ivec,$enc)=map("%o$_",(0..5));
$code.=<<___;
.globl AES_cbc_encrypt
.align 32
AES_cbc_encrypt:
ld [$key + 240], %g1
nop
brz $enc, .Lcbc_decrypt
cmp %g1, 12
bl,pt %icc, aes128_t4_cbc_encrypt
nop
be,pn %icc, aes192_t4_cbc_encrypt
nop
ba aes256_t4_cbc_encrypt
nop
.Lcbc_decrypt:
bl,pt %icc, aes128_t4_cbc_decrypt
nop
be,pn %icc, aes192_t4_cbc_decrypt
nop
ba aes256_t4_cbc_decrypt
nop
.type AES_cbc_encrypt,#function
.size AES_cbc_encrypt,.-AES_cbc_encrypt
___
}
$code.=<<___;
.asciz "AES for SPARC T4, David S. Miller, Andy Polyakov"
.align 4
___
&emit_assembler();
close STDOUT;

989
crypto/aes/asm/aesv8-armx.pl Executable file
View File

@ -0,0 +1,989 @@
#!/usr/bin/env perl
#
# ====================================================================
# Written by Andy Polyakov <appro@openssl.org> for the OpenSSL
# project. The module is, however, dual licensed under OpenSSL and
# CRYPTOGAMS licenses depending on where you obtain it. For further
# details see http://www.openssl.org/~appro/cryptogams/.
# ====================================================================
#
# This module implements support for ARMv8 AES instructions. The
# module is endian-agnostic in sense that it supports both big- and
# little-endian cases. As does it support both 32- and 64-bit modes
# of operation. Latter is achieved by limiting amount of utilized
# registers to 16, which implies additional NEON load and integer
# instructions. This has no effect on mighty Apple A7, where results
# are literally equal to the theoretical estimates based on AES
# instruction latencies and issue rates. On Cortex-A53, an in-order
# execution core, this costs up to 10-15%, which is partially
# compensated by implementing dedicated code path for 128-bit
# CBC encrypt case. On Cortex-A57 parallelizable mode performance
# seems to be limited by sheer amount of NEON instructions...
#
# Performance in cycles per byte processed with 128-bit key:
#
# CBC enc CBC dec CTR
# Apple A7 2.39 1.20 1.20
# Cortex-A53 1.32 1.29 1.46
# Cortex-A57(*) 1.95 0.85 0.93
# Denver 1.96 0.86 0.80
#
# (*) original 3.64/1.34/1.32 results were for r0p0 revision
# and are still same even for updated module;
$flavour = shift;
open STDOUT,">".shift;
$prefix="aes_v8";
$code=<<___;
#include "arm_arch.h"
#if __ARM_MAX_ARCH__>=7
.text
___
$code.=".arch armv8-a+crypto\n" if ($flavour =~ /64/);
$code.=".arch armv7-a\n.fpu neon\n.code 32\n" if ($flavour !~ /64/);
#^^^^^^ this is done to simplify adoption by not depending
# on latest binutils.
# Assembler mnemonics are an eclectic mix of 32- and 64-bit syntax,
# NEON is mostly 32-bit mnemonics, integer - mostly 64. Goal is to
# maintain both 32- and 64-bit codes within single module and
# transliterate common code to either flavour with regex vodoo.
#
{{{
my ($inp,$bits,$out,$ptr,$rounds)=("x0","w1","x2","x3","w12");
my ($zero,$rcon,$mask,$in0,$in1,$tmp,$key)=
$flavour=~/64/? map("q$_",(0..6)) : map("q$_",(0..3,8..10));
$code.=<<___;
.align 5
rcon:
.long 0x01,0x01,0x01,0x01
.long 0x0c0f0e0d,0x0c0f0e0d,0x0c0f0e0d,0x0c0f0e0d // rotate-n-splat
.long 0x1b,0x1b,0x1b,0x1b
.globl ${prefix}_set_encrypt_key
.type ${prefix}_set_encrypt_key,%function
.align 5
${prefix}_set_encrypt_key:
.Lenc_key:
___
$code.=<<___ if ($flavour =~ /64/);
stp x29,x30,[sp,#-16]!
add x29,sp,#0
___
$code.=<<___;
mov $ptr,#-1
cmp $inp,#0
b.eq .Lenc_key_abort
cmp $out,#0
b.eq .Lenc_key_abort
mov $ptr,#-2
cmp $bits,#128
b.lt .Lenc_key_abort
cmp $bits,#256
b.gt .Lenc_key_abort
tst $bits,#0x3f
b.ne .Lenc_key_abort
adr $ptr,rcon
cmp $bits,#192
veor $zero,$zero,$zero
vld1.8 {$in0},[$inp],#16
mov $bits,#8 // reuse $bits
vld1.32 {$rcon,$mask},[$ptr],#32
b.lt .Loop128
b.eq .L192
b .L256
.align 4
.Loop128:
vtbl.8 $key,{$in0},$mask
vext.8 $tmp,$zero,$in0,#12
vst1.32 {$in0},[$out],#16
aese $key,$zero
subs $bits,$bits,#1
veor $in0,$in0,$tmp
vext.8 $tmp,$zero,$tmp,#12
veor $in0,$in0,$tmp
vext.8 $tmp,$zero,$tmp,#12
veor $key,$key,$rcon
veor $in0,$in0,$tmp
vshl.u8 $rcon,$rcon,#1
veor $in0,$in0,$key
b.ne .Loop128
vld1.32 {$rcon},[$ptr]
vtbl.8 $key,{$in0},$mask
vext.8 $tmp,$zero,$in0,#12
vst1.32 {$in0},[$out],#16
aese $key,$zero
veor $in0,$in0,$tmp
vext.8 $tmp,$zero,$tmp,#12
veor $in0,$in0,$tmp
vext.8 $tmp,$zero,$tmp,#12
veor $key,$key,$rcon
veor $in0,$in0,$tmp
vshl.u8 $rcon,$rcon,#1
veor $in0,$in0,$key
vtbl.8 $key,{$in0},$mask
vext.8 $tmp,$zero,$in0,#12
vst1.32 {$in0},[$out],#16
aese $key,$zero
veor $in0,$in0,$tmp
vext.8 $tmp,$zero,$tmp,#12
veor $in0,$in0,$tmp
vext.8 $tmp,$zero,$tmp,#12
veor $key,$key,$rcon
veor $in0,$in0,$tmp
veor $in0,$in0,$key
vst1.32 {$in0},[$out]
add $out,$out,#0x50
mov $rounds,#10
b .Ldone
.align 4
.L192:
vld1.8 {$in1},[$inp],#8
vmov.i8 $key,#8 // borrow $key
vst1.32 {$in0},[$out],#16
vsub.i8 $mask,$mask,$key // adjust the mask
.Loop192:
vtbl.8 $key,{$in1},$mask
vext.8 $tmp,$zero,$in0,#12
vst1.32 {$in1},[$out],#8
aese $key,$zero
subs $bits,$bits,#1
veor $in0,$in0,$tmp
vext.8 $tmp,$zero,$tmp,#12
veor $in0,$in0,$tmp
vext.8 $tmp,$zero,$tmp,#12
veor $in0,$in0,$tmp
vdup.32 $tmp,${in0}[3]
veor $tmp,$tmp,$in1
veor $key,$key,$rcon
vext.8 $in1,$zero,$in1,#12
vshl.u8 $rcon,$rcon,#1
veor $in1,$in1,$tmp
veor $in0,$in0,$key
veor $in1,$in1,$key
vst1.32 {$in0},[$out],#16
b.ne .Loop192
mov $rounds,#12
add $out,$out,#0x20
b .Ldone
.align 4
.L256:
vld1.8 {$in1},[$inp]
mov $bits,#7
mov $rounds,#14
vst1.32 {$in0},[$out],#16
.Loop256:
vtbl.8 $key,{$in1},$mask
vext.8 $tmp,$zero,$in0,#12
vst1.32 {$in1},[$out],#16
aese $key,$zero
subs $bits,$bits,#1
veor $in0,$in0,$tmp
vext.8 $tmp,$zero,$tmp,#12
veor $in0,$in0,$tmp
vext.8 $tmp,$zero,$tmp,#12
veor $key,$key,$rcon
veor $in0,$in0,$tmp
vshl.u8 $rcon,$rcon,#1
veor $in0,$in0,$key
vst1.32 {$in0},[$out],#16
b.eq .Ldone
vdup.32 $key,${in0}[3] // just splat
vext.8 $tmp,$zero,$in1,#12
aese $key,$zero
veor $in1,$in1,$tmp
vext.8 $tmp,$zero,$tmp,#12
veor $in1,$in1,$tmp
vext.8 $tmp,$zero,$tmp,#12
veor $in1,$in1,$tmp
veor $in1,$in1,$key
b .Loop256
.Ldone:
str $rounds,[$out]
mov $ptr,#0
.Lenc_key_abort:
mov x0,$ptr // return value
`"ldr x29,[sp],#16" if ($flavour =~ /64/)`
ret
.size ${prefix}_set_encrypt_key,.-${prefix}_set_encrypt_key
.globl ${prefix}_set_decrypt_key
.type ${prefix}_set_decrypt_key,%function
.align 5
${prefix}_set_decrypt_key:
___
$code.=<<___ if ($flavour =~ /64/);
stp x29,x30,[sp,#-16]!
add x29,sp,#0
___
$code.=<<___ if ($flavour !~ /64/);
stmdb sp!,{r4,lr}
___
$code.=<<___;
bl .Lenc_key
cmp x0,#0
b.ne .Ldec_key_abort
sub $out,$out,#240 // restore original $out
mov x4,#-16
add $inp,$out,x12,lsl#4 // end of key schedule
vld1.32 {v0.16b},[$out]
vld1.32 {v1.16b},[$inp]
vst1.32 {v0.16b},[$inp],x4
vst1.32 {v1.16b},[$out],#16
.Loop_imc:
vld1.32 {v0.16b},[$out]
vld1.32 {v1.16b},[$inp]
aesimc v0.16b,v0.16b
aesimc v1.16b,v1.16b
vst1.32 {v0.16b},[$inp],x4
vst1.32 {v1.16b},[$out],#16
cmp $inp,$out
b.hi .Loop_imc
vld1.32 {v0.16b},[$out]
aesimc v0.16b,v0.16b
vst1.32 {v0.16b},[$inp]
eor x0,x0,x0 // return value
.Ldec_key_abort:
___
$code.=<<___ if ($flavour !~ /64/);
ldmia sp!,{r4,pc}
___
$code.=<<___ if ($flavour =~ /64/);
ldp x29,x30,[sp],#16
ret
___
$code.=<<___;
.size ${prefix}_set_decrypt_key,.-${prefix}_set_decrypt_key
___
}}}
{{{
sub gen_block () {
my $dir = shift;
my ($e,$mc) = $dir eq "en" ? ("e","mc") : ("d","imc");
my ($inp,$out,$key)=map("x$_",(0..2));
my $rounds="w3";
my ($rndkey0,$rndkey1,$inout)=map("q$_",(0..3));
$code.=<<___;
.globl ${prefix}_${dir}crypt
.type ${prefix}_${dir}crypt,%function
.align 5
${prefix}_${dir}crypt:
ldr $rounds,[$key,#240]
vld1.32 {$rndkey0},[$key],#16
vld1.8 {$inout},[$inp]
sub $rounds,$rounds,#2
vld1.32 {$rndkey1},[$key],#16
.Loop_${dir}c:
aes$e $inout,$rndkey0
aes$mc $inout,$inout
vld1.32 {$rndkey0},[$key],#16
subs $rounds,$rounds,#2
aes$e $inout,$rndkey1
aes$mc $inout,$inout
vld1.32 {$rndkey1},[$key],#16
b.gt .Loop_${dir}c
aes$e $inout,$rndkey0
aes$mc $inout,$inout
vld1.32 {$rndkey0},[$key]
aes$e $inout,$rndkey1
veor $inout,$inout,$rndkey0
vst1.8 {$inout},[$out]
ret
.size ${prefix}_${dir}crypt,.-${prefix}_${dir}crypt
___
}
&gen_block("en");
&gen_block("de");
}}}
{{{
my ($inp,$out,$len,$key,$ivp)=map("x$_",(0..4)); my $enc="w5";
my ($rounds,$cnt,$key_,$step,$step1)=($enc,"w6","x7","x8","x12");
my ($dat0,$dat1,$in0,$in1,$tmp0,$tmp1,$ivec,$rndlast)=map("q$_",(0..7));
my ($dat,$tmp,$rndzero_n_last)=($dat0,$tmp0,$tmp1);
my ($key4,$key5,$key6,$key7)=("x6","x12","x14",$key);
### q8-q15 preloaded key schedule
$code.=<<___;
.globl ${prefix}_cbc_encrypt
.type ${prefix}_cbc_encrypt,%function
.align 5
${prefix}_cbc_encrypt:
___
$code.=<<___ if ($flavour =~ /64/);
stp x29,x30,[sp,#-16]!
add x29,sp,#0
___
$code.=<<___ if ($flavour !~ /64/);
mov ip,sp
stmdb sp!,{r4-r8,lr}
vstmdb sp!,{d8-d15} @ ABI specification says so
ldmia ip,{r4-r5} @ load remaining args
___
$code.=<<___;
subs $len,$len,#16
mov $step,#16
b.lo .Lcbc_abort
cclr $step,eq
cmp $enc,#0 // en- or decrypting?
ldr $rounds,[$key,#240]
and $len,$len,#-16
vld1.8 {$ivec},[$ivp]
vld1.8 {$dat},[$inp],$step
vld1.32 {q8-q9},[$key] // load key schedule...
sub $rounds,$rounds,#6
add $key_,$key,x5,lsl#4 // pointer to last 7 round keys
sub $rounds,$rounds,#2
vld1.32 {q10-q11},[$key_],#32
vld1.32 {q12-q13},[$key_],#32
vld1.32 {q14-q15},[$key_],#32
vld1.32 {$rndlast},[$key_]
add $key_,$key,#32
mov $cnt,$rounds
b.eq .Lcbc_dec
cmp $rounds,#2
veor $dat,$dat,$ivec
veor $rndzero_n_last,q8,$rndlast
b.eq .Lcbc_enc128
vld1.32 {$in0-$in1},[$key_]
add $key_,$key,#16
add $key4,$key,#16*4
add $key5,$key,#16*5
aese $dat,q8
aesmc $dat,$dat
add $key6,$key,#16*6
add $key7,$key,#16*7
b .Lenter_cbc_enc
.align 4
.Loop_cbc_enc:
aese $dat,q8
aesmc $dat,$dat
vst1.8 {$ivec},[$out],#16
.Lenter_cbc_enc:
aese $dat,q9
aesmc $dat,$dat
aese $dat,$in0
aesmc $dat,$dat
vld1.32 {q8},[$key4]
cmp $rounds,#4
aese $dat,$in1
aesmc $dat,$dat
vld1.32 {q9},[$key5]
b.eq .Lcbc_enc192
aese $dat,q8
aesmc $dat,$dat
vld1.32 {q8},[$key6]
aese $dat,q9
aesmc $dat,$dat
vld1.32 {q9},[$key7]
nop
.Lcbc_enc192:
aese $dat,q8
aesmc $dat,$dat
subs $len,$len,#16
aese $dat,q9
aesmc $dat,$dat
cclr $step,eq
aese $dat,q10
aesmc $dat,$dat
aese $dat,q11
aesmc $dat,$dat
vld1.8 {q8},[$inp],$step
aese $dat,q12
aesmc $dat,$dat
veor q8,q8,$rndzero_n_last
aese $dat,q13
aesmc $dat,$dat
vld1.32 {q9},[$key_] // re-pre-load rndkey[1]
aese $dat,q14
aesmc $dat,$dat
aese $dat,q15
veor $ivec,$dat,$rndlast
b.hs .Loop_cbc_enc
vst1.8 {$ivec},[$out],#16
b .Lcbc_done
.align 5
.Lcbc_enc128:
vld1.32 {$in0-$in1},[$key_]
aese $dat,q8
aesmc $dat,$dat
b .Lenter_cbc_enc128
.Loop_cbc_enc128:
aese $dat,q8
aesmc $dat,$dat
vst1.8 {$ivec},[$out],#16
.Lenter_cbc_enc128:
aese $dat,q9
aesmc $dat,$dat
subs $len,$len,#16
aese $dat,$in0
aesmc $dat,$dat
cclr $step,eq
aese $dat,$in1
aesmc $dat,$dat
aese $dat,q10
aesmc $dat,$dat
aese $dat,q11
aesmc $dat,$dat
vld1.8 {q8},[$inp],$step
aese $dat,q12
aesmc $dat,$dat
aese $dat,q13
aesmc $dat,$dat
aese $dat,q14
aesmc $dat,$dat
veor q8,q8,$rndzero_n_last
aese $dat,q15
veor $ivec,$dat,$rndlast
b.hs .Loop_cbc_enc128
vst1.8 {$ivec},[$out],#16
b .Lcbc_done
___
{
my ($dat2,$in2,$tmp2)=map("q$_",(10,11,9));
$code.=<<___;
.align 5
.Lcbc_dec:
vld1.8 {$dat2},[$inp],#16
subs $len,$len,#32 // bias
add $cnt,$rounds,#2
vorr $in1,$dat,$dat
vorr $dat1,$dat,$dat
vorr $in2,$dat2,$dat2
b.lo .Lcbc_dec_tail
vorr $dat1,$dat2,$dat2
vld1.8 {$dat2},[$inp],#16
vorr $in0,$dat,$dat
vorr $in1,$dat1,$dat1
vorr $in2,$dat2,$dat2
.Loop3x_cbc_dec:
aesd $dat0,q8
aesimc $dat0,$dat0
aesd $dat1,q8
aesimc $dat1,$dat1
aesd $dat2,q8
aesimc $dat2,$dat2
vld1.32 {q8},[$key_],#16
subs $cnt,$cnt,#2
aesd $dat0,q9
aesimc $dat0,$dat0
aesd $dat1,q9
aesimc $dat1,$dat1
aesd $dat2,q9
aesimc $dat2,$dat2
vld1.32 {q9},[$key_],#16
b.gt .Loop3x_cbc_dec
aesd $dat0,q8
aesimc $dat0,$dat0
aesd $dat1,q8
aesimc $dat1,$dat1
aesd $dat2,q8
aesimc $dat2,$dat2
veor $tmp0,$ivec,$rndlast
subs $len,$len,#0x30
veor $tmp1,$in0,$rndlast
mov.lo x6,$len // x6, $cnt, is zero at this point
aesd $dat0,q9
aesimc $dat0,$dat0
aesd $dat1,q9
aesimc $dat1,$dat1
aesd $dat2,q9
aesimc $dat2,$dat2
veor $tmp2,$in1,$rndlast
add $inp,$inp,x6 // $inp is adjusted in such way that
// at exit from the loop $dat1-$dat2
// are loaded with last "words"
vorr $ivec,$in2,$in2
mov $key_,$key
aesd $dat0,q12
aesimc $dat0,$dat0
aesd $dat1,q12
aesimc $dat1,$dat1
aesd $dat2,q12
aesimc $dat2,$dat2
vld1.8 {$in0},[$inp],#16
aesd $dat0,q13
aesimc $dat0,$dat0
aesd $dat1,q13
aesimc $dat1,$dat1
aesd $dat2,q13
aesimc $dat2,$dat2
vld1.8 {$in1},[$inp],#16
aesd $dat0,q14
aesimc $dat0,$dat0
aesd $dat1,q14
aesimc $dat1,$dat1
aesd $dat2,q14
aesimc $dat2,$dat2
vld1.8 {$in2},[$inp],#16
aesd $dat0,q15
aesd $dat1,q15
aesd $dat2,q15
vld1.32 {q8},[$key_],#16 // re-pre-load rndkey[0]
add $cnt,$rounds,#2
veor $tmp0,$tmp0,$dat0
veor $tmp1,$tmp1,$dat1
veor $dat2,$dat2,$tmp2
vld1.32 {q9},[$key_],#16 // re-pre-load rndkey[1]
vst1.8 {$tmp0},[$out],#16
vorr $dat0,$in0,$in0
vst1.8 {$tmp1},[$out],#16
vorr $dat1,$in1,$in1
vst1.8 {$dat2},[$out],#16
vorr $dat2,$in2,$in2
b.hs .Loop3x_cbc_dec
cmn $len,#0x30
b.eq .Lcbc_done
nop
.Lcbc_dec_tail:
aesd $dat1,q8
aesimc $dat1,$dat1
aesd $dat2,q8
aesimc $dat2,$dat2
vld1.32 {q8},[$key_],#16
subs $cnt,$cnt,#2
aesd $dat1,q9
aesimc $dat1,$dat1
aesd $dat2,q9
aesimc $dat2,$dat2
vld1.32 {q9},[$key_],#16
b.gt .Lcbc_dec_tail
aesd $dat1,q8
aesimc $dat1,$dat1
aesd $dat2,q8
aesimc $dat2,$dat2
aesd $dat1,q9
aesimc $dat1,$dat1
aesd $dat2,q9
aesimc $dat2,$dat2
aesd $dat1,q12
aesimc $dat1,$dat1
aesd $dat2,q12
aesimc $dat2,$dat2
cmn $len,#0x20
aesd $dat1,q13
aesimc $dat1,$dat1
aesd $dat2,q13
aesimc $dat2,$dat2
veor $tmp1,$ivec,$rndlast
aesd $dat1,q14
aesimc $dat1,$dat1
aesd $dat2,q14
aesimc $dat2,$dat2
veor $tmp2,$in1,$rndlast
aesd $dat1,q15
aesd $dat2,q15
b.eq .Lcbc_dec_one
veor $tmp1,$tmp1,$dat1
veor $tmp2,$tmp2,$dat2
vorr $ivec,$in2,$in2
vst1.8 {$tmp1},[$out],#16
vst1.8 {$tmp2},[$out],#16
b .Lcbc_done
.Lcbc_dec_one:
veor $tmp1,$tmp1,$dat2
vorr $ivec,$in2,$in2
vst1.8 {$tmp1},[$out],#16
.Lcbc_done:
vst1.8 {$ivec},[$ivp]
.Lcbc_abort:
___
}
$code.=<<___ if ($flavour !~ /64/);
vldmia sp!,{d8-d15}
ldmia sp!,{r4-r8,pc}
___
$code.=<<___ if ($flavour =~ /64/);
ldr x29,[sp],#16
ret
___
$code.=<<___;
.size ${prefix}_cbc_encrypt,.-${prefix}_cbc_encrypt
___
}}}
{{{
my ($inp,$out,$len,$key,$ivp)=map("x$_",(0..4));
my ($rounds,$cnt,$key_)=("w5","w6","x7");
my ($ctr,$tctr0,$tctr1,$tctr2)=map("w$_",(8..10,12));
my $step="x12"; # aliases with $tctr2
my ($dat0,$dat1,$in0,$in1,$tmp0,$tmp1,$ivec,$rndlast)=map("q$_",(0..7));
my ($dat2,$in2,$tmp2)=map("q$_",(10,11,9));
my ($dat,$tmp)=($dat0,$tmp0);
### q8-q15 preloaded key schedule
$code.=<<___;
.globl ${prefix}_ctr32_encrypt_blocks
.type ${prefix}_ctr32_encrypt_blocks,%function
.align 5
${prefix}_ctr32_encrypt_blocks:
___
$code.=<<___ if ($flavour =~ /64/);
stp x29,x30,[sp,#-16]!
add x29,sp,#0
___
$code.=<<___ if ($flavour !~ /64/);
mov ip,sp
stmdb sp!,{r4-r10,lr}
vstmdb sp!,{d8-d15} @ ABI specification says so
ldr r4, [ip] @ load remaining arg
___
$code.=<<___;
ldr $rounds,[$key,#240]
ldr $ctr, [$ivp, #12]
vld1.32 {$dat0},[$ivp]
vld1.32 {q8-q9},[$key] // load key schedule...
sub $rounds,$rounds,#4
mov $step,#16
cmp $len,#2
add $key_,$key,x5,lsl#4 // pointer to last 5 round keys
sub $rounds,$rounds,#2
vld1.32 {q12-q13},[$key_],#32
vld1.32 {q14-q15},[$key_],#32
vld1.32 {$rndlast},[$key_]
add $key_,$key,#32
mov $cnt,$rounds
cclr $step,lo
#ifndef __ARMEB__
rev $ctr, $ctr
#endif
vorr $dat1,$dat0,$dat0
add $tctr1, $ctr, #1
vorr $dat2,$dat0,$dat0
add $ctr, $ctr, #2
vorr $ivec,$dat0,$dat0
rev $tctr1, $tctr1
vmov.32 ${dat1}[3],$tctr1
b.ls .Lctr32_tail
rev $tctr2, $ctr
sub $len,$len,#3 // bias
vmov.32 ${dat2}[3],$tctr2
b .Loop3x_ctr32
.align 4
.Loop3x_ctr32:
aese $dat0,q8
aesmc $dat0,$dat0
aese $dat1,q8
aesmc $dat1,$dat1
aese $dat2,q8
aesmc $dat2,$dat2
vld1.32 {q8},[$key_],#16
subs $cnt,$cnt,#2
aese $dat0,q9
aesmc $dat0,$dat0
aese $dat1,q9
aesmc $dat1,$dat1
aese $dat2,q9
aesmc $dat2,$dat2
vld1.32 {q9},[$key_],#16
b.gt .Loop3x_ctr32
aese $dat0,q8
aesmc $tmp0,$dat0
aese $dat1,q8
aesmc $tmp1,$dat1
vld1.8 {$in0},[$inp],#16
vorr $dat0,$ivec,$ivec
aese $dat2,q8
aesmc $dat2,$dat2
vld1.8 {$in1},[$inp],#16
vorr $dat1,$ivec,$ivec
aese $tmp0,q9
aesmc $tmp0,$tmp0
aese $tmp1,q9
aesmc $tmp1,$tmp1
vld1.8 {$in2},[$inp],#16
mov $key_,$key
aese $dat2,q9
aesmc $tmp2,$dat2
vorr $dat2,$ivec,$ivec
add $tctr0,$ctr,#1
aese $tmp0,q12
aesmc $tmp0,$tmp0
aese $tmp1,q12
aesmc $tmp1,$tmp1
veor $in0,$in0,$rndlast
add $tctr1,$ctr,#2
aese $tmp2,q12
aesmc $tmp2,$tmp2
veor $in1,$in1,$rndlast
add $ctr,$ctr,#3
aese $tmp0,q13
aesmc $tmp0,$tmp0
aese $tmp1,q13
aesmc $tmp1,$tmp1
veor $in2,$in2,$rndlast
rev $tctr0,$tctr0
aese $tmp2,q13
aesmc $tmp2,$tmp2
vmov.32 ${dat0}[3], $tctr0
rev $tctr1,$tctr1
aese $tmp0,q14
aesmc $tmp0,$tmp0
aese $tmp1,q14
aesmc $tmp1,$tmp1
vmov.32 ${dat1}[3], $tctr1
rev $tctr2,$ctr
aese $tmp2,q14
aesmc $tmp2,$tmp2
vmov.32 ${dat2}[3], $tctr2
subs $len,$len,#3
aese $tmp0,q15
aese $tmp1,q15
aese $tmp2,q15
veor $in0,$in0,$tmp0
vld1.32 {q8},[$key_],#16 // re-pre-load rndkey[0]
vst1.8 {$in0},[$out],#16
veor $in1,$in1,$tmp1
mov $cnt,$rounds
vst1.8 {$in1},[$out],#16
veor $in2,$in2,$tmp2
vld1.32 {q9},[$key_],#16 // re-pre-load rndkey[1]
vst1.8 {$in2},[$out],#16
b.hs .Loop3x_ctr32
adds $len,$len,#3
b.eq .Lctr32_done
cmp $len,#1
mov $step,#16
cclr $step,eq
.Lctr32_tail:
aese $dat0,q8
aesmc $dat0,$dat0
aese $dat1,q8
aesmc $dat1,$dat1
vld1.32 {q8},[$key_],#16
subs $cnt,$cnt,#2
aese $dat0,q9
aesmc $dat0,$dat0
aese $dat1,q9
aesmc $dat1,$dat1
vld1.32 {q9},[$key_],#16
b.gt .Lctr32_tail
aese $dat0,q8
aesmc $dat0,$dat0
aese $dat1,q8
aesmc $dat1,$dat1
aese $dat0,q9
aesmc $dat0,$dat0
aese $dat1,q9
aesmc $dat1,$dat1
vld1.8 {$in0},[$inp],$step
aese $dat0,q12
aesmc $dat0,$dat0
aese $dat1,q12
aesmc $dat1,$dat1
vld1.8 {$in1},[$inp]
aese $dat0,q13
aesmc $dat0,$dat0
aese $dat1,q13
aesmc $dat1,$dat1
veor $in0,$in0,$rndlast
aese $dat0,q14
aesmc $dat0,$dat0
aese $dat1,q14
aesmc $dat1,$dat1
veor $in1,$in1,$rndlast
aese $dat0,q15
aese $dat1,q15
cmp $len,#1
veor $in0,$in0,$dat0
veor $in1,$in1,$dat1
vst1.8 {$in0},[$out],#16
b.eq .Lctr32_done
vst1.8 {$in1},[$out]
.Lctr32_done:
___
$code.=<<___ if ($flavour !~ /64/);
vldmia sp!,{d8-d15}
ldmia sp!,{r4-r10,pc}
___
$code.=<<___ if ($flavour =~ /64/);
ldr x29,[sp],#16
ret
___
$code.=<<___;
.size ${prefix}_ctr32_encrypt_blocks,.-${prefix}_ctr32_encrypt_blocks
___
}}}
$code.=<<___;
#endif
___
########################################
if ($flavour =~ /64/) { ######## 64-bit code
my %opcode = (
"aesd" => 0x4e285800, "aese" => 0x4e284800,
"aesimc"=> 0x4e287800, "aesmc" => 0x4e286800 );
local *unaes = sub {
my ($mnemonic,$arg)=@_;
$arg =~ m/[qv]([0-9]+)[^,]*,\s*[qv]([0-9]+)/o &&
sprintf ".inst\t0x%08x\t//%s %s",
$opcode{$mnemonic}|$1|($2<<5),
$mnemonic,$arg;
};
foreach(split("\n",$code)) {
s/\`([^\`]*)\`/eval($1)/geo;
s/\bq([0-9]+)\b/"v".($1<8?$1:$1+8).".16b"/geo; # old->new registers
s/@\s/\/\//o; # old->new style commentary
#s/[v]?(aes\w+)\s+([qv].*)/unaes($1,$2)/geo or
s/cclr\s+([wx])([^,]+),\s*([a-z]+)/csel $1$2,$1zr,$1$2,$3/o or
s/mov\.([a-z]+)\s+([wx][0-9]+),\s*([wx][0-9]+)/csel $2,$3,$2,$1/o or
s/vmov\.i8/movi/o or # fix up legacy mnemonics
s/vext\.8/ext/o or
s/vrev32\.8/rev32/o or
s/vtst\.8/cmtst/o or
s/vshr/ushr/o or
s/^(\s+)v/$1/o or # strip off v prefix
s/\bbx\s+lr\b/ret/o;
# fix up remainig legacy suffixes
s/\.[ui]?8//o;
m/\],#8/o and s/\.16b/\.8b/go;
s/\.[ui]?32//o and s/\.16b/\.4s/go;
s/\.[ui]?64//o and s/\.16b/\.2d/go;
s/\.[42]([sd])\[([0-3])\]/\.$1\[$2\]/o;
print $_,"\n";
}
} else { ######## 32-bit code
my %opcode = (
"aesd" => 0xf3b00340, "aese" => 0xf3b00300,
"aesimc"=> 0xf3b003c0, "aesmc" => 0xf3b00380 );
local *unaes = sub {
my ($mnemonic,$arg)=@_;
if ($arg =~ m/[qv]([0-9]+)[^,]*,\s*[qv]([0-9]+)/o) {
my $word = $opcode{$mnemonic}|(($1&7)<<13)|(($1&8)<<19)
|(($2&7)<<1) |(($2&8)<<2);
# since ARMv7 instructions are always encoded little-endian.
# correct solution is to use .inst directive, but older
# assemblers don't implement it:-(
sprintf ".byte\t0x%02x,0x%02x,0x%02x,0x%02x\t@ %s %s",
$word&0xff,($word>>8)&0xff,
($word>>16)&0xff,($word>>24)&0xff,
$mnemonic,$arg;
}
};
sub unvtbl {
my $arg=shift;
$arg =~ m/q([0-9]+),\s*\{q([0-9]+)\},\s*q([0-9]+)/o &&
sprintf "vtbl.8 d%d,{q%d},d%d\n\t".
"vtbl.8 d%d,{q%d},d%d", 2*$1,$2,2*$3, 2*$1+1,$2,2*$3+1;
}
sub unvdup32 {
my $arg=shift;
$arg =~ m/q([0-9]+),\s*q([0-9]+)\[([0-3])\]/o &&
sprintf "vdup.32 q%d,d%d[%d]",$1,2*$2+($3>>1),$3&1;
}
sub unvmov32 {
my $arg=shift;
$arg =~ m/q([0-9]+)\[([0-3])\],(.*)/o &&
sprintf "vmov.32 d%d[%d],%s",2*$1+($2>>1),$2&1,$3;
}
foreach(split("\n",$code)) {
s/\`([^\`]*)\`/eval($1)/geo;
s/\b[wx]([0-9]+)\b/r$1/go; # new->old registers
s/\bv([0-9])\.[12468]+[bsd]\b/q$1/go; # new->old registers
s/\/\/\s?/@ /o; # new->old style commentary
# fix up remainig new-style suffixes
s/\{q([0-9]+)\},\s*\[(.+)\],#8/sprintf "{d%d},[$2]!",2*$1/eo or
s/\],#[0-9]+/]!/o;
s/[v]?(aes\w+)\s+([qv].*)/unaes($1,$2)/geo or
s/cclr\s+([^,]+),\s*([a-z]+)/mov$2 $1,#0/o or
s/vtbl\.8\s+(.*)/unvtbl($1)/geo or
s/vdup\.32\s+(.*)/unvdup32($1)/geo or
s/vmov\.32\s+(.*)/unvmov32($1)/geo or
s/^(\s+)b\./$1b/o or
s/^(\s+)mov\./$1mov/o or
s/^(\s+)ret/$1bx\tlr/o;
print $_,"\n";
}
}
close STDOUT;

2469
crypto/aes/asm/bsaes-armv7.pl Executable file

File diff suppressed because it is too large Load Diff

View File

@ -38,8 +38,9 @@
# Emilia's this(*) difference
#
# Core 2 9.30 8.69 +7%
# Nehalem(**) 7.63 6.98 +9%
# Atom 17.1 17.4 -2%(***)
# Nehalem(**) 7.63 6.88 +11%
# Atom 17.1 16.4 +4%
# Silvermont - 12.9
#
# (*) Comparison is not completely fair, because "this" is ECB,
# i.e. no extra processing such as counter values calculation
@ -50,14 +51,6 @@
# (**) Results were collected on Westmere, which is considered to
# be equivalent to Nehalem for this code.
#
# (***) Slowdown on Atom is rather strange per se, because original
# implementation has a number of 9+-bytes instructions, which
# are bad for Atom front-end, and which I eliminated completely.
# In attempt to address deterioration sbox() was tested in FP
# SIMD "domain" (movaps instead of movdqa, xorps instead of
# pxor, etc.). While it resulted in nominal 4% improvement on
# Atom, it hurted Westmere by more than 2x factor.
#
# As for key schedule conversion subroutine. Interface to OpenSSL
# relies on per-invocation on-the-fly conversion. This naturally
# has impact on performance, especially for short inputs. Conversion
@ -67,7 +60,7 @@
# conversion conversion/8x block
# Core 2 240 0.22
# Nehalem 180 0.20
# Atom 430 0.19
# Atom 430 0.20
#
# The ratio values mean that 128-byte blocks will be processed
# 16-18% slower, 256-byte blocks - 9-10%, 384-byte blocks - 6-7%,
@ -83,9 +76,10 @@
# Add decryption procedure. Performance in CPU cycles spent to decrypt
# one byte out of 4096-byte buffer with 128-bit key is:
#
# Core 2 9.83
# Nehalem 7.74
# Atom 19.0
# Core 2 9.98
# Nehalem 7.80
# Atom 17.9
# Silvermont 14.0
#
# November 2011.
#
@ -434,21 +428,21 @@ my $mask=pop;
$code.=<<___;
pxor 0x00($key),@x[0]
pxor 0x10($key),@x[1]
pshufb $mask,@x[0]
pxor 0x20($key),@x[2]
pshufb $mask,@x[1]
pxor 0x30($key),@x[3]
pshufb $mask,@x[2]
pshufb $mask,@x[0]
pshufb $mask,@x[1]
pxor 0x40($key),@x[4]
pshufb $mask,@x[3]
pxor 0x50($key),@x[5]
pshufb $mask,@x[4]
pshufb $mask,@x[2]
pshufb $mask,@x[3]
pxor 0x60($key),@x[6]
pshufb $mask,@x[5]
pxor 0x70($key),@x[7]
pshufb $mask,@x[4]
pshufb $mask,@x[5]
pshufb $mask,@x[6]
lea 0x80($key),$key
pshufb $mask,@x[7]
lea 0x80($key),$key
___
}
@ -820,18 +814,18 @@ _bsaes_encrypt8:
movdqa 0x50($const), @XMM[8] # .LM0SR
pxor @XMM[9], @XMM[0] # xor with round0 key
pxor @XMM[9], @XMM[1]
pshufb @XMM[8], @XMM[0]
pxor @XMM[9], @XMM[2]
pshufb @XMM[8], @XMM[1]
pxor @XMM[9], @XMM[3]
pshufb @XMM[8], @XMM[2]
pshufb @XMM[8], @XMM[0]
pshufb @XMM[8], @XMM[1]
pxor @XMM[9], @XMM[4]
pshufb @XMM[8], @XMM[3]
pxor @XMM[9], @XMM[5]
pshufb @XMM[8], @XMM[4]
pshufb @XMM[8], @XMM[2]
pshufb @XMM[8], @XMM[3]
pxor @XMM[9], @XMM[6]
pshufb @XMM[8], @XMM[5]
pxor @XMM[9], @XMM[7]
pshufb @XMM[8], @XMM[4]
pshufb @XMM[8], @XMM[5]
pshufb @XMM[8], @XMM[6]
pshufb @XMM[8], @XMM[7]
_bsaes_encrypt8_bitslice:
@ -884,18 +878,18 @@ _bsaes_decrypt8:
movdqa -0x30($const), @XMM[8] # .LM0ISR
pxor @XMM[9], @XMM[0] # xor with round0 key
pxor @XMM[9], @XMM[1]
pshufb @XMM[8], @XMM[0]
pxor @XMM[9], @XMM[2]
pshufb @XMM[8], @XMM[1]
pxor @XMM[9], @XMM[3]
pshufb @XMM[8], @XMM[2]
pshufb @XMM[8], @XMM[0]
pshufb @XMM[8], @XMM[1]
pxor @XMM[9], @XMM[4]
pshufb @XMM[8], @XMM[3]
pxor @XMM[9], @XMM[5]
pshufb @XMM[8], @XMM[4]
pshufb @XMM[8], @XMM[2]
pshufb @XMM[8], @XMM[3]
pxor @XMM[9], @XMM[6]
pshufb @XMM[8], @XMM[5]
pxor @XMM[9], @XMM[7]
pshufb @XMM[8], @XMM[4]
pshufb @XMM[8], @XMM[5]
pshufb @XMM[8], @XMM[6]
pshufb @XMM[8], @XMM[7]
___
@ -1937,21 +1931,21 @@ $code.=<<___;
movdqa -0x10(%r11), @XMM[8] # .LSWPUPM0SR
pxor @XMM[9], @XMM[0] # xor with round0 key
pxor @XMM[9], @XMM[1]
pshufb @XMM[8], @XMM[0]
pxor @XMM[9], @XMM[2]
pshufb @XMM[8], @XMM[1]
pxor @XMM[9], @XMM[3]
pshufb @XMM[8], @XMM[2]
pshufb @XMM[8], @XMM[0]
pshufb @XMM[8], @XMM[1]
pxor @XMM[9], @XMM[4]
pshufb @XMM[8], @XMM[3]
pxor @XMM[9], @XMM[5]
pshufb @XMM[8], @XMM[4]
pshufb @XMM[8], @XMM[2]
pshufb @XMM[8], @XMM[3]
pxor @XMM[9], @XMM[6]
pshufb @XMM[8], @XMM[5]
pxor @XMM[9], @XMM[7]
pshufb @XMM[8], @XMM[4]
pshufb @XMM[8], @XMM[5]
pshufb @XMM[8], @XMM[6]
lea .LBS0(%rip), %r11 # constants table
pshufb @XMM[8], @XMM[7]
lea .LBS0(%rip), %r11 # constants table
mov %ebx,%r10d # pass rounds
call _bsaes_encrypt8_bitslice

1512
crypto/aes/asm/vpaes-ppc.pl Executable file

File diff suppressed because it is too large Load Diff

View File

@ -27,9 +27,10 @@
#
# aes-586.pl vpaes-x86.pl
#
# Core 2(**) 29.1/42.3/18.3 22.0/25.6(***)
# Nehalem 27.9/40.4/18.1 10.3/12.0
# Atom 102./119./60.1 64.5/85.3(***)
# Core 2(**) 28.1/41.4/18.3 21.9/25.2(***)
# Nehalem 27.9/40.4/18.1 10.2/11.9
# Atom 70.7/92.1/60.1 61.1/75.4(***)
# Silvermont 45.4/62.9/24.1 49.2/61.1(***)
#
# (*) "Hyper-threading" in the context refers rather to cache shared
# among multiple cores, than to specifically Intel HTT. As vast
@ -40,8 +41,8 @@
# (**) "Core 2" refers to initial 65nm design, a.k.a. Conroe.
#
# (***) Less impressive improvement on Core 2 and Atom is due to slow
# pshufb, yet it's respectable +32%/65% improvement on Core 2
# and +58%/40% on Atom (as implied, over "hyper-threading-safe"
# pshufb, yet it's respectable +28%/64% improvement on Core 2
# and +15% on Atom (as implied, over "hyper-threading-safe"
# code path).
#
# <appro@openssl.org>
@ -183,35 +184,35 @@ $k_dsbo=0x2c0; # decryption sbox final output
&movdqa ("xmm1","xmm6")
&movdqa ("xmm2",&QWP($k_ipt,$const));
&pandn ("xmm1","xmm0");
&movdqu ("xmm5",&QWP(0,$key));
&psrld ("xmm1",4);
&pand ("xmm0","xmm6");
&movdqu ("xmm5",&QWP(0,$key));
&pshufb ("xmm2","xmm0");
&movdqa ("xmm0",&QWP($k_ipt+16,$const));
&pshufb ("xmm0","xmm1");
&pxor ("xmm2","xmm5");
&pxor ("xmm0","xmm2");
&psrld ("xmm1",4);
&add ($key,16);
&pshufb ("xmm0","xmm1");
&lea ($base,&DWP($k_mc_backward,$const));
&pxor ("xmm0","xmm2");
&jmp (&label("enc_entry"));
&set_label("enc_loop",16);
# middle of middle round
&movdqa ("xmm4",&QWP($k_sb1,$const)); # 4 : sb1u
&pshufb ("xmm4","xmm2"); # 4 = sb1u
&pxor ("xmm4","xmm5"); # 4 = sb1u + k
&movdqa ("xmm0",&QWP($k_sb1+16,$const));# 0 : sb1t
&pshufb ("xmm4","xmm2"); # 4 = sb1u
&pshufb ("xmm0","xmm3"); # 0 = sb1t
&pxor ("xmm0","xmm4"); # 0 = A
&pxor ("xmm4","xmm5"); # 4 = sb1u + k
&movdqa ("xmm5",&QWP($k_sb2,$const)); # 4 : sb2u
&pshufb ("xmm5","xmm2"); # 4 = sb2u
&pxor ("xmm0","xmm4"); # 0 = A
&movdqa ("xmm1",&QWP(-0x40,$base,$magic));# .Lk_mc_forward[]
&pshufb ("xmm5","xmm2"); # 4 = sb2u
&movdqa ("xmm2",&QWP($k_sb2+16,$const));# 2 : sb2t
&pshufb ("xmm2","xmm3"); # 2 = sb2t
&pxor ("xmm2","xmm5"); # 2 = 2A
&movdqa ("xmm4",&QWP(0,$base,$magic)); # .Lk_mc_backward[]
&pshufb ("xmm2","xmm3"); # 2 = sb2t
&movdqa ("xmm3","xmm0"); # 3 = A
&pxor ("xmm2","xmm5"); # 2 = 2A
&pshufb ("xmm0","xmm1"); # 0 = B
&add ($key,16); # next key
&pxor ("xmm0","xmm2"); # 0 = 2A+B
@ -220,30 +221,30 @@ $k_dsbo=0x2c0; # decryption sbox final output
&pxor ("xmm3","xmm0"); # 3 = 2A+B+D
&pshufb ("xmm0","xmm1"); # 0 = 2B+C
&and ($magic,0x30); # ... mod 4
&pxor ("xmm0","xmm3"); # 0 = 2A+3B+C+D
&sub ($round,1); # nr--
&pxor ("xmm0","xmm3"); # 0 = 2A+3B+C+D
&set_label("enc_entry");
# top of round
&movdqa ("xmm1","xmm6"); # 1 : i
&movdqa ("xmm5",&QWP($k_inv+16,$const));# 2 : a/k
&pandn ("xmm1","xmm0"); # 1 = i<<4
&psrld ("xmm1",4); # 1 = i
&pand ("xmm0","xmm6"); # 0 = k
&movdqa ("xmm5",&QWP($k_inv+16,$const));# 2 : a/k
&pshufb ("xmm5","xmm0"); # 2 = a/k
&pxor ("xmm0","xmm1"); # 0 = j
&movdqa ("xmm3","xmm7"); # 3 : 1/i
&pxor ("xmm0","xmm1"); # 0 = j
&pshufb ("xmm3","xmm1"); # 3 = 1/i
&pxor ("xmm3","xmm5"); # 3 = iak = 1/i + a/k
&movdqa ("xmm4","xmm7"); # 4 : 1/j
&pxor ("xmm3","xmm5"); # 3 = iak = 1/i + a/k
&pshufb ("xmm4","xmm0"); # 4 = 1/j
&pxor ("xmm4","xmm5"); # 4 = jak = 1/j + a/k
&movdqa ("xmm2","xmm7"); # 2 : 1/iak
&pxor ("xmm4","xmm5"); # 4 = jak = 1/j + a/k
&pshufb ("xmm2","xmm3"); # 2 = 1/iak
&pxor ("xmm2","xmm0"); # 2 = io
&movdqa ("xmm3","xmm7"); # 3 : 1/jak
&movdqu ("xmm5",&QWP(0,$key));
&pxor ("xmm2","xmm0"); # 2 = io
&pshufb ("xmm3","xmm4"); # 3 = 1/jak
&movdqu ("xmm5",&QWP(0,$key));
&pxor ("xmm3","xmm1"); # 3 = jo
&jnz (&label("enc_loop"));
@ -265,8 +266,8 @@ $k_dsbo=0x2c0; # decryption sbox final output
## Same API as encryption core.
##
&function_begin_B("_vpaes_decrypt_core");
&mov ($round,&DWP(240,$key));
&lea ($base,&DWP($k_dsbd,$const));
&mov ($round,&DWP(240,$key));
&movdqa ("xmm1","xmm6");
&movdqa ("xmm2",&QWP($k_dipt-$k_dsbd,$base));
&pandn ("xmm1","xmm0");
@ -292,62 +293,61 @@ $k_dsbo=0x2c0; # decryption sbox final output
## Inverse mix columns
##
&movdqa ("xmm4",&QWP(-0x20,$base)); # 4 : sb9u
&movdqa ("xmm1",&QWP(-0x10,$base)); # 0 : sb9t
&pshufb ("xmm4","xmm2"); # 4 = sb9u
&pxor ("xmm4","xmm0");
&movdqa ("xmm0",&QWP(-0x10,$base)); # 0 : sb9t
&pshufb ("xmm0","xmm3"); # 0 = sb9t
&pxor ("xmm0","xmm4"); # 0 = ch
&add ($key,16); # next round key
&pshufb ("xmm0","xmm5"); # MC ch
&pshufb ("xmm1","xmm3"); # 0 = sb9t
&pxor ("xmm0","xmm4");
&movdqa ("xmm4",&QWP(0,$base)); # 4 : sbdu
&pxor ("xmm0","xmm1"); # 0 = ch
&movdqa ("xmm1",&QWP(0x10,$base)); # 0 : sbdt
&pshufb ("xmm4","xmm2"); # 4 = sbdu
&pxor ("xmm4","xmm0"); # 4 = ch
&movdqa ("xmm0",&QWP(0x10,$base)); # 0 : sbdt
&pshufb ("xmm0","xmm3"); # 0 = sbdt
&pxor ("xmm0","xmm4"); # 0 = ch
&sub ($round,1); # nr--
&pshufb ("xmm0","xmm5"); # MC ch
&pshufb ("xmm1","xmm3"); # 0 = sbdt
&pxor ("xmm0","xmm4"); # 4 = ch
&movdqa ("xmm4",&QWP(0x20,$base)); # 4 : sbbu
&pxor ("xmm0","xmm1"); # 0 = ch
&movdqa ("xmm1",&QWP(0x30,$base)); # 0 : sbbt
&pshufb ("xmm4","xmm2"); # 4 = sbbu
&pxor ("xmm4","xmm0"); # 4 = ch
&movdqa ("xmm0",&QWP(0x30,$base)); # 0 : sbbt
&pshufb ("xmm0","xmm3"); # 0 = sbbt
&pxor ("xmm0","xmm4"); # 0 = ch
&pshufb ("xmm0","xmm5"); # MC ch
&pshufb ("xmm1","xmm3"); # 0 = sbbt
&pxor ("xmm0","xmm4"); # 4 = ch
&movdqa ("xmm4",&QWP(0x40,$base)); # 4 : sbeu
&pshufb ("xmm4","xmm2"); # 4 = sbeu
&pxor ("xmm4","xmm0"); # 4 = ch
&movdqa ("xmm0",&QWP(0x50,$base)); # 0 : sbet
&pshufb ("xmm0","xmm3"); # 0 = sbet
&pxor ("xmm0","xmm4"); # 0 = ch
&pxor ("xmm0","xmm1"); # 0 = ch
&movdqa ("xmm1",&QWP(0x50,$base)); # 0 : sbet
&pshufb ("xmm4","xmm2"); # 4 = sbeu
&pshufb ("xmm0","xmm5"); # MC ch
&pshufb ("xmm1","xmm3"); # 0 = sbet
&pxor ("xmm0","xmm4"); # 4 = ch
&add ($key,16); # next round key
&palignr("xmm5","xmm5",12);
&pxor ("xmm0","xmm1"); # 0 = ch
&sub ($round,1); # nr--
&set_label("dec_entry");
# top of round
&movdqa ("xmm1","xmm6"); # 1 : i
&pandn ("xmm1","xmm0"); # 1 = i<<4
&psrld ("xmm1",4); # 1 = i
&pand ("xmm0","xmm6"); # 0 = k
&movdqa ("xmm2",&QWP($k_inv+16,$const));# 2 : a/k
&pandn ("xmm1","xmm0"); # 1 = i<<4
&pand ("xmm0","xmm6"); # 0 = k
&psrld ("xmm1",4); # 1 = i
&pshufb ("xmm2","xmm0"); # 2 = a/k
&pxor ("xmm0","xmm1"); # 0 = j
&movdqa ("xmm3","xmm7"); # 3 : 1/i
&pxor ("xmm0","xmm1"); # 0 = j
&pshufb ("xmm3","xmm1"); # 3 = 1/i
&pxor ("xmm3","xmm2"); # 3 = iak = 1/i + a/k
&movdqa ("xmm4","xmm7"); # 4 : 1/j
&pxor ("xmm3","xmm2"); # 3 = iak = 1/i + a/k
&pshufb ("xmm4","xmm0"); # 4 = 1/j
&pxor ("xmm4","xmm2"); # 4 = jak = 1/j + a/k
&movdqa ("xmm2","xmm7"); # 2 : 1/iak
&pshufb ("xmm2","xmm3"); # 2 = 1/iak
&pxor ("xmm2","xmm0"); # 2 = io
&movdqa ("xmm3","xmm7"); # 3 : 1/jak
&pxor ("xmm2","xmm0"); # 2 = io
&pshufb ("xmm3","xmm4"); # 3 = 1/jak
&pxor ("xmm3","xmm1"); # 3 = jo
&movdqu ("xmm0",&QWP(0,$key));
&pxor ("xmm3","xmm1"); # 3 = jo
&jnz (&label("dec_loop"));
# middle of last round
@ -542,12 +542,12 @@ $k_dsbo=0x2c0; # decryption sbox final output
## %xmm0: b+c+d b+c b a
##
&function_begin_B("_vpaes_schedule_192_smear");
&pshufd ("xmm0","xmm6",0x80); # d c 0 0 -> c 0 0 0
&pxor ("xmm6","xmm0"); # -> c+d c 0 0
&pshufd ("xmm1","xmm6",0x80); # d c 0 0 -> c 0 0 0
&pshufd ("xmm0","xmm7",0xFE); # b a _ _ -> b b b a
&pxor ("xmm6","xmm1"); # -> c+d c 0 0
&pxor ("xmm1","xmm1");
&pxor ("xmm6","xmm0"); # -> b+c+d b+c b a
&movdqa ("xmm0","xmm6");
&pxor ("xmm1","xmm1");
&movhlps("xmm6","xmm1"); # clobber low side with zeros
&ret ();
&function_end_B("_vpaes_schedule_192_smear");

View File

@ -27,9 +27,10 @@
#
# aes-x86_64.pl vpaes-x86_64.pl
#
# Core 2(**) 30.5/43.7/14.3 21.8/25.7(***)
# Nehalem 30.5/42.2/14.6 9.8/11.8
# Atom 63.9/79.0/32.1 64.0/84.8(***)
# Core 2(**) 29.6/41.1/14.3 21.9/25.2(***)
# Nehalem 29.6/40.3/14.6 10.0/11.8
# Atom 57.3/74.2/32.1 60.9/77.2(***)
# Silvermont 52.7/64.0/19.5 48.8/60.8(***)
#
# (*) "Hyper-threading" in the context refers rather to cache shared
# among multiple cores, than to specifically Intel HTT. As vast
@ -40,7 +41,7 @@
# (**) "Core 2" refers to initial 65nm design, a.k.a. Conroe.
#
# (***) Less impressive improvement on Core 2 and Atom is due to slow
# pshufb, yet it's respectable +40%/78% improvement on Core 2
# pshufb, yet it's respectable +36%/62% improvement on Core 2
# (as implied, over "hyper-threading-safe" code path).
#
# <appro@openssl.org>
@ -95,8 +96,8 @@ _vpaes_encrypt_core:
movdqa .Lk_ipt+16(%rip), %xmm0 # ipthi
pshufb %xmm1, %xmm0
pxor %xmm5, %xmm2
pxor %xmm2, %xmm0
add \$16, %r9
pxor %xmm2, %xmm0
lea .Lk_mc_backward(%rip),%r10
jmp .Lenc_entry
@ -104,19 +105,19 @@ _vpaes_encrypt_core:
.Lenc_loop:
# middle of middle round
movdqa %xmm13, %xmm4 # 4 : sb1u
pshufb %xmm2, %xmm4 # 4 = sb1u
pxor %xmm5, %xmm4 # 4 = sb1u + k
movdqa %xmm12, %xmm0 # 0 : sb1t
pshufb %xmm2, %xmm4 # 4 = sb1u
pshufb %xmm3, %xmm0 # 0 = sb1t
pxor %xmm4, %xmm0 # 0 = A
pxor %xmm5, %xmm4 # 4 = sb1u + k
movdqa %xmm15, %xmm5 # 4 : sb2u
pshufb %xmm2, %xmm5 # 4 = sb2u
pxor %xmm4, %xmm0 # 0 = A
movdqa -0x40(%r11,%r10), %xmm1 # .Lk_mc_forward[]
pshufb %xmm2, %xmm5 # 4 = sb2u
movdqa (%r11,%r10), %xmm4 # .Lk_mc_backward[]
movdqa %xmm14, %xmm2 # 2 : sb2t
pshufb %xmm3, %xmm2 # 2 = sb2t
pxor %xmm5, %xmm2 # 2 = 2A
movdqa (%r11,%r10), %xmm4 # .Lk_mc_backward[]
movdqa %xmm0, %xmm3 # 3 = A
pxor %xmm5, %xmm2 # 2 = 2A
pshufb %xmm1, %xmm0 # 0 = B
add \$16, %r9 # next key
pxor %xmm2, %xmm0 # 0 = 2A+B
@ -125,30 +126,30 @@ _vpaes_encrypt_core:
pxor %xmm0, %xmm3 # 3 = 2A+B+D
pshufb %xmm1, %xmm0 # 0 = 2B+C
and \$0x30, %r11 # ... mod 4
pxor %xmm3, %xmm0 # 0 = 2A+3B+C+D
sub \$1,%rax # nr--
pxor %xmm3, %xmm0 # 0 = 2A+3B+C+D
.Lenc_entry:
# top of round
movdqa %xmm9, %xmm1 # 1 : i
movdqa %xmm11, %xmm5 # 2 : a/k
pandn %xmm0, %xmm1 # 1 = i<<4
psrld \$4, %xmm1 # 1 = i
pand %xmm9, %xmm0 # 0 = k
movdqa %xmm11, %xmm5 # 2 : a/k
pshufb %xmm0, %xmm5 # 2 = a/k
pxor %xmm1, %xmm0 # 0 = j
movdqa %xmm10, %xmm3 # 3 : 1/i
pxor %xmm1, %xmm0 # 0 = j
pshufb %xmm1, %xmm3 # 3 = 1/i
pxor %xmm5, %xmm3 # 3 = iak = 1/i + a/k
movdqa %xmm10, %xmm4 # 4 : 1/j
pxor %xmm5, %xmm3 # 3 = iak = 1/i + a/k
pshufb %xmm0, %xmm4 # 4 = 1/j
pxor %xmm5, %xmm4 # 4 = jak = 1/j + a/k
movdqa %xmm10, %xmm2 # 2 : 1/iak
pxor %xmm5, %xmm4 # 4 = jak = 1/j + a/k
pshufb %xmm3, %xmm2 # 2 = 1/iak
pxor %xmm0, %xmm2 # 2 = io
movdqa %xmm10, %xmm3 # 3 : 1/jak
movdqu (%r9), %xmm5
pxor %xmm0, %xmm2 # 2 = io
pshufb %xmm4, %xmm3 # 3 = 1/jak
movdqu (%r9), %xmm5
pxor %xmm1, %xmm3 # 3 = jo
jnz .Lenc_loop
@ -201,62 +202,61 @@ _vpaes_decrypt_core:
## Inverse mix columns
##
movdqa -0x20(%r10),%xmm4 # 4 : sb9u
movdqa -0x10(%r10),%xmm1 # 0 : sb9t
pshufb %xmm2, %xmm4 # 4 = sb9u
pxor %xmm0, %xmm4
movdqa -0x10(%r10),%xmm0 # 0 : sb9t
pshufb %xmm3, %xmm0 # 0 = sb9t
pxor %xmm4, %xmm0 # 0 = ch
add \$16, %r9 # next round key
pshufb %xmm5, %xmm0 # MC ch
pshufb %xmm3, %xmm1 # 0 = sb9t
pxor %xmm4, %xmm0
movdqa 0x00(%r10),%xmm4 # 4 : sbdu
pshufb %xmm2, %xmm4 # 4 = sbdu
pxor %xmm0, %xmm4 # 4 = ch
movdqa 0x10(%r10),%xmm0 # 0 : sbdt
pshufb %xmm3, %xmm0 # 0 = sbdt
pxor %xmm4, %xmm0 # 0 = ch
sub \$1,%rax # nr--
pshufb %xmm5, %xmm0 # MC ch
movdqa 0x20(%r10),%xmm4 # 4 : sbbu
pshufb %xmm2, %xmm4 # 4 = sbbu
pxor %xmm0, %xmm4 # 4 = ch
movdqa 0x30(%r10),%xmm0 # 0 : sbbt
pshufb %xmm3, %xmm0 # 0 = sbbt
pxor %xmm4, %xmm0 # 0 = ch
pshufb %xmm5, %xmm0 # MC ch
movdqa 0x40(%r10),%xmm4 # 4 : sbeu
pshufb %xmm2, %xmm4 # 4 = sbeu
pxor %xmm0, %xmm4 # 4 = ch
movdqa 0x50(%r10),%xmm0 # 0 : sbet
pshufb %xmm3, %xmm0 # 0 = sbet
pxor %xmm4, %xmm0 # 0 = ch
pxor %xmm1, %xmm0 # 0 = ch
movdqa 0x10(%r10),%xmm1 # 0 : sbdt
pshufb %xmm2, %xmm4 # 4 = sbdu
pshufb %xmm5, %xmm0 # MC ch
pshufb %xmm3, %xmm1 # 0 = sbdt
pxor %xmm4, %xmm0 # 4 = ch
movdqa 0x20(%r10),%xmm4 # 4 : sbbu
pxor %xmm1, %xmm0 # 0 = ch
movdqa 0x30(%r10),%xmm1 # 0 : sbbt
pshufb %xmm2, %xmm4 # 4 = sbbu
pshufb %xmm5, %xmm0 # MC ch
pshufb %xmm3, %xmm1 # 0 = sbbt
pxor %xmm4, %xmm0 # 4 = ch
movdqa 0x40(%r10),%xmm4 # 4 : sbeu
pxor %xmm1, %xmm0 # 0 = ch
movdqa 0x50(%r10),%xmm1 # 0 : sbet
pshufb %xmm2, %xmm4 # 4 = sbeu
pshufb %xmm5, %xmm0 # MC ch
pshufb %xmm3, %xmm1 # 0 = sbet
pxor %xmm4, %xmm0 # 4 = ch
add \$16, %r9 # next round key
palignr \$12, %xmm5, %xmm5
pxor %xmm1, %xmm0 # 0 = ch
sub \$1,%rax # nr--
.Ldec_entry:
# top of round
movdqa %xmm9, %xmm1 # 1 : i
pandn %xmm0, %xmm1 # 1 = i<<4
movdqa %xmm11, %xmm2 # 2 : a/k
psrld \$4, %xmm1 # 1 = i
pand %xmm9, %xmm0 # 0 = k
movdqa %xmm11, %xmm2 # 2 : a/k
pshufb %xmm0, %xmm2 # 2 = a/k
pxor %xmm1, %xmm0 # 0 = j
movdqa %xmm10, %xmm3 # 3 : 1/i
pxor %xmm1, %xmm0 # 0 = j
pshufb %xmm1, %xmm3 # 3 = 1/i
pxor %xmm2, %xmm3 # 3 = iak = 1/i + a/k
movdqa %xmm10, %xmm4 # 4 : 1/j
pxor %xmm2, %xmm3 # 3 = iak = 1/i + a/k
pshufb %xmm0, %xmm4 # 4 = 1/j
pxor %xmm2, %xmm4 # 4 = jak = 1/j + a/k
movdqa %xmm10, %xmm2 # 2 : 1/iak
pshufb %xmm3, %xmm2 # 2 = 1/iak
pxor %xmm0, %xmm2 # 2 = io
movdqa %xmm10, %xmm3 # 3 : 1/jak
pxor %xmm0, %xmm2 # 2 = io
pshufb %xmm4, %xmm3 # 3 = 1/jak
pxor %xmm1, %xmm3 # 3 = jo
movdqu (%r9), %xmm0
pxor %xmm1, %xmm3 # 3 = jo
jnz .Ldec_loop
# middle of last round
@ -464,12 +464,12 @@ _vpaes_schedule_core:
.type _vpaes_schedule_192_smear,\@abi-omnipotent
.align 16
_vpaes_schedule_192_smear:
pshufd \$0x80, %xmm6, %xmm0 # d c 0 0 -> c 0 0 0
pxor %xmm0, %xmm6 # -> c+d c 0 0
pshufd \$0x80, %xmm6, %xmm1 # d c 0 0 -> c 0 0 0
pshufd \$0xFE, %xmm7, %xmm0 # b a _ _ -> b b b a
pxor %xmm1, %xmm6 # -> c+d c 0 0
pxor %xmm1, %xmm1
pxor %xmm0, %xmm6 # -> b+c+d b+c b a
movdqa %xmm6, %xmm0
pxor %xmm1, %xmm1
movhlps %xmm1, %xmm6 # clobber low side with zeros
ret
.size _vpaes_schedule_192_smear,.-_vpaes_schedule_192_smear

46
crypto/arm64cpuid.S Normal file
View File

@ -0,0 +1,46 @@
#include "arm_arch.h"
.text
.arch armv8-a+crypto
.align 5
.global _armv7_neon_probe
.type _armv7_neon_probe,%function
_armv7_neon_probe:
orr v15.16b, v15.16b, v15.16b
ret
.size _armv7_neon_probe,.-_armv7_neon_probe
.global _armv7_tick
.type _armv7_tick,%function
_armv7_tick:
mrs x0, CNTVCT_EL0
ret
.size _armv7_tick,.-_armv7_tick
.global _armv8_aes_probe
.type _armv8_aes_probe,%function
_armv8_aes_probe:
aese v0.16b, v0.16b
ret
.size _armv8_aes_probe,.-_armv8_aes_probe
.global _armv8_sha1_probe
.type _armv8_sha1_probe,%function
_armv8_sha1_probe:
sha1h s0, s0
ret
.size _armv8_sha1_probe,.-_armv8_sha1_probe
.global _armv8_sha256_probe
.type _armv8_sha256_probe,%function
_armv8_sha256_probe:
sha256su0 v0.4s, v0.4s
ret
.size _armv8_sha256_probe,.-_armv8_sha256_probe
.global _armv8_pmull_probe
.type _armv8_pmull_probe,%function
_armv8_pmull_probe:
pmull v0.1q, v0.1d, v0.1d
ret
.size _armv8_pmull_probe,.-_armv8_pmull_probe

View File

@ -10,13 +10,24 @@
# define __ARMEL__
# endif
# elif defined(__GNUC__)
# if defined(__aarch64__)
# define __ARM_ARCH__ 8
# if __BYTE_ORDER__==__ORDER_BIG_ENDIAN__
# define __ARMEB__
# else
# define __ARMEL__
# endif
/*
* Why doesn't gcc define __ARM_ARCH__? Instead it defines
* bunch of below macros. See all_architectires[] table in
* gcc/config/arm/arm.c. On a side note it defines
* __ARMEL__/__ARMEB__ for little-/big-endian.
*/
# if defined(__ARM_ARCH_7__) || defined(__ARM_ARCH_7A__) || \
# elif defined(__ARM_ARCH)
# define __ARM_ARCH__ __ARM_ARCH
# elif defined(__ARM_ARCH_8A__)
# define __ARM_ARCH__ 8
# elif defined(__ARM_ARCH_7__) || defined(__ARM_ARCH_7A__) || \
defined(__ARM_ARCH_7R__)|| defined(__ARM_ARCH_7M__) || \
defined(__ARM_ARCH_7EM__)
# define __ARM_ARCH__ 7
@ -41,11 +52,27 @@
# include <openssl/fipssyms.h>
# endif
# if !__ASSEMBLER__
extern unsigned int OPENSSL_armcap_P;
# define ARMV7_NEON (1<<0)
# define ARMV7_TICK (1<<1)
# if !defined(__ARM_MAX_ARCH__)
# define __ARM_MAX_ARCH__ __ARM_ARCH__
# endif
# if __ARM_MAX_ARCH__<__ARM_ARCH__
# error "__ARM_MAX_ARCH__ can't be less than __ARM_ARCH__"
# elif __ARM_MAX_ARCH__!=__ARM_ARCH__
# if __ARM_ARCH__<7 && __ARM_MAX_ARCH__>=7 && defined(__ARMEB__)
# error "can't build universal big-endian binary"
# endif
# endif
# if !__ASSEMBLER__
extern unsigned int OPENSSL_armcap_P;
# endif
# define ARMV7_NEON (1<<0)
# define ARMV7_TICK (1<<1)
# define ARMV8_AES (1<<2)
# define ARMV8_SHA1 (1<<3)
# define ARMV8_SHA256 (1<<4)
# define ARMV8_PMULL (1<<5)
#endif

View File

@ -7,8 +7,18 @@
#include "arm_arch.h"
unsigned int OPENSSL_armcap_P;
unsigned int OPENSSL_armcap_P = 0;
#if __ARM_MAX_ARCH__<7
void OPENSSL_cpuid_setup(void)
{
}
unsigned long OPENSSL_rdtsc(void)
{
return 0;
}
#else
static sigset_t all_masked;
static sigjmp_buf ill_jmp;
@ -22,9 +32,13 @@ static void ill_handler(int sig)
* ARM compilers support inline assembler...
*/
void _armv7_neon_probe(void);
unsigned int _armv7_tick(void);
void _armv8_aes_probe(void);
void _armv8_sha1_probe(void);
void _armv8_sha256_probe(void);
void _armv8_pmull_probe(void);
unsigned long _armv7_tick(void);
unsigned int OPENSSL_rdtsc(void)
unsigned long OPENSSL_rdtsc(void)
{
if (OPENSSL_armcap_P & ARMV7_TICK)
return _armv7_tick();
@ -32,9 +46,44 @@ unsigned int OPENSSL_rdtsc(void)
return 0;
}
#if defined(__GNUC__) && __GNUC__>=2
/*
* Use a weak reference to getauxval() so we can use it if it is available but
* don't break the build if it is not.
*/
# if defined(__GNUC__) && __GNUC__>=2
void OPENSSL_cpuid_setup(void) __attribute__ ((constructor));
#endif
extern unsigned long getauxval(unsigned long type) __attribute__ ((weak));
# else
static unsigned long (*getauxval) (unsigned long) = NULL;
# endif
/*
* ARM puts the the feature bits for Crypto Extensions in AT_HWCAP2, whereas
* AArch64 used AT_HWCAP.
*/
# if defined(__arm__) || defined (__arm)
# define HWCAP 16
/* AT_HWCAP */
# define HWCAP_NEON (1 << 12)
# define HWCAP_CE 26
/* AT_HWCAP2 */
# define HWCAP_CE_AES (1 << 0)
# define HWCAP_CE_PMULL (1 << 1)
# define HWCAP_CE_SHA1 (1 << 2)
# define HWCAP_CE_SHA256 (1 << 3)
# elif defined(__aarch64__)
# define HWCAP 16
/* AT_HWCAP */
# define HWCAP_NEON (1 << 1)
# define HWCAP_CE HWCAP
# define HWCAP_CE_AES (1 << 3)
# define HWCAP_CE_PMULL (1 << 4)
# define HWCAP_CE_SHA1 (1 << 5)
# define HWCAP_CE_SHA256 (1 << 6)
# endif
void OPENSSL_cpuid_setup(void)
{
char *e;
@ -47,7 +96,7 @@ void OPENSSL_cpuid_setup(void)
trigger = 1;
if ((e = getenv("OPENSSL_armcap"))) {
OPENSSL_armcap_P = strtoul(e, NULL, 0);
OPENSSL_armcap_P = (unsigned int)strtoul(e, NULL, 0);
return;
}
@ -67,9 +116,42 @@ void OPENSSL_cpuid_setup(void)
sigprocmask(SIG_SETMASK, &ill_act.sa_mask, &oset);
sigaction(SIGILL, &ill_act, &ill_oact);
if (sigsetjmp(ill_jmp, 1) == 0) {
if (getauxval != NULL) {
if (getauxval(HWCAP) & HWCAP_NEON) {
unsigned long hwcap = getauxval(HWCAP_CE);
OPENSSL_armcap_P |= ARMV7_NEON;
if (hwcap & HWCAP_CE_AES)
OPENSSL_armcap_P |= ARMV8_AES;
if (hwcap & HWCAP_CE_PMULL)
OPENSSL_armcap_P |= ARMV8_PMULL;
if (hwcap & HWCAP_CE_SHA1)
OPENSSL_armcap_P |= ARMV8_SHA1;
if (hwcap & HWCAP_CE_SHA256)
OPENSSL_armcap_P |= ARMV8_SHA256;
}
} else if (sigsetjmp(ill_jmp, 1) == 0) {
_armv7_neon_probe();
OPENSSL_armcap_P |= ARMV7_NEON;
if (sigsetjmp(ill_jmp, 1) == 0) {
_armv8_pmull_probe();
OPENSSL_armcap_P |= ARMV8_PMULL | ARMV8_AES;
} else if (sigsetjmp(ill_jmp, 1) == 0) {
_armv8_aes_probe();
OPENSSL_armcap_P |= ARMV8_AES;
}
if (sigsetjmp(ill_jmp, 1) == 0) {
_armv8_sha1_probe();
OPENSSL_armcap_P |= ARMV8_SHA1;
}
if (sigsetjmp(ill_jmp, 1) == 0) {
_armv8_sha256_probe();
OPENSSL_armcap_P |= ARMV8_SHA256;
}
}
if (sigsetjmp(ill_jmp, 1) == 0) {
_armv7_tick();
@ -79,3 +161,4 @@ void OPENSSL_cpuid_setup(void)
sigaction(SIGILL, &ill_oact, NULL);
sigprocmask(SIG_SETMASK, &oset, NULL);
}
#endif

View File

@ -4,20 +4,6 @@
.code 32
.align 5
.global _armv7_neon_probe
.type _armv7_neon_probe,%function
_armv7_neon_probe:
.word 0xf26ee1fe @ vorr q15,q15,q15
.word 0xe12fff1e @ bx lr
.size _armv7_neon_probe,.-_armv7_neon_probe
.global _armv7_tick
.type _armv7_tick,%function
_armv7_tick:
mrc p15,0,r0,c9,c13,0
.word 0xe12fff1e @ bx lr
.size _armv7_tick,.-_armv7_tick
.global OPENSSL_atomic_add
.type OPENSSL_atomic_add,%function
OPENSSL_atomic_add:
@ -28,7 +14,7 @@ OPENSSL_atomic_add:
cmp r2,#0
bne .Ladd
mov r0,r3
.word 0xe12fff1e @ bx lr
bx lr
#else
stmdb sp!,{r4-r6,lr}
ldr r2,.Lspinlock
@ -81,62 +67,131 @@ OPENSSL_cleanse:
adds r1,r1,#4
bne .Little
.Lcleanse_done:
#if __ARM_ARCH__>=5
bx lr
#else
tst lr,#1
moveq pc,lr
.word 0xe12fff1e @ bx lr
#endif
.size OPENSSL_cleanse,.-OPENSSL_cleanse
#if __ARM_MAX_ARCH__>=7
.arch armv7-a
.fpu neon
.align 5
.global _armv7_neon_probe
.type _armv7_neon_probe,%function
_armv7_neon_probe:
vorr q0,q0,q0
bx lr
.size _armv7_neon_probe,.-_armv7_neon_probe
.global _armv7_tick
.type _armv7_tick,%function
_armv7_tick:
mrrc p15,1,r0,r1,c14 @ CNTVCT
bx lr
.size _armv7_tick,.-_armv7_tick
.global _armv8_aes_probe
.type _armv8_aes_probe,%function
_armv8_aes_probe:
.byte 0x00,0x03,0xb0,0xf3 @ aese.8 q0,q0
bx lr
.size _armv8_aes_probe,.-_armv8_aes_probe
.global _armv8_sha1_probe
.type _armv8_sha1_probe,%function
_armv8_sha1_probe:
.byte 0x40,0x0c,0x00,0xf2 @ sha1c.32 q0,q0,q0
bx lr
.size _armv8_sha1_probe,.-_armv8_sha1_probe
.global _armv8_sha256_probe
.type _armv8_sha256_probe,%function
_armv8_sha256_probe:
.byte 0x40,0x0c,0x00,0xf3 @ sha256h.32 q0,q0,q0
bx lr
.size _armv8_sha256_probe,.-_armv8_sha256_probe
.global _armv8_pmull_probe
.type _armv8_pmull_probe,%function
_armv8_pmull_probe:
.byte 0x00,0x0e,0xa0,0xf2 @ vmull.p64 q0,d0,d0
bx lr
.size _armv8_pmull_probe,.-_armv8_pmull_probe
#endif
.global OPENSSL_wipe_cpu
.type OPENSSL_wipe_cpu,%function
OPENSSL_wipe_cpu:
#if __ARM_MAX_ARCH__>=7
ldr r0,.LOPENSSL_armcap
adr r1,.LOPENSSL_armcap
ldr r0,[r1,r0]
#endif
eor r2,r2,r2
eor r3,r3,r3
eor ip,ip,ip
#if __ARM_MAX_ARCH__>=7
tst r0,#1
beq .Lwipe_done
.word 0xf3000150 @ veor q0, q0, q0
.word 0xf3022152 @ veor q1, q1, q1
.word 0xf3044154 @ veor q2, q2, q2
.word 0xf3066156 @ veor q3, q3, q3
.word 0xf34001f0 @ veor q8, q8, q8
.word 0xf34221f2 @ veor q9, q9, q9
.word 0xf34441f4 @ veor q10, q10, q10
.word 0xf34661f6 @ veor q11, q11, q11
.word 0xf34881f8 @ veor q12, q12, q12
.word 0xf34aa1fa @ veor q13, q13, q13
.word 0xf34cc1fc @ veor q14, q14, q14
.word 0xf34ee1fe @ veor q15, q15, q15
veor q0, q0, q0
veor q1, q1, q1
veor q2, q2, q2
veor q3, q3, q3
veor q8, q8, q8
veor q9, q9, q9
veor q10, q10, q10
veor q11, q11, q11
veor q12, q12, q12
veor q13, q13, q13
veor q14, q14, q14
veor q15, q15, q15
.Lwipe_done:
#endif
mov r0,sp
#if __ARM_ARCH__>=5
bx lr
#else
tst lr,#1
moveq pc,lr
.word 0xe12fff1e @ bx lr
#endif
.size OPENSSL_wipe_cpu,.-OPENSSL_wipe_cpu
.global OPENSSL_instrument_bus
.type OPENSSL_instrument_bus,%function
OPENSSL_instrument_bus:
eor r0,r0,r0
#if __ARM_ARCH__>=5
bx lr
#else
tst lr,#1
moveq pc,lr
.word 0xe12fff1e @ bx lr
#endif
.size OPENSSL_instrument_bus,.-OPENSSL_instrument_bus
.global OPENSSL_instrument_bus2
.type OPENSSL_instrument_bus2,%function
OPENSSL_instrument_bus2:
eor r0,r0,r0
#if __ARM_ARCH__>=5
bx lr
#else
tst lr,#1
moveq pc,lr
.word 0xe12fff1e @ bx lr
#endif
.size OPENSSL_instrument_bus2,.-OPENSSL_instrument_bus2
.align 5
#if __ARM_MAX_ARCH__>=7
.LOPENSSL_armcap:
.word OPENSSL_armcap_P-.LOPENSSL_armcap
#endif
#if __ARM_ARCH__>=6
.align 5
#else

View File

@ -176,7 +176,7 @@ a_gentm.o: ../../include/openssl/err.h ../../include/openssl/lhash.h
a_gentm.o: ../../include/openssl/opensslconf.h ../../include/openssl/opensslv.h
a_gentm.o: ../../include/openssl/ossl_typ.h ../../include/openssl/safestack.h
a_gentm.o: ../../include/openssl/stack.h ../../include/openssl/symhacks.h
a_gentm.o: ../cryptlib.h ../o_time.h a_gentm.c
a_gentm.o: ../cryptlib.h ../o_time.h a_gentm.c asn1_locl.h
a_i2d_fp.o: ../../e_os.h ../../include/openssl/asn1.h
a_i2d_fp.o: ../../include/openssl/bio.h ../../include/openssl/buffer.h
a_i2d_fp.o: ../../include/openssl/crypto.h ../../include/openssl/e_os2.h
@ -277,6 +277,7 @@ a_time.o: ../../include/openssl/lhash.h ../../include/openssl/opensslconf.h
a_time.o: ../../include/openssl/opensslv.h ../../include/openssl/ossl_typ.h
a_time.o: ../../include/openssl/safestack.h ../../include/openssl/stack.h
a_time.o: ../../include/openssl/symhacks.h ../cryptlib.h ../o_time.h a_time.c
a_time.o: asn1_locl.h
a_type.o: ../../e_os.h ../../include/openssl/asn1.h
a_type.o: ../../include/openssl/asn1t.h ../../include/openssl/bio.h
a_type.o: ../../include/openssl/buffer.h ../../include/openssl/crypto.h
@ -293,7 +294,7 @@ a_utctm.o: ../../include/openssl/err.h ../../include/openssl/lhash.h
a_utctm.o: ../../include/openssl/opensslconf.h ../../include/openssl/opensslv.h
a_utctm.o: ../../include/openssl/ossl_typ.h ../../include/openssl/safestack.h
a_utctm.o: ../../include/openssl/stack.h ../../include/openssl/symhacks.h
a_utctm.o: ../cryptlib.h ../o_time.h a_utctm.c
a_utctm.o: ../cryptlib.h ../o_time.h a_utctm.c asn1_locl.h
a_utf8.o: ../../e_os.h ../../include/openssl/asn1.h ../../include/openssl/bio.h
a_utf8.o: ../../include/openssl/buffer.h ../../include/openssl/crypto.h
a_utf8.o: ../../include/openssl/e_os2.h ../../include/openssl/err.h

View File

@ -65,6 +65,7 @@
#include "cryptlib.h"
#include "o_time.h"
#include <openssl/asn1.h>
#include "asn1_locl.h"
#if 0
@ -117,7 +118,7 @@ ASN1_GENERALIZEDTIME *d2i_ASN1_GENERALIZEDTIME(ASN1_GENERALIZEDTIME **a,
#endif
int ASN1_GENERALIZEDTIME_check(ASN1_GENERALIZEDTIME *d)
int asn1_generalizedtime_to_tm(struct tm *tm, const ASN1_GENERALIZEDTIME *d)
{
static const int min[9] = { 0, 0, 1, 1, 0, 0, 0, 0, 0 };
static const int max[9] = { 99, 99, 12, 31, 23, 59, 59, 12, 59 };
@ -139,6 +140,8 @@ int ASN1_GENERALIZEDTIME_check(ASN1_GENERALIZEDTIME *d)
for (i = 0; i < 7; i++) {
if ((i == 6) && ((a[o] == 'Z') || (a[o] == '+') || (a[o] == '-'))) {
i++;
if (tm)
tm->tm_sec = 0;
break;
}
if ((a[o] < '0') || (a[o] > '9'))
@ -155,6 +158,31 @@ int ASN1_GENERALIZEDTIME_check(ASN1_GENERALIZEDTIME *d)
if ((n < min[i]) || (n > max[i]))
goto err;
if (tm) {
switch (i) {
case 0:
tm->tm_year = n * 100 - 1900;
break;
case 1:
tm->tm_year += n;
break;
case 2:
tm->tm_mon = n - 1;
break;
case 3:
tm->tm_mday = n;
break;
case 4:
tm->tm_hour = n;
break;
case 5:
tm->tm_min = n;
break;
case 6:
tm->tm_sec = n;
break;
}
}
}
/*
* Optional fractional seconds: decimal point followed by one or more
@ -174,6 +202,7 @@ int ASN1_GENERALIZEDTIME_check(ASN1_GENERALIZEDTIME *d)
if (a[o] == 'Z')
o++;
else if ((a[o] == '+') || (a[o] == '-')) {
int offsign = a[o] == '-' ? -1 : 1, offset = 0;
o++;
if (o + 4 > l)
goto err;
@ -187,9 +216,17 @@ int ASN1_GENERALIZEDTIME_check(ASN1_GENERALIZEDTIME *d)
n = (n * 10) + a[o] - '0';
if ((n < min[i]) || (n > max[i]))
goto err;
if (tm) {
if (i == 7)
offset = n * 3600;
else if (i == 8)
offset += n * 60;
}
o++;
}
} else {
if (offset && !OPENSSL_gmtime_adj(tm, 0, offset * offsign))
return 0;
} else if (a[o]) {
/* Missing time zone information. */
goto err;
}
@ -198,6 +235,11 @@ int ASN1_GENERALIZEDTIME_check(ASN1_GENERALIZEDTIME *d)
return (0);
}
int ASN1_GENERALIZEDTIME_check(const ASN1_GENERALIZEDTIME *d)
{
return asn1_generalizedtime_to_tm(NULL, d);
}
int ASN1_GENERALIZEDTIME_set_string(ASN1_GENERALIZEDTIME *s, const char *str)
{
ASN1_GENERALIZEDTIME t;

View File

@ -66,6 +66,7 @@
#include "cryptlib.h"
#include "o_time.h"
#include <openssl/asn1t.h>
#include "asn1_locl.h"
IMPLEMENT_ASN1_MSTRING(ASN1_TIME, B_ASN1_TIME)
@ -196,3 +197,32 @@ int ASN1_TIME_set_string(ASN1_TIME *s, const char *str)
return 1;
}
static int asn1_time_to_tm(struct tm *tm, const ASN1_TIME *t)
{
if (t == NULL) {
time_t now_t;
time(&now_t);
if (OPENSSL_gmtime(&now_t, tm))
return 1;
return 0;
}
if (t->type == V_ASN1_UTCTIME)
return asn1_utctime_to_tm(tm, t);
else if (t->type == V_ASN1_GENERALIZEDTIME)
return asn1_generalizedtime_to_tm(tm, t);
return 0;
}
int ASN1_TIME_diff(int *pday, int *psec,
const ASN1_TIME *from, const ASN1_TIME *to)
{
struct tm tm_from, tm_to;
if (!asn1_time_to_tm(&tm_from, from))
return 0;
if (!asn1_time_to_tm(&tm_to, to))
return 0;
return OPENSSL_gmtime_diff(pday, psec, &tm_from, &tm_to);
}

View File

@ -61,6 +61,7 @@
#include "cryptlib.h"
#include "o_time.h"
#include <openssl/asn1.h>
#include "asn1_locl.h"
#if 0
int i2d_ASN1_UTCTIME(ASN1_UTCTIME *a, unsigned char **pp)
@ -109,7 +110,7 @@ ASN1_UTCTIME *d2i_ASN1_UTCTIME(ASN1_UTCTIME **a, unsigned char **pp,
#endif
int ASN1_UTCTIME_check(ASN1_UTCTIME *d)
int asn1_utctime_to_tm(struct tm *tm, const ASN1_UTCTIME *d)
{
static const int min[8] = { 0, 1, 1, 0, 0, 0, 0, 0 };
static const int max[8] = { 99, 12, 31, 23, 59, 59, 12, 59 };
@ -127,6 +128,8 @@ int ASN1_UTCTIME_check(ASN1_UTCTIME *d)
for (i = 0; i < 6; i++) {
if ((i == 5) && ((a[o] == 'Z') || (a[o] == '+') || (a[o] == '-'))) {
i++;
if (tm)
tm->tm_sec = 0;
break;
}
if ((a[o] < '0') || (a[o] > '9'))
@ -143,10 +146,33 @@ int ASN1_UTCTIME_check(ASN1_UTCTIME *d)
if ((n < min[i]) || (n > max[i]))
goto err;
if (tm) {
switch (i) {
case 0:
tm->tm_year = n < 50 ? n + 100 : n;
break;
case 1:
tm->tm_mon = n - 1;
break;
case 2:
tm->tm_mday = n;
break;
case 3:
tm->tm_hour = n;
break;
case 4:
tm->tm_min = n;
break;
case 5:
tm->tm_sec = n;
break;
}
}
}
if (a[o] == 'Z')
o++;
else if ((a[o] == '+') || (a[o] == '-')) {
int offsign = a[o] == '-' ? -1 : 1, offset = 0;
o++;
if (o + 4 > l)
goto err;
@ -160,12 +186,25 @@ int ASN1_UTCTIME_check(ASN1_UTCTIME *d)
n = (n * 10) + a[o] - '0';
if ((n < min[i]) || (n > max[i]))
goto err;
if (tm) {
if (i == 6)
offset = n * 3600;
else if (i == 7)
offset += n * 60;
}
o++;
}
if (offset && !OPENSSL_gmtime_adj(tm, 0, offset * offsign))
return 0;
}
return (o == l);
return o == l;
err:
return (0);
return 0;
}
int ASN1_UTCTIME_check(const ASN1_UTCTIME *d)
{
return asn1_utctime_to_tm(NULL, d);
}
int ASN1_UTCTIME_set_string(ASN1_UTCTIME *s, const char *str)
@ -249,43 +288,26 @@ ASN1_UTCTIME *ASN1_UTCTIME_adj(ASN1_UTCTIME *s, time_t t,
int ASN1_UTCTIME_cmp_time_t(const ASN1_UTCTIME *s, time_t t)
{
struct tm *tm;
struct tm data;
int offset;
int year;
struct tm stm, ttm;
int day, sec;
#define g2(p) (((p)[0]-'0')*10+(p)[1]-'0')
if (s->data[12] == 'Z')
offset = 0;
else {
offset = g2(s->data + 13) * 60 + g2(s->data + 15);
if (s->data[12] == '-')
offset = -offset;
}
t -= offset * 60; /* FIXME: may overflow in extreme cases */
tm = OPENSSL_gmtime(&t, &data);
/*
* NB: -1, 0, 1 already valid return values so use -2 to indicate error.
*/
if (tm == NULL)
if (!asn1_utctime_to_tm(&stm, s))
return -2;
#define return_cmp(a,b) if ((a)<(b)) return -1; else if ((a)>(b)) return 1
year = g2(s->data);
if (year < 50)
year += 100;
return_cmp(year, tm->tm_year);
return_cmp(g2(s->data + 2) - 1, tm->tm_mon);
return_cmp(g2(s->data + 4), tm->tm_mday);
return_cmp(g2(s->data + 6), tm->tm_hour);
return_cmp(g2(s->data + 8), tm->tm_min);
return_cmp(g2(s->data + 10), tm->tm_sec);
#undef g2
#undef return_cmp
if (!OPENSSL_gmtime(&t, &ttm))
return -2;
if (!OPENSSL_gmtime_diff(&day, &sec, &ttm, &stm))
return -2;
if (day > 0)
return 1;
if (day < 0)
return -1;
if (sec > 0)
return 1;
if (sec < 0)
return -1;
return 0;
}

View File

@ -68,6 +68,7 @@
extern const EVP_PKEY_ASN1_METHOD rsa_asn1_meths[];
extern const EVP_PKEY_ASN1_METHOD dsa_asn1_meths[];
extern const EVP_PKEY_ASN1_METHOD dh_asn1_meth;
extern const EVP_PKEY_ASN1_METHOD dhx_asn1_meth;
extern const EVP_PKEY_ASN1_METHOD eckey_asn1_meth;
extern const EVP_PKEY_ASN1_METHOD hmac_asn1_meth;
extern const EVP_PKEY_ASN1_METHOD cmac_asn1_meth;
@ -92,7 +93,10 @@ static const EVP_PKEY_ASN1_METHOD *standard_methods[] = {
&eckey_asn1_meth,
#endif
&hmac_asn1_meth,
&cmac_asn1_meth
&cmac_asn1_meth,
#ifndef OPENSSL_NO_DH
&dhx_asn1_meth
#endif
};
typedef int sk_cmp_fn_type(const char *const *a, const char *const *b);
@ -460,3 +464,21 @@ void EVP_PKEY_asn1_set_ctrl(EVP_PKEY_ASN1_METHOD *ameth,
{
ameth->pkey_ctrl = pkey_ctrl;
}
void EVP_PKEY_asn1_set_item(EVP_PKEY_ASN1_METHOD *ameth,
int (*item_verify) (EVP_MD_CTX *ctx,
const ASN1_ITEM *it,
void *asn,
X509_ALGOR *a,
ASN1_BIT_STRING *sig,
EVP_PKEY *pkey),
int (*item_sign) (EVP_MD_CTX *ctx,
const ASN1_ITEM *it,
void *asn,
X509_ALGOR *alg1,
X509_ALGOR *alg2,
ASN1_BIT_STRING *sig))
{
ameth->item_sign = item_sign;
ameth->item_verify = item_verify;
}

View File

@ -207,13 +207,13 @@ typedef struct asn1_const_ctx_st {
# define ASN1_OBJECT_FLAG_CRITICAL 0x02/* critical x509v3 object id */
# define ASN1_OBJECT_FLAG_DYNAMIC_STRINGS 0x04/* internal use */
# define ASN1_OBJECT_FLAG_DYNAMIC_DATA 0x08/* internal use */
typedef struct asn1_object_st {
struct asn1_object_st {
const char *sn, *ln;
int nid;
int length;
const unsigned char *data; /* data remains const after init */
int flags; /* Should we free this one */
} ASN1_OBJECT;
};
# define ASN1_STRING_FLAG_BITS_LEFT 0x08/* Set if 0x07 has bits left value */
/*
@ -843,7 +843,7 @@ int ASN1_INTEGER_cmp(const ASN1_INTEGER *x, const ASN1_INTEGER *y);
DECLARE_ASN1_FUNCTIONS(ASN1_ENUMERATED)
int ASN1_UTCTIME_check(ASN1_UTCTIME *a);
int ASN1_UTCTIME_check(const ASN1_UTCTIME *a);
ASN1_UTCTIME *ASN1_UTCTIME_set(ASN1_UTCTIME *s, time_t t);
ASN1_UTCTIME *ASN1_UTCTIME_adj(ASN1_UTCTIME *s, time_t t,
int offset_day, long offset_sec);
@ -853,13 +853,15 @@ int ASN1_UTCTIME_cmp_time_t(const ASN1_UTCTIME *s, time_t t);
time_t ASN1_UTCTIME_get(const ASN1_UTCTIME *s);
# endif
int ASN1_GENERALIZEDTIME_check(ASN1_GENERALIZEDTIME *a);
int ASN1_GENERALIZEDTIME_check(const ASN1_GENERALIZEDTIME *a);
ASN1_GENERALIZEDTIME *ASN1_GENERALIZEDTIME_set(ASN1_GENERALIZEDTIME *s,
time_t t);
ASN1_GENERALIZEDTIME *ASN1_GENERALIZEDTIME_adj(ASN1_GENERALIZEDTIME *s,
time_t t, int offset_day,
long offset_sec);
int ASN1_GENERALIZEDTIME_set_string(ASN1_GENERALIZEDTIME *s, const char *str);
int ASN1_TIME_diff(int *pday, int *psec,
const ASN1_TIME *from, const ASN1_TIME *to);
DECLARE_ASN1_FUNCTIONS(ASN1_OCTET_STRING)
ASN1_OCTET_STRING *ASN1_OCTET_STRING_dup(const ASN1_OCTET_STRING *a);

View File

@ -59,6 +59,9 @@
/* Internal ASN1 structures and functions: not for application use */
int asn1_utctime_to_tm(struct tm *tm, const ASN1_UTCTIME *d);
int asn1_generalizedtime_to_tm(struct tm *tm, const ASN1_GENERALIZEDTIME *d);
/* ASN1 print context structure */
struct asn1_pctx_st {

View File

@ -228,6 +228,21 @@ int X509_print_ex(BIO *bp, X509 *x, unsigned long nmflags,
}
}
if (!(cflag & X509_FLAG_NO_IDS)) {
if (ci->issuerUID) {
if (BIO_printf(bp, "%8sIssuer Unique ID: ", "") <= 0)
goto err;
if (!X509_signature_dump(bp, ci->issuerUID, 12))
goto err;
}
if (ci->subjectUID) {
if (BIO_printf(bp, "%8sSubject Unique ID: ", "") <= 0)
goto err;
if (!X509_signature_dump(bp, ci->subjectUID, 12))
goto err;
}
}
if (!(cflag & X509_FLAG_NO_EXTENSIONS))
X509V3_extensions_print(bp, "X509v3 extensions",
ci->extensions, cflag, 8);

View File

@ -58,8 +58,8 @@
#include <stdio.h>
#include "cryptlib.h"
#include "asn1_locl.h"
#include <openssl/asn1t.h>
#include "asn1_locl.h"
#include <openssl/x509.h>
#include <openssl/x509v3.h>
@ -341,6 +341,8 @@ ASN1_SEQUENCE_ref(X509_CRL, crl_cb, CRYPTO_LOCK_X509_CRL) = {
IMPLEMENT_ASN1_FUNCTIONS(X509_REVOKED)
IMPLEMENT_ASN1_DUP_FUNCTION(X509_REVOKED)
IMPLEMENT_ASN1_FUNCTIONS(X509_CRL_INFO)
IMPLEMENT_ASN1_FUNCTIONS(X509_CRL)

View File

@ -208,3 +208,23 @@ int i2d_X509_AUX(X509 *a, unsigned char **pp)
length += i2d_X509_CERT_AUX(a->aux, pp);
return length;
}
int i2d_re_X509_tbs(X509 *x, unsigned char **pp)
{
x->cert_info->enc.modified = 1;
return i2d_X509_CINF(x->cert_info, pp);
}
void X509_get0_signature(ASN1_BIT_STRING **psig, X509_ALGOR **palg,
const X509 *x)
{
if (psig)
*psig = x->signature;
if (palg)
*palg = x->sig_alg;
}
int X509_get_signature_nid(const X509 *x)
{
return OBJ_obj2nid(x->sig_alg->algorithm);
}

View File

@ -182,3 +182,28 @@ int BIO_dump_indent(BIO *bp, const char *s, int len, int indent)
{
return BIO_dump_indent_cb(write_bio, bp, s, len, indent);
}
int BIO_hex_string(BIO *out, int indent, int width, unsigned char *data,
int datalen)
{
int i, j = 0;
if (datalen < 1)
return 1;
for (i = 0; i < datalen - 1; i++) {
if (i && !j)
BIO_printf(out, "%*s", indent, "");
BIO_printf(out, "%02X:", data[i]);
j = (j + 1) % width;
if (!j)
BIO_printf(out, "\n");
}
if (i && !j)
BIO_printf(out, "%*s", indent, "");
BIO_printf(out, "%02X", data[datalen - 1]);
return 1;
}

View File

@ -225,13 +225,17 @@ int BIO_get_port(const char *str, unsigned short *port_ptr)
int BIO_sock_error(int sock)
{
int j, i;
int size;
union {
size_t s;
int i;
} size;
# if defined(OPENSSL_SYS_BEOS_R5)
return 0;
# endif
size = sizeof(int);
/* heuristic way to adapt for platforms that expect 64-bit optlen */
size.s = 0, size.i = sizeof(j);
/*
* Note: under Windows the third parameter is of type (char *) whereas
* under other systems it is (void *) if you don't have a cast it will

View File

@ -174,6 +174,7 @@ extern "C" {
# define BIO_CTRL_DGRAM_SET_NEXT_TIMEOUT 45/* Next DTLS handshake timeout
* to adjust socket timeouts */
# define BIO_CTRL_DGRAM_SET_DONT_FRAG 48
# define BIO_CTRL_DGRAM_GET_MTU_OVERHEAD 49
@ -725,6 +726,9 @@ int BIO_dump_indent(BIO *b, const char *bytes, int len, int indent);
int BIO_dump_fp(FILE *fp, const char *s, int len);
int BIO_dump_indent_fp(FILE *fp, const char *s, int len, int indent);
# endif
int BIO_hex_string(BIO *out, int indent, int width, unsigned char *data,
int datalen);
struct hostent *BIO_gethostbyname(const char *name);
/*-
* We might want a thread-safe interface too:
@ -761,8 +765,8 @@ int BIO_dgram_sctp_wait_for_dry(BIO *b);
int BIO_dgram_sctp_msg_waiting(BIO *b);
# endif
BIO *BIO_new_fd(int fd, int close_flag);
BIO *BIO_new_connect(char *host_port);
BIO *BIO_new_accept(char *host_port);
BIO *BIO_new_connect(const char *host_port);
BIO *BIO_new_accept(const char *host_port);
int BIO_new_bio_pair(BIO **bio1, size_t writebuf1,
BIO **bio2, size_t writebuf2);

View File

@ -1,6 +1,6 @@
/* crypto/bio/bio_err.c */
/* ====================================================================
* Copyright (c) 1999-2011 The OpenSSL Project. All rights reserved.
* Copyright (c) 1999-2015 The OpenSSL Project. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions

View File

@ -445,7 +445,7 @@ static int acpt_puts(BIO *bp, const char *str)
return (ret);
}
BIO *BIO_new_accept(char *str)
BIO *BIO_new_accept(const char *str)
{
BIO *ret;

View File

@ -585,7 +585,7 @@ static int conn_puts(BIO *bp, const char *str)
return (ret);
}
BIO *BIO_new_connect(char *str)
BIO *BIO_new_connect(const char *str)
{
BIO *ret;

View File

@ -65,7 +65,7 @@
#include <openssl/bio.h>
#ifndef OPENSSL_NO_DGRAM
# if defined(OPENSSL_SYS_WIN32) || defined(OPENSSL_SYS_VMS)
# if defined(OPENSSL_SYS_VMS)
# include <sys/timeb.h>
# endif
@ -80,6 +80,10 @@
# define IP_MTU 14 /* linux is lame */
# endif
# if OPENSSL_USE_IPV6 && !defined(IPPROTO_IPV6)
# define IPPROTO_IPV6 41 /* windows is lame */
# endif
# if defined(__FreeBSD__) && defined(IN6_IS_ADDR_V4MAPPED)
/* Standard definition causes type-punning problems. */
# undef IN6_IS_ADDR_V4MAPPED
@ -496,8 +500,8 @@ static long dgram_ctrl(BIO *b, int cmd, long num, void *ptr)
int *ip;
struct sockaddr *to = NULL;
bio_dgram_data *data = NULL;
# if defined(OPENSSL_SYS_LINUX) && (defined(IP_MTU_DISCOVER) || defined(IP_MTU))
int sockopt_val = 0;
# if defined(OPENSSL_SYS_LINUX) && (defined(IP_MTU_DISCOVER) || defined(IP_MTU))
socklen_t sockopt_len; /* assume that system supporting IP_MTU is
* modern enough to define socklen_t */
socklen_t addr_len;
@ -882,6 +886,61 @@ static long dgram_ctrl(BIO *b, int cmd, long num, void *ptr)
ret = 0;
break;
# endif
case BIO_CTRL_DGRAM_SET_DONT_FRAG:
sockopt_val = num ? 1 : 0;
switch (data->peer.sa.sa_family) {
case AF_INET:
# if defined(IP_DONTFRAG)
if ((ret = setsockopt(b->num, IPPROTO_IP, IP_DONTFRAG,
&sockopt_val, sizeof(sockopt_val))) < 0) {
perror("setsockopt");
ret = -1;
}
# elif defined(OPENSSL_SYS_LINUX) && defined(IP_MTU_DISCOVER) && defined (IP_PMTUDISC_PROBE)
if ((sockopt_val = num ? IP_PMTUDISC_PROBE : IP_PMTUDISC_DONT),
(ret = setsockopt(b->num, IPPROTO_IP, IP_MTU_DISCOVER,
&sockopt_val, sizeof(sockopt_val))) < 0) {
perror("setsockopt");
ret = -1;
}
# elif defined(OPENSSL_SYS_WINDOWS) && defined(IP_DONTFRAGMENT)
if ((ret = setsockopt(b->num, IPPROTO_IP, IP_DONTFRAGMENT,
(const char *)&sockopt_val,
sizeof(sockopt_val))) < 0) {
perror("setsockopt");
ret = -1;
}
# else
ret = -1;
# endif
break;
# if OPENSSL_USE_IPV6
case AF_INET6:
# if defined(IPV6_DONTFRAG)
if ((ret = setsockopt(b->num, IPPROTO_IPV6, IPV6_DONTFRAG,
(const void *)&sockopt_val,
sizeof(sockopt_val))) < 0) {
perror("setsockopt");
ret = -1;
}
# elif defined(OPENSSL_SYS_LINUX) && defined(IPV6_MTUDISCOVER)
if ((sockopt_val = num ? IP_PMTUDISC_PROBE : IP_PMTUDISC_DONT),
(ret = setsockopt(b->num, IPPROTO_IPV6, IPV6_MTU_DISCOVER,
&sockopt_val, sizeof(sockopt_val))) < 0) {
perror("setsockopt");
ret = -1;
}
# else
ret = -1;
# endif
break;
# endif
default:
ret = -1;
break;
}
break;
case BIO_CTRL_DGRAM_GET_MTU_OVERHEAD:
ret = dgram_get_mtu_overhead(data);
break;
@ -1995,11 +2054,22 @@ int BIO_dgram_non_fatal_error(int err)
static void get_current_time(struct timeval *t)
{
# ifdef OPENSSL_SYS_WIN32
struct _timeb tb;
_ftime(&tb);
t->tv_sec = (long)tb.time;
t->tv_usec = (long)tb.millitm * 1000;
# if defined(_WIN32)
SYSTEMTIME st;
union {
unsigned __int64 ul;
FILETIME ft;
} now;
GetSystemTime(&st);
SystemTimeToFileTime(&st, &now.ft);
# ifdef __MINGW32__
now.ul -= 116444736000000000ULL;
# else
now.ul -= 116444736000000000UI64; /* re-bias to 1/1/1970 */
# endif
t->tv_sec = (long)(now.ul / 10000000);
t->tv_usec = ((int)(now.ul % 10000000)) / 10;
# elif defined(OPENSSL_SYS_VMS)
struct timeb tb;
ftime(&tb);

View File

@ -63,9 +63,27 @@
#if defined(OPENSSL_NO_POSIX_IO)
/*
* One can argue that one should implement dummy placeholder for
* BIO_s_fd here...
* Dummy placeholder for BIO_s_fd...
*/
BIO *BIO_new_fd(int fd, int close_flag)
{
return NULL;
}
int BIO_fd_non_fatal_error(int err)
{
return 0;
}
int BIO_fd_should_retry(int i)
{
return 0;
}
BIO_METHOD *BIO_s_fd(void)
{
return NULL;
}
#else
/*
* As for unconditional usage of "UPLINK" interface in this module.

View File

@ -77,6 +77,12 @@ sparcv9a-mont.s: asm/sparcv9a-mont.pl
$(PERL) asm/sparcv9a-mont.pl $(CFLAGS) > $@
sparcv9-mont.s: asm/sparcv9-mont.pl
$(PERL) asm/sparcv9-mont.pl $(CFLAGS) > $@
vis3-mont.s: asm/vis3-mont.pl
$(PERL) asm/vis3-mont.pl $(CFLAGS) > $@
sparct4-mont.S: asm/sparct4-mont.pl
$(PERL) asm/sparct4-mont.pl $(CFLAGS) > $@
sparcv9-gf2m.S: asm/sparcv9-gf2m.pl
$(PERL) asm/sparcv9-gf2m.pl $(CFLAGS) > $@
bn-mips3.o: asm/mips3.s
@if [ "$(CC)" = "gcc" ]; then \
@ -102,8 +108,10 @@ x86_64-mont5.s: asm/x86_64-mont5.pl
$(PERL) asm/x86_64-mont5.pl $(PERLASM_SCHEME) > $@
x86_64-gf2m.s: asm/x86_64-gf2m.pl
$(PERL) asm/x86_64-gf2m.pl $(PERLASM_SCHEME) > $@
modexp512-x86_64.s: asm/modexp512-x86_64.pl
$(PERL) asm/modexp512-x86_64.pl $(PERLASM_SCHEME) > $@
rsaz-x86_64.s: asm/rsaz-x86_64.pl
$(PERL) asm/rsaz-x86_64.pl $(PERLASM_SCHEME) > $@
rsaz-avx2.s: asm/rsaz-avx2.pl
$(PERL) asm/rsaz-avx2.pl $(PERLASM_SCHEME) > $@
bn-ia64.s: asm/ia64.S
$(CC) $(CFLAGS) -E asm/ia64.S > $@
@ -125,14 +133,15 @@ ppc-mont.s: asm/ppc-mont.pl;$(PERL) asm/ppc-mont.pl $(PERLASM_SCHEME) $@
ppc64-mont.s: asm/ppc64-mont.pl;$(PERL) asm/ppc64-mont.pl $(PERLASM_SCHEME) $@
alpha-mont.s: asm/alpha-mont.pl
(preproc=/tmp/$$$$.$@; trap "rm $$preproc" INT; \
(preproc=$$$$.$@.S; trap "rm $$preproc" INT; \
$(PERL) asm/alpha-mont.pl > $$preproc && \
$(CC) -E $$preproc > $@ && rm $$preproc)
$(CC) -E -P $$preproc > $@ && rm $$preproc)
# GNU make "catch all"
%-mont.s: asm/%-mont.pl; $(PERL) $< $(PERLASM_SCHEME) $@
%-mont.S: asm/%-mont.pl; $(PERL) $< $(PERLASM_SCHEME) $@
%-gf2m.S: asm/%-gf2m.pl; $(PERL) $< $(PERLASM_SCHEME) $@
armv4-mont.o: armv4-mont.S
armv4-gf2m.o: armv4-gf2m.S
files:
@ -244,6 +253,7 @@ bn_exp.o: ../../include/openssl/lhash.h ../../include/openssl/opensslconf.h
bn_exp.o: ../../include/openssl/opensslv.h ../../include/openssl/ossl_typ.h
bn_exp.o: ../../include/openssl/safestack.h ../../include/openssl/stack.h
bn_exp.o: ../../include/openssl/symhacks.h ../cryptlib.h bn_exp.c bn_lcl.h
bn_exp.o: rsaz_exp.h
bn_exp2.o: ../../e_os.h ../../include/openssl/bio.h ../../include/openssl/bn.h
bn_exp2.o: ../../include/openssl/buffer.h ../../include/openssl/crypto.h
bn_exp2.o: ../../include/openssl/e_os2.h ../../include/openssl/err.h

View File

@ -20,48 +20,26 @@
# length, more for longer keys. Even though NEON 1x1 multiplication
# runs in even less cycles, ~30, improvement is measurable only on
# longer keys. One has to optimize code elsewhere to get NEON glow...
#
# April 2014
#
# Double bn_GF2m_mul_2x2 performance by using algorithm from paper
# referred below, which improves ECDH and ECDSA verify benchmarks
# by 18-40%.
#
# Câmara, D.; Gouvêa, C. P. L.; López, J. & Dahab, R.: Fast Software
# Polynomial Multiplication on ARM Processors using the NEON Engine.
#
# http://conradoplg.cryptoland.net/files/2010/12/mocrysen13.pdf
while (($output=shift) && ($output!~/^\w[\w\-]*\.\w+$/)) {}
open STDOUT,">$output";
sub Dlo() { shift=~m|q([1]?[0-9])|?"d".($1*2):""; }
sub Dhi() { shift=~m|q([1]?[0-9])|?"d".($1*2+1):""; }
sub Q() { shift=~m|d([1-3]?[02468])|?"q".($1/2):""; }
$code=<<___;
#include "arm_arch.h"
.text
.code 32
#if __ARM_ARCH__>=7
.fpu neon
.type mul_1x1_neon,%function
.align 5
mul_1x1_neon:
vshl.u64 `&Dlo("q1")`,d16,#8 @ q1-q3 are slided $a
vmull.p8 `&Q("d0")`,d16,d17 @ a·bb
vshl.u64 `&Dlo("q2")`,d16,#16
vmull.p8 q1,`&Dlo("q1")`,d17 @ a<<8·bb
vshl.u64 `&Dlo("q3")`,d16,#24
vmull.p8 q2,`&Dlo("q2")`,d17 @ a<<16·bb
vshr.u64 `&Dlo("q1")`,#8
vmull.p8 q3,`&Dlo("q3")`,d17 @ a<<24·bb
vshl.u64 `&Dhi("q1")`,#24
veor d0,`&Dlo("q1")`
vshr.u64 `&Dlo("q2")`,#16
veor d0,`&Dhi("q1")`
vshl.u64 `&Dhi("q2")`,#16
veor d0,`&Dlo("q2")`
vshr.u64 `&Dlo("q3")`,#24
veor d0,`&Dhi("q2")`
vshl.u64 `&Dhi("q3")`,#8
veor d0,`&Dlo("q3")`
veor d0,`&Dhi("q3")`
bx lr
.size mul_1x1_neon,.-mul_1x1_neon
#endif
___
################
# private interface to mul_1x1_ialu
@ -159,56 +137,17 @@ ___
# void bn_GF2m_mul_2x2(BN_ULONG *r,
# BN_ULONG a1,BN_ULONG a0,
# BN_ULONG b1,BN_ULONG b0); # r[3..0]=a1a0·b1b0
($A1,$B1,$A0,$B0,$A1B1,$A0B0)=map("d$_",(18..23));
{
$code.=<<___;
.global bn_GF2m_mul_2x2
.type bn_GF2m_mul_2x2,%function
.align 5
bn_GF2m_mul_2x2:
#if __ARM_ARCH__>=7
#if __ARM_MAX_ARCH__>=7
ldr r12,.LOPENSSL_armcap
.Lpic: ldr r12,[pc,r12]
tst r12,#1
beq .Lialu
veor $A1,$A1
vmov.32 $B1,r3,r3 @ two copies of b1
vmov.32 ${A1}[0],r1 @ a1
veor $A0,$A0
vld1.32 ${B0}[],[sp,:32] @ two copies of b0
vmov.32 ${A0}[0],r2 @ a0
mov r12,lr
vmov d16,$A1
vmov d17,$B1
bl mul_1x1_neon @ a1·b1
vmov $A1B1,d0
vmov d16,$A0
vmov d17,$B0
bl mul_1x1_neon @ a0·b0
vmov $A0B0,d0
veor d16,$A0,$A1
veor d17,$B0,$B1
veor $A0,$A0B0,$A1B1
bl mul_1x1_neon @ (a0+a1)·(b0+b1)
veor d0,$A0 @ (a0+a1)·(b0+b1)-a0·b0-a1·b1
vshl.u64 d1,d0,#32
vshr.u64 d0,d0,#32
veor $A0B0,d1
veor $A1B1,d0
vst1.32 {${A0B0}[0]},[r0,:32]!
vst1.32 {${A0B0}[1]},[r0,:32]!
vst1.32 {${A1B1}[0]},[r0,:32]!
vst1.32 {${A1B1}[1]},[r0,:32]
bx r12
.align 4
.Lialu:
bne .LNEON
#endif
___
$ret="r10"; # reassigned 1st argument
@ -260,8 +199,72 @@ $code.=<<___;
moveq pc,lr @ be binary compatible with V4, yet
bx lr @ interoperable with Thumb ISA:-)
#endif
___
}
{
my ($r,$t0,$t1,$t2,$t3)=map("q$_",(0..3,8..12));
my ($a,$b,$k48,$k32,$k16)=map("d$_",(26..31));
$code.=<<___;
#if __ARM_MAX_ARCH__>=7
.arch armv7-a
.fpu neon
.align 5
.LNEON:
ldr r12, [sp] @ 5th argument
vmov.32 $a, r2, r1
vmov.32 $b, r12, r3
vmov.i64 $k48, #0x0000ffffffffffff
vmov.i64 $k32, #0x00000000ffffffff
vmov.i64 $k16, #0x000000000000ffff
vext.8 $t0#lo, $a, $a, #1 @ A1
vmull.p8 $t0, $t0#lo, $b @ F = A1*B
vext.8 $r#lo, $b, $b, #1 @ B1
vmull.p8 $r, $a, $r#lo @ E = A*B1
vext.8 $t1#lo, $a, $a, #2 @ A2
vmull.p8 $t1, $t1#lo, $b @ H = A2*B
vext.8 $t3#lo, $b, $b, #2 @ B2
vmull.p8 $t3, $a, $t3#lo @ G = A*B2
vext.8 $t2#lo, $a, $a, #3 @ A3
veor $t0, $t0, $r @ L = E + F
vmull.p8 $t2, $t2#lo, $b @ J = A3*B
vext.8 $r#lo, $b, $b, #3 @ B3
veor $t1, $t1, $t3 @ M = G + H
vmull.p8 $r, $a, $r#lo @ I = A*B3
veor $t0#lo, $t0#lo, $t0#hi @ t0 = (L) (P0 + P1) << 8
vand $t0#hi, $t0#hi, $k48
vext.8 $t3#lo, $b, $b, #4 @ B4
veor $t1#lo, $t1#lo, $t1#hi @ t1 = (M) (P2 + P3) << 16
vand $t1#hi, $t1#hi, $k32
vmull.p8 $t3, $a, $t3#lo @ K = A*B4
veor $t2, $t2, $r @ N = I + J
veor $t0#lo, $t0#lo, $t0#hi
veor $t1#lo, $t1#lo, $t1#hi
veor $t2#lo, $t2#lo, $t2#hi @ t2 = (N) (P4 + P5) << 24
vand $t2#hi, $t2#hi, $k16
vext.8 $t0, $t0, $t0, #15
veor $t3#lo, $t3#lo, $t3#hi @ t3 = (K) (P6 + P7) << 32
vmov.i64 $t3#hi, #0
vext.8 $t1, $t1, $t1, #14
veor $t2#lo, $t2#lo, $t2#hi
vmull.p8 $r, $a, $b @ D = A*B
vext.8 $t3, $t3, $t3, #12
vext.8 $t2, $t2, $t2, #13
veor $t0, $t0, $t1
veor $t2, $t2, $t3
veor $r, $r, $t0
veor $r, $r, $t2
vst1.32 {$r}, [r0]
ret @ bx lr
#endif
___
}
$code.=<<___;
.size bn_GF2m_mul_2x2,.-bn_GF2m_mul_2x2
#if __ARM_ARCH__>=7
#if __ARM_MAX_ARCH__>=7
.align 5
.LOPENSSL_armcap:
.word OPENSSL_armcap_P-(.Lpic+8)
@ -269,10 +272,18 @@ $code.=<<___;
.asciz "GF(2^m) Multiplication for ARMv4/NEON, CRYPTOGAMS by <appro\@openssl.org>"
.align 5
#if __ARM_MAX_ARCH__>=7
.comm OPENSSL_armcap_P,4,4
#endif
___
$code =~ s/\`([^\`]*)\`/eval $1/gem;
$code =~ s/\bbx\s+lr\b/.word\t0xe12fff1e/gm; # make it possible to compile with -march=armv4
print $code;
foreach (split("\n",$code)) {
s/\`([^\`]*)\`/eval $1/geo;
s/\bq([0-9]+)#(lo|hi)/sprintf "d%d",2*$1+($2 eq "hi")/geo or
s/\bret\b/bx lr/go or
s/\bbx\s+lr\b/.word\t0xe12fff1e/go; # make it possible to compile with -march=armv4
print $_,"\n";
}
close STDOUT; # enforce flush

View File

@ -1,7 +1,7 @@
#!/usr/bin/env perl
# ====================================================================
# Written by Andy Polyakov <appro@fy.chalmers.se> for the OpenSSL
# Written by Andy Polyakov <appro@openssl.org> for the OpenSSL
# project. The module is, however, dual licensed under OpenSSL and
# CRYPTOGAMS licenses depending on where you obtain it. For further
# details see http://www.openssl.org/~appro/cryptogams/.
@ -23,6 +23,21 @@
# than 1/2KB. Windows CE port would be trivial, as it's exclusively
# about decorations, ABI and instruction syntax are identical.
# November 2013
#
# Add NEON code path, which handles lengths divisible by 8. RSA/DSA
# performance improvement on Cortex-A8 is ~45-100% depending on key
# length, more for longer keys. On Cortex-A15 the span is ~10-105%.
# On Snapdragon S4 improvement was measured to vary from ~70% to
# incredible ~380%, yes, 4.8x faster, for RSA4096 sign. But this is
# rather because original integer-only code seems to perform
# suboptimally on S4. Situation on Cortex-A9 is unfortunately
# different. It's being looked into, but the trouble is that
# performance for vectors longer than 256 bits is actually couple
# of percent worse than for integer-only code. The code is chosen
# for execution on all NEON-capable processors, because gain on
# others outweighs the marginal loss on Cortex-A9.
while (($output=shift) && ($output!~/^\w[\w\-]*\.\w+$/)) {}
open STDOUT,">$output";
@ -52,16 +67,40 @@ $_n0="$num,#14*4";
$_num="$num,#15*4"; $_bpend=$_num;
$code=<<___;
#include "arm_arch.h"
.text
.code 32
#if __ARM_MAX_ARCH__>=7
.align 5
.LOPENSSL_armcap:
.word OPENSSL_armcap_P-bn_mul_mont
#endif
.global bn_mul_mont
.type bn_mul_mont,%function
.align 2
.align 5
bn_mul_mont:
ldr ip,[sp,#4] @ load num
stmdb sp!,{r0,r2} @ sp points at argument block
ldr $num,[sp,#3*4] @ load num
cmp $num,#2
#if __ARM_MAX_ARCH__>=7
tst ip,#7
bne .Lialu
adr r0,bn_mul_mont
ldr r2,.LOPENSSL_armcap
ldr r0,[r0,r2]
tst r0,#1 @ NEON available?
ldmia sp, {r0,r2}
beq .Lialu
add sp,sp,#8
b bn_mul8x_mont_neon
.align 4
.Lialu:
#endif
cmp ip,#2
mov $num,ip @ load num
movlt r0,#0
addlt sp,sp,#2*4
blt .Labrt
@ -191,14 +230,447 @@ bn_mul_mont:
ldmia sp!,{r4-r12,lr} @ restore registers
add sp,sp,#2*4 @ skip over {r0,r2}
mov r0,#1
.Labrt: tst lr,#1
.Labrt:
#if __ARM_ARCH__>=5
ret @ bx lr
#else
tst lr,#1
moveq pc,lr @ be binary compatible with V4, yet
bx lr @ interoperable with Thumb ISA:-)
#endif
.size bn_mul_mont,.-bn_mul_mont
.asciz "Montgomery multiplication for ARMv4, CRYPTOGAMS by <appro\@openssl.org>"
___
{
sub Dlo() { shift=~m|q([1]?[0-9])|?"d".($1*2):""; }
sub Dhi() { shift=~m|q([1]?[0-9])|?"d".($1*2+1):""; }
my ($A0,$A1,$A2,$A3)=map("d$_",(0..3));
my ($N0,$N1,$N2,$N3)=map("d$_",(4..7));
my ($Z,$Temp)=("q4","q5");
my ($A0xB,$A1xB,$A2xB,$A3xB,$A4xB,$A5xB,$A6xB,$A7xB)=map("q$_",(6..13));
my ($Bi,$Ni,$M0)=map("d$_",(28..31));
my $zero=&Dlo($Z);
my $temp=&Dlo($Temp);
my ($rptr,$aptr,$bptr,$nptr,$n0,$num)=map("r$_",(0..5));
my ($tinptr,$toutptr,$inner,$outer)=map("r$_",(6..9));
$code.=<<___;
#if __ARM_MAX_ARCH__>=7
.arch armv7-a
.fpu neon
.type bn_mul8x_mont_neon,%function
.align 5
bn_mul8x_mont_neon:
mov ip,sp
stmdb sp!,{r4-r11}
vstmdb sp!,{d8-d15} @ ABI specification says so
ldmia ip,{r4-r5} @ load rest of parameter block
sub $toutptr,sp,#16
vld1.32 {${Bi}[0]}, [$bptr,:32]!
sub $toutptr,$toutptr,$num,lsl#4
vld1.32 {$A0-$A3}, [$aptr]! @ can't specify :32 :-(
and $toutptr,$toutptr,#-64
vld1.32 {${M0}[0]}, [$n0,:32]
mov sp,$toutptr @ alloca
veor $zero,$zero,$zero
subs $inner,$num,#8
vzip.16 $Bi,$zero
vmull.u32 $A0xB,$Bi,${A0}[0]
vmull.u32 $A1xB,$Bi,${A0}[1]
vmull.u32 $A2xB,$Bi,${A1}[0]
vshl.i64 $temp,`&Dhi("$A0xB")`,#16
vmull.u32 $A3xB,$Bi,${A1}[1]
vadd.u64 $temp,$temp,`&Dlo("$A0xB")`
veor $zero,$zero,$zero
vmul.u32 $Ni,$temp,$M0
vmull.u32 $A4xB,$Bi,${A2}[0]
vld1.32 {$N0-$N3}, [$nptr]!
vmull.u32 $A5xB,$Bi,${A2}[1]
vmull.u32 $A6xB,$Bi,${A3}[0]
vzip.16 $Ni,$zero
vmull.u32 $A7xB,$Bi,${A3}[1]
bne .LNEON_1st
@ special case for num=8, everything is in register bank...
vmlal.u32 $A0xB,$Ni,${N0}[0]
sub $outer,$num,#1
vmlal.u32 $A1xB,$Ni,${N0}[1]
vmlal.u32 $A2xB,$Ni,${N1}[0]
vmlal.u32 $A3xB,$Ni,${N1}[1]
vmlal.u32 $A4xB,$Ni,${N2}[0]
vmov $Temp,$A0xB
vmlal.u32 $A5xB,$Ni,${N2}[1]
vmov $A0xB,$A1xB
vmlal.u32 $A6xB,$Ni,${N3}[0]
vmov $A1xB,$A2xB
vmlal.u32 $A7xB,$Ni,${N3}[1]
vmov $A2xB,$A3xB
vmov $A3xB,$A4xB
vshr.u64 $temp,$temp,#16
vmov $A4xB,$A5xB
vmov $A5xB,$A6xB
vadd.u64 $temp,$temp,`&Dhi("$Temp")`
vmov $A6xB,$A7xB
veor $A7xB,$A7xB
vshr.u64 $temp,$temp,#16
b .LNEON_outer8
.align 4
.LNEON_outer8:
vld1.32 {${Bi}[0]}, [$bptr,:32]!
veor $zero,$zero,$zero
vzip.16 $Bi,$zero
vadd.u64 `&Dlo("$A0xB")`,`&Dlo("$A0xB")`,$temp
vmlal.u32 $A0xB,$Bi,${A0}[0]
vmlal.u32 $A1xB,$Bi,${A0}[1]
vmlal.u32 $A2xB,$Bi,${A1}[0]
vshl.i64 $temp,`&Dhi("$A0xB")`,#16
vmlal.u32 $A3xB,$Bi,${A1}[1]
vadd.u64 $temp,$temp,`&Dlo("$A0xB")`
veor $zero,$zero,$zero
subs $outer,$outer,#1
vmul.u32 $Ni,$temp,$M0
vmlal.u32 $A4xB,$Bi,${A2}[0]
vmlal.u32 $A5xB,$Bi,${A2}[1]
vmlal.u32 $A6xB,$Bi,${A3}[0]
vzip.16 $Ni,$zero
vmlal.u32 $A7xB,$Bi,${A3}[1]
vmlal.u32 $A0xB,$Ni,${N0}[0]
vmlal.u32 $A1xB,$Ni,${N0}[1]
vmlal.u32 $A2xB,$Ni,${N1}[0]
vmlal.u32 $A3xB,$Ni,${N1}[1]
vmlal.u32 $A4xB,$Ni,${N2}[0]
vmov $Temp,$A0xB
vmlal.u32 $A5xB,$Ni,${N2}[1]
vmov $A0xB,$A1xB
vmlal.u32 $A6xB,$Ni,${N3}[0]
vmov $A1xB,$A2xB
vmlal.u32 $A7xB,$Ni,${N3}[1]
vmov $A2xB,$A3xB
vmov $A3xB,$A4xB
vshr.u64 $temp,$temp,#16
vmov $A4xB,$A5xB
vmov $A5xB,$A6xB
vadd.u64 $temp,$temp,`&Dhi("$Temp")`
vmov $A6xB,$A7xB
veor $A7xB,$A7xB
vshr.u64 $temp,$temp,#16
bne .LNEON_outer8
vadd.u64 `&Dlo("$A0xB")`,`&Dlo("$A0xB")`,$temp
mov $toutptr,sp
vshr.u64 $temp,`&Dlo("$A0xB")`,#16
mov $inner,$num
vadd.u64 `&Dhi("$A0xB")`,`&Dhi("$A0xB")`,$temp
add $tinptr,sp,#16
vshr.u64 $temp,`&Dhi("$A0xB")`,#16
vzip.16 `&Dlo("$A0xB")`,`&Dhi("$A0xB")`
b .LNEON_tail2
.align 4
.LNEON_1st:
vmlal.u32 $A0xB,$Ni,${N0}[0]
vld1.32 {$A0-$A3}, [$aptr]!
vmlal.u32 $A1xB,$Ni,${N0}[1]
subs $inner,$inner,#8
vmlal.u32 $A2xB,$Ni,${N1}[0]
vmlal.u32 $A3xB,$Ni,${N1}[1]
vmlal.u32 $A4xB,$Ni,${N2}[0]
vld1.32 {$N0-$N1}, [$nptr]!
vmlal.u32 $A5xB,$Ni,${N2}[1]
vst1.64 {$A0xB-$A1xB}, [$toutptr,:256]!
vmlal.u32 $A6xB,$Ni,${N3}[0]
vmlal.u32 $A7xB,$Ni,${N3}[1]
vst1.64 {$A2xB-$A3xB}, [$toutptr,:256]!
vmull.u32 $A0xB,$Bi,${A0}[0]
vld1.32 {$N2-$N3}, [$nptr]!
vmull.u32 $A1xB,$Bi,${A0}[1]
vst1.64 {$A4xB-$A5xB}, [$toutptr,:256]!
vmull.u32 $A2xB,$Bi,${A1}[0]
vmull.u32 $A3xB,$Bi,${A1}[1]
vst1.64 {$A6xB-$A7xB}, [$toutptr,:256]!
vmull.u32 $A4xB,$Bi,${A2}[0]
vmull.u32 $A5xB,$Bi,${A2}[1]
vmull.u32 $A6xB,$Bi,${A3}[0]
vmull.u32 $A7xB,$Bi,${A3}[1]
bne .LNEON_1st
vmlal.u32 $A0xB,$Ni,${N0}[0]
add $tinptr,sp,#16
vmlal.u32 $A1xB,$Ni,${N0}[1]
sub $aptr,$aptr,$num,lsl#2 @ rewind $aptr
vmlal.u32 $A2xB,$Ni,${N1}[0]
vld1.64 {$Temp}, [sp,:128]
vmlal.u32 $A3xB,$Ni,${N1}[1]
sub $outer,$num,#1
vmlal.u32 $A4xB,$Ni,${N2}[0]
vst1.64 {$A0xB-$A1xB}, [$toutptr,:256]!
vmlal.u32 $A5xB,$Ni,${N2}[1]
vshr.u64 $temp,$temp,#16
vld1.64 {$A0xB}, [$tinptr, :128]!
vmlal.u32 $A6xB,$Ni,${N3}[0]
vst1.64 {$A2xB-$A3xB}, [$toutptr,:256]!
vmlal.u32 $A7xB,$Ni,${N3}[1]
vst1.64 {$A4xB-$A5xB}, [$toutptr,:256]!
vadd.u64 $temp,$temp,`&Dhi("$Temp")`
veor $Z,$Z,$Z
vst1.64 {$A6xB-$A7xB}, [$toutptr,:256]!
vld1.64 {$A1xB-$A2xB}, [$tinptr, :256]!
vst1.64 {$Z}, [$toutptr,:128]
vshr.u64 $temp,$temp,#16
b .LNEON_outer
.align 4
.LNEON_outer:
vld1.32 {${Bi}[0]}, [$bptr,:32]!
sub $nptr,$nptr,$num,lsl#2 @ rewind $nptr
vld1.32 {$A0-$A3}, [$aptr]!
veor $zero,$zero,$zero
mov $toutptr,sp
vzip.16 $Bi,$zero
sub $inner,$num,#8
vadd.u64 `&Dlo("$A0xB")`,`&Dlo("$A0xB")`,$temp
vmlal.u32 $A0xB,$Bi,${A0}[0]
vld1.64 {$A3xB-$A4xB},[$tinptr,:256]!
vmlal.u32 $A1xB,$Bi,${A0}[1]
vmlal.u32 $A2xB,$Bi,${A1}[0]
vld1.64 {$A5xB-$A6xB},[$tinptr,:256]!
vmlal.u32 $A3xB,$Bi,${A1}[1]
vshl.i64 $temp,`&Dhi("$A0xB")`,#16
veor $zero,$zero,$zero
vadd.u64 $temp,$temp,`&Dlo("$A0xB")`
vld1.64 {$A7xB},[$tinptr,:128]!
vmul.u32 $Ni,$temp,$M0
vmlal.u32 $A4xB,$Bi,${A2}[0]
vld1.32 {$N0-$N3}, [$nptr]!
vmlal.u32 $A5xB,$Bi,${A2}[1]
vmlal.u32 $A6xB,$Bi,${A3}[0]
vzip.16 $Ni,$zero
vmlal.u32 $A7xB,$Bi,${A3}[1]
.LNEON_inner:
vmlal.u32 $A0xB,$Ni,${N0}[0]
vld1.32 {$A0-$A3}, [$aptr]!
vmlal.u32 $A1xB,$Ni,${N0}[1]
subs $inner,$inner,#8
vmlal.u32 $A2xB,$Ni,${N1}[0]
vmlal.u32 $A3xB,$Ni,${N1}[1]
vst1.64 {$A0xB-$A1xB}, [$toutptr,:256]!
vmlal.u32 $A4xB,$Ni,${N2}[0]
vld1.64 {$A0xB}, [$tinptr, :128]!
vmlal.u32 $A5xB,$Ni,${N2}[1]
vst1.64 {$A2xB-$A3xB}, [$toutptr,:256]!
vmlal.u32 $A6xB,$Ni,${N3}[0]
vld1.64 {$A1xB-$A2xB}, [$tinptr, :256]!
vmlal.u32 $A7xB,$Ni,${N3}[1]
vst1.64 {$A4xB-$A5xB}, [$toutptr,:256]!
vmlal.u32 $A0xB,$Bi,${A0}[0]
vld1.64 {$A3xB-$A4xB}, [$tinptr, :256]!
vmlal.u32 $A1xB,$Bi,${A0}[1]
vst1.64 {$A6xB-$A7xB}, [$toutptr,:256]!
vmlal.u32 $A2xB,$Bi,${A1}[0]
vld1.64 {$A5xB-$A6xB}, [$tinptr, :256]!
vmlal.u32 $A3xB,$Bi,${A1}[1]
vld1.32 {$N0-$N3}, [$nptr]!
vmlal.u32 $A4xB,$Bi,${A2}[0]
vld1.64 {$A7xB}, [$tinptr, :128]!
vmlal.u32 $A5xB,$Bi,${A2}[1]
vmlal.u32 $A6xB,$Bi,${A3}[0]
vmlal.u32 $A7xB,$Bi,${A3}[1]
bne .LNEON_inner
vmlal.u32 $A0xB,$Ni,${N0}[0]
add $tinptr,sp,#16
vmlal.u32 $A1xB,$Ni,${N0}[1]
sub $aptr,$aptr,$num,lsl#2 @ rewind $aptr
vmlal.u32 $A2xB,$Ni,${N1}[0]
vld1.64 {$Temp}, [sp,:128]
vmlal.u32 $A3xB,$Ni,${N1}[1]
subs $outer,$outer,#1
vmlal.u32 $A4xB,$Ni,${N2}[0]
vst1.64 {$A0xB-$A1xB}, [$toutptr,:256]!
vmlal.u32 $A5xB,$Ni,${N2}[1]
vld1.64 {$A0xB}, [$tinptr, :128]!
vshr.u64 $temp,$temp,#16
vst1.64 {$A2xB-$A3xB}, [$toutptr,:256]!
vmlal.u32 $A6xB,$Ni,${N3}[0]
vld1.64 {$A1xB-$A2xB}, [$tinptr, :256]!
vmlal.u32 $A7xB,$Ni,${N3}[1]
vst1.64 {$A4xB-$A5xB}, [$toutptr,:256]!
vadd.u64 $temp,$temp,`&Dhi("$Temp")`
vst1.64 {$A6xB-$A7xB}, [$toutptr,:256]!
vshr.u64 $temp,$temp,#16
bne .LNEON_outer
mov $toutptr,sp
mov $inner,$num
.LNEON_tail:
vadd.u64 `&Dlo("$A0xB")`,`&Dlo("$A0xB")`,$temp
vld1.64 {$A3xB-$A4xB}, [$tinptr, :256]!
vshr.u64 $temp,`&Dlo("$A0xB")`,#16
vadd.u64 `&Dhi("$A0xB")`,`&Dhi("$A0xB")`,$temp
vld1.64 {$A5xB-$A6xB}, [$tinptr, :256]!
vshr.u64 $temp,`&Dhi("$A0xB")`,#16
vld1.64 {$A7xB}, [$tinptr, :128]!
vzip.16 `&Dlo("$A0xB")`,`&Dhi("$A0xB")`
.LNEON_tail2:
vadd.u64 `&Dlo("$A1xB")`,`&Dlo("$A1xB")`,$temp
vst1.32 {`&Dlo("$A0xB")`[0]}, [$toutptr, :32]!
vshr.u64 $temp,`&Dlo("$A1xB")`,#16
vadd.u64 `&Dhi("$A1xB")`,`&Dhi("$A1xB")`,$temp
vshr.u64 $temp,`&Dhi("$A1xB")`,#16
vzip.16 `&Dlo("$A1xB")`,`&Dhi("$A1xB")`
vadd.u64 `&Dlo("$A2xB")`,`&Dlo("$A2xB")`,$temp
vst1.32 {`&Dlo("$A1xB")`[0]}, [$toutptr, :32]!
vshr.u64 $temp,`&Dlo("$A2xB")`,#16
vadd.u64 `&Dhi("$A2xB")`,`&Dhi("$A2xB")`,$temp
vshr.u64 $temp,`&Dhi("$A2xB")`,#16
vzip.16 `&Dlo("$A2xB")`,`&Dhi("$A2xB")`
vadd.u64 `&Dlo("$A3xB")`,`&Dlo("$A3xB")`,$temp
vst1.32 {`&Dlo("$A2xB")`[0]}, [$toutptr, :32]!
vshr.u64 $temp,`&Dlo("$A3xB")`,#16
vadd.u64 `&Dhi("$A3xB")`,`&Dhi("$A3xB")`,$temp
vshr.u64 $temp,`&Dhi("$A3xB")`,#16
vzip.16 `&Dlo("$A3xB")`,`&Dhi("$A3xB")`
vadd.u64 `&Dlo("$A4xB")`,`&Dlo("$A4xB")`,$temp
vst1.32 {`&Dlo("$A3xB")`[0]}, [$toutptr, :32]!
vshr.u64 $temp,`&Dlo("$A4xB")`,#16
vadd.u64 `&Dhi("$A4xB")`,`&Dhi("$A4xB")`,$temp
vshr.u64 $temp,`&Dhi("$A4xB")`,#16
vzip.16 `&Dlo("$A4xB")`,`&Dhi("$A4xB")`
vadd.u64 `&Dlo("$A5xB")`,`&Dlo("$A5xB")`,$temp
vst1.32 {`&Dlo("$A4xB")`[0]}, [$toutptr, :32]!
vshr.u64 $temp,`&Dlo("$A5xB")`,#16
vadd.u64 `&Dhi("$A5xB")`,`&Dhi("$A5xB")`,$temp
vshr.u64 $temp,`&Dhi("$A5xB")`,#16
vzip.16 `&Dlo("$A5xB")`,`&Dhi("$A5xB")`
vadd.u64 `&Dlo("$A6xB")`,`&Dlo("$A6xB")`,$temp
vst1.32 {`&Dlo("$A5xB")`[0]}, [$toutptr, :32]!
vshr.u64 $temp,`&Dlo("$A6xB")`,#16
vadd.u64 `&Dhi("$A6xB")`,`&Dhi("$A6xB")`,$temp
vld1.64 {$A0xB}, [$tinptr, :128]!
vshr.u64 $temp,`&Dhi("$A6xB")`,#16
vzip.16 `&Dlo("$A6xB")`,`&Dhi("$A6xB")`
vadd.u64 `&Dlo("$A7xB")`,`&Dlo("$A7xB")`,$temp
vst1.32 {`&Dlo("$A6xB")`[0]}, [$toutptr, :32]!
vshr.u64 $temp,`&Dlo("$A7xB")`,#16
vadd.u64 `&Dhi("$A7xB")`,`&Dhi("$A7xB")`,$temp
vld1.64 {$A1xB-$A2xB}, [$tinptr, :256]!
vshr.u64 $temp,`&Dhi("$A7xB")`,#16
vzip.16 `&Dlo("$A7xB")`,`&Dhi("$A7xB")`
subs $inner,$inner,#8
vst1.32 {`&Dlo("$A7xB")`[0]}, [$toutptr, :32]!
bne .LNEON_tail
vst1.32 {${temp}[0]}, [$toutptr, :32] @ top-most bit
sub $nptr,$nptr,$num,lsl#2 @ rewind $nptr
subs $aptr,sp,#0 @ clear carry flag
add $bptr,sp,$num,lsl#2
.LNEON_sub:
ldmia $aptr!, {r4-r7}
ldmia $nptr!, {r8-r11}
sbcs r8, r4,r8
sbcs r9, r5,r9
sbcs r10,r6,r10
sbcs r11,r7,r11
teq $aptr,$bptr @ preserves carry
stmia $rptr!, {r8-r11}
bne .LNEON_sub
ldr r10, [$aptr] @ load top-most bit
veor q0,q0,q0
sub r11,$bptr,sp @ this is num*4
veor q1,q1,q1
mov $aptr,sp
sub $rptr,$rptr,r11 @ rewind $rptr
mov $nptr,$bptr @ second 3/4th of frame
sbcs r10,r10,#0 @ result is carry flag
.LNEON_copy_n_zap:
ldmia $aptr!, {r4-r7}
ldmia $rptr, {r8-r11}
movcc r8, r4
vst1.64 {q0-q1}, [$nptr,:256]! @ wipe
movcc r9, r5
movcc r10,r6
vst1.64 {q0-q1}, [$nptr,:256]! @ wipe
movcc r11,r7
ldmia $aptr, {r4-r7}
stmia $rptr!, {r8-r11}
sub $aptr,$aptr,#16
ldmia $rptr, {r8-r11}
movcc r8, r4
vst1.64 {q0-q1}, [$aptr,:256]! @ wipe
movcc r9, r5
movcc r10,r6
vst1.64 {q0-q1}, [$nptr,:256]! @ wipe
movcc r11,r7
teq $aptr,$bptr @ preserves carry
stmia $rptr!, {r8-r11}
bne .LNEON_copy_n_zap
sub sp,ip,#96
vldmia sp!,{d8-d15}
ldmia sp!,{r4-r11}
ret @ bx lr
.size bn_mul8x_mont_neon,.-bn_mul8x_mont_neon
#endif
___
}
$code.=<<___;
.asciz "Montgomery multiplication for ARMv4/NEON, CRYPTOGAMS by <appro\@openssl.org>"
.align 2
#if __ARM_MAX_ARCH__>=7
.comm OPENSSL_armcap_P,4,4
#endif
___
$code =~ s/\`([^\`]*)\`/eval $1/gem;
$code =~ s/\bbx\s+lr\b/.word\t0xe12fff1e/gm; # make it possible to compile with -march=armv4
$code =~ s/\bret\b/bx lr/gm;
print $code;
close STDOUT;

View File

@ -46,7 +46,7 @@
# ($s0,$s1,$s2,$s3,$s4,$s5,$s6,$s7)=map("\$$_",(16..23));
# ($gp,$sp,$fp,$ra)=map("\$$_",(28..31));
#
$flavour = shift; # supported flavours are o32,n32,64,nubi32,nubi64
$flavour = shift || "o32"; # supported flavours are o32,n32,64,nubi32,nubi64
if ($flavour =~ /64|n32/i) {
$PTR_ADD="dadd"; # incidentally works even on n32

View File

@ -48,7 +48,7 @@
# has to content with 40-85% improvement depending on benchmark and
# key length, more for longer keys.
$flavour = shift;
$flavour = shift || "o32";
while (($output=shift) && ($output!~/^\w[\w\-]*\.\w+$/)) {}
open STDOUT,">$output";

2201
crypto/bn/asm/mips3.s Normal file

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -325,6 +325,7 @@ Lcopy: ; copy or in-place refresh
.long 0
.byte 0,12,4,0,0x80,12,6,0
.long 0
.size .bn_mul_mont_int,.-.bn_mul_mont_int
.asciz "Montgomery Multiplication for PPC, CRYPTOGAMS by <appro\@openssl.org>"
___

View File

@ -392,6 +392,7 @@ $data=<<EOF;
.long 0
.byte 0,12,0x14,0,0,0,2,0
.long 0
.size .bn_sqr_comba4,.-.bn_sqr_comba4
#
# NOTE: The following label name should be changed to
@ -819,6 +820,7 @@ $data=<<EOF;
.long 0
.byte 0,12,0x14,0,0,0,2,0
.long 0
.size .bn_sqr_comba8,.-.bn_sqr_comba8
#
# NOTE: The following label name should be changed to
@ -972,6 +974,7 @@ $data=<<EOF;
.long 0
.byte 0,12,0x14,0,0,0,3,0
.long 0
.size .bn_mul_comba4,.-.bn_mul_comba4
#
# NOTE: The following label name should be changed to
@ -1510,6 +1513,7 @@ $data=<<EOF;
.long 0
.byte 0,12,0x14,0,0,0,3,0
.long 0
.size .bn_mul_comba8,.-.bn_mul_comba8
#
# NOTE: The following label name should be changed to
@ -1560,6 +1564,7 @@ Lppcasm_sub_adios:
.long 0
.byte 0,12,0x14,0,0,0,4,0
.long 0
.size .bn_sub_words,.-.bn_sub_words
#
# NOTE: The following label name should be changed to
@ -1605,6 +1610,7 @@ Lppcasm_add_adios:
.long 0
.byte 0,12,0x14,0,0,0,4,0
.long 0
.size .bn_add_words,.-.bn_add_words
#
# NOTE: The following label name should be changed to
@ -1720,6 +1726,7 @@ Lppcasm_div9:
.long 0
.byte 0,12,0x14,0,0,0,3,0
.long 0
.size .bn_div_words,.-.bn_div_words
#
# NOTE: The following label name should be changed to
@ -1761,6 +1768,7 @@ Lppcasm_sqr_adios:
.long 0
.byte 0,12,0x14,0,0,0,3,0
.long 0
.size .bn_sqr_words,.-.bn_sqr_words
#
# NOTE: The following label name should be changed to
@ -1866,6 +1874,7 @@ Lppcasm_mw_OVER:
.long 0
.byte 0,12,0x14,0,0,0,4,0
.long 0
.size bn_mul_words,.-bn_mul_words
#
# NOTE: The following label name should be changed to
@ -1991,6 +2000,7 @@ Lppcasm_maw_adios:
.long 0
.byte 0,12,0x14,0,0,0,4,0
.long 0
.size .bn_mul_add_words,.-.bn_mul_add_words
.align 4
EOF
$data =~ s/\`([^\`]*)\`/eval $1/gem;

View File

@ -1,7 +1,7 @@
#!/usr/bin/env perl
# ====================================================================
# Written by Andy Polyakov <appro@fy.chalmers.se> for the OpenSSL
# Written by Andy Polyakov <appro@openssl.org> for the OpenSSL
# project. The module is, however, dual licensed under OpenSSL and
# CRYPTOGAMS licenses depending on where you obtain it. For further
# details see http://www.openssl.org/~appro/cryptogams/.
@ -65,6 +65,14 @@
# others alternative would be to break dependence on upper halves of
# GPRs by sticking to 32-bit integer operations...
# December 2012
# Remove above mentioned dependence on GPRs' upper halves in 32-bit
# build. No signal masking overhead, but integer instructions are
# *more* numerous... It's still "universally" faster than 32-bit
# ppc-mont.pl, but improvement coefficient is not as impressive
# for longer keys...
$flavour = shift;
if ($flavour =~ /32/) {
@ -110,6 +118,9 @@ $tp="r10";
$j="r11";
$i="r12";
# non-volatile registers
$c1="r19";
$n1="r20";
$a1="r21";
$nap_d="r22"; # interleaved ap and np in double format
$a0="r23"; # ap[0]
$t0="r24"; # temporary registers
@ -180,8 +191,8 @@ $T3a="f30"; $T3b="f31";
# . .
# +-------------------------------+
# . .
# -12*size_t +-------------------------------+
# | 10 saved gpr, r22-r31 |
# -13*size_t +-------------------------------+
# | 13 saved gpr, r19-r31 |
# . .
# . .
# -12*8 +-------------------------------+
@ -215,6 +226,9 @@ $code=<<___;
mr $i,$sp
$STUX $sp,$sp,$tp ; alloca
$PUSH r19,`-12*8-13*$SIZE_T`($i)
$PUSH r20,`-12*8-12*$SIZE_T`($i)
$PUSH r21,`-12*8-11*$SIZE_T`($i)
$PUSH r22,`-12*8-10*$SIZE_T`($i)
$PUSH r23,`-12*8-9*$SIZE_T`($i)
$PUSH r24,`-12*8-8*$SIZE_T`($i)
@ -237,40 +251,26 @@ $code=<<___;
stfd f29,`-3*8`($i)
stfd f30,`-2*8`($i)
stfd f31,`-1*8`($i)
___
$code.=<<___ if ($SIZE_T==8);
ld $a0,0($ap) ; pull ap[0] value
ld $n0,0($n0) ; pull n0[0] value
ld $t3,0($bp) ; bp[0]
___
$code.=<<___ if ($SIZE_T==4);
mr $t1,$n0
lwz $a0,0($ap) ; pull ap[0,1] value
lwz $t0,4($ap)
lwz $n0,0($t1) ; pull n0[0,1] value
lwz $t1,4($t1)
lwz $t3,0($bp) ; bp[0,1]
lwz $t2,4($bp)
insrdi $a0,$t0,32,0
insrdi $n0,$t1,32,0
insrdi $t3,$t2,32,0
___
$code.=<<___;
addi $tp,$sp,`$FRAME+$TRANSFER+8+64`
li $i,-64
add $nap_d,$tp,$num
and $nap_d,$nap_d,$i ; align to 64 bytes
mulld $t7,$a0,$t3 ; ap[0]*bp[0]
; nap_d is off by 1, because it's used with stfdu/lfdu
addi $nap_d,$nap_d,-8
srwi $j,$num,`3+1` ; counter register, num/2
mulld $t7,$t7,$n0 ; tp[0]*n0
addi $j,$j,-1
addi $tp,$sp,`$FRAME+$TRANSFER-8`
li $carry,0
mtctr $j
___
$code.=<<___ if ($SIZE_T==8);
ld $a0,0($ap) ; pull ap[0] value
ld $t3,0($bp) ; bp[0]
ld $n0,0($n0) ; pull n0[0] value
mulld $t7,$a0,$t3 ; ap[0]*bp[0]
; transfer bp[0] to FPU as 4x16-bit values
extrdi $t0,$t3,16,48
extrdi $t1,$t3,16,32
@ -280,6 +280,8 @@ $code.=<<___;
std $t1,`$FRAME+8`($sp)
std $t2,`$FRAME+16`($sp)
std $t3,`$FRAME+24`($sp)
mulld $t7,$t7,$n0 ; tp[0]*n0
; transfer (ap[0]*bp[0])*n0 to FPU as 4x16-bit values
extrdi $t4,$t7,16,48
extrdi $t5,$t7,16,32
@ -289,21 +291,61 @@ $code.=<<___;
std $t5,`$FRAME+40`($sp)
std $t6,`$FRAME+48`($sp)
std $t7,`$FRAME+56`($sp)
___
$code.=<<___ if ($SIZE_T==8);
lwz $t0,4($ap) ; load a[j] as 32-bit word pair
lwz $t1,0($ap)
lwz $t2,12($ap) ; load a[j+1] as 32-bit word pair
extrdi $t0,$a0,32,32 ; lwz $t0,4($ap)
extrdi $t1,$a0,32,0 ; lwz $t1,0($ap)
lwz $t2,12($ap) ; load a[1] as 32-bit word pair
lwz $t3,8($ap)
lwz $t4,4($np) ; load n[j] as 32-bit word pair
lwz $t4,4($np) ; load n[0] as 32-bit word pair
lwz $t5,0($np)
lwz $t6,12($np) ; load n[j+1] as 32-bit word pair
lwz $t6,12($np) ; load n[1] as 32-bit word pair
lwz $t7,8($np)
___
$code.=<<___ if ($SIZE_T==4);
lwz $t0,0($ap) ; load a[j..j+3] as 32-bit word pairs
lwz $t1,4($ap)
lwz $t2,8($ap)
lwz $a0,0($ap) ; pull ap[0,1] value
mr $n1,$n0
lwz $a1,4($ap)
li $c1,0
lwz $t1,0($bp) ; bp[0,1]
lwz $t3,4($bp)
lwz $n0,0($n1) ; pull n0[0,1] value
lwz $n1,4($n1)
mullw $t4,$a0,$t1 ; mulld ap[0]*bp[0]
mulhwu $t5,$a0,$t1
mullw $t6,$a1,$t1
mullw $t7,$a0,$t3
add $t5,$t5,$t6
add $t5,$t5,$t7
; transfer bp[0] to FPU as 4x16-bit values
extrwi $t0,$t1,16,16
extrwi $t1,$t1,16,0
extrwi $t2,$t3,16,16
extrwi $t3,$t3,16,0
std $t0,`$FRAME+0`($sp) ; yes, std in 32-bit build
std $t1,`$FRAME+8`($sp)
std $t2,`$FRAME+16`($sp)
std $t3,`$FRAME+24`($sp)
mullw $t0,$t4,$n0 ; mulld tp[0]*n0
mulhwu $t1,$t4,$n0
mullw $t2,$t5,$n0
mullw $t3,$t4,$n1
add $t1,$t1,$t2
add $t1,$t1,$t3
; transfer (ap[0]*bp[0])*n0 to FPU as 4x16-bit values
extrwi $t4,$t0,16,16
extrwi $t5,$t0,16,0
extrwi $t6,$t1,16,16
extrwi $t7,$t1,16,0
std $t4,`$FRAME+32`($sp) ; yes, std in 32-bit build
std $t5,`$FRAME+40`($sp)
std $t6,`$FRAME+48`($sp)
std $t7,`$FRAME+56`($sp)
mr $t0,$a0 ; lwz $t0,0($ap)
mr $t1,$a1 ; lwz $t1,4($ap)
lwz $t2,8($ap) ; load a[j..j+3] as 32-bit word pairs
lwz $t3,12($ap)
lwz $t4,0($np) ; load n[j..j+3] as 32-bit word pairs
lwz $t5,4($np)
@ -319,7 +361,7 @@ $code.=<<___;
lfd $nb,`$FRAME+40`($sp)
lfd $nc,`$FRAME+48`($sp)
lfd $nd,`$FRAME+56`($sp)
std $t0,`$FRAME+64`($sp)
std $t0,`$FRAME+64`($sp) ; yes, std even in 32-bit build
std $t1,`$FRAME+72`($sp)
std $t2,`$FRAME+80`($sp)
std $t3,`$FRAME+88`($sp)
@ -441,7 +483,7 @@ $code.=<<___ if ($SIZE_T==4);
lwz $t7,12($np)
___
$code.=<<___;
std $t0,`$FRAME+64`($sp)
std $t0,`$FRAME+64`($sp) ; yes, std even in 32-bit build
std $t1,`$FRAME+72`($sp)
std $t2,`$FRAME+80`($sp)
std $t3,`$FRAME+88`($sp)
@ -449,6 +491,9 @@ $code.=<<___;
std $t5,`$FRAME+104`($sp)
std $t6,`$FRAME+112`($sp)
std $t7,`$FRAME+120`($sp)
___
if ($SIZE_T==8 or $flavour =~ /osx/) {
$code.=<<___;
ld $t0,`$FRAME+0`($sp)
ld $t1,`$FRAME+8`($sp)
ld $t2,`$FRAME+16`($sp)
@ -457,6 +502,20 @@ $code.=<<___;
ld $t5,`$FRAME+40`($sp)
ld $t6,`$FRAME+48`($sp)
ld $t7,`$FRAME+56`($sp)
___
} else {
$code.=<<___;
lwz $t1,`$FRAME+0`($sp)
lwz $t0,`$FRAME+4`($sp)
lwz $t3,`$FRAME+8`($sp)
lwz $t2,`$FRAME+12`($sp)
lwz $t5,`$FRAME+16`($sp)
lwz $t4,`$FRAME+20`($sp)
lwz $t7,`$FRAME+24`($sp)
lwz $t6,`$FRAME+28`($sp)
___
}
$code.=<<___;
lfd $A0,`$FRAME+64`($sp)
lfd $A1,`$FRAME+72`($sp)
lfd $A2,`$FRAME+80`($sp)
@ -488,7 +547,9 @@ $code.=<<___;
fmadd $T0b,$A0,$bb,$dotb
stfd $A2,24($nap_d) ; save a[j+1] in double format
stfd $A3,32($nap_d)
___
if ($SIZE_T==8 or $flavour =~ /osx/) {
$code.=<<___;
fmadd $T1a,$A0,$bc,$T1a
fmadd $T1b,$A0,$bd,$T1b
fmadd $T2a,$A1,$bc,$T2a
@ -561,11 +622,123 @@ $code.=<<___;
stfd $T3b,`$FRAME+56`($sp)
std $t0,8($tp) ; tp[j-1]
stdu $t4,16($tp) ; tp[j]
___
} else {
$code.=<<___;
fmadd $T1a,$A0,$bc,$T1a
fmadd $T1b,$A0,$bd,$T1b
addc $t0,$t0,$carry
adde $t1,$t1,$c1
srwi $carry,$t0,16
fmadd $T2a,$A1,$bc,$T2a
fmadd $T2b,$A1,$bd,$T2b
stfd $N0,40($nap_d) ; save n[j] in double format
stfd $N1,48($nap_d)
srwi $c1,$t1,16
insrwi $carry,$t1,16,0
fmadd $T3a,$A2,$bc,$T3a
fmadd $T3b,$A2,$bd,$T3b
addc $t2,$t2,$carry
adde $t3,$t3,$c1
srwi $carry,$t2,16
fmul $dota,$A3,$bc
fmul $dotb,$A3,$bd
stfd $N2,56($nap_d) ; save n[j+1] in double format
stfdu $N3,64($nap_d)
insrwi $t0,$t2,16,0 ; 0..31 bits
srwi $c1,$t3,16
insrwi $carry,$t3,16,0
fmadd $T1a,$N1,$na,$T1a
fmadd $T1b,$N1,$nb,$T1b
lwz $t3,`$FRAME+32`($sp) ; permuted $t1
lwz $t2,`$FRAME+36`($sp) ; permuted $t0
addc $t4,$t4,$carry
adde $t5,$t5,$c1
srwi $carry,$t4,16
fmadd $T2a,$N2,$na,$T2a
fmadd $T2b,$N2,$nb,$T2b
srwi $c1,$t5,16
insrwi $carry,$t5,16,0
fmadd $T3a,$N3,$na,$T3a
fmadd $T3b,$N3,$nb,$T3b
addc $t6,$t6,$carry
adde $t7,$t7,$c1
srwi $carry,$t6,16
fmadd $T0a,$N0,$na,$T0a
fmadd $T0b,$N0,$nb,$T0b
insrwi $t4,$t6,16,0 ; 32..63 bits
srwi $c1,$t7,16
insrwi $carry,$t7,16,0
fmadd $T1a,$N0,$nc,$T1a
fmadd $T1b,$N0,$nd,$T1b
lwz $t7,`$FRAME+40`($sp) ; permuted $t3
lwz $t6,`$FRAME+44`($sp) ; permuted $t2
addc $t2,$t2,$carry
adde $t3,$t3,$c1
srwi $carry,$t2,16
fmadd $T2a,$N1,$nc,$T2a
fmadd $T2b,$N1,$nd,$T2b
stw $t0,12($tp) ; tp[j-1]
stw $t4,8($tp)
srwi $c1,$t3,16
insrwi $carry,$t3,16,0
fmadd $T3a,$N2,$nc,$T3a
fmadd $T3b,$N2,$nd,$T3b
lwz $t1,`$FRAME+48`($sp) ; permuted $t5
lwz $t0,`$FRAME+52`($sp) ; permuted $t4
addc $t6,$t6,$carry
adde $t7,$t7,$c1
srwi $carry,$t6,16
fmadd $dota,$N3,$nc,$dota
fmadd $dotb,$N3,$nd,$dotb
insrwi $t2,$t6,16,0 ; 64..95 bits
srwi $c1,$t7,16
insrwi $carry,$t7,16,0
fctid $T0a,$T0a
fctid $T0b,$T0b
lwz $t5,`$FRAME+56`($sp) ; permuted $t7
lwz $t4,`$FRAME+60`($sp) ; permuted $t6
addc $t0,$t0,$carry
adde $t1,$t1,$c1
srwi $carry,$t0,16
fctid $T1a,$T1a
fctid $T1b,$T1b
srwi $c1,$t1,16
insrwi $carry,$t1,16,0
fctid $T2a,$T2a
fctid $T2b,$T2b
addc $t4,$t4,$carry
adde $t5,$t5,$c1
srwi $carry,$t4,16
fctid $T3a,$T3a
fctid $T3b,$T3b
insrwi $t0,$t4,16,0 ; 96..127 bits
srwi $c1,$t5,16
insrwi $carry,$t5,16,0
stfd $T0a,`$FRAME+0`($sp)
stfd $T0b,`$FRAME+8`($sp)
stfd $T1a,`$FRAME+16`($sp)
stfd $T1b,`$FRAME+24`($sp)
stfd $T2a,`$FRAME+32`($sp)
stfd $T2b,`$FRAME+40`($sp)
stfd $T3a,`$FRAME+48`($sp)
stfd $T3b,`$FRAME+56`($sp)
stw $t2,20($tp) ; tp[j]
stwu $t0,16($tp)
___
}
$code.=<<___;
bdnz- L1st
fctid $dota,$dota
fctid $dotb,$dotb
___
if ($SIZE_T==8 or $flavour =~ /osx/) {
$code.=<<___;
ld $t0,`$FRAME+0`($sp)
ld $t1,`$FRAME+8`($sp)
ld $t2,`$FRAME+16`($sp)
@ -611,33 +784,117 @@ $code.=<<___;
insrdi $t6,$t7,48,0
srdi $ovf,$t7,48
std $t6,8($tp) ; tp[num-1]
___
} else {
$code.=<<___;
lwz $t1,`$FRAME+0`($sp)
lwz $t0,`$FRAME+4`($sp)
lwz $t3,`$FRAME+8`($sp)
lwz $t2,`$FRAME+12`($sp)
lwz $t5,`$FRAME+16`($sp)
lwz $t4,`$FRAME+20`($sp)
lwz $t7,`$FRAME+24`($sp)
lwz $t6,`$FRAME+28`($sp)
stfd $dota,`$FRAME+64`($sp)
stfd $dotb,`$FRAME+72`($sp)
addc $t0,$t0,$carry
adde $t1,$t1,$c1
srwi $carry,$t0,16
insrwi $carry,$t1,16,0
srwi $c1,$t1,16
addc $t2,$t2,$carry
adde $t3,$t3,$c1
srwi $carry,$t2,16
insrwi $t0,$t2,16,0 ; 0..31 bits
insrwi $carry,$t3,16,0
srwi $c1,$t3,16
addc $t4,$t4,$carry
adde $t5,$t5,$c1
srwi $carry,$t4,16
insrwi $carry,$t5,16,0
srwi $c1,$t5,16
addc $t6,$t6,$carry
adde $t7,$t7,$c1
srwi $carry,$t6,16
insrwi $t4,$t6,16,0 ; 32..63 bits
insrwi $carry,$t7,16,0
srwi $c1,$t7,16
stw $t0,12($tp) ; tp[j-1]
stw $t4,8($tp)
lwz $t3,`$FRAME+32`($sp) ; permuted $t1
lwz $t2,`$FRAME+36`($sp) ; permuted $t0
lwz $t7,`$FRAME+40`($sp) ; permuted $t3
lwz $t6,`$FRAME+44`($sp) ; permuted $t2
lwz $t1,`$FRAME+48`($sp) ; permuted $t5
lwz $t0,`$FRAME+52`($sp) ; permuted $t4
lwz $t5,`$FRAME+56`($sp) ; permuted $t7
lwz $t4,`$FRAME+60`($sp) ; permuted $t6
addc $t2,$t2,$carry
adde $t3,$t3,$c1
srwi $carry,$t2,16
insrwi $carry,$t3,16,0
srwi $c1,$t3,16
addc $t6,$t6,$carry
adde $t7,$t7,$c1
srwi $carry,$t6,16
insrwi $t2,$t6,16,0 ; 64..95 bits
insrwi $carry,$t7,16,0
srwi $c1,$t7,16
addc $t0,$t0,$carry
adde $t1,$t1,$c1
srwi $carry,$t0,16
insrwi $carry,$t1,16,0
srwi $c1,$t1,16
addc $t4,$t4,$carry
adde $t5,$t5,$c1
srwi $carry,$t4,16
insrwi $t0,$t4,16,0 ; 96..127 bits
insrwi $carry,$t5,16,0
srwi $c1,$t5,16
stw $t2,20($tp) ; tp[j]
stwu $t0,16($tp)
lwz $t7,`$FRAME+64`($sp)
lwz $t6,`$FRAME+68`($sp)
lwz $t5,`$FRAME+72`($sp)
lwz $t4,`$FRAME+76`($sp)
addc $t6,$t6,$carry
adde $t7,$t7,$c1
srwi $carry,$t6,16
insrwi $carry,$t7,16,0
srwi $c1,$t7,16
addc $t4,$t4,$carry
adde $t5,$t5,$c1
insrwi $t6,$t4,16,0
srwi $t4,$t4,16
insrwi $t4,$t5,16,0
srwi $ovf,$t5,16
stw $t6,12($tp) ; tp[num-1]
stw $t4,8($tp)
___
}
$code.=<<___;
slwi $t7,$num,2
subf $nap_d,$t7,$nap_d ; rewind pointer
li $i,8 ; i=1
.align 5
Louter:
addi $tp,$sp,`$FRAME+$TRANSFER`
li $carry,0
mtctr $j
___
$code.=<<___ if ($SIZE_T==8);
ldx $t3,$bp,$i ; bp[i]
___
$code.=<<___ if ($SIZE_T==4);
add $t0,$bp,$i
lwz $t3,0($t0) ; bp[i,i+1]
lwz $t0,4($t0)
insrdi $t3,$t0,32,0
___
$code.=<<___;
ldx $t3,$bp,$i ; bp[i]
ld $t6,`$FRAME+$TRANSFER+8`($sp) ; tp[0]
mulld $t7,$a0,$t3 ; ap[0]*bp[i]
addi $tp,$sp,`$FRAME+$TRANSFER`
add $t7,$t7,$t6 ; ap[0]*bp[i]+tp[0]
li $carry,0
mulld $t7,$t7,$n0 ; tp[0]*n0
mtctr $j
mulld $t7,$a0,$t3 ; ap[0]*bp[i]
add $t7,$t7,$t6 ; ap[0]*bp[i]+tp[0]
; transfer bp[i] to FPU as 4x16-bit values
extrdi $t0,$t3,16,48
extrdi $t1,$t3,16,32
@ -647,6 +904,8 @@ $code.=<<___;
std $t1,`$FRAME+8`($sp)
std $t2,`$FRAME+16`($sp)
std $t3,`$FRAME+24`($sp)
mulld $t7,$t7,$n0 ; tp[0]*n0
; transfer (ap[0]*bp[i]+tp[0])*n0 to FPU as 4x16-bit values
extrdi $t4,$t7,16,48
extrdi $t5,$t7,16,32
@ -656,7 +915,50 @@ $code.=<<___;
std $t5,`$FRAME+40`($sp)
std $t6,`$FRAME+48`($sp)
std $t7,`$FRAME+56`($sp)
___
$code.=<<___ if ($SIZE_T==4);
add $t0,$bp,$i
li $c1,0
lwz $t1,0($t0) ; bp[i,i+1]
lwz $t3,4($t0)
mullw $t4,$a0,$t1 ; ap[0]*bp[i]
lwz $t0,`$FRAME+$TRANSFER+8+4`($sp) ; tp[0]
mulhwu $t5,$a0,$t1
lwz $t2,`$FRAME+$TRANSFER+8`($sp) ; tp[0]
mullw $t6,$a1,$t1
mullw $t7,$a0,$t3
add $t5,$t5,$t6
add $t5,$t5,$t7
addc $t4,$t4,$t0 ; ap[0]*bp[i]+tp[0]
adde $t5,$t5,$t2
; transfer bp[i] to FPU as 4x16-bit values
extrwi $t0,$t1,16,16
extrwi $t1,$t1,16,0
extrwi $t2,$t3,16,16
extrwi $t3,$t3,16,0
std $t0,`$FRAME+0`($sp) ; yes, std in 32-bit build
std $t1,`$FRAME+8`($sp)
std $t2,`$FRAME+16`($sp)
std $t3,`$FRAME+24`($sp)
mullw $t0,$t4,$n0 ; mulld tp[0]*n0
mulhwu $t1,$t4,$n0
mullw $t2,$t5,$n0
mullw $t3,$t4,$n1
add $t1,$t1,$t2
add $t1,$t1,$t3
; transfer (ap[0]*bp[i]+tp[0])*n0 to FPU as 4x16-bit values
extrwi $t4,$t0,16,16
extrwi $t5,$t0,16,0
extrwi $t6,$t1,16,16
extrwi $t7,$t1,16,0
std $t4,`$FRAME+32`($sp) ; yes, std in 32-bit build
std $t5,`$FRAME+40`($sp)
std $t6,`$FRAME+48`($sp)
std $t7,`$FRAME+56`($sp)
___
$code.=<<___;
lfd $A0,8($nap_d) ; load a[j] in double format
lfd $A1,16($nap_d)
lfd $A2,24($nap_d) ; load a[j+1] in double format
@ -769,7 +1071,9 @@ Linner:
fmul $dotb,$A3,$bd
lfd $A2,24($nap_d) ; load a[j+1] in double format
lfd $A3,32($nap_d)
___
if ($SIZE_T==8 or $flavour =~ /osx/) {
$code.=<<___;
fmadd $T1a,$N1,$na,$T1a
fmadd $T1b,$N1,$nb,$T1b
ld $t0,`$FRAME+0`($sp)
@ -856,10 +1160,131 @@ $code.=<<___;
addze $carry,$carry
std $t3,-16($tp) ; tp[j-1]
std $t5,-8($tp) ; tp[j]
___
} else {
$code.=<<___;
fmadd $T1a,$N1,$na,$T1a
fmadd $T1b,$N1,$nb,$T1b
lwz $t1,`$FRAME+0`($sp)
lwz $t0,`$FRAME+4`($sp)
fmadd $T2a,$N2,$na,$T2a
fmadd $T2b,$N2,$nb,$T2b
lwz $t3,`$FRAME+8`($sp)
lwz $t2,`$FRAME+12`($sp)
fmadd $T3a,$N3,$na,$T3a
fmadd $T3b,$N3,$nb,$T3b
lwz $t5,`$FRAME+16`($sp)
lwz $t4,`$FRAME+20`($sp)
addc $t0,$t0,$carry
adde $t1,$t1,$c1
srwi $carry,$t0,16
fmadd $T0a,$N0,$na,$T0a
fmadd $T0b,$N0,$nb,$T0b
lwz $t7,`$FRAME+24`($sp)
lwz $t6,`$FRAME+28`($sp)
srwi $c1,$t1,16
insrwi $carry,$t1,16,0
fmadd $T1a,$N0,$nc,$T1a
fmadd $T1b,$N0,$nd,$T1b
addc $t2,$t2,$carry
adde $t3,$t3,$c1
srwi $carry,$t2,16
fmadd $T2a,$N1,$nc,$T2a
fmadd $T2b,$N1,$nd,$T2b
insrwi $t0,$t2,16,0 ; 0..31 bits
srwi $c1,$t3,16
insrwi $carry,$t3,16,0
fmadd $T3a,$N2,$nc,$T3a
fmadd $T3b,$N2,$nd,$T3b
lwz $t2,12($tp) ; tp[j]
lwz $t3,8($tp)
addc $t4,$t4,$carry
adde $t5,$t5,$c1
srwi $carry,$t4,16
fmadd $dota,$N3,$nc,$dota
fmadd $dotb,$N3,$nd,$dotb
srwi $c1,$t5,16
insrwi $carry,$t5,16,0
fctid $T0a,$T0a
addc $t6,$t6,$carry
adde $t7,$t7,$c1
srwi $carry,$t6,16
fctid $T0b,$T0b
insrwi $t4,$t6,16,0 ; 32..63 bits
srwi $c1,$t7,16
insrwi $carry,$t7,16,0
fctid $T1a,$T1a
addc $t0,$t0,$t2
adde $t4,$t4,$t3
lwz $t3,`$FRAME+32`($sp) ; permuted $t1
lwz $t2,`$FRAME+36`($sp) ; permuted $t0
fctid $T1b,$T1b
addze $carry,$carry
addze $c1,$c1
stw $t0,4($tp) ; tp[j-1]
stw $t4,0($tp)
fctid $T2a,$T2a
addc $t2,$t2,$carry
adde $t3,$t3,$c1
srwi $carry,$t2,16
lwz $t7,`$FRAME+40`($sp) ; permuted $t3
lwz $t6,`$FRAME+44`($sp) ; permuted $t2
fctid $T2b,$T2b
srwi $c1,$t3,16
insrwi $carry,$t3,16,0
lwz $t1,`$FRAME+48`($sp) ; permuted $t5
lwz $t0,`$FRAME+52`($sp) ; permuted $t4
fctid $T3a,$T3a
addc $t6,$t6,$carry
adde $t7,$t7,$c1
srwi $carry,$t6,16
lwz $t5,`$FRAME+56`($sp) ; permuted $t7
lwz $t4,`$FRAME+60`($sp) ; permuted $t6
fctid $T3b,$T3b
insrwi $t2,$t6,16,0 ; 64..95 bits
insrwi $carry,$t7,16,0
srwi $c1,$t7,16
lwz $t6,20($tp)
lwzu $t7,16($tp)
addc $t0,$t0,$carry
stfd $T0a,`$FRAME+0`($sp)
adde $t1,$t1,$c1
srwi $carry,$t0,16
stfd $T0b,`$FRAME+8`($sp)
insrwi $carry,$t1,16,0
srwi $c1,$t1,16
addc $t4,$t4,$carry
stfd $T1a,`$FRAME+16`($sp)
adde $t5,$t5,$c1
srwi $carry,$t4,16
insrwi $t0,$t4,16,0 ; 96..127 bits
stfd $T1b,`$FRAME+24`($sp)
insrwi $carry,$t5,16,0
srwi $c1,$t5,16
addc $t2,$t2,$t6
stfd $T2a,`$FRAME+32`($sp)
adde $t0,$t0,$t7
stfd $T2b,`$FRAME+40`($sp)
addze $carry,$carry
stfd $T3a,`$FRAME+48`($sp)
addze $c1,$c1
stfd $T3b,`$FRAME+56`($sp)
stw $t2,-4($tp) ; tp[j]
stw $t0,-8($tp)
___
}
$code.=<<___;
bdnz- Linner
fctid $dota,$dota
fctid $dotb,$dotb
___
if ($SIZE_T==8 or $flavour =~ /osx/) {
$code.=<<___;
ld $t0,`$FRAME+0`($sp)
ld $t1,`$FRAME+8`($sp)
ld $t2,`$FRAME+16`($sp)
@ -926,7 +1351,116 @@ $code.=<<___;
insrdi $t6,$t7,48,0
srdi $ovf,$t7,48
std $t6,0($tp) ; tp[num-1]
___
} else {
$code.=<<___;
lwz $t1,`$FRAME+0`($sp)
lwz $t0,`$FRAME+4`($sp)
lwz $t3,`$FRAME+8`($sp)
lwz $t2,`$FRAME+12`($sp)
lwz $t5,`$FRAME+16`($sp)
lwz $t4,`$FRAME+20`($sp)
lwz $t7,`$FRAME+24`($sp)
lwz $t6,`$FRAME+28`($sp)
stfd $dota,`$FRAME+64`($sp)
stfd $dotb,`$FRAME+72`($sp)
addc $t0,$t0,$carry
adde $t1,$t1,$c1
srwi $carry,$t0,16
insrwi $carry,$t1,16,0
srwi $c1,$t1,16
addc $t2,$t2,$carry
adde $t3,$t3,$c1
srwi $carry,$t2,16
insrwi $t0,$t2,16,0 ; 0..31 bits
lwz $t2,12($tp) ; tp[j]
insrwi $carry,$t3,16,0
srwi $c1,$t3,16
lwz $t3,8($tp)
addc $t4,$t4,$carry
adde $t5,$t5,$c1
srwi $carry,$t4,16
insrwi $carry,$t5,16,0
srwi $c1,$t5,16
addc $t6,$t6,$carry
adde $t7,$t7,$c1
srwi $carry,$t6,16
insrwi $t4,$t6,16,0 ; 32..63 bits
insrwi $carry,$t7,16,0
srwi $c1,$t7,16
addc $t0,$t0,$t2
adde $t4,$t4,$t3
addze $carry,$carry
addze $c1,$c1
stw $t0,4($tp) ; tp[j-1]
stw $t4,0($tp)
lwz $t3,`$FRAME+32`($sp) ; permuted $t1
lwz $t2,`$FRAME+36`($sp) ; permuted $t0
lwz $t7,`$FRAME+40`($sp) ; permuted $t3
lwz $t6,`$FRAME+44`($sp) ; permuted $t2
lwz $t1,`$FRAME+48`($sp) ; permuted $t5
lwz $t0,`$FRAME+52`($sp) ; permuted $t4
lwz $t5,`$FRAME+56`($sp) ; permuted $t7
lwz $t4,`$FRAME+60`($sp) ; permuted $t6
addc $t2,$t2,$carry
adde $t3,$t3,$c1
srwi $carry,$t2,16
insrwi $carry,$t3,16,0
srwi $c1,$t3,16
addc $t6,$t6,$carry
adde $t7,$t7,$c1
srwi $carry,$t6,16
insrwi $t2,$t6,16,0 ; 64..95 bits
lwz $t6,20($tp)
insrwi $carry,$t7,16,0
srwi $c1,$t7,16
lwzu $t7,16($tp)
addc $t0,$t0,$carry
adde $t1,$t1,$c1
srwi $carry,$t0,16
insrwi $carry,$t1,16,0
srwi $c1,$t1,16
addc $t4,$t4,$carry
adde $t5,$t5,$c1
srwi $carry,$t4,16
insrwi $t0,$t4,16,0 ; 96..127 bits
insrwi $carry,$t5,16,0
srwi $c1,$t5,16
addc $t2,$t2,$t6
adde $t0,$t0,$t7
lwz $t7,`$FRAME+64`($sp)
lwz $t6,`$FRAME+68`($sp)
addze $carry,$carry
addze $c1,$c1
lwz $t5,`$FRAME+72`($sp)
lwz $t4,`$FRAME+76`($sp)
addc $t6,$t6,$carry
adde $t7,$t7,$c1
stw $t2,-4($tp) ; tp[j]
stw $t0,-8($tp)
addc $t6,$t6,$ovf
addze $t7,$t7
srwi $carry,$t6,16
insrwi $carry,$t7,16,0
srwi $c1,$t7,16
addc $t4,$t4,$carry
adde $t5,$t5,$c1
insrwi $t6,$t4,16,0
srwi $t4,$t4,16
insrwi $t4,$t5,16,0
srwi $ovf,$t5,16
stw $t6,4($tp) ; tp[num-1]
stw $t4,0($tp)
___
}
$code.=<<___;
slwi $t7,$num,2
addi $i,$i,8
subf $nap_d,$t7,$nap_d ; rewind pointer
@ -994,14 +1528,14 @@ $code.=<<___ if ($SIZE_T==4);
mtctr $j
.align 4
Lsub: ld $t0,8($tp) ; load tp[j..j+3] in 64-bit word order
ldu $t2,16($tp)
Lsub: lwz $t0,12($tp) ; load tp[j..j+3] in 64-bit word order
lwz $t1,8($tp)
lwz $t2,20($tp)
lwzu $t3,16($tp)
lwz $t4,4($np) ; load np[j..j+3] in 32-bit word order
lwz $t5,8($np)
lwz $t6,12($np)
lwzu $t7,16($np)
extrdi $t1,$t0,32,0
extrdi $t3,$t2,32,0
subfe $t4,$t4,$t0 ; tp[j]-np[j]
stw $t0,4($ap) ; save tp[j..j+3] in 32-bit word order
subfe $t5,$t5,$t1 ; tp[j+1]-np[j+1]
@ -1052,6 +1586,9 @@ ___
$code.=<<___;
$POP $i,0($sp)
li r3,1 ; signal "handled"
$POP r19,`-12*8-13*$SIZE_T`($i)
$POP r20,`-12*8-12*$SIZE_T`($i)
$POP r21,`-12*8-11*$SIZE_T`($i)
$POP r22,`-12*8-10*$SIZE_T`($i)
$POP r23,`-12*8-9*$SIZE_T`($i)
$POP r24,`-12*8-8*$SIZE_T`($i)
@ -1077,8 +1614,9 @@ $code.=<<___;
mr $sp,$i
blr
.long 0
.byte 0,12,4,0,0x8c,10,6,0
.byte 0,12,4,0,0x8c,13,6,0
.long 0
.size .$fname,.-.$fname
.asciz "Montgomery Multiplication for PPC64, CRYPTOGAMS by <appro\@openssl.org>"
___

1898
crypto/bn/asm/rsaz-avx2.pl Executable file

File diff suppressed because it is too large Load Diff

2144
crypto/bn/asm/rsaz-x86_64.pl Executable file

File diff suppressed because it is too large Load Diff

1222
crypto/bn/asm/sparct4-mont.pl Executable file

File diff suppressed because it is too large Load Diff

190
crypto/bn/asm/sparcv9-gf2m.pl Executable file
View File

@ -0,0 +1,190 @@
#!/usr/bin/env perl
#
# ====================================================================
# Written by Andy Polyakov <appro@openssl.org> for the OpenSSL
# project. The module is, however, dual licensed under OpenSSL and
# CRYPTOGAMS licenses depending on where you obtain it. For further
# details see http://www.openssl.org/~appro/cryptogams/.
# ====================================================================
#
# October 2012
#
# The module implements bn_GF2m_mul_2x2 polynomial multiplication used
# in bn_gf2m.c. It's kind of low-hanging mechanical port from C for
# the time being... Except that it has two code paths: one suitable
# for all SPARCv9 processors and one for VIS3-capable ones. Former
# delivers ~25-45% more, more for longer keys, heaviest DH and DSA
# verify operations on venerable UltraSPARC II. On T4 VIS3 code is
# ~100-230% faster than gcc-generated code and ~35-90% faster than
# the pure SPARCv9 code path.
$locals=16*8;
$tab="%l0";
@T=("%g2","%g3");
@i=("%g4","%g5");
($a1,$a2,$a4,$a8,$a12,$a48)=map("%o$_",(0..5));
($lo,$hi,$b)=("%g1",$a8,"%o7"); $a=$lo;
$code.=<<___;
#include <sparc_arch.h>
#ifdef __arch64__
.register %g2,#scratch
.register %g3,#scratch
#endif
#ifdef __PIC__
SPARC_PIC_THUNK(%g1)
#endif
.globl bn_GF2m_mul_2x2
.align 16
bn_GF2m_mul_2x2:
SPARC_LOAD_ADDRESS_LEAF(OPENSSL_sparcv9cap_P,%g1,%g5)
ld [%g1+0],%g1 ! OPENSSL_sparcv9cap_P[0]
andcc %g1, SPARCV9_VIS3, %g0
bz,pn %icc,.Lsoftware
nop
sllx %o1, 32, %o1
sllx %o3, 32, %o3
or %o2, %o1, %o1
or %o4, %o3, %o3
.word 0x95b262ab ! xmulx %o1, %o3, %o2
.word 0x99b262cb ! xmulxhi %o1, %o3, %o4
srlx %o2, 32, %o1 ! 13 cycles later
st %o2, [%o0+0]
st %o1, [%o0+4]
srlx %o4, 32, %o3
st %o4, [%o0+8]
retl
st %o3, [%o0+12]
.align 16
.Lsoftware:
save %sp,-STACK_FRAME-$locals,%sp
sllx %i1,32,$a
mov -1,$a12
sllx %i3,32,$b
or %i2,$a,$a
srlx $a12,1,$a48 ! 0x7fff...
or %i4,$b,$b
srlx $a12,2,$a12 ! 0x3fff...
add %sp,STACK_BIAS+STACK_FRAME,$tab
sllx $a,2,$a4
mov $a,$a1
sllx $a,1,$a2
srax $a4,63,@i[1] ! broadcast 61st bit
and $a48,$a4,$a4 ! (a<<2)&0x7fff...
srlx $a48,2,$a48
srax $a2,63,@i[0] ! broadcast 62nd bit
and $a12,$a2,$a2 ! (a<<1)&0x3fff...
srax $a1,63,$lo ! broadcast 63rd bit
and $a48,$a1,$a1 ! (a<<0)&0x1fff...
sllx $a1,3,$a8
and $b,$lo,$lo
and $b,@i[0],@i[0]
and $b,@i[1],@i[1]
stx %g0,[$tab+0*8] ! tab[0]=0
xor $a1,$a2,$a12
stx $a1,[$tab+1*8] ! tab[1]=a1
stx $a2,[$tab+2*8] ! tab[2]=a2
xor $a4,$a8,$a48
stx $a12,[$tab+3*8] ! tab[3]=a1^a2
xor $a4,$a1,$a1
stx $a4,[$tab+4*8] ! tab[4]=a4
xor $a4,$a2,$a2
stx $a1,[$tab+5*8] ! tab[5]=a1^a4
xor $a4,$a12,$a12
stx $a2,[$tab+6*8] ! tab[6]=a2^a4
xor $a48,$a1,$a1
stx $a12,[$tab+7*8] ! tab[7]=a1^a2^a4
xor $a48,$a2,$a2
stx $a8,[$tab+8*8] ! tab[8]=a8
xor $a48,$a12,$a12
stx $a1,[$tab+9*8] ! tab[9]=a1^a8
xor $a4,$a1,$a1
stx $a2,[$tab+10*8] ! tab[10]=a2^a8
xor $a4,$a2,$a2
stx $a12,[$tab+11*8] ! tab[11]=a1^a2^a8
xor $a4,$a12,$a12
stx $a48,[$tab+12*8] ! tab[12]=a4^a8
srlx $lo,1,$hi
stx $a1,[$tab+13*8] ! tab[13]=a1^a4^a8
sllx $lo,63,$lo
stx $a2,[$tab+14*8] ! tab[14]=a2^a4^a8
srlx @i[0],2,@T[0]
stx $a12,[$tab+15*8] ! tab[15]=a1^a2^a4^a8
sllx @i[0],62,$a1
sllx $b,3,@i[0]
srlx @i[1],3,@T[1]
and @i[0],`0xf<<3`,@i[0]
sllx @i[1],61,$a2
ldx [$tab+@i[0]],@i[0]
srlx $b,4-3,@i[1]
xor @T[0],$hi,$hi
and @i[1],`0xf<<3`,@i[1]
xor $a1,$lo,$lo
ldx [$tab+@i[1]],@i[1]
xor @T[1],$hi,$hi
xor @i[0],$lo,$lo
srlx $b,8-3,@i[0]
xor $a2,$lo,$lo
and @i[0],`0xf<<3`,@i[0]
___
for($n=1;$n<14;$n++) {
$code.=<<___;
sllx @i[1],`$n*4`,@T[0]
ldx [$tab+@i[0]],@i[0]
srlx @i[1],`64-$n*4`,@T[1]
xor @T[0],$lo,$lo
srlx $b,`($n+2)*4`-3,@i[1]
xor @T[1],$hi,$hi
and @i[1],`0xf<<3`,@i[1]
___
push(@i,shift(@i)); push(@T,shift(@T));
}
$code.=<<___;
sllx @i[1],`$n*4`,@T[0]
ldx [$tab+@i[0]],@i[0]
srlx @i[1],`64-$n*4`,@T[1]
xor @T[0],$lo,$lo
sllx @i[0],`($n+1)*4`,@T[0]
xor @T[1],$hi,$hi
srlx @i[0],`64-($n+1)*4`,@T[1]
xor @T[0],$lo,$lo
xor @T[1],$hi,$hi
srlx $lo,32,%i1
st $lo,[%i0+0]
st %i1,[%i0+4]
srlx $hi,32,%i2
st $hi,[%i0+8]
st %i2,[%i0+12]
ret
restore
.type bn_GF2m_mul_2x2,#function
.size bn_GF2m_mul_2x2,.-bn_GF2m_mul_2x2
.asciz "GF(2^m) Multiplication for SPARCv9, CRYPTOGAMS by <appro\@openssl.org>"
.align 4
___
$code =~ s/\`([^\`]*)\`/eval($1)/gem;
print $code;
close STDOUT;

373
crypto/bn/asm/vis3-mont.pl Executable file
View File

@ -0,0 +1,373 @@
#!/usr/bin/env perl
# ====================================================================
# Written by Andy Polyakov <appro@openssl.org> for the OpenSSL
# project. The module is, however, dual licensed under OpenSSL and
# CRYPTOGAMS licenses depending on where you obtain it. For further
# details see http://www.openssl.org/~appro/cryptogams/.
# ====================================================================
# October 2012.
#
# SPARCv9 VIS3 Montgomery multiplicaion procedure suitable for T3 and
# onward. There are three new instructions used here: umulxhi,
# addxc[cc] and initializing store. On T3 RSA private key operations
# are 1.54/1.87/2.11/2.26 times faster for 512/1024/2048/4096-bit key
# lengths. This is without dedicated squaring procedure. On T4
# corresponding coefficients are 1.47/2.10/2.80/2.90x, which is mostly
# for reference purposes, because T4 has dedicated Montgomery
# multiplication and squaring *instructions* that deliver even more.
$bits=32;
for (@ARGV) { $bits=64 if (/\-m64/ || /\-xarch\=v9/); }
if ($bits==64) { $bias=2047; $frame=192; }
else { $bias=0; $frame=112; }
$code.=<<___ if ($bits==64);
.register %g2,#scratch
.register %g3,#scratch
___
$code.=<<___;
.section ".text",#alloc,#execinstr
___
($n0,$m0,$m1,$lo0,$hi0, $lo1,$hi1,$aj,$alo,$nj,$nlo,$tj)=
(map("%g$_",(1..5)),map("%o$_",(0..5,7)));
# int bn_mul_mont(
$rp="%o0"; # BN_ULONG *rp,
$ap="%o1"; # const BN_ULONG *ap,
$bp="%o2"; # const BN_ULONG *bp,
$np="%o3"; # const BN_ULONG *np,
$n0p="%o4"; # const BN_ULONG *n0,
$num="%o5"; # int num); # caller ensures that num is even
# and >=6
$code.=<<___;
.globl bn_mul_mont_vis3
.align 32
bn_mul_mont_vis3:
add %sp, $bias, %g4 ! real top of stack
sll $num, 2, $num ! size in bytes
add $num, 63, %g5
andn %g5, 63, %g5 ! buffer size rounded up to 64 bytes
add %g5, %g5, %g1
add %g5, %g1, %g1 ! 3*buffer size
sub %g4, %g1, %g1
andn %g1, 63, %g1 ! align at 64 byte
sub %g1, $frame, %g1 ! new top of stack
sub %g1, %g4, %g1
save %sp, %g1, %sp
___
# +-------------------------------+<----- %sp
# . .
# +-------------------------------+<----- aligned at 64 bytes
# | __int64 tmp[0] |
# +-------------------------------+
# . .
# . .
# +-------------------------------+<----- aligned at 64 bytes
# | __int64 ap[1..0] | converted ap[]
# +-------------------------------+
# | __int64 np[1..0] | converted np[]
# +-------------------------------+
# | __int64 ap[3..2] |
# . .
# . .
# +-------------------------------+
($rp,$ap,$bp,$np,$n0p,$num)=map("%i$_",(0..5));
($t0,$t1,$t2,$t3,$cnt,$tp,$bufsz,$anp)=map("%l$_",(0..7));
($ovf,$i)=($t0,$t1);
$code.=<<___;
ld [$n0p+0], $t0 ! pull n0[0..1] value
add %sp, $bias+$frame, $tp
ld [$n0p+4], $t1
add $tp, %g5, $anp
ld [$bp+0], $t2 ! m0=bp[0]
sllx $t1, 32, $n0
ld [$bp+4], $t3
or $t0, $n0, $n0
add $bp, 8, $bp
ld [$ap+0], $t0 ! ap[0]
sllx $t3, 32, $m0
ld [$ap+4], $t1
or $t2, $m0, $m0
ld [$ap+8], $t2 ! ap[1]
sllx $t1, 32, $aj
ld [$ap+12], $t3
or $t0, $aj, $aj
add $ap, 16, $ap
stx $aj, [$anp] ! converted ap[0]
mulx $aj, $m0, $lo0 ! ap[0]*bp[0]
umulxhi $aj, $m0, $hi0
ld [$np+0], $t0 ! np[0]
sllx $t3, 32, $aj
ld [$np+4], $t1
or $t2, $aj, $aj
ld [$np+8], $t2 ! np[1]
sllx $t1, 32, $nj
ld [$np+12], $t3
or $t0, $nj, $nj
add $np, 16, $np
stx $nj, [$anp+8] ! converted np[0]
mulx $lo0, $n0, $m1 ! "tp[0]"*n0
stx $aj, [$anp+16] ! converted ap[1]
mulx $aj, $m0, $alo ! ap[1]*bp[0]
umulxhi $aj, $m0, $aj ! ahi=aj
mulx $nj, $m1, $lo1 ! np[0]*m1
umulxhi $nj, $m1, $hi1
sllx $t3, 32, $nj
or $t2, $nj, $nj
stx $nj, [$anp+24] ! converted np[1]
add $anp, 32, $anp
addcc $lo0, $lo1, $lo1
addxc %g0, $hi1, $hi1
mulx $nj, $m1, $nlo ! np[1]*m1
umulxhi $nj, $m1, $nj ! nhi=nj
ba .L1st
sub $num, 24, $cnt ! cnt=num-3
.align 16
.L1st:
ld [$ap+0], $t0 ! ap[j]
addcc $alo, $hi0, $lo0
ld [$ap+4], $t1
addxc $aj, %g0, $hi0
sllx $t1, 32, $aj
add $ap, 8, $ap
or $t0, $aj, $aj
stx $aj, [$anp] ! converted ap[j]
ld [$np+0], $t2 ! np[j]
addcc $nlo, $hi1, $lo1
ld [$np+4], $t3
addxc $nj, %g0, $hi1 ! nhi=nj
sllx $t3, 32, $nj
add $np, 8, $np
mulx $aj, $m0, $alo ! ap[j]*bp[0]
or $t2, $nj, $nj
umulxhi $aj, $m0, $aj ! ahi=aj
stx $nj, [$anp+8] ! converted np[j]
add $anp, 16, $anp ! anp++
mulx $nj, $m1, $nlo ! np[j]*m1
addcc $lo0, $lo1, $lo1 ! np[j]*m1+ap[j]*bp[0]
umulxhi $nj, $m1, $nj ! nhi=nj
addxc %g0, $hi1, $hi1
stx $lo1, [$tp] ! tp[j-1]
add $tp, 8, $tp ! tp++
brnz,pt $cnt, .L1st
sub $cnt, 8, $cnt ! j--
!.L1st
addcc $alo, $hi0, $lo0
addxc $aj, %g0, $hi0 ! ahi=aj
addcc $nlo, $hi1, $lo1
addxc $nj, %g0, $hi1
addcc $lo0, $lo1, $lo1 ! np[j]*m1+ap[j]*bp[0]
addxc %g0, $hi1, $hi1
stx $lo1, [$tp] ! tp[j-1]
add $tp, 8, $tp
addcc $hi0, $hi1, $hi1
addxc %g0, %g0, $ovf ! upmost overflow bit
stx $hi1, [$tp]
add $tp, 8, $tp
ba .Louter
sub $num, 16, $i ! i=num-2
.align 16
.Louter:
ld [$bp+0], $t2 ! m0=bp[i]
ld [$bp+4], $t3
sub $anp, $num, $anp ! rewind
sub $tp, $num, $tp
sub $anp, $num, $anp
add $bp, 8, $bp
sllx $t3, 32, $m0
ldx [$anp+0], $aj ! ap[0]
or $t2, $m0, $m0
ldx [$anp+8], $nj ! np[0]
mulx $aj, $m0, $lo0 ! ap[0]*bp[i]
ldx [$tp], $tj ! tp[0]
umulxhi $aj, $m0, $hi0
ldx [$anp+16], $aj ! ap[1]
addcc $lo0, $tj, $lo0 ! ap[0]*bp[i]+tp[0]
mulx $aj, $m0, $alo ! ap[1]*bp[i]
addxc %g0, $hi0, $hi0
mulx $lo0, $n0, $m1 ! tp[0]*n0
umulxhi $aj, $m0, $aj ! ahi=aj
mulx $nj, $m1, $lo1 ! np[0]*m1
umulxhi $nj, $m1, $hi1
ldx [$anp+24], $nj ! np[1]
add $anp, 32, $anp
addcc $lo1, $lo0, $lo1
mulx $nj, $m1, $nlo ! np[1]*m1
addxc %g0, $hi1, $hi1
umulxhi $nj, $m1, $nj ! nhi=nj
ba .Linner
sub $num, 24, $cnt ! cnt=num-3
.align 16
.Linner:
addcc $alo, $hi0, $lo0
ldx [$tp+8], $tj ! tp[j]
addxc $aj, %g0, $hi0 ! ahi=aj
ldx [$anp+0], $aj ! ap[j]
addcc $nlo, $hi1, $lo1
mulx $aj, $m0, $alo ! ap[j]*bp[i]
addxc $nj, %g0, $hi1 ! nhi=nj
ldx [$anp+8], $nj ! np[j]
add $anp, 16, $anp
umulxhi $aj, $m0, $aj ! ahi=aj
addcc $lo0, $tj, $lo0 ! ap[j]*bp[i]+tp[j]
mulx $nj, $m1, $nlo ! np[j]*m1
addxc %g0, $hi0, $hi0
umulxhi $nj, $m1, $nj ! nhi=nj
addcc $lo1, $lo0, $lo1 ! np[j]*m1+ap[j]*bp[i]+tp[j]
addxc %g0, $hi1, $hi1
stx $lo1, [$tp] ! tp[j-1]
add $tp, 8, $tp
brnz,pt $cnt, .Linner
sub $cnt, 8, $cnt
!.Linner
ldx [$tp+8], $tj ! tp[j]
addcc $alo, $hi0, $lo0
addxc $aj, %g0, $hi0 ! ahi=aj
addcc $lo0, $tj, $lo0 ! ap[j]*bp[i]+tp[j]
addxc %g0, $hi0, $hi0
addcc $nlo, $hi1, $lo1
addxc $nj, %g0, $hi1 ! nhi=nj
addcc $lo1, $lo0, $lo1 ! np[j]*m1+ap[j]*bp[i]+tp[j]
addxc %g0, $hi1, $hi1
stx $lo1, [$tp] ! tp[j-1]
subcc %g0, $ovf, %g0 ! move upmost overflow to CCR.xcc
addxccc $hi1, $hi0, $hi1
addxc %g0, %g0, $ovf
stx $hi1, [$tp+8]
add $tp, 16, $tp
brnz,pt $i, .Louter
sub $i, 8, $i
sub $anp, $num, $anp ! rewind
sub $tp, $num, $tp
sub $anp, $num, $anp
ba .Lsub
subcc $num, 8, $cnt ! cnt=num-1 and clear CCR.xcc
.align 16
.Lsub:
ldx [$tp], $tj
add $tp, 8, $tp
ldx [$anp+8], $nj
add $anp, 16, $anp
subccc $tj, $nj, $t2 ! tp[j]-np[j]
srlx $tj, 32, $tj
srlx $nj, 32, $nj
subccc $tj, $nj, $t3
add $rp, 8, $rp
st $t2, [$rp-4] ! reverse order
st $t3, [$rp-8]
brnz,pt $cnt, .Lsub
sub $cnt, 8, $cnt
sub $anp, $num, $anp ! rewind
sub $tp, $num, $tp
sub $anp, $num, $anp
sub $rp, $num, $rp
subc $ovf, %g0, $ovf ! handle upmost overflow bit
and $tp, $ovf, $ap
andn $rp, $ovf, $np
or $np, $ap, $ap ! ap=borrow?tp:rp
ba .Lcopy
sub $num, 8, $cnt
.align 16
.Lcopy: ! copy or in-place refresh
ld [$ap+0], $t2
ld [$ap+4], $t3
add $ap, 8, $ap
stx %g0, [$tp] ! zap
add $tp, 8, $tp
stx %g0, [$anp] ! zap
stx %g0, [$anp+8]
add $anp, 16, $anp
st $t3, [$rp+0] ! flip order
st $t2, [$rp+4]
add $rp, 8, $rp
brnz $cnt, .Lcopy
sub $cnt, 8, $cnt
mov 1, %o0
ret
restore
.type bn_mul_mont_vis3, #function
.size bn_mul_mont_vis3, .-bn_mul_mont_vis3
.asciz "Montgomery Multiplication for SPARCv9 VIS3, CRYPTOGAMS by <appro\@openssl.org>"
.align 4
___
# Purpose of these subroutines is to explicitly encode VIS instructions,
# so that one can compile the module without having to specify VIS
# extentions on compiler command line, e.g. -xarch=v9 vs. -xarch=v9a.
# Idea is to reserve for option to produce "universal" binary and let
# programmer detect if current CPU is VIS capable at run-time.
sub unvis3 {
my ($mnemonic,$rs1,$rs2,$rd)=@_;
my %bias = ( "g" => 0, "o" => 8, "l" => 16, "i" => 24 );
my ($ref,$opf);
my %visopf = ( "addxc" => 0x011,
"addxccc" => 0x013,
"umulxhi" => 0x016 );
$ref = "$mnemonic\t$rs1,$rs2,$rd";
if ($opf=$visopf{$mnemonic}) {
foreach ($rs1,$rs2,$rd) {
return $ref if (!/%([goli])([0-9])/);
$_=$bias{$1}+$2;
}
return sprintf ".word\t0x%08x !%s",
0x81b00000|$rd<<25|$rs1<<14|$opf<<5|$rs2,
$ref;
} else {
return $ref;
}
}
foreach (split("\n",$code)) {
s/\`([^\`]*)\`/eval $1/ge;
s/\b(umulxhi|addxc[c]{0,2})\s+(%[goli][0-7]),\s*(%[goli][0-7]),\s*(%[goli][0-7])/
&unvis3($1,$2,$3,$4)
/ge;
print $_,"\n";
}
close STDOUT;

View File

@ -55,7 +55,7 @@
* machine.
*/
# ifdef _WIN64
# if defined(_WIN64) || !defined(__LP64__)
# define BN_ULONG unsigned long long
# else
# define BN_ULONG unsigned long
@ -63,7 +63,6 @@
# undef mul
# undef mul_add
# undef sqr
/*-
* "m"(a), "+m"(r) is the way to favor DirectPath µ-code;
@ -99,8 +98,8 @@
: "cc"); \
(r)=carry, carry=high; \
} while (0)
# define sqr(r0,r1,a) \
# undef sqr
# define sqr(r0,r1,a) \
asm ("mulq %2" \
: "=a"(r0),"=d"(r1) \
: "a"(a) \
@ -204,20 +203,22 @@ BN_ULONG bn_div_words(BN_ULONG h, BN_ULONG l, BN_ULONG d)
BN_ULONG bn_add_words(BN_ULONG *rp, const BN_ULONG *ap, const BN_ULONG *bp,
int n)
{
BN_ULONG ret = 0, i = 0;
BN_ULONG ret;
size_t i = 0;
if (n <= 0)
return 0;
asm volatile (" subq %2,%2 \n"
asm volatile (" subq %0,%0 \n" /* clear carry */
" jmp 1f \n"
".p2align 4 \n"
"1: movq (%4,%2,8),%0 \n"
" adcq (%5,%2,8),%0 \n"
" movq %0,(%3,%2,8) \n"
" leaq 1(%2),%2 \n"
" lea 1(%2),%2 \n"
" loop 1b \n"
" sbbq %0,%0 \n":"=&a" (ret), "+c"(n),
"=&r"(i)
" sbbq %0,%0 \n":"=&r" (ret), "+c"(n),
"+r"(i)
:"r"(rp), "r"(ap), "r"(bp)
:"cc", "memory");
@ -228,20 +229,22 @@ BN_ULONG bn_add_words(BN_ULONG *rp, const BN_ULONG *ap, const BN_ULONG *bp,
BN_ULONG bn_sub_words(BN_ULONG *rp, const BN_ULONG *ap, const BN_ULONG *bp,
int n)
{
BN_ULONG ret = 0, i = 0;
BN_ULONG ret;
size_t i = 0;
if (n <= 0)
return 0;
asm volatile (" subq %2,%2 \n"
asm volatile (" subq %0,%0 \n" /* clear borrow */
" jmp 1f \n"
".p2align 4 \n"
"1: movq (%4,%2,8),%0 \n"
" sbbq (%5,%2,8),%0 \n"
" movq %0,(%3,%2,8) \n"
" leaq 1(%2),%2 \n"
" lea 1(%2),%2 \n"
" loop 1b \n"
" sbbq %0,%0 \n":"=&a" (ret), "+c"(n),
"=&r"(i)
" sbbq %0,%0 \n":"=&r" (ret), "+c"(n),
"+r"(i)
:"r"(rp), "r"(ap), "r"(bp)
:"cc", "memory");
@ -313,55 +316,58 @@ BN_ULONG bn_sub_words(BN_ULONG *r, BN_ULONG *a, BN_ULONG *b, int n)
*/
# if 0
/* original macros are kept for reference purposes */
# define mul_add_c(a,b,c0,c1,c2) { \
BN_ULONG ta=(a),tb=(b); \
t1 = ta * tb; \
t2 = BN_UMULT_HIGH(ta,tb); \
c0 += t1; t2 += (c0<t1)?1:0; \
c1 += t2; c2 += (c1<t2)?1:0; \
}
# define mul_add_c(a,b,c0,c1,c2) do { \
BN_ULONG ta = (a), tb = (b); \
BN_ULONG lo, hi; \
BN_UMULT_LOHI(lo,hi,ta,tb); \
c0 += lo; hi += (c0<lo)?1:0; \
c1 += hi; c2 += (c1<hi)?1:0; \
} while(0)
# define mul_add_c2(a,b,c0,c1,c2) { \
BN_ULONG ta=(a),tb=(b),t0; \
t1 = BN_UMULT_HIGH(ta,tb); \
t0 = ta * tb; \
c0 += t0; t2 = t1+((c0<t0)?1:0);\
c1 += t2; c2 += (c1<t2)?1:0; \
c0 += t0; t1 += (c0<t0)?1:0; \
c1 += t1; c2 += (c1<t1)?1:0; \
}
# define mul_add_c2(a,b,c0,c1,c2) do { \
BN_ULONG ta = (a), tb = (b); \
BN_ULONG lo, hi, tt; \
BN_UMULT_LOHI(lo,hi,ta,tb); \
c0 += lo; tt = hi+((c0<lo)?1:0); \
c1 += tt; c2 += (c1<tt)?1:0; \
c0 += lo; hi += (c0<lo)?1:0; \
c1 += hi; c2 += (c1<hi)?1:0; \
} while(0)
# define sqr_add_c(a,i,c0,c1,c2) do { \
BN_ULONG ta = (a)[i]; \
BN_ULONG lo, hi; \
BN_UMULT_LOHI(lo,hi,ta,ta); \
c0 += lo; hi += (c0<lo)?1:0; \
c1 += hi; c2 += (c1<hi)?1:0; \
} while(0)
# else
# define mul_add_c(a,b,c0,c1,c2) do { \
# define mul_add_c(a,b,c0,c1,c2) do { \
BN_ULONG t1,t2; \
asm ("mulq %3" \
: "=a"(t1),"=d"(t2) \
: "a"(a),"m"(b) \
: "cc"); \
asm ("addq %2,%0; adcq %3,%1" \
: "+r"(c0),"+d"(t2) \
: "a"(t1),"g"(0) \
: "cc"); \
asm ("addq %2,%0; adcq %3,%1" \
: "+r"(c1),"+r"(c2) \
: "d"(t2),"g"(0) \
: "cc"); \
asm ("addq %3,%0; adcq %4,%1; adcq %5,%2" \
: "+r"(c0),"+r"(c1),"+r"(c2) \
: "r"(t1),"r"(t2),"g"(0) \
: "cc"); \
} while (0)
# define sqr_add_c(a,i,c0,c1,c2) do { \
# define sqr_add_c(a,i,c0,c1,c2) do { \
BN_ULONG t1,t2; \
asm ("mulq %2" \
: "=a"(t1),"=d"(t2) \
: "a"(a[i]) \
: "cc"); \
asm ("addq %2,%0; adcq %3,%1" \
: "+r"(c0),"+d"(t2) \
: "a"(t1),"g"(0) \
: "cc"); \
asm ("addq %2,%0; adcq %3,%1" \
: "+r"(c1),"+r"(c2) \
: "d"(t2),"g"(0) \
: "cc"); \
asm ("addq %3,%0; adcq %4,%1; adcq %5,%2" \
: "+r"(c0),"+r"(c1),"+r"(c2) \
: "r"(t1),"r"(t2),"g"(0) \
: "cc"); \
} while (0)
# define mul_add_c2(a,b,c0,c1,c2) do { \
# define mul_add_c2(a,b,c0,c1,c2) do { \
BN_ULONG t1,t2; \
asm ("mulq %3" \
: "=a"(t1),"=d"(t2) \
: "a"(a),"m"(b) \
@ -382,7 +388,6 @@ BN_ULONG bn_sub_words(BN_ULONG *r, BN_ULONG *a, BN_ULONG *b, int n)
void bn_mul_comba8(BN_ULONG *r, BN_ULONG *a, BN_ULONG *b)
{
BN_ULONG t1, t2;
BN_ULONG c1, c2, c3;
c1 = 0;
@ -486,7 +491,6 @@ void bn_mul_comba8(BN_ULONG *r, BN_ULONG *a, BN_ULONG *b)
void bn_mul_comba4(BN_ULONG *r, BN_ULONG *a, BN_ULONG *b)
{
BN_ULONG t1, t2;
BN_ULONG c1, c2, c3;
c1 = 0;
@ -526,7 +530,6 @@ void bn_mul_comba4(BN_ULONG *r, BN_ULONG *a, BN_ULONG *b)
void bn_sqr_comba8(BN_ULONG *r, const BN_ULONG *a)
{
BN_ULONG t1, t2;
BN_ULONG c1, c2, c3;
c1 = 0;
@ -602,7 +605,6 @@ void bn_sqr_comba8(BN_ULONG *r, const BN_ULONG *a)
void bn_sqr_comba4(BN_ULONG *r, const BN_ULONG *a)
{
BN_ULONG t1, t2;
BN_ULONG c1, c2, c3;
c1 = 0;

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -256,24 +256,6 @@ extern "C" {
# define BN_HEX_FMT2 "%08X"
# endif
/*
* 2011-02-22 SMS. In various places, a size_t variable or a type cast to
* size_t was used to perform integer-only operations on pointers. This
* failed on VMS with 64-bit pointers (CC /POINTER_SIZE = 64) because size_t
* is still only 32 bits. What's needed in these cases is an integer type
* with the same size as a pointer, which size_t is not certain to be. The
* only fix here is VMS-specific.
*/
# if defined(OPENSSL_SYS_VMS)
# if __INITIAL_POINTER_SIZE == 64
# define PTR_SIZE_INT long long
# else /* __INITIAL_POINTER_SIZE == 64 */
# define PTR_SIZE_INT int
# endif /* __INITIAL_POINTER_SIZE == 64 [else] */
# else /* defined(OPENSSL_SYS_VMS) */
# define PTR_SIZE_INT size_t
# endif /* defined(OPENSSL_SYS_VMS) [else] */
# define BN_DEFAULT_BITS 1280
# define BN_FLG_MALLOCED 0x01

View File

@ -489,121 +489,144 @@ BN_ULONG bn_sub_words(BN_ULONG *r, const BN_ULONG *a, const BN_ULONG *b,
* c=(c2,c1,c0)
*/
/*
* Keep in mind that carrying into high part of multiplication result
* can not overflow, because it cannot be all-ones.
*/
# ifdef BN_LLONG
# define mul_add_c(a,b,c0,c1,c2) \
t=(BN_ULLONG)a*b; \
t1=(BN_ULONG)Lw(t); \
t2=(BN_ULONG)Hw(t); \
c0=(c0+t1)&BN_MASK2; if ((c0) < t1) t2++; \
c1=(c1+t2)&BN_MASK2; if ((c1) < t2) c2++;
/*
* Keep in mind that additions to multiplication result can not
* overflow, because its high half cannot be all-ones.
*/
# define mul_add_c(a,b,c0,c1,c2) do { \
BN_ULONG hi; \
BN_ULLONG t = (BN_ULLONG)(a)*(b); \
t += c0; /* no carry */ \
c0 = (BN_ULONG)Lw(t); \
hi = (BN_ULONG)Hw(t); \
c1 = (c1+hi)&BN_MASK2; if (c1<hi) c2++; \
} while(0)
# define mul_add_c2(a,b,c0,c1,c2) \
t=(BN_ULLONG)a*b; \
tt=(t+t)&BN_MASK; \
if (tt < t) c2++; \
t1=(BN_ULONG)Lw(tt); \
t2=(BN_ULONG)Hw(tt); \
c0=(c0+t1)&BN_MASK2; \
if ((c0 < t1) && (((++t2)&BN_MASK2) == 0)) c2++; \
c1=(c1+t2)&BN_MASK2; if ((c1) < t2) c2++;
# define mul_add_c2(a,b,c0,c1,c2) do { \
BN_ULONG hi; \
BN_ULLONG t = (BN_ULLONG)(a)*(b); \
BN_ULLONG tt = t+c0; /* no carry */ \
c0 = (BN_ULONG)Lw(tt); \
hi = (BN_ULONG)Hw(tt); \
c1 = (c1+hi)&BN_MASK2; if (c1<hi) c2++; \
t += c0; /* no carry */ \
c0 = (BN_ULONG)Lw(t); \
hi = (BN_ULONG)Hw(t); \
c1 = (c1+hi)&BN_MASK2; if (c1<hi) c2++; \
} while(0)
# define sqr_add_c(a,i,c0,c1,c2) \
t=(BN_ULLONG)a[i]*a[i]; \
t1=(BN_ULONG)Lw(t); \
t2=(BN_ULONG)Hw(t); \
c0=(c0+t1)&BN_MASK2; if ((c0) < t1) t2++; \
c1=(c1+t2)&BN_MASK2; if ((c1) < t2) c2++;
# define sqr_add_c(a,i,c0,c1,c2) do { \
BN_ULONG hi; \
BN_ULLONG t = (BN_ULLONG)a[i]*a[i]; \
t += c0; /* no carry */ \
c0 = (BN_ULONG)Lw(t); \
hi = (BN_ULONG)Hw(t); \
c1 = (c1+hi)&BN_MASK2; if (c1<hi) c2++; \
} while(0)
# define sqr_add_c2(a,i,j,c0,c1,c2) \
mul_add_c2((a)[i],(a)[j],c0,c1,c2)
# elif defined(BN_UMULT_LOHI)
/*
* Keep in mind that additions to hi can not overflow, because
* the high word of a multiplication result cannot be all-ones.
*/
# define mul_add_c(a,b,c0,c1,c2) do { \
BN_ULONG ta = (a), tb = (b); \
BN_ULONG lo, hi; \
BN_UMULT_LOHI(lo,hi,ta,tb); \
c0 += lo; hi += (c0<lo)?1:0; \
c1 += hi; c2 += (c1<hi)?1:0; \
} while(0)
# define mul_add_c(a,b,c0,c1,c2) { \
BN_ULONG ta=(a),tb=(b); \
BN_UMULT_LOHI(t1,t2,ta,tb); \
c0 += t1; t2 += (c0<t1)?1:0; \
c1 += t2; c2 += (c1<t2)?1:0; \
}
# define mul_add_c2(a,b,c0,c1,c2) do { \
BN_ULONG ta = (a), tb = (b); \
BN_ULONG lo, hi, tt; \
BN_UMULT_LOHI(lo,hi,ta,tb); \
c0 += lo; tt = hi+((c0<lo)?1:0); \
c1 += tt; c2 += (c1<tt)?1:0; \
c0 += lo; hi += (c0<lo)?1:0; \
c1 += hi; c2 += (c1<hi)?1:0; \
} while(0)
# define mul_add_c2(a,b,c0,c1,c2) { \
BN_ULONG ta=(a),tb=(b),t0; \
BN_UMULT_LOHI(t0,t1,ta,tb); \
c0 += t0; t2 = t1+((c0<t0)?1:0);\
c1 += t2; c2 += (c1<t2)?1:0; \
c0 += t0; t1 += (c0<t0)?1:0; \
c1 += t1; c2 += (c1<t1)?1:0; \
}
# define sqr_add_c(a,i,c0,c1,c2) { \
BN_ULONG ta=(a)[i]; \
BN_UMULT_LOHI(t1,t2,ta,ta); \
c0 += t1; t2 += (c0<t1)?1:0; \
c1 += t2; c2 += (c1<t2)?1:0; \
}
# define sqr_add_c(a,i,c0,c1,c2) do { \
BN_ULONG ta = (a)[i]; \
BN_ULONG lo, hi; \
BN_UMULT_LOHI(lo,hi,ta,ta); \
c0 += lo; hi += (c0<lo)?1:0; \
c1 += hi; c2 += (c1<hi)?1:0; \
} while(0)
# define sqr_add_c2(a,i,j,c0,c1,c2) \
mul_add_c2((a)[i],(a)[j],c0,c1,c2)
# elif defined(BN_UMULT_HIGH)
/*
* Keep in mind that additions to hi can not overflow, because
* the high word of a multiplication result cannot be all-ones.
*/
# define mul_add_c(a,b,c0,c1,c2) do { \
BN_ULONG ta = (a), tb = (b); \
BN_ULONG lo = ta * tb; \
BN_ULONG hi = BN_UMULT_HIGH(ta,tb); \
c0 += lo; hi += (c0<lo)?1:0; \
c1 += hi; c2 += (c1<hi)?1:0; \
} while(0)
# define mul_add_c(a,b,c0,c1,c2) { \
BN_ULONG ta=(a),tb=(b); \
t1 = ta * tb; \
t2 = BN_UMULT_HIGH(ta,tb); \
c0 += t1; t2 += (c0<t1)?1:0; \
c1 += t2; c2 += (c1<t2)?1:0; \
}
# define mul_add_c2(a,b,c0,c1,c2) do { \
BN_ULONG ta = (a), tb = (b), tt; \
BN_ULONG lo = ta * tb; \
BN_ULONG hi = BN_UMULT_HIGH(ta,tb); \
c0 += lo; tt = hi + ((c0<lo)?1:0); \
c1 += tt; c2 += (c1<tt)?1:0; \
c0 += lo; hi += (c0<lo)?1:0; \
c1 += hi; c2 += (c1<hi)?1:0; \
} while(0)
# define mul_add_c2(a,b,c0,c1,c2) { \
BN_ULONG ta=(a),tb=(b),t0; \
t1 = BN_UMULT_HIGH(ta,tb); \
t0 = ta * tb; \
c0 += t0; t2 = t1+((c0<t0)?1:0);\
c1 += t2; c2 += (c1<t2)?1:0; \
c0 += t0; t1 += (c0<t0)?1:0; \
c1 += t1; c2 += (c1<t1)?1:0; \
}
# define sqr_add_c(a,i,c0,c1,c2) { \
BN_ULONG ta=(a)[i]; \
t1 = ta * ta; \
t2 = BN_UMULT_HIGH(ta,ta); \
c0 += t1; t2 += (c0<t1)?1:0; \
c1 += t2; c2 += (c1<t2)?1:0; \
}
# define sqr_add_c(a,i,c0,c1,c2) do { \
BN_ULONG ta = (a)[i]; \
BN_ULONG lo = ta * ta; \
BN_ULONG hi = BN_UMULT_HIGH(ta,ta); \
c0 += lo; hi += (c0<lo)?1:0; \
c1 += hi; c2 += (c1<hi)?1:0; \
} while(0)
# define sqr_add_c2(a,i,j,c0,c1,c2) \
mul_add_c2((a)[i],(a)[j],c0,c1,c2)
# else /* !BN_LLONG */
# define mul_add_c(a,b,c0,c1,c2) \
t1=LBITS(a); t2=HBITS(a); \
bl=LBITS(b); bh=HBITS(b); \
mul64(t1,t2,bl,bh); \
c0=(c0+t1)&BN_MASK2; if ((c0) < t1) t2++; \
c1=(c1+t2)&BN_MASK2; if ((c1) < t2) c2++;
/*
* Keep in mind that additions to hi can not overflow, because
* the high word of a multiplication result cannot be all-ones.
*/
# define mul_add_c(a,b,c0,c1,c2) do { \
BN_ULONG lo = LBITS(a), hi = HBITS(a); \
BN_ULONG bl = LBITS(b), bh = HBITS(b); \
mul64(lo,hi,bl,bh); \
c0 = (c0+lo)&BN_MASK2; if (c0<lo) hi++; \
c1 = (c1+hi)&BN_MASK2; if (c1<hi) c2++; \
} while(0)
# define mul_add_c2(a,b,c0,c1,c2) \
t1=LBITS(a); t2=HBITS(a); \
bl=LBITS(b); bh=HBITS(b); \
mul64(t1,t2,bl,bh); \
if (t2 & BN_TBIT) c2++; \
t2=(t2+t2)&BN_MASK2; \
if (t1 & BN_TBIT) t2++; \
t1=(t1+t1)&BN_MASK2; \
c0=(c0+t1)&BN_MASK2; \
if ((c0 < t1) && (((++t2)&BN_MASK2) == 0)) c2++; \
c1=(c1+t2)&BN_MASK2; if ((c1) < t2) c2++;
# define mul_add_c2(a,b,c0,c1,c2) do { \
BN_ULONG tt; \
BN_ULONG lo = LBITS(a), hi = HBITS(a); \
BN_ULONG bl = LBITS(b), bh = HBITS(b); \
mul64(lo,hi,bl,bh); \
tt = hi; \
c0 = (c0+lo)&BN_MASK2; if (c0<lo) tt++; \
c1 = (c1+tt)&BN_MASK2; if (c1<tt) c2++; \
c0 = (c0+lo)&BN_MASK2; if (c0<lo) hi++; \
c1 = (c1+hi)&BN_MASK2; if (c1<hi) c2++; \
} while(0)
# define sqr_add_c(a,i,c0,c1,c2) \
sqr64(t1,t2,(a)[i]); \
c0=(c0+t1)&BN_MASK2; if ((c0) < t1) t2++; \
c1=(c1+t2)&BN_MASK2; if ((c1) < t2) c2++;
# define sqr_add_c(a,i,c0,c1,c2) do { \
BN_ULONG lo, hi; \
sqr64(lo,hi,(a)[i]); \
c0 = (c0+lo)&BN_MASK2; if (c0<lo) hi++; \
c1 = (c1+hi)&BN_MASK2; if (c1<hi) c2++; \
} while(0)
# define sqr_add_c2(a,i,j,c0,c1,c2) \
mul_add_c2((a)[i],(a)[j],c0,c1,c2)
@ -611,12 +634,6 @@ BN_ULONG bn_sub_words(BN_ULONG *r, const BN_ULONG *a, const BN_ULONG *b,
void bn_mul_comba8(BN_ULONG *r, BN_ULONG *a, BN_ULONG *b)
{
# ifdef BN_LLONG
BN_ULLONG t;
# else
BN_ULONG bl, bh;
# endif
BN_ULONG t1, t2;
BN_ULONG c1, c2, c3;
c1 = 0;
@ -720,12 +737,6 @@ void bn_mul_comba8(BN_ULONG *r, BN_ULONG *a, BN_ULONG *b)
void bn_mul_comba4(BN_ULONG *r, BN_ULONG *a, BN_ULONG *b)
{
# ifdef BN_LLONG
BN_ULLONG t;
# else
BN_ULONG bl, bh;
# endif
BN_ULONG t1, t2;
BN_ULONG c1, c2, c3;
c1 = 0;
@ -765,12 +776,6 @@ void bn_mul_comba4(BN_ULONG *r, BN_ULONG *a, BN_ULONG *b)
void bn_sqr_comba8(BN_ULONG *r, const BN_ULONG *a)
{
# ifdef BN_LLONG
BN_ULLONG t, tt;
# else
BN_ULONG bl, bh;
# endif
BN_ULONG t1, t2;
BN_ULONG c1, c2, c3;
c1 = 0;
@ -846,12 +851,6 @@ void bn_sqr_comba8(BN_ULONG *r, const BN_ULONG *a)
void bn_sqr_comba4(BN_ULONG *r, const BN_ULONG *a)
{
# ifdef BN_LLONG
BN_ULLONG t, tt;
# else
BN_ULONG bl, bh;
# endif
BN_ULONG t1, t2;
BN_ULONG c1, c2, c3;
c1 = 0;

View File

@ -122,6 +122,17 @@
# ifndef alloca
# define alloca(s) __builtin_alloca((s))
# endif
#elif defined(__sun)
# include <alloca.h>
#endif
#include "rsaz_exp.h"
#undef SPARC_T4_MONT
#if defined(OPENSSL_BN_ASM_MONT) && (defined(__sparc__) || defined(__sparc))
# include "sparc_arch.h"
extern unsigned int OPENSSL_sparcv9cap_P[];
# define SPARC_T4_MONT
#endif
/* maximum precomputation table size for *variable* sliding windows */
@ -464,6 +475,23 @@ int BN_mod_exp_mont(BIGNUM *rr, const BIGNUM *a, const BIGNUM *p,
wstart = bits - 1; /* The top bit of the window */
wend = 0; /* The bottom bit of the window */
#if 1 /* by Shay Gueron's suggestion */
j = m->top; /* borrow j */
if (m->d[j - 1] & (((BN_ULONG)1) << (BN_BITS2 - 1))) {
if (bn_wexpand(r, j) == NULL)
goto err;
/* 2^(top*BN_BITS2) - m */
r->d[0] = (0 - m->d[0]) & BN_MASK2;
for (i = 1; i < j; i++)
r->d[i] = (~m->d[i]) & BN_MASK2;
r->top = j;
/*
* Upper words will be zero if the corresponding words of 'm' were
* 0xfff[...], so decrement r->top accordingly.
*/
bn_correct_top(r);
} else
#endif
if (!BN_to_montgomery(r, BN_value_one(), mont, ctx))
goto err;
for (;;) {
@ -515,6 +543,17 @@ int BN_mod_exp_mont(BIGNUM *rr, const BIGNUM *a, const BIGNUM *p,
if (wstart < 0)
break;
}
#if defined(SPARC_T4_MONT)
if (OPENSSL_sparcv9cap_P[0] & (SPARCV9_VIS3 | SPARCV9_PREFER_FPU)) {
j = mont->N.top; /* borrow j */
val[0]->d[0] = 1; /* borrow val[0] */
for (i = 1; i < j; i++)
val[0]->d[i] = 0;
val[0]->top = j;
if (!BN_mod_mul_montgomery(rr, r, val[0], mont, ctx))
goto err;
} else
#endif
if (!BN_from_montgomery(rr, r, mont, ctx))
goto err;
ret = 1;
@ -526,6 +565,27 @@ int BN_mod_exp_mont(BIGNUM *rr, const BIGNUM *a, const BIGNUM *p,
return (ret);
}
#if defined(SPARC_T4_MONT)
static BN_ULONG bn_get_bits(const BIGNUM *a, int bitpos)
{
BN_ULONG ret = 0;
int wordpos;
wordpos = bitpos / BN_BITS2;
bitpos %= BN_BITS2;
if (wordpos >= 0 && wordpos < a->top) {
ret = a->d[wordpos] & BN_MASK2;
if (bitpos) {
ret >>= bitpos;
if (++wordpos < a->top)
ret |= a->d[wordpos] << (BN_BITS2 - bitpos);
}
}
return ret & BN_MASK2;
}
#endif
/*
* BN_mod_exp_mont_consttime() stores the precomputed powers in a specific
* layout so that accessing any of these table values shows the same access
@ -594,6 +654,9 @@ int BN_mod_exp_mont_consttime(BIGNUM *rr, const BIGNUM *a, const BIGNUM *p,
int powerbufLen = 0;
unsigned char *powerbuf = NULL;
BIGNUM tmp, am;
#if defined(SPARC_T4_MONT)
unsigned int t4 = 0;
#endif
bn_check_top(a);
bn_check_top(p);
@ -626,21 +689,62 @@ int BN_mod_exp_mont_consttime(BIGNUM *rr, const BIGNUM *a, const BIGNUM *p,
goto err;
}
#ifdef RSAZ_ENABLED
/*
* If the size of the operands allow it, perform the optimized
* RSAZ exponentiation. For further information see
* crypto/bn/rsaz_exp.c and accompanying assembly modules.
*/
if ((16 == a->top) && (16 == p->top) && (BN_num_bits(m) == 1024)
&& rsaz_avx2_eligible()) {
if (NULL == bn_wexpand(rr, 16))
goto err;
RSAZ_1024_mod_exp_avx2(rr->d, a->d, p->d, m->d, mont->RR.d,
mont->n0[0]);
rr->top = 16;
rr->neg = 0;
bn_correct_top(rr);
ret = 1;
goto err;
} else if ((8 == a->top) && (8 == p->top) && (BN_num_bits(m) == 512)) {
if (NULL == bn_wexpand(rr, 8))
goto err;
RSAZ_512_mod_exp(rr->d, a->d, p->d, m->d, mont->n0[0], mont->RR.d);
rr->top = 8;
rr->neg = 0;
bn_correct_top(rr);
ret = 1;
goto err;
}
#endif
/* Get the window size to use with size of p. */
window = BN_window_bits_for_ctime_exponent_size(bits);
#if defined(OPENSSL_BN_ASM_MONT5)
if (window == 6 && bits <= 1024)
window = 5; /* ~5% improvement of 2048-bit RSA sign */
#if defined(SPARC_T4_MONT)
if (window >= 5 && (top & 15) == 0 && top <= 64 &&
(OPENSSL_sparcv9cap_P[1] & (CFR_MONTMUL | CFR_MONTSQR)) ==
(CFR_MONTMUL | CFR_MONTSQR) && (t4 = OPENSSL_sparcv9cap_P[0]))
window = 5;
else
#endif
#if defined(OPENSSL_BN_ASM_MONT5)
if (window >= 5) {
window = 5; /* ~5% improvement for RSA2048 sign, and even
* for RSA4096 */
if ((top & 7) == 0)
powerbufLen += 2 * top * sizeof(m->d[0]);
}
#endif
(void)0;
/*
* Allocate a buffer large enough to hold all of the pre-computed powers
* of am, am itself and tmp.
*/
numPowers = 1 << window;
powerbufLen = sizeof(m->d[0]) * (top * numPowers +
((2 * top) >
numPowers ? (2 * top) : numPowers));
powerbufLen += sizeof(m->d[0]) * (top * numPowers +
((2 * top) >
numPowers ? (2 * top) : numPowers));
#ifdef alloca
if (powerbufLen < 3072)
powerbufFree =
@ -670,15 +774,17 @@ int BN_mod_exp_mont_consttime(BIGNUM *rr, const BIGNUM *a, const BIGNUM *p,
tmp.flags = am.flags = BN_FLG_STATIC_DATA;
/* prepare a^0 in Montgomery domain */
#if 1
#if 1 /* by Shay Gueron's suggestion */
if (m->d[top - 1] & (((BN_ULONG)1) << (BN_BITS2 - 1))) {
/* 2^(top*BN_BITS2) - m */
tmp.d[0] = (0 - m->d[0]) & BN_MASK2;
for (i = 1; i < top; i++)
tmp.d[i] = (~m->d[i]) & BN_MASK2;
tmp.top = top;
} else
#endif
if (!BN_to_montgomery(&tmp, BN_value_one(), mont, ctx))
goto err;
#else
tmp.d[0] = (0 - m->d[0]) & BN_MASK2; /* 2^(top*BN_BITS2) - m */
for (i = 1; i < top; i++)
tmp.d[i] = (~m->d[i]) & BN_MASK2;
tmp.top = top;
#endif
/* prepare a^1 in Montgomery domain */
if (a->neg || BN_ucmp(a, m) >= 0) {
@ -689,6 +795,138 @@ int BN_mod_exp_mont_consttime(BIGNUM *rr, const BIGNUM *a, const BIGNUM *p,
} else if (!BN_to_montgomery(&am, a, mont, ctx))
goto err;
#if defined(SPARC_T4_MONT)
if (t4) {
typedef int (*bn_pwr5_mont_f) (BN_ULONG *tp, const BN_ULONG *np,
const BN_ULONG *n0, const void *table,
int power, int bits);
int bn_pwr5_mont_t4_8(BN_ULONG *tp, const BN_ULONG *np,
const BN_ULONG *n0, const void *table,
int power, int bits);
int bn_pwr5_mont_t4_16(BN_ULONG *tp, const BN_ULONG *np,
const BN_ULONG *n0, const void *table,
int power, int bits);
int bn_pwr5_mont_t4_24(BN_ULONG *tp, const BN_ULONG *np,
const BN_ULONG *n0, const void *table,
int power, int bits);
int bn_pwr5_mont_t4_32(BN_ULONG *tp, const BN_ULONG *np,
const BN_ULONG *n0, const void *table,
int power, int bits);
static const bn_pwr5_mont_f pwr5_funcs[4] = {
bn_pwr5_mont_t4_8, bn_pwr5_mont_t4_16,
bn_pwr5_mont_t4_24, bn_pwr5_mont_t4_32
};
bn_pwr5_mont_f pwr5_worker = pwr5_funcs[top / 16 - 1];
typedef int (*bn_mul_mont_f) (BN_ULONG *rp, const BN_ULONG *ap,
const void *bp, const BN_ULONG *np,
const BN_ULONG *n0);
int bn_mul_mont_t4_8(BN_ULONG *rp, const BN_ULONG *ap, const void *bp,
const BN_ULONG *np, const BN_ULONG *n0);
int bn_mul_mont_t4_16(BN_ULONG *rp, const BN_ULONG *ap,
const void *bp, const BN_ULONG *np,
const BN_ULONG *n0);
int bn_mul_mont_t4_24(BN_ULONG *rp, const BN_ULONG *ap,
const void *bp, const BN_ULONG *np,
const BN_ULONG *n0);
int bn_mul_mont_t4_32(BN_ULONG *rp, const BN_ULONG *ap,
const void *bp, const BN_ULONG *np,
const BN_ULONG *n0);
static const bn_mul_mont_f mul_funcs[4] = {
bn_mul_mont_t4_8, bn_mul_mont_t4_16,
bn_mul_mont_t4_24, bn_mul_mont_t4_32
};
bn_mul_mont_f mul_worker = mul_funcs[top / 16 - 1];
void bn_mul_mont_vis3(BN_ULONG *rp, const BN_ULONG *ap,
const void *bp, const BN_ULONG *np,
const BN_ULONG *n0, int num);
void bn_mul_mont_t4(BN_ULONG *rp, const BN_ULONG *ap,
const void *bp, const BN_ULONG *np,
const BN_ULONG *n0, int num);
void bn_mul_mont_gather5_t4(BN_ULONG *rp, const BN_ULONG *ap,
const void *table, const BN_ULONG *np,
const BN_ULONG *n0, int num, int power);
void bn_flip_n_scatter5_t4(const BN_ULONG *inp, size_t num,
void *table, size_t power);
void bn_gather5_t4(BN_ULONG *out, size_t num,
void *table, size_t power);
void bn_flip_t4(BN_ULONG *dst, BN_ULONG *src, size_t num);
BN_ULONG *np = mont->N.d, *n0 = mont->n0;
int stride = 5 * (6 - (top / 16 - 1)); /* multiple of 5, but less
* than 32 */
/*
* BN_to_montgomery can contaminate words above .top [in
* BN_DEBUG[_DEBUG] build]...
*/
for (i = am.top; i < top; i++)
am.d[i] = 0;
for (i = tmp.top; i < top; i++)
tmp.d[i] = 0;
bn_flip_n_scatter5_t4(tmp.d, top, powerbuf, 0);
bn_flip_n_scatter5_t4(am.d, top, powerbuf, 1);
if (!(*mul_worker) (tmp.d, am.d, am.d, np, n0) &&
!(*mul_worker) (tmp.d, am.d, am.d, np, n0))
bn_mul_mont_vis3(tmp.d, am.d, am.d, np, n0, top);
bn_flip_n_scatter5_t4(tmp.d, top, powerbuf, 2);
for (i = 3; i < 32; i++) {
/* Calculate a^i = a^(i-1) * a */
if (!(*mul_worker) (tmp.d, tmp.d, am.d, np, n0) &&
!(*mul_worker) (tmp.d, tmp.d, am.d, np, n0))
bn_mul_mont_vis3(tmp.d, tmp.d, am.d, np, n0, top);
bn_flip_n_scatter5_t4(tmp.d, top, powerbuf, i);
}
/* switch to 64-bit domain */
np = alloca(top * sizeof(BN_ULONG));
top /= 2;
bn_flip_t4(np, mont->N.d, top);
bits--;
for (wvalue = 0, i = bits % 5; i >= 0; i--, bits--)
wvalue = (wvalue << 1) + BN_is_bit_set(p, bits);
bn_gather5_t4(tmp.d, top, powerbuf, wvalue);
/*
* Scan the exponent one window at a time starting from the most
* significant bits.
*/
while (bits >= 0) {
if (bits < stride)
stride = bits + 1;
bits -= stride;
wvalue = bn_get_bits(p, bits + 1);
if ((*pwr5_worker) (tmp.d, np, n0, powerbuf, wvalue, stride))
continue;
/* retry once and fall back */
if ((*pwr5_worker) (tmp.d, np, n0, powerbuf, wvalue, stride))
continue;
bits += stride - 5;
wvalue >>= stride - 5;
wvalue &= 31;
bn_mul_mont_t4(tmp.d, tmp.d, tmp.d, np, n0, top);
bn_mul_mont_t4(tmp.d, tmp.d, tmp.d, np, n0, top);
bn_mul_mont_t4(tmp.d, tmp.d, tmp.d, np, n0, top);
bn_mul_mont_t4(tmp.d, tmp.d, tmp.d, np, n0, top);
bn_mul_mont_t4(tmp.d, tmp.d, tmp.d, np, n0, top);
bn_mul_mont_gather5_t4(tmp.d, tmp.d, powerbuf, np, n0, top,
wvalue);
}
bn_flip_t4(tmp.d, tmp.d, top);
top *= 2;
/* back to 32-bit domain */
tmp.top = top;
bn_correct_top(&tmp);
OPENSSL_cleanse(np, top * sizeof(BN_ULONG));
} else
#endif
#if defined(OPENSSL_BN_ASM_MONT5)
if (window == 5 && top > 1) {
/*
@ -707,8 +945,15 @@ int BN_mod_exp_mont_consttime(BIGNUM *rr, const BIGNUM *a, const BIGNUM *p,
void bn_scatter5(const BN_ULONG *inp, size_t num,
void *table, size_t power);
void bn_gather5(BN_ULONG *out, size_t num, void *table, size_t power);
void bn_power5(BN_ULONG *rp, const BN_ULONG *ap,
const void *table, const BN_ULONG *np,
const BN_ULONG *n0, int num, int power);
int bn_get_bits5(const BN_ULONG *ap, int off);
int bn_from_montgomery(BN_ULONG *rp, const BN_ULONG *ap,
const BN_ULONG *not_used, const BN_ULONG *np,
const BN_ULONG *n0, int num);
BN_ULONG *np = mont->N.d, *n0 = mont->n0;
BN_ULONG *np = mont->N.d, *n0 = mont->n0, *np2;
/*
* BN_to_montgomery can contaminate words above .top [in
@ -719,6 +964,12 @@ int BN_mod_exp_mont_consttime(BIGNUM *rr, const BIGNUM *a, const BIGNUM *p,
for (i = tmp.top; i < top; i++)
tmp.d[i] = 0;
if (top & 7)
np2 = np;
else
for (np2 = am.d + top, i = 0; i < top; i++)
np2[2 * i] = np[i];
bn_scatter5(tmp.d, top, powerbuf, 0);
bn_scatter5(am.d, am.top, powerbuf, 1);
bn_mul_mont(tmp.d, am.d, am.d, np, n0, top);
@ -727,7 +978,7 @@ int BN_mod_exp_mont_consttime(BIGNUM *rr, const BIGNUM *a, const BIGNUM *p,
# if 0
for (i = 3; i < 32; i++) {
/* Calculate a^i = a^(i-1) * a */
bn_mul_mont_gather5(tmp.d, am.d, powerbuf, np, n0, top, i - 1);
bn_mul_mont_gather5(tmp.d, am.d, powerbuf, np2, n0, top, i - 1);
bn_scatter5(tmp.d, top, powerbuf, i);
}
# else
@ -738,7 +989,7 @@ int BN_mod_exp_mont_consttime(BIGNUM *rr, const BIGNUM *a, const BIGNUM *p,
}
for (i = 3; i < 8; i += 2) {
int j;
bn_mul_mont_gather5(tmp.d, am.d, powerbuf, np, n0, top, i - 1);
bn_mul_mont_gather5(tmp.d, am.d, powerbuf, np2, n0, top, i - 1);
bn_scatter5(tmp.d, top, powerbuf, i);
for (j = 2 * i; j < 32; j *= 2) {
bn_mul_mont(tmp.d, tmp.d, tmp.d, np, n0, top);
@ -746,13 +997,13 @@ int BN_mod_exp_mont_consttime(BIGNUM *rr, const BIGNUM *a, const BIGNUM *p,
}
}
for (; i < 16; i += 2) {
bn_mul_mont_gather5(tmp.d, am.d, powerbuf, np, n0, top, i - 1);
bn_mul_mont_gather5(tmp.d, am.d, powerbuf, np2, n0, top, i - 1);
bn_scatter5(tmp.d, top, powerbuf, i);
bn_mul_mont(tmp.d, tmp.d, tmp.d, np, n0, top);
bn_scatter5(tmp.d, top, powerbuf, 2 * i);
}
for (; i < 32; i += 2) {
bn_mul_mont_gather5(tmp.d, am.d, powerbuf, np, n0, top, i - 1);
bn_mul_mont_gather5(tmp.d, am.d, powerbuf, np2, n0, top, i - 1);
bn_scatter5(tmp.d, top, powerbuf, i);
}
# endif
@ -765,20 +1016,34 @@ int BN_mod_exp_mont_consttime(BIGNUM *rr, const BIGNUM *a, const BIGNUM *p,
* Scan the exponent one window at a time starting from the most
* significant bits.
*/
while (bits >= 0) {
for (wvalue = 0, i = 0; i < 5; i++, bits--)
wvalue = (wvalue << 1) + BN_is_bit_set(p, bits);
if (top & 7)
while (bits >= 0) {
for (wvalue = 0, i = 0; i < 5; i++, bits--)
wvalue = (wvalue << 1) + BN_is_bit_set(p, bits);
bn_mul_mont(tmp.d, tmp.d, tmp.d, np, n0, top);
bn_mul_mont(tmp.d, tmp.d, tmp.d, np, n0, top);
bn_mul_mont(tmp.d, tmp.d, tmp.d, np, n0, top);
bn_mul_mont(tmp.d, tmp.d, tmp.d, np, n0, top);
bn_mul_mont(tmp.d, tmp.d, tmp.d, np, n0, top);
bn_mul_mont_gather5(tmp.d, tmp.d, powerbuf, np, n0, top, wvalue);
bn_mul_mont(tmp.d, tmp.d, tmp.d, np, n0, top);
bn_mul_mont(tmp.d, tmp.d, tmp.d, np, n0, top);
bn_mul_mont(tmp.d, tmp.d, tmp.d, np, n0, top);
bn_mul_mont(tmp.d, tmp.d, tmp.d, np, n0, top);
bn_mul_mont(tmp.d, tmp.d, tmp.d, np, n0, top);
bn_mul_mont_gather5(tmp.d, tmp.d, powerbuf, np, n0, top,
wvalue);
} else {
while (bits >= 0) {
wvalue = bn_get_bits5(p->d, bits - 4);
bits -= 5;
bn_power5(tmp.d, tmp.d, powerbuf, np2, n0, top, wvalue);
}
}
ret = bn_from_montgomery(tmp.d, tmp.d, NULL, np2, n0, top);
tmp.top = top;
bn_correct_top(&tmp);
if (ret) {
if (!BN_copy(rr, &tmp))
ret = 0;
goto err; /* non-zero ret means it's not error */
}
} else
#endif
{
@ -844,6 +1109,15 @@ int BN_mod_exp_mont_consttime(BIGNUM *rr, const BIGNUM *a, const BIGNUM *p,
}
/* Convert the final result from montgomery to standard format */
#if defined(SPARC_T4_MONT)
if (OPENSSL_sparcv9cap_P[0] & (SPARCV9_VIS3 | SPARCV9_PREFER_FPU)) {
am.d[0] = 1; /* borrow am */
for (i = 1; i < top; i++)
am.d[i] = 0;
if (!BN_mod_mul_montgomery(rr, &tmp, &am, mont, ctx))
goto err;
} else
#endif
if (!BN_from_montgomery(rr, &tmp, mont, ctx))
goto err;
ret = 1;

View File

@ -450,8 +450,7 @@ int BN_GF2m_mod_arr(BIGNUM *r, const BIGNUM *a, const int p[])
d0 = p[k] % BN_BITS2;
d1 = BN_BITS2 - d0;
z[n] ^= (zz << d0);
tmp_ulong = zz >> d1;
if (d0 && tmp_ulong)
if (d0 && (tmp_ulong = zz >> d1))
z[n + 1] ^= tmp_ulong;
}

View File

@ -204,6 +204,24 @@ extern "C" {
# define BN_MUL_LOW_RECURSIVE_SIZE_NORMAL (32)/* 32 */
# define BN_MONT_CTX_SET_SIZE_WORD (64)/* 32 */
/*
* 2011-02-22 SMS. In various places, a size_t variable or a type cast to
* size_t was used to perform integer-only operations on pointers. This
* failed on VMS with 64-bit pointers (CC /POINTER_SIZE = 64) because size_t
* is still only 32 bits. What's needed in these cases is an integer type
* with the same size as a pointer, which size_t is not certain to be. The
* only fix here is VMS-specific.
*/
# if defined(OPENSSL_SYS_VMS)
# if __INITIAL_POINTER_SIZE == 64
# define PTR_SIZE_INT long long
# else /* __INITIAL_POINTER_SIZE == 64 */
# define PTR_SIZE_INT int
# endif /* __INITIAL_POINTER_SIZE == 64 [else] */
# elif !defined(PTR_SIZE_INT) /* defined(OPENSSL_SYS_VMS) */
# define PTR_SIZE_INT size_t
# endif /* defined(OPENSSL_SYS_VMS) [else] */
# if !defined(OPENSSL_NO_ASM) && !defined(OPENSSL_NO_INLINE_ASM) && !defined(PEDANTIC)
/*
* BN_UMULT_HIGH section.
@ -295,6 +313,15 @@ unsigned __int64 _umul128(unsigned __int64 a, unsigned __int64 b,
: "r"(a), "r"(b));
# endif
# endif
# elif defined(__aarch64__) && defined(SIXTY_FOUR_BIT_LONG)
# if defined(__GNUC__) && __GNUC__>=2
# define BN_UMULT_HIGH(a,b) ({ \
register BN_ULONG ret; \
asm ("umulh %0,%1,%2" \
: "=r"(ret) \
: "r"(a), "r"(b)); \
ret; })
# endif
# endif /* cpu */
# endif /* OPENSSL_NO_ASM */

View File

@ -1042,7 +1042,6 @@ int test_mod_exp_mont_consttime(BIO *bp, BN_CTX *ctx)
int test_mod_exp_mont5(BIO *bp, BN_CTX *ctx)
{
BIGNUM *a, *p, *m, *d, *e;
BN_MONT_CTX *mont;
a = BN_new();
@ -1050,7 +1049,6 @@ int test_mod_exp_mont5(BIO *bp, BN_CTX *ctx)
m = BN_new();
d = BN_new();
e = BN_new();
mont = BN_MONT_CTX_new();
BN_bntest_rand(m, 1024, 0, 1); /* must be odd for montgomery */
@ -1099,6 +1097,7 @@ int test_mod_exp_mont5(BIO *bp, BN_CTX *ctx)
fprintf(stderr, "Modular exponentiation test failed!\n");
return 0;
}
BN_MONT_CTX_free(mont);
BN_free(a);
BN_free(p);
BN_free(m);

Some files were not shown because too many files have changed in this diff Show More