100 Commits

Author SHA1 Message Date
Konstantin Belousov
04c49e68de Use the fpu_kern_enter() interface to properly separate usermode FPU
context from in-kernel execution of padlock instructions and to handle
spurious FPUDNA exceptions that sometime are raised when doing padlock
calculations.

Globally mark crypto(9) kthread as using FPU.

Reviewed by:	pjd
Hardware provided by:	Sentex Communications
Tested by:	  pho
PR:    amd64/135014
MFC after:    1 month
2010-06-05 16:00:53 +00:00
Bjoern A. Zeeb
77680d964f Add comments trying to explain what bad things happen here, i.e.
how hashed MD5/SHA are implemented, abusing Final() for padding and
sw_octx to transport the key from the beginning to the end.

Enlightened about what was going on here by: cperciva
Reviewed by:	cperciva
MFC After:	3 days
X-MFC with:	r187826
PR:		kern/126468
2010-01-09 15:43:47 +00:00
Bjoern A. Zeeb
df4dece102 In case the compression result is the same size as the orignal version,
the compression was useless as well.  Make sure to not update the data
and return, else we would waste resources when decompressing.

This also avoids the copyback() changing data other consumers like
xform_ipcomp.c would have ignored because of no win and sent out without
noting that compression was used, resulting in invalid packets at the
receiver.

MFC after:	5 days
2009-11-29 17:53:57 +00:00
Bjoern A. Zeeb
d8ce467304 Add SDT iter probes forgotten in r199885.
MFC after:	5 days
2009-11-29 17:46:40 +00:00
Bjoern A. Zeeb
6f443bec58 Change memory managment from a fixed size array to a list.
This is needed to avoid running into out of buffer situations
where we cannot alloc a new buffer because we hit the array size
limit (ZBUF).
Use a combined allocation for the struct and the actual data buffer
to not increase the number of malloc calls. [1]

Defer initialization of zbuf until we actually need it.

Make sure the output buffer will be large enough in all cases.

Details discussed with:	kib [1]
Reviewed by:		kib [1]
MFC after:		6 days
2009-11-28 21:08:19 +00:00
Bjoern A. Zeeb
9699afb61b Z_PARTIAL_FLUSH is marked deprecated. Z_SYNC_FLUSH is the suggested
replacement but only use it for inflate. For deflate use Z_FINISH
as Z_SYNC_FLUSH adds a trailing marker in some cases that inflate(),
despite the comment in zlib, does npt seem to cope well with, resulting
in errors when uncompressing exactly fills the outbut buffer without
a Z_STREAM_END and a successive call returns an error.

MFC after:	6 days
2009-11-28 17:44:57 +00:00
Bjoern A. Zeeb
d9c18e5627 Add SDT probes for opencrypto:deflate:deflate_gobal:*.
They are not nice but they were helpful.

MFC after:	6 days
2009-11-28 17:20:41 +00:00
Bjoern A. Zeeb
df21ad6e41 Define an SDT provider for "opencrypto".
MFC after:	6 days
2009-11-28 16:54:18 +00:00
Pawel Jakub Dawidek
f49861e199 If crypto operation is finished with EAGAIN, don't repeat operation from
the return context, but from the original context.
Before repeating operation clear DONE flag and error.

Reviewed by:	sam
Obtained from:	Wheel Sp. z o.o. (http://www.wheel.pl)
2009-09-04 09:48:18 +00:00
Rafal Jaworowski
ae184a6a7d Fix cryptodev UIO creation.
Cryptodev uses UIO structure do get data from userspace and pass it to
cryptographic engines. Initially UIO size is equal to size of data passed to
engine, but if UIO is prepared for hash calculation an additional small space
is created to hold result of operation.

While creating space for the result, UIO I/O vector size is correctly
extended, but uio_resid field in UIO structure is not modified.

As bus_dma code uses uio_resid field to determine size of UIO DMA mapping,
resulting mapping hasn't correct size. This leads to a crash if all the
following conditions are met:

     1. Hardware cryptographic accelerator writes result of hash operation
        using DMA.
     2. Size of input data is less or equal than (n * PAGE_SIZE),
     3. Size of input data plus size of hash result is grather than
        (n * PAGE_SIZE, where n is the same as in point 2.

This patch fixes this problem by adding size of the extenstion to uio_resid
field in UIO structure.

Submitted by:	Piotr Ziecik kosmo ! semihalf dot com
Reviewed by:	philip
Obtained from:	Semihalf
2009-05-23 13:23:46 +00:00
Warner Losh
3f147ab251 Fix return type for detach routine (should be int)
Fix first parameter for identify routine (should be driver_t *)
2009-02-05 17:43:12 +00:00
Bjoern A. Zeeb
1f4990a6b5 While OpenBSD's crypto/ framework has sha1 and md5 implementations that
can cope with a result buffer of NULL in the "Final" function, we cannot.
Thus pass in a temporary buffer long enough for either md5 or sha1 results
so that we do not panic.

PR:		bin/126468
MFC after:	1 week
2009-01-28 15:31:16 +00:00
Doug Rabson
bfd50e2732 Don't hang if encrypting/decrypting using struct iovecs where one of the
iovecs ends on a crypto block boundary.
2008-10-30 16:11:07 +00:00
Dag-Erling Smørgrav
e11e3f187d Fix a number of style issues in the MALLOC / FREE commit. I've tried to
be careful not to fix anything that was already broken; the NFSv4 code is
particularly bad in this respect.
2008-10-23 20:26:15 +00:00
Dag-Erling Smørgrav
1ede983cc9 Retire the MALLOC and FREE macros. They are an abomination unto style(9).
MFC after:	3 months
2008-10-23 15:53:51 +00:00
John Baldwin
e46502943a Make ftruncate a 'struct file' operation rather than a vnode operation.
This makes it possible to support ftruncate() on non-vnode file types in
the future.
- 'struct fileops' grows a 'fo_truncate' method to handle an ftruncate() on
  a given file descriptor.
- ftruncate() moves to kern/sys_generic.c and now just fetches a file
  object and invokes fo_truncate().
- The vnode-specific portions of ftruncate() move to vn_truncate() in
  vfs_vnops.c which implements fo_truncate() for vnode file types.
- Non-vnode file types return EINVAL in their fo_truncate() method.

Submitted by:	rwatson
2008-01-07 20:05:19 +00:00
Jeff Roberson
397c19d175 Remove explicit locking of struct file.
- Introduce a finit() which is used to initailize the fields of struct file
   in such a way that the ops vector is only valid after the data, type,
   and flags are valid.
 - Protect f_flag and f_count with atomic operations.
 - Remove the global list of all files and associated accounting.
 - Rewrite the unp garbage collection such that it no longer requires
   the global list of all files and instead uses a list of all unp sockets.
 - Mark sockets in the accept queue so we don't incorrectly gc them.

Tested by:	kris, pho
2007-12-30 01:42:15 +00:00
Julian Elischer
3745c395ec Rename the kthread_xxx (e.g. kthread_create()) calls
to kproc_xxx as they actually make whole processes.
Thos makes way for us to add REAL kthread_create() and friends
that actually make theads. it turns out that most of these
calls actually end up being moved back to the thread version
when it's added. but we need to make this cosmetic change first.

I'd LOVE to do this rename in 7.0  so that we can eventually MFC the
new kthread_xxx() calls.
2007-10-20 23:23:23 +00:00
Konstantin Belousov
1649bbbb94 Deny attempt to malloc unbounded amount of the memory.
Convert malloc()/bzero() to malloc(M_ZERO).

Obtained from:  OpenBSD
MFC after:      3 days
Approved by:    re (kensmith)
2007-10-08 20:08:34 +00:00
Peter Wemm
0278f1c0a3 Quiet warnings. These do not appear to be actually used uninitialized,
but gcc's optimizer isn't smart enough to see that.  Pre-initializing
seems harmless enough.

Approved by:  re (rwatson)
2007-07-05 06:59:14 +00:00
George V. Neville-Neil
559d3390d0 Integrate the Camellia Block Cipher. For more information see RFC 4132
and its bibliography.

Submitted by:   Tomoyuki Okazaki <okazaki at kick dot gr dot jp>
MFC after:      1 month
2007-05-09 19:37:02 +00:00
Robert Watson
5e3f7694b1 Replace custom file descriptor array sleep lock constructed using a mutex
and flags with an sxlock.  This leads to a significant and measurable
performance improvement as a result of access to shared locking for
frequent lookup operations, reduced general overhead, and reduced overhead
in the event of contention.  All of these are imported for threaded
applications where simultaneous access to a shared file descriptor array
occurs frequently.  Kris has reported 2x-4x transaction rate improvements
on 8-core MySQL benchmarks; smaller improvements can be expected for many
workloads as a result of reduced overhead.

- Generally eliminate the distinction between "fast" and regular
  acquisisition of the filedesc lock; the plan is that they will now all
  be fast.  Change all locking instances to either shared or exclusive
  locks.

- Correct a bug (pointed out by kib) in fdfree() where previously msleep()
  was called without the mutex held; sx_sleep() is now always called with
  the sxlock held exclusively.

- Universally hold the struct file lock over changes to struct file,
  rather than the filedesc lock or no lock.  Always update the f_ops
  field last. A further memory barrier is required here in the future
  (discussed with jhb).

- Improve locking and reference management in linux_at(), which fails to
  properly acquire vnode references before using vnode pointers.  Annotate
  improper use of vn_fullpath(), which will be replaced at a future date.

In fcntl(), we conservatively acquire an exclusive lock, even though in
some cases a shared lock may be sufficient, which should be revisited.
The dropping of the filedesc lock in fdgrowtable() is no longer required
as the sxlock can be held over the sleep operation; we should consider
removing that (pointed out by attilio).

Tested by:	kris
Discussed with:	jhb, kris, attilio, jeff
2007-04-04 09:11:34 +00:00
Sam Leffler
faf5485263 add missing file from last commit that overhauls crypto/driver api's 2007-03-21 03:43:33 +00:00
Sam Leffler
6810ad6f2a Overhaul driver/subsystem api's:
o make all crypto drivers have a device_t; pseudo drivers like the s/w
  crypto driver synthesize one
o change the api between the crypto subsystem and drivers to use kobj;
  cryptodev_if.m defines this api
o use the fact that all crypto drivers now have a device_t to add support
  for specifying which of several potential devices to use when doing
  crypto operations
o add new ioctls that allow user apps to select a specific crypto device
  to use (previous ioctls maintained for compatibility)
o overhaul crypto subsystem code to eliminate lots of cruft and hide
  implementation details from drivers
o bring in numerous fixes from Michale Richardson/hifn; mostly for
  795x parts
o add an optional mechanism for mmap'ing the hifn 795x public key h/w
  to user space for use by openssl (not enabled by default)
o update crypto test tools to use new ioctl's and add cmd line options
  to specify a device to use for tests

These changes will also enable much future work on improving the core
crypto subsystem; including proper load balancing and interposing code
between the core and drivers to dispatch small operations to the s/w
driver as appropriate.

These changes were instigated by the work of Michael Richardson.

Reviewed by:	pjd
Approved by:	re
2007-03-21 03:42:51 +00:00
Pawel Jakub Dawidek
0d5c337bef When DIAGNOSTIC is defined, verify if we don't free crypto requests from
the crypto queue or from the return queue.
2006-06-06 15:04:52 +00:00
Pawel Jakub Dawidek
f34a967b01 Use newly added functions to simplify the code. 2006-06-04 22:17:25 +00:00
Pawel Jakub Dawidek
11d2e1e8ff - Replace COPYDATA() and COPYBACK() macros with crypto_copydata() and
crypto_copyback() functions.
- Add crypto_apply() function.

This will allow for more code simplification.
2006-06-04 22:15:13 +00:00
Pawel Jakub Dawidek
694e011306 Prefer hardware crypto over software crypto.
Before the change if a hardware crypto driver was loaded after
the software crypto driver, calling crypto_newsession() with
hard=0, will always choose software crypto.
2006-06-04 22:12:08 +00:00
Pawel Jakub Dawidek
f8e422e5f8 Use newly added defines instead of magic values. 2006-06-04 15:11:59 +00:00
Pawel Jakub Dawidek
d905998c95 Move COPYDATA() and COPYBACK() macros to cryptodev.h, they will be used
in padlock(4) as well.
2006-06-04 15:10:12 +00:00
Pawel Jakub Dawidek
082a4bab02 - Remove HMAC_BLOCK_LEN, it serves no purpose.
- Use defines of used algorithm instead of HMAC_BLOCK_LEN.
2006-06-04 14:49:34 +00:00
Pawel Jakub Dawidek
eec31f224d - Use define of an algorithm with the biggest block length to describe
EALG_MAX_BLOCK_LEN instead of hardcoded value.
- Kill an unused define.
2006-06-04 14:36:42 +00:00
Pawel Jakub Dawidek
bc58b0ec67 Rename HMAC_BLOCK_MAXLEN to HMAC_MAX_BLOCK_LEN to be consistent with
EALG_MAX_BLOCK_LEN.
2006-06-04 14:29:42 +00:00
Pawel Jakub Dawidek
0bbc4bf97d Rename AALG_MAX_RESULT_LEN to HASH_MAX_LEN to look more constent with
other defines.
2006-06-04 14:25:16 +00:00
Pawel Jakub Dawidek
9ea7e4210f - Add defines with hash length for each hash algorithm.
- Add defines with block length for each HMAC algorithm.
- Add AES_BLOCK_LEN define which is an alias for RIJNDAEL128_BLOCK_LEN.
- Add NULL_BLOCK_LEN define.
2006-06-04 14:20:47 +00:00
Pawel Jakub Dawidek
38d2f8d63c Kill an unused argument. 2006-06-04 12:15:59 +00:00
Pawel Jakub Dawidek
6c68b224b0 Remove (now unused) crp_mac field. 2006-05-22 16:27:27 +00:00
Pawel Jakub Dawidek
cd80523efc Fix usage of HMAC algorithms via /dev/crypto. 2006-05-22 16:24:11 +00:00
Pawel Jakub Dawidek
3a865c827a Improve the code responsible for waking up the crypto_proc thread.
Checking if the queues are empty is not enough for the crypto_proc thread
(it is enough for the crypto_ret_thread), because drivers can be marked
as blocked. In a situation where we have operations related to different
crypto drivers in the queue, it is possible that one driver is marked as
blocked. In this case, the queue will not be empty and we won't wakeup
the crypto_proc thread to execute operations for the others drivers.

Simply setting a global variable to 1 when we goes to sleep and setting
it back to 0 when we wake up is sufficient. The variable is protected
with the queue lock.
2006-05-22 10:05:23 +00:00
Pawel Jakub Dawidek
9c12ca29d6 Don't wakeup the crypto_ret_proc thread if it is running already.
Before the change if the thread was working on symmetric operation, we
would send unnecessary wakeup after adding asymmetric operation (when
asym queue was empty) and vice versa.
2006-05-22 09:58:34 +00:00
Pawel Jakub Dawidek
04d8f36a4f Don't set cc_kqblocked twice and don't increment cryptostats.cs_kblocks
twice if we call crypto_kinvoke() from crypto_proc thread.
This change also removes unprotected access to cc_kqblocked field
(CRYPTO_Q_LOCK() should be used for protection).
2006-05-22 09:37:28 +00:00
Pawel Jakub Dawidek
3aaf7145c5 Document how we synchronize access to the fields in the cryptocap
structure.
2006-05-22 07:49:42 +00:00
Pawel Jakub Dawidek
bda0abc627 We must synchronize access to cc_qblocked, because there could be a race
where crypto_invoke() returns ERESTART and before we set cc_qblocked to 1,
crypto_unblock() is called and sets it to 0. This way we mark device as
blocked forever.

Fix it by not setting cc_qblocked in the fast path and by protecting
crypto_invoke() in the crypto_proc thread with CRYPTO_Q_LOCK().
This won't slow things down, because there is no contention - we have
only one crypto thread. Actually it can be slightly faster, because we
save two atomic ops per crypto request.
The fast code path remains lock-less.
2006-05-22 07:48:45 +00:00
Pawel Jakub Dawidek
c3c820369e Silent Coverity Prevent report by asserting that cap != NULL.
Coverity ID:	1414
2006-05-18 06:28:39 +00:00
Pawel Jakub Dawidek
f6c4bc3b91 - Fix a very old bug in HMAC/SHA{384,512}. When HMAC is using SHA384
or SHA512, the blocksize is 128 bytes, not 64 bytes as anywhere else.
  The bug also exists in NetBSD, OpenBSD and various other independed
  implementations I look at.
- We cannot decide which hash function to use for HMAC based on the key
  length, because any HMAC function can use any key length.
  To fix it split CRYPTO_SHA2_HMAC into three algorithm:
  CRYPTO_SHA2_256_HMAC, CRYPTO_SHA2_384_HMAC and CRYPTO_SHA2_512_HMAC.
  Those names are consistent with OpenBSD's naming.
- Remove authsize field from auth_hash structure.
- Allow consumer to define size of hash he wants to receive.
  This allows to use HMAC not only for IPsec, where 96 bits MAC is requested.
  The size of requested MAC is defined at newsession time in the cri_mlen
  field - when 0, entire MAC will be returned.
- Add swcr_authprepare() function which prepares authentication key.
- Allow to provide key for every authentication operation, not only at
  newsession time by honoring CRD_F_KEY_EXPLICIT flag.
- Make giving key at newsession time optional - don't try to operate on it
  if its NULL.
- Extend COPYBACK()/COPYDATA() macros to handle CRYPTO_BUF_CONTIG buffer
  type as well.
- Accept CRYPTO_BUF_IOV buffer type in swcr_authcompute() as we have
  cuio_apply() now.
- 16 bits for key length (SW_klen) is more than enough.

Reviewed by:	sam
2006-05-17 18:24:17 +00:00
Pawel Jakub Dawidek
4acae0ac29 - Make opencrypto more SMP friendly by dropping the queue lock around
crypto_invoke(). This allows to serve multiple crypto requests in
  parallel and not bached requests are served lock-less.
  Drivers should not depend on the queue lock beeing held around
  crypto_invoke() and if they do, that's an error in the driver - it
  should do its own synchronization.
- Don't forget to wakeup the crypto thread when new requests is
  queued and only if both symmetric and asymmetric queues are empty.
- Symmetric requests use sessions and there is no way driver can
  disappear when there is an active session, so we don't need to check
  this, but assert this. This is also safe to not use the driver lock
  in this case.
- Assymetric requests don't use sessions, so don't check the driver
  in crypto_kinvoke().
- Protect assymetric operation with the driver lock, because if there
  is no symmetric session, driver can disappear.
- Don't send assymetric request to the driver if it is marked as
  blocked.
- Add an XXX comment, because I don't think migration to another driver
  is safe when there are pending requests using freed session.
- Remove 'hint' argument from crypto_kinvoke(), as it serves no purpose.
- Don't hold the driver lock around kprocess method call, instead use
  cc_koperations to track number of in-progress requests.
- Cleanup register/unregister code a bit.
- Other small simplifications and cleanups.

Reviewed by:	sam
2006-05-17 18:12:44 +00:00
Pawel Jakub Dawidek
645df8d06e Remove cri_rnd. It is not used.
Reviewed by:	sam
2006-05-17 18:04:51 +00:00
Pawel Jakub Dawidek
613894d047 If kern.cryptodevallowsoft is TRUE allow also for symmetric software crypto
in kernel. Useful for testing.

Reviewed by:	sam
2006-05-17 18:01:51 +00:00
Pawel Jakub Dawidek
b5161eb7b5 Forgot about adding cuio_apply() here.
Reviewed by:	sam
2006-05-17 17:58:05 +00:00
Pawel Jakub Dawidek
8f91d4abe9 - Implement cuio_apply(), an equivalent to m_apply(9).
- Implement CUIO_SKIP() macro which is only responsible for skipping the given
  number of bytes from iovec list. This allows to avoid duplicating the same
  code in three functions.

Reviewed by:	sam
2006-05-17 17:56:00 +00:00