79 Commits

Author SHA1 Message Date
Robert Watson
5e3f7694b1 Replace custom file descriptor array sleep lock constructed using a mutex
and flags with an sxlock.  This leads to a significant and measurable
performance improvement as a result of access to shared locking for
frequent lookup operations, reduced general overhead, and reduced overhead
in the event of contention.  All of these are imported for threaded
applications where simultaneous access to a shared file descriptor array
occurs frequently.  Kris has reported 2x-4x transaction rate improvements
on 8-core MySQL benchmarks; smaller improvements can be expected for many
workloads as a result of reduced overhead.

- Generally eliminate the distinction between "fast" and regular
  acquisisition of the filedesc lock; the plan is that they will now all
  be fast.  Change all locking instances to either shared or exclusive
  locks.

- Correct a bug (pointed out by kib) in fdfree() where previously msleep()
  was called without the mutex held; sx_sleep() is now always called with
  the sxlock held exclusively.

- Universally hold the struct file lock over changes to struct file,
  rather than the filedesc lock or no lock.  Always update the f_ops
  field last. A further memory barrier is required here in the future
  (discussed with jhb).

- Improve locking and reference management in linux_at(), which fails to
  properly acquire vnode references before using vnode pointers.  Annotate
  improper use of vn_fullpath(), which will be replaced at a future date.

In fcntl(), we conservatively acquire an exclusive lock, even though in
some cases a shared lock may be sufficient, which should be revisited.
The dropping of the filedesc lock in fdgrowtable() is no longer required
as the sxlock can be held over the sleep operation; we should consider
removing that (pointed out by attilio).

Tested by:	kris
Discussed with:	jhb, kris, attilio, jeff
2007-04-04 09:11:34 +00:00
Sam Leffler
faf5485263 add missing file from last commit that overhauls crypto/driver api's 2007-03-21 03:43:33 +00:00
Sam Leffler
6810ad6f2a Overhaul driver/subsystem api's:
o make all crypto drivers have a device_t; pseudo drivers like the s/w
  crypto driver synthesize one
o change the api between the crypto subsystem and drivers to use kobj;
  cryptodev_if.m defines this api
o use the fact that all crypto drivers now have a device_t to add support
  for specifying which of several potential devices to use when doing
  crypto operations
o add new ioctls that allow user apps to select a specific crypto device
  to use (previous ioctls maintained for compatibility)
o overhaul crypto subsystem code to eliminate lots of cruft and hide
  implementation details from drivers
o bring in numerous fixes from Michale Richardson/hifn; mostly for
  795x parts
o add an optional mechanism for mmap'ing the hifn 795x public key h/w
  to user space for use by openssl (not enabled by default)
o update crypto test tools to use new ioctl's and add cmd line options
  to specify a device to use for tests

These changes will also enable much future work on improving the core
crypto subsystem; including proper load balancing and interposing code
between the core and drivers to dispatch small operations to the s/w
driver as appropriate.

These changes were instigated by the work of Michael Richardson.

Reviewed by:	pjd
Approved by:	re
2007-03-21 03:42:51 +00:00
Pawel Jakub Dawidek
0d5c337bef When DIAGNOSTIC is defined, verify if we don't free crypto requests from
the crypto queue or from the return queue.
2006-06-06 15:04:52 +00:00
Pawel Jakub Dawidek
f34a967b01 Use newly added functions to simplify the code. 2006-06-04 22:17:25 +00:00
Pawel Jakub Dawidek
11d2e1e8ff - Replace COPYDATA() and COPYBACK() macros with crypto_copydata() and
crypto_copyback() functions.
- Add crypto_apply() function.

This will allow for more code simplification.
2006-06-04 22:15:13 +00:00
Pawel Jakub Dawidek
694e011306 Prefer hardware crypto over software crypto.
Before the change if a hardware crypto driver was loaded after
the software crypto driver, calling crypto_newsession() with
hard=0, will always choose software crypto.
2006-06-04 22:12:08 +00:00
Pawel Jakub Dawidek
f8e422e5f8 Use newly added defines instead of magic values. 2006-06-04 15:11:59 +00:00
Pawel Jakub Dawidek
d905998c95 Move COPYDATA() and COPYBACK() macros to cryptodev.h, they will be used
in padlock(4) as well.
2006-06-04 15:10:12 +00:00
Pawel Jakub Dawidek
082a4bab02 - Remove HMAC_BLOCK_LEN, it serves no purpose.
- Use defines of used algorithm instead of HMAC_BLOCK_LEN.
2006-06-04 14:49:34 +00:00
Pawel Jakub Dawidek
eec31f224d - Use define of an algorithm with the biggest block length to describe
EALG_MAX_BLOCK_LEN instead of hardcoded value.
- Kill an unused define.
2006-06-04 14:36:42 +00:00
Pawel Jakub Dawidek
bc58b0ec67 Rename HMAC_BLOCK_MAXLEN to HMAC_MAX_BLOCK_LEN to be consistent with
EALG_MAX_BLOCK_LEN.
2006-06-04 14:29:42 +00:00
Pawel Jakub Dawidek
0bbc4bf97d Rename AALG_MAX_RESULT_LEN to HASH_MAX_LEN to look more constent with
other defines.
2006-06-04 14:25:16 +00:00
Pawel Jakub Dawidek
9ea7e4210f - Add defines with hash length for each hash algorithm.
- Add defines with block length for each HMAC algorithm.
- Add AES_BLOCK_LEN define which is an alias for RIJNDAEL128_BLOCK_LEN.
- Add NULL_BLOCK_LEN define.
2006-06-04 14:20:47 +00:00
Pawel Jakub Dawidek
38d2f8d63c Kill an unused argument. 2006-06-04 12:15:59 +00:00
Pawel Jakub Dawidek
6c68b224b0 Remove (now unused) crp_mac field. 2006-05-22 16:27:27 +00:00
Pawel Jakub Dawidek
cd80523efc Fix usage of HMAC algorithms via /dev/crypto. 2006-05-22 16:24:11 +00:00
Pawel Jakub Dawidek
3a865c827a Improve the code responsible for waking up the crypto_proc thread.
Checking if the queues are empty is not enough for the crypto_proc thread
(it is enough for the crypto_ret_thread), because drivers can be marked
as blocked. In a situation where we have operations related to different
crypto drivers in the queue, it is possible that one driver is marked as
blocked. In this case, the queue will not be empty and we won't wakeup
the crypto_proc thread to execute operations for the others drivers.

Simply setting a global variable to 1 when we goes to sleep and setting
it back to 0 when we wake up is sufficient. The variable is protected
with the queue lock.
2006-05-22 10:05:23 +00:00
Pawel Jakub Dawidek
9c12ca29d6 Don't wakeup the crypto_ret_proc thread if it is running already.
Before the change if the thread was working on symmetric operation, we
would send unnecessary wakeup after adding asymmetric operation (when
asym queue was empty) and vice versa.
2006-05-22 09:58:34 +00:00
Pawel Jakub Dawidek
04d8f36a4f Don't set cc_kqblocked twice and don't increment cryptostats.cs_kblocks
twice if we call crypto_kinvoke() from crypto_proc thread.
This change also removes unprotected access to cc_kqblocked field
(CRYPTO_Q_LOCK() should be used for protection).
2006-05-22 09:37:28 +00:00
Pawel Jakub Dawidek
3aaf7145c5 Document how we synchronize access to the fields in the cryptocap
structure.
2006-05-22 07:49:42 +00:00
Pawel Jakub Dawidek
bda0abc627 We must synchronize access to cc_qblocked, because there could be a race
where crypto_invoke() returns ERESTART and before we set cc_qblocked to 1,
crypto_unblock() is called and sets it to 0. This way we mark device as
blocked forever.

Fix it by not setting cc_qblocked in the fast path and by protecting
crypto_invoke() in the crypto_proc thread with CRYPTO_Q_LOCK().
This won't slow things down, because there is no contention - we have
only one crypto thread. Actually it can be slightly faster, because we
save two atomic ops per crypto request.
The fast code path remains lock-less.
2006-05-22 07:48:45 +00:00
Pawel Jakub Dawidek
c3c820369e Silent Coverity Prevent report by asserting that cap != NULL.
Coverity ID:	1414
2006-05-18 06:28:39 +00:00
Pawel Jakub Dawidek
f6c4bc3b91 - Fix a very old bug in HMAC/SHA{384,512}. When HMAC is using SHA384
or SHA512, the blocksize is 128 bytes, not 64 bytes as anywhere else.
  The bug also exists in NetBSD, OpenBSD and various other independed
  implementations I look at.
- We cannot decide which hash function to use for HMAC based on the key
  length, because any HMAC function can use any key length.
  To fix it split CRYPTO_SHA2_HMAC into three algorithm:
  CRYPTO_SHA2_256_HMAC, CRYPTO_SHA2_384_HMAC and CRYPTO_SHA2_512_HMAC.
  Those names are consistent with OpenBSD's naming.
- Remove authsize field from auth_hash structure.
- Allow consumer to define size of hash he wants to receive.
  This allows to use HMAC not only for IPsec, where 96 bits MAC is requested.
  The size of requested MAC is defined at newsession time in the cri_mlen
  field - when 0, entire MAC will be returned.
- Add swcr_authprepare() function which prepares authentication key.
- Allow to provide key for every authentication operation, not only at
  newsession time by honoring CRD_F_KEY_EXPLICIT flag.
- Make giving key at newsession time optional - don't try to operate on it
  if its NULL.
- Extend COPYBACK()/COPYDATA() macros to handle CRYPTO_BUF_CONTIG buffer
  type as well.
- Accept CRYPTO_BUF_IOV buffer type in swcr_authcompute() as we have
  cuio_apply() now.
- 16 bits for key length (SW_klen) is more than enough.

Reviewed by:	sam
2006-05-17 18:24:17 +00:00
Pawel Jakub Dawidek
4acae0ac29 - Make opencrypto more SMP friendly by dropping the queue lock around
crypto_invoke(). This allows to serve multiple crypto requests in
  parallel and not bached requests are served lock-less.
  Drivers should not depend on the queue lock beeing held around
  crypto_invoke() and if they do, that's an error in the driver - it
  should do its own synchronization.
- Don't forget to wakeup the crypto thread when new requests is
  queued and only if both symmetric and asymmetric queues are empty.
- Symmetric requests use sessions and there is no way driver can
  disappear when there is an active session, so we don't need to check
  this, but assert this. This is also safe to not use the driver lock
  in this case.
- Assymetric requests don't use sessions, so don't check the driver
  in crypto_kinvoke().
- Protect assymetric operation with the driver lock, because if there
  is no symmetric session, driver can disappear.
- Don't send assymetric request to the driver if it is marked as
  blocked.
- Add an XXX comment, because I don't think migration to another driver
  is safe when there are pending requests using freed session.
- Remove 'hint' argument from crypto_kinvoke(), as it serves no purpose.
- Don't hold the driver lock around kprocess method call, instead use
  cc_koperations to track number of in-progress requests.
- Cleanup register/unregister code a bit.
- Other small simplifications and cleanups.

Reviewed by:	sam
2006-05-17 18:12:44 +00:00
Pawel Jakub Dawidek
645df8d06e Remove cri_rnd. It is not used.
Reviewed by:	sam
2006-05-17 18:04:51 +00:00
Pawel Jakub Dawidek
613894d047 If kern.cryptodevallowsoft is TRUE allow also for symmetric software crypto
in kernel. Useful for testing.

Reviewed by:	sam
2006-05-17 18:01:51 +00:00
Pawel Jakub Dawidek
b5161eb7b5 Forgot about adding cuio_apply() here.
Reviewed by:	sam
2006-05-17 17:58:05 +00:00
Pawel Jakub Dawidek
8f91d4abe9 - Implement cuio_apply(), an equivalent to m_apply(9).
- Implement CUIO_SKIP() macro which is only responsible for skipping the given
  number of bytes from iovec list. This allows to avoid duplicating the same
  code in three functions.

Reviewed by:	sam
2006-05-17 17:56:00 +00:00
Pawel Jakub Dawidek
71af8134f7 Be sure to wakeup the crypto thread when new request was queued.
This should fix a hang when starting cryptokeytest (and more).

MFC after:	1 month
2006-04-11 18:01:04 +00:00
Pawel Jakub Dawidek
48b0f2e10f - Simplify the code by using arc4rand(9) instead of arc4random(9) in a loop.
- Correct a comment.

MFC after:	2 weeks
2006-04-10 18:24:59 +00:00
Pawel Jakub Dawidek
4b465da26f Fix memory leak which occurs when crypto.ko module is unloaded.
Discussed with:	sam
MFC after	3 days
2006-03-28 08:33:30 +00:00
Wojciech A. Koszek
0a0eb0e8db crypto.ko depends on zlib.
Submitted by:	Ben Kelly <bkelly at vadev.org>
Approved by:	rwatson
Point hat to:	me
MFC after:	1 day
2006-03-04 15:50:46 +00:00
Wojciech A. Koszek
51b4ccb464 This patch fixes a problem, which exists if you have IPSEC in your kernel
and want to have crypto support loaded as KLD. By moving zlib to separate
module and adding MODULE_DEPEND directives, it is possible to use such
configuration without complication. Otherwise, since IPSEC is linked with
zlib (just like crypto.ko) you'll get following error:

	interface zlib.1 already present in the KLD 'kernel'!

Approved by:	cognet (mentor)
2006-02-27 16:56:22 +00:00
Pawel Jakub Dawidek
e6d944d7c3 Fix bogus check. It was possible to panic the kernel by giving 0 length.
This is actually a local DoS, as every user can use /dev/crypto if there
is crypto hardware in the system and cryptodev.ko is loaded (or compiled
into the kernel).

Reported by:	Mike Tancsa <mike@sentex.net>
MFC after:	1 day
2005-08-18 11:58:03 +00:00
Pawel Jakub Dawidek
36c51ae068 Check key size for rijndael, as invalid key size can lead to kernel panic.
It checked other algorithms against this bug and it seems they aren't
affected.

Reported by:	Mike Tancsa <mike@sentex.net>
PR:		i386/84860
Reviewed by:	phk, cperciva(x2)
2005-08-16 18:59:00 +00:00
Scott Long
e39e116ca2 malloc.h relies on param.h for a definition of MAXCPU. I guess that there is
other header pollution that makes this work right now, but it falls over when
doing a RELENG_5 -> HEAD upgrade.
2005-05-30 05:01:44 +00:00
Hajimu UMEMOTO
df3c03a773 just use crypto/rijndael, and nuke opencrypto/rindael.[ch].
the two became almost identical since latest KAME merge.

Discussed with:	sam
2005-03-11 17:24:46 +00:00
Hajimu UMEMOTO
a40be31edb - use 1/2 space for rijndael context in ipsec
- rijndael_set_key() always sets up full context
- rijndaelKeySetupDec() gets back original protoype

Reviewed by:	sam
Obtained from:	OpenBSD
2005-03-11 12:45:09 +00:00
Hajimu UMEMOTO
9f65b10b0f refer opencrypto/cast.h directly. 2005-03-11 12:37:07 +00:00
Poul-Henning Kamp
78b7c8d68d Use dynamic major number allocation. 2005-02-27 22:11:02 +00:00
Warner Losh
60727d8b86 /* -> /*- for license, minor formatting changes 2005-01-07 02:29:27 +00:00
Poul-Henning Kamp
a0fbccc9e7 Push Giant down through ioctl.
Don't grab Giant in the upper syscall/wrapper code

NET_LOCK_GIANT in the socket code (sockets/fifos).

mtx_lock(&Giant) in the vnode code.

mtx_lock(&Giant) in the opencrypto code.  (This may actually not be
needed, but better safe than sorry).

Devfs grabs Giant if the driver is marked as needing Giant.
2004-11-17 09:09:55 +00:00
Robert Watson
d7aed12f45 Don't acquire Giant in cryptof_close(), as the code is intended to be
able to run MPsafe (and appears to be MPsafe).

Discussed with (some time ago):	sam
2004-08-10 03:26:17 +00:00
Robert Watson
1c1ce9253f Push acquisition of Giant from fdrop_closed() into fo_close() so that
individual file object implementations can optionally acquire Giant if
they require it:

- soo_close(): depends on debug.mpsafenet
- pipe_close(): Giant not acquired
- kqueue_close(): Giant required
- vn_close(): Giant required
- cryptof_close(): Giant required (conservative)

Notes:

  Giant is still acquired in close() even when closing MPSAFE objects
  due to kqueue requiring Giant in the calling closef() code.
  Microbenchmarks indicate that this removal of Giant cuts 3%-3% off
  of pipe create/destroy pairs from user space with SMP compiled into
  the kernel.

  The cryptodev and opencrypto code appears MPSAFE, but I'm unable to
  test it extensively and so have left Giant over fo_close().  It can
  probably be removed given some testing and review.
2004-07-22 18:35:43 +00:00
Poul-Henning Kamp
89c9c53da0 Do the dreaded s/dev_t/struct cdev */
Bump __FreeBSD_version accordingly.
2004-06-16 09:47:26 +00:00
Poul-Henning Kamp
5dba30f15a add missing #include <sys/module.h> 2004-05-30 20:27:19 +00:00
John Baldwin
6074439965 kthread_exit() no longer requires Giant, so don't force callers to acquire
Giant just to call kthread_exit().

Requested by:	many
2004-03-05 22:42:17 +00:00
Poul-Henning Kamp
dc08ffec87 Device megapatch 4/6:
Introduce d_version field in struct cdevsw, this must always be
initialized to D_VERSION.

Flip sense of D_NOGIANT flag to D_NEEDGIANT, this involves removing
four D_NOGIANT flags and adding 145 D_NEEDGIANT flags.
2004-02-21 21:10:55 +00:00
Poul-Henning Kamp
08b21ed2da Do not aggressively unroll the AES implementation, in non-benchmarking use
it is same speed on small cache cpus and slower on largecache cpus.

Approved by:	sam@
2004-02-04 08:44:10 +00:00