Commit Graph

278 Commits

Author SHA1 Message Date
Warner Losh
031beb4e23 sys: Remove $FreeBSD$: one-line sh pattern
Remove /^\s*#[#!]?\s*\$FreeBSD\$.*$\n/
2023-08-16 11:54:58 -06:00
Warner Losh
685dc743dc sys: Remove $FreeBSD$: one-line .c pattern
Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
2023-08-16 11:54:36 -06:00
Warner Losh
95ee2897e9 sys: Remove $FreeBSD$: two-line .h pattern
Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/
2023-08-16 11:54:11 -06:00
Warner Losh
4d846d260e spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD
The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.

Discussed with:		pfg
MFC After:		3 days
Sponsored by:		Netflix
2023-05-12 10:44:03 -06:00
Colin Percival
0811ce5723 random: Ingest extra fast entropy when !seeded
We periodically ingest entropy from pollable entropy sources, but only
8 bytes at a time and only occasionally enough to feed all of Fortuna's
pools once per second.  This can result in Fortuna remaining unseeded
for a nontrivial amount of time when there is no entropy passed in from
the boot loader, even if RDRAND is available to quickly provide a large
amount of entropy.

Detect in random_sources_feed if we are not yet seeded, and increase the
amount of immediate entropy harvesting we perform, in order to "fill"
Fortuna's entropy pools and avoid having
  random: randomdev_wait_until_seeded unblock wait
stall the boot process when entropy is available.

This speeds up the FreeBSD boot in the Firecracker VM by 2.3 seconds.

Approved by:	csprng (delphij)
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D35802
2022-07-19 23:59:40 -07:00
Andrew Turner
0b040a4809 Fix the random source descriptions
- Add the missing RANDOM_PURE_QUALCOMM description
 - Make RANDOM_PURE_VMGENID consistent with the other pure sources
   by including "PURE_" in the description.

Approved by:	csprng (cem)
Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35412
2022-06-17 10:36:17 +01:00
John Baldwin
56f5947a71 Remove checks for __GNUCLIKE_ASM assuming it is always true.
All supported compilers (modern versions of GCC and clang) support
this.

Many places didn't have an #else so would just silently do the wrong
thing.  Ancient versions of icc (the original motivation for this) are
no longer a compiler FreeBSD supports.

PR:		263102 (exp-run)
Reviewed by:	brooks, imp
Differential Revision:	https://reviews.freebsd.org/D34797
2022-04-12 10:05:45 -07:00
Gordon Bergling
474df59def random(3): Fix a typo in a source code comment
- s/psuedo/pseudo/

MFC after:	3 days
2022-04-09 09:14:22 +02:00
Colin Percival
5c73b3e0a3 Add support for getting early entropy from UEFI
UEFI provides a protocol for accessing randomness. This is a good way
to gather early entropy, especially when there's no driver for the RNG
on the platform (as is the case on the Marvell Armada8k (MACCHIATObin)
for now).

If the entropy_efi_seed option is enabled in loader.conf (default: YES)
obtain 2048 bytes of entropy from UEFI and pass is to the kernel as a
"module" of name "efi_rng_seed" and type "boot_entropy_platform"; if
present, ingest it into the kernel RNG.

Submitted by:	Greg V
Reviewed by:	markm, kevans
Approved by:	csprng (markm)
MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D20780
2022-02-17 13:01:11 -08:00
Kyle Evans
642701abc8 kern: harvest entropy from callouts
74cf7cae4d ("softclock: Use dedicated ithreads for running callouts.")
switched callouts away from the swi infrastructure.  It turns out that
this was a major source of entropy in early boot, which we've now lost.

As a result, first boot on hardware without a 'fast' entropy source
would block waiting for fortuna to be seeded with little hope of
progressing without manual intervention.

Let's resolve it by explicitly harvesting entropy in callout_process()
if we've handled any callouts.  cc/curthread/now seem to be reasonable
sources of entropy, so use those.

Discussed with:	jhb (also proposed initial patch)
Reported by:	many
Reviewed by:	cem, markm (both csprng)
Differential Revision:	https://reviews.freebsd.org/D34150
2022-02-03 10:05:06 -06:00
Colin Percival
1580afcd6e randomdev: Remove 100 ms sleep from write routine
This was introduced in 2014 along with the comment (which has since
been deleted):
	/* Introduce an annoying delay to stop swamping */

Modern cryptographic random number generators can ingest arbitrarily
large amounts of non-random (or even maliciously selected) input
without losing their security.

Depending on the number of "boot entropy files" present on the system,
this can speed up the boot process by up to 1 second.

Reviewed by:	cem
MFC ater:	1 week
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D32984
2021-11-16 10:27:27 -08:00
Konstantin Belousov
362c6d8dec nehemiah: manually assemble xstore(-rng)
It seems that clang IAS erronously adds repz prefix which should not be
there.  Cpu would try to store around %ecx bytes of random, while we
only expect a word.

PR:	259218
Reported and tested by:	 Dennis Clarke <dclarke@blastwave.org>
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2021-10-23 02:31:16 +03:00
Kyle Evans
5e79bba562 kern: random: collect ~16x less from fast-entropy sources
Previously, we were collecting at a base rate of:

64 bits x 32 pools x 10 Hz = 2.5 kB/s

This change drops it to closer to 64-ish bits per pool per second, to
work a little better with entropy providers in virtualized environments
without compromising the security goals of Fortuna.

Reviewed by:	#csprng (cem, delphij, markm)
Differential Revision:	https://reviews.freebsd.org/D32021
2021-09-23 01:03:02 -05:00
Kyle Evans
6895cade94 kern: random: drop read_rate and associated functionality
Refer to discussion in PR 230808 for a less incomplete discussion, but
the gist of this change is that we currently collect orders of magnitude
more entropy than we need.

The excess comes from bytes being read out of /dev/*random.  The default
rate at which we collect entropy without the read_rate increase is
already more than we need to recover from a compromise of an internal
state.

Reviewed by:	#csprng (cem, delphij, markm)
Differential Revision:	https://reviews.freebsd.org/D32021
2021-09-23 01:03:01 -05:00
Conrad Meyer
f8e8a06d23 random(4) FenestrasX: Push root seed version to arc4random(3)
Push the root seed version to userspace through the VDSO page, if
the RANDOM_FENESTRASX algorithm is enabled.  Otherwise, there is no
functional change.  The mechanism can be disabled with
debug.fxrng_vdso_enable=0.

arc4random(3) obtains a pointer to the root seed version published by
the kernel in the shared page at allocation time.  Like arc4random(9),
it maintains its own per-process copy of the seed version corresponding
to the root seed version at the time it last rekeyed.  On read requests,
the process seed version is compared with the version published in the
shared page; if they do not match, arc4random(3) reseeds from the
kernel before providing generated output.

This change does not implement the FenestrasX concept of PCPU userspace
generators seeded from a per-process base generator.  That change is
left for future discussion/work.

Reviewed by:	kib (previous version)
Approved by:	csprng (me -- only touching FXRNG here)
Differential Revision:	https://reviews.freebsd.org/D22839
2020-10-10 21:52:00 +00:00
Conrad Meyer
10b1a17594 arc4random(9): Integrate with RANDOM_FENESTRASX push-reseed
There is no functional change for the existing Fortuna random(4)
implementation, which remains the default in GENERIC.

In the FenestrasX model, when the root CSPRNG is reseeded from pools due to
an (infrequent) timer, child CSPRNGs can cheaply detect this condition and
reseed.  To do so, they just need to track an additional 64-bit value in the
associated state, and compare it against the root seed version (generation)
on random reads.

This revision integrates arc4random(9) into that model without substantially
changing the design or implementation of arc4random(9).  The motivation is
that arc4random(9) is immediately reseeded when the backing random(4)
implementation has additional entropy.  This is arguably most important
during boot, when fenestrasX is reseeding at 1, 3, 9, 27, etc., second
intervals.  Today, arc4random(9) has a hardcoded 300 second reseed window.
Without this mechanism, if arc4random(9) gets weak entropy during initial
seed (and arc4random(9) is used early in boot, so this is quite possible),
it may continue to emit poorly seeded output for 5 minutes.  The FenestrasX
push-reseed scheme corrects consumers, like arc4random(9), as soon as
possible.

Reviewed by:	markm
Approved by:	csprng (markm)
Differential Revision:	https://reviews.freebsd.org/D22838
2020-10-10 21:48:06 +00:00
Conrad Meyer
a3c41f8bfb Add "Fenestras X" alternative /dev/random implementation
Fortuna remains the default; no functional change to GENERIC.

Big picture:
- Scalable entropy generation with per-CPU, buffered local generators.
- "Push" system for reseeding child generators when root PRNG is
  reseeded.  (Design can be extended to arc4random(9) and userspace
  generators.)
- Similar entropy pooling system to Fortuna, but starts with a single
  pool to quickly bootstrap as much entropy as possible early on.
- Reseeding from pooled entropy based on time schedule.  The time
  interval starts small and grows exponentially until reaching a cap.
  Again, the goal is to have the RNG state depend on as much entropy as
  possible quickly, but still periodically incorporate new entropy for
  the same reasons as Fortuna.

Notable design choices in this implementation that differ from those
specified in the whitepaper:
- Blake2B instead of SHA-2 512 for entropy pooling
- Chacha20 instead of AES-CTR DRBG
- Initial seeding.  We support more platforms and not all of them use
  loader(8).  So we have to grab the initial entropy sources in kernel
  mode instead, as much as possible.  Fortuna didn't have any mechanism
  for this aside from the special case of loader-provided previous-boot
  entropy, so most of these sources remain TODO after this commit.

Reviewed by:	markm
Approved by:	csprng (markm)
Differential Revision:	https://reviews.freebsd.org/D22837
2020-10-10 21:45:59 +00:00
John Baldwin
4a711b8d04 Use zfree() instead of explicit_bzero() and free().
In addition to reducing lines of code, this also ensures that the full
allocation is always zeroed avoiding possible bugs with incorrect
lengths passed to explicit_bzero().

Suggested by:	cem
Reviewed by:	cem, delphij
Approved by:	csprng (cem)
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25435
2020-06-25 20:17:34 +00:00
John Baldwin
97e251327f Remove ubsec(4).
This driver was previously marked for deprecation in r360710.

Approved by:	csprng (cem, gordon, delphij)
Relnotes:	yes
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D24766
2020-05-11 20:30:28 +00:00
Pawel Biernacki
4312ebfe0b Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (18 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE.  All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Reviewed by:	cem
Approved by:	csprng, kib (mentor, blanket)
Differential Revision:	https://reviews.freebsd.org/D23841
2020-02-27 13:12:14 +00:00
Conrad Meyer
767991d2be vmgenid(4): Integrate as a random(4) source
The number is public and has no "entropy," but should be integrated quickly
on VM rewind events to avoid duplicate sequences.

Approved by:	csprng(markm)
Differential Revision:	https://reviews.freebsd.org/D22946
2020-01-01 00:35:02 +00:00
Conrad Meyer
374c99911e random(4): Make entropy source deregistration safe
Allow loadable modules that provide random entropy source(s) to safely
unload.  Prior to this change, no driver could ensure that their
random_source structure was not being used by random_harvestq.c for any
period of time after invoking random_source_deregister().

This change converts the source_list LIST to a ConcurrencyKit CK_LIST and
uses an epoch(9) to protect typical read accesses of the list.  The existing
HARVEST_LOCK spin mutex is used to safely add and remove list entries.
random_source_deregister() uses epoch_wait() to ensure no concurrent
source_list readers are accessing a random_source before freeing the list
item and returning to the caller.

Callers can safely unload immediately after random_source_deregister()
returns.

Reviewed by:	markj
Approved by:	csprng(markm)
Discussed with:	jhb
Differential Revision:	https://reviews.freebsd.org/D22489
2019-12-30 01:38:19 +00:00
Conrad Meyer
3ee1d5bb9d random(4): Simplify RANDOM_LOADABLE
Simplify RANDOM_LOADABLE by removing the ability to unload a LOADABLE
random(4) implementation.  This allows one-time random module selection
at boot, by loader(8).  Swapping modules on the fly doesn't seem
especially useful.

This removes the need to hold a lock over the sleepable module calls
read_random and read_random_uio.

init/deinit have been pulled out of random_algorithm entirely.  Algorithms
can run their own sysinits to initialize; deinit is removed entirely, as
algorithms can not be unloaded.  Algorithms should initialize at
SI_SUB_RANDOM:SI_ORDER_SECOND.  In LOADABLE systems, algorithms install
a pointer to their local random_algorithm context in p_random_alg_context at
that time.

Go ahead and const'ify random_algorithm objects; there is no need to mutate
them at runtime.

LOADABLE kernel NULL checks are removed from random_harvestq by ordering
random_harvestq initialization at SI_SUB_RANDOM:SI_ORDER_THIRD, after
algorithm init.  Prior to random_harvestq init, hc_harvest_mask is zero and
no events are forwarded to algorithms; after random_harvestq init, the
relevant pointers will already have been installed.

Remove the bulk of random_infra shim wrappers and instead expose the bare
function pointers in sys/random.h.  In LOADABLE systems, read_random(9) et
al are just thin shim macros around invoking the associated function
pointer.  We do not provide a registration system but instead expect
LOADABLE modules to register themselves at SI_SUB_RANDOM:SI_ORDER_SECOND.
An example is provided in randomdev.c, as used in the random_fortuna.ko
module.

Approved by:	csprng(markm)
Discussed with:	gordon
Differential Revision:	https://reviews.freebsd.org/D22512
2019-12-26 19:32:11 +00:00
Conrad Meyer
68b97d40fb random(4): Flip default Fortuna generator over to Chacha20
The implementation was landed in r344913 and has had some bake time (at
least on my personal systems).  There is some discussion of the motivation
for defaulting to this cipher as a PRF in the commit log for r344913.

As documented in that commit, administrators can retain the prior (AES-ICM)
mode of operation by setting the 'kern.random.use_chacha20_cipher' tunable
to 0 in loader.conf(5).

Approved by:	csprng(delphij, markm)
Differential Revision:	https://reviews.freebsd.org/D22878
2019-12-20 21:11:00 +00:00
Conrad Meyer
548dca90ae random(4): Fortuna: Enable concurrent generation by default for 13
Flip the knob added in r349154 to "enabled."  The commit message from that
revision and associated code comment describe the rationale, implementation,
and motivation for the new default in detail.  I have dog-fooded this
configuration on my own systems for six months, for what that's worth.

For end-users: the result is just as secure.  The benefit is a faster, more
responsive system when processes produce significant demand on random(4).

As mentioned in the earlier commit, the prior behavior may be restored by
setting the kern.random.fortuna.concurrent_read="0" knob in loader.conf(5).

This scales the random generation side of random(4) somewhat, although there
is still a global mutex being shared by all cores and rand_harvestq; the
situation is generally much better than it was before on small CPU systems,
but do not expect miracles on 256-core systems running 256-thread full-rate
random(4) read.  Work is ongoing to address both the generation-side (in
more depth) and the harvest-side scaling problems.

Approved by:	csprng(delphij, markm)
Tested by:	markm
Differential Revision:	https://reviews.freebsd.org/D22879
2019-12-20 08:31:23 +00:00
Conrad Meyer
b6db1cc710 random(4): De-export random_sources list
The internal datastructures do not need to be visible outside of
random_harvestq, and this helps ensure they are not misused.

No functional change.

Approved by:	csprng(delphij, markm)
Differential Revision:	https://reviews.freebsd.org/D22485
2019-11-22 20:24:15 +00:00
Conrad Meyer
d7a23f9f6b random(4): Use ordinary sysctl definitions
There's no need to dynamically populate them; the SYSCTL_ macros take care
of load/unload appropriately already (and random_harvestq is 'standard' and
cannot be unloaded anyway).

Approved by:	csprng(delphij, markm)
Differential Revision:	https://reviews.freebsd.org/D22484
2019-11-22 20:22:29 +00:00
Conrad Meyer
f19de0a945 random(4): Abstract loader entropy injection
Break random_harvestq_prime up into some logical subroutines.  The goal
is that it becomes easier to add other early entropy sources.

While here, drop pre-12.0 compatibility logic.  loader default configuration
should preload the file as expeced since 12.0.

Approved by:	csprng(delphij, markm)
Differential Revision:	https://reviews.freebsd.org/D22482
2019-11-22 20:20:37 +00:00
Conrad Meyer
92ebf15da5 random(4): Remove unused definitions
Approved by:	csprng(gordon, markm)
Differential Revision:	https://reviews.freebsd.org/D22481
2019-11-22 20:18:07 +00:00
Conrad Meyer
cb285f7c7c random/ivy: Provide mechanism to read independent seed values from rdrand
On x86 platforms with the intrinsic, rdrand is a deterministic bit generator
(AES-CTR) seeded from an entropic source.  On x86 platforms with rdseed, it
is something closer to the upstream entropic source.  (There is more nuance;
a block diagram is provided in [1].)

On devices with rdrand and without rdseed, there is no good intrinsic for
acecssing the good entropic soure directly.  However, the DRBG is guaranteed
to reseed every 8 kB on these platforms.  As a conservative option, on such
hardware we can read an extra 7.99kB samples every time we want a sample
from an independent seed.

As one can imagine, this drastically slows the effective read rate of
RDRAND (a factor of 1024 on amd64 and 2048 on ia32).  Microbenchmarks on AMD
Zen (has RDSEED) show an RDRAND rate of 25 MB/s and Intel Haswell (no
RDSEED) show RDRAND of 170 MB/s.  This would reduce the read rate on Haswell
to ~170 kB/s (at 100% CPU).  random(4)'s harvestq thread periodically
"feeds" from pure sources in amounts of 128-1024 bytes.  On Haswell,
enabling this feature increases the CPU time of RDRAND in each "feed" from
approximately 0.7-6 µs to 0.7-6 ms.

Because there is some performance penalty to this more conservative option,
a knob is provided to enable the change.  The change does not affect
platforms with RDSEED.

[1]: https://software.intel.com/en-us/articles/intel-digital-random-number-generator-drng-software-implementation-guide#inpage-nav-4-2

Approved by:	csprng(delphij, markm)
Differential Revision:	https://reviews.freebsd.org/D22455
2019-11-22 19:30:31 +00:00
Conrad Meyer
c41faf5591 random/ivy: Trivial refactoring
It is clearer to me to return success/error (true/false) instead of some
retry count linked to the inline assembly implementation.

No functional change.

Approved by:	core(csprng) => csprng(markm)
Differential Revision:	https://reviews.freebsd.org/D22454
2019-11-20 19:55:43 +00:00
Conrad Meyer
7384206a94 random(4): Reorder configuration of random source modules
Move fast entropy source registration to the earlier
SI_SUB_RANDOM:SI_ORDER_FOURTH and move random_harvestq_prime after that.
Relocate the registration routines out of the much later randomdev module
and into random_harvestq.

This is necessary for the fast random sources to actually register before we
perform random_harvestq_prime() early in the kernel boot.

No functional change.

Reviewed by:	delphij, markjm
Approved by:	secteam(delphij)
Differential Revision:	https://reviews.freebsd.org/D21308
2019-08-18 16:04:01 +00:00
Conrad Meyer
878a05a4e6 random(4): Remove "EXPERIMENTAL" verbiage from concurrent operation
No functional change.

Add a verbose comment giving an example side-by-side comparison between the
prior and Concurrent modes of Fortuna, and why one should believe they
produce the same result.

The intent is to flip this on by default prior to 13.0, so testing is
encouraged.  To enable, add the following to loader.conf:

    kern.random.fortuna.concurrent_read="1"

The intent is also to flip the default blockcipher to the faster Chacha-20
prior to 13.0, so testing of that mode of operation is also appreciated.
To enable, add the following to loader.conf:

    kern.random.use_chacha20_cipher="1"

Approved by:	secteam(implicit)
2019-08-15 00:39:53 +00:00
Conrad Meyer
22eedc9722 random(4): Fix a regression in short AES mode reads
In r349154, random device reads of size < 16 bytes (AES block size) were
accidentally broken to loop forever.  Correct the loop condition for small
reads.

Reported by:	pho
Reviewed by:	delphij
Approved by:	secteam(delphij)
Differential Revision:	https://reviews.freebsd.org/D20686
2019-06-18 18:50:58 +00:00
Conrad Meyer
179f62805c random(4): Fortuna: allow increased concurrency
Add experimental feature to increase concurrency in Fortuna.  As this
diverges slightly from canonical Fortuna, and due to the security
sensitivity of random(4), it is off by default.  To enable it, set the
tunable kern.random.fortuna.concurrent_read="1".  The rest of this commit
message describes the behavior when enabled.

Readers continue to update shared Fortuna state under global mutex, as they
do in the status quo implementation of the algorithm, but shift the actual
PRF generation out from under the global lock.  This massively reduces the
CPU time readers spend holding the global lock, allowing for increased
concurrency on SMP systems and less bullying of the harvestq kthread.

It is somewhat of a deviation from FS&K.  I think the primary difference is
that the specific sequence of AES keys will differ if READ_RANDOM_UIO is
accessed concurrently (as the 2nd thread to take the mutex will no longer
receive a key derived from rekeying the first thread).  However, I believe
the goals of rekeying AES are maintained: trivially, we continue to rekey
every 1MB for the statistical property; and each consumer gets a
forward-secret, independent AES key for their PRF.

Since Chacha doesn't need to rekey for sequences of any length, this change
makes no difference to the sequence of Chacha keys and PRF generated when
Chacha is used in place of AES.

On a GENERIC 4-thread VM (so, INVARIANTS/WITNESS, numbers not necessarily
representative), 3x concurrent AES performance jumped from ~55 MiB/s per
thread to ~197 MB/s per thread.  Concurrent Chacha20 at 3 threads went from
roughly ~113 MB/s per thread to ~430 MB/s per thread.

Prior to this change, the system was extremely unresponsive with 3-4
concurrent random readers; each thread had high variance in latency and
throughput, depending on who got lucky and won the lock.  "rand_harvestq"
thread CPU use was high (double digits), seemingly due to spinning on the
global lock.

After the change, concurrent random readers and the system in general are
much more responsive, and rand_harvestq CPU use dropped to basically zero.

Tests are added to the devrandom suite to ensure the uint128_add64 primitive
utilized by unlocked read functions to specification.

Reviewed by:	markm
Approved by:	secteam(delphij)
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D20313
2019-06-17 20:29:13 +00:00
Conrad Meyer
d0d71d818c random(4): Generalize algorithm-independent APIs
At a basic level, remove assumptions about the underlying algorithm (such as
output block size and reseeding requirements) from the algorithm-independent
logic in randomdev.c.  Chacha20 does not have many of the restrictions that
AES-ICM does as a PRF (Pseudo-Random Function), because it has a cipher
block size of 512 bits.  The motivation is that by generalizing the API,
Chacha is not penalized by the limitations of AES.

In READ_RANDOM_UIO, first attempt to NOWAIT allocate a large enough buffer
for the entire user request, or the maximal input we'll accept between
signal checking, whichever is smaller.  The idea is that the implementation
of any randomdev algorithm is then free to divide up large requests in
whatever fashion it sees fit.

As part of this, two responsibilities from the "algorithm-generic" randomdev
code are pushed down into the Fortuna ra_read implementation (and any other
future or out-of-tree ra_read implementations):

  1. If an algorithm needs to rekey every N bytes, it is responsible for
  handling that in ra_read(). (I.e., Fortuna's 1MB rekey interval for AES
  block generation.)

  2. If an algorithm uses a block cipher that doesn't tolerate partial-block
  requests (again, e.g., AES), it is also responsible for handling that in
  ra_read().

Several APIs are changed from u_int buffer length to the more canonical
size_t.  Several APIs are changed from taking a blockcount to a bytecount,
to permit PRFs like Chacha20 to directly generate quantities of output that
are not multiples of RANDOM_BLOCKSIZE (AES block size).

The Fortuna algorithm is changed to NOT rekey every 1MiB when in Chacha20
mode (kern.random.use_chacha20_cipher="1").  This is explicitly supported by
the math in FS&K §9.4 (Ferguson, Schneier, and Kohno; "Cryptography
Engineering"), as well as by their conclusion: "If we had a block cipher
with a 256-bit [or greater] block size, then the collisions would not
have been an issue at all."

For now, continue to break up reads into PAGE_SIZE chunks, as they were
before.  So, no functional change, mostly.

Reviewed by:	markm
Approved by:	secteam(delphij)
Differential Revision:	https://reviews.freebsd.org/D20312
2019-06-17 15:09:12 +00:00
Conrad Meyer
403c041316 random(4): Add regression tests for uint128 implementation, Chacha CTR
Add some basic regression tests to verify behavior of both uint128
implementations at typical boundary conditions, to run on all architectures.

Test uint128 increment behavior of Chacha in keystream mode, as used by
'kern.random.use_chacha20_cipher=1' (r344913) to verify assumptions at edge
cases.  These assumptions are critical to the safety of using Chacha as a
PRF in Fortuna (as implemented).

(Chacha's use in arc4random is safe regardless of these tests, as it is
limited to far less than 4 billion blocks of output in that API.)

Reviewed by:	markm
Approved by:	secteam(gordon)
Differential Revision:	https://reviews.freebsd.org/D20392
2019-06-17 14:59:45 +00:00
Conrad Meyer
5ca5dfe938 random(4): Fix RANDOM_LOADABLE build
I introduced an obvious compiler error in r346282, so this change fixes
that.

Unfortunately, RANDOM_LOADABLE isn't covered by our existing tinderbox, and
it seems like there were existing latent linking problems.  I believe these
were introduced on accident in r338324 during reduction of the boolean
expression(s) adjacent to randomdev.c and hash.c.  It seems the
RANDOM_LOADABLE build breakage has gone unnoticed for nine months.

This change correctly annotates randomdev.c and hash.c with !random_loadable
to match the pre-r338324 logic; and additionally updates the HWRNG drivers
in MD 'files.*', which depend on random_device symbols, with
!random_loadable (it is invalid for the kernel to depend on symbols from a
module).

(The expression for both randomdev.c and hash.c was the same, prior to
r338324: "optional random random_yarrow | random !random_yarrow
!random_loadable".  I.e., "random && (yarrow || !loadable)."  When Yarrow
was removed ("yarrow := False"), the expression was incorrectly reduced to
"optional random" when it should have retained "random && !loadable".)

Additionally, I discovered that virtio_random was missing a MODULE_DEPEND on
random_device, which breaks kld load/link of the driver on RANDOM_LOADABLE
kernels.  Address that issue as well.

PR:		238223
Reported by:	Eir Nym <eirnym AT gmail.com>
Reviewed by:	delphij, markm
Approved by:	secteam(delphij)
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D20466
2019-06-01 01:22:21 +00:00
Conrad Meyer
00e0e488a0 random(4): deduplicate explicit_bzero() in harvest
Pull the responsibility for zeroing events, which is general to any
conceivable implementation of a random device algorithm, out of the
algorithm-specific Fortuna code and into the callers.  Most callers
indirect through random_fortuna_process_event(), so add the logic there.
Most callers already explicitly bzeroed the events they provided, so the
logic in Fortuna was mostly redundant.

Add one missing bzero in randomdev_accumulate().  Also, remove a redundant
bzero in the same function -- randomdev_hash_finish() is obliged to bzero
the hash state.

Reviewed by:	delphij
Approved by:	secteam(delphij)
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D20318
2019-05-23 21:02:27 +00:00
Konstantin Belousov
7c5a46a1bc Remove resolver_qual from DEFINE_IFUNC/DEFINE_UIFUNC macros.
In all practical situations, the resolver visibility is static.

Requested by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Approved by:	so (emaste)
Differential revision:	https://reviews.freebsd.org/D20281
2019-05-16 22:20:54 +00:00
Conrad Meyer
e8e1f0b420 Fortuna: Fix false negatives in is_random_seeded()
(1) We may have had sufficient entropy to consider Fortuna seeded, but the
random_fortuna_seeded() function would produce a false negative if
fs_counter was still zero.  This condition could arise after
random_harvestq_prime() processed the /boot/entropy file and before any
read-type operation invoked "pre_read()."  Fortuna's fs_counter variable is
only incremented (if certain conditions are met) by reseeding, which is
invoked by random_fortuna_pre_read().

is_random_seeded(9) was introduced in r346282, but the function was unused
prior to r346358, which introduced this regression.  The regression broke
initial seeding of arc4random(9) and broke periodic reseeding[A], until something
other than arc4random(9) invoked read_random(9) or read_random_uio(9) directly.
(Such as userspace getrandom(2) or read(2) of /dev/random.  By default,
/etc/rc.d/random does this during multiuser start-up.)

(2) The conditions under which Fortuna will reseed (including initial seeding)
are: (a) sufficient "entropy" (by sheer byte count; default 64) is collected
in the zeroth pool (of 32 pools), and (b) it has been at least 100ms since
the last reseed (to prevent trivial DoS; part of FS&K design).  Prior to
this revision, initial seeding might have been prevented if the reseed
function was invoked during the first 100ms of boot.

This revision addresses both of these issues.  If random_fortuna_seeded()
observes a zero fs_counter, it invokes random_fortuna_pre_read() and checks
again.  This addresses the problem where entropy actually was sufficient,
but nothing had attempted a read -> pre_read yet.

The second change is to disable the 100ms reseed guard when Fortuna has
never been seeded yet (fs_lasttime == 0).  The guard is intended to prevent
gratuitous subsequent reseeds, not initial seeding!

Machines running CURRENT between r346358 and this revision are encouraged to
refresh when possible.  Keys generated by userspace with /dev/random or
getrandom(9) during this timeframe are safe, but any long-term session keys
generated by kernel arc4random consumers are potentially suspect.

[A]: Broken in the sense that is_random_seeded(9) false negatives would cause
arc4random(9) to (re-)seed with weak entropy (SHA256(cyclecount ||
FreeBSD_version)).

PR:		237869
Reported by:	delphij, dim
Reviewed by:	delphij
Approved by:	secteam(delphij)
X-MFC-With:	r346282, r346358 (if ever)
Security:	yes
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D20239
2019-05-13 19:35:35 +00:00
Mark Johnston
b870199522 Avoid returning a NULL pointer from the Intel hw PRNG ifunc resolver.
DTrace expects kernel function symbols of a non-zero size to have an
implementation, which is a reasonable invariant to preserve.

Reported and tested by:	ler
Reviewed by:	cem, kib
Approved by:	so (delphij)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20218
2019-05-10 04:28:17 +00:00
Conrad Meyer
e01ada5c44 random(4): Don't complain noisily when an entropy source is slow
Mjg@ reports that RDSEED (r347239) causes a lot of logspam from this printf,
and I don't feel that it is especially useful (even ratelimited).  There are
many other quality/quantity checks we're not performing on entropy sources;
lack of high frequency availability does not disqualify a good entropy
source.

There is some discussion in the linked Differential about what logging might
be appropriate and/or polling policy for slower TRNG sources.  Please feel
free to chime in if you have opinions.

Reported by:	mjg
Reviewed by:	markm, delphij
Approved by:	secteam(delphij)
X-MFC-With:	r347239
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D20195
2019-05-08 14:54:32 +00:00
Conrad Meyer
2cb54a800c random: x86 driver: Prefer RDSEED over RDRAND when available
Per
https://software.intel.com/en-us/blogs/2012/11/17/the-difference-between-rdrand-and-rdseed
, RDRAND is a PRNG seeded from the same source as RDSEED.  The source is
more suitable as PRNG seed material, so prefer it when the RDSEED intrinsic
is available (indicated in CPU feature bits).

Reviewed by:	delphij, jhb, imp (earlier version)
Approved by:	secteam(delphij)
Security:	yes
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D20192
2019-05-08 00:45:16 +00:00
Conrad Meyer
3782136ff1 random(4): Restore availability tradeoff prior to r346250
As discussed in that commit message, it is a dangerous default.  But the
safe default causes enough pain on a variety of platforms that for now,
restore the prior default.

Some of this is self-induced pain we should/could do better about; for
example, programmatic CI systems and VM managers should introduce entropy
from the host for individual VM instances.  This is considered a future work
item.

On modern x86 and Power9 systems, this may be wholly unnecessary after
D19928 lands (even in the non-ideal case where early /boot/entropy is
unavailable), because they have fast hardware random sources available early
in boot.  But D19928 is not yet landed and we have a host of architectures
which do not provide fast random sources.

This change adds several tunables and diagnostic sysctls, documented
thoroughly in UPDATING and sys/dev/random/random_infra.c.

PR:		230875 (reopens)
Reported by:	adrian, jhb, imp, and probably others
Reviewed by:	delphij, imp (earlier version), markm (earlier version)
Discussed with:	adrian
Approved by:	secteam(delphij)
Relnotes:	yeah
Security:	related
Differential Revision:	https://reviews.freebsd.org/D19944
2019-04-18 20:48:54 +00:00
Conrad Meyer
f3d2512db6 random(4): Add is_random_seeded(9) KPI
The imagined use is for early boot consumers of random to be able to make
decisions based on whether random is available yet or not.  One such
consumer seems to be __stack_chk_init(), which runs immediately after random
is initialized.  A follow-up patch will attempt to address that.

Reported by:	many
Reviewed by:	delphij (except man page)
Approved by:	secteam(delphij)
Differential Revision:	https://reviews.freebsd.org/D19926
2019-04-16 17:12:17 +00:00
Conrad Meyer
13774e8228 random(4): Block read_random(9) on initial seeding
read_random() is/was used, mostly without error checking, in a lot of
very sensitive places in the kernel -- including seeding the widely used
arc4random(9).

Most uses, especially arc4random(9), should block until the device is seeded
rather than proceeding with a bogus or empty seed.  I did not spy any
obvious kernel consumers where blocking would be inappropriate (in the
sense that lack of entropy would be ok -- I did not investigate locking
angle thoroughly).  In many instances, arc4random_buf(9) or that family
of APIs would be more appropriate anyway; that work was done in r345865.

A minor cleanup was made to the implementation of the READ_RANDOM function:
instead of using a variable-length array on the stack to temporarily store
all full random blocks sufficient to satisfy the requested 'len', only store
a single block on the stack.  This has some benefit in terms of reducing
stack usage, reducing memcpy overhead and reducing devrandom output leakage
via the stack.  Additionally, the stack block is now safely zeroed if it was
used.

One caveat of this change is that the kern.arandom sysctl no longer returns
zero bytes immediately if the random device is not seeded.  This means that
FreeBSD-specific userspace applications which attempted to handle an
unseeded random device may be broken by this change.  If such behavior is
needed, it can be replaced by the more portable getrandom(2) GRND_NONBLOCK
option.

On any typical FreeBSD system, entropy is persisted on read/write media and
used to seed the random device very early in boot, and blocking is never a
problem.

This change primarily impacts the behavior of /dev/random on embedded
systems with read-only media that do not configure "nodevice random".  We
toggle the default from 'charge on blindly with no entropy' to 'block
indefinitely.'  This default is safer, but may cause frustration.  Embedded
system designers using FreeBSD have several options.  The most obvious is to
plan to have a small writable NVRAM or NAND to persist entropy, like larger
systems.  Early entropy can be fed from any loader, or by writing directly
to /dev/random during boot.  Some embedded SoCs now provide a fast hardware
entropy source; this would also work for quickly seeding Fortuna.  A 3rd
option would be creating an embedded-specific, more simplistic random
module, like that designed by DJB in [1] (this design still requires a small
rewritable media for forward secrecy).  Finally, the least preferred option
might be "nodevice random", although I plan to remove this in a subsequent
revision.

To help developers emulate the behavior of these embedded systems on
ordinary workstations, the tunable kern.random.block_seeded_status was
added.  When set to 1, it blocks the random device.

I attempted to document this change in random.4 and random.9 and ran into a
bunch of out-of-date or irrelevant or inaccurate content and ended up
rototilling those documents more than I intended to.  Sorry.  I think
they're in a better state now.

PR:		230875
Reviewed by:	delphij, markm (earlier version)
Approved by:	secteam(delphij), devrandom(markm)
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D19744
2019-04-15 18:40:36 +00:00
Marcin Wojtas
4ee7d3b011 Allow using TPM as entropy source.
TPM has a built-in RNG, with its own entropy source.
The driver was extended to harvest 16 random bytes from TPM every 10 seconds.
A new build option "TPM_HARVEST" was introduced - for now, however, it
is not enabled by default in the GENERIC config.

Submitted by: Kornel Duleba <mindal@semihalf.com>
Reviewed by: markm, delphij
Approved by: secteam
Obtained from: Semihalf
Sponsored by: Stormshield
Differential Revision: https://reviews.freebsd.org/D19620
2019-03-23 05:13:51 +00:00
Conrad Meyer
ab69c4858c Fortuna: Add Chacha20 as an alternative stream cipher
Chacha20 with a 256 bit key and 128 bit counter size is a good match for an
AES256-ICM replacement.

In userspace, Chacha20 is typically marginally slower than AES-ICM on
machines with AESNI intrinsics, but typically much faster than AES on
machines without special intrinsics.  ChaCha20 does well on typical modern
architectures with SIMD instructions, which includes most types of machines
FreeBSD runs on.

In the kernel, we can't (or don't) make use of AESNI intrinsics for
random(4) anyway.  So even on amd64, using Chacha provides a modest
performance improvement in random device throughput today.

This change makes the stream cipher used by random(4) configurable at boot
time with the 'kern.random.use_chacha20_cipher' tunable.

Very rough, non-scientific measurements at the /dev/random device, on a
GENERIC-NODEBUG amd64 VM with 'pv', show a factor of 2.2x higher throughput
for Chacha20 over the existing AES-ICM mode.

Reviewed by:	delphij, markm
Approved by:	secteam (delphij)
Differential Revision:	https://reviews.freebsd.org/D19475
2019-03-08 01:17:20 +00:00
Conrad Meyer
e66ccbeaa3 fortuna: Deduplicate kernel vs user includes
No functional change.

Reviewed by:	markj, markm
Approved by:	secteam (delphij), core (brooks)
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D19409
2019-03-01 22:51:45 +00:00