Commit Graph

2330 Commits

Author SHA1 Message Date
eugen
22eefa8b94 geom_part: make it possible recovering broken GPT after some LBAs cut off
This is followup to r365477.

If pre-formatted device has GPT and a partition covering
last available LBAs and the device is attached using
a bridge reducing amount of LBAs, then it could be not enough
forcing GEOM to use primary GPT. Also, we should make it possible
to recover GPT and this requires either deleting or resizing the partition.

This change enables "gpart delete" and "gpart resize" commands
on corrupted GPT with following "gpart recover".

It still does not allow modifying corrupted GPT without
preliminary setting sysctl kern.geom.part.check_integrity=0

For example:

# gpart show da0
=>        34  3906963389  da0  GPT  (1.8T) [CORRUPT]
          34      262144    1  ms-reserved  (128M)
      262178        2014       - free -  (1.0M)
      264192  3906764943    2  freebsd-swap  (1.8T)
# gpart resize -i 2 -s 3900000000 da0
# gpart recover da0

Reported by:	Alex Korchmar
MFC after:	3 days
2020-09-17 04:39:39 +00:00
imp
fd6740f61e We don't need the sc_ekeys_lock in standalone environment.
When we bring in geli into the boot loader, we are single threaded so
we don't have to worry about locking. We have no mutexes, and don't need
to use them, so comment it out.

MFC After: 3 days
2020-09-14 23:51:14 +00:00
trasz
067e33842f Move TDP_GEOM check from userret() to ast(); this code path is quite
infrequent.

Reviewed by:	kib
No objections:	mav
Tested by:	pho
MFC after:	2 weeks
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D26374
2020-09-14 10:14:03 +00:00
eugen
ad24dd082a geom_part: extend kern.geom.part.check_integrity to work on GPT
There are multiple USB/SATA bridges on the market that unconditionally
cut some LBAs off connected media. This could be a problem
for pre-partitioned drives so GEOM complains and does not create
devices in /dev for slices/partitions preventing access to existing data.

We have kern.geom.part.check_integrity that allows us to correct
partitioning if changed from default 1 to 0 but it works for MBR only.
If backup copy of GPT is unavailable due to decreases number of LBAs,
kernel still does not give access to partitions and prints to dmesg:

GEOM: md0: corrupt or invalid GPT detected.
GEOM: md0: GPT rejected -- may not be recoverable.

This change makes it work for GPT too, so it created partitions in /dev
and prints to dmesg this instead:

GEOM: md0: the secondary GPT table is corrupt or invalid.
GEOM: md0: using the primary only -- recovery suggested.

Then "gpart recover" re-created backup copy of GPT
and allows further manipulations with partitions.

This change is no-op for default configuration having
kern.geom.part.check_integrity=1

Reported by:	Alex Korchmar
MFC after:	3 days.
2020-09-08 22:23:53 +00:00
mjg
246370ba36 geom: clean up empty lines in .c and .h files 2020-09-01 22:14:09 +00:00
imp
9763b7f570 Retire devctl_notify_f()
devctl_notify_f isn't needed, so retire it. The flags argument is now
unused, so rather than keep it around, retire it. Convert all old
users of it to devctl_notify(). This path no longer sleeps, so is safe
to call from any context. Since it doesn't sleep, it doesn't need to
know if it is OK to sleep or not.

Reviewed by: markj@
Differential Revision: https://reviews.freebsd.org/D26140
2020-08-29 04:30:06 +00:00
asomers
3ab78c10af geli: use unmapped I/O
Use unmapped I/O for geli. Unlike most geom providers, geli needs to
manipulate data on every read or write. Previously it would always map bios.

On my 16-core, dual socket server using geli atop md(4) devices, with 512B
sectors, this change increases geli IOPs by about 3x.

Note that geli still can't use unmapped I/O when data integrity verification
is enabled (but it could, with a little more work).  And it can't use
unmapped I/O in combination with ZFS, because ZFS uses mapped bios.

Reviewed by:	markj, kib, jhb, mjg, mat, bcr (manpages)
MFC after:	1 week
Sponsored by:	Axcient
Differential Revision:	https://reviews.freebsd.org/D25671
2020-08-26 02:44:35 +00:00
imp
0e02faf73c Use devctl.h instead of bus.h to reduce newbus pollution.
There's no need for these parts of the kernel to know about newbus,
so narrow what is included to devctl.h for device_notify_*.

Suggested by: kib@
2020-08-21 00:03:24 +00:00
cem
aa185fa69b gpart(8): Recognize apple-zfs and solaris-reserved partition ids
Introduce G_PART_ALIAS_SOLARIS_RESERVED, GPT_ENT_TYPE_SOLARIS_RESERVED et al.,
to make gpart show output more convenient on systems with illumos/openindiana
disks visible.

Submitted by:	Juraj Lutter <otis AT sk.FreeBSD.org>
Reviewed by:	bcr(manpages), delphij, myself
Differential Revision:	https://reviews.freebsd.org/D26012
2020-08-17 17:07:05 +00:00
jhb
003c0b2eaa Fix indentation. 2020-07-27 16:31:21 +00:00
delphij
9b50683aa7 gctl_get_geom: Skip validation of g_class.
The caller from kernel is expected to provide an valid g_class
pointer, instead of traversing the global g_class list, just
use that pointer directly instead.

Reviewed by:		mav
MFC after:		2 weeks
Differential Revision:	https://reviews.freebsd.org/D25811
2020-07-26 22:30:55 +00:00
delphij
42ecd2556b geom_map and geom_redboot: Remove unused ctlreq handler.
The two classes do not take any verbs and always gctl_error for
all requests, so don't bother to provide a ctlreq handler.

Reviewed by:		mav
MFC after:		2 weeks
Differential Revision:	https://reviews.freebsd.org/D25810
2020-07-26 22:30:01 +00:00
delphij
8d66daf2fc Use snprintf instead of sprintf.
MFC after:	2 weeks
2020-07-26 01:45:26 +00:00
delphij
6a2ece89b4 geom_label: Make glabel labels more trivial by separating the tasting
routines out.

While there, also simplify the creation of label paths a little bit
by requiring the / suffix for label directory prefixes (ld_dir renamed
to ld_dirprefix to indicate the change) and stop defining macros for
these when they are only used once.

Reviewed by:		cem
MFC after:		2 weeks
Differential Revision:	https://reviews.freebsd.org/D25597
2020-07-26 00:44:59 +00:00
delphij
dd22d53dd6 Consistently use gctl_get_provider instead of home-grown variants.
Reviewed by:		cem, imp
MFC after:		2 weeks
Differential revision:	https://reviews.freebsd.org/D25739
2020-07-22 02:15:21 +00:00
delphij
1e8583399f gctl_get_class, gctl_get_geom and gctl_get_provider: provide feedback
when the requested argument is missing.

Reviewed by:		cem
MFC after:		2 weeks
Differential revision:	https://reviews.freebsd.org/D25738
2020-07-22 02:14:27 +00:00
asomers
a4d0ce6eb3 Fix geli's null cipher, and add a test case
PR:		247954
Submitted by:	jhb (sys), asomers (tests)
Reviewed by:	jhb (tests), asomers (sys)
MFC after:	2 weeks
Sponsored by:	Axcient
2020-07-21 19:18:29 +00:00
delphij
0f5603fc77 Fix indent for if clause.
MFC after:	2 weeks
2020-07-20 01:55:19 +00:00
delphij
de021239d7 g_concat_find_device: trim /dev/ if it is present, like other GEOM
classes.

Reviewed by:	cem
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D25596
2020-07-09 08:00:46 +00:00
delphij
3249097b0a sys/geom: consistently use _PATH_DEV instead of hardcoding "/dev/".
Reviewed by:	cem
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D25565
2020-07-09 02:52:39 +00:00
asomers
bd9a5e4721 geli: enable direct dispatch
geli does all of its crypto operations in a separate thread pool, so
g_eli_start, g_eli_read_done, and g_eli_write_done don't actually do very
much work. Enabling direct dispatch eliminates the g_up/g_down bottlenecks,
doubling IOPs on my system. This change does not affect the thread pool.

Reviewed by:	markj
MFC after:	2 weeks
Sponsored by:	Axcient
Differential Revision:	https://reviews.freebsd.org/D25587
2020-07-08 17:12:12 +00:00
cem
6e934c3644 geom(4): Kill GEOM_PART_EBR_COMPAT option
Take advantage of Warner's nice new real GEOM aliasing system and use it for
aliased partition names that actually work.

Our canonical EBR partition name is the weird, not-default-on-x86-prior-to-
this-revision "da1p4+00001234."  However, if compatibility mode (tunable
kern.geom.part.ebr.compat_aliases) is enabled (1, default), we continue to
provide the alias names like "da1p5" in addition to the weird canonical
names.

Naming partition providers was just one aspect of the COMPAT knob; in
addition it limited mutability, in part because it did not preserve existing
EBR header content aside from that of LBA 0.  This change saves the EBR
header for LBA 0, as well as for every EBR partition encountered.  That way,
when we write out the EBR partition table on modification, we can restore
any bootloader or other metadata in both LBA0 (the first data-containing EBR
may start after 0) as well as every logical EBR we read from the disk, and
only update the geometry metadata and linked list pointers that describe the
actual partitioning.

(This change does not add support for the 'bootcode' verb to EBR.)

PR:		232463
Reported by:	Manish Jain <bourne.identity AT hotmail.com>
Discussed with:	ae (no objection)
Relnotes:	maybe
Differential Revision:	https://reviews.freebsd.org/D24939
2020-07-01 02:16:36 +00:00
jhb
c8e0782a7e Use explicit_bzero() instead of bzero() for sensitive data.
Reviewed by:	delphij
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25441
2020-06-25 20:25:35 +00:00
jhb
f90bf52ef5 Use zfree() instead of bzero() and free().
These bzero's should have been explicit_bzero's.

Reviewed by:	cem, delphij
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25437
2020-06-25 20:20:22 +00:00
jhb
a900668f4a Use zfree() instead of explicit_bzero() and free().
In addition to reducing lines of code, this also ensures that the full
allocation is always zeroed avoiding possible bugs with incorrect
lengths passed to explicit_bzero().

Suggested by:	cem
Reviewed by:	cem, delphij
Approved by:	csprng (cem)
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25435
2020-06-25 20:17:34 +00:00
mckusick
9e5277ef92 Optimize g_journal's superblock update by noting that the summary
information is neither read nor written so it need not be written
out when updating the superblock.

PR:           247425
Sponsored by: Netflix
2020-06-23 21:44:00 +00:00
bapt
a095da0a47 Revert r362466
Such change should not have happen without prior discussion and review.

With hat:	transitioning core
2020-06-22 07:46:24 +00:00
hselasky
2333998f66 Improve wording to be more precise and clear.
No functional change intended.

s/Master Boot/Main Boot/ (also called MBR)

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-06-21 13:34:08 +00:00
mckusick
fc56f5d127 Move the pointers stored in the superblock into a separate
fs_summary_info structure. This change was originally done
by the CheriBSD project as they need larger pointers that
do not fit in the existing superblock.

This cleanup of the superblock eases the task of the commit
that immediately follows this one.

Suggested by: brooks
Reviewed by:  kib
PR:           246983
Sponsored by: Netflix
2020-06-19 01:02:53 +00:00
jhb
bbd694b98b Add a crypto capability flag for accelerated software drivers.
Use this in GELI to print out a different message when accelerated
software such as AESNI is used vs plain software crypto.

While here, simplify the logic in GELI a bit for determing which type
of crypto driver was chosen the first time by examining the
capabilities of the matched driver after a single call to
crypto_newsession rather than making separate calls with different
flags.

Reviewed by:	delphij
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25126
2020-06-09 22:26:07 +00:00
cem
d4d4471117 Revert r361838
Reported by:	delphij
2020-06-06 14:19:16 +00:00
cem
14cc84cb6a geom_label: Use provider aliasing to alias upstream geoms
For synthetic aliases (just pseudonyms inferred from metadata like GPT or
UFS labels, GPT UUIDs, etc), use the GEOM provider aliasing system to create
a symlink to the real device instead of creating an independent device.
This makes it more clear which labels and devices correspond, and we can
safely have multiple labels to a single device accessed at once.

The confusingly named geom_label on-disk construct continues to behave
identically to how it did before.

This requires teaching GEOM's provider aliasing about the possibility
that aliases might be added later in time, and GEOM's devfs interaction
layer not to worry about existing aliases during retaste.

Discussed with:	imp
Relnotes:	sure, if we don't end up reverting it
Differential Revision:	https://reviews.freebsd.org/D24968
2020-06-05 16:12:21 +00:00
cem
a2ca650df6 geom: Don't re-add duplicate aliases
Reviewed by:	imp (informal +1; extracted from phab 24968)
2020-06-05 16:05:09 +00:00
cem
83e0af4c93 geom_part: Dispatch to partitions to create providers and aliases
This allows partitions to create additional aliases of their own.  The
default method implementations preserve the existing behavior.

No functional change.

Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D24938
2020-05-29 19:44:18 +00:00
asomers
d10d581667 geli: fix a livelock during panic
During any kind of shutdown, kern_reboot calls geli's pre_sync event hook,
which tries to destroy all unused geli devices. But during a panic, geli
can't destroy any devices, because the scheduler is stopped, so it can't
switch threads. A livelock results, and the system never dumps core.

This commit fixes the problem by refusing to destroy any devices during
panic, used or otherwise.

PR:		246207
Reviewed by:	jhb
MFC after:	2 weeks
Sponsored by:	Axcient
Differential Revision:	https://reviews.freebsd.org/D24697
2020-05-27 19:13:26 +00:00
chs
c18fd01602 This commit enables a UFS filesystem to do a forcible unmount when
the underlying media fails or becomes inaccessible. For example
when a USB flash memory card hosting a UFS filesystem is unplugged.

The strategy for handling disk I/O errors when soft updates are
enabled is to stop writing to the disk of the affected file system
but continue to accept I/O requests and report that all future
writes by the file system to that disk actually succeed. Then
initiate an asynchronous forced unmount of the affected file system.

There are two cases for disk I/O errors:

   - ENXIO, which means that this disk is gone and the lower layers
     of the storage stack already guarantee that no future I/O to
     this disk will succeed.

   - EIO (or most other errors), which means that this particular
     I/O request has failed but subsequent I/O requests to this
     disk might still succeed.

For ENXIO, we can just clear the error and continue, because we
know that the file system cannot affect the on-disk state after we
see this error. For EIO or other errors, we arrange for the geom_vfs
layer to reject all future I/O requests with ENXIO just like is
done when the geom_vfs is orphaned. In both cases, the file system
code can just clear the error and proceed with the forcible unmount.

This new treatment of I/O errors is needed for writes of any buffer
that is involved in a dependency. Most dependencies are described
by a structure attached to the buffer's b_dep field. But some are
created and processed as a result of the completion of the dependencies
attached to the buffer.

Clearing of some dependencies require a read. For example if there
is a dependency that requires an inode to be written, the disk block
containing that inode must be read, the updated inode copied into
place in that buffer, and the buffer then written back to disk.

Often the needed buffer is already in memory and can be used. But
if it needs to be read from the disk, the read will fail, so we
fabricate a buffer full of zeroes and pretend that the read succeeded.
This zero'ed buffer can be updated and written back to disk.

The only case where a buffer full of zeros causes the code to do
the wrong thing is when reading an inode buffer containing an inode
that still has an inode dependency in memory that will reinitialize
the effective link count (i_effnlink) based on the actual link count
(i_nlink) that we read. To handle this case we now store the i_nlink
value that we wrote in the inode dependency so that it can be
restored into the zero'ed buffer thus keeping the tracking of the
inode link count consistent.

Because applications depend on knowing when an attempt to write
their data to stable storage has failed, the fsync(2) and msync(2)
system calls need to return errors if data fails to be written to
stable storage. So these operations return ENXIO for every call
made on files in a file system where we have otherwise been ignoring
I/O errors.

Coauthered by: mckusick
Reviewed by:   kib
Tested by:     Peter Holm
Approved by:   mckusick (mentor)
Sponsored by:  Netflix
Differential Revision:  https://reviews.freebsd.org/D24088
2020-05-25 23:47:31 +00:00
jhb
8f001f91aa Add support for optional separate output buffers to in-kernel crypto.
Some crypto consumers such as GELI and KTLS for file-backed sendfile
need to store their output in a separate buffer from the input.
Currently these consumers copy the contents of the input buffer into
the output buffer and queue an in-place crypto operation on the output
buffer.  Using a separate output buffer avoids this copy.

- Create a new 'struct crypto_buffer' describing a crypto buffer
  containing a type and type-specific fields.  crp_ilen is gone,
  instead buffers that use a flat kernel buffer have a cb_buf_len
  field for their length.  The length of other buffer types is
  inferred from the backing store (e.g. uio_resid for a uio).
  Requests now have two such structures: crp_buf for the input buffer,
  and crp_obuf for the output buffer.

- Consumers now use helper functions (crypto_use_*,
  e.g. crypto_use_mbuf()) to configure the input buffer.  If an output
  buffer is not configured, the request still modifies the input
  buffer in-place.  A consumer uses a second set of helper functions
  (crypto_use_output_*) to configure an output buffer.

- Consumers must request support for separate output buffers when
  creating a crypto session via the CSP_F_SEPARATE_OUTPUT flag and are
  only permitted to queue a request with a separate output buffer on
  sessions with this flag set.  Existing drivers already reject
  sessions with unknown flags, so this permits drivers to be modified
  to support this extension without requiring all drivers to change.

- Several data-related functions now have matching versions that
  operate on an explicit buffer (e.g. crypto_apply_buf,
  crypto_contiguous_subsegment_buf, bus_dma_load_crp_buf).

- Most of the existing data-related functions operate on the input
  buffer.  However crypto_copyback always writes to the output buffer
  if a request uses a separate output buffer.

- For the regions in input/output buffers, the following conventions
  are followed:
  - AAD and IV are always present in input only and their
    fields are offsets into the input buffer.
  - payload is always present in both buffers.  If a request uses a
    separate output buffer, it must set a new crp_payload_start_output
    field to the offset of the payload in the output buffer.
  - digest is in the input buffer for verify operations, and in the
    output buffer for compute operations.  crp_digest_start is relative
    to the appropriate buffer.

- Add a crypto buffer cursor abstraction.  This is a more general form
  of some bits in the cryptosoft driver that tried to always use uio's.
  However, compared to the original code, this avoids rewalking the uio
  iovec array for requests with multiple vectors.  It also avoids
  allocate an iovec array for mbufs and populating it by instead walking
  the mbuf chain directly.

- Update the cryptosoft(4) driver to support separate output buffers
  making use of the cursor abstraction.

Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D24545
2020-05-25 22:12:04 +00:00
imp
4ea1f7a5c3 Reimplement aliases in geom
The alias needs to be part of the provider instead of the geom to work
properly. To bind the DEV geom, we need to look at the provider's names and
aliases and create the dev entries from there. If this lives in the GEOM, then
it won't propigate down the tree properly. Remove it from geom, add it provider.

Update geli, gmountver, gnop, gpart, and guzip to use it, which handles the bulk
of the uses in FreeBSD. I think this is all the providers that create a new name
based on their parent's name.
2020-05-13 19:17:28 +00:00
cem
b7a52bdf7c geom(4) mirror: Do not panic on gmirror(8) insert, resize
Geom_mirror initialization occurs in spurts and the present of a
non-destroyed g_mirror softc does not always indicate that the geom has
launched (i.e., has an sc_provider).

Some gmirror(8) commands (via g_mirror_ctl) depend on a g_mirror's
sc_provider (insert and resize).  For those commands, g_mirror_ctl is
modified to sleep-poll in an interruptible way until the target geom is
either launched or destroyed.

Reviewed by:	markj
Tested by:	markj
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D24780
2020-05-11 22:39:53 +00:00
pjd
c41a141aca Add g_topology_locked() macro that returns true if we already hold the GEOM
topology lock.
2020-04-25 21:41:09 +00:00
jhb
b9b909056c Mark eli_metadata_crypto_supported inline.
This quiets warnings about it not being always used.

Reported by:	kevans
2020-04-15 18:27:28 +00:00
jhb
2314192d69 Remove support for geli(4) algorithms deprecated in r348206.
This removes support for reading and writing volumes using the
following algorithms:

- Triple DES
- Blowfish
- MD5 HMAC integrity

In addition, this commit adds an explicit whitelist of supported
algorithms to give a better error message when an invalid or
unsupported algorithm is used by an existing volume.

Reviewed by:	cem
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D24343
2020-04-15 00:14:50 +00:00
imp
45cf829984 Now that we don't have special-case geom hacking defined in md_var.h, stop
including it. sparc64 was the last straggler here, but these weren't removed at
the time.
2020-04-07 22:23:22 +00:00
markj
1781f9ef6f geom_journal: Only stop the switcher process if one was started.
PR:		243196
MFC after:	1 week
2020-04-03 13:57:41 +00:00
jhb
ddcef18974 Refactor driver and consumer interfaces for OCF (in-kernel crypto).
- The linked list of cryptoini structures used in session
  initialization is replaced with a new flat structure: struct
  crypto_session_params.  This session includes a new mode to define
  how the other fields should be interpreted.  Available modes
  include:

  - COMPRESS (for compression/decompression)
  - CIPHER (for simply encryption/decryption)
  - DIGEST (computing and verifying digests)
  - AEAD (combined auth and encryption such as AES-GCM and AES-CCM)
  - ETA (combined auth and encryption using encrypt-then-authenticate)

  Additional modes could be added in the future (e.g. if we wanted to
  support TLS MtE for AES-CBC in the kernel we could add a new mode
  for that.  TLS modes might also affect how AAD is interpreted, etc.)

  The flat structure also includes the key lengths and algorithms as
  before.  However, code doesn't have to walk the linked list and
  switch on the algorithm to determine which key is the auth key vs
  encryption key.  The 'csp_auth_*' fields are always used for auth
  keys and settings and 'csp_cipher_*' for cipher.  (Compression
  algorithms are stored in csp_cipher_alg.)

- Drivers no longer register a list of supported algorithms.  This
  doesn't quite work when you factor in modes (e.g. a driver might
  support both AES-CBC and SHA2-256-HMAC separately but not combined
  for ETA).  Instead, a new 'crypto_probesession' method has been
  added to the kobj interface for symmteric crypto drivers.  This
  method returns a negative value on success (similar to how
  device_probe works) and the crypto framework uses this value to pick
  the "best" driver.  There are three constants for hardware
  (e.g. ccr), accelerated software (e.g. aesni), and plain software
  (cryptosoft) that give preference in that order.  One effect of this
  is that if you request only hardware when creating a new session,
  you will no longer get a session using accelerated software.
  Another effect is that the default setting to disallow software
  crypto via /dev/crypto now disables accelerated software.

  Once a driver is chosen, 'crypto_newsession' is invoked as before.

- Crypto operations are now solely described by the flat 'cryptop'
  structure.  The linked list of descriptors has been removed.

  A separate enum has been added to describe the type of data buffer
  in use instead of using CRYPTO_F_* flags to make it easier to add
  more types in the future if needed (e.g. wired userspace buffers for
  zero-copy).  It will also make it easier to re-introduce separate
  input and output buffers (in-kernel TLS would benefit from this).

  Try to make the flags related to IV handling less insane:

  - CRYPTO_F_IV_SEPARATE means that the IV is stored in the 'crp_iv'
    member of the operation structure.  If this flag is not set, the
    IV is stored in the data buffer at the 'crp_iv_start' offset.

  - CRYPTO_F_IV_GENERATE means that a random IV should be generated
    and stored into the data buffer.  This cannot be used with
    CRYPTO_F_IV_SEPARATE.

  If a consumer wants to deal with explicit vs implicit IVs, etc. it
  can always generate the IV however it needs and store partial IVs in
  the buffer and the full IV/nonce in crp_iv and set
  CRYPTO_F_IV_SEPARATE.

  The layout of the buffer is now described via fields in cryptop.
  crp_aad_start and crp_aad_length define the boundaries of any AAD.
  Previously with GCM and CCM you defined an auth crd with this range,
  but for ETA your auth crd had to span both the AAD and plaintext
  (and they had to be adjacent).

  crp_payload_start and crp_payload_length define the boundaries of
  the plaintext/ciphertext.  Modes that only do a single operation
  (COMPRESS, CIPHER, DIGEST) should only use this region and leave the
  AAD region empty.

  If a digest is present (or should be generated), it's starting
  location is marked by crp_digest_start.

  Instead of using the CRD_F_ENCRYPT flag to determine the direction
  of the operation, cryptop now includes an 'op' field defining the
  operation to perform.  For digests I've added a new VERIFY digest
  mode which assumes a digest is present in the input and fails the
  request with EBADMSG if it doesn't match the internally-computed
  digest.  GCM and CCM already assumed this, and the new AEAD mode
  requires this for decryption.  The new ETA mode now also requires
  this for decryption, so IPsec and GELI no longer do their own
  authentication verification.  Simple DIGEST operations can also do
  this, though there are no in-tree consumers.

  To eventually support some refcounting to close races, the session
  cookie is now passed to crypto_getop() and clients should no longer
  set crp_sesssion directly.

- Assymteric crypto operation structures should be allocated via
  crypto_getkreq() and freed via crypto_freekreq().  This permits the
  crypto layer to track open asym requests and close races with a
  driver trying to unregister while asym requests are in flight.

- crypto_copyback, crypto_copydata, crypto_apply, and
  crypto_contiguous_subsegment now accept the 'crp' object as the
  first parameter instead of individual members.  This makes it easier
  to deal with different buffer types in the future as well as
  separate input and output buffers.  It's also simpler for driver
  writers to use.

- bus_dmamap_load_crp() loads a DMA mapping for a crypto buffer.
  This understands the various types of buffers so that drivers that
  use DMA do not have to be aware of different buffer types.

- Helper routines now exist to build an auth context for HMAC IPAD
  and OPAD.  This reduces some duplicated work among drivers.

- Key buffers are now treated as const throughout the framework and in
  device drivers.  However, session key buffers provided when a session
  is created are expected to remain alive for the duration of the
  session.

- GCM and CCM sessions now only specify a cipher algorithm and a cipher
  key.  The redundant auth information is not needed or used.

- For cryptosoft, split up the code a bit such that the 'process'
  callback now invokes a function pointer in the session.  This
  function pointer is set based on the mode (in effect) though it
  simplifies a few edge cases that would otherwise be in the switch in
  'process'.

  It does split up GCM vs CCM which I think is more readable even if there
  is some duplication.

- I changed /dev/crypto to support GMAC requests using CRYPTO_AES_NIST_GMAC
  as an auth algorithm and updated cryptocheck to work with it.

- Combined cipher and auth sessions via /dev/crypto now always use ETA
  mode.  The COP_F_CIPHER_FIRST flag is now a no-op that is ignored.
  This was actually documented as being true in crypto(4) before, but
  the code had not implemented this before I added the CIPHER_FIRST
  flag.

- I have not yet updated /dev/crypto to be aware of explicit modes for
  sessions.  I will probably do that at some point in the future as well
  as teach it about IV/nonce and tag lengths for AEAD so we can support
  all of the NIST KAT tests for GCM and CCM.

- I've split up the exising crypto.9 manpage into several pages
  of which many are written from scratch.

- I have converted all drivers and consumers in the tree and verified
  that they compile, but I have not tested all of them.  I have tested
  the following drivers:

  - cryptosoft
  - aesni (AES only)
  - blake2
  - ccr

  and the following consumers:

  - cryptodev
  - IPsec
  - ktls_ocf
  - GELI (lightly)

  I have not tested the following:

  - ccp
  - aesni with sha
  - hifn
  - kgssapi_krb5
  - ubsec
  - padlock
  - safe
  - armv8_crypto (aarch64)
  - glxsb (i386)
  - sec (ppc)
  - cesa (armv7)
  - cryptocteon (mips64)
  - nlmsec (mips64)

Discussed with:	cem
Relnotes:	yes
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D23677
2020-03-27 18:25:23 +00:00
jhb
4d92c70724 Use the newer EINTEGRITY error when authentication fails.
GELI used to fail with EINVAL when a read request spanned a disk
sector whose contents did not match the sector's authentication tag.
The recently-added EINTEGRITY more closely matches to the error in
this case.

Reviewed by:	cem, mckusick
MFC after:	2 weeks
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D24131
2020-03-23 21:26:32 +00:00
kaktus
ad355b0a9d Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE.  All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by:	kib (mentor, blanket)
Commented by:	kib, gallatin, melifaro
Differential Revision:	https://reviews.freebsd.org/D23718
2020-02-26 14:26:36 +00:00
kaktus
b3eed6299b Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (12 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Approved by:	kib (mentor, blanket)
Differential Revision:	https://reviews.freebsd.org/D23637
2020-02-24 10:42:56 +00:00
kevans
3540e3cf2d geli taste: allow GELIBOOT tagged providers as well
Currently the installer will tag geliboot partitions with both BOOT and
GELIBOOT; the former allows the kernel to taste it at boot, while the latter
is what loaders keys off of.

However, it seems reasonable to assume that if a provider's been tagged with
GELIBOOT that the kernel should also take that as a hint to taste/attach at
boot. This would allow us to stop tagging GELIBOOT partitions with BOOT in
bsdinstall, but I'm not sure that there's a compelling reason to do so any
time soon.

Reviewed by:	oshogbo
Differential Revision:	https://reviews.freebsd.org/D23387
2020-02-07 21:36:14 +00:00
imp
2ae8935cbe Supress not supported message
For the moment, supress the operation not supported messages at this level.  In
the fullness of time, we will have better error tracking so we can diagnose
issues in the future.

Reviewed by: scottl@
2020-02-07 17:47:08 +00:00