Commit Graph

2103 Commits

Author SHA1 Message Date
Warner Losh
36d6e01474 Eliminate useless adjustments of aliased device.
No need to set any fields in the cloned device. devfs uses symlinks,
so the adev entries returned won't be presented to the drivers. Since
we don't save copies, nothing else will see them. This code came from
the old compat code, and it appears to be obsolete or never needed.

Submitted by: kib@
Differential Review: https://reviews.freebsd.org/D11919
2017-08-07 22:42:46 +00:00
Warner Losh
d3517d306c Expose API to allow disks to ask for alias names in devfs.
Implement disk_add_alias to allow aliases to be added to disks. All
disk have a primary name (say "foo") can also have secondary names
(say "bar") such that all instances of "foo" also have a "bar"
alias. So if you have foo0, foo0p1, foo1, foo1s1 and foo1s1a nodes
created by the foo driver and gpart, device nodes bar0, bar0p1, bar1,
bar1s1 and bar1s1a will appear as symlinks back to the original nodes.
This generalizes to multiple aliases. However, since the unit number
follows the primary name, multiple device drivers can't create the
same aliases unless those drives coorinate the unit number space (eg
you couldn't add an alias 'disk' to both 'da' and 'ada' because it's
possible to have da0 and ada0, because 'disk0' is ambiguous).

Differential Revision: https://reviews.freebsd.org/D11873
2017-08-07 21:12:38 +00:00
Warner Losh
5d7d13290a Add alias support to gpart.
When we're creating new providers for each of the partitions, add
aliases to the geom before we create the provider so when geom_dev
tastes the provider, the aliases are in place so the proper /dev
entries are created. So foo5p6 gets created as an alias for bar5p6
when foo is an alias for bar in the geom we're partitioning with
g_part. This also copies aliases from the container geom (eg disk) to
the label geom (the disk with GPT partitioning) so that aliases nest
properly.

Differential Revision: https://reviews.freebsd.org/D11873
2017-08-07 21:12:33 +00:00
Warner Losh
c624eb2598 Add aliasing concept to geom.
Add an alias name list to geoms. Use them in geom_dev to create
aliases. Previously, geom_dev would create an device node for the name
of the geom. Now, additional nodes are created pointing back to the
primary node with make_dev_alias_p. Aliases must be in place on the
geom before any tasting occurs.

Differential Revision: https://reviews.freebsd.org/D11873
2017-08-07 21:12:28 +00:00
Kirk McKusick
6c6118b390 gjournal is broken in handling its flush_queue. If we have 10 bio's
in the flush_queue:
         1 2 3 4 5 6 7 8 9 10
and another 10 bio's go into the flush queue after only the first five
bio's are removed from the flush queue, the queue should look like:
         6 7 8 9 10 11 12 13 14 15 16 17 18 19 20,
but because of the bug we end up with
         6 11 12 13 14  15 16 17 18 19 20 7 8 9 10.
So the sequence of the bio's is damaged in the flush queue (and
therefore in the journal on disk !). This error can be triggered by
ffs_snapshot() when a block is read with readblock() and gjournal finds
this block in the broken flush queue before it goes to the correct
active queue.

The fix is to place all new blocks at the end of the queue.

Submitted by: Dr. Andreas Longwitz <longwitz@incore.de>
Discussed with: kib
MFC after: 1 week
2017-08-07 19:40:03 +00:00
Kirk McKusick
683590b642 sysctl kern.geom.journal.cache.limit shows negative value for FreeBSD/amd64
system having over 4GB RAM. That's due to:

1) the limit being u_int instead of u_long like vm.kmem_size (the limit is
   half of vm.kmem_size by default for amd64);
2) sysctl handler g_journal_cache_limit_sysctl() using u_int instead of u_long.

The fix is to replace u_int with u_long for the kern.geom.journal.cache.limit
sysctl variable.

PR: 198500
Submitted by: Dr. Andreas Longwitz <longwitz@incore.de>
Reported by: Eugene Grosbein
Discussed with: kib
MFC after: 1 week
2017-08-07 19:18:27 +00:00
Alexander Motin
1631690677 Add GEOM::descr attribute for symmetry with GEOM::ident.
MFC after:	2 weeks
2017-07-06 08:36:14 +00:00
Ryan Libby
fb0e3235ea g_virstor.h: macro parenthesization
Build with gcc -Wint-in-bool-context revealed a macro parenthesization
error (invoking LOG_MSG with a ternary expression for lvl).

Reviewed by:	markj
Approved by:	markj (mentor)
Sponsored by:	Dell EMC Isilon
Differential revision:	https://reviews.freebsd.org/D11411
2017-06-30 22:01:18 +00:00
Marcelo Araujo
4323355e76 With r318394 seems it breaks gpart(8) in some embedded systems such like PCEngines,
RPI1-B, Alix and APU2 boards as well as NanoBSD with the following message:

vnode_pager_generic_getpages_done: I/O read error 5

Seems the breakage was because it was missed to include acr in glabel update.

Reported by:	Peter Blok <pblok@bsd4all.org>,
		madpilot, imp and trasz.
Reviewed by:	trasz
Tested by:	Peter Blok and madpilot.
MFC after:	3 days.
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D11365
2017-06-27 01:22:27 +00:00
Stephen J. Kiernan
9a81ba0f24 Add MD_VERIFY option to enable O_VERIFY in open for vnode type.
Add -o [no]verify option to mdconfig (and document in man page.)
Implement GEOM attribute MNT::verified to ask md if the backing vnode is
  verified.
Check for MNT::verified in cd9660 mount to flag the mount as MNT_VERIFIED if
  the underlying device has been verified.

Reviewed by:	rwatson
Approved by:	sjg (mentor)
Obtained from:	Juniper Networks, Inc.
Differential Revision:	https://reviews.freebsd.org/D2902
2017-05-31 21:18:11 +00:00
Edward Tomasz Napierala
6635c8ed2f Fix typo.
MFC after:	2 weeks
2017-05-18 08:25:07 +00:00
Mark Johnston
db7c508323 Synchronize unclean mirrors before adding them to a running gmirror.
During gmirror startup, if component mirrors are found to be dirty as is
typical after a system crash, the mirrors are synchronized to the mirror
with highest priority. However if a gmirror starts without all of its
mirrors present, for example because of some transient delays during
tasting, the remaining mirrors must be synchronized before they may become
active.

MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
2017-05-02 23:29:42 +00:00
Alexander Motin
d109d8adc7 Dump md_iterations as signed, which it really is.
PR:		208305
PR:		196834
MFC after:	2 weeks
2017-04-21 07:43:44 +00:00
Alexander Motin
d8880fd450 Always allow setting number of iterations for the first time.
Before this change it was impossible to set number of PKCS#5v2 iterations,
required to set passphrase, if it has two keys and never had any passphrase.
Due to present metadata format limitations there are still cases when number
of iterations can not be changed, but now it works in cases when it can.

PR:		218512
MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D10338
2017-04-21 07:16:07 +00:00
Mark Johnston
a7d94fcc3e Rename two gmirror state flags to make their meanings slightly clearer.
No functional change.

MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
2017-04-14 17:13:57 +00:00
Mark Johnston
1e91412e40 Don't set the mirror GEOM softc to NULL in g_mirror_destroy().
At this point we have not rendezvous'ed with the mirror worker thread, and
I/O may still be in flight. Various I/O completion paths expect to be able
to obtain a reference to the mirror softc from the GEOM, so setting it to
NULL may result in various NULL pointer dereferences if the mirror is
stopped with -f or the kernel is shut down while a mirror is
synchronizing. The worker thread will clear the softc pointer before
exiting.

Tested by:	pho
MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
2017-04-14 17:08:37 +00:00
Mark Johnston
77011eac86 Check for a provider error before enqueuing mirror I/O.
We are otherwise susceptible to a race with a concurrent teardown of the
mirror provider, causing the I/O to be left uncompleted after the mirror
started withering.

Tested by:	pho
MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
2017-04-14 17:03:32 +00:00
Mark Johnston
a65d524afc Stop mirror synchronization before draining the I/O queue.
Regular I/O requests may be blocked by concurrent synchronization requests
targeted to the same LBAs, in which case they are moved to a holding queue
until the conflicting I/O completes. We therefore want to stop
synchronization before completing pending I/O in g_mirror_destroy_provider()
since this ensures that blocked I/O requests are completed as well.

Tested by:	pho
MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
2017-04-14 16:54:50 +00:00
Mark Johnston
a4834289d6 Handle NULL entries in gmirror disk ds_bios arrays.
Entries may be removed and freed if an I/O error occurs during mirror
synchronization, so we cannot assume that all entries of ds_bios are
valid.

Also ensure that a synchronization BIO's array index is preserved after
a successful write.

Reported and tested by:	pho
MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
2017-04-10 17:15:59 +00:00
Allan Jude
ec5c0e5be9 Implement boot-time encryption key passing (keybuf)
This patch adds a general mechanism for providing encryption keys to the
kernel from the boot loader. This is intended to enable GELI support at
boot time, providing a better mechanism for passing keys to the kernel
than environment variables. It is designed to be extensible to other
applications, and can easily handle multiple encrypted volumes with
different keys.

This mechanism is currently used by the pending GELI EFI work.
Additionally, this mechanism can potentially be used to interface with
GRUB, opening up options for coreboot+GRUB configurations with completely
encrypted disks.

Another benefit over the existing system is that it does not require
re-deriving the user key from the password at each boot stage.

Most of this patch was written by Eric McCorkle. It was extended by
Allan Jude with a number of minor enhancements and extending the keybuf
feature into boot2.

GELI user keys are now derived once, in boot2, then passed to the loader,
which reuses the key, then passes it to the kernel, where the GELI module
destroys the keybuf after decrypting the volumes.

Submitted by:	Eric McCorkle <eric@metricspace.net> (Original Version)
Reviewed by:	oshogbo (earlier version), cem (earlier version)
MFC after:	3 weeks
Relnotes:	yes
Sponsored by:	ScaleEngine Inc.
Differential Revision:	https://reviews.freebsd.org/D9575
2017-04-01 05:05:22 +00:00
Allan Jude
39b7ca4533 sys/geom/eli: Switch bzero() to explicit_bzero() for sensitive data
In GELI, anywhere we are zeroing out possibly sensitive data, like
the metadata struct, the metadata sector (both contain the encrypted
master key), the user key, or the master key, use explicit_bzero.

Didn't touch the bzero() used to initialize structs.

Reviewed by:	delphij, oshogbo
Sponsored by:	ScaleEngine Inc.
Differential Revision:	https://reviews.freebsd.org/D9809
2017-03-31 00:07:03 +00:00
Mark Johnston
0d75d0dfbc Avoid sleeping when the mirror I/O queue is non-empty.
A request may be queued while the queue lock is dropped when the mirror is
being destroyed. The corresponding wakeup would be lost, possibly resulting
in an apparent hang of the mirror worker thread.

Tested by:	pho (part of a larger patch)
MFC after:	1 week
Sponsored by:	Dell EMC Isilon
2017-03-29 19:39:07 +00:00
Mark Johnston
c1ab409cba Remove an unneeded g_mirror_destroy_provider() call.
The worker thread will destroy the mirror provider as part of its teardown
sequence. The call made sense in the initial revision of gmirror, but
became unnecessary in r137248.

Tested by:	pho (part of a larger diff)
MFC afteR:	2 weeks
Sponsored by:	Dell EMC Isilon
2017-03-29 19:30:22 +00:00
Mark Johnston
819cd913f4 Refine r301173 a bit.
- Don't execute any of g_mirror_shutdown_post_sync() when panicking. We
  cannot safely idle the mirror or stop synchronization in that state, and
  the current attempts to do so complicate debugging of gmirror itself.
- Check for a non-NULL panicstr instead of using SCHEDULER_STOPPED(). The
  latter was added for use in the locking primitives.

Reviewed by:	mav, pjd
MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
2017-03-27 16:25:58 +00:00
Marcelo Araujo
7f5f84f08f After r315112 I broke the tests with eli, instead to pass 0, I should pass
M_NOWAIT to g_media_changed() that will call g_post_event() with this flag.

Reported by:	lwhsu, ngie and ae
2017-03-13 13:56:01 +00:00
Scott Long
d8474e52e3 Report disk flags via the sysctl tree 2017-03-13 11:09:17 +00:00
Marcelo Araujo
2ae0afa8ee Add the capability to refresh the gpart(8) label without need a reboot.
gpart(8) has functionality to change the label of an GPT partition.
This functionality works like it should, however, after a label change
the /dev/gpt/ entries remain unchanged. glabel(8) status output remains
unchanged. The change only takes effect after a reboot.

PR:		162690
Submitted by:	sub.mesa@gmail, Ben RUBSON <ben.rubson@gmail.com>, ae
Reviewed by:	allanjude, bapt, bcr
MFC after:	6 weeks.
Differential Revision:	https://reviews.freebsd.org/D9935
2017-03-12 04:15:56 +00:00
Alexander Motin
4d5832bc12 When chunking large DIOCGDELETE, do it on stripe edge.
MFC after:	2 weeks
2017-03-08 12:18:58 +00:00
Mariusz Zaborski
c27fb0b589 The kern.geom.part.auto_resize should be tunable. 2017-02-28 20:51:20 +00:00
Mariusz Zaborski
01ad653a81 Add sysctl to control auto resize of the GEOM metadata.
Reviewed by:	AllanJude
Differential Revision:	https://reviews.freebsd.org/D9603
2017-02-27 17:54:01 +00:00
Marius Strobl
4874af73c1 - Allow different slicers for different flash types to be registered
with geom_flashmap(4) and teach it about MMC for slicing enhanced
  user data area partitions. The FDT slicer still is the default for
  CFI, NAND and SPI flash on FDT-enabled platforms.
- In addition to a device_t, also pass the name of the GEOM provider
  in question to the slicers as a single device may provide more than
  provider.
- Build a geom_flashmap.ko.
- Use MODULE_VERSION() so other modules can depend on geom_flashmap(4).
- Remove redundant/superfluous GEOM routines that either do nothing
  or provide/just call default GEOM (slice) functionality.
- Trim/adjust includes

Submitted by:	jhibbits (RouterBoard bits)
Reviewed by:	jhibbits
2017-02-22 10:21:39 +00:00
Allan Jude
85c15ab853 improve PBKDF2 performance
The PBKDF2 in sys/geom/eli/pkcs5v2.c is around half the speed it could be

GELI's PBKDF2 uses a simple benchmark to determine a number of iterations
that will takes approximately 2 seconds. The security provided is actually
half what is expected, because an attacker could use the optimized
algorithm to brute force the key in half the expected time.

With this change, all newly generated GELI keys will be approximately 2x
as strong. Previously generated keys will talk half as long to calculate,
resulting in faster mounting of encrypted volumes. Users may choose to
rekey, to generate a new key with the larger default number of iterations
using the geli(8) setkey command.

Security of existing data is not compromised, as ~1 second per brute force
attempt is still a very high threshold.

PR:		202365
Original Research:	https://jbp.io/2015/08/11/pbkdf2-performance-matters/
Submitted by:	Joe Pixton <jpixton@gmail.com> (Original Version), jmg (Later Version)
Reviewed by:	ed, pjd, delphij
Approved by:	secteam, pjd (maintainer)
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D8236
2017-02-19 19:30:31 +00:00
John Baldwin
dcbe5188da Defer startup of gjournal switcher kproc.
Don't start switcher kproc until the first GEOM is created.

Reviewed by:	pjd
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D8576
2017-02-07 22:45:59 +00:00
Andrey V. Elsukov
9ef6004352 Check that primary GPT header is valid before wiping partitioning.
This allows safely destroy corrupted GPT when primary header was
rewritten by some data, that do not want to destroy.

MFC after:	1 week
2017-02-04 05:09:47 +00:00
Yoshihiro Takahashi
2b375b4edd Remove pc98 support completely.
I thank all developers and contributors for pc98.

Relnotes:	yes
2017-01-28 02:22:15 +00:00
Alexander Motin
d3fef0a092 Report disk addition errors on add or create subcommand.
MFC after:	1 week
2017-01-20 13:49:04 +00:00
Alexander Motin
17160457b4 Report random flash storage as non-rotating to GEOM_DISK.
While doing it, introduce respective constants in geom_disk.h.

MFC after:	1 week
2017-01-12 08:53:10 +00:00
Conrad Meyer
b28ea2c250 g_raid: Prevent tasters from attempting excessively large reads
Some g_raid tasters attempt metadata reads in multiples of the provider
sectorsize.  Reads larger than MAXPHYS are invalid, so detect and abort
in such situations.

Spiritually similar to r217305 / PR 147851.

PR:		214721
Sponsored by:	Dell EMC Isilon
2017-01-12 06:58:31 +00:00
Dimitry Andric
012039fd55 Fix logic error in gvinum's gv_set_sd_state()
With clang 4.0.0, I'm getting the following warnings:

    sys/geom/vinum/geom_vinum_state.c:186:7: error: logical not is only
    applied to the left hand side of this bitwise operator
    [-Werror,-Wlogical-not-parentheses]
                    if (!flags & GV_SETSTATE_FORCE)
                        ^      ~

The logical not operator should obiously be called after masking.

Reviewed by:	mav, pfg
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D9093
2017-01-08 17:56:54 +00:00
Sepherosa Ziehau
c22dceff9d build: Unbreak LINT
Sponsored by:	Microsoft
2016-12-21 01:39:11 +00:00
Konrad Witaszczyk
480f31c214 Add support for encrypted kernel crash dumps.
Changes include modifications in kernel crash dump routines, dumpon(8) and
savecore(8). A new tool called decryptcore(8) was added.

A new DIOCSKERNELDUMP I/O control was added to send a kernel crash dump
configuration in the diocskerneldump_arg structure to the kernel.
The old DIOCSKERNELDUMP I/O control was renamed to DIOCSKERNELDUMP_FREEBSD11 for
backward ABI compatibility.

dumpon(8) generates an one-time random symmetric key and encrypts it using
an RSA public key in capability mode. Currently only AES-256-CBC is supported
but EKCD was designed to implement support for other algorithms in the future.
The public key is chosen using the -k flag. The dumpon rc(8) script can do this
automatically during startup using the dumppubkey rc.conf(5) variable.  Once the
keys are calculated dumpon sends them to the kernel via DIOCSKERNELDUMP I/O
control.

When the kernel receives the DIOCSKERNELDUMP I/O control it generates a random
IV and sets up the key schedule for the specified algorithm. Each time the
kernel tries to write a crash dump to the dump device, the IV is replaced by
a SHA-256 hash of the previous value. This is intended to make a possible
differential cryptanalysis harder since it is possible to write multiple crash
dumps without reboot by repeating the following commands:
# sysctl debug.kdb.enter=1
db> call doadump(0)
db> continue
# savecore

A kernel dump key consists of an algorithm identifier, an IV and an encrypted
symmetric key. The kernel dump key size is included in a kernel dump header.
The size is an unsigned 32-bit integer and it is aligned to a block size.
The header structure has 512 bytes to match the block size so it was required to
make a panic string 4 bytes shorter to add a new field to the header structure.
If the kernel dump key size in the header is nonzero it is assumed that the
kernel dump key is placed after the first header on the dump device and the core
dump is encrypted.

Separate functions were implemented to write the kernel dump header and the
kernel dump key as they need to be unencrypted. The dump_write function encrypts
data if the kernel was compiled with the EKCD option. Encrypted kernel textdumps
are not supported due to the way they are constructed which makes it impossible
to use the CBC mode for encryption. It should be also noted that textdumps don't
contain sensitive data by design as a user decides what information should be
dumped.

savecore(8) writes the kernel dump key to a key.# file if its size in the header
is nonzero. # is the number of the current core dump.

decryptcore(8) decrypts the core dump using a private RSA key and the kernel
dump key. This is performed by a child process in capability mode.
If the decryption was not successful the parent process removes a partially
decrypted core dump.

Description on how to encrypt crash dumps was added to the decryptcore(8),
dumpon(8), rc.conf(5) and savecore(8) manual pages.

EKCD was tested on amd64 using bhyve and i386, mipsel and sparc64 using QEMU.
The feature still has to be tested on arm and arm64 as it wasn't possible to run
FreeBSD due to the problems with QEMU emulation and lack of hardware.

Designed by:	def, pjd
Reviewed by:	cem, oshogbo, pjd
Partial review:	delphij, emaste, jhb, kib
Approved by:	pjd (mentor)
Differential Revision:	https://reviews.freebsd.org/D4712
2016-12-10 16:20:39 +00:00
Alexander Motin
b6fe583c55 Add gmirror create subcommand, alike to gstripe, gconcat, etc.
It is quite specific mode of operation without storing on-disk metadata.
It can be useful in some cases in combination with some external control
tools handling mirror creation and disks hot-plug.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2016-11-30 09:27:08 +00:00
Alexander Motin
dc399583ba Use providergone method to cover race between destroy and g_access().
Reviewed by:	markj
MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2016-11-13 03:56:26 +00:00
Alexander Motin
80f0a89c62 Do not report error on close even if we have no paths left.
MFC after:	 2 weeks
2016-11-12 18:57:38 +00:00
Bryan Drewery
28323add09 Fix improper use of "its".
Sponsored by:	Dell EMC Isilon
2016-11-08 23:59:41 +00:00
Conrad Meyer
8532d381a9 Add BUF_TRACKING and FULL_BUF_TRACKING buffer debugging
Upstream the BUF_TRACKING and FULL_BUF_TRACKING buffer debugging code.
This can be handy in tracking down what code touched hung bios and bufs
last. The full history is especially useful, but adds enough bloat that
it shouldn't be enabled in release builds.

Function names (or arbitrary string constants) are tracked in a
fixed-size ring in bufs. Bios gain a pointer to the upper buf for
tracking. SCSI CCBs gain a pointer to the upper bio for tracking.

Reviewed by:	markj
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D8366
2016-10-31 23:09:52 +00:00
Ruslan Bukin
ae8b1f90fe Fix alignment issues on MIPS: align the pointers properly.
All the 5520 GEOM_ELI tests passed successfully on MIPS64EB.

Sponsored by:	DARPA, AFRL
Sponsored by:	HEIF5
Differential Revision:	https://reviews.freebsd.org/D7905
2016-10-31 16:55:14 +00:00
Mark Johnston
5c2ac5cf2a gmirror: Add a subroutine to free synchronization BIOs.
This addresses a memory leak that occurs upon an I/O error during a mirror
synchronization.

MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
2016-10-20 23:08:40 +00:00
Mark Johnston
b450976dc2 gmirror: Release pending regular requests when synchronization stops.
Normally gmirror allows colliding requests to proceed whenever a
synchronization request completes and advances to the next offset. However
if an I/O request collides with one of the final g_mirror_syncreqs, nothing
releases it once synchronization completes, resulting in an apparent I/O
hang. The same problem can occur if synchronization is aborted by an
I/O error. Therefore, be sure to requeue pending requests when
mirror synchronization is stopped for any reason.

While here, remove some dead code from g_mirror_regular_release().

MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
2016-10-20 23:02:30 +00:00
Alexander Motin
5a236b0ef9 Fix possible geom destruction before final provider close.
Introduce internal counter to track opens.  Using provider's counters is
not very successfull after calling g_wither_provider().

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2016-10-06 15:20:05 +00:00