Commit Graph

2142 Commits

Author SHA1 Message Date
Alexander Kabaev
151ba7933a Do pass removing some write-only variables from the kernel.
This reduces noise when kernel is compiled by newer GCC versions,
such as one used by external toolchain ports.

Reviewed by: kib, andrew(sys/arm and sys/arm64), emaste(partial), erj(partial)
Reviewed by: jhb (sys/dev/pci/* sys/kern/vfs_aio.c and sys/kern/kern_synch.c)
Differential Revision: https://reviews.freebsd.org/D10385
2017-12-25 04:48:39 +00:00
Mark Johnston
9abe2e7e98 Avoid using bioq_* in gmirror.
gmirror does not perform any sorting of I/O requests, so the bioq API
doesn't provide any advantages over plain TAILQs. The API also does not
provide operations needed by an upcoming change.

No functional change intended. The diff shrinks the geom_mirror.ko
text and the gmirror softc slightly.

Tested by:	pho (part of a larger patch)
MFC after:	1 week
Sponsored by:	Dell EMC Isilon
2017-12-19 17:13:04 +00:00
Mark Johnston
68eadcec0f Give a couple of predication functions a bool return type.
No functional change intended.

MFC after:	1 week
Sponsored by:	Dell EMC Isilon
2017-12-15 19:14:21 +00:00
Mark Johnston
204d94f161 Typo.
MFC after:	1 week
2017-12-15 19:03:03 +00:00
Mark Johnston
8b93770503 Address a possible lost wakeup for gmirror events.
g_mirror_event_send() acquires the I/O queue lock to deliver a wakeup
to the worker thread, and this is done after enqueuing the event.
So it's sufficient to check the event queue before atomically releasing
the queue lock and going to sleep.

MFC after:	1 week
Sponsored by:	Dell EMC Isilon
2017-12-12 17:29:34 +00:00
Mark Johnston
b634781eac Give g_mirror_event_get() a more accurate name.
MFC after:	1 week
Sponsored by:	Dell EMC Isilon
2017-12-12 17:25:25 +00:00
Mark Johnston
a3584ee355 Decrement sc_writes when BIO_DELETE requests complete.
Otherwise a gmirror that has received a BIO_DELETE request will never be
marked clean (unless sc_writes overflows).

MFC after:	1 week
Sponsored by:	Dell EMC Isilon
2017-12-12 17:24:30 +00:00
Eugene Grosbein
1b8ea9beff geom_raid (RAID5): do not lose bp->bio_error, keep it in pbp->bio_error
and return it by passing to g_raid_iodone()

Approved by:	mav (mentor)
MFC after:	3 days
2017-12-07 20:09:17 +00:00
Eugene Grosbein
01a51aba38 Fix use-after-free that sometimes results in a garbage returned
instead of right error code after requests to SINGLE/CONCAT volumes, f.e:

# dd if=/dev/raid/r0 bs=512 of=/dev/null
dd: /dev/raid/r0: Unknown error: -559038242

Reviewed by:	avg (mentor), mav (mentor)
MFC after:	3 days
2017-12-07 05:55:18 +00:00
Warner Losh
700b6f8e23 When building standalone, include stand.h rather than the kernel
includes or the userland includes.

Sponsored by: Netflix
2017-12-05 21:37:32 +00:00
Warner Losh
3a7d67e741 We don't need both _STAND and _STANDALONE. There's more places that
use _STANDALONE, so change the former to the latter.

Sponsored by: Netflix
2017-12-02 00:07:09 +00:00
Mark Johnston
2ceafb776e Update gmirror metadata less frequently when synchronizing.
We periodically record synchronization progress in the metadata
block of the disk being synchronized; this allows an interrupted
synchronization to be resumed. However, the frequency of these
updates heavily pessimized synchronization time on some media. This
change modifies gmirror to update metadata based on a time period,
and adds a sysctl to control that period. The default value results
in a much lower update frequency and increases the completion time
for an interrupted rebuild only marginally.

Reported by:	Andre Albsmeier <andre@fbsd.e4m.org>
MFC after:	3 weeks
2017-11-30 20:36:29 +00:00
Pedro F. Giffuni
3728855a0f sys/geom: adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
2017-11-27 15:17:37 +00:00
Mark Johnston
0349817103 Allow kern.geom.mirror.debug to be negative.
A negative value can be used to suppress all prints from the gmirror
kernel code, which can be useful when attempting to trigger race
conditions using stress tests.

MFC after:	1 week
2017-11-23 14:07:52 +00:00
Warner Losh
fdcf0c7477 While the EFI spec allows numbers to be in many forms, libefivar
produces hex numbers for the dsn. Since that come is from EDK2, change
this for symmetry, by generating the dsn as a hex number.

Noticed by: gpart list | grep efimedia | awk -F: '{print $2;}' | \
	sed -e 's/^ *//g;s/,,/,/' | grep MBR | efidp -p | efidp -f
Sponsored by: Netflix
2017-11-21 06:12:21 +00:00
Warner Losh
2ab9683565 Remove trailing whitespace (one I just introduced and a bunch of
others in the same directory).

Sponsored by: Netflix
2017-11-21 05:42:13 +00:00
Warner Losh
d65b6588d6 Implement efi media tagging for MBR partitioning types.
Sponsored by: Netflix
2017-11-21 05:35:21 +00:00
Pedro F. Giffuni
df57947f08 spdx: initial adoption of licensing ID tags.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.

Initially, only tag files that use BSD 4-Clause "Original" license.

RelNotes:	yes
Differential Revision:	https://reviews.freebsd.org/D13133
2017-11-18 14:26:50 +00:00
Andriy Gapon
f0fa2af656 geom_slice: fix r325227, protect against multiple calls to g_slice_free
This geom does not immediately detach its consumer relying on the
wither-washer to do that.  Since that happens asynchronously we may get
additional spoiling events.  So, we need to account for that.

There are multiple options for fixing this issue like detaching
immediately or checking for G_CF_ORPHAN in g_slice_spoiled().
The most reliable and least intrusive fix seems to be setting
geom->softc to NULL on the first call and checking for NULL on
subsequent calls.  This is something that the code did before r325227.

Reported by:	David Wolfskill <david@catwhisker.org>,
		O. Hartmann <o.hartmann@walstatt.org>
Tested by:	David Wolfskill <david@catwhisker.org> (earlier version)
Discussed with:	mav
MFC after:	1 week
X-MFC with:	r325227
2017-11-01 10:53:10 +00:00
Andriy Gapon
9662d80af5 geom_slice: do not destroy softc until providers are gone
At present, g_slice_orphan and g_slice_spoiled destroy the softc
(struct g_slicer) even before calling g_wither_geom, so there can
be active and incoming io requests at that time and g_slice_start
can access the softc.

This commit changes the code to destroy the softc only after all
providers are closed.

While there, a couple of small cleanups.

Reported by:	Ben RUBSON <ben.rubson@gmail.com>
Tested by:	Ben RUBSON <ben.rubson@gmail.com>
Reviewed by:	mav, smh (earlier version)
MFC after:	2 weeks
Sponsored by:	Panzura
Differential Revision: https://reviews.freebsd.org/D12809
2017-10-31 10:10:13 +00:00
Edward Tomasz Napierala
338ed98ad2 Add back missing MTX_DEF, it still needs to be there.
(Although it's defined to be 0, so there's no functional change.)

Reported by:	glebius
MFC after:	2 weeks
2017-10-29 12:03:06 +00:00
Mark Johnston
cef5abd140 Fix a lock leak in g_mirror_destroy().
g_mirror_destroy() is supposed to unlock the softc before indicating
success, but it wasn't doing so if the caller raced with another
thread destroying the mirror.

MFC after:	1 week
Sponsored by:	Dell EMC Isilon
2017-10-27 17:05:14 +00:00
Edward Tomasz Napierala
0d73fface2 Make gmountver(8) use direct dispatch.
MFC after:	2 weeks
2017-10-26 10:18:31 +00:00
Edward Tomasz Napierala
0a8cfed8cf Make gmountver(8) use G_PF_ACCEPT_UNMAPPED.
MFC after:	2 weeks
2017-10-26 09:29:35 +00:00
Mark Johnston
64a16434d8 Add support for compressed kernel dumps.
When using a kernel built with the GZIO config option, dumpon -z can be
used to configure gzip compression using the in-kernel copy of zlib.
This is useful on systems with large amounts of RAM, which require a
correspondingly large dump device. Recovery of compressed dumps is also
faster since fewer bytes need to be copied from the dump device.

Because we have no way of knowing the final size of a compressed dump
until it is written, the kernel will always attempt to dump when
compression is configured, regardless of the dump device size. If the
dump is aborted because we run out of space, an error is reported on
the console.

savecore(8) is modified to handle compressed dumps and save them to
vmcore.<index>.gz, as it does when given the -z option.

A new rc.conf variable, dumpon_flags, is added. Its value is added to
the boot-time dumpon(8) invocation that occurs when a dump device is
configured in rc.conf.

Reviewed by:	cem (earlier version)
Discussed with:	def, rgrimes
Relnotes:	yes
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D11723
2017-10-25 00:51:00 +00:00
Alan Somers
27f0f2ec5f Display rotation rate and TRIM/UNMAP support in diskinfo(8)
Bump __FreeBSD_version due to the expansion of struct diocgattr_arg.

Reviewed by:	mav, allanjude, imp
MFC after:	3 weeks
Relnotes:	yes
Sponsored by:	Spectra Logic Corp
Differential Revision:	https://reviews.freebsd.org/D12578
2017-10-04 15:09:49 +00:00
Edward Tomasz Napierala
b73e1f746a Don't destroy gmountver(8) devices on shutdown, unless they are orphaned.
Otherwise we would fail to sync the filesystem on reboot.

MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
2017-10-04 12:25:39 +00:00
Edward Tomasz Napierala
2b4490a5d3 Clear G_CF_ORPHAN when attaching. This fixes cases where the same
GEOM consumer can be orphaned, and then reattach to another provider.

From a user point of view, this makes gmountver(4) work again.

Reviewed by:	avg, mav
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D12228
2017-10-02 11:57:00 +00:00
Conrad Meyer
a523de2365 g_resize_provider_event: Do not invoke orphan method twice
Like r266444, g_resize_provider_event can attempt to orphan an already
orphaned geom_dev consumer.  This will cause a panic in g_dev_orphan.  Apply
the same fix as was applied to g_orphan_register.

Reviewed by:	ae
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D12469
2017-09-24 19:59:26 +00:00
Andriy Gapon
7103ac8ad6 gmirror: treat ENXIO as disk disconnect, not media error
In theory, all data access errors mean that a member is out of sync
at most.  But they were treated as more serious errors to avoid the
situation where a flaky disk gets repeatedly disconnected, re-synchronized,
reconnected and then disconnected again.

ENXIO is a special error that means that the member disk disappeared,
so it should get the same handling as the GEOM orphaning event.
There is a better chance that when the disk is reconnected, it will be
a good member again.

When ENXIO happens on a read we use the exisiting G_MIRROR_BUMP_SYNCID
mechanism which means that the mirror's syncid is increased as soon
as there is a write to the mirror.  That's because no data has got out
of sync yet, but the problematic memeber is disconnected, so the future
write will make it stale.

When ENXIO happens on a write we use a new G_MIRROR_BUMP_SYNCID_NOW
mechanism which means that we update the mirror metadata as soon as
possible because the problematic memeber is already behind.

Reviewed by:	markj, imp
MFC after:	3 weeks
Differential Revision: https://reviews.freebsd.org/D9463
2017-09-15 13:57:08 +00:00
Conrad Meyer
ea5eee641e Fix information leak in geli(8) integrity mode
In integrity mode, a larger logical sector (e.g., 4096 bytes) spans several
physical sectors (e.g., 512 bytes) on the backing device.  Due to hash
overhead, a 4096 byte logical sector takes 8.5625 512-byte physical sectors.
This means that only 288 bytes (256 data + 32 hash) of the last 512 byte
sector are used.

The memory allocation used to store the encrypted data to be written to the
physical sectors comes from malloc(9) and does not use M_ZERO.

Previously, nothing initialized the final physical sector backing each
logical sector, aside from the hash + encrypted data portion.  So 224 bytes
of kernel heap memory was leaked to every block :-(.

This patch addresses the issue by initializing the trailing portion of the
physical sector in every logical sector to zeros before use.  A much simpler
but higher overhead fix would be to tag the entire allocation M_ZERO.

PR:		222077
Reported by:	Maxim Khitrov <max AT mxcrypt.com>
Reviewed by:	emaste
Security:	yes
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D12272
2017-09-09 01:41:01 +00:00
Warner Losh
a905e3962c The hard drive media device path contains the size of the partition,
not its end. This makes the GEOM efimedia attribute match the
FreeBSD:Boot1Device environment variable now.

Sponsored by: Netflix
2017-09-02 07:04:06 +00:00
Warner Losh
ab4effdc68 Add efimedia attribute for all GPT partitions.
Sposnored by: Netflix
Differential Revision: https://reviews.freebsd.org/D12206
2017-09-01 17:55:25 +00:00
Konstantin Belousov
58d8f357c7 Let g_access() log the actual error number.
Submitted by:	 Fabian Keil <fk@fabiankeil.de>
PR:	221855
MFC after:	1 week
2017-08-27 12:24:25 +00:00
Mariusz Zaborski
3453dc72ad Hide length of geli passphrase during boot.
Introduce additional flag to the geli which allows to restore previous
behavior.

Reviewed by:	AllanJude@, cem@ (previous version)
MFC:		1 month
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D11751
2017-08-26 14:07:24 +00:00
Kirk McKusick
037331ddbd When read requests are sent from a filesystem running above g_journal,
the g_journal level needs to check whether it is holding a newer
copy of the block than that which exists on the disk. If so, it
needs to return its copy. If not, it should pass the request down
to the disk to fulfill. It currently considers six queues:

0) delayed queue,
1) unsent (current queue),
2) in-flight to the journal (flush queue),
3) active journal (active queue),
4) inactive journal (inactive queue), and
5) inflight to the disk (copy queue).

Checking on two of these queues is unnecessary:

0) The delayed requests should not be used for reads because they
   have not yet been entered into the journal, so their value should
   reflect the disk contents, not the future contents that are not
   yet committed.

2) Because all the bio's in the flush queue are also found on the
   active queue, there is no need to inspect the flush queue for
   reads since they will be found when searching the active queue.

Submitted by: Dr. Andreas Longwitz <longwitz@incore.de>
Discussed with: kib
MFC after: 1 week
2017-08-13 18:09:22 +00:00
Kirk McKusick
8fccf8ffd7 Eliminate a variable that is only ever set.
Submitted by: Dr. Andreas Longwitz <longwitz@incore.de>
Discussed with: kib
MFC after: 1 week
2017-08-13 18:06:38 +00:00
Warner Losh
0038725697 Also provide a warning for geom_fox.
Differential Review: https://reviews.freebsd.org/D11935
Requested by: jhb@
MFC After: 3 days
2017-08-09 16:37:37 +00:00
Warner Losh
20995eab57 Mark geom classes as deprecated.
geom_bsd, geom_mbr and geom_sunlabel have been obsolete since Marcel
Moolenaar's geom_part was in FreeBSD 7. They haven't been in GENERIC
since FreeBSD 8. Add warning when used.

geom_vol_ffs has been obsolete since ufs support to geom_label was
committed in FreeBSD 5. It hasn't been in GENERIC since FreeBSD 5.
Add warning when used.

geom_fox has been obsolete since gmultipath was committed in FreeBSD 7.
(no warning added, since this is a very obscure class).

These will all be removed in FreeBSD 12.

MFC After: 3 days
Differential Revision: https://reviews.freebsd.org/D11935

Note: Classes will be removed after MFC
2017-08-09 16:15:24 +00:00
Warner Losh
36d6e01474 Eliminate useless adjustments of aliased device.
No need to set any fields in the cloned device. devfs uses symlinks,
so the adev entries returned won't be presented to the drivers. Since
we don't save copies, nothing else will see them. This code came from
the old compat code, and it appears to be obsolete or never needed.

Submitted by: kib@
Differential Review: https://reviews.freebsd.org/D11919
2017-08-07 22:42:46 +00:00
Warner Losh
d3517d306c Expose API to allow disks to ask for alias names in devfs.
Implement disk_add_alias to allow aliases to be added to disks. All
disk have a primary name (say "foo") can also have secondary names
(say "bar") such that all instances of "foo" also have a "bar"
alias. So if you have foo0, foo0p1, foo1, foo1s1 and foo1s1a nodes
created by the foo driver and gpart, device nodes bar0, bar0p1, bar1,
bar1s1 and bar1s1a will appear as symlinks back to the original nodes.
This generalizes to multiple aliases. However, since the unit number
follows the primary name, multiple device drivers can't create the
same aliases unless those drives coorinate the unit number space (eg
you couldn't add an alias 'disk' to both 'da' and 'ada' because it's
possible to have da0 and ada0, because 'disk0' is ambiguous).

Differential Revision: https://reviews.freebsd.org/D11873
2017-08-07 21:12:38 +00:00
Warner Losh
5d7d13290a Add alias support to gpart.
When we're creating new providers for each of the partitions, add
aliases to the geom before we create the provider so when geom_dev
tastes the provider, the aliases are in place so the proper /dev
entries are created. So foo5p6 gets created as an alias for bar5p6
when foo is an alias for bar in the geom we're partitioning with
g_part. This also copies aliases from the container geom (eg disk) to
the label geom (the disk with GPT partitioning) so that aliases nest
properly.

Differential Revision: https://reviews.freebsd.org/D11873
2017-08-07 21:12:33 +00:00
Warner Losh
c624eb2598 Add aliasing concept to geom.
Add an alias name list to geoms. Use them in geom_dev to create
aliases. Previously, geom_dev would create an device node for the name
of the geom. Now, additional nodes are created pointing back to the
primary node with make_dev_alias_p. Aliases must be in place on the
geom before any tasting occurs.

Differential Revision: https://reviews.freebsd.org/D11873
2017-08-07 21:12:28 +00:00
Kirk McKusick
6c6118b390 gjournal is broken in handling its flush_queue. If we have 10 bio's
in the flush_queue:
         1 2 3 4 5 6 7 8 9 10
and another 10 bio's go into the flush queue after only the first five
bio's are removed from the flush queue, the queue should look like:
         6 7 8 9 10 11 12 13 14 15 16 17 18 19 20,
but because of the bug we end up with
         6 11 12 13 14  15 16 17 18 19 20 7 8 9 10.
So the sequence of the bio's is damaged in the flush queue (and
therefore in the journal on disk !). This error can be triggered by
ffs_snapshot() when a block is read with readblock() and gjournal finds
this block in the broken flush queue before it goes to the correct
active queue.

The fix is to place all new blocks at the end of the queue.

Submitted by: Dr. Andreas Longwitz <longwitz@incore.de>
Discussed with: kib
MFC after: 1 week
2017-08-07 19:40:03 +00:00
Kirk McKusick
683590b642 sysctl kern.geom.journal.cache.limit shows negative value for FreeBSD/amd64
system having over 4GB RAM. That's due to:

1) the limit being u_int instead of u_long like vm.kmem_size (the limit is
   half of vm.kmem_size by default for amd64);
2) sysctl handler g_journal_cache_limit_sysctl() using u_int instead of u_long.

The fix is to replace u_int with u_long for the kern.geom.journal.cache.limit
sysctl variable.

PR: 198500
Submitted by: Dr. Andreas Longwitz <longwitz@incore.de>
Reported by: Eugene Grosbein
Discussed with: kib
MFC after: 1 week
2017-08-07 19:18:27 +00:00
Alexander Motin
1631690677 Add GEOM::descr attribute for symmetry with GEOM::ident.
MFC after:	2 weeks
2017-07-06 08:36:14 +00:00
Ryan Libby
fb0e3235ea g_virstor.h: macro parenthesization
Build with gcc -Wint-in-bool-context revealed a macro parenthesization
error (invoking LOG_MSG with a ternary expression for lvl).

Reviewed by:	markj
Approved by:	markj (mentor)
Sponsored by:	Dell EMC Isilon
Differential revision:	https://reviews.freebsd.org/D11411
2017-06-30 22:01:18 +00:00
Marcelo Araujo
4323355e76 With r318394 seems it breaks gpart(8) in some embedded systems such like PCEngines,
RPI1-B, Alix and APU2 boards as well as NanoBSD with the following message:

vnode_pager_generic_getpages_done: I/O read error 5

Seems the breakage was because it was missed to include acr in glabel update.

Reported by:	Peter Blok <pblok@bsd4all.org>,
		madpilot, imp and trasz.
Reviewed by:	trasz
Tested by:	Peter Blok and madpilot.
MFC after:	3 days.
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D11365
2017-06-27 01:22:27 +00:00
Stephen J. Kiernan
9a81ba0f24 Add MD_VERIFY option to enable O_VERIFY in open for vnode type.
Add -o [no]verify option to mdconfig (and document in man page.)
Implement GEOM attribute MNT::verified to ask md if the backing vnode is
  verified.
Check for MNT::verified in cd9660 mount to flag the mount as MNT_VERIFIED if
  the underlying device has been verified.

Reviewed by:	rwatson
Approved by:	sjg (mentor)
Obtained from:	Juniper Networks, Inc.
Differential Revision:	https://reviews.freebsd.org/D2902
2017-05-31 21:18:11 +00:00
Edward Tomasz Napierala
6635c8ed2f Fix typo.
MFC after:	2 weeks
2017-05-18 08:25:07 +00:00