Commit Graph

1883 Commits

Author SHA1 Message Date
delphij
47f8d13b54 In g_eli_crypto_hmac_init(), zero out after using the ipad buffer,
k_ipad.

Note that the two consumers in geli(4) are not affected by this
issue because the way the code is constructed and as such, we
believe there is no security impact with or without this change
with geli(4)'s usage.

Reported by:	Serge van den Boom <serge vdboom.org>
Reviewed by:	pjd
MFC after:	2 weeks
2014-02-08 05:17:49 +00:00
loos
3fd6ea64d7 Fix the build with DEBUG enabled. Where possible, fix style(9) issues.
Reviewed by:	bde
Approved by:	adrian (mentor)
2014-02-07 13:06:48 +00:00
loos
df1c99a134 Fix a logic error. Because of this inflateReset() wasn't being called and
the output buffer wasn't being cleared between the inflate() calls,
producing zeroed output after the first inflate() call.

This fixes the read of mkuzip(8) images with geom_uncompress(4).

Reviewed by:	ray
Approved by:	adrian (mentor)
2014-02-03 17:25:36 +00:00
loos
dc52643210 Remove some unnecessary code. The offsets read from the first block are
overwritten a few lines bellow.

Reviewed by:	ray
Approved by:	adrian (mentor)
2014-02-03 17:21:36 +00:00
ae
165a2b7500 Always free sbuf in gctl_free().
MFC after:	1 week
2014-01-23 21:30:31 +00:00
ae
cb5cd1d9fc Remove another unneeded NULL check from geom_alloc_copyin().
Do copyout in case of gctl version mismatch and fix sbuf leak in
g_ctl_ioctl_ctl().

MFC after:	1 week
2014-01-23 20:25:38 +00:00
ae
f9c14b9c5e In gctl_copyin() remove unused error variable.
geom_alloc_copyin() can't return ENOMEM, so describe its fail as bad
control request. Add check for NULL pointer in gctl_dump(), since it
can be NULL when geom_alloc_copyin() failed.

MFC after:	1 week
2014-01-23 19:55:02 +00:00
ae
ce531f97ec Fix typo in r261084.
Add to the gctl_error() an ability to specify error description even
if numeric error code is already specified. Also by default set
error code to EINVAL.

PR:		185852
MFC after:	1 week
2014-01-23 19:31:17 +00:00
ae
3ec50b516b malloc() with M_WAITOK doesn't return NULL.
MFC after:	1 week
2014-01-23 19:07:22 +00:00
mav
d0581e230a Removed unneeded and dangerous assignment. It would probably cause NULL
refererence panic if compiler not optimize it out.

Found with:	Clang static analyzer
MFC after:	2 weeks
2014-01-19 16:37:57 +00:00
loos
74ff5d934c Build the geom_uncompress(4) module by default.
Fix geom_uncompress(4) module loading.  Don't link zlib.c (which is a module
itself) directly.

The built module was verified and used to read a few mkulzma(8) images on
amd64 to validate some of the informations on the manual page.

While here, don't overwrite CFLAGS.

Reviewed by:	ray
Approved by:	adrian (mentor)
2014-01-10 20:29:46 +00:00
ae
150bba5c8f Add an ability to stop gmirror and clear its metadata in one command.
This fixes the problem, when gmirror starts again just after stop.

The problem occurs when gmirror's component has geom label with equal size.
E.g. gpt and gptid have the same size as partition, diskid has the same
size as entire disk. When gmirror's geom has been destroyed, glabel
creates its providers and this initiate retaste.

Now "gmirror destroy" command is available. It destroys geom and also
erases gmirror's metadata.

MFC after:	2 weeks
2013-12-27 02:43:53 +00:00
marck
e207b96998 Add GPT UUID for VMware vSAN meta-data partition.
Approved by:	ae
MFC after:	2 weeks
2013-12-26 21:06:12 +00:00
ae
86d162fb60 Prevent users from deactivating the last component of a mirror.
PR:		184985
MFC after:	1 week
2013-12-19 22:13:12 +00:00
pjd
8f9b4c6a1e Clear some more places with potentially sensitive data.
MFC after:	1 week
2013-12-15 22:52:18 +00:00
pjd
170007786b Clear content of keyfiles loaded by the loader after processing them.
Pointed out by:	rwatson
MFC after:	1 week
2013-12-15 22:51:26 +00:00
mav
4dd7fdae6a Fix bug introduced at r256607. We have to recalculate bp_resid here since
sizes of original and completed requests may differ due to end of media.

Bisected by:	pho
2013-12-12 08:23:28 +00:00
jhibbits
d6564a729a Partially revert r259080. bde@ pointed out that there are a lot more style bugs
going on in here than can be fixed, and I introduced some of my own.  Rather
than fix the whole host of them, back out my bugs.

Found by:	bde
X-MFC with:	r259080
2013-12-08 09:34:56 +00:00
jhibbits
76f5b95a89 Fix some integer signs. These unsigned integers should all be signed.
Found by:	clang (powerpc64)
2013-12-07 19:55:34 +00:00
eadler
44c01df173 Fix undefined behavior: (1 << 31) is not defined as 1 is an int and this
shifts into the sign bit.  Instead use (1U << 31) which gets the
expected result.

This fix is not ideal as it assumes a 32 bit int, but does fix the issue
for most cases.

A similar change was made in OpenBSD.

Discussed with:	-arch, rdivacky
Reviewed by:	cperciva
2013-11-30 22:17:27 +00:00
mav
cf37ee63fb Escape special XML chars, returned by some devices, confusing XML parsers.
MFC after:	1 month
2013-11-27 14:25:06 +00:00
marcel
3ce6c52230 Have the GPT probe return a lower priority when the MBR is not a PMBR
The purpose of the PMBR is to have the disk appear in use to GPT
unaware utilities (like fdisk).  However, if the PMBR has been changed
by a GPT unaware utlity then we must assume that this was deliberate
(as it involved removal of the special slice) and we should not treat
the unmodified GPT-specific sectors as being valid.  By lowering the
probe priority in that case, the MBR scheme will take precedence and
the kernel will end up using the MBR and not the GPT. We will still
use the GPT if the kernel does not support the MBR scheme.
2013-11-21 22:02:59 +00:00
ae
0b2e834032 Add "resize" verb to gmirror(8) and such functionality to geom_mirror(4).
Now it is easy to expand the size of the mirror when all its components
are replaced. Also add g_resize method to geom_mirror class. It will write
updated metadata to new last sector, when parent provider is resized.

Silence from:	geom@
MFC after:	1 month
2013-11-19 22:55:17 +00:00
mav
c684e93584 In addition to r258220 allow shrinking in "automatic" mode if there is
already valid metadata found at the new location.  This should allow easy
transparent recovery if first resize was done by mistake.

While there, unify metadata write code and fix minor memory leak.

MFC after:	1 month
2013-11-17 05:38:54 +00:00
mav
fb7bca5b5a Implement automatic live resize support for GEOM MULTIPATH class.
In "manual" mode just automatically resize provider in any direction.
In "automatic" mode allow only growth (with new metadata write); in case
of shrinking destroy the multipath device same as before since it may be
undesirable to write new metadata within old user area.

MFC after:	1 month
2013-11-16 14:31:49 +00:00
ae
69b3043c7e Add missing line breaks.
PR:		181900
MFC after:	1 week
2013-11-11 11:13:12 +00:00
delphij
f04c07285f When zero'ing out a buffer, make sure we are using right size.
Without this change, in the worst but unlikely case scenario, certain
administrative operations, including change of configuration, set or
delete key from a GEOM ELI provider, may leave potentially sensitive
information in buffer allocated from kernel memory.

We believe that it is not possible to actively exploit these issues, nor
does it impact the security of normal usage of GEOM ELI providers when
these operations are not performed after system boot.

Security:	possible sensitive information disclosure
Submitted by:	Clement Lecigne <clecigne google com>
MFC after:	3 days
2013-11-02 01:16:10 +00:00
jhb
b132568f90 Reject attempts to attack a disk device that has the old NEEDSGIANT
flag set.

Reviewed by:	mav
2013-10-25 19:19:12 +00:00
smh
5c7a6f5d92 Improve ZFS N-way mirror read performance by using load and locality
information.

The existing algorithm selects a preferred leaf vdev based on offset of the zio
request modulo the number of members in the mirror. It assumes the devices are
of equal performance and that spreading the requests randomly over both drives
will be sufficient to saturate them. In practice this results in the leaf vdevs
being under utilized.

The new algorithm takes into the following additional factors:
* Load of the vdevs (number outstanding I/O requests)
* The locality of last queued I/O vs the new I/O request.

Within the locality calculation additional knowledge about the underlying vdev
is considered such as; is the device backing the vdev a rotating media device.

This results in performance increases across the board as well as significant
increases for predominantly streaming loads and for configurations which don't
have evenly performing devices.

The following are results from a setup with 3 Way Mirror with 2 x HD's and
1 x SSD from a basic test running multiple parrallel dd's.

With pre-fetch disabled (vfs.zfs.prefetch_disable=1):

== Stripe Balanced (default) ==
Read 15360MB using bs: 1048576, readers: 3, took 161 seconds @ 95 MB/s
== Load Balanced (zfslinux) ==
Read 15360MB using bs: 1048576, readers: 3, took 297 seconds @ 51 MB/s
== Load Balanced (locality freebsd) ==
Read 15360MB using bs: 1048576, readers: 3, took 54 seconds @ 284 MB/s

With pre-fetch enabled (vfs.zfs.prefetch_disable=0):

== Stripe Balanced (default) ==
Read 15360MB using bs: 1048576, readers: 3, took 91 seconds @ 168 MB/s
== Load Balanced (zfslinux) ==
Read 15360MB using bs: 1048576, readers: 3, took 108 seconds @ 142 MB/s
== Load Balanced (locality freebsd) ==
Read 15360MB using bs: 1048576, readers: 3, took 48 seconds @ 320 MB/s

In addition to the performance changes the code was also restructured, with
the help of Justin Gibbs, to provide a more logical flow which also ensures
vdevs loads are only calculated from the set of valid candidates.

The following additional sysctls where added to allow the administrator
to tune the behaviour of the load algorithm:
* vfs.zfs.vdev.mirror.rotating_inc
* vfs.zfs.vdev.mirror.rotating_seek_inc
* vfs.zfs.vdev.mirror.rotating_seek_offset
* vfs.zfs.vdev.mirror.non_rotating_inc
* vfs.zfs.vdev.mirror.non_rotating_seek_inc

These changes where based on work started by the zfsonlinux developers:
https://github.com/zfsonlinux/zfs/pull/1487

Reviewed by:	gibbs, mav, will
MFC after:	2 weeks
Sponsored by:	Multiplay
2013-10-23 09:54:58 +00:00
mjg
c2c7122a9a gnop: make sure that newly allocated memory for softc is zeroed
This prevents mtx_init from encountering non-zeros and panicking
the kernel as a result.

Reported by:	Keith White <kwhite site.uottawa.ca>
2013-10-23 01:34:18 +00:00
mav
81c5dcd662 Remove Giant-locked drivers support (DISKFLAG_NEEDSGIANT flag) from disk(9).
Since at least FreeBSD 7 we had only four of them in the base tree, and
in head branch, thanks to jhb@, we have no any for more then a year.
2013-10-22 10:21:20 +00:00
mav
4219fc0074 Merge GEOM direct dispatch changes from the projects/camlock branch.
When safety requirements are met, it allows to avoid passing I/O requests
to GEOM g_up/g_down thread, executing them directly in the caller context.
That allows to avoid CPU bottlenecks in g_up/g_down threads, plus avoid
several context switches per I/O.

The defined now safety requirements are:
 - caller should not hold any locks and should be reenterable;
 - callee should not depend on GEOM dual-threaded concurency semantics;
 - on the way down, if request is unmapped while callee doesn't support it,
   the context should be sleepable;
 - kernel thread stack usage should be below 50%.

To keep compatibility with GEOM classes not meeting above requirements
new provider and consumer flags added:
 - G_CF_DIRECT_SEND -- consumer code meets caller requirements (request);
 - G_CF_DIRECT_RECEIVE -- consumer code meets callee requirements (done);
 - G_PF_DIRECT_SEND -- provider code meets caller requirements (done);
 - G_PF_DIRECT_RECEIVE -- provider code meets callee requirements (request).
Capable GEOM class can set them, allowing direct dispatch in cases where
it is safe.  If any of requirements are not met, request is queued to
g_up or g_down thread same as before.

Such GEOM classes were reviewed and updated to support direct dispatch:
CONCAT, DEV, DISK, GATE, MD, MIRROR, MULTIPATH, NOP, PART, RAID, STRIPE,
VFS, ZERO, ZFS::VDEV, ZFS::ZVOL, all classes based on g_slice KPI (LABEL,
MAP, FLASHMAP, etc).

To declare direct completion capability disk(9) KPI got new flag equivalent
to G_PF_DIRECT_SEND -- DISKFLAG_DIRECT_COMPLETION.  da(4) and ada(4) disk
drivers got it set now thanks to earlier CAM locking work.

This change more then twice increases peak block storage performance on
systems with manu CPUs, together with earlier CAM locking changes reaching
more then 1 million IOPS (512 byte raw reads from 16 SATA SSDs on 4 HBAs to
256 user-level threads).

Sponsored by:	iXsystems, Inc.
MFC after:	2 months
2013-10-22 08:22:19 +00:00
trasz
26d2e3d5ff Fix build with gcc by spelling unused format string as "unused" instead of NULL.
MFC after:	29 days
2013-10-19 08:20:00 +00:00
trasz
fb4e4700a1 Make geom_label(4) resize-aware. This fixes a situation when "gpart resize"
would resize a partition, but label providers - e.g. /dev/gptid/XXX - would
stay the same size.

Reviewed by:	mav
MFC after:	1 month
Sponsored by:	FreeBSD Foundation
2013-10-18 09:14:19 +00:00
ae
05ca533af6 Add an automatic resize support to the GEOM_PART class.
When parent provider has been resized, the scheme specific G_PART_RESIZE
method does an update of scheme's metadata. But all changes are not saved
to disk, until `gpart commit` will be called.

Discussed with:	trasz
MFC after:	1 month
2013-10-17 16:18:43 +00:00
mav
bf87deb4e7 MFprojects/camlock r256445:
Add unmapped I/O support to GEOM RAID.
2013-10-16 09:33:23 +00:00
mav
9059e21206 MFprojects/camlock r256371:
Fix passing uninitialized bio_resid argument to g_trace().
2013-10-16 09:21:40 +00:00
mav
9a5bcf65b3 MFprojects/camlock r254907:
Move g_io_deliver() out of the lock, as required for direct dispatch.
Move g_destroy_bio() out too to reduce lock scope even more.
2013-10-16 09:18:01 +00:00
mav
2080607889 MFprojects/camlock r254905:
Introduce new function devstat_end_transaction_bio_bt(), adding new argument
to specify present time.  Use this function to move binuptime() out of lock,
substantially reducing lock congestion when slow timecounter is used.
2013-10-16 09:12:40 +00:00
des
140807754c Introduce a kern.geom.notaste sysctl that can be used to temporarily
disable GEOM tasting to avoid the "bouncing GEOM" problem where, when
you shut down the consumer of a provider which can be viewed in multiple
ways (typically a mirror whose members are labeled partitions), GEOM
will immediately taste that provider's alter ego and reattach the
consumer.

Approved by:	re (glebius)
2013-09-24 20:05:16 +00:00
ae
7f30f5be1c Remove stub implementation.
MFC after:	1 week
2013-09-05 09:44:09 +00:00
mav
8324fc3480 Make ELI destruction (including orphanization) less aggressive, making it
always wait for provider close.  Old algorithm was reported to cause NULL
dereference panic on attempt to close provider after softc destruction.
If not global workaroung in GEOM, that could even cause destruction with
requests still in flight.
2013-09-02 10:44:54 +00:00
mav
3380a03b00 MFprojects/camlock r254895:
Add unmapped BIO support to GEOM ZERO if kern.geom.zero.clear is cleared.
2013-08-26 20:39:02 +00:00
mav
e8031ce26c Add new attribute lunname to report only textual LUN-specific device IDs.
While lunid attribute prefers to report numeric ones, having both may be
useful in some situations.
2013-08-24 09:42:14 +00:00
ken
5591de079d Change the way that unmapped I/O capability is advertised.
The previous method was to set the D_UNMAPPED_IO flag in the cdevsw
for the driver.  The problem with this is that in many cases (e.g.
sa(4)) there may be some instances of the driver that can handle
unmapped I/O and some that can't.  The isp(4) driver can handle
unmapped I/O, but the esp(4) driver currently cannot.  The cdevsw
is shared among all driver instances.

So instead of setting a flag on the cdevsw, set a flag on the cdev.
This allows drivers to indicate support for unmapped I/O on a
per-instance basis.

sys/conf.h:	Remove the D_UNMAPPED_IO cdevsw flag and replace it
		with an SI_UNMAPPED cdev flag.

kern_physio.c:	Look at the cdev SI_UNMAPPED flag to determine
		whether or not a particular driver can handle
		unmapped I/O.

geom_dev.c:	Set the SI_UNMAPPED flag for all GEOM cdevs.
		Since GEOM will create a temporary mapping when
		needed, setting SI_UNMAPPED unconditionally will
		work.

		Remove the D_UNMAPPED_IO flag.

nvme_ns.c:	Set the SI_UNMAPPED flag on cdevs created here
		if NVME_UNMAPPED_BIO_SUPPORT is enabled.

vfs_aio.c:	In aio_qphysio(), check the SI_UNMAPPED flag on a
		cdev instead of the D_UNMAPPED_IO flag on the cdevsw.

sys/param.h:	Bump __FreeBSD_version to 1000045 for the switch from
		setting the D_UNMAPPED_IO flag in the cdevsw to setting
		SI_UNMAPPED in the cdev.

Reviewed by:	kib, jimharris
MFC after:	1 week
Sponsored by:	Spectra Logic
2013-08-15 22:52:39 +00:00
mav
eba4a485b2 Return error when opening read-only volumes (like RAID4/5/...) for writing.
Previously opens succeeded, but actual write operations returned errors.

Requested by:	peter
MFC after:	2 weeks
2013-08-13 07:56:40 +00:00
mav
d9e76bbffc Oops, wrong constant at r254269. 2013-08-13 06:25:34 +00:00
mav
1ddae2c9b4 Fix reasonable but safe Clang warnings. 2013-08-13 06:21:36 +00:00
ed
e591d48c3e Fix the formatting of the error message.
The G_MIRROR_DEBUG() macro already appends a newline. Also, most of the
log messages emitted by gmirror start with an uppercase letter.
2013-08-12 18:17:45 +00:00
ae
4c7750d3a8 gpt_entries is used as limit for the number of partition entries in
the GEOM_PART. Instead of just using number of entries from the GPT
header, calculate this limit based on the reserved space between
GPT header and first available LBA.

MFC after:	2 weeks
2013-08-08 16:09:20 +00:00