Commit Graph

1981 Commits

Author SHA1 Message Date
Warner Losh
3f2e5b8584 After the introduction of direct dispatch, the pacing code in g_down()
broke in two ways. One, the pacing variable was accessed in multiple
threads in an unsafe way. Two, since large numbers of I/O could come
down from the buf layer at one time, large numbers of allocation
failures could happen all at once, resulting in a huge pace value that
would limit I/Os to 10 IOPS for minutes (or even hours) at a
time. While a real solution to these problems requires substantial
work (to go to a no-allocation after the first model, or to have some
way to wait for more memory with some kind of reserve for pager and
swapper requests), it is relatively easy to make this simplistic
pacing less pathological.

Move to using a volatile variable with loads and stores. While this is
a little racy, losing the race is safe: either you get memory and
proceed, or you don't and queue. Second, sleep for 1ms (or one tick, whichever
is larger) instead of 100ms. This removes the artificial 10 IOPS limit
while still easing up on new I/Os during memory shortages. Remove
tying the amount of time we do this to the number of failed requests
and do it only as long as we keep failing requests.

Finally, to avoid needless recursion when memory is tight (start ->
g_io_deliver() -> g_io_request() -> start -> ... until we use 1/2 the
stack), don't do direct dispatch while pacing. This should be a rare
event (not steady state) so the performance hit here is worth the
extra safety of not starving g_down() with directly dispatched I/O.

Differential Review: https://reviews.freebsd.org/D3546
2015-09-02 17:29:30 +00:00
Justin Hibbits
6aabc119b6 Create a RouterBoard platform and use it to create a flash map
Summary:
The RouterBoard uses a predefined partition map which doesn't exist in the fdt.
This change allows overriding the fdt slicer with a custom slicer, and uses this
custom slicer to define the flash map on the RouterBoard RB800.
D3305 converts the mpc85xx platform into a base class, so that systems based on
the mpc85xx platform can add their own overrides.  This change builds on D3305,
and creates a RouterBoard (RB800) platform to initialize the slicer override.

Reviewed By: nwhitehorn, imp
Differential Revision: https://reviews.freebsd.org/D3345
2015-08-22 05:50:18 +00:00
Pedro F. Giffuni
6bc3fe5f4e Clean out some externally visible "more then" grammar
MFC after:	3 days
2015-08-11 03:12:09 +00:00
Enji Cooper
604083d74c Make some debug printf's into DPRINTF's to reduce noise on attach/detahh
Similar reasoning to what was done in r286367 with geom_uzip(4)

MFC after: 2 weeks
Differential Revision: D3320
Sponsored by: EMC / Isilon Storage Division
2015-08-09 06:58:06 +00:00
Pawel Jakub Dawidek
46e3447026 Enable BIO_DELETE passthru in GELI, so TRIM/UNMAP can work as expected when
GELI is used on a SSD or inside virtual machine, so that guest can tell
host that it is no longer using some of the storage.

Enabling BIO_DELETE passthru comes with a small security consequence - an
attacker can tell how much space is being really used on encrypted device and
has less data no analyse then. This is why the -T option can be given to the
init subcommand to turn off this behaviour and -t/T options for the configure
subcommand can be used to adjust this setting later.

PR:		198863
Submitted by:	Matthew D. Fuller fullermd at over-yonder dot net

This commit also includes a fix from Fabian Keil freebsd-listen at
fabiankeil.de for 'configure' on onetime providers which is not strictly
related, but is entangled in the same code, so would cause conflicts if
separated out.
2015-08-08 09:51:38 +00:00
Konstantin Belousov
347e9d5495 Minor style cleanup of the code surrounding r286404.
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-08-07 08:24:12 +00:00
Konstantin Belousov
9b34965019 The condition to use direct processing for the unmapped bio is
reverted.  We can do direct processing when g_io_check() does not need
to perform transient remapping of the bio, otherwise the thread has to
sleep.

Reviewed by:	mav (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2015-08-07 08:13:34 +00:00
Pawel Jakub Dawidek
5ee9ea19fe After crypto_dispatch() bio might be already delivered and destroyed,
so we cannot access it anymore. Setting an error later lead to memory
corruption.

Assert that crypto_dispatch() was successful. It can fail only if we pass a
bogus crypto request, which is a bug in the program, not a runtime condition.

PR:		199705
Submitted by:	luke.tw
Reviewed by:	emaste
MFC after:	3 days
2015-08-06 17:13:34 +00:00
Enji Cooper
fcc8461cfb Make some debug printf's into DPRINTF's to reduce noise on attach/detach
Differential Revision: https://reviews.freebsd.org/D3306
MFC after: 1 week
Reviewed by: loos
Sponsored by: EMC / Isilon Storage Division
2015-08-06 15:30:14 +00:00
Edward Tomasz Napierala
72800098bf Fix panic triggered by code like this:
open("/dev/md0", O_EXEC);

Discussed with:	kib@, mav@
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D3051
2015-08-04 10:40:08 +00:00
Edward Tomasz Napierala
d6cc35b287 Fix panic that would happen on forcibly unmounting devfs (note that
as it is now, devfs ignores MNT_FORCE anyway, so it needs to be modified
to trigger the panic) with consumers still opened.

Note that this still results in a leak of r/w/e counters.  It seems
to be harmless, though.  If anyone knows a better way to approach
this - please tell.

Discussed with:	kib@, mav@
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D3050
2015-08-03 16:35:18 +00:00
Andrey V. Elsukov
da6c24e123 Report the scheme and provider names in warning message about unaligned
partition.

PR:		201873
MFC after:	1 week
2015-07-26 11:16:48 +00:00
Allan Jude
ce808c7ad8 Add a new option to gpart(8) to fix Lenovo BIOS boot issue
PR:		184910
Reviewed by:	ae, wblock
Approved by:	marcel
MFC after:	3 days
Relnotes:	yes
Sponsored by:	ScaleEngine Inc.
Differential Revision:	https://reviews.freebsd.org/D3065
2015-07-15 02:23:55 +00:00
Pawel Jakub Dawidek
4273d41299 Spoil even can happen for some time now even on providers opened exclusively
(on the media change event). Update GELI to handle that situation.

PR:		201185
Submitted by:	Matthew D. Fuller
2015-07-10 19:27:19 +00:00
Pawel Jakub Dawidek
fefb6a143a Properly propagate errors in metadata reading.
PR:		198860
Submitted by:	Matthew D. Fuller
2015-07-02 10:57:34 +00:00
Pawel Jakub Dawidek
edaa9008ff Allow to omit keyfile number for the first keyfile. 2015-07-02 10:55:32 +00:00
Edward Tomasz Napierala
628b712826 Fix off-by-one error in fstyp(8) and geom_label(4) that made them use
a single space (" ") as a CD9660 label name when no label was present.
Similar problem was also present in msdosfs label recognition.

PR:		200828
Differential Revision:	https://reviews.freebsd.org/D2830
Reviewed by:	asomers@, emaste@
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2015-06-18 21:55:55 +00:00
Andrey V. Elsukov
e7d0c7e458 Teach G_PART_GPT class to handle g_resize_provider event.
MFC after:	10 days
2015-06-08 12:52:41 +00:00
Jung-uk Kim
fd90e2ed54 CALLOUT_MPSAFE has lost its meaning since r141428, i.e., for more than ten
years for head.  However, it is continuously misused as the mpsafe argument
for callout_init(9).  Deprecate the flag and clean up callout_init() calls
to make them more consistent.

Differential Revision:	https://reviews.freebsd.org/D2613
Reviewed by:	jhb
MFC after:	2 weeks
2015-05-22 17:05:21 +00:00
Andrey V. Elsukov
153c57b5b4 Read GEOM_UNCOMPRESS metadata using several requests that fit into
MAXPHYS. For large compressed images the metadata size can be bigger
than MAXPHYS and this triggers KASSERT in g_read_data().
Also use g_free() to free memory allocated by g_read_data().

PR:		199476
MFC after:	2 weeks
2015-05-19 09:28:52 +00:00
Andrey V. Elsukov
4b8d4f97b0 Add apple-boot, apple-hfs and apple-ufs aliases to MBR scheme.
Sort DOSPTYP_* entries in diskmbr.h by value.
Document these scheme-specific types in gpart(8).

MFC after:	1 week
2015-05-05 09:33:02 +00:00
Craig Rodrigues
d9db52256e Move zlib.c from net to libkern.
It is not network-specific code and would
be better as part of libkern instead.
Move zlib.h and zutil.h from net/ to sys/
Update includes to use sys/zlib.h and sys/zutil.h instead of net/

Submitted by:		Steve Kiernan stevek@juniper.net
Obtained from:		Juniper Networks, Inc.
GitHub Pull Request:	https://github.com/freebsd/freebsd/pull/28
Relnotes:		yes
2015-04-22 14:38:58 +00:00
Pedro F. Giffuni
4a5e6b854d g_uncompress_taste: prevent a double free.
Found by:	Clang Static Analyzer
MFC after:	1 week
2015-04-20 16:31:27 +00:00
Alexander Motin
0ada3afc25 Remove sleeps from geom_up thread on device destruction.
MFC after:	3 days.
2015-04-09 13:09:05 +00:00
Alexander Motin
5d85cd2d11 Remove extra semicolon.
MFC after:	1 week
2015-03-27 12:45:20 +00:00
Alexander Motin
3ab0187add Remove request sorting from GEOM_MIRROR and GEOM_RAID.
When CPU is not busy, those queues are typically empty.  When CPU is busy,
then one more extra sorting is the last thing it needs.  If specific device
(HDD) really needs sorting, then it will be done later by CAM.

This supposed to fix livelock reported for mirror of two SSDs, when UFS
fires zillion of BIO_DELETE requests, that totally blocks I/O subsystem by
pointless sorting of requests and responses under single mutex lock.

MFC after:	2 weeks
2015-03-27 12:44:28 +00:00
Alexander Motin
41fe4ba647 Fix bug on memory allocation error in split method.
While there, use bioq_takefirst() in place where it is convenient.

MFC after:	1 week
2015-03-27 11:14:12 +00:00
Alexander Motin
5523c82c1a Make GEOM_PART work in presence of previous withered self.
MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2015-03-26 12:17:47 +00:00
Alexander Motin
2f36085dcf Report withered providers as such alike to GEOMs.
MFC after:	2 weeks
2015-03-26 11:19:24 +00:00
Alexander Motin
ba772028db When searching for provider by name, prefer non-withered one.
MFC after:	2 weeks
2015-03-26 11:02:29 +00:00
Adrian Chadd
28d507fcec Fix the label search routine in geom_map to not trip up on '\0' bytes.
* Just do the buf check early and fail out
* If the offset being searched is:

00110000  00 b5 7e 45 61 e2 76 d3  c1 78 dd 15 95 cd 1f f1  |..~Ea.v..x......|

.. and the match string is '.!/bin/sh'

.. then it'll set the match string[0] to '\0', do a strncmp() against
the read buffer, find it's matching two zero-length strings, and think
that's where to start.

MFC after:	2 weeks
2015-03-19 03:58:25 +00:00
Andrey V. Elsukov
4fb4ebe0a4 Add GUID and alias for Apple Core Storage partition.
PR:		196241
MFC after:	1 week
2015-03-12 18:51:31 +00:00
Alexander Motin
7715befdf2 Fix couple BIO_DELETE bugs in geom_mirror.
Do not report GEOM::candelete if none of providers support BIO_DELETE.
If consumer still requests BIO_DELETE, report error instead of hanging.

MFC after:	2 weeks
2015-03-12 10:20:53 +00:00
Alexander Motin
0b1b7c2cec Replace constant with proper sizeof().
Submitted by:	Dmitry Luhtionov <dmitryluhtionov@gmail.com>
MFC after:	2 weeks
2015-02-25 10:18:11 +00:00
Edward Tomasz Napierala
01de1a0650 Add devd(8) notifications for creation and destruction of GEOM devices.
Differential Revision:	https://reviews.freebsd.org/D1211
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2015-01-14 11:15:57 +00:00
Warner Losh
a91275f72f Remove old ioctl use and support, once and for all. 2015-01-06 05:28:37 +00:00
Warner Losh
0acf08d985 Remove support for FreeBSD 7 and really old FreeBSD 8. The classifiers
have been in the base for a while, so the gymnastics here aren't
needed. In addition, the bugs in subr_disk.c have been fixed since
2009, so there's no need for an identical copy of it in the tree
anymore. There's really no need to binary patch g_io_request, so let's
get rid of the code (not compiled in anymore) lest others think it is
a good idea.
2014-12-20 00:04:01 +00:00
John-Mark Gurney
08fca7a56b Add some new modes to OpenCrypto. These modes are AES-ICM (can be used
for counter mode), and AES-GCM.  Both of these modes have been added to
the aesni module.

Included is a set of tests to validate that the software and aesni
module calculate the correct values.  These use the NIST KAT test
vectors.  To run the test, you will need to install a soon to be
committed port, nist-kat that will install the vectors.  Using a port
is necessary as the test vectors are around 25MB.

All the man pages were updated.  I have added a new man page, crypto.7,
which includes a description of how to use each mode.  All the new modes
and some other AES modes are present.  It would be good for someone
else to go through and document the other modes.

A new ioctl was added to support AEAD modes which AES-GCM is one of them.
Without this ioctl, it is not possible to test AEAD modes from userland.

Add a timing safe bcmp for use to compare MACs.  Previously we were using
bcmp which could leak timing info and result in the ability to forge
messages.

Add a minor optimization to the aesni module so that single segment
mbufs don't get copied and instead are updated in place.  The aesni
module needs to be updated to support blocked IO so segmented mbufs
don't have to be copied.

We require that the IV be specified for all calls for both GCM and ICM.
This is to ensure proper use of these functions.

Obtained from:	p4: //depot/projects/opencrypto
Relnotes:	yes
Sponsored by:	FreeBSD Foundation
Sponsored by:	NetGate
2014-12-12 19:56:36 +00:00
Alexander Motin
1e68fe9c33 Avoid unneeded malloc/memcpy/free if there is no metadata on disk.
Submitted by:	Dmitry Luhtionov <dmitryluhtionov@gmail.com>
MFC after:	2 weeks
2014-12-05 10:23:18 +00:00
Alexander Motin
26f0f92fa2 Decode some binary fields of Intel metadata.
Submitted by:	Dmitry Luhtionov <dmitryluhtionov@gmail.com>
MFC after:	2 weeks
2014-12-04 15:54:45 +00:00
Warner Losh
66cc25a224 Actually, that was a bad idea. Go back to MAXPARTITIONS.
Submitted by: bruce
2014-11-20 17:31:25 +00:00
Warner Losh
dd87e2c610 The number of BSD partitions is variable. Return the proper number
(which is in basetable->gpt_entries).

Submitted by: ae@
2014-11-19 18:55:27 +00:00
Warner Losh
73f49e9eef Implement the historic DIOCGDINFO ioctl for gpart on BSD
partitions. Several utilities still use this interface and require
additional information since gpart was activated than before. This
allows fsck of a UFS partition without having to specify it is UFS,
per historic behavior.
2014-11-18 17:06:40 +00:00
Pawel Jakub Dawidek
5ebb15b942 Add missing privilege check when setting the dump device. Before that change it
was possible for a regular user to setup the dump device if he had write access
to the given device. In theory it is a security issue as user might get access
to kernel's memory after provoking kernel crash, but in practise it is not
recommended to give regular users direct access to storage devices.

Rework the code so that we do privileges check within the set_dumper() function
to avoid similar problems in the future.

Discussed with:	secteam
2014-11-11 04:48:09 +00:00
Dag-Erling Smørgrav
133cdd9e13 Constify the AES code and propagate to consumers. This allows us to
update the Fortuna code to use SHAd-256 as defined in FS&K.

Approved by:	so (self)
2014-11-10 09:44:38 +00:00
Poul-Henning Kamp
cd15a01091 Translate the errno to gctl_error() texts.
Spotted by:	mwlucas
2014-11-09 15:52:11 +00:00
Alexander Motin
c3e7ba3e6d Add to CTL support for logical block provisioning threshold notifications.
For ZVOL-backed LUNs this allows to inform initiators if storage's used or
available spaces get above/below the configured thresholds.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2014-11-06 00:48:36 +00:00
Alexander Motin
ccf8a5688a Revert somewhat hackish geom_disk optimization, committed as part of r256880,
and the following r273143 commit, supposed to workaround introduced issue by
quite innocent-looking change.

While there is no clear understanding why, but r273143 is accused in data
corruption in some environments with high I/O load.  I personally don't see
any problem in that commit, and possibly it is just a trigger to some other
bug somewhere, but better safe then sorry for now.

Requested by:	scottl@
MFC after:	3 days
2014-10-25 15:16:19 +00:00
Colin Percival
66427784c1 Populate the GELI passphrase cache with the kern.geom.eli.passphrase
variable (if any) provided in the boot environment.  Unset it from
the kernel environment after doing this, so that the passphrase is
no longer present in kernel memory once we enter userland.

This will make it possible to provide a GELI passphrase via the boot
loader; FreeBSD's loader does not yet do this, but GRUB (and PCBSD)
will have support for this soon.

Tested by:	kmoore
2014-10-22 23:41:15 +00:00
Hans Petter Selasky
f0188618f2 Fix multiple incorrect SYSCTL arguments in the kernel:
- Wrong integer type was specified.

- Wrong or missing "access" specifier. The "access" specifier
sometimes included the SYSCTL type, which it should not, except for
procedural SYSCTL nodes.

- Logical OR where binary OR was expected.

- Properly assert the "access" argument passed to all SYSCTL macros,
using the CTASSERT macro. This applies to both static- and dynamically
created SYSCTLs.

- Properly assert the the data type for both static and dynamic
SYSCTLs. In the case of static SYSCTLs we only assert that the data
pointed to by the SYSCTL data pointer has the correct size, hence
there is no easy way to assert types in the C language outside a
C-function.

- Rewrote some code which doesn't pass a constant "access" specifier
when creating dynamic SYSCTL nodes, which is now a requirement.

- Updated "EXAMPLES" section in SYSCTL manual page.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2014-10-21 07:31:21 +00:00