freebsd-dev

Author	SHA1	Message	Date
Mariusz Zaborski	4d7486c30f	gnop: style nits	2019-07-31 17:51:06 +00:00
Mariusz Zaborski	4f80c85519	gnop: Introduce requests delay. This allows to simulated disk that is responding slowly to the IO requests. Reviewed by: markj, bcr, pjd (previous version) Differential Revision: https://reviews.freebsd.org/D21052	2019-07-31 17:47:12 +00:00
Ryan Libby	9167705c8c	g_mirror_taste: avoid deadlock, always clear tasting flag If g_mirror_taste encountered an error at g_mirror_add_disk, it might try to g_mirror_destroy the device with the G_MIRROR_DEVICE_FLAG_TASTING flag still set. This would wait on a worker to complete the destruction with g_mirror_try_destroy, but that function bails out if the tasting flag is set, resulting in a deadlock. Clear the tasting flag before trying to destroy the device. Test Plan: sysctl debug.fail_point.mnowait="1%return" kyua test -k /usr/tests/sys/geom/class/mirror/Kyuafile Reviewed by: markj Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D20744	2019-07-01 22:06:36 +00:00
Ryan Libby	3bb6e0f0c7	g_eli_create: only dec g_access acw if we inc'd it Reviewed by: cem, markj Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D20743	2019-07-01 22:06:16 +00:00
Warner Losh	f5a95d9a07	Remove NAND and NANDFS support NANDFS has been broken for years. Remove it. The NAND drivers that remain are for ancient parts that are no longer relevant. They are polled, have terrible performance and just for ancient arm hardware. NAND parts have evolved significantly from this early work and little to none of it would be relevant should someone need to update to support raw nand. This code has been off by default for years and has violated the vnode protocol leading to panics since it was committed. Numerous posts to arch@ and other locations have found no actual users for this software. Relnotes: Yes No Objection From: arch@ Differential Revision: https://reviews.freebsd.org/D20745	2019-06-25 04:50:09 +00:00
Alexander Motin	49ee0fcea5	Use sbuf_cat() in GEOM confxml generation. When it comes to megabytes of text, difference between sbuf_printf() and sbuf_cat() becomes substantial. MFC after: 2 weeks Sponsored by: iXsystems, Inc.	2019-06-19 15:36:02 +00:00
Alexander Motin	5c32e9fcb2	Optimize kern.geom.conf* sysctls. On large systems those sysctls may generate megabytes of output. Before this change sbuf(9) code was resizing buffer by 4KB each time many times, generating tons of TLB shootdowns. Unfortunately in this case existing sbuf_new_for_sysctl() mechanism, supposed to help with this issue, is not applicable, since all the sbuf writes are done in different kernel thread. This change improves situation in two ways: - on first sysctl call, not providing any output buffer, it sets special sbuf drain function, just counting the data and so not needing big buffer; - on second sysctl call it uses as initial buffer size value saved on previous call, so that in most cases there will be no reallocation, unless GEOM topology changed significantly. MFC after: 1 week Sponsored by: iXsystems, Inc.	2019-06-18 21:05:10 +00:00
Xin LI	f89d207279	Separate kernel crc32() implementation to its own header (gsb_crc32.h) and rename the source to gsb_crc32.c. This is a prerequisite of unifying kernel zlib instances. PR: 229763 Submitted by: Yoshihiro Ota <ota at j.email.ne.jp> Differential Revision: https://reviews.freebsd.org/D20193	2019-06-17 19:49:08 +00:00
Mariusz Zaborski	a802439365	geli: style nits	2019-06-12 19:29:48 +00:00
Mariusz Zaborski	e7630efbe6	geli: partially revert r348709 Let's change the unsigned arguments to the signed one, but let's don't change pointers to the array notation. Requested by: pjd	2019-06-12 19:29:12 +00:00
Mariusz Zaborski	1808673cc4	geli: build warning fixes Submitted by: Aaron Prieger <aprieger@llnw.com> Reviewed by: sbruno Differential Revision: https://reviews.freebsd.org/D11068	2019-06-05 22:46:18 +00:00
Kirk McKusick	10519e1398	When using the destroy option to shut down a nop GEOM module, I/O operations already in its queue were not being properly drained. The GEOM framework does the queue draining, but the module needs to wait for the draining to happen. The waiting is done by adding a g_nop_providergone() function to wait for the I/O operations to finish up. This change is similar to change -r345758 made to the memory-disk driver. Submitted by: Chuck Silvers Tested by: Chuck Silvers MFC after: 1 week Sponsored by: Netflix	2019-05-25 00:07:49 +00:00
John Baldwin	5c420aae3b	Add deprecation warnings for weaker algorithms to geli(4). - Triple DES has been formally deprecated in Kerberos (RFC 8429) and is soon to be deprecated in IPsec (RFC 8221). - Blowfish is deprecated. FreeBSD doesn't support its successor (Twofish). - MD5 is generally considered a weak digest that has known attacks. geli refuses to create new volumes using these algorithms via 'geli init'. It also warns when attaching to existing volumes or creating temporary volumes via 'geli onetime' . The plan is to fully remove support for these algorithms in FreeBSD 13. Note that none of these algorithms have ever been the default algorithm used by geli(8). Users would have had to explicitly select these algorithms when creating volumes in the past. Reviewed by: cem, delphij MFC after: 3 days Relnotes: yes Sponsored by: Chelsio Communications Differential Revision: https://reviews.freebsd.org/D20344	2019-05-23 22:31:55 +00:00
Conrad Meyer	6b6e2954dd	List-ify kernel dump device configuration Allow users to specify multiple dump configurations in a prioritized list. This enables fallback to secondary device(s) if primary dump fails. E.g., one might configure a preference for netdump, but fallback to disk dump as a second choice if netdump is unavailable. This change does not list-ify netdump configuration, which is tracked separately from ordinary disk dumps internally; only one netdump configuration can be made at a time, for now. It also does not implement IPv6 netdump. savecore(8) is already capable of scanning and iterating multiple devices from /etc/fstab or passed on the command line. This change doesn't update the rc or loader variables 'dumpdev' in any way; it can still be set to configure a single dump device, and rc.d/savecore still uses it as a single device. Only dumpon(8) is updated to be able to configure the more complicated configurations for now. As part of revving the ABI, unify netdump and disk dump configuration ioctl / structure, and leave room for ipv6 netdump as a future possibility. Backwards-compatibility ioctls are added to smooth ABI transition, especially for developers who may not keep kernel and userspace perfectly synced. Reviewed by: markj, scottl (earlier version) Relnotes: maybe Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D19996	2019-05-06 18:24:07 +00:00
Roger Pau Monné	b951b8f721	geom: fix initialization order There's a race between the initialization of devsoftc.mtx (by devinit) and the creation of the geom worker thread g_run_events, which calls devctl_queue_data_f. Both of those are initialized at SI_SUB_DRIVERS and SI_ORDER_FIRST, which means the geom worked thread can be created before the mutex has been initialized, leading to the panic below: wpanic: mtx_lock() of spin mutex (null) @ /usr/home/osstest/build.135317.build-amd64-freebsd/freebsd/sys/kern/subr_bus.c:620 cpuid = 3 time = 1 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe003b968710 vpanic() at vpanic+0x19d/frame 0xfffffe003b968760 panic() at panic+0x43/frame 0xfffffe003b9687c0 __mtx_lock_flags() at __mtx_lock_flags+0x145/frame 0xfffffe003b968810 devctl_queue_data_f() at devctl_queue_data_f+0x6a/frame 0xfffffe003b968840 g_dev_taste() at g_dev_taste+0x463/frame 0xfffffe003b968a00 g_load_class() at g_load_class+0x1bc/frame 0xfffffe003b968a30 g_run_events() at g_run_events+0x197/frame 0xfffffe003b968a70 fork_exit() at fork_exit+0x84/frame 0xfffffe003b968ab0 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe003b968ab0 --- trap 0, rip = 0, rsp = 0, rbp = 0 --- KDB: enter: panic [ thread pid 13 tid 100029 ] Stopped at kdb_enter+0x3b: movq $0,kdb_why Fix this by initializing geom at SI_ORDER_SECOND instead of SI_ORDER_FIRST. Sponsored by: Citrix Systems R&D Reviewed by: kevans, markj Differential revision: https://reviews.freebsd.org/D20148	2019-05-06 09:48:34 +00:00
Alexander Motin	9c498bd5c3	Call delist_dev() before destroy_dev_sched_cb(). destroy_dev_sched_cb() is excessively asynchronous, and during media change retaste new provider may appear sooner then device of the previous one get destroyed. MFC after: 1 week Sponsored by: iXsystems, Inc.	2019-04-24 19:56:02 +00:00
Conrad Meyer	83efd2885e	gnop(8): Nopify configuration as a kernel dump device As a dummy / no-op dump device, to facilitate dumpon(8) testing. Reviewed by: markj (earlier version) Differential Revision: https://reviews.freebsd.org/D19991	2019-04-22 03:25:49 +00:00
Pawel Jakub Dawidek	2f07cdf871	Implement automatic online expansion of GELI providers - if the underlying provider grows, GELI will expand automatically and will move the metadata to the new location of the last sector. This functionality is turned on by default. It can be turned off with the -R flag, but it is not recommended - if the underlying provider grows and automatic expansion is turned off, it won't be possible to attach this provider again, as the metadata is no longer located in the last sector. If the automatic expansion is turned off and the underlying provider grows, GELI will only log a message with the previous size of the provider, so recovery can be easier. Obtained from: Fudo Security	2019-04-03 23:57:37 +00:00
Pawel Jakub Dawidek	e6b0d5eb9f	Introduce new event SIZECHANGE within GEOM system to inform about GEOM providers mediasize changes. While here, use GEOM nomenclature to describe providers instead of calling them device nodes. Obtained from: Fudo Security Tested in: AWS	2019-03-30 07:24:34 +00:00
Ian Lepore	91a3f3588a	Support device-independent labels for geom_flashmap slices. While geom_flashmap has always supported label names for its slices, it does so by appending "s.labelname" to the provider device name, meaning you still have to know the name and unit of the hardware device to use the labels. These changes add support for device-independent geom_flashmap labels, using the standard geom_label infrastructure. geom_flashmap now creates a softc struct attached to its geom, and as it creates slices it stores the label into an array in the softc. The new geom_label_flashmap uses those labels when tasting a geom_flashmap provider. Differential Revision: https://reviews.freebsd.org/D19535	2019-03-24 19:11:45 +00:00
Conrad Meyer	54533f66c9	stack(9): Drop unused API mode and comment that referenced it Reviewed by: markj Differential Revision: https://reviews.freebsd.org/D19601	2019-03-15 22:39:55 +00:00
Marcel Moolenaar	96937e3b23	Revert revision 254095 In revision 254095, gpt_entries is not set to match the on-disk hdr_entries, but rather is computed based on available space. There are 2 problems with this: 1. The GPT backend respects hdr_entries and only reads and writes that number of partition entries. On top of that, CRC32 is computed over the table that has hdr_entries elements. When the common code works on what is possibly a larger number, the behaviour becomes inconsistent and problematic. In particular, it would be possible to add a new partition that on a reboot isn't there anymore. 2. The calculation of gpt_entries is based on flawed assumptions. The GPT specification does not dictate that sectors are layed out in a particular way that the available space can be determined by looking at LBAs. In practice, implementations do the same thing, because there's no reason to do it any other way. Still, GPT allows certain freedoms that can be exploited in some form or shape if the need arises. PR: 229977 MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D19438	2019-03-05 04:15:34 +00:00
Konstantin Belousov	e8643b01e6	Modularize xz. Embedded lzma decompression library becomes a module usable by other consumers, in addition to geom_uzip. Most important code changes are - removal of XZ_DEC_SINGLE define, we need the code to work with XZ_DEC_DYNALLOC; - xz_crc32_init() call is removed from geom_uzip, xz module handles initialization on its own. xz is no longer embedded into geom_uzip, instead the depend line for the module is provided, and corresponding kernel option is added to each MIPS kernel config file using geom_uzip. The commit also carries unrelated cleanup by removing excess "device geom_uzip" in places which were missed in r344479. Reviewed by: cem, hselasky, ray, slavash (previous versions) Sponsored by: Mellanox Technologies Differential revision: https://reviews.freebsd.org/D19266 MFC after: 3 weeks	2019-02-26 19:55:03 +00:00
Mark Johnston	3843d88ca8	Add a missing return statement to g_concat_kernel_dump(). The error occurs when upper layers attempt an out-of-bounds write. Submitted by: Noah Bergbauer <noah.bergbauer@tum.de> MFC after: 1 week	2019-02-26 18:30:51 +00:00
Mark Johnston	cd2e908669	Define a constant for the maximum number of GEOM_CTL arguments. Reviewed by: eugen MFC with: r344305 Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D19271	2019-02-20 17:07:08 +00:00
Mark Johnston	d4fbe32c65	Limit the number of entries allocated for a REPORT_ZONES command. The DIOCGETZONE ioctl can be used to fetch the zone list of an SMR drive, and the caller specifies the number of entries it wants to fetch. Clamp the caller's request to a sane limit so that a user cannot attempt large allocations. Callers already need to invoke the ioctl multiple times to fetch the full list in general, so there's no harm in limiting the number of entries returned. Fix style while here. admbug: 807 Reported by: Ilja Van Sprundel <ivansprundel@ioactive.com> Reviewed by: asomers, ken Tested by: ken MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D19249	2019-02-19 21:33:02 +00:00
Mark Johnston	60a92c781d	Impose a limit on the number of GEOM_CTL arguments. Otherwise a privileged user can trigger a memory allocation of unbounded size, or an integer overflow in the subsequent geom_alloc_copyin() call, leading to out-of-bounds accesses. Hard-code a large limit to circumvent this problem. admbug: 854 Reported by: Anonymous of the Shellphish Grill Team Reviewed by: ae MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D19251	2019-02-19 21:22:22 +00:00
Andriy Voskoboinyk	81df432ecf	geom_uzip(4): set 'gp != NULL' assertion on top of the function There was yet another access to this variable in g_trace() few lines upper. PR: 203499 Reported by: cem MFC after: 5 days MFC with: 343473	2019-01-26 17:17:25 +00:00
Andriy Voskoboinyk	34fd9d7000	geom_uzip(4): move NULL pointer KASSERT check before it is dereferenced PR: 203499 Submitted by: <chadf@triularity.org> MFC after: 5 days	2019-01-26 14:54:06 +00:00
Conrad Meyer	797f009d59	gmirror: Relocate DEVICE_FLAGS to adjacent lines gmirror's sc_flags is shared between some on-disk state and some runtime only state. There's no real reason for that and they could probably be split up. Until they are, locate all of the flags for the same field nearby each other in the source, for clarity. No functional change. Sponsored by: Dell EMC Isilon	2019-01-23 16:44:21 +00:00
Mark Johnston	438622af06	Use g_handleattr() to reply to GEOM::candelete queries. g_handleattr() fills out bp->bio_completed; otherwise, g_getattr() returns an error in response to the query. This caused BIO_DELETE support to not be propagated through stacked configurations, e.g., a gconcat of gmirror volumes would not handle BIO_DELETE even when the gmirrors do. g_io_getattr() was not affected by the problem. PR: 232676 Reported and tested by: noah.bergbauer@tum.de MFC after: 1 week	2019-01-02 15:52:16 +00:00
Alexander Motin	02a9923034	Switch from mutexes to atomics in GEOM_DEV I/O path. Mutexes in I/O path there were used twice per I/O to atomically access several variables to close and/or destroy the device on last request completion. I found the way to fit all required info into one integer, suitable for atomic operations. It opened race window on device close, but addition of timeout to the msleep() there should cover it. Profiling shows removal of significant spinning time on those mutexes and IOPS increase from ~600K to >800K to NVMe on 72-core systems. MFC after: 1 month Sponsored by: iXsystems, Inc.	2018-12-27 19:15:24 +00:00
Conrad Meyer	d2d82bfc90	gmirror: Remove a last-minute INVARIANTS breakage in r341840 I mistakenly added a lock assertion to this routine at the last minute without confirming it was held during g_mirror_create. It isn't (it isn't even initialized yet). Mea culpa. Access is exclusive in both callers, just not always by that particular lock. Reported by: lwhsu X-MFC-With: r341840, r341674	2018-12-12 18:13:56 +00:00
Conrad Meyer	23c25bd8b1	gmirror: Fix a bug introduced in r341674 r341674 inadvertently introduced a bug where newer mirror components being tasted would clear the high sc_flags that are not controlled by component metadata, such as G_MIRROR_DEVICE_FLAG_TASTING. This could plausibly expose a small window of time during STARTING where device destruction might race with mirror component addition, probably resulting in a crash. Reviewed by: markj X-MFC-With: r341674 Differential Revision: https://reviews.freebsd.org/D18521	2018-12-12 05:48:27 +00:00
Conrad Meyer	af7dcae0e2	gmirror: Evaluate mirror components against newest metadata copy Re-apply r341665 with format strings fixed. If we happen to taste a stale mirror component first, don't reject valid, newer components that have differing metadata from the stale component (during STARTING). Instead, update our view of the most recent metadata as we taste components. Like mediasize beforehand, remove some checks from g_mirror_check_metadata which would evict valid components due to metadata that can change over a mirror's lifetime. g_mirror_check_metadata is invoked long before we check genid/syncid and decide which component(s) are newest and whether or not we have quorum. Before checking if we can enter RUNNING (i.e., we have quorum) after a NEW component is added, first remove any known stale or inconsistent disks from the mirrorset, rather than removing them after deciding we have quorum. Check if we have quorum after removing these components. Additionally, add a knob, kern.geom.mirror.launch_mirror_before_timeout, to force gmirrors to wait out the full timeout (kern.geom.mirror.timeout) before transitioning from STARTING to RUNNING. This is a kludge to help ensure all eligible, boot-time available mirror components are tasted before RUNNING a gmirror. Add a basic test case for STARTING -> RUNNING startup behavior around stale genids. PR: 232671, 232835 Submitted by: Cindy Yang <cyang AT isilon.com> (previous version) Reviewed by: markj (kernel portions) Discussed with: asomers, Cindy Yang Tested by: pho Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D18062	2018-12-07 02:44:04 +00:00
Conrad Meyer	c4e87bdfc1	Revert r341665 due to tinderbox breakage I didn't notice that some format strings were non-portable. Will fix and re-commit later.	2018-12-07 00:47:05 +00:00
Conrad Meyer	bc1ee0be2d	gmirror: Evaluate mirror components against newest metadata copy If we happen to taste a stale mirror component first, don't reject valid, newer components that have differing metadata from the stale component (during STARTING). Instead, update our view of the most recent metadata as we taste components. Like mediasize beforehand, remove some checks from g_mirror_check_metadata which would evict valid components due to metadata that can change over a mirror's lifetime. g_mirror_check_metadata is invoked long before we check genid/syncid and decide which component(s) are newest and whether or not we have quorum. Before checking if we can enter RUNNING (i.e., we have quorum) after a NEW component is added, first remove any known stale or inconsistent disks from the mirrorset, rather than removing them after deciding we have quorum. Check if we have quorum after removing these components. Additionally, add a knob, kern.geom.mirror.launch_mirror_before_timeout, to force gmirrors to wait out the full timeout (kern.geom.mirror.timeout) before transitioning from STARTING to RUNNING. This is a kludge to help ensure all eligible, boot-time available mirror components are tasted before RUNNING a gmirror. When we are instructed to forget mirror components, bump the generation id to avoid confusion with such stale components later. Add a basic test case for STARTING -> RUNNING startup behavior around stale genids. PR: 232671, 232835 Submitted by: Cindy Yang <cyang AT isilon.com> (previous version) Reviewed by: markj (kernel portions) Discussed with: asomers, Cindy Yang Tested by: pho Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D18062	2018-12-06 23:55:39 +00:00
Kirk McKusick	fb14e73cb4	Normally when an attempt is made to mount a UFS/FFS filesystem whose superblock has a check-hash error, an error message noting the superblock check-hash failure is printed and the mount fails. The administrator then runs fsck to repair the filesystem and when successful, the filesystem can once again be mounted. This approach fails if the filesystem in question is a root filesystem from which you are trying to boot. Here, the loader fails when trying to access the filesystem to get the kernel to boot. So it is necessary to allow the loader to ignore the superblock check-hash error and make a best effort to read the kernel. The filesystem may be suffiently corrupted that the read attempt fails, but there is no harm in trying since the loader makes no attempt to write to the filesystem. Once the kernel is loaded and starts to run, it attempts to mount its root filesystem. Once again, failure means that it breaks to its prompt to ask where to get its root filesystem. Unless you have an alternate root filesystem, you are stuck. Since the root filesystem is initially mounted read-only, it is safe to make an attempt to mount the root filesystem with the failed superblock check-hash. Thus, when asked to mount a root filesystem with a failed superblock check-hash, the kernel prints a warning message that the root filesystem superblock check-hash needs repair, but notes that it is ignoring the error and proceeding. It does mark the filesystem as needing an fsck which prevents it from being enabled for writing until fsck has been run on it. The net effect is that the reboot fails to single user, but at least at that point the administrator has the tools at hand to fix the problem. Reported by: Rick Macklem (rmacklem@) Discussed with: Warner Losh (imp@) Sponsored by: Netflix	2018-12-06 00:09:39 +00:00
Maxim Sobolev	9dcafe16d4	Another attempt to fix issue with the DIOCGDELETE ioctl(2) not handling slightly out-of-bound requests properly (r340187). Perform range check here rather then rely on g_delete_data() to DTRT. The g_delete_data() would always return success for requests starting just the next byte after providers media boundary. MFC after: 4 weeks	2018-12-04 21:48:56 +00:00
Dag-Erling Smørgrav	cdd2df880d	Add a “skip_dsn” option to g_part's bootcode verb to prevent g_part_mbr from setting the volume serial number. This unbreaks older boot blocks that don't support serial numbers, and allows boot0cfg to set the serial number itself if requested by the user. Submitted by: lev@, yuripv@ MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D17386	2018-11-27 14:58:19 +00:00
Maxim Sobolev	de66da7374	Revert r340187, it breaks EOD (end-of-device) detection logic. Turns out, i/o into last_sector+N is handled differently for N==1 and N>1 cases to accomodate that, so some other approach would be needed to fix DIOCGDELETE ioctl(2).	2018-11-07 16:28:09 +00:00
Maxim Sobolev	8948179aba	Don't allow BIO_READ, BIO_WRITE or BIO_DELETE requests that are fully beyond the end of providers media. The only exception is made for the zero length transfers which are allowed to be just on the boundary. Previously, any requests starting on the boundary (i.e. next byte after the last one) have been allowed to go through. No response from: freebsd-geom@, phk MFC after: 1 month	2018-11-06 15:55:41 +00:00
Mark Johnston	25c9cca757	Have gconcat advertise delete support if one of its disks does. This follows the example set by other multi-disk GEOM classes. PR: 232676 Tested by: noah.bergbauer@tum.de MFC after: 1 month	2018-10-30 00:22:14 +00:00
Eugene Grosbein	6d305ab0b2	Extend stripeoffset and stripesize of GEOMs from u_int to off_t GEOM's stripeoffset overflows at 4 gigabyte margin (2^32) because of its u_int type. This leads to incorrect data in the output generated by "sysctl kern.geom.confxml" command, "graid list" etc. when GEOM array has volumes larger than 4G, for example. This change does not affect ABI but changes KBI. No MFC planned. Differential Revision: https://reviews.freebsd.org/D13426	2018-10-27 16:14:42 +00:00
Xin LI	0db665bb98	Restore backward compatibility for "attach" verb. In r332361 and r333439, two new parameters were added to geli attach verb using gctl_get_paraml, which requires the value to be present. This would prevent old geli(8) binary from attaching geli(4) device as they have no knowledge about the new parameters. Restore backward compatibility by treating the absense of these two values as seeing the default value supplied by userland. PR: 232595 Reviewed by: oshogbo MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D17680	2018-10-27 03:37:14 +00:00
Glen Barber	01d4e2149e	MFH r338661 through r339200. Sponsored by: The FreeBSD Foundation	2018-10-05 17:53:47 +00:00
Alexander Motin	cf4a52cf67	Fix use-after-free in RAID0 error reporting of GEOM_RAID. PR: 231510 Submitted by: yangx92@hotmail.com Approved by: re (gjb) MFC after: 1 week	2018-09-24 16:58:55 +00:00
Jung-uk Kim	9c40dcbe5f	Make geli(8) buildable.	2018-09-19 07:08:04 +00:00
Conrad Meyer	1b0909d51a	OpenCrypto: Convert sessions to opaque handles instead of integers Track session objects in the framework, and pass handles between the framework (OCF), consumers, and drivers. Avoid redundancy and complexity in individual drivers by allocating session memory in the framework and providing it to drivers in ::newsession(). Session handles are no longer integers with information encoded in various high bits. Use of the CRYPTO_SESID2FOO() macros should be replaced with the appropriate crypto_ses2foo() function on the opaque session handle. Convert OCF drivers (in particular, cryptosoft, as well as myriad others) to the opaque handle interface. Discard existing session tracking as much as possible (quick pass). There may be additional code ripe for deletion. Convert OCF consumers (ipsec, geom_eli, krb5, cryptodev) to handle-style interface. The conversion is largely mechnical. The change is documented in crypto.9. Inspired by https://lists.freebsd.org/pipermail/freebsd-arch/2018-January/018835.html . No objection from: ae (ipsec portion) Reported by: jhb	2018-07-18 00:56:25 +00:00
Conrad Meyer	1df7f41560	OCF: Convert consumers to the session id typedef These were missed in the earlier r336269. No functional change. Sponsored by: Dell EMC Isilon	2018-07-16 19:01:05 +00:00
Mariusz Zaborski	78f79a9a08	Let geli deal with lost devices without crashing. PR: 162036 Submitted by: Fabian Keil <fk@fabiankeil.de> Obtained from: ElectroBSD Discussed with: pjd@	2018-07-15 18:03:19 +00:00
Warner Losh	4bae19e9b8	g_eli_key_cmp is used only in the kernel, so only define it in the kernel.	2018-07-13 18:21:38 +00:00
Mikolaj Golub	874774c5d4	geom_gate: enable resize Reviewed By: pjd Approved By: pjd Differential Revision: https://reviews.freebsd.org/D11531	2018-07-13 07:08:06 +00:00
Ed Maste	76db6c8773	gpart: add EFI alias for MBR partition scheme Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D15870	2018-06-17 20:10:48 +00:00
Ed Maste	a0a8412b2a	Sort geom/part mbr/ebr/ldm alias table entries Having the table entries in alpha order simplifies future additions. Sponsored by: The FreeBSD Foundation	2018-06-17 20:06:27 +00:00
Mariusz Zaborski	31f7586d73	Introduce the 'n' flag for the geli attach command. If the 'n' flag is provided the provided key number will be used to decrypt device. This can be used combined with dryrun to verify if the key is set correctly. This can be also used to determine which key slot we want to change on already attached device. Reviewed by: allanjude Differential Revision: https://reviews.freebsd.org/D15309	2018-05-09 20:53:38 +00:00
Mark Johnston	bd92e6b6f5	Refactor some of the MI kernel dump code in preparation for netdump. - Add clear_dumper() to complement set_dumper(). - Drain netdump's preallocated mbuf pool when clearing the dumper. - Don't do bounds checking for dumpers with mediasize 0. - Add dumper callbacks for initialization for writing out headers. Reviewed by: sbruno MFC after: 1 month Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D15252	2018-05-06 00:22:38 +00:00
Mark Johnston	681554d70b	Remove a redundant assertion. MFC after: 1 week Sponsored by: Dell EMC Isilon	2018-05-06 00:05:03 +00:00
Mark Johnston	40e805221b	Avoid dropping the topology lock in gmirror's dumpconf implementation. Doing so introduces races which can lead to a use-after-free when grabbing a snapshot of the GEOM mesh. To ensure that a mirror's disk list remains stable, change its locking protocol: both the softc lock and the topology lock are now required to modify the list, so either lock is sufficient for traversal. Tested by: pho MFC after: 2 weeks Sponsored by: Dell EMC Isilon	2018-05-06 00:03:24 +00:00
Ed Maste	b525a10ac0	gpart: add fat32lba MBR partition type FAT32 partition with LBA addressing. Reviewed by: marcel MFC after: 3 days Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D15266	2018-05-04 00:34:27 +00:00
Kyle Evans	74d6c131cb	Annotate geom modules with MODULE_VERSION GEOM ELI may double ask the password during boot. Once at loader time, and once at init time. This happens due a module loading bug. By default GEOM ELI caches the password in the kernel, but without the MODULE_VERSION annotation, the kernel loads over the kernel module, even if the GEOM ELI was compiled into the kernel. In this case, the newly loaded module purges/invalidates/overwrites the GEOM ELI's password cache, which causes the double asking. MFC Note: There's a pc98 component to the original submission that is omitted here due to pc98 removal in head. This part will need to be revived upon MFC. Reviewed by: imp Submitted by: op Obtained from: opBSD MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D14992	2018-04-10 19:18:16 +00:00
Mariusz Zaborski	8f1c45c20a	Introduce dry run option for attaching the device. This will allow us to verify if passphrase and key is valid without decrypting whole device. Reviewed by: cem@, allanjude@ Differential Revision: https://reviews.freebsd.org/D15000	2018-04-10 13:22:48 +00:00
Kyle Evans	2967ace894	Retire the geom_aes class It's had a good life, but it's not really configurable and not really used. Obtained from: opBSD (with some changes) Differential Revision: https://reviews.freebsd.org/D14991	2018-04-09 17:30:30 +00:00
Brooks Davis	6469bdcdb6	Move most of the contents of opt_compat.h to opt_global.h. opt_compat.h is mentioned in nearly 180 files. In-progress network driver compabibility improvements may add over 100 more so this is closer to "just about everywhere" than "only some files" per the guidance in sys/conf/options. Keep COMPAT_LINUX32 in opt_compat.h as it is confined to a subset of sys/compat/linux/*.c. A fake _COMPAT_LINUX option ensure opt_compat.h is created on all architectures. Move COMPAT_LINUXKPI to opt_dontuse.h as it is only used to control the set of compiled files. Reviewed by: kib, cem, jhb, jtl Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D14941	2018-04-06 17:35:35 +00:00
Sean Bruno	2c385d51ce	Squash error from geom by sizing ident strings to DISK_IDENT_SIZE. Display attribute in future error strings and differentiate g_handleattr() error messages for ease of debugging in the future. "g_handleattr: md1 bio_length 24 strlen 31 -> EFAULT" Reported by: swills Reviewed by: imp cem avg Sponsored by: Limelight Networks Differential Revision: https://reviews.freebsd.org/D14962	2018-04-05 13:56:40 +00:00
Kirk McKusick	fb15890a8c	When freeing a superblock returned by ffs_sbget, be sure to also free the superblock summary information. Reported by: Peter Holm (pho@) Tested by: Peter Holm (pho@)	2018-03-24 15:36:25 +00:00
Mariusz Zaborski	9ea857cf0f	Remove unneeded variable which was introduced in r328472. Pointed out by: pjd@	2018-03-18 15:09:55 +00:00
Andriy Gapon	aca41af247	g_access: deal with races created by geoms that drop the topology lock The problem is that g_access() must be called with the GEOM topology lock held. And that gives a false impression that the lock is indeed held across the call. But this isn't always true because many classes, ZVOL being one of the many, need to drop the lock. It's either to perform an I/O on the first open or to acquire a different lock (like in g_mirror_access). That, of course, can break many assumptions. For example, g_slice_access() adds an extra exclusive count on the first open. As described above, an underlying geom may drop the topology lock and that would open a race with another thread that would also request another extra exclusive count. In general, two consumers may be granted incompatible accesses. To avoid this problem the code is changed to mark a geom with special flag before calling its access method and clear the flag afterwards. If another thread sees that flag, then it means that the topology lock has been dropped (either by the geom in question or downstream from it), so it is not safe to make another access call. So, the second thread would use g_topology_sleep() to wait until the flag is cleared and only then would it proceed with the access. Also see http://docs.freebsd.org/cgi/mid.cgi?809d9254-ee56-59d8-69a4-08838e985cea PR: 225960 Reported by: asomers Reviewed by: markj, mav MFC after: 3 weeks Differential Revision: https://reviews.freebsd.org/D14533	2018-03-15 09:16:10 +00:00
Conrad Meyer	ee4d316fe7	g_part_gpt: Fix memory leak in error path If g_part_gpt_read() encountered a disk with bad primary and secondary tables, it could leak memory. Reported by: Coverity Sponsored by: Dell EMC Isilon	2018-03-07 01:55:50 +00:00
Conrad Meyer	90575a0ec9	g_label_ufs: Fix typo from r330264 Reported by: O. Hartmann <o.hartmann AT walstatt.org> Sponsored by: Dell EMC Isilon	2018-03-02 06:02:54 +00:00
Kirk McKusick	efbf396426	This change is some refactoring of Mark Johnston's changes in r329375 to fix the memory leak that I introduced in r328426. Instead of trying to clear up the possible memory leak in all the clients, I ensure that it gets cleaned up in the source (e.g., ffs_sbget ensures that memory is always freed if it returns an error). The original change in r328426 was a bit sparse in its description. So I am expanding on its description here (thanks cem@ and rgrimes@ for your encouragement for my longer commit messages). In preparation for adding check hashing to superblocks, r328426 is a refactoring of the code to get the reading/writing of the superblock into one place. Unlike the cylinder group reading/writing which ends up in two places (ffs_getcg/ffs_geom_strategy in the kernel and cgget/cgput in libufs), I have the core superblock functions just in the kernel (ffs_sbfetch/ffs_sbput in ffs_subr.c which is already imported into utilities like fsck_ffs as well as libufs to implement sbget/sbput). The ffs_sbfetch and ffs_sbput functions take a function pointer to do the actual I/O for which there are four variants: ffs_use_bread / ffs_use_bwrite for the in-kernel filesystem g_use_g_read_data / g_use_g_write_data for kernel geom clients ufs_use_sa_read for the standalone code (stand/libsa/ufs.c but not stand/libsa/ufsread.c which is size constrained) use_pread / use_pwrite for libufs Uses of these interfaces are in the UFS filesystem, geoms journal & label, libsa changes, and libufs. They also permeate out into the filesystem utilities fsck_ffs, newfs, growfs, clri, dump, quotacheck, fsirand, fstyp, and quot. Some of these utilities should probably be converted to directly use libufs (like dumpfs was for example), but there does not seem to be much win in doing so. Tested by: Peter Holm (pho@)	2018-03-02 04:34:53 +00:00
Mark Johnston	16759360d4	Fix a memory leak introduced in r328426. ffs_sbget() may return a superblock buffer even if it fails, so the caller must be prepared to free it in this case. Moreover, when tasting alternate superblock locations in a loop, ffs_sbget()'s readfunc callback must free the previously allocated buffer. Reported and tested by: pho Reviewed by: kib (previous version) Differential Revision: https://reviews.freebsd.org/D14390	2018-02-16 15:41:03 +00:00
Alan Somers	834063202a	gpart: append partition name to the underlying provider's physical path If the underlying provider's physical path is null, then the gpart device's physical path will be, too. Otherwise, it will append the partition name, such as "/p1" or "/s1/a". This will make gpart work better with zfsd(8). PR: 224965 MFC after: 3 weeks Differential Revision: https://reviews.freebsd.org/D14010	2018-02-14 20:26:09 +00:00
Alan Somers	0bab7fa8a7	geli: append "/eli" to the underlying provider's physical path If the underlying provider's physical path is null, then the geli device's physical path will be, too. Otherwise, it will append "/eli". This will make geli work better with zfsd(8). PR: 224962 MFC after: 3 weeks Differential Revision: https://reviews.freebsd.org/D13979	2018-02-14 20:15:32 +00:00
Justin Hibbits	d793587fe2	Fix a panic introduced in r329225 Some GEOM partition tables may be destroyed with incomplete partition entries. Guard against this with NULL checks. Reported by: pholm,others Reviewed by: markj Tested by: pholm	2018-02-14 15:12:09 +00:00
Justin Hibbits	08a3b42fdb	Narrow a race, and fix a leak, in g_part_wither A race in g_part_wither() can lead to I/O being performed with a freed GEOM when the device disappears. Close the race as best as we can for now, following the code patterns from g_part_ctl_destroy() and g_part_ctl_undo(). This also fixes a leak, as g_wither_geom() does not wither providers, it only orphans them, so the partition entries would never get destroyed in g_wither_washer(). Note, this is not a complete fix, it can still race with g_part_start(), the race has merely been narrowed. Reviewed by: markj Sponsored by: Dell EMC Isilon	2018-02-13 17:40:09 +00:00
Conrad Meyer	b42712a8b7	Add GUID and alias for Apple APFS partition PR: 225813 Submitted by: James Wright <james.wright AT jigsawdezign.com>	2018-02-11 06:57:20 +00:00
Mark Johnston	0d02f6c201	Simplify synchronization read error handling. Since synchronization reads are performed by submitting a request to the external mirror provider, we know that the request returns with an error only when gmirror was unable to read a copy of the block from any mirror. Thus, there is no need to retry the request from the synchronization error handler. Tested by: pho MFC after: 2 weeks Sponsored by: Dell EMC Isilon	2018-02-06 16:02:33 +00:00
Alan Somers	f5b4099e6b	geom: don't write stack garbage in disk labels Most consumers of g_metadata_store were passing in partially unallocated memory, resulting in stack garbage being written to disk labels. Fix them by zeroing the memory first. gvirstor repeated the same mistake, but in the kernel. Also, glabel's label contained a fixed-size string that wasn't initialized to zero. PR: 222077 Reported by: Maxim Khitrov <max@mxcrypt.com> Reviewed by: cem MFC after: 3 weeks X-MFC-With: 323314 X-MFC-With: 323338 Differential Revision: https://reviews.freebsd.org/D14164	2018-02-04 14:49:55 +00:00
Xin LI	90a48fba23	After r328426, g_label depends on UFS (option FFS) code to read UFS superblock, and the kernel will fail to link when UFS is not built in. This commit makes it depend on a small portion of FFS bits and thereby fixes build for this situation. This is intended as an interim bandaid, and the actual superblock reading code should probably be made independent of UFS, so we do not need to depend on it (see kib@'s comment in the review for details), and we will revisit this once the superblock check hashes are all in place. Differential Revision: https://reviews.freebsd.org/D14092	2018-02-03 09:15:13 +00:00
Kirk McKusick	5d84ae8b49	Null out journal softc pointer earlier to avoid a segment fault that can otherwise occur. PR: 221804 Submitted by: Andreas Longwitz <longwitz at incore.de> MFC after: 1 week	2018-01-31 23:30:49 +00:00
Mariusz Zaborski	0fc4adbe06	Don't truncate name of glabel. If it's to long just report that. Reviewed by: trasz@ Differential Revision: https://reviews.freebsd.org/D13746	2018-01-27 12:28:52 +00:00
Kirk McKusick	dffce2150e	Refactoring of reading and writing of the UFS/FFS superblock. Specifically reading is done if ffs_sbget() and writing is done in ffs_sbput(). These functions are exported to libufs via the sbget() and sbput() functions which then used in the various filesystem utilities. This work is in preparation for adding subperblock check hashes. No functional change intended. Reviewed by: kib	2018-01-26 00:58:32 +00:00
Pedro F. Giffuni	ac2fffa4b7	Revert r327828, r327949, r327953, r328016-r328026, r328041: Uses of mallocarray(9). The use of mallocarray(9) has rocketed the required swap to build FreeBSD. This is likely caused by the allocation size attributes which put extra pressure on the compiler. Given that most of these checks are superfluous we have to choose better where to use mallocarray(9). We still have more uses of mallocarray(9) but hopefully this is enough to bring swap usage to a reasonable level. Reported by: wosch PR: 225197	2018-01-21 15:42:36 +00:00
Alan Somers	6f7f85e0e1	gnop(8): add the ability to set a nop provider's physical path While I'm here, expand the existing tests a bit. MFC after: 3 weeks Differential Revision: https://reviews.freebsd.org/D13579	2018-01-18 05:57:10 +00:00
Pedro F. Giffuni	0cee6dcdb0	misc geom and gnu: make some use of mallocarray(9). Focus on code where we are doing multiplications within malloc(9). None of these ire likely to overflow, however the change is still useful as some static checkers can benefit from the allocation attributes we use for mallocarray. This initial sweep only covers malloc(9) calls with M_NOWAIT. No good reason but I started doing the changes before r327796 and at that time it was convenient to make sure the sorrounding code could handle NULL values. Differential revision: https://reviews.freebsd.org/D13837	2018-01-15 21:23:16 +00:00
Andriy Gapon	6ce374aa94	geom_disk / scsi_da: deny opening write-protected disks for writing Ths change consists of two parts. geom_disk: deny opening a disk for writing if it's marked as write-protected. A new disk(9) flag is added to mark write protected disks. A possible alternative could be to add another parameter to d_open, so that the open mode could be passed to it and the disk drivers could make the decision internally, but the flag required less churn. scsi_da: add a new phase of disk probing to query the all pages mode sense page. We can determine if the disk is write protected using bit 7 of the device specific field in the mode parameter header returned by MODE SENSE. PR: 224037 Reviewed by: mav MFC after: 4 weeks Differential Revision: https://reviews.freebsd.org/D13360	2018-01-15 11:20:00 +00:00
Mark Johnston	762f440f15	Fix handling of read errors during mirror synchronization. We would previously just free the request BIO, which would either cause the disk to stay stuck in the SYNCHRONIZING state, or result in synchronization completing without having copied the block which returned an error. With this change, if the disk which returned an error is the only active disk in the mirror, the synchronizing disk is kicked out. Otherwise, the read is retried. Reported and tested by: pho (previous version) MFC after: 2 weeks Sponsored by: Dell EMC Isilon	2018-01-10 19:37:21 +00:00
Mark Johnston	792f0c3b09	Clarify the use of the gmirror flag mask constants. MFC after: 1 week Sponsored by: Dell EMC Isilon	2018-01-10 15:21:36 +00:00
Mark Johnston	aed882a9fb	Avoid referencing a possibly freed consumer after r327496. g_mirror_regular_request() may free the gmirror consumer for a disk if that disk is being disconnected, after which we must not dereference the consumer pointer. CID: 1384280 X-MFC with: r327496	2018-01-10 05:06:21 +00:00
Mark Johnston	8b0a00b745	Sort and remove unneeded includes. MFC after: 1 week Sponsored by: Dell EMC Isilon	2018-01-08 15:56:40 +00:00
Mark Johnston	7653e6d781	Release the queue lock before restarting the worker loop. Reported and tested by: pho MFC after: 3 days Sponsored by: Dell EMC Isilon	2018-01-08 15:41:49 +00:00
Mark Johnston	1787c3feb4	Fix some I/O ordering issues in gmirror. - BIO_FLUSH requests were dispatched to the disks directly from g_mirror_start() rather than going through the mirror's I/O request queue, so they could have been reordered with preceding writes. Address this by processing such requests from the queue, avoiding direct dispatch. - Handling for collisions with synchronization requests was too fine-grained and could cause reordering of writes. In particular, BIO_ORDERED was not being honoured. Address this by effectively freezing the request queue any time a collision with a synchronization request occurs. The queue is unfrozen once the collision with the first frozen request is over. - The above-mentioned collision handling allowed reads to jump ahead of writes to the same offset. Address this by freezing all request types when a collision occurs, not just BIO_WRITEs and BIO_DELETEs. Also add some more fail points for use in testing error handling. Reviewed by: imp MFC after: 3 weeks Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D13559	2018-01-02 18:11:54 +00:00
Colin Percival	8b8a7c43a9	Instrument "boot holds" for the benefit of the TSLOG framework. These are places where the "main thread" of the booting kernel (either the thread which later becomes swapper or the thread which later becomes init) has to stop and wait for action to take place in another thread before continuing. There are currently three such holds: 1. The intr_config_hooks SYSINIT waits for hooks registered via the config_intrhook_establish function; this allows (typically) devices which need interrupts enabled to complete their initialization to do so before root is mounted. 2. The g_waitidle function waits for the GEOM event queue to be empty; this ensures that all of the disks which have been attached have been tasted before we attempt to mount root. 3. The vfs_mountroot_wait function (in addition to calling g_waitidle) waits for holds registered via root_mount_hold; among other things, this is used by the USB subsystem to ensure that we don't fail to mount root if it's located on a USB disk which takes a while to probe.	2017-12-31 09:23:52 +00:00
Pedro F. Giffuni	2afb21f309	geom_ccd.c: Fix the licenses properly The license merging in r109471 didn't take into account that licensing could change. Just removing the 3rd clause obviates the copyright assignment to the NetBSD Foundation. We do have plenty of files that have two or more licensing as in this case, so fix this properly by splitting back the licenses as they are upstream. Obtained from: NetBSD	2017-12-30 02:07:18 +00:00
Pedro F. Giffuni	68689f580b	geom_ccd.c: Update the license with changes from upstream. Part of this file originated in NetBSD, with the original file carrying two versions of 4-clause BSD licenses. r109471 attempted to simplify the situation by putting both licenses together. Meanwhile, NetBSD dropped Clauses 3 and 4 from their own license, and eventually NetBSD got permission from the University of Utah to drop the 3rd clause. Keep the license "simple" by dropping the third clause since both TNF, Utah/Berkeley and phk agree in principle that it can be dropped. Obtained from: NetBSD (ccd.c CVS 1.128, 1.138)	2017-12-30 01:37:08 +00:00
Alexander Kabaev	151ba7933a	Do pass removing some write-only variables from the kernel. This reduces noise when kernel is compiled by newer GCC versions, such as one used by external toolchain ports. Reviewed by: kib, andrew(sys/arm and sys/arm64), emaste(partial), erj(partial) Reviewed by: jhb (sys/dev/pci/* sys/kern/vfs_aio.c and sys/kern/kern_synch.c) Differential Revision: https://reviews.freebsd.org/D10385	2017-12-25 04:48:39 +00:00
Mark Johnston	9abe2e7e98	Avoid using bioq_* in gmirror. gmirror does not perform any sorting of I/O requests, so the bioq API doesn't provide any advantages over plain TAILQs. The API also does not provide operations needed by an upcoming change. No functional change intended. The diff shrinks the geom_mirror.ko text and the gmirror softc slightly. Tested by: pho (part of a larger patch) MFC after: 1 week Sponsored by: Dell EMC Isilon	2017-12-19 17:13:04 +00:00
Mark Johnston	68eadcec0f	Give a couple of predication functions a bool return type. No functional change intended. MFC after: 1 week Sponsored by: Dell EMC Isilon	2017-12-15 19:14:21 +00:00
Mark Johnston	204d94f161	Typo. MFC after: 1 week	2017-12-15 19:03:03 +00:00
Mark Johnston	8b93770503	Address a possible lost wakeup for gmirror events. g_mirror_event_send() acquires the I/O queue lock to deliver a wakeup to the worker thread, and this is done after enqueuing the event. So it's sufficient to check the event queue before atomically releasing the queue lock and going to sleep. MFC after: 1 week Sponsored by: Dell EMC Isilon	2017-12-12 17:29:34 +00:00
Mark Johnston	b634781eac	Give g_mirror_event_get() a more accurate name. MFC after: 1 week Sponsored by: Dell EMC Isilon	2017-12-12 17:25:25 +00:00
Mark Johnston	a3584ee355	Decrement sc_writes when BIO_DELETE requests complete. Otherwise a gmirror that has received a BIO_DELETE request will never be marked clean (unless sc_writes overflows). MFC after: 1 week Sponsored by: Dell EMC Isilon	2017-12-12 17:24:30 +00:00
Eugene Grosbein	1b8ea9beff	geom_raid (RAID5): do not lose bp->bio_error, keep it in pbp->bio_error and return it by passing to g_raid_iodone() Approved by: mav (mentor) MFC after: 3 days	2017-12-07 20:09:17 +00:00
Eugene Grosbein	01a51aba38	Fix use-after-free that sometimes results in a garbage returned instead of right error code after requests to SINGLE/CONCAT volumes, f.e: # dd if=/dev/raid/r0 bs=512 of=/dev/null dd: /dev/raid/r0: Unknown error: -559038242 Reviewed by: avg (mentor), mav (mentor) MFC after: 3 days	2017-12-07 05:55:18 +00:00
Warner Losh	700b6f8e23	When building standalone, include stand.h rather than the kernel includes or the userland includes. Sponsored by: Netflix	2017-12-05 21:37:32 +00:00
Warner Losh	3a7d67e741	We don't need both _STAND and _STANDALONE. There's more places that use _STANDALONE, so change the former to the latter. Sponsored by: Netflix	2017-12-02 00:07:09 +00:00
Mark Johnston	2ceafb776e	Update gmirror metadata less frequently when synchronizing. We periodically record synchronization progress in the metadata block of the disk being synchronized; this allows an interrupted synchronization to be resumed. However, the frequency of these updates heavily pessimized synchronization time on some media. This change modifies gmirror to update metadata based on a time period, and adds a sysctl to control that period. The default value results in a much lower update frequency and increases the completion time for an interrupted rebuild only marginally. Reported by: Andre Albsmeier <andre@fbsd.e4m.org> MFC after: 3 weeks	2017-11-30 20:36:29 +00:00
Pedro F. Giffuni	3728855a0f	sys/geom: adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 2-Clause license, however the tool I was using misidentified many licenses so this was mostly a manual - error prone - task. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts.	2017-11-27 15:17:37 +00:00
Mark Johnston	0349817103	Allow kern.geom.mirror.debug to be negative. A negative value can be used to suppress all prints from the gmirror kernel code, which can be useful when attempting to trigger race conditions using stress tests. MFC after: 1 week	2017-11-23 14:07:52 +00:00
Warner Losh	fdcf0c7477	While the EFI spec allows numbers to be in many forms, libefivar produces hex numbers for the dsn. Since that come is from EDK2, change this for symmetry, by generating the dsn as a hex number. Noticed by: gpart list \| grep efimedia \| awk -F: '{print $2;}' \| \ sed -e 's/^ *//g;s/,,/,/' \| grep MBR \| efidp -p \| efidp -f Sponsored by: Netflix	2017-11-21 06:12:21 +00:00
Warner Losh	2ab9683565	Remove trailing whitespace (one I just introduced and a bunch of others in the same directory). Sponsored by: Netflix	2017-11-21 05:42:13 +00:00
Warner Losh	d65b6588d6	Implement efi media tagging for MBR partitioning types. Sponsored by: Netflix	2017-11-21 05:35:21 +00:00
Pedro F. Giffuni	df57947f08	spdx: initial adoption of licensing ID tags. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Special thanks to Wind River for providing access to "The Duke of Highlander" tool: an older (2014) run over FreeBSD tree was useful as a starting point. Initially, only tag files that use BSD 4-Clause "Original" license. RelNotes: yes Differential Revision: https://reviews.freebsd.org/D13133	2017-11-18 14:26:50 +00:00
Andriy Gapon	f0fa2af656	geom_slice: fix r325227, protect against multiple calls to g_slice_free This geom does not immediately detach its consumer relying on the wither-washer to do that. Since that happens asynchronously we may get additional spoiling events. So, we need to account for that. There are multiple options for fixing this issue like detaching immediately or checking for G_CF_ORPHAN in g_slice_spoiled(). The most reliable and least intrusive fix seems to be setting geom->softc to NULL on the first call and checking for NULL on subsequent calls. This is something that the code did before r325227. Reported by: David Wolfskill <david@catwhisker.org>, O. Hartmann <o.hartmann@walstatt.org> Tested by: David Wolfskill <david@catwhisker.org> (earlier version) Discussed with: mav MFC after: 1 week X-MFC with: r325227	2017-11-01 10:53:10 +00:00
Andriy Gapon	9662d80af5	geom_slice: do not destroy softc until providers are gone At present, g_slice_orphan and g_slice_spoiled destroy the softc (struct g_slicer) even before calling g_wither_geom, so there can be active and incoming io requests at that time and g_slice_start can access the softc. This commit changes the code to destroy the softc only after all providers are closed. While there, a couple of small cleanups. Reported by: Ben RUBSON <ben.rubson@gmail.com> Tested by: Ben RUBSON <ben.rubson@gmail.com> Reviewed by: mav, smh (earlier version) MFC after: 2 weeks Sponsored by: Panzura Differential Revision: https://reviews.freebsd.org/D12809	2017-10-31 10:10:13 +00:00
Edward Tomasz Napierala	338ed98ad2	Add back missing MTX_DEF, it still needs to be there. (Although it's defined to be 0, so there's no functional change.) Reported by: glebius MFC after: 2 weeks	2017-10-29 12:03:06 +00:00
Mark Johnston	cef5abd140	Fix a lock leak in g_mirror_destroy(). g_mirror_destroy() is supposed to unlock the softc before indicating success, but it wasn't doing so if the caller raced with another thread destroying the mirror. MFC after: 1 week Sponsored by: Dell EMC Isilon	2017-10-27 17:05:14 +00:00
Edward Tomasz Napierala	0d73fface2	Make gmountver(8) use direct dispatch. MFC after: 2 weeks	2017-10-26 10:18:31 +00:00
Edward Tomasz Napierala	0a8cfed8cf	Make gmountver(8) use G_PF_ACCEPT_UNMAPPED. MFC after: 2 weeks	2017-10-26 09:29:35 +00:00
Mark Johnston	64a16434d8	Add support for compressed kernel dumps. When using a kernel built with the GZIO config option, dumpon -z can be used to configure gzip compression using the in-kernel copy of zlib. This is useful on systems with large amounts of RAM, which require a correspondingly large dump device. Recovery of compressed dumps is also faster since fewer bytes need to be copied from the dump device. Because we have no way of knowing the final size of a compressed dump until it is written, the kernel will always attempt to dump when compression is configured, regardless of the dump device size. If the dump is aborted because we run out of space, an error is reported on the console. savecore(8) is modified to handle compressed dumps and save them to vmcore.<index>.gz, as it does when given the -z option. A new rc.conf variable, dumpon_flags, is added. Its value is added to the boot-time dumpon(8) invocation that occurs when a dump device is configured in rc.conf. Reviewed by: cem (earlier version) Discussed with: def, rgrimes Relnotes: yes Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D11723	2017-10-25 00:51:00 +00:00
Alan Somers	27f0f2ec5f	Display rotation rate and TRIM/UNMAP support in diskinfo(8) Bump __FreeBSD_version due to the expansion of struct diocgattr_arg. Reviewed by: mav, allanjude, imp MFC after: 3 weeks Relnotes: yes Sponsored by: Spectra Logic Corp Differential Revision: https://reviews.freebsd.org/D12578	2017-10-04 15:09:49 +00:00
Edward Tomasz Napierala	b73e1f746a	Don't destroy gmountver(8) devices on shutdown, unless they are orphaned. Otherwise we would fail to sync the filesystem on reboot. MFC after: 2 weeks Sponsored by: DARPA, AFRL	2017-10-04 12:25:39 +00:00
Edward Tomasz Napierala	2b4490a5d3	Clear G_CF_ORPHAN when attaching. This fixes cases where the same GEOM consumer can be orphaned, and then reattach to another provider. From a user point of view, this makes gmountver(4) work again. Reviewed by: avg, mav MFC after: 2 weeks Sponsored by: DARPA, AFRL Differential Revision: https://reviews.freebsd.org/D12228	2017-10-02 11:57:00 +00:00
Conrad Meyer	a523de2365	g_resize_provider_event: Do not invoke orphan method twice Like r266444, g_resize_provider_event can attempt to orphan an already orphaned geom_dev consumer. This will cause a panic in g_dev_orphan. Apply the same fix as was applied to g_orphan_register. Reviewed by: ae Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D12469	2017-09-24 19:59:26 +00:00
Andriy Gapon	7103ac8ad6	gmirror: treat ENXIO as disk disconnect, not media error In theory, all data access errors mean that a member is out of sync at most. But they were treated as more serious errors to avoid the situation where a flaky disk gets repeatedly disconnected, re-synchronized, reconnected and then disconnected again. ENXIO is a special error that means that the member disk disappeared, so it should get the same handling as the GEOM orphaning event. There is a better chance that when the disk is reconnected, it will be a good member again. When ENXIO happens on a read we use the exisiting G_MIRROR_BUMP_SYNCID mechanism which means that the mirror's syncid is increased as soon as there is a write to the mirror. That's because no data has got out of sync yet, but the problematic memeber is disconnected, so the future write will make it stale. When ENXIO happens on a write we use a new G_MIRROR_BUMP_SYNCID_NOW mechanism which means that we update the mirror metadata as soon as possible because the problematic memeber is already behind. Reviewed by: markj, imp MFC after: 3 weeks Differential Revision: https://reviews.freebsd.org/D9463	2017-09-15 13:57:08 +00:00
Conrad Meyer	ea5eee641e	Fix information leak in geli(8) integrity mode In integrity mode, a larger logical sector (e.g., 4096 bytes) spans several physical sectors (e.g., 512 bytes) on the backing device. Due to hash overhead, a 4096 byte logical sector takes 8.5625 512-byte physical sectors. This means that only 288 bytes (256 data + 32 hash) of the last 512 byte sector are used. The memory allocation used to store the encrypted data to be written to the physical sectors comes from malloc(9) and does not use M_ZERO. Previously, nothing initialized the final physical sector backing each logical sector, aside from the hash + encrypted data portion. So 224 bytes of kernel heap memory was leaked to every block :-(. This patch addresses the issue by initializing the trailing portion of the physical sector in every logical sector to zeros before use. A much simpler but higher overhead fix would be to tag the entire allocation M_ZERO. PR: 222077 Reported by: Maxim Khitrov <max AT mxcrypt.com> Reviewed by: emaste Security: yes Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D12272	2017-09-09 01:41:01 +00:00
Warner Losh	a905e3962c	The hard drive media device path contains the size of the partition, not its end. This makes the GEOM efimedia attribute match the FreeBSD:Boot1Device environment variable now. Sponsored by: Netflix	2017-09-02 07:04:06 +00:00
Warner Losh	ab4effdc68	Add efimedia attribute for all GPT partitions. Sposnored by: Netflix Differential Revision: https://reviews.freebsd.org/D12206	2017-09-01 17:55:25 +00:00
Konstantin Belousov	58d8f357c7	Let g_access() log the actual error number. Submitted by: Fabian Keil <fk@fabiankeil.de> PR: 221855 MFC after: 1 week	2017-08-27 12:24:25 +00:00
Mariusz Zaborski	3453dc72ad	Hide length of geli passphrase during boot. Introduce additional flag to the geli which allows to restore previous behavior. Reviewed by: AllanJude@, cem@ (previous version) MFC: 1 month Relnotes: yes Differential Revision: https://reviews.freebsd.org/D11751	2017-08-26 14:07:24 +00:00
Kirk McKusick	037331ddbd	When read requests are sent from a filesystem running above g_journal, the g_journal level needs to check whether it is holding a newer copy of the block than that which exists on the disk. If so, it needs to return its copy. If not, it should pass the request down to the disk to fulfill. It currently considers six queues: 0) delayed queue, 1) unsent (current queue), 2) in-flight to the journal (flush queue), 3) active journal (active queue), 4) inactive journal (inactive queue), and 5) inflight to the disk (copy queue). Checking on two of these queues is unnecessary: 0) The delayed requests should not be used for reads because they have not yet been entered into the journal, so their value should reflect the disk contents, not the future contents that are not yet committed. 2) Because all the bio's in the flush queue are also found on the active queue, there is no need to inspect the flush queue for reads since they will be found when searching the active queue. Submitted by: Dr. Andreas Longwitz <longwitz@incore.de> Discussed with: kib MFC after: 1 week	2017-08-13 18:09:22 +00:00
Kirk McKusick	8fccf8ffd7	Eliminate a variable that is only ever set. Submitted by: Dr. Andreas Longwitz <longwitz@incore.de> Discussed with: kib MFC after: 1 week	2017-08-13 18:06:38 +00:00
Warner Losh	0038725697	Also provide a warning for geom_fox. Differential Review: https://reviews.freebsd.org/D11935 Requested by: jhb@ MFC After: 3 days	2017-08-09 16:37:37 +00:00
Warner Losh	20995eab57	Mark geom classes as deprecated. geom_bsd, geom_mbr and geom_sunlabel have been obsolete since Marcel Moolenaar's geom_part was in FreeBSD 7. They haven't been in GENERIC since FreeBSD 8. Add warning when used. geom_vol_ffs has been obsolete since ufs support to geom_label was committed in FreeBSD 5. It hasn't been in GENERIC since FreeBSD 5. Add warning when used. geom_fox has been obsolete since gmultipath was committed in FreeBSD 7. (no warning added, since this is a very obscure class). These will all be removed in FreeBSD 12. MFC After: 3 days Differential Revision: https://reviews.freebsd.org/D11935 Note: Classes will be removed after MFC	2017-08-09 16:15:24 +00:00
Warner Losh	36d6e01474	Eliminate useless adjustments of aliased device. No need to set any fields in the cloned device. devfs uses symlinks, so the adev entries returned won't be presented to the drivers. Since we don't save copies, nothing else will see them. This code came from the old compat code, and it appears to be obsolete or never needed. Submitted by: kib@ Differential Review: https://reviews.freebsd.org/D11919	2017-08-07 22:42:46 +00:00
Warner Losh	d3517d306c	Expose API to allow disks to ask for alias names in devfs. Implement disk_add_alias to allow aliases to be added to disks. All disk have a primary name (say "foo") can also have secondary names (say "bar") such that all instances of "foo" also have a "bar" alias. So if you have foo0, foo0p1, foo1, foo1s1 and foo1s1a nodes created by the foo driver and gpart, device nodes bar0, bar0p1, bar1, bar1s1 and bar1s1a will appear as symlinks back to the original nodes. This generalizes to multiple aliases. However, since the unit number follows the primary name, multiple device drivers can't create the same aliases unless those drives coorinate the unit number space (eg you couldn't add an alias 'disk' to both 'da' and 'ada' because it's possible to have da0 and ada0, because 'disk0' is ambiguous). Differential Revision: https://reviews.freebsd.org/D11873	2017-08-07 21:12:38 +00:00
Warner Losh	5d7d13290a	Add alias support to gpart. When we're creating new providers for each of the partitions, add aliases to the geom before we create the provider so when geom_dev tastes the provider, the aliases are in place so the proper /dev entries are created. So foo5p6 gets created as an alias for bar5p6 when foo is an alias for bar in the geom we're partitioning with g_part. This also copies aliases from the container geom (eg disk) to the label geom (the disk with GPT partitioning) so that aliases nest properly. Differential Revision: https://reviews.freebsd.org/D11873	2017-08-07 21:12:33 +00:00
Warner Losh	c624eb2598	Add aliasing concept to geom. Add an alias name list to geoms. Use them in geom_dev to create aliases. Previously, geom_dev would create an device node for the name of the geom. Now, additional nodes are created pointing back to the primary node with make_dev_alias_p. Aliases must be in place on the geom before any tasting occurs. Differential Revision: https://reviews.freebsd.org/D11873	2017-08-07 21:12:28 +00:00
Kirk McKusick	6c6118b390	gjournal is broken in handling its flush_queue. If we have 10 bio's in the flush_queue: 1 2 3 4 5 6 7 8 9 10 and another 10 bio's go into the flush queue after only the first five bio's are removed from the flush queue, the queue should look like: 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20, but because of the bug we end up with 6 11 12 13 14 15 16 17 18 19 20 7 8 9 10. So the sequence of the bio's is damaged in the flush queue (and therefore in the journal on disk !). This error can be triggered by ffs_snapshot() when a block is read with readblock() and gjournal finds this block in the broken flush queue before it goes to the correct active queue. The fix is to place all new blocks at the end of the queue. Submitted by: Dr. Andreas Longwitz <longwitz@incore.de> Discussed with: kib MFC after: 1 week	2017-08-07 19:40:03 +00:00
Kirk McKusick	683590b642	sysctl kern.geom.journal.cache.limit shows negative value for FreeBSD/amd64 system having over 4GB RAM. That's due to: 1) the limit being u_int instead of u_long like vm.kmem_size (the limit is half of vm.kmem_size by default for amd64); 2) sysctl handler g_journal_cache_limit_sysctl() using u_int instead of u_long. The fix is to replace u_int with u_long for the kern.geom.journal.cache.limit sysctl variable. PR: 198500 Submitted by: Dr. Andreas Longwitz <longwitz@incore.de> Reported by: Eugene Grosbein Discussed with: kib MFC after: 1 week	2017-08-07 19:18:27 +00:00
Alexander Motin	1631690677	Add GEOM::descr attribute for symmetry with GEOM::ident. MFC after: 2 weeks	2017-07-06 08:36:14 +00:00
Ryan Libby	fb0e3235ea	g_virstor.h: macro parenthesization Build with gcc -Wint-in-bool-context revealed a macro parenthesization error (invoking LOG_MSG with a ternary expression for lvl). Reviewed by: markj Approved by: markj (mentor) Sponsored by: Dell EMC Isilon Differential revision: https://reviews.freebsd.org/D11411	2017-06-30 22:01:18 +00:00
Marcelo Araujo	4323355e76	With r318394 seems it breaks gpart(8) in some embedded systems such like PCEngines, RPI1-B, Alix and APU2 boards as well as NanoBSD with the following message: vnode_pager_generic_getpages_done: I/O read error 5 Seems the breakage was because it was missed to include acr in glabel update. Reported by: Peter Blok <pblok@bsd4all.org>, madpilot, imp and trasz. Reviewed by: trasz Tested by: Peter Blok and madpilot. MFC after: 3 days. Sponsored by: iXsystems, Inc. Differential Revision: https://reviews.freebsd.org/D11365	2017-06-27 01:22:27 +00:00
Stephen J. Kiernan	9a81ba0f24	Add MD_VERIFY option to enable O_VERIFY in open for vnode type. Add -o [no]verify option to mdconfig (and document in man page.) Implement GEOM attribute MNT::verified to ask md if the backing vnode is verified. Check for MNT::verified in cd9660 mount to flag the mount as MNT_VERIFIED if the underlying device has been verified. Reviewed by: rwatson Approved by: sjg (mentor) Obtained from: Juniper Networks, Inc. Differential Revision: https://reviews.freebsd.org/D2902	2017-05-31 21:18:11 +00:00
Edward Tomasz Napierala	6635c8ed2f	Fix typo. MFC after: 2 weeks	2017-05-18 08:25:07 +00:00
Mark Johnston	db7c508323	Synchronize unclean mirrors before adding them to a running gmirror. During gmirror startup, if component mirrors are found to be dirty as is typical after a system crash, the mirrors are synchronized to the mirror with highest priority. However if a gmirror starts without all of its mirrors present, for example because of some transient delays during tasting, the remaining mirrors must be synchronized before they may become active. MFC after: 2 weeks Sponsored by: Dell EMC Isilon	2017-05-02 23:29:42 +00:00
Alexander Motin	d109d8adc7	Dump md_iterations as signed, which it really is. PR: 208305 PR: 196834 MFC after: 2 weeks	2017-04-21 07:43:44 +00:00
Alexander Motin	d8880fd450	Always allow setting number of iterations for the first time. Before this change it was impossible to set number of PKCS#5v2 iterations, required to set passphrase, if it has two keys and never had any passphrase. Due to present metadata format limitations there are still cases when number of iterations can not be changed, but now it works in cases when it can. PR: 218512 MFC after: 2 weeks Sponsored by: iXsystems, Inc. Differential Revision: https://reviews.freebsd.org/D10338	2017-04-21 07:16:07 +00:00
Mark Johnston	a7d94fcc3e	Rename two gmirror state flags to make their meanings slightly clearer. No functional change. MFC after: 2 weeks Sponsored by: Dell EMC Isilon	2017-04-14 17:13:57 +00:00
Mark Johnston	1e91412e40	Don't set the mirror GEOM softc to NULL in g_mirror_destroy(). At this point we have not rendezvous'ed with the mirror worker thread, and I/O may still be in flight. Various I/O completion paths expect to be able to obtain a reference to the mirror softc from the GEOM, so setting it to NULL may result in various NULL pointer dereferences if the mirror is stopped with -f or the kernel is shut down while a mirror is synchronizing. The worker thread will clear the softc pointer before exiting. Tested by: pho MFC after: 2 weeks Sponsored by: Dell EMC Isilon	2017-04-14 17:08:37 +00:00
Mark Johnston	77011eac86	Check for a provider error before enqueuing mirror I/O. We are otherwise susceptible to a race with a concurrent teardown of the mirror provider, causing the I/O to be left uncompleted after the mirror started withering. Tested by: pho MFC after: 2 weeks Sponsored by: Dell EMC Isilon	2017-04-14 17:03:32 +00:00
Mark Johnston	a65d524afc	Stop mirror synchronization before draining the I/O queue. Regular I/O requests may be blocked by concurrent synchronization requests targeted to the same LBAs, in which case they are moved to a holding queue until the conflicting I/O completes. We therefore want to stop synchronization before completing pending I/O in g_mirror_destroy_provider() since this ensures that blocked I/O requests are completed as well. Tested by: pho MFC after: 2 weeks Sponsored by: Dell EMC Isilon	2017-04-14 16:54:50 +00:00
Mark Johnston	a4834289d6	Handle NULL entries in gmirror disk ds_bios arrays. Entries may be removed and freed if an I/O error occurs during mirror synchronization, so we cannot assume that all entries of ds_bios are valid. Also ensure that a synchronization BIO's array index is preserved after a successful write. Reported and tested by: pho MFC after: 2 weeks Sponsored by: Dell EMC Isilon	2017-04-10 17:15:59 +00:00
Allan Jude	ec5c0e5be9	Implement boot-time encryption key passing (keybuf) This patch adds a general mechanism for providing encryption keys to the kernel from the boot loader. This is intended to enable GELI support at boot time, providing a better mechanism for passing keys to the kernel than environment variables. It is designed to be extensible to other applications, and can easily handle multiple encrypted volumes with different keys. This mechanism is currently used by the pending GELI EFI work. Additionally, this mechanism can potentially be used to interface with GRUB, opening up options for coreboot+GRUB configurations with completely encrypted disks. Another benefit over the existing system is that it does not require re-deriving the user key from the password at each boot stage. Most of this patch was written by Eric McCorkle. It was extended by Allan Jude with a number of minor enhancements and extending the keybuf feature into boot2. GELI user keys are now derived once, in boot2, then passed to the loader, which reuses the key, then passes it to the kernel, where the GELI module destroys the keybuf after decrypting the volumes. Submitted by: Eric McCorkle <eric@metricspace.net> (Original Version) Reviewed by: oshogbo (earlier version), cem (earlier version) MFC after: 3 weeks Relnotes: yes Sponsored by: ScaleEngine Inc. Differential Revision: https://reviews.freebsd.org/D9575	2017-04-01 05:05:22 +00:00
Allan Jude	39b7ca4533	sys/geom/eli: Switch bzero() to explicit_bzero() for sensitive data In GELI, anywhere we are zeroing out possibly sensitive data, like the metadata struct, the metadata sector (both contain the encrypted master key), the user key, or the master key, use explicit_bzero. Didn't touch the bzero() used to initialize structs. Reviewed by: delphij, oshogbo Sponsored by: ScaleEngine Inc. Differential Revision: https://reviews.freebsd.org/D9809	2017-03-31 00:07:03 +00:00
Mark Johnston	0d75d0dfbc	Avoid sleeping when the mirror I/O queue is non-empty. A request may be queued while the queue lock is dropped when the mirror is being destroyed. The corresponding wakeup would be lost, possibly resulting in an apparent hang of the mirror worker thread. Tested by: pho (part of a larger patch) MFC after: 1 week Sponsored by: Dell EMC Isilon	2017-03-29 19:39:07 +00:00
Mark Johnston	c1ab409cba	Remove an unneeded g_mirror_destroy_provider() call. The worker thread will destroy the mirror provider as part of its teardown sequence. The call made sense in the initial revision of gmirror, but became unnecessary in r137248. Tested by: pho (part of a larger diff) MFC afteR: 2 weeks Sponsored by: Dell EMC Isilon	2017-03-29 19:30:22 +00:00
Mark Johnston	819cd913f4	Refine r301173 a bit. - Don't execute any of g_mirror_shutdown_post_sync() when panicking. We cannot safely idle the mirror or stop synchronization in that state, and the current attempts to do so complicate debugging of gmirror itself. - Check for a non-NULL panicstr instead of using SCHEDULER_STOPPED(). The latter was added for use in the locking primitives. Reviewed by: mav, pjd MFC after: 2 weeks Sponsored by: Dell EMC Isilon	2017-03-27 16:25:58 +00:00
Marcelo Araujo	7f5f84f08f	After r315112 I broke the tests with eli, instead to pass 0, I should pass M_NOWAIT to g_media_changed() that will call g_post_event() with this flag. Reported by: lwhsu, ngie and ae	2017-03-13 13:56:01 +00:00
Scott Long	d8474e52e3	Report disk flags via the sysctl tree	2017-03-13 11:09:17 +00:00
Marcelo Araujo	2ae0afa8ee	Add the capability to refresh the gpart(8) label without need a reboot. gpart(8) has functionality to change the label of an GPT partition. This functionality works like it should, however, after a label change the /dev/gpt/ entries remain unchanged. glabel(8) status output remains unchanged. The change only takes effect after a reboot. PR: 162690 Submitted by: sub.mesa@gmail, Ben RUBSON <ben.rubson@gmail.com>, ae Reviewed by: allanjude, bapt, bcr MFC after: 6 weeks. Differential Revision: https://reviews.freebsd.org/D9935	2017-03-12 04:15:56 +00:00
Alexander Motin	4d5832bc12	When chunking large DIOCGDELETE, do it on stripe edge. MFC after: 2 weeks	2017-03-08 12:18:58 +00:00
Mariusz Zaborski	c27fb0b589	The kern.geom.part.auto_resize should be tunable.	2017-02-28 20:51:20 +00:00
Mariusz Zaborski	01ad653a81	Add sysctl to control auto resize of the GEOM metadata. Reviewed by: AllanJude Differential Revision: https://reviews.freebsd.org/D9603	2017-02-27 17:54:01 +00:00
Marius Strobl	4874af73c1	- Allow different slicers for different flash types to be registered with geom_flashmap(4) and teach it about MMC for slicing enhanced user data area partitions. The FDT slicer still is the default for CFI, NAND and SPI flash on FDT-enabled platforms. - In addition to a device_t, also pass the name of the GEOM provider in question to the slicers as a single device may provide more than provider. - Build a geom_flashmap.ko. - Use MODULE_VERSION() so other modules can depend on geom_flashmap(4). - Remove redundant/superfluous GEOM routines that either do nothing or provide/just call default GEOM (slice) functionality. - Trim/adjust includes Submitted by: jhibbits (RouterBoard bits) Reviewed by: jhibbits	2017-02-22 10:21:39 +00:00
Allan Jude	85c15ab853	improve PBKDF2 performance The PBKDF2 in sys/geom/eli/pkcs5v2.c is around half the speed it could be GELI's PBKDF2 uses a simple benchmark to determine a number of iterations that will takes approximately 2 seconds. The security provided is actually half what is expected, because an attacker could use the optimized algorithm to brute force the key in half the expected time. With this change, all newly generated GELI keys will be approximately 2x as strong. Previously generated keys will talk half as long to calculate, resulting in faster mounting of encrypted volumes. Users may choose to rekey, to generate a new key with the larger default number of iterations using the geli(8) setkey command. Security of existing data is not compromised, as ~1 second per brute force attempt is still a very high threshold. PR: 202365 Original Research: https://jbp.io/2015/08/11/pbkdf2-performance-matters/ Submitted by: Joe Pixton <jpixton@gmail.com> (Original Version), jmg (Later Version) Reviewed by: ed, pjd, delphij Approved by: secteam, pjd (maintainer) MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D8236	2017-02-19 19:30:31 +00:00
John Baldwin	dcbe5188da	Defer startup of gjournal switcher kproc. Don't start switcher kproc until the first GEOM is created. Reviewed by: pjd MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D8576	2017-02-07 22:45:59 +00:00
Andrey V. Elsukov	9ef6004352	Check that primary GPT header is valid before wiping partitioning. This allows safely destroy corrupted GPT when primary header was rewritten by some data, that do not want to destroy. MFC after: 1 week	2017-02-04 05:09:47 +00:00
Yoshihiro Takahashi	2b375b4edd	Remove pc98 support completely. I thank all developers and contributors for pc98. Relnotes: yes	2017-01-28 02:22:15 +00:00
Alexander Motin	d3fef0a092	Report disk addition errors on `add` or `create` subcommand. MFC after: 1 week	2017-01-20 13:49:04 +00:00
Alexander Motin	17160457b4	Report random flash storage as non-rotating to GEOM_DISK. While doing it, introduce respective constants in geom_disk.h. MFC after: 1 week	2017-01-12 08:53:10 +00:00
Conrad Meyer	b28ea2c250	g_raid: Prevent tasters from attempting excessively large reads Some g_raid tasters attempt metadata reads in multiples of the provider sectorsize. Reads larger than MAXPHYS are invalid, so detect and abort in such situations. Spiritually similar to r217305 / PR 147851. PR: 214721 Sponsored by: Dell EMC Isilon	2017-01-12 06:58:31 +00:00
Dimitry Andric	012039fd55	Fix logic error in gvinum's gv_set_sd_state() With clang 4.0.0, I'm getting the following warnings: sys/geom/vinum/geom_vinum_state.c:186:7: error: logical not is only applied to the left hand side of this bitwise operator [-Werror,-Wlogical-not-parentheses] if (!flags & GV_SETSTATE_FORCE) ^ ~ The logical not operator should obiously be called after masking. Reviewed by: mav, pfg MFC after: 3 days Differential Revision: https://reviews.freebsd.org/D9093	2017-01-08 17:56:54 +00:00
Sepherosa Ziehau	c22dceff9d	build: Unbreak LINT Sponsored by: Microsoft	2016-12-21 01:39:11 +00:00
Konrad Witaszczyk	480f31c214	Add support for encrypted kernel crash dumps. Changes include modifications in kernel crash dump routines, dumpon(8) and savecore(8). A new tool called decryptcore(8) was added. A new DIOCSKERNELDUMP I/O control was added to send a kernel crash dump configuration in the diocskerneldump_arg structure to the kernel. The old DIOCSKERNELDUMP I/O control was renamed to DIOCSKERNELDUMP_FREEBSD11 for backward ABI compatibility. dumpon(8) generates an one-time random symmetric key and encrypts it using an RSA public key in capability mode. Currently only AES-256-CBC is supported but EKCD was designed to implement support for other algorithms in the future. The public key is chosen using the -k flag. The dumpon rc(8) script can do this automatically during startup using the dumppubkey rc.conf(5) variable. Once the keys are calculated dumpon sends them to the kernel via DIOCSKERNELDUMP I/O control. When the kernel receives the DIOCSKERNELDUMP I/O control it generates a random IV and sets up the key schedule for the specified algorithm. Each time the kernel tries to write a crash dump to the dump device, the IV is replaced by a SHA-256 hash of the previous value. This is intended to make a possible differential cryptanalysis harder since it is possible to write multiple crash dumps without reboot by repeating the following commands: # sysctl debug.kdb.enter=1 db> call doadump(0) db> continue # savecore A kernel dump key consists of an algorithm identifier, an IV and an encrypted symmetric key. The kernel dump key size is included in a kernel dump header. The size is an unsigned 32-bit integer and it is aligned to a block size. The header structure has 512 bytes to match the block size so it was required to make a panic string 4 bytes shorter to add a new field to the header structure. If the kernel dump key size in the header is nonzero it is assumed that the kernel dump key is placed after the first header on the dump device and the core dump is encrypted. Separate functions were implemented to write the kernel dump header and the kernel dump key as they need to be unencrypted. The dump_write function encrypts data if the kernel was compiled with the EKCD option. Encrypted kernel textdumps are not supported due to the way they are constructed which makes it impossible to use the CBC mode for encryption. It should be also noted that textdumps don't contain sensitive data by design as a user decides what information should be dumped. savecore(8) writes the kernel dump key to a key.# file if its size in the header is nonzero. # is the number of the current core dump. decryptcore(8) decrypts the core dump using a private RSA key and the kernel dump key. This is performed by a child process in capability mode. If the decryption was not successful the parent process removes a partially decrypted core dump. Description on how to encrypt crash dumps was added to the decryptcore(8), dumpon(8), rc.conf(5) and savecore(8) manual pages. EKCD was tested on amd64 using bhyve and i386, mipsel and sparc64 using QEMU. The feature still has to be tested on arm and arm64 as it wasn't possible to run FreeBSD due to the problems with QEMU emulation and lack of hardware. Designed by: def, pjd Reviewed by: cem, oshogbo, pjd Partial review: delphij, emaste, jhb, kib Approved by: pjd (mentor) Differential Revision: https://reviews.freebsd.org/D4712	2016-12-10 16:20:39 +00:00
Alexander Motin	b6fe583c55	Add `gmirror create` subcommand, alike to gstripe, gconcat, etc. It is quite specific mode of operation without storing on-disk metadata. It can be useful in some cases in combination with some external control tools handling mirror creation and disks hot-plug. MFC after: 2 weeks Sponsored by: iXsystems, Inc.	2016-11-30 09:27:08 +00:00
Alexander Motin	dc399583ba	Use providergone method to cover race between destroy and g_access(). Reviewed by: markj MFC after: 2 weeks Sponsored by: iXsystems, Inc.	2016-11-13 03:56:26 +00:00
Alexander Motin	80f0a89c62	Do not report error on close even if we have no paths left. MFC after: 2 weeks	2016-11-12 18:57:38 +00:00
Bryan Drewery	28323add09	Fix improper use of "its". Sponsored by: Dell EMC Isilon	2016-11-08 23:59:41 +00:00
Conrad Meyer	8532d381a9	Add BUF_TRACKING and FULL_BUF_TRACKING buffer debugging Upstream the BUF_TRACKING and FULL_BUF_TRACKING buffer debugging code. This can be handy in tracking down what code touched hung bios and bufs last. The full history is especially useful, but adds enough bloat that it shouldn't be enabled in release builds. Function names (or arbitrary string constants) are tracked in a fixed-size ring in bufs. Bios gain a pointer to the upper buf for tracking. SCSI CCBs gain a pointer to the upper bio for tracking. Reviewed by: markj Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D8366	2016-10-31 23:09:52 +00:00
Ruslan Bukin	ae8b1f90fe	Fix alignment issues on MIPS: align the pointers properly. All the 5520 GEOM_ELI tests passed successfully on MIPS64EB. Sponsored by: DARPA, AFRL Sponsored by: HEIF5 Differential Revision: https://reviews.freebsd.org/D7905	2016-10-31 16:55:14 +00:00
Mark Johnston	5c2ac5cf2a	gmirror: Add a subroutine to free synchronization BIOs. This addresses a memory leak that occurs upon an I/O error during a mirror synchronization. MFC after: 2 weeks Sponsored by: Dell EMC Isilon	2016-10-20 23:08:40 +00:00
Mark Johnston	b450976dc2	gmirror: Release pending regular requests when synchronization stops. Normally gmirror allows colliding requests to proceed whenever a synchronization request completes and advances to the next offset. However if an I/O request collides with one of the final g_mirror_syncreqs, nothing releases it once synchronization completes, resulting in an apparent I/O hang. The same problem can occur if synchronization is aborted by an I/O error. Therefore, be sure to requeue pending requests when mirror synchronization is stopped for any reason. While here, remove some dead code from g_mirror_regular_release(). MFC after: 2 weeks Sponsored by: Dell EMC Isilon	2016-10-20 23:02:30 +00:00
Alexander Motin	5a236b0ef9	Fix possible geom destruction before final provider close. Introduce internal counter to track opens. Using provider's counters is not very successfull after calling g_wither_provider(). MFC after: 2 weeks Sponsored by: iXsystems, Inc.	2016-10-06 15:20:05 +00:00
Mark Johnston	4dea20be45	gmirror: Write an updated syncid before queuing writes. When a syncid bump is pending, any write to the mirror results in the updated syncid being written to each component's metadata block. However, the update was only being performed after the writes to the mirror componenents were queued. Instead, synchronously update the metadata block first. MFC after: 3 weeks Sponsored by: Dell EMC Isilon	2016-10-06 00:13:55 +00:00
Mark Johnston	903618cd65	gmirror: Bump the syncid if broken disks are found during startup. Consider a mirror with two components, m1 and m2. Suppose a hardware error results in the removal of m2, with m1's genid bumped. Suppose further that a replacement mirror component m3 is created and synchronized, after which the system is shut down uncleanly. During a subsequent bootup, if gmirror tastes m1 and m2 first, m2 will be removed from the mirror because it is broken, but the mirror will be started without bumping the syncid on m1 because all elements of the mirror are accounted for. Then m3 will be added to the already-running mirror with the same syncid as m1, so the components will not be synchronized despite the unclean shutdown. Handle this scenario by bumping the syncid of healthy components if any broken mirrors are discovered during mirror startup. MFC after: 3 weeks Sponsored by: Dell EMC Isilon	2016-10-06 00:05:45 +00:00
Mark Johnston	fff048e4bc	gmirror: Use bool instead of boolean_t. MFC after: 1 week Sponsored by: Dell EMC Isilon	2016-10-05 23:55:01 +00:00
Adrian Chadd	85ab1aeccf	[geom_redboot] Extend geom_redboot to handle non-zero fis offset. Submitted by: Mori Hiroki <yamori813@yahoo.co.jp> Differential Revision: https://reviews.freebsd.org/D7237	2016-10-04 16:35:38 +00:00
Alexander Motin	8b64f3ca6c	Use g_wither_provider() where applicable. It is just a helper function combining G_PF_WITHER setting with g_orphan_provider().	2016-09-23 21:29:40 +00:00
Edward Tomasz Napierala	0c4440c3aa	Follow up r305988 by removing g_bio_run_task and related code. The g_io_schedule_up() gets its "if" condition swapped to make it more similar to g_io_schedule_down(). Suggested by: mav@ Reviewed by: mav@ MFC after: 1 month	2016-09-20 09:18:33 +00:00
Edward Tomasz Napierala	bbdd6614bd	Remove unused bio_taskqueue(). MFC after: 1 month	2016-09-19 17:46:15 +00:00
Mark Johnston	4bfb585351	Don't treat an error from g_mirror_clear_metadata() as fatal. Such errors can occur as the result of a write error or because the disk backing the mirror element was removed. They result in a generation ID bump on all active elements of the mirror, so we can safely disconnect the mirror component rather than destroy it. MFC after: 2 weeks Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D7750	2016-09-06 23:42:59 +00:00
Mark Johnston	40c5032d32	Add some fail points to gmirror. These are useful for testing changes to I/O error handling, and for reproducing existing bugs in a controlled manner. The fail points are g_mirror_regular_request_read g_mirror_regular_request_write g_mirror_sync_request_read g_mirror_sync_request_write g_mirror_metadata_write They all effectively allow one to inject an error value into the bio_error field of a corresponding BIO request as it is being completed. MFC after: 2 weeks Sponsored by: EMC / Isilon Storage Division	2016-09-06 23:35:48 +00:00
Andrey V. Elsukov	0428336393	Do not invoke resize event if initial disk size is zero. Some disks report the size only after first opening. And due to the events are asynchronous, some consumers can receive this event too late and this confuses them. This partially restores previous behaviour, and at the same time this should fix the problem, when already opened provider loses resize event. PR: 211028 MFC after: 3 weeks	2016-08-01 20:54:54 +00:00
Andrey V. Elsukov	1f353a2315	Do not invoke resize method if geom is being withered. PR: 211028 MFC after: 2 weeks	2016-07-25 09:12:08 +00:00
Andrey V. Elsukov	f1ff88cf8c	Use g_resize_provider() to change the size of GEOM_DISK provider, when it is being opened. This should fix the possible loss of a resize event when disk capacity changed. PR: 211028 Reported by: Dexuan Cui <decui at microsoft dot com> MFC after: 3 weeks	2016-07-19 05:36:21 +00:00
Maxim Sobolev	55f9588af4	Relax checking if the privider size matches size recorded in the superblock, allowing provider to be bit bigger, i.e. have some extra padding after the FS image. That in some cases might be a side-effect of using CLOOP format which enforces certain block size and trying to compress image that is not exactly the number of those blocks in size. The UFS itself does not have any issues mounting such padded file systems, so it's what GEOM_LABEL should do. Submitted by: @mizhka_gmail.com Differential Revision: https://reviews.freebsd.org/D6208	2016-07-18 05:00:01 +00:00
Mark Johnston	7d31c3939a	Move some gmirror metadata update messages to a higher debug level. These can be printed quite frequently from a mostly-idle mirror, cluttering the console. MFC after: 1 week	2016-07-14 00:40:24 +00:00
Maxim Sobolev	74ba4047a3	1.Improve handling around last compressed block of the file, which is necessary because CLOOP format lacks explicit EOF or length, so that in the presence of padding or when the CLOOP is put onto a larger partition upper level provider size may be larger. Bound amount of extra data that we might touch to the max length of the compressed block and detect zero-padding in the last cluster, which when sector is all-zero might cause us to emit bogus I/O error after decompression of that fails. To not make code any more complicated that it needs to be deal with it in lazy-manner, i.e. when we first access that specific cluster. This change also fixes stupid mistake in the LZMA code, inherited from geom_lzma, which does not share length of the output buffer buffer with the decompression routine, so that in the presence of corrupted or purposedly tailored data may easily cause heap overflow and kernel memory corruption. Beef up validation of the CLOOP TOC by checking that lengths of all but the last compressed clusters match upper limit set by the decompressor and improve some error diagnostic output while I am here. 2.Add kern.geom.uzip.attach_to tunable to artifically limit attaching uzip to certain devices in the dev tree only. For example the following only makes us attaching to the GPT labels: kern.geom.uzip.attach_to="gpt/" 3.Add kern.geom.uzip.noattach_to, which does opposite to the (2) above, i.e. prevents geom_uzip from tasting / attaching to providers matching some pattern. By default we don't attach to our own kind, i.e. kern.geom.uzip.noattach_to=".uzip". It saves us quite some CPU cycles, esp on low-end embedded systems. Approved by: re (gjb) Differential Revision: https://reviews.freebsd.org/D7013	2016-06-29 18:19:05 +00:00
Kenneth D. Merry	a02e196edd	Switch geom_disk over to using a pool mutex. The GEOM disk d_mtx is only acquired on disk creation and destruction. It is a good candidate for replacement with a pool mutex. This eliminates the mutex initialization and teardown and the mutex and name variables themselves from struct disk. sys/geom/geom_disk.h: Take d_mtx and d_mtx_name out of struct disk. sys/geom/geom_disk.c: Use mtx_pool_lock() and mtx_pool_unlock() to guard the disk initialization state instead of a dedicated mutex. This allows removing the initialization and destruction of d_mtx. sys/sys/param.h: Bump __FreeBSD_version to 1100119 for the change to struct disk. Suggested by: jhb Sponsored by: Spectra Logic Approved by: re (gjb)	2016-06-23 20:05:59 +00:00
Mark Johnston	be20fc2e90	Do not complete pending gmirror BIOs when tearing down the provider. This will result in lock recursion and is more generally incorrect since the completion handlers will just reinsert the BIOs into the queue we're trying to drain. Reviewed by: imp, ngie Approved by: re (gjb) MFC after: 3 weeks Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D6908	2016-06-22 21:00:28 +00:00
Kenneth D. Merry	e5616d65d0	Fix a bug that caused da(4) peripheral drivers to not fully go away after the underlying device went away. The problem was that callers who queue the GEOM resize provider event didn't check to make sure that the provider had not been withered. For the other equivalent case, g_new_provider_event(), the code checks to see whether the provider has been withered before queueing a g_new_provider_event() to the event thread. In some cases, a resize provider event would come through after the provider had been withered and all of the existing consumers had been orphaned. When the resize event triggered a taste of the provider, that would attach a new consumer to the now withered provider. The wither washer (g_wither_washer() would never be able to completely tear down the GEOM because of the consumers that were hanging around. The solution was to check the G_PF_WITHER provider flag before queueing the g_resize_provider_event(), and add an assert to g_resize_provider_event() to insure that it isn't called on a withered provider. sys/geom/geom_subr.c: In g_resize_provider(), don't try to continue if the G_PF_WITHER flag is set. In g_resize_provider_event(), add an assert that the G_PF_WITHER flag is not set. In g_access(), if a provider has an error, print out the name of the provider with the error. Sponsored by: Spectra Logic Approved by: re (marius) MFC after: 3 days	2016-06-22 14:39:13 +00:00
Kenneth D. Merry	1ff824e786	Fix a bug that caused da(4) instances to hang around after the underlying device is gone. The problem was that when disk_gone() is called, if the GEOM disk creation process has not yet happened, the withering process couldn't start. We didn't record any state in the GEOM disk code, and so the d_gone() callback to the da(4) driver never happened. The solution is to track the state of the creation process, and initiate the withering process from g_disk_create() if the disk is being created. This change does add fields to struct disk, and so I have bumped DISK_VERSION. geom_disk.c: Track where we are in the disk creation process, and check to see whether our underlying disk has gone away or not. In disk_gone(), set a new d_goneflag variable that g_disk_create() can check to see if it needs to clean up the disk instance. geom_disk.h: Add a mutex to struct disk (for internal use) disk init level, and a gone flag. Bump DISK_VERSION because the size of struct disk has changed and fields have been added at the beginning. Sponsored by: Spectra Logic Approved by: re (marius)	2016-06-21 20:18:19 +00:00
Gleb Smirnoff	a7c5163b5f	When we are in panic, always go the asynchronous path in g_mirror_destroy(), otherwise the system will hang. This is a temporarily least intrusive crutch to get certain panicing systems dumping. The proper fix should question is g_mirror_destroy() should be called on a panicing system at all. Discussed with: mav	2016-06-01 22:11:54 +00:00
Alan Somers	151746b244	Avoid issuing spa config updates for physical path when not necessary ZFS's configuration needs to be updated whenever the physical path for a device changes, but not when a new device is introduced. This is because new devices necessarily cause config updates, but only if they are actually accepted into the pool. sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c Split vdev_geom_set_physpath out of vdev_geom_attrchanged. When setting the vdev's physical path, only request a config update if the physical path has changed. Don't request it when opening a device for the first time, because the config sync will happen anyway upstack. sys/geom/geom_dev.c Split g_dev_set_physpath and g_dev_set_media out of g_dev_attrchanged Submitted by: will, asomers MFC after: 4 weeks Sponsored by: Spectra Logic Corp Differential Revision: https://reviews.freebsd.org/D6428	2016-05-27 22:32:44 +00:00
Konstantin Belousov	d5446cc8f4	Remove unneeded Giant locking around kthreads creation. Sponsored by: The FreeBSD Foundation	2016-05-20 08:28:11 +00:00
Konstantin Belousov	4e2732b550	Removal of Giant droping wrappers for GEOM classes. Sponsored by: The FreeBSD Foundation	2016-05-20 08:25:37 +00:00
Konstantin Belousov	dff9131e58	Remove asserts that Giant is not held on entrance into geom KPI, which outlived their usefulness. This allows to remove drop/pickup Giant wrappers around GEOM calls. Discussed with: alfred, imp, phk Sponsored by: The FreeBSD Foundation	2016-05-20 08:22:20 +00:00
Kenneth D. Merry	9a6844d55f	Add support for managing Shingled Magnetic Recording (SMR) drives. This change includes support for SCSI SMR drives (which conform to the Zoned Block Commands or ZBC spec) and ATA SMR drives (which conform to the Zoned ATA Command Set or ZAC spec) behind SAS expanders. This includes full management support through the GEOM BIO interface, and through a new userland utility, zonectl(8), and through camcontrol(8). This is now ready for filesystems to use to detect and manage zoned drives. (There is no work in progress that I know of to use this for ZFS or UFS, if anyone is interested, let me know and I may have some suggestions.) Also, improve ATA command passthrough and dispatch support, both via ATA and ATA passthrough over SCSI. Also, add support to camcontrol(8) for the ATA Extended Power Conditions feature set. You can now manage ATA device power states, and set various idle time thresholds for a drive to enter lower power states. Note that this change cannot be MFCed in full, because it depends on changes to the struct bio API that break compatilibity. In order to avoid breaking the stable API, only changes that don't touch or depend on the struct bio changes can be merged. For example, the camcontrol(8) changes don't depend on the new bio API, but zonectl(8) and the probe changes to the da(4) and ada(4) drivers do depend on it. Also note that the SMR changes have not yet been tested with an actual SCSI ZBC device, or a SCSI to ATA translation layer (SAT) that supports ZBC to ZAC translation. I have not yet gotten a suitable drive or SAT layer, so any testing help would be appreciated. These changes have been tested with Seagate Host Aware SATA drives attached to both SAS and SATA controllers. Also, I do not have any SATA Host Managed devices, and I suspect that it may take additional (hopefully minor) changes to support them. Thanks to Seagate for supplying the test hardware and answering questions. sbin/camcontrol/Makefile: Add epc.c and zone.c. sbin/camcontrol/camcontrol.8: Document the zone and epc subcommands. sbin/camcontrol/camcontrol.c: Add the zone and epc subcommands. Add auxiliary register support to build_ata_cmd(). Make sure to set the CAM_ATAIO_NEEDRESULT, CAM_ATAIO_DMA, and CAM_ATAIO_FPDMA flags as appropriate for ATA commands. Add a new get_ata_status() function to parse ATA result from SCSI sense descriptors (for ATA passthrough over SCSI) and ATA I/O requests. sbin/camcontrol/camcontrol.h: Update the build_ata_cmd() prototype Add get_ata_status(), zone(), and epc(). sbin/camcontrol/epc.c: Support for ATA Extended Power Conditions features. This includes support for all features documented in the ACS-4 Revision 12 specification from t13.org (dated February 18, 2016). The EPC feature set allows putting a drive into a power power mode immediately, or setting timeouts so that the drive will automatically enter progressively lower power states after various idle times. sbin/camcontrol/fwdownload.c: Update the firmware download code for the new build_ata_cmd() arguments. sbin/camcontrol/zone.c: Implement support for Shingled Magnetic Recording (SMR) drives via SCSI Zoned Block Commands (ZBC) and ATA Zoned Device ATA Command Set (ZAC). These specs were developed in concert, and are functionally identical. The primary differences are due to SCSI and ATA differences. (SCSI is big endian, ATA is little endian, for example.) This includes support for all commands defined in the ZBC and ZAC specs. sys/cam/ata/ata_all.c: Decode a number of additional ATA command names in ata_op_string(). Add a new CCB building function, ata_read_log(). Add ata_zac_mgmt_in() and ata_zac_mgmt_out() CCB building functions. These support both DMA and NCQ encapsulation. sys/cam/ata/ata_all.h: Add prototypes for ata_read_log(), ata_zac_mgmt_out(), and ata_zac_mgmt_in(). sys/cam/ata/ata_da.c: Revamp the ada(4) driver to support zoned devices. Add four new probe states to gather information needed for zone support. Add a new adasetflags() function to avoid duplication of large blocks of flag setting between the async handler and register functions. Add new sysctl variables that describe zone support and paramters. Add support for the new BIO_ZONE bio, and all of its subcommands: DISK_ZONE_OPEN, DISK_ZONE_CLOSE, DISK_ZONE_FINISH, DISK_ZONE_RWP, DISK_ZONE_REPORT_ZONES, and DISK_ZONE_GET_PARAMS. sys/cam/scsi/scsi_all.c: Add command descriptions for the ZBC IN/OUT commands. Add descriptions for ZBC Host Managed devices. Add a new function, scsi_ata_pass() to do ATA passthrough over SCSI. This will eventually replace scsi_ata_pass_16() -- it can create the 12, 16, and 32-byte variants of the ATA PASS-THROUGH command, and supports setting all of the registers defined as of SAT-4, Revision 5 (March 11, 2016). Change scsi_ata_identify() to use scsi_ata_pass() instead of scsi_ata_pass_16(). Add a new scsi_ata_read_log() function to facilitate reading ATA logs via SCSI. sys/cam/scsi/scsi_all.h: Add the new ATA PASS-THROUGH(32) command CDB. Add extended and variable CDB opcodes. Add Zoned Block Device Characteristics VPD page. Add ATA Return SCSI sense descriptor. Add prototypes for scsi_ata_read_log() and scsi_ata_pass(). sys/cam/scsi/scsi_da.c: Revamp the da(4) driver to support zoned devices. Add five new probe states, four of which are needed for ATA devices. Add five new sysctl variables that describe zone support and parameters. The da(4) driver supports SCSI ZBC devices, as well as ATA ZAC devices when they are attached via a SCSI to ATA Translation (SAT) layer. Since ZBC -> ZAC translation is a new feature in the T10 SAT-4 spec, most SATA drives will be supported via ATA commands sent via the SCSI ATA PASS-THROUGH command. The da(4) driver will prefer the ZBC interface, if it is available, for performance reasons, but will use the ATA PASS-THROUGH interface to the ZAC command set if the SAT layer doesn't support translation yet. As I mentioned above, ZBC command support is untested. Add support for the new BIO_ZONE bio, and all of its subcommands: DISK_ZONE_OPEN, DISK_ZONE_CLOSE, DISK_ZONE_FINISH, DISK_ZONE_RWP, DISK_ZONE_REPORT_ZONES, and DISK_ZONE_GET_PARAMS. Add scsi_zbc_in() and scsi_zbc_out() CCB building functions. Add scsi_ata_zac_mgmt_out() and scsi_ata_zac_mgmt_in() CCB/CDB building functions. Note that these have return values, unlike almost all other CCB building functions in CAM. The reason is that they can fail, depending upon the particular combination of input parameters. The primary failure case is if the user wants NCQ, but fails to specify additional CDB storage. NCQ requires using the 32-byte version of the SCSI ATA PASS-THROUGH command, and the current CAM CDB size is 16 bytes. sys/cam/scsi/scsi_da.h: Add ZBC IN and ZBC OUT CDBs and opcodes. Add SCSI Report Zones data structures. Add scsi_zbc_in(), scsi_zbc_out(), scsi_ata_zac_mgmt_out(), and scsi_ata_zac_mgmt_in() prototypes. sys/dev/ahci/ahci.c: Fix SEND / RECEIVE FPDMA QUEUED in the ahci(4) driver. ahci_setup_fis() previously set the top bits of the sector count register in the FIS to 0 for FPDMA commands. This is okay for read and write, because the PRIO field is in the only thing in those bits, and we don't implement that further up the stack. But, for SEND and RECEIVE FPDMA QUEUED, the subcommand is in that byte, so it needs to be transmitted to the drive. In ahci_setup_fis(), always set the the top 8 bits of the sector count register. We need it in both the standard and NCQ / FPDMA cases. sys/geom/eli/g_eli.c: Pass BIO_ZONE commands through the GELI class. sys/geom/geom.h: Add g_io_zonecmd() prototype. sys/geom/geom_dev.c: Add new DIOCZONECMD ioctl, which allows sending zone commands to disks. sys/geom/geom_disk.c: Add support for BIO_ZONE commands. sys/geom/geom_disk.h: Add a new flag, DISKFLAG_CANZONE, that indicates that a given GEOM disk client can handle BIO_ZONE commands. sys/geom/geom_io.c: Add a new function, g_io_zonecmd(), that handles execution of BIO_ZONE commands. Add permissions check for BIO_ZONE commands. Add command decoding for BIO_ZONE commands. sys/geom/geom_subr.c: Add DDB command decoding for BIO_ZONE commands. sys/kern/subr_devstat.c: Record statistics for REPORT ZONES commands. Note that the number of bytes transferred for REPORT ZONES won't quite match what is received from the harware. This is because we're necessarily counting bytes coming from the da(4) / ada(4) drivers, which are using the disk_zone.h interface to communicate up the stack. The structure sizes it uses are slightly different than the SCSI and ATA structure sizes. sys/sys/ata.h: Add many bit and structure definitions for ZAC, NCQ, and EPC command support. sys/sys/bio.h: Convert the bio_cmd field to a straight enumeration. This will yield more space for additional commands in the future. After change r297955 and other related changes, this is now possible. Converting to an enumeration will also prevent use as a bitmask in the future. sys/sys/disk.h: Define the DIOCZONECMD ioctl. sys/sys/disk_zone.h: Add a new API for managing zoned disks. This is very close to the SCSI ZBC and ATA ZAC standards, but uses integers in native byte order instead of big endian (SCSI) or little endian (ATA) byte arrays. This is intended to offer to the complete feature set of the ZBC and ZAC disk management without requiring the application developer to include SCSI or ATA headers. We also use one set of headers for ioctl consumers and kernel bio-level consumers. sys/sys/param.h: Bump __FreeBSD_version for sys/bio.h command changes, and inclusion of SMR support. usr.sbin/Makefile: Add the zonectl utility. usr.sbin/diskinfo/diskinfo.c Add disk zoning capability to the 'diskinfo -v' output. usr.sbin/zonectl/Makefile: Add zonectl makefile. usr.sbin/zonectl/zonectl.8 zonectl(8) man page. usr.sbin/zonectl/zonectl.c The zonectl(8) utility. This allows managing SCSI or ATA zoned disks via the disk_zone.h API. You can report zones, reset write pointers, get parameters, etc. Sponsored by: Spectra Logic Differential Revision: https://reviews.freebsd.org/D6147 Reviewed by: wblock (documentation)	2016-05-19 14:08:36 +00:00
John Baldwin	fdce57a042	Add an EARLY_AP_STARTUP option to start APs earlier during boot. Currently, Application Processors (non-boot CPUs) are started by MD code at SI_SUB_CPU, but they are kept waiting in a "pen" until SI_SUB_SMP at which point they are released to run kernel threads. SI_SUB_SMP is one of the last SYSINIT levels, so APs don't enter the scheduler and start running threads until fairly late in the boot. This change moves SI_SUB_SMP up to just before software interrupt threads are created allowing the APs to start executing kernel threads much sooner (before any devices are probed). This allows several initialization routines that need to perform initialization on all CPUs to now perform that initialization in one step rather than having to defer the AP initialization to a second SYSINIT run at SI_SUB_SMP. It also permits all CPUs to be available for handling interrupts before any devices are probed. This last feature fixes a problem on with interrupt vector exhaustion. Specifically, in the old model all device interrupts were routed onto the boot CPU during boot. Later after the APs were released at SI_SUB_SMP, interrupts were redistributed across all CPUs. However, several drivers for multiqueue hardware allocate N interrupts per CPU in the system. In a system with many CPUs, just a few drivers doing this could exhaust the available pool of interrupt vectors on the boot CPU as each driver was allocating N * mp_ncpu vectors on the boot CPU. Now, drivers will allocate interrupts on their desired CPUs during boot meaning that only N interrupts are allocated from the boot CPU instead of N * mp_ncpu. Some other bits of code can also be simplified as smp_started is now true much earlier and will now always be true for these bits of code. This removes the need to treat the single-CPU boot environment as a special case. As a transition aid, the new behavior is available under a new kernel option (EARLY_AP_STARTUP). This will allow the option to be turned off if need be during initial testing. I plan to enable this on x86 by default in a followup commit in the next few days and to have all platforms moved over before 11.0. Once the transition is complete, the option will be removed along with the !EARLY_AP_STARTUP code. These changes have only been tested on x86. Other platform maintainers are encouraged to port their architectures over as well. The main things to check for are any uses of smp_started in MD code that can be simplified and SI_SUB_SMP SYSINITs in MD code that can be removed in the EARLY_AP_STARTUP case (e.g. the interrupt shuffling). PR: kern/199321 Reviewed by: markj, gnn, kib Sponsored by: Netflix	2016-05-14 18:22:52 +00:00
Maxim Sobolev	e3d7ead7df	Add missing include "opt_geom.h" to make GEOM_UZIP_DEBUG option working, also rename enum member so it does not conflict with GEOM_UZIP option name. Submitted by: mizhka@gmail.com Differential Revision: https://reviews.freebsd.org/D6207	2016-05-06 20:32:39 +00:00
Pedro F. Giffuni	4ed3c0e713	sys: Make use of our rounddown() macro when sys/param.h is available. No functional change.	2016-04-30 14:41:18 +00:00
Pedro F. Giffuni	e8d5712284	sys/geom: spelling fixes in comments. No functional change.	2016-04-29 20:56:58 +00:00
Pedro F. Giffuni	310aef3257	sys/geom: spelling fixes. These affect debugging messages. MFC after: 2 weeks	2016-04-28 19:26:46 +00:00
Pedro F. Giffuni	b99bce73e2	geom: unsign some types to match their definitions and avoid overflows. In struct:gctl_req, nargs is unsigned. In mirror: g_mirror_syncreqs is unsigned. In raid: in struct:g_raid_volume, v_disks_count is unsigned. In virstor: in struct:g_virstor_softc, n_components is unsigned. MFC after: 2 weeks	2016-04-27 15:10:40 +00:00
Conrad Meyer	4a2776e538	g_part_bsd64: Delete duplicate/dead code RAW_PART is handled earlier in the loop. Reported by: Coverity CID: 1223201 Sponsored by: EMC / Isilon Storage Division	2016-04-26 22:32:33 +00:00
Conrad Meyer	5ad33e776f	g_part_bsd64: Check for valid on-disk npartitions value This value is u32 on disk, but assigned to an int in memory. After we do the implicit conversion via assignment, check that the result is at least one[1] (non-negative[2]). 1. The subsequent for-loop iterates from gpt_entries minus one, down, until reaching zero. A negative or zero initial index results in undefined signed integer overflow. 2. It is also used to index into arrays later. In practice, we expected non-malicious disks to contain small positive values. Reported by: Coverity CID: 1223202 Sponsored by: EMC / Isilon Storage Division	2016-04-26 22:30:54 +00:00
Pedro F. Giffuni	55e0987aea	sys: extend use of the howmany() macro when available. We have a howmany() macro in the <sys/param.h> header that is convenient to re-use as it makes things easier to read.	2016-04-26 15:38:17 +00:00
Maxim Sobolev	f260c3eadc	Relax TOC offsets checking somewhat, allowing offset pointing to the next byte past EOF to denote zero-block(s) at the very end of the file.	2016-04-26 06:50:38 +00:00
Maxim Sobolev	416ee66e25	o Fix handling of images with compression block sizes comparable to MAXPHYS. o Improve debug somewhat; o Convert "BUG BUG BUG message" into a proper KASSERT.	2016-04-23 06:31:46 +00:00
Alan Somers	1c2c346f09	DRY on buffer sizes. Update to r298420. sys/geom/geom_disk.c: In disk_attr_changed, don't repeat a buffer size. Reported by: ngie, hselasky MFC after: 4 weeks X-MFC-With: 298420 Sponsored by: Spectra Logic Corp	2016-04-21 21:13:41 +00:00
Pedro F. Giffuni	d9c9c81c08	sys: use our roundup2/rounddown2() macros when param.h is available. rounddown2 tends to produce longer lines than the original code and when the code has a high indentation level it was not really advantageous to do the replacement. This tries to strike a balance between readability using the macros and flexibility of having the expressions, so not everything is converted.	2016-04-21 19:57:40 +00:00
Alan Somers	42f42c9942	Notify userspace listeners when geom disk attributes have changed sys/geom/geom_disk.c: disk_attr_changed(): Generate a devctl event of type GEOM:<attr> for every call. MFC after: 4 weeks Sponsored by: Spectra Logic Corp Differential Revision: https://reviews.freebsd.org/D5952	2016-04-21 16:43:15 +00:00
Pedro F. Giffuni	63b6b7a74a	Indentation issues. Contract some lines leftover from r298310. Mea culpa.	2016-04-20 16:19:44 +00:00
Pedro F. Giffuni	02abd40029	kernel: use our nitems() macro when it is available through param.h. No functional change, only trivial cases are done in this sweep, Discussed in: freebsd-current	2016-04-19 23:48:27 +00:00
Pedro F. Giffuni	01b5c6f73e	g_gate: for pointers replace 0 with NULL. These are mostly cosmetical, no functional change. Found with devel/coccinelle.	2016-04-15 16:18:07 +00:00
Warner Losh	9a8fa125c1	Bump bio_cmd and bio_*flags from 8 bits to 16. Differential Revision: https://reviews.freebsd.org/D5784	2016-04-14 05:10:41 +00:00
Pedro F. Giffuni	74b8d63dcc	Cleanup unnecessary semicolons from the kernel. Found with devel/coccinelle.	2016-04-10 23:07:00 +00:00
Allan Jude	d873662594	Create the GELIBOOT GEOM_ELI flag This flag indicates that the user wishes to use the GELIBOOT feature to boot from a fully encrypted root file system. Currently, GELIBOOT does not support key files, and in the future when it does, they will be loaded differently. Due to the design of GELI, and the desire for secrecy, the GELI metadata does not know if key files are used or not, it just adds the key material (if any) to the HMAC before the optional passphrase, so there is no way to tell if a GELI partition requires key files or not. Since the GELIBOOT code in boot2 and the loader does not support keys, they will now only attempt to attach if this flag is set. This will stop GELIBOOT from prompting for passwords to GELIs that it cannot decrypt, disrupting the boot process PR: 208251 Reviewed by: ed, oshogbo, wblock Sponsored by: ScaleEngine Inc. Differential Revision: https://reviews.freebsd.org/D5867	2016-04-08 01:25:25 +00:00
Pedro F. Giffuni	21ff1f7469	g_sched_destroy(): prevent return of uninitialized scalar variable. For the !gsp case there some chance of returning an uninitialized return value. Prevent that from happening by initializing the error value. CID: 1006421	2016-04-03 16:25:51 +00:00
Warner Losh	ca19dfe480	Don't assume that bio_cmd is a bit mask. Differential Revision: https://reviews.freebsd.org/D5592	2016-03-10 06:25:39 +00:00
Warner Losh	8076d204da	Don't assume that bio_cmd is bit mask. Differential Revision: https://reviews.freebsd.org/D5593	2016-03-10 06:25:31 +00:00
Adrian Chadd	443a0f85dd	Fixes to make it compile under gcc-4.2.	2016-02-24 02:52:49 +00:00
Maxim Sobolev	5497acc527	Obsolete mkulzma(8) and geom_uncompress(4), their functionality is now provided by mkuzip(8) and geom_uzip(4) respectively. MFC after: 1 month	2016-02-24 00:39:36 +00:00
Maxim Sobolev	8f8cb840b0	Improve mkuzip(8) and geom_uzip(4), merge in LZMA support from mkulzma(8) and geom_uncompress(4): 1. mkuzip(8): - Proper support for eliminating all-zero blocks when compressing an image. This feature is already supported by the geom_uzip(4) module and CLOOP format in general, so it's just a matter of making mkuzip(8) match. It should be noted, however that this feature while it sounds great, results in very slight improvement in the overall compression ratio, since compressing default 16k all-zero block produces only 39 bytes compressed output block, which is 99.8% compression ratio. With typical average compression ratio of amd64 binaries and data being around 60-70% the difference between 99.8% and 100.0% is not that great further diluted by the ratio of number of zero blocks in the uncompressed image to the overall number of blocks being less than 0.5 (typically). However, this may be important from performance standpoint, so that kernel are not spinning its wheels decompressing those empty blocks every time this zero region is read. It could also be important when you create huge image mostly filled with zero blocks for testing purposes. - New feature allowing to de-duplicate output image. It turns out that if you twist CLOOP format a bit you can do that as well. And unlike zero-blocks elimination, this gives a noticeable improvement in the overall compression ratio, reducing output image by something like 3-4% on my test UFS2 3GB image consisting of full FreeBSD base system plus some of the packages (openjdk, apache etc), about 2.3GB worth of file data (800+MB compressed). The only caveat is that images created with this feature "on" would not work on older versions of FeeBSDxi kernel, hence it's turned off by default. - provide options to control both features and document them in manual page. - merge in all relevant LZMA compression support from the mkulzma(8), add new option to select between both. - switch license from ad-hoc beerware into standard 2-clause BSD. 2. geom_uzip(4): - implement support for de-duplicated images; - optimize some code paths to handle "all-zero" blocks without reading any compressed data; - beef up manual page to explain that geom_uzip(4) is not limited only to md(4) images. The compressed data can be written to the block device and accessed directly via magic of GEOM(4) and devfs(4), including to mount root fs from a compressed drive. - convert debug log code from being compiled in conditionally into being present all the time and provide two sysctls to turn it on or off. Due to intended use of the module, it can be used in environments where there may not be a luxury to put new kernel with debug code enabled. Having those options handy allows debug issues without as much problem by just having access to serial console or network shell access to a box/appliance. The resulting additional CPU cycles are just few int comparisons and branches, and those are minuscule when compared to data decompression which is the main feature of the module. - hopefully improve robustness and resiliency of the geom_uzip(4) by performing some of the data validation / range checking on the TOC entries and rejecting to attach to an image if those checks fail. - merge in all relevant LZMA decompression support from the geom_uncompress(4), enable automatically when appropriate format is indicated in the header. - move compilation work into its own worker thread so that it does not clog g_up. This allows multiple instances work in parallel utilizing smp cores. - document new knobs in the manual page. Reviewed by: adrian MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D5333	2016-02-23 23:59:08 +00:00
Warner Losh	bd4c1dd6d6	Use the right size for zeroing. Submitted by: rpokala@	2016-02-17 18:28:38 +00:00
Warner Losh	c55f57071a	Create an API to reset a struct bio (g_reset_bio). This is mandatory for all struct bio you get back from g_{new,alloc}_bio. Temporary bios that you create on the stack or elsewhere should use this before first use of the bio, and between uses of the bio. At the moment, it is nothing more than a wrapper around bzero, but that may change in the future. The wrapper also removes one place where we encode the size of struct bio in the KBI.	2016-02-17 17:16:02 +00:00
Adrian Chadd	61789a9a76	Teach the flashmap code about the SPI flash. PR: kern/206227 Submitted by: Stanislav Galabov <sgalabov@gmail.com>	2016-01-23 05:26:29 +00:00
Ravi Pokala	cb03a5029b	Add rotationrate to geom disk dumpconf Parse and report the nominal rotation rate reported by the drive. Reviewed by: sbruno, jhb Approved by: jhb MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D4483 Requested by: Kevin Bowling < kevin.bowling @ kev009.com >	2016-01-14 21:52:21 +00:00
Allan Jude	4332feca4b	Make additional parts of sys/geom/eli more usable in userspace The upcoming GELI support in the loader reuses parts of this code Some ifdefs are added, and some code is moved outside of existing ifdefs The HMAC parts of GELI are broken out into their own file, to separate them from the kernel crypto/openssl dependant parts that are replaced in the boot code. Passed the GELI regression suite (tools/regression/geom/eli) Files=20 Tests=14996 Result: PASS Reviewed by: pjd, delphij MFC after: 1 week Sponsored by: ScaleEngine Inc. Differential Revision: https://reviews.freebsd.org/D4699	2016-01-07 05:47:34 +00:00
Allan Jude	9c0c355f2a	Add some additional GPT partition types 4 ChromeOS GPT types 2 Microsoft partition types the new OpenBSD partition type Approved by: marcel (mentor) MFC after: 1 week Relnotes: yes Sponsored by: ScaleEngine Inc. Differential Revision: https://reviews.freebsd.org/D3841	2015-12-27 18:12:13 +00:00
Allan Jude	7a3f5d11fb	Replace sys/crypto/sha2/sha2.c with lib/libmd/sha512c.c cperciva's libmd implementation is 5-30% faster The same was done for SHA256 previously in r263218 cperciva's implementation was lacking SHA-384 which I implemented, validated against OpenSSL and the NIST documentation Extend sbin/md5 to create sha384(1) Chase dependancies on sys/crypto/sha2/sha2.{c,h} and replace them with sha512{c.c,.h} Reviewed by: cperciva, des, delphij Approved by: secteam, bapt (mentor) MFC after: 2 weeks Sponsored by: ScaleEngine Inc. Differential Revision: https://reviews.freebsd.org/D3929	2015-12-27 17:33:59 +00:00
Allan Jude	1747e1d875	Fix incorrect error message in geom map If geom_map fails to find the end of a mapped partition based on a search, it would return the incorrect error message, stating it could not parse the START value Reviewed by: adrian Approved by: bapt (mentor) Sponsored by: ScaleEngine Inc. Differential Revision: https://reviews.freebsd.org/D4187	2015-12-27 17:09:23 +00:00
Warner Losh	268f69f40b	It turns out that it's OK to sleep in this context, so use M_WAITOK for the softc for the delay module. Noticed by: rpokala@	2015-12-18 14:10:00 +00:00
Warner Losh	6a607537da	Scheduling module to introduce a fixed delay into the I/O path.	2015-12-18 05:39:25 +00:00
Steven Hartland	25080ac4d4	Prevent g_access calls to bad multipath members When a multipath member is orphaned its access members are zeroed before its removed if marked for wither, so prevent any future calls to g_access on such members. This prevents a panic on debug kernels which validates the resultant values aren't negative. Reviewed by: mav MFC after: 2 weeks Sponsored by: Multiplay Differential Revision: https://reviews.freebsd.org/D4416	2015-12-15 21:11:41 +00:00
Andrey V. Elsukov	af90a87209	Make detection of GPT a bit more reliable. When we are detecting a partition table and didn't find PMBR, try to read backup GPT header from the last sector and if it is correct, assume that we have GPT. Reviewed by: rpokala MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D4282	2015-12-10 10:35:07 +00:00
Kenneth D. Merry	985108aeb1	Fix a style issue in g_disk_limit(). Noticed by: bdrewery MFC after: 1 week	2015-12-04 03:44:12 +00:00
Kenneth D. Merry	42fbdde413	Fix g_disk_vlist_limit() to work properly with deletes. Add a new bp argument to g_disk_maxsegs(), and add a new function, g_disk_maxsize() tha will properly determine the maximum I/O size for a delete or non-delete bio. Submitted by: will MFC after: 1 week Sponsored by: Spectra Logic	2015-12-04 03:38:35 +00:00

... 3 4 5 6 7 ...

2438 Commits