freebsd-skq

Author	SHA1	Message	Date
markj	67a31ca9fc	Synchronize unclean mirrors before adding them to a running gmirror. During gmirror startup, if component mirrors are found to be dirty as is typical after a system crash, the mirrors are synchronized to the mirror with highest priority. However if a gmirror starts without all of its mirrors present, for example because of some transient delays during tasting, the remaining mirrors must be synchronized before they may become active. MFC after: 2 weeks Sponsored by: Dell EMC Isilon	2017-05-02 23:29:42 +00:00
markj	cf195e3f52	Rename two gmirror state flags to make their meanings slightly clearer. No functional change. MFC after: 2 weeks Sponsored by: Dell EMC Isilon	2017-04-14 17:13:57 +00:00
markj	61b9e5a5da	Don't set the mirror GEOM softc to NULL in g_mirror_destroy(). At this point we have not rendezvous'ed with the mirror worker thread, and I/O may still be in flight. Various I/O completion paths expect to be able to obtain a reference to the mirror softc from the GEOM, so setting it to NULL may result in various NULL pointer dereferences if the mirror is stopped with -f or the kernel is shut down while a mirror is synchronizing. The worker thread will clear the softc pointer before exiting. Tested by: pho MFC after: 2 weeks Sponsored by: Dell EMC Isilon	2017-04-14 17:08:37 +00:00
markj	63b619ac47	Check for a provider error before enqueuing mirror I/O. We are otherwise susceptible to a race with a concurrent teardown of the mirror provider, causing the I/O to be left uncompleted after the mirror started withering. Tested by: pho MFC after: 2 weeks Sponsored by: Dell EMC Isilon	2017-04-14 17:03:32 +00:00
markj	cc56dafb0b	Stop mirror synchronization before draining the I/O queue. Regular I/O requests may be blocked by concurrent synchronization requests targeted to the same LBAs, in which case they are moved to a holding queue until the conflicting I/O completes. We therefore want to stop synchronization before completing pending I/O in g_mirror_destroy_provider() since this ensures that blocked I/O requests are completed as well. Tested by: pho MFC after: 2 weeks Sponsored by: Dell EMC Isilon	2017-04-14 16:54:50 +00:00
markj	cb88fec38a	Handle NULL entries in gmirror disk ds_bios arrays. Entries may be removed and freed if an I/O error occurs during mirror synchronization, so we cannot assume that all entries of ds_bios are valid. Also ensure that a synchronization BIO's array index is preserved after a successful write. Reported and tested by: pho MFC after: 2 weeks Sponsored by: Dell EMC Isilon	2017-04-10 17:15:59 +00:00
markj	15ebdcc4bf	Avoid sleeping when the mirror I/O queue is non-empty. A request may be queued while the queue lock is dropped when the mirror is being destroyed. The corresponding wakeup would be lost, possibly resulting in an apparent hang of the mirror worker thread. Tested by: pho (part of a larger patch) MFC after: 1 week Sponsored by: Dell EMC Isilon	2017-03-29 19:39:07 +00:00
markj	f1678a8682	Remove an unneeded g_mirror_destroy_provider() call. The worker thread will destroy the mirror provider as part of its teardown sequence. The call made sense in the initial revision of gmirror, but became unnecessary in r137248. Tested by: pho (part of a larger diff) MFC afteR: 2 weeks Sponsored by: Dell EMC Isilon	2017-03-29 19:30:22 +00:00
markj	4740c5564a	Refine r301173 a bit. - Don't execute any of g_mirror_shutdown_post_sync() when panicking. We cannot safely idle the mirror or stop synchronization in that state, and the current attempts to do so complicate debugging of gmirror itself. - Check for a non-NULL panicstr instead of using SCHEDULER_STOPPED(). The latter was added for use in the locking primitives. Reviewed by: mav, pjd MFC after: 2 weeks Sponsored by: Dell EMC Isilon	2017-03-27 16:25:58 +00:00
mav	7acb1dbb48	Add `gmirror create` subcommand, alike to gstripe, gconcat, etc. It is quite specific mode of operation without storing on-disk metadata. It can be useful in some cases in combination with some external control tools handling mirror creation and disks hot-plug. MFC after: 2 weeks Sponsored by: iXsystems, Inc.	2016-11-30 09:27:08 +00:00
mav	e6c44edd72	Use providergone method to cover race between destroy and g_access(). Reviewed by: markj MFC after: 2 weeks Sponsored by: iXsystems, Inc.	2016-11-13 03:56:26 +00:00
markj	645711f058	gmirror: Add a subroutine to free synchronization BIOs. This addresses a memory leak that occurs upon an I/O error during a mirror synchronization. MFC after: 2 weeks Sponsored by: Dell EMC Isilon	2016-10-20 23:08:40 +00:00
markj	18cd64ee0d	gmirror: Release pending regular requests when synchronization stops. Normally gmirror allows colliding requests to proceed whenever a synchronization request completes and advances to the next offset. However if an I/O request collides with one of the final g_mirror_syncreqs, nothing releases it once synchronization completes, resulting in an apparent I/O hang. The same problem can occur if synchronization is aborted by an I/O error. Therefore, be sure to requeue pending requests when mirror synchronization is stopped for any reason. While here, remove some dead code from g_mirror_regular_release(). MFC after: 2 weeks Sponsored by: Dell EMC Isilon	2016-10-20 23:02:30 +00:00
mav	d1e2d58d18	Fix possible geom destruction before final provider close. Introduce internal counter to track opens. Using provider's counters is not very successfull after calling g_wither_provider(). MFC after: 2 weeks Sponsored by: iXsystems, Inc.	2016-10-06 15:20:05 +00:00
markj	8c59e4b664	gmirror: Write an updated syncid before queuing writes. When a syncid bump is pending, any write to the mirror results in the updated syncid being written to each component's metadata block. However, the update was only being performed after the writes to the mirror componenents were queued. Instead, synchronously update the metadata block first. MFC after: 3 weeks Sponsored by: Dell EMC Isilon	2016-10-06 00:13:55 +00:00
markj	780b5bd00e	gmirror: Bump the syncid if broken disks are found during startup. Consider a mirror with two components, m1 and m2. Suppose a hardware error results in the removal of m2, with m1's genid bumped. Suppose further that a replacement mirror component m3 is created and synchronized, after which the system is shut down uncleanly. During a subsequent bootup, if gmirror tastes m1 and m2 first, m2 will be removed from the mirror because it is broken, but the mirror will be started without bumping the syncid on m1 because all elements of the mirror are accounted for. Then m3 will be added to the already-running mirror with the same syncid as m1, so the components will not be synchronized despite the unclean shutdown. Handle this scenario by bumping the syncid of healthy components if any broken mirrors are discovered during mirror startup. MFC after: 3 weeks Sponsored by: Dell EMC Isilon	2016-10-06 00:05:45 +00:00
markj	33afbd593e	gmirror: Use bool instead of boolean_t. MFC after: 1 week Sponsored by: Dell EMC Isilon	2016-10-05 23:55:01 +00:00
mav	990a95381b	Use g_wither_provider() where applicable. It is just a helper function combining G_PF_WITHER setting with g_orphan_provider().	2016-09-23 21:29:40 +00:00
markj	866957573f	Don't treat an error from g_mirror_clear_metadata() as fatal. Such errors can occur as the result of a write error or because the disk backing the mirror element was removed. They result in a generation ID bump on all active elements of the mirror, so we can safely disconnect the mirror component rather than destroy it. MFC after: 2 weeks Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D7750	2016-09-06 23:42:59 +00:00
markj	0d7f42693b	Add some fail points to gmirror. These are useful for testing changes to I/O error handling, and for reproducing existing bugs in a controlled manner. The fail points are g_mirror_regular_request_read g_mirror_regular_request_write g_mirror_sync_request_read g_mirror_sync_request_write g_mirror_metadata_write They all effectively allow one to inject an error value into the bio_error field of a corresponding BIO request as it is being completed. MFC after: 2 weeks Sponsored by: EMC / Isilon Storage Division	2016-09-06 23:35:48 +00:00
markj	32d16ca9fb	Move some gmirror metadata update messages to a higher debug level. These can be printed quite frequently from a mostly-idle mirror, cluttering the console. MFC after: 1 week	2016-07-14 00:40:24 +00:00
markj	a8b2bf9492	Do not complete pending gmirror BIOs when tearing down the provider. This will result in lock recursion and is more generally incorrect since the completion handlers will just reinsert the BIOs into the queue we're trying to drain. Reviewed by: imp, ngie Approved by: re (gjb) MFC after: 3 weeks Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D6908	2016-06-22 21:00:28 +00:00
glebius	7461cb05f9	When we are in panic, always go the asynchronous path in g_mirror_destroy(), otherwise the system will hang. This is a temporarily least intrusive crutch to get certain panicing systems dumping. The proper fix should question is g_mirror_destroy() should be called on a panicing system at all. Discussed with: mav	2016-06-01 22:11:54 +00:00
kib	f05c84067d	Removal of Giant droping wrappers for GEOM classes. Sponsored by: The FreeBSD Foundation	2016-05-20 08:25:37 +00:00
pfg	fafa173c28	sys/geom: spelling fixes in comments. No functional change.	2016-04-29 20:56:58 +00:00
pfg	863c16cbbd	geom: unsign some types to match their definitions and avoid overflows. In struct:gctl_req, nargs is unsigned. In mirror: g_mirror_syncreqs is unsigned. In raid: in struct:g_raid_volume, v_disks_count is unsigned. In virstor: in struct:g_virstor_softc, n_components is unsigned. MFC after: 2 weeks	2016-04-27 15:10:40 +00:00
imp	19f97c184f	Bump bio_cmd and bio_*flags from 8 bits to 16. Differential Revision: https://reviews.freebsd.org/D5784	2016-04-14 05:10:41 +00:00
imp	0bfb5dbc86	Create an API to reset a struct bio (g_reset_bio). This is mandatory for all struct bio you get back from g_{new,alloc}_bio. Temporary bios that you create on the stack or elsewhere should use this before first use of the bio, and between uses of the bio. At the moment, it is nothing more than a wrapper around bzero, but that may change in the future. The wrapper also removes one place where we encode the size of struct bio in the KBI.	2016-02-17 17:16:02 +00:00
jkim	318c4f97e6	CALLOUT_MPSAFE has lost its meaning since r141428, i.e., for more than ten years for head. However, it is continuously misused as the mpsafe argument for callout_init(9). Deprecate the flag and clean up callout_init() calls to make them more consistent. Differential Revision: https://reviews.freebsd.org/D2613 Reviewed by: jhb MFC after: 2 weeks	2015-05-22 17:05:21 +00:00
mav	176aa76341	Remove extra semicolon. MFC after: 1 week	2015-03-27 12:45:20 +00:00
mav	3e52cc4fb2	Remove request sorting from GEOM_MIRROR and GEOM_RAID. When CPU is not busy, those queues are typically empty. When CPU is busy, then one more extra sorting is the last thing it needs. If specific device (HDD) really needs sorting, then it will be done later by CAM. This supposed to fix livelock reported for mirror of two SSDs, when UFS fires zillion of BIO_DELETE requests, that totally blocks I/O subsystem by pointless sorting of requests and responses under single mutex lock. MFC after: 2 weeks	2015-03-27 12:44:28 +00:00
mav	89133d309e	Fix bug on memory allocation error in split method. While there, use bioq_takefirst() in place where it is convenient. MFC after: 1 week	2015-03-27 11:14:12 +00:00
mav	19bf0d882b	Fix couple BIO_DELETE bugs in geom_mirror. Do not report GEOM::candelete if none of providers support BIO_DELETE. If consumer still requests BIO_DELETE, report error instead of hanging. MFC after: 2 weeks	2015-03-12 10:20:53 +00:00
hselasky	35b126e324	Pull in r267961 and r267973 again. Fix for issues reported will follow.	2014-06-28 03:56:17 +00:00
gjb	fc21f40567	Revert r267961, r267973: These changes prevent sysctl(8) from returning proper output, such as: 1) no output from sysctl(8) 2) erroneously returning ENOMEM with tools like truss(1) or uname(1) truss: can not get etype: Cannot allocate memory	2014-06-27 22:05:21 +00:00
hselasky	bd1ed65f0f	Extend the meaning of the CTLFLAG_TUN flag to automatically check if there is an environment variable which shall initialize the SYSCTL during early boot. This works for all SYSCTL types both statically and dynamically created ones, except for the SYSCTL NODE type and SYSCTLs which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to be used in the case a tunable sysctl has a custom initialisation function allowing the sysctl to still be marked as a tunable. The kernel SYSCTL API is mostly the same, with a few exceptions for some special operations like iterating childrens of a static/extern SYSCTL node. This operation should probably be made into a factored out common macro, hence some device drivers use this. The reason for changing the SYSCTL API was the need for a SYSCTL parent OID pointer and not only the SYSCTL parent OID list pointer in order to quickly generate the sysctl path. The motivation behind this patch is to avoid parameter loading cludges inside the OFED driver subsystem. Instead of adding special code to the OFED driver subsystem to post-load tunables into dynamically created sysctls, we generalize this in the kernel. Other changes: - Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask" to "hw.pcic.intr_mask". - Removed redundant TUNABLE statements throughout the kernel. - Some minor code rewrites in connection to removing not needed TUNABLE statements. - Added a missing SYSCTL_DECL(). - Wrapped two very long lines. - Avoid malloc()/free() inside sysctl string handling, in case it is called to initialize a sysctl from a tunable, hence malloc()/free() is not ready when sysctls from the sysctl dataset are registered. - Bumped FreeBSD version to indicate SYSCTL API change. MFC after: 2 weeks Sponsored by: Mellanox Technologies	2014-06-27 16:33:43 +00:00
bdrewery	2eab8fff0d	Show error code when failing to destroy a mirror on delay Sponsored by: EMC / Isilon Storage Division MFC after: 2 weeks	2014-04-05 03:01:29 +00:00
ae	150bba5c8f	Add an ability to stop gmirror and clear its metadata in one command. This fixes the problem, when gmirror starts again just after stop. The problem occurs when gmirror's component has geom label with equal size. E.g. gpt and gptid have the same size as partition, diskid has the same size as entire disk. When gmirror's geom has been destroyed, glabel creates its providers and this initiate retaste. Now "gmirror destroy" command is available. It destroys geom and also erases gmirror's metadata. MFC after: 2 weeks	2013-12-27 02:43:53 +00:00
ae	86d162fb60	Prevent users from deactivating the last component of a mirror. PR: 184985 MFC after: 1 week	2013-12-19 22:13:12 +00:00
ae	0b2e834032	Add "resize" verb to gmirror(8) and such functionality to geom_mirror(4). Now it is easy to expand the size of the mirror when all its components are replaced. Also add g_resize method to geom_mirror class. It will write updated metadata to new last sector, when parent provider is resized. Silence from: geom@ MFC after: 1 month	2013-11-19 22:55:17 +00:00
mav	4219fc0074	Merge GEOM direct dispatch changes from the projects/camlock branch. When safety requirements are met, it allows to avoid passing I/O requests to GEOM g_up/g_down thread, executing them directly in the caller context. That allows to avoid CPU bottlenecks in g_up/g_down threads, plus avoid several context switches per I/O. The defined now safety requirements are: - caller should not hold any locks and should be reenterable; - callee should not depend on GEOM dual-threaded concurency semantics; - on the way down, if request is unmapped while callee doesn't support it, the context should be sleepable; - kernel thread stack usage should be below 50%. To keep compatibility with GEOM classes not meeting above requirements new provider and consumer flags added: - G_CF_DIRECT_SEND -- consumer code meets caller requirements (request); - G_CF_DIRECT_RECEIVE -- consumer code meets callee requirements (done); - G_PF_DIRECT_SEND -- provider code meets caller requirements (done); - G_PF_DIRECT_RECEIVE -- provider code meets callee requirements (request). Capable GEOM class can set them, allowing direct dispatch in cases where it is safe. If any of requirements are not met, request is queued to g_up or g_down thread same as before. Such GEOM classes were reviewed and updated to support direct dispatch: CONCAT, DEV, DISK, GATE, MD, MIRROR, MULTIPATH, NOP, PART, RAID, STRIPE, VFS, ZERO, ZFS::VDEV, ZFS::ZVOL, all classes based on g_slice KPI (LABEL, MAP, FLASHMAP, etc). To declare direct completion capability disk(9) KPI got new flag equivalent to G_PF_DIRECT_SEND -- DISKFLAG_DIRECT_COMPLETION. da(4) and ada(4) disk drivers got it set now thanks to earlier CAM locking work. This change more then twice increases peak block storage performance on systems with manu CPUs, together with earlier CAM locking changes reaching more then 1 million IOPS (512 byte raw reads from 16 SATA SSDs on 4 HBAs to 256 user-level threads). Sponsored by: iXsystems, Inc. MFC after: 2 months	2013-10-22 08:22:19 +00:00
ed	e591d48c3e	Fix the formatting of the error message. The G_MIRROR_DEBUG() macro already appends a newline. Also, most of the log messages emitted by gmirror start with an uppercase letter.	2013-08-12 18:17:45 +00:00
scottl	0aa8657901	Fix a mystery cut-n-paste corruption from the previous commit. Submitted by: Brenden Fabeny	2013-06-19 23:09:10 +00:00
scottl	ecb835e525	Mark geom_mirror as capable of unmapped i/o Obtained from: Netflix MFC after: 3 days	2013-06-19 21:52:32 +00:00
avg	c245f91f95	g_mirror: g_getattr() failure should not be fatal This allows to use gmirror e.g. on top of ZVOLs. PR: kern/175323 Submitted by: Alexei.Volkov@softlynx.ru, mav Reported by: Alexei.Volkov@softlynx.ru Tested by: Alexei.Volkov@softlynx.ru Reviewed by: ae, mav, pjd MFC after: 1 week	2013-01-26 10:50:04 +00:00
mav	19b676e81d	Alike to r242314 for GRAID make GMIRROR more aggressive in marking volumes as clean on shutdown and move that action from shutdown_pre_sync stage to shutdown_post_sync to avoid extra flapping. ZFS tends to not close devices on shutdown, that doesn't allow GEOM RAID to shutdown gracefully. To handle that, mark volume as clean just when shutdown time comes and there are no active writes. PR: kern/113957 MFC after: 2 weeks	2013-01-15 01:13:55 +00:00
glebius	d14a6db068	When synchronizing, include in the config dump amount of bytes syncronized. The rationale behind this is the following: for large disks the percent synchronisation counter ticks too seldom, and monitoring software (as well as human operator) can't tell whether synchronisation goes on or one of disks got stuck. On an idle server one can look into gstat and see whether synchronisation goes on or not, but on a busy server that won't work. Also, new value monitored can be differentiated obtaining the synchronisation speed quite precisely. Submitted by: Konstantin Kukushkin <dark ramtel.ru> Reviewed by: pjd	2012-09-11 20:20:13 +00:00
glebius	1a62bb3224	Make geom_mirror more friendly to SSDs. To properly support TRIM, we need to pass BIO_DELETE requests down to providers that support it. Also, we need to announce our support for BIO_DELETE to upper consumer. This requires: - In g_mirror_start() return true for "GEOM::candelete" request. - In g_mirror_init_disk() probe below provider for "GEOM::candelete" attribute, and mark disk with a flag if it does support BIO_DELETE. - In g_mirror_register_request() distribute BIO_DELETE requests only to those disks, that do support it. Note that we announce "GEOM::candelete" as true unconditionally of whether we have TRIM-capable media down below or not. This is made intentionally, because upper consumer (usually UFS) requests the attribite only once at mount time. And if user ever migrates his mirror from HDDs to SSDs, then he/she would get TRIM working without remounting filesystem. Reviewed by: pjd	2012-07-01 15:43:52 +00:00
glebius	2c999b9f21	In g_mirror_regular_request() upon successful delivery treat BIO_DELETE requests same way as BIO_WRITE removing them from queue. This fixes panic with BIO_DELETE operations on geom_mirror. Reviewed by: pjd	2012-07-01 15:30:43 +00:00
ae	c16abb371b	Prevent removing of the last active component from a mirror. PR: kern/154860 Reviewed by: pjd MFC after: 1 week	2012-05-18 09:22:21 +00:00

1 2 3 4

168 Commits