freebsd-dev

Author	SHA1	Message	Date
Alexander Motin	b6fe583c55	Add `gmirror create` subcommand, alike to gstripe, gconcat, etc. It is quite specific mode of operation without storing on-disk metadata. It can be useful in some cases in combination with some external control tools handling mirror creation and disks hot-plug. MFC after: 2 weeks Sponsored by: iXsystems, Inc.	2016-11-30 09:27:08 +00:00
Alexander Motin	dc399583ba	Use providergone method to cover race between destroy and g_access(). Reviewed by: markj MFC after: 2 weeks Sponsored by: iXsystems, Inc.	2016-11-13 03:56:26 +00:00
Mark Johnston	5c2ac5cf2a	gmirror: Add a subroutine to free synchronization BIOs. This addresses a memory leak that occurs upon an I/O error during a mirror synchronization. MFC after: 2 weeks Sponsored by: Dell EMC Isilon	2016-10-20 23:08:40 +00:00
Mark Johnston	b450976dc2	gmirror: Release pending regular requests when synchronization stops. Normally gmirror allows colliding requests to proceed whenever a synchronization request completes and advances to the next offset. However if an I/O request collides with one of the final g_mirror_syncreqs, nothing releases it once synchronization completes, resulting in an apparent I/O hang. The same problem can occur if synchronization is aborted by an I/O error. Therefore, be sure to requeue pending requests when mirror synchronization is stopped for any reason. While here, remove some dead code from g_mirror_regular_release(). MFC after: 2 weeks Sponsored by: Dell EMC Isilon	2016-10-20 23:02:30 +00:00
Alexander Motin	5a236b0ef9	Fix possible geom destruction before final provider close. Introduce internal counter to track opens. Using provider's counters is not very successfull after calling g_wither_provider(). MFC after: 2 weeks Sponsored by: iXsystems, Inc.	2016-10-06 15:20:05 +00:00
Mark Johnston	4dea20be45	gmirror: Write an updated syncid before queuing writes. When a syncid bump is pending, any write to the mirror results in the updated syncid being written to each component's metadata block. However, the update was only being performed after the writes to the mirror componenents were queued. Instead, synchronously update the metadata block first. MFC after: 3 weeks Sponsored by: Dell EMC Isilon	2016-10-06 00:13:55 +00:00
Mark Johnston	903618cd65	gmirror: Bump the syncid if broken disks are found during startup. Consider a mirror with two components, m1 and m2. Suppose a hardware error results in the removal of m2, with m1's genid bumped. Suppose further that a replacement mirror component m3 is created and synchronized, after which the system is shut down uncleanly. During a subsequent bootup, if gmirror tastes m1 and m2 first, m2 will be removed from the mirror because it is broken, but the mirror will be started without bumping the syncid on m1 because all elements of the mirror are accounted for. Then m3 will be added to the already-running mirror with the same syncid as m1, so the components will not be synchronized despite the unclean shutdown. Handle this scenario by bumping the syncid of healthy components if any broken mirrors are discovered during mirror startup. MFC after: 3 weeks Sponsored by: Dell EMC Isilon	2016-10-06 00:05:45 +00:00
Mark Johnston	fff048e4bc	gmirror: Use bool instead of boolean_t. MFC after: 1 week Sponsored by: Dell EMC Isilon	2016-10-05 23:55:01 +00:00
Alexander Motin	8b64f3ca6c	Use g_wither_provider() where applicable. It is just a helper function combining G_PF_WITHER setting with g_orphan_provider().	2016-09-23 21:29:40 +00:00
Mark Johnston	4bfb585351	Don't treat an error from g_mirror_clear_metadata() as fatal. Such errors can occur as the result of a write error or because the disk backing the mirror element was removed. They result in a generation ID bump on all active elements of the mirror, so we can safely disconnect the mirror component rather than destroy it. MFC after: 2 weeks Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D7750	2016-09-06 23:42:59 +00:00
Mark Johnston	40c5032d32	Add some fail points to gmirror. These are useful for testing changes to I/O error handling, and for reproducing existing bugs in a controlled manner. The fail points are g_mirror_regular_request_read g_mirror_regular_request_write g_mirror_sync_request_read g_mirror_sync_request_write g_mirror_metadata_write They all effectively allow one to inject an error value into the bio_error field of a corresponding BIO request as it is being completed. MFC after: 2 weeks Sponsored by: EMC / Isilon Storage Division	2016-09-06 23:35:48 +00:00
Mark Johnston	7d31c3939a	Move some gmirror metadata update messages to a higher debug level. These can be printed quite frequently from a mostly-idle mirror, cluttering the console. MFC after: 1 week	2016-07-14 00:40:24 +00:00
Mark Johnston	be20fc2e90	Do not complete pending gmirror BIOs when tearing down the provider. This will result in lock recursion and is more generally incorrect since the completion handlers will just reinsert the BIOs into the queue we're trying to drain. Reviewed by: imp, ngie Approved by: re (gjb) MFC after: 3 weeks Sponsored by: EMC / Isilon Storage Division Differential Revision: https://reviews.freebsd.org/D6908	2016-06-22 21:00:28 +00:00
Gleb Smirnoff	a7c5163b5f	When we are in panic, always go the asynchronous path in g_mirror_destroy(), otherwise the system will hang. This is a temporarily least intrusive crutch to get certain panicing systems dumping. The proper fix should question is g_mirror_destroy() should be called on a panicing system at all. Discussed with: mav	2016-06-01 22:11:54 +00:00
Konstantin Belousov	4e2732b550	Removal of Giant droping wrappers for GEOM classes. Sponsored by: The FreeBSD Foundation	2016-05-20 08:25:37 +00:00
Pedro F. Giffuni	e8d5712284	sys/geom: spelling fixes in comments. No functional change.	2016-04-29 20:56:58 +00:00
Pedro F. Giffuni	b99bce73e2	geom: unsign some types to match their definitions and avoid overflows. In struct:gctl_req, nargs is unsigned. In mirror: g_mirror_syncreqs is unsigned. In raid: in struct:g_raid_volume, v_disks_count is unsigned. In virstor: in struct:g_virstor_softc, n_components is unsigned. MFC after: 2 weeks	2016-04-27 15:10:40 +00:00
Warner Losh	9a8fa125c1	Bump bio_cmd and bio_*flags from 8 bits to 16. Differential Revision: https://reviews.freebsd.org/D5784	2016-04-14 05:10:41 +00:00
Warner Losh	c55f57071a	Create an API to reset a struct bio (g_reset_bio). This is mandatory for all struct bio you get back from g_{new,alloc}_bio. Temporary bios that you create on the stack or elsewhere should use this before first use of the bio, and between uses of the bio. At the moment, it is nothing more than a wrapper around bzero, but that may change in the future. The wrapper also removes one place where we encode the size of struct bio in the KBI.	2016-02-17 17:16:02 +00:00
Jung-uk Kim	fd90e2ed54	CALLOUT_MPSAFE has lost its meaning since r141428, i.e., for more than ten years for head. However, it is continuously misused as the mpsafe argument for callout_init(9). Deprecate the flag and clean up callout_init() calls to make them more consistent. Differential Revision: https://reviews.freebsd.org/D2613 Reviewed by: jhb MFC after: 2 weeks	2015-05-22 17:05:21 +00:00
Alexander Motin	5d85cd2d11	Remove extra semicolon. MFC after: 1 week	2015-03-27 12:45:20 +00:00
Alexander Motin	3ab0187add	Remove request sorting from GEOM_MIRROR and GEOM_RAID. When CPU is not busy, those queues are typically empty. When CPU is busy, then one more extra sorting is the last thing it needs. If specific device (HDD) really needs sorting, then it will be done later by CAM. This supposed to fix livelock reported for mirror of two SSDs, when UFS fires zillion of BIO_DELETE requests, that totally blocks I/O subsystem by pointless sorting of requests and responses under single mutex lock. MFC after: 2 weeks	2015-03-27 12:44:28 +00:00
Alexander Motin	41fe4ba647	Fix bug on memory allocation error in split method. While there, use bioq_takefirst() in place where it is convenient. MFC after: 1 week	2015-03-27 11:14:12 +00:00
Alexander Motin	7715befdf2	Fix couple BIO_DELETE bugs in geom_mirror. Do not report GEOM::candelete if none of providers support BIO_DELETE. If consumer still requests BIO_DELETE, report error instead of hanging. MFC after: 2 weeks	2015-03-12 10:20:53 +00:00
Hans Petter Selasky	af3b2549c4	Pull in r267961 and r267973 again. Fix for issues reported will follow.	2014-06-28 03:56:17 +00:00
Glen Barber	37a107a407	Revert r267961, r267973: These changes prevent sysctl(8) from returning proper output, such as: 1) no output from sysctl(8) 2) erroneously returning ENOMEM with tools like truss(1) or uname(1) truss: can not get etype: Cannot allocate memory	2014-06-27 22:05:21 +00:00
Hans Petter Selasky	3da1cf1e88	Extend the meaning of the CTLFLAG_TUN flag to automatically check if there is an environment variable which shall initialize the SYSCTL during early boot. This works for all SYSCTL types both statically and dynamically created ones, except for the SYSCTL NODE type and SYSCTLs which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to be used in the case a tunable sysctl has a custom initialisation function allowing the sysctl to still be marked as a tunable. The kernel SYSCTL API is mostly the same, with a few exceptions for some special operations like iterating childrens of a static/extern SYSCTL node. This operation should probably be made into a factored out common macro, hence some device drivers use this. The reason for changing the SYSCTL API was the need for a SYSCTL parent OID pointer and not only the SYSCTL parent OID list pointer in order to quickly generate the sysctl path. The motivation behind this patch is to avoid parameter loading cludges inside the OFED driver subsystem. Instead of adding special code to the OFED driver subsystem to post-load tunables into dynamically created sysctls, we generalize this in the kernel. Other changes: - Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask" to "hw.pcic.intr_mask". - Removed redundant TUNABLE statements throughout the kernel. - Some minor code rewrites in connection to removing not needed TUNABLE statements. - Added a missing SYSCTL_DECL(). - Wrapped two very long lines. - Avoid malloc()/free() inside sysctl string handling, in case it is called to initialize a sysctl from a tunable, hence malloc()/free() is not ready when sysctls from the sysctl dataset are registered. - Bumped FreeBSD version to indicate SYSCTL API change. MFC after: 2 weeks Sponsored by: Mellanox Technologies	2014-06-27 16:33:43 +00:00
Bryan Drewery	09adfca39f	Show error code when failing to destroy a mirror on delay Sponsored by: EMC / Isilon Storage Division MFC after: 2 weeks	2014-04-05 03:01:29 +00:00
Andrey V. Elsukov	ae3bc0acff	Add an ability to stop gmirror and clear its metadata in one command. This fixes the problem, when gmirror starts again just after stop. The problem occurs when gmirror's component has geom label with equal size. E.g. gpt and gptid have the same size as partition, diskid has the same size as entire disk. When gmirror's geom has been destroyed, glabel creates its providers and this initiate retaste. Now "gmirror destroy" command is available. It destroys geom and also erases gmirror's metadata. MFC after: 2 weeks	2013-12-27 02:43:53 +00:00
Andrey V. Elsukov	7c5710dbaf	Prevent users from deactivating the last component of a mirror. PR: 184985 MFC after: 1 week	2013-12-19 22:13:12 +00:00
Andrey V. Elsukov	32cea4ca0f	Add "resize" verb to gmirror(8) and such functionality to geom_mirror(4). Now it is easy to expand the size of the mirror when all its components are replaced. Also add g_resize method to geom_mirror class. It will write updated metadata to new last sector, when parent provider is resized. Silence from: geom@ MFC after: 1 month	2013-11-19 22:55:17 +00:00
Alexander Motin	40ea77a036	Merge GEOM direct dispatch changes from the projects/camlock branch. When safety requirements are met, it allows to avoid passing I/O requests to GEOM g_up/g_down thread, executing them directly in the caller context. That allows to avoid CPU bottlenecks in g_up/g_down threads, plus avoid several context switches per I/O. The defined now safety requirements are: - caller should not hold any locks and should be reenterable; - callee should not depend on GEOM dual-threaded concurency semantics; - on the way down, if request is unmapped while callee doesn't support it, the context should be sleepable; - kernel thread stack usage should be below 50%. To keep compatibility with GEOM classes not meeting above requirements new provider and consumer flags added: - G_CF_DIRECT_SEND -- consumer code meets caller requirements (request); - G_CF_DIRECT_RECEIVE -- consumer code meets callee requirements (done); - G_PF_DIRECT_SEND -- provider code meets caller requirements (done); - G_PF_DIRECT_RECEIVE -- provider code meets callee requirements (request). Capable GEOM class can set them, allowing direct dispatch in cases where it is safe. If any of requirements are not met, request is queued to g_up or g_down thread same as before. Such GEOM classes were reviewed and updated to support direct dispatch: CONCAT, DEV, DISK, GATE, MD, MIRROR, MULTIPATH, NOP, PART, RAID, STRIPE, VFS, ZERO, ZFS::VDEV, ZFS::ZVOL, all classes based on g_slice KPI (LABEL, MAP, FLASHMAP, etc). To declare direct completion capability disk(9) KPI got new flag equivalent to G_PF_DIRECT_SEND -- DISKFLAG_DIRECT_COMPLETION. da(4) and ada(4) disk drivers got it set now thanks to earlier CAM locking work. This change more then twice increases peak block storage performance on systems with manu CPUs, together with earlier CAM locking changes reaching more then 1 million IOPS (512 byte raw reads from 16 SATA SSDs on 4 HBAs to 256 user-level threads). Sponsored by: iXsystems, Inc. MFC after: 2 months	2013-10-22 08:22:19 +00:00
Ed Schouten	647a92d62b	Fix the formatting of the error message. The G_MIRROR_DEBUG() macro already appends a newline. Also, most of the log messages emitted by gmirror start with an uppercase letter.	2013-08-12 18:17:45 +00:00
Scott Long	f07b69478e	Fix a mystery cut-n-paste corruption from the previous commit. Submitted by: Brenden Fabeny	2013-06-19 23:09:10 +00:00
Scott Long	2084cbe975	Mark geom_mirror as capable of unmapped i/o Obtained from: Netflix MFC after: 3 days	2013-06-19 21:52:32 +00:00
Andriy Gapon	1f1088b843	g_mirror: g_getattr() failure should not be fatal This allows to use gmirror e.g. on top of ZVOLs. PR: kern/175323 Submitted by: Alexei.Volkov@softlynx.ru, mav Reported by: Alexei.Volkov@softlynx.ru Tested by: Alexei.Volkov@softlynx.ru Reviewed by: ae, mav, pjd MFC after: 1 week	2013-01-26 10:50:04 +00:00
Alexander Motin	cbab616174	Alike to r242314 for GRAID make GMIRROR more aggressive in marking volumes as clean on shutdown and move that action from shutdown_pre_sync stage to shutdown_post_sync to avoid extra flapping. ZFS tends to not close devices on shutdown, that doesn't allow GEOM RAID to shutdown gracefully. To handle that, mark volume as clean just when shutdown time comes and there are no active writes. PR: kern/113957 MFC after: 2 weeks	2013-01-15 01:13:55 +00:00
Gleb Smirnoff	4a7f7b10b5	When synchronizing, include in the config dump amount of bytes syncronized. The rationale behind this is the following: for large disks the percent synchronisation counter ticks too seldom, and monitoring software (as well as human operator) can't tell whether synchronisation goes on or one of disks got stuck. On an idle server one can look into gstat and see whether synchronisation goes on or not, but on a busy server that won't work. Also, new value monitored can be differentiated obtaining the synchronisation speed quite precisely. Submitted by: Konstantin Kukushkin <dark ramtel.ru> Reviewed by: pjd	2012-09-11 20:20:13 +00:00
Gleb Smirnoff	d89862ac87	Make geom_mirror more friendly to SSDs. To properly support TRIM, we need to pass BIO_DELETE requests down to providers that support it. Also, we need to announce our support for BIO_DELETE to upper consumer. This requires: - In g_mirror_start() return true for "GEOM::candelete" request. - In g_mirror_init_disk() probe below provider for "GEOM::candelete" attribute, and mark disk with a flag if it does support BIO_DELETE. - In g_mirror_register_request() distribute BIO_DELETE requests only to those disks, that do support it. Note that we announce "GEOM::candelete" as true unconditionally of whether we have TRIM-capable media down below or not. This is made intentionally, because upper consumer (usually UFS) requests the attribite only once at mount time. And if user ever migrates his mirror from HDDs to SSDs, then he/she would get TRIM working without remounting filesystem. Reviewed by: pjd	2012-07-01 15:43:52 +00:00
Gleb Smirnoff	b0ae63ca25	In g_mirror_regular_request() upon successful delivery treat BIO_DELETE requests same way as BIO_WRITE removing them from queue. This fixes panic with BIO_DELETE operations on geom_mirror. Reviewed by: pjd	2012-07-01 15:30:43 +00:00
Andrey V. Elsukov	f931cd70af	Prevent removing of the last active component from a mirror. PR: kern/154860 Reviewed by: pjd MFC after: 1 week	2012-05-18 09:22:21 +00:00
Andrey V. Elsukov	1ee0138d2f	Introduce new device flag G_MIRROR_DEVICE_FLAG_TASTING. It should protect geom from destroying while it is tasting. PR: kern/154860 Reviewed by: pjd MFC after: 1 week	2012-05-18 09:19:07 +00:00
Ed Schouten	6472ac3d8a	Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs. The SYSCTL_NODE macro defines a list that stores all child-elements of that node. If there's no SYSCTL_DECL macro anywhere else, there's no reason why it shouldn't be static.	2011-11-07 15:43:11 +00:00
Andrey V. Elsukov	5d807a0e1a	Include sys/sbuf.h directly. Reviewed by: pjd	2011-07-11 05:22:31 +00:00
Alexander Motin	90f2be2430	Implement relaxed comparision for hardcoded provider names to make it ignore adX/adaY difference in both directions to simplify migration to the CAM-based ATA or back.	2011-04-27 00:10:26 +00:00
Alexander Leidinger	cb08c2cc83	Add some FEATURE macros for various GEOM classes. No FreeBSD version bump, the userland application to query the features will be committed last and can serve as an indication of the availablility if needed. Sponsored by: Google Summer of Code 2010 Submitted by: kibab Reviewed by: silence on geom@ during 2 weeks X-MFC after: to be determined in last commit with code from this project	2011-02-25 10:24:35 +00:00
Pawel Jakub Dawidek	a478ea7490	- Allow to specify value as const pointers. - Make optional string values always an empty string.	2010-09-13 08:56:07 +00:00
Alexander Motin	3d7cfb15f5	Remove bintime_cmp() function, unused since r200086. MFC after: 1 week	2010-08-18 15:38:10 +00:00
Alexander Motin	86de0ca52c	Move wakeup() out of mutex to reduce contention.	2010-01-05 10:30:56 +00:00
Alexander Motin	92f60381d9	As soon as mirror has no own stripes, report largest stripe of unrerlying components, hoping others fit, if they are not equal.	2009-12-24 12:17:22 +00:00

1 2 3 4

159 Commits