freebsd-skq

Author	SHA1	Message	Date
Mark Johnston	2f1cfb7f63	gmirror: Pre-allocate the timeout event structure We can't call malloc(M_WAITOK) in a callout handler. Reviewed by: imp Reported by: pho Tested by: pho MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D29223	2021-03-11 15:45:15 -05:00
Konstantin Belousov	cd85379104	Make MAXPHYS tunable. Bump MAXPHYS to 1M. Replace MAXPHYS by runtime variable maxphys. It is initialized from MAXPHYS by default, but can be also adjusted with the tunable kern.maxphys. Make b_pages[] array in struct buf flexible. Size b_pages[] for buffer cache buffers exactly to atop(maxbcachebuf) (currently it is sized to atop(MAXPHYS)), and b_pages[] for pbufs is sized to atop(maxphys) + 1. The +1 for pbufs allow several pbuf consumers, among them vmapbuf(), to use unaligned buffers still sized to maxphys, esp. when such buffers come from userspace (). Overall, we save significant amount of otherwise wasted memory in b_pages[] for buffer cache buffers, while bumping MAXPHYS to desired high value. Eliminate all direct uses of the MAXPHYS constant in kernel and driver sources, except a place which initialize maxphys. Some random (and arguably weird) uses of MAXPHYS, e.g. in linuxolator, are converted straight. Some drivers, which use MAXPHYS to size embeded structures, get private MAXPHYS-like constant; their convertion is out of scope for this work. Changes to cam/, dev/ahci, dev/ata, dev/mpr, dev/mpt, dev/mvs, dev/siis, where either submitted by, or based on changes by mav. Suggested by: mav () Reviewed by: imp, mav, imp, mckusick, scottl (intermediate versions) Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D27225	2020-11-28 12:12:51 +00:00
Edward Tomasz Napierala	d22ff249d9	Make g_attach() return ENXIO for orphaned providers; update various classes to add missing error checking. Reviewed by: imp MFC after: 2 weeks Sponsored by: NetApp, Inc. Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D26658	2020-10-18 16:24:08 +00:00
Mateusz Guzik	d40bc60752	geom: clean up empty lines in .c and .h files	2020-09-01 22:14:09 +00:00
Xin LI	fcf69f3dbc	Consistently use gctl_get_provider instead of home-grown variants. Reviewed by: cem, imp MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D25739	2020-07-22 02:15:21 +00:00
Xin LI	8510f61acd	sys/geom: consistently use _PATH_DEV instead of hardcoding "/dev/". Reviewed by: cem MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D25565	2020-07-09 02:52:39 +00:00
Conrad Meyer	844b743d31	geom(4) mirror: Do not panic on gmirror(8) insert, resize Geom_mirror initialization occurs in spurts and the present of a non-destroyed g_mirror softc does not always indicate that the geom has launched (i.e., has an sc_provider). Some gmirror(8) commands (via g_mirror_ctl) depend on a g_mirror's sc_provider (insert and resize). For those commands, g_mirror_ctl is modified to sleep-poll in an interruptible way until the target geom is either launched or destroyed. Reviewed by: markj Tested by: markj Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D24780	2020-05-11 22:39:53 +00:00
Pawel Biernacki	7029da5c36	Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many) r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are still not MPSAFE (or already are but aren’t properly marked). Use it in preparation for a general review of all nodes. This is non-functional change that adds annotations to SYSCTL_NODE and SYSCTL_PROC nodes using one of the soon-to-be-required flags. Mark all obvious cases as MPSAFE. All entries that haven't been marked as MPSAFE before are by default marked as NEEDGIANT Approved by: kib (mentor, blanket) Commented by: kib, gallatin, melifaro Differential Revision: https://reviews.freebsd.org/D23718	2020-02-26 14:26:36 +00:00
Warner Losh	8b522bdae6	Pass BIO_SPEEDUP through all the geom layers While some geom layers pass unknown commands down, not all do. For the ones that don't, pass BIO_SPEEDUP down to the providers that constittue the geom, as applicable. No changes to vinum or virstor because I was unsure how to add this support, and I'm also unsure how to test these. gvinum doesn't implement BIO_FLUSH either, so it may just be poorly maintained. gvirstor is for testing and not supportig BIO_SPEEDUP is fine. Reviewed by: chs Differential Revision: https://reviews.freebsd.org/D23183	2020-01-17 01:15:55 +00:00
Mateusz Guzik	879e0604ee	Add KERNEL_PANICKED macro for use in place of direct panicstr tests	2020-01-12 06:07:54 +00:00
Alexander Motin	c4c88d4718	Remove duplicate g_debugflags declaration. While there, define G_F_FOOTSHOOTING instead of numeric constants. MFC after: 13 days X-MFX-with: r355412	2019-12-05 15:07:32 +00:00
Conrad Meyer	ac03832ef3	GEOM: Reduce unnecessary log interleaving with sbufs Similar to what was done for device_printfs in r347229. Convert g_print_bio() to a thin shim around g_format_bio(), which acts on an sbuf; documented in g_bio.9. Reviewed by: markj Discussed with: rlibby Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D21165	2019-08-07 19:28:35 +00:00
Ryan Libby	9167705c8c	g_mirror_taste: avoid deadlock, always clear tasting flag If g_mirror_taste encountered an error at g_mirror_add_disk, it might try to g_mirror_destroy the device with the G_MIRROR_DEVICE_FLAG_TASTING flag still set. This would wait on a worker to complete the destruction with g_mirror_try_destroy, but that function bails out if the tasting flag is set, resulting in a deadlock. Clear the tasting flag before trying to destroy the device. Test Plan: sysctl debug.fail_point.mnowait="1%return" kyua test -k /usr/tests/sys/geom/class/mirror/Kyuafile Reviewed by: markj Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D20744	2019-07-01 22:06:36 +00:00
Alexander Motin	49ee0fcea5	Use sbuf_cat() in GEOM confxml generation. When it comes to megabytes of text, difference between sbuf_printf() and sbuf_cat() becomes substantial. MFC after: 2 weeks Sponsored by: iXsystems, Inc.	2019-06-19 15:36:02 +00:00
Conrad Meyer	797f009d59	gmirror: Relocate DEVICE_FLAGS to adjacent lines gmirror's sc_flags is shared between some on-disk state and some runtime only state. There's no real reason for that and they could probably be split up. Until they are, locate all of the flags for the same field nearby each other in the source, for clarity. No functional change. Sponsored by: Dell EMC Isilon	2019-01-23 16:44:21 +00:00
Mark Johnston	438622af06	Use g_handleattr() to reply to GEOM::candelete queries. g_handleattr() fills out bp->bio_completed; otherwise, g_getattr() returns an error in response to the query. This caused BIO_DELETE support to not be propagated through stacked configurations, e.g., a gconcat of gmirror volumes would not handle BIO_DELETE even when the gmirrors do. g_io_getattr() was not affected by the problem. PR: 232676 Reported and tested by: noah.bergbauer@tum.de MFC after: 1 week	2019-01-02 15:52:16 +00:00
Conrad Meyer	d2d82bfc90	gmirror: Remove a last-minute INVARIANTS breakage in r341840 I mistakenly added a lock assertion to this routine at the last minute without confirming it was held during g_mirror_create. It isn't (it isn't even initialized yet). Mea culpa. Access is exclusive in both callers, just not always by that particular lock. Reported by: lwhsu X-MFC-With: r341840, r341674	2018-12-12 18:13:56 +00:00
Conrad Meyer	23c25bd8b1	gmirror: Fix a bug introduced in r341674 r341674 inadvertently introduced a bug where newer mirror components being tasted would clear the high sc_flags that are not controlled by component metadata, such as G_MIRROR_DEVICE_FLAG_TASTING. This could plausibly expose a small window of time during STARTING where device destruction might race with mirror component addition, probably resulting in a crash. Reviewed by: markj X-MFC-With: r341674 Differential Revision: https://reviews.freebsd.org/D18521	2018-12-12 05:48:27 +00:00
Conrad Meyer	af7dcae0e2	gmirror: Evaluate mirror components against newest metadata copy Re-apply r341665 with format strings fixed. If we happen to taste a stale mirror component first, don't reject valid, newer components that have differing metadata from the stale component (during STARTING). Instead, update our view of the most recent metadata as we taste components. Like mediasize beforehand, remove some checks from g_mirror_check_metadata which would evict valid components due to metadata that can change over a mirror's lifetime. g_mirror_check_metadata is invoked long before we check genid/syncid and decide which component(s) are newest and whether or not we have quorum. Before checking if we can enter RUNNING (i.e., we have quorum) after a NEW component is added, first remove any known stale or inconsistent disks from the mirrorset, rather than removing them after deciding we have quorum. Check if we have quorum after removing these components. Additionally, add a knob, kern.geom.mirror.launch_mirror_before_timeout, to force gmirrors to wait out the full timeout (kern.geom.mirror.timeout) before transitioning from STARTING to RUNNING. This is a kludge to help ensure all eligible, boot-time available mirror components are tasted before RUNNING a gmirror. Add a basic test case for STARTING -> RUNNING startup behavior around stale genids. PR: 232671, 232835 Submitted by: Cindy Yang <cyang AT isilon.com> (previous version) Reviewed by: markj (kernel portions) Discussed with: asomers, Cindy Yang Tested by: pho Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D18062	2018-12-07 02:44:04 +00:00
Conrad Meyer	c4e87bdfc1	Revert r341665 due to tinderbox breakage I didn't notice that some format strings were non-portable. Will fix and re-commit later.	2018-12-07 00:47:05 +00:00
Conrad Meyer	bc1ee0be2d	gmirror: Evaluate mirror components against newest metadata copy If we happen to taste a stale mirror component first, don't reject valid, newer components that have differing metadata from the stale component (during STARTING). Instead, update our view of the most recent metadata as we taste components. Like mediasize beforehand, remove some checks from g_mirror_check_metadata which would evict valid components due to metadata that can change over a mirror's lifetime. g_mirror_check_metadata is invoked long before we check genid/syncid and decide which component(s) are newest and whether or not we have quorum. Before checking if we can enter RUNNING (i.e., we have quorum) after a NEW component is added, first remove any known stale or inconsistent disks from the mirrorset, rather than removing them after deciding we have quorum. Check if we have quorum after removing these components. Additionally, add a knob, kern.geom.mirror.launch_mirror_before_timeout, to force gmirrors to wait out the full timeout (kern.geom.mirror.timeout) before transitioning from STARTING to RUNNING. This is a kludge to help ensure all eligible, boot-time available mirror components are tasted before RUNNING a gmirror. When we are instructed to forget mirror components, bump the generation id to avoid confusion with such stale components later. Add a basic test case for STARTING -> RUNNING startup behavior around stale genids. PR: 232671, 232835 Submitted by: Cindy Yang <cyang AT isilon.com> (previous version) Reviewed by: markj (kernel portions) Discussed with: asomers, Cindy Yang Tested by: pho Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D18062	2018-12-06 23:55:39 +00:00
Mark Johnston	681554d70b	Remove a redundant assertion. MFC after: 1 week Sponsored by: Dell EMC Isilon	2018-05-06 00:05:03 +00:00
Mark Johnston	40e805221b	Avoid dropping the topology lock in gmirror's dumpconf implementation. Doing so introduces races which can lead to a use-after-free when grabbing a snapshot of the GEOM mesh. To ensure that a mirror's disk list remains stable, change its locking protocol: both the softc lock and the topology lock are now required to modify the list, so either lock is sufficient for traversal. Tested by: pho MFC after: 2 weeks Sponsored by: Dell EMC Isilon	2018-05-06 00:03:24 +00:00
Kyle Evans	74d6c131cb	Annotate geom modules with MODULE_VERSION GEOM ELI may double ask the password during boot. Once at loader time, and once at init time. This happens due a module loading bug. By default GEOM ELI caches the password in the kernel, but without the MODULE_VERSION annotation, the kernel loads over the kernel module, even if the GEOM ELI was compiled into the kernel. In this case, the newly loaded module purges/invalidates/overwrites the GEOM ELI's password cache, which causes the double asking. MFC Note: There's a pc98 component to the original submission that is omitted here due to pc98 removal in head. This part will need to be revived upon MFC. Reviewed by: imp Submitted by: op Obtained from: opBSD MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D14992	2018-04-10 19:18:16 +00:00
Mark Johnston	0d02f6c201	Simplify synchronization read error handling. Since synchronization reads are performed by submitting a request to the external mirror provider, we know that the request returns with an error only when gmirror was unable to read a copy of the block from any mirror. Thus, there is no need to retry the request from the synchronization error handler. Tested by: pho MFC after: 2 weeks Sponsored by: Dell EMC Isilon	2018-02-06 16:02:33 +00:00
Mark Johnston	762f440f15	Fix handling of read errors during mirror synchronization. We would previously just free the request BIO, which would either cause the disk to stay stuck in the SYNCHRONIZING state, or result in synchronization completing without having copied the block which returned an error. With this change, if the disk which returned an error is the only active disk in the mirror, the synchronizing disk is kicked out. Otherwise, the read is retried. Reported and tested by: pho (previous version) MFC after: 2 weeks Sponsored by: Dell EMC Isilon	2018-01-10 19:37:21 +00:00
Mark Johnston	792f0c3b09	Clarify the use of the gmirror flag mask constants. MFC after: 1 week Sponsored by: Dell EMC Isilon	2018-01-10 15:21:36 +00:00
Mark Johnston	aed882a9fb	Avoid referencing a possibly freed consumer after r327496. g_mirror_regular_request() may free the gmirror consumer for a disk if that disk is being disconnected, after which we must not dereference the consumer pointer. CID: 1384280 X-MFC with: r327496	2018-01-10 05:06:21 +00:00
Mark Johnston	8b0a00b745	Sort and remove unneeded includes. MFC after: 1 week Sponsored by: Dell EMC Isilon	2018-01-08 15:56:40 +00:00
Mark Johnston	7653e6d781	Release the queue lock before restarting the worker loop. Reported and tested by: pho MFC after: 3 days Sponsored by: Dell EMC Isilon	2018-01-08 15:41:49 +00:00
Mark Johnston	1787c3feb4	Fix some I/O ordering issues in gmirror. - BIO_FLUSH requests were dispatched to the disks directly from g_mirror_start() rather than going through the mirror's I/O request queue, so they could have been reordered with preceding writes. Address this by processing such requests from the queue, avoiding direct dispatch. - Handling for collisions with synchronization requests was too fine-grained and could cause reordering of writes. In particular, BIO_ORDERED was not being honoured. Address this by effectively freezing the request queue any time a collision with a synchronization request occurs. The queue is unfrozen once the collision with the first frozen request is over. - The above-mentioned collision handling allowed reads to jump ahead of writes to the same offset. Address this by freezing all request types when a collision occurs, not just BIO_WRITEs and BIO_DELETEs. Also add some more fail points for use in testing error handling. Reviewed by: imp MFC after: 3 weeks Sponsored by: Dell EMC Isilon Differential Revision: https://reviews.freebsd.org/D13559	2018-01-02 18:11:54 +00:00
Mark Johnston	9abe2e7e98	Avoid using bioq_* in gmirror. gmirror does not perform any sorting of I/O requests, so the bioq API doesn't provide any advantages over plain TAILQs. The API also does not provide operations needed by an upcoming change. No functional change intended. The diff shrinks the geom_mirror.ko text and the gmirror softc slightly. Tested by: pho (part of a larger patch) MFC after: 1 week Sponsored by: Dell EMC Isilon	2017-12-19 17:13:04 +00:00
Mark Johnston	68eadcec0f	Give a couple of predication functions a bool return type. No functional change intended. MFC after: 1 week Sponsored by: Dell EMC Isilon	2017-12-15 19:14:21 +00:00
Mark Johnston	204d94f161	Typo. MFC after: 1 week	2017-12-15 19:03:03 +00:00
Mark Johnston	8b93770503	Address a possible lost wakeup for gmirror events. g_mirror_event_send() acquires the I/O queue lock to deliver a wakeup to the worker thread, and this is done after enqueuing the event. So it's sufficient to check the event queue before atomically releasing the queue lock and going to sleep. MFC after: 1 week Sponsored by: Dell EMC Isilon	2017-12-12 17:29:34 +00:00
Mark Johnston	b634781eac	Give g_mirror_event_get() a more accurate name. MFC after: 1 week Sponsored by: Dell EMC Isilon	2017-12-12 17:25:25 +00:00
Mark Johnston	a3584ee355	Decrement sc_writes when BIO_DELETE requests complete. Otherwise a gmirror that has received a BIO_DELETE request will never be marked clean (unless sc_writes overflows). MFC after: 1 week Sponsored by: Dell EMC Isilon	2017-12-12 17:24:30 +00:00
Mark Johnston	2ceafb776e	Update gmirror metadata less frequently when synchronizing. We periodically record synchronization progress in the metadata block of the disk being synchronized; this allows an interrupted synchronization to be resumed. However, the frequency of these updates heavily pessimized synchronization time on some media. This change modifies gmirror to update metadata based on a time period, and adds a sysctl to control that period. The default value results in a much lower update frequency and increases the completion time for an interrupted rebuild only marginally. Reported by: Andre Albsmeier <andre@fbsd.e4m.org> MFC after: 3 weeks	2017-11-30 20:36:29 +00:00
Pedro F. Giffuni	3728855a0f	sys/geom: adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 2-Clause license, however the tool I was using misidentified many licenses so this was mostly a manual - error prone - task. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts.	2017-11-27 15:17:37 +00:00
Mark Johnston	0349817103	Allow kern.geom.mirror.debug to be negative. A negative value can be used to suppress all prints from the gmirror kernel code, which can be useful when attempting to trigger race conditions using stress tests. MFC after: 1 week	2017-11-23 14:07:52 +00:00
Mark Johnston	cef5abd140	Fix a lock leak in g_mirror_destroy(). g_mirror_destroy() is supposed to unlock the softc before indicating success, but it wasn't doing so if the caller raced with another thread destroying the mirror. MFC after: 1 week Sponsored by: Dell EMC Isilon	2017-10-27 17:05:14 +00:00
Andriy Gapon	7103ac8ad6	gmirror: treat ENXIO as disk disconnect, not media error In theory, all data access errors mean that a member is out of sync at most. But they were treated as more serious errors to avoid the situation where a flaky disk gets repeatedly disconnected, re-synchronized, reconnected and then disconnected again. ENXIO is a special error that means that the member disk disappeared, so it should get the same handling as the GEOM orphaning event. There is a better chance that when the disk is reconnected, it will be a good member again. When ENXIO happens on a read we use the exisiting G_MIRROR_BUMP_SYNCID mechanism which means that the mirror's syncid is increased as soon as there is a write to the mirror. That's because no data has got out of sync yet, but the problematic memeber is disconnected, so the future write will make it stale. When ENXIO happens on a write we use a new G_MIRROR_BUMP_SYNCID_NOW mechanism which means that we update the mirror metadata as soon as possible because the problematic memeber is already behind. Reviewed by: markj, imp MFC after: 3 weeks Differential Revision: https://reviews.freebsd.org/D9463	2017-09-15 13:57:08 +00:00
Mark Johnston	db7c508323	Synchronize unclean mirrors before adding them to a running gmirror. During gmirror startup, if component mirrors are found to be dirty as is typical after a system crash, the mirrors are synchronized to the mirror with highest priority. However if a gmirror starts without all of its mirrors present, for example because of some transient delays during tasting, the remaining mirrors must be synchronized before they may become active. MFC after: 2 weeks Sponsored by: Dell EMC Isilon	2017-05-02 23:29:42 +00:00
Mark Johnston	a7d94fcc3e	Rename two gmirror state flags to make their meanings slightly clearer. No functional change. MFC after: 2 weeks Sponsored by: Dell EMC Isilon	2017-04-14 17:13:57 +00:00
Mark Johnston	1e91412e40	Don't set the mirror GEOM softc to NULL in g_mirror_destroy(). At this point we have not rendezvous'ed with the mirror worker thread, and I/O may still be in flight. Various I/O completion paths expect to be able to obtain a reference to the mirror softc from the GEOM, so setting it to NULL may result in various NULL pointer dereferences if the mirror is stopped with -f or the kernel is shut down while a mirror is synchronizing. The worker thread will clear the softc pointer before exiting. Tested by: pho MFC after: 2 weeks Sponsored by: Dell EMC Isilon	2017-04-14 17:08:37 +00:00
Mark Johnston	77011eac86	Check for a provider error before enqueuing mirror I/O. We are otherwise susceptible to a race with a concurrent teardown of the mirror provider, causing the I/O to be left uncompleted after the mirror started withering. Tested by: pho MFC after: 2 weeks Sponsored by: Dell EMC Isilon	2017-04-14 17:03:32 +00:00
Mark Johnston	a65d524afc	Stop mirror synchronization before draining the I/O queue. Regular I/O requests may be blocked by concurrent synchronization requests targeted to the same LBAs, in which case they are moved to a holding queue until the conflicting I/O completes. We therefore want to stop synchronization before completing pending I/O in g_mirror_destroy_provider() since this ensures that blocked I/O requests are completed as well. Tested by: pho MFC after: 2 weeks Sponsored by: Dell EMC Isilon	2017-04-14 16:54:50 +00:00
Mark Johnston	a4834289d6	Handle NULL entries in gmirror disk ds_bios arrays. Entries may be removed and freed if an I/O error occurs during mirror synchronization, so we cannot assume that all entries of ds_bios are valid. Also ensure that a synchronization BIO's array index is preserved after a successful write. Reported and tested by: pho MFC after: 2 weeks Sponsored by: Dell EMC Isilon	2017-04-10 17:15:59 +00:00
Mark Johnston	0d75d0dfbc	Avoid sleeping when the mirror I/O queue is non-empty. A request may be queued while the queue lock is dropped when the mirror is being destroyed. The corresponding wakeup would be lost, possibly resulting in an apparent hang of the mirror worker thread. Tested by: pho (part of a larger patch) MFC after: 1 week Sponsored by: Dell EMC Isilon	2017-03-29 19:39:07 +00:00
Mark Johnston	c1ab409cba	Remove an unneeded g_mirror_destroy_provider() call. The worker thread will destroy the mirror provider as part of its teardown sequence. The call made sense in the initial revision of gmirror, but became unnecessary in r137248. Tested by: pho (part of a larger diff) MFC afteR: 2 weeks Sponsored by: Dell EMC Isilon	2017-03-29 19:30:22 +00:00

1 2 3 4 5

210 Commits