freebsd-skq

Author	SHA1	Message	Date
Andriy Gapon	1f1088b843	g_mirror: g_getattr() failure should not be fatal This allows to use gmirror e.g. on top of ZVOLs. PR: kern/175323 Submitted by: Alexei.Volkov@softlynx.ru, mav Reported by: Alexei.Volkov@softlynx.ru Tested by: Alexei.Volkov@softlynx.ru Reviewed by: ae, mav, pjd MFC after: 1 week	2013-01-26 10:50:04 +00:00
Alexander Motin	c3ec009a97	- Fix rebuild position broken at r245522. - Identify one more metadata field.	2013-01-17 03:27:08 +00:00
Alexander Motin	821a0f639e	For Promise/AMD metadata add support for disks with capacity above 2TiB and for volumes with sector size above 512 bytes.	2013-01-17 00:50:25 +00:00
Alexander Motin	ed8180e665	Recalculate volume size only for real CONCATs. For SINGLE trust volume size given by metadata, as it should be correct and in some cases can be smaller then subdisk size.	2013-01-17 00:09:50 +00:00
Alexander Motin	2c6a273750	Allow to insert new component to geom_raid3 without specifying number. PR: kern/160562 MFC after: 2 weeks	2013-01-15 10:06:35 +00:00
Alexander Motin	f62c1a47d6	Alike to r242314 for GRAID make GRAID3 more aggressive in marking volumes as clean on shutdown and move that action from shutdown_pre_sync stage to shutdown_post_sync to avoid extra flapping. ZFS tends to not close devices on shutdown, that doesn't allow GEOM RAID to shutdown gracefully. To handle that, mark volume as clean just when shutdown time comes and there are no active writes. MFC after: 2 weeks	2013-01-15 01:27:04 +00:00
Alexander Motin	cbab616174	Alike to r242314 for GRAID make GMIRROR more aggressive in marking volumes as clean on shutdown and move that action from shutdown_pre_sync stage to shutdown_post_sync to avoid extra flapping. ZFS tends to not close devices on shutdown, that doesn't allow GEOM RAID to shutdown gracefully. To handle that, mark volume as clean just when shutdown time comes and there are no active writes. PR: kern/113957 MFC after: 2 weeks	2013-01-15 01:13:55 +00:00
Alexander Motin	4c10c25e33	Keep value of orig_config_id metadata field. Windows driver writes there previous value of config_id when it is changed in some cases. I guess it may be used do avoid some split-brain conditions.	2013-01-14 20:31:45 +00:00
Alexander Motin	eb84fc957c	Small cosmetic tuning of the IRRT status constants.	2013-01-14 16:38:43 +00:00
Alexander Motin	511c69d9ce	Print some more metadata fields.	2013-01-14 13:06:35 +00:00
Alexander Motin	898a4b74f4	Windows driver writes relative volume IDs to metadata field. Use that value as a hint for raid/rX device number to make it persistent across reboots.	2013-01-14 00:38:51 +00:00
Alexander Motin	f9462b9bbe	- Add checks for Intel metadata version and attributes. Ignore disks with unsupported metadata types like Intel Smart Response to not corrupt them. - Improve setting of these things during metadata writing to protect from incapable BIOS'es and other implementations.	2013-01-13 23:00:40 +00:00
Alexander Motin	b99586c25f	Improve support for disabled disks. If disabled disk disconnected and then reconnected back, leave it as disconnected. If new disk inserted instead of disabled, rebuild it and leave as enabled.	2013-01-13 14:30:37 +00:00
Alexander Motin	865aea63c3	Windows handles INIT and VERIFY as array-wide and it doesn't specify which disks should be rebuilt. Our rebuild code is same time disk-centric. To handle this situation properly check all disks for RBLD flags, and if no disk specified try rebuild/resync all of them except newly inserted.	2013-01-12 21:51:49 +00:00
Alexander Motin	4c95a24141	Implement migration from single disk to RAID1/IRRT for Intel metadata. Windows driver uses such migration when it creates new arrays. While GEOM RAID has no mechanism to implement migration in general case, this specifc case still can be handled easily via degraded RAID1 creation followed by regular rebuild.	2013-01-12 18:25:48 +00:00
Alexander Motin	26c538bc0b	Add basic support for Intel Rapid Recover Technology (Intel RRT). It is alike to RAID1, but with dedicating master and recovery disks and providing manual control over synchronization. It allows to use recovery disk as snapshot of the master disk from the time of the last sync. This implementation is not functionaly complete comparing to Windows, but it is better then silent conversion to RAID1 on first boot.	2013-01-12 09:35:44 +00:00
Konstantin Belousov	ddd6b3fc33	Add flags argument to vfs_write_resume() and remove vfs_write_resume_flags(). Sponsored by: The FreeBSD Foundation	2013-01-11 06:08:32 +00:00
Pawel Jakub Dawidek	6011443800	Reset provider-specific fields when resending I/O request in low memory conditions. This fixes assertion which checks those fields when kernel is compiled with DIAGNOSTIC. Reported by: kib, pho MFC after: 1 week	2012-12-26 20:07:47 +00:00
Jaakko Heinonen	efec959c2c	Mangle label names containing spaces, non-printable characters '%' or '"'. Mangling is only done for label names read from file system metadata. Encoding resembles URL encoding. For example, the space character becomes %20. Help by: kib Discussed with: imp, kib, pjd	2012-12-22 13:43:12 +00:00
Jaakko Heinonen	02c62349c9	- Don't pass geom and provider names as format strings. - Add __printflike() attributes. - Remove an extra argument for the g_new_geomf() call in swapongeom_ev(). Reviewed by: pjd	2012-11-20 12:32:18 +00:00
Alfred Perlstein	bad7e7f3dd	Provide a device name in the sysctl tree for programs to query the state of crashdump target devices. This will be used to add a "-l" (ell) flag to dumpon(8) to list the currently configured dumpdev. Reviewed by: phk	2012-11-01 17:01:05 +00:00
Edward Tomasz Napierala	549f62fa42	Fix problem with geom_label(4) not recognizing UFS labels on filesystems extended using growfs(8). The problem here is that geom_label checks if the filesystem size recorded in UFS superblock is equal to the provider (i.e. device) size. This check cannot be removed due to backward compatibility. On the other hand, in most cases growfs(8) cannot set fs_size in the superblock to match the provider size, because, differently from newfs(8), it cannot recompute cylinder group sizes. To fix this problem, add another superblock field, fs_providersize, used only for this purpose. The geom_label(4) will attach if either fs_size (filesystem created with newfs(8)) or fs_providersize (filesystem expanded using growfs(8)) matches the device size. PR: kern/165962 Reviewed by: mckusick Sponsored by: FreeBSD Foundation	2012-10-30 21:32:10 +00:00
Alexander Motin	650e245ebf	Minor addition to r242323: Alike to BIO_WRITE, report success if at least one subdisk succeeded with BIO_DELETE. But unlike BIO_WRITE don't fail disk on BIO_DELETE error. Sponsored by: iXsystems, Inc. MFC after: 1 month	2012-10-29 21:08:06 +00:00
Alexander Motin	609a74746a	Add basic BIO_DELETE support to GEOM RAID class for all RAID levels. If at least one subdisk in the volume supports it, BIO_DELETE requests will be propagated down. Unfortunatelly, for RAID levels with redundancy unmapped blocks will be mapped back during first rebuild/resync process. Sponsored by: iXsystems, Inc. MFC after: 1 month	2012-10-29 18:04:38 +00:00
Edward Tomasz Napierala	1af2d09b49	Fix locking problem in disk_resize(); previously it would run without topology lock, resulting in assertion when running with DIAGNOSTIC. Reviewed by: mav (earlier version)	2012-10-29 17:52:43 +00:00
Alexander Motin	a479c51be3	Make GEOM RAID more aggressive in marking volumes as clean on shutdown and move that action from shutdown_pre_sync to shutdown_post_sync stage to avoid extra flapping. ZFS tends to not close devices on shutdown, that doesn't allow GEOM RAID to shutdown gracefully. To handle that, mark volume as clean just when shutdown time comes and there are no active writes. MFC after: 2 weeks	2012-10-29 14:18:54 +00:00
Konstantin Belousov	5050aa86cf	Remove the support for using non-mpsafe filesystem modules. In particular, do not lock Giant conditionally when calling into the filesystem module, remove the VFS_LOCK_GIANT() and related macros. Stop handling buffers belonging to non-mpsafe filesystems. The VFS_VERSION is bumped to indicate the interface change which does not result in the interface signatures changes. Conducted and reviewed by: attilio Tested by: pho	2012-10-22 17:50:54 +00:00
Attilio Rao	682ee99e7a	It seems that it is preferable to keep support for glabel also for filesystems that we don't support natively. Revert part of r241636 to do so. This patch is not targeted for MFC. Requested by: gleb, jhb	2012-10-18 22:18:11 +00:00
Attilio Rao	a42ac676f5	Disconnect non-MPSAFE NTFS from the build in preparation for dropping GIANT from VFS. This code is particulary broken and fragile and other in-kernel implementations around, found in other operating systems, don't really seem clean and solid enough to be imported at all. If someone wants to reconsider in-kernel NTFS implementation for inclusion again, a fair effort for completely fixing and cleaning it up is expected. In the while NTFS regular users can use FUSE interface and ntfs-3g port to work with their NTFS partitions. This is not targeted for MFC.	2012-10-17 11:30:00 +00:00
Alexander Motin	c6f0cd57e3	NULL-ify last previously used pointer instead of last possible pointer. This should be only a cosmetic change. Found by: Clang Static Analyzer	2012-10-10 20:41:37 +00:00
Alexander Motin	6871a543f9	Make graid command line a bit more friendly by allowing volume name or provider name to be specified instead of geom name (first argument in all subcommands except label). In most cases there is only one array used any way, so it is not really useful to make user type ugly geom names like Intel-f0bdf223 or SiI-732c2b9448cf. Though they can be used in some cases. Sponsored by: iXsystems, Inc. MFC after: 1 month	2012-10-07 19:30:16 +00:00
Andriy Gapon	a90c9dfeab	g_part_taste: directly destroy consumer and geom here, no need for withering Besides withered but still alive consumers may interfere with re-tatsing. MFC after: 16 days	2012-10-06 19:52:50 +00:00
Pawel Jakub Dawidek	5d8a6a1078	Remove the topology lock from disk_gone(), it might be called with regular mutexes held and the topology lock is an sx lock. The topology lock was there to protect traversing through the list of providers of disk's geom, but it seems that disk's geom has always exactly one provider. Change the code to call g_wither_provider() for this one provider, which is safe to do without holding the topology lock and assert that there is indeed only one provider. Discussed with: ken MFC after: 1 week	2012-09-28 08:22:51 +00:00
Pawel Jakub Dawidek	171f6b3a34	Use the topology lock to protect list of providers while withering them. It is possible that provider is destroyed while we are iterating over the list. Reported by: Brian Parkison <parkison@panzura.com> Discussed with: phk MFC after: 1 week	2012-09-22 12:41:49 +00:00
Andriy Gapon	85f5b9aa70	g_disk_flushcache definitely should not be traced under G_T_TOPOLOGY ... use G_T_BIO instead MFC after: 1 week	2012-09-18 07:57:34 +00:00
Alexander Motin	c89d2fbe18	Add global and per-module sysctls/tunables to enable/disable metadata taste. That should help to handle some cases when disk has some RAID metadata that should be ignored, especially during boot. MFC after: 3 days	2012-09-13 13:27:09 +00:00
Gleb Smirnoff	4a7f7b10b5	When synchronizing, include in the config dump amount of bytes syncronized. The rationale behind this is the following: for large disks the percent synchronisation counter ticks too seldom, and monitoring software (as well as human operator) can't tell whether synchronisation goes on or one of disks got stuck. On an idle server one can look into gstat and see whether synchronisation goes on or not, but on a busy server that won't work. Also, new value monitored can be differentiated obtaining the synchronisation speed quite precisely. Submitted by: Konstantin Kukushkin <dark ramtel.ru> Reviewed by: pjd	2012-09-11 20:20:13 +00:00
Pawel Jakub Dawidek	769afdc71e	Allow to pass providers with /dev/ prefix to g_provider_by_name(). MFC after: 3 days	2012-09-01 10:52:19 +00:00
Ed Schouten	24d1105dde	Remove unneeded G_PF_CANDELETE flag. This flag is only used by GEOM so it can be propagated to the character device's SI_CANDELETE. Unfortunately, SI_CANDELETE seems to do nothing.	2012-08-28 19:28:31 +00:00
Thomas Quinot	8fb378d6b1	(g_multipath_rotate): Fix algorithm so that it does rotate over all good providers, not just the last two. PR: kern/170379 Reviewed by: mav MFC after: 2 weeks	2012-08-25 10:36:31 +00:00
Pawel Jakub Dawidek	9d18043979	Always initialize sc_ekey, because as of r238116 it is always used. If GELI provider was created on FreeBSD HEAD r238116 or later (but before this change), it is using very weak keys and the data is not protected. The bug was introduced on 4th July 2012. One can verify if its provider was created with weak keys by running: # geli dump <provider> \| grep version If the version is 7 and the system didn't include this fix when provider was initialized, then the data has to be backed up, underlying provider overwritten with random data, system upgraded and provider recreated. Reported by: Fabian Keil <fk@fabiankeil.de> Tested by: Fabian Keil <fk@fabiankeil.de> Discussed with: so MFC after: 3 days	2012-08-10 18:43:29 +00:00
Alexander Motin	d9d6849693	Add missing FAILED event to g_raid_subdisk_event2str() to print it properly in debug messages. Submitted by: Dmitry Luhtionov <dmitryluhtionov@gmail.com>	2012-08-10 13:36:33 +00:00
Jim Harris	82a6ae1009	Clone BIO_ORDERED flag, for disk drivers (namely CAM) that try to consume it. Sponsored by: Intel Discussed with: gibbs, scottl	2012-08-07 20:16:10 +00:00
Mikolaj Golub	1d9db37c77	In g_gate_dumpconf() always check the result of g_gate_hold(). This fixes "Negative sc_ref" panic possible when sysctl_kern_geom_confxml() is run simultaneously with destroying GATE device. Reviewed by: pjd MFC after: 3 days	2012-08-07 18:50:33 +00:00
Jim Harris	c1d00eabe8	In virstor_ctl_stop(), check for a valid softc before trying to update metadata. Sponsored by: Intel Reported and tested by: Marcelo Gondim <gondim at bsdinfo dot com dot br> PR: kern/170199 MFC after: 3 days	2012-08-03 20:24:16 +00:00
Thomas Quinot	71ee4ef0d9	New command "gmultipath prefer" to force selection of a specified provider in an Active/Passive configuration. Reviewed by: mav MFC after: 4 weeks	2012-08-03 14:55:35 +00:00
Alexander Motin	e521fb0558	Partially revert r238886 in part of GEOM_VFS spoiling. This change triggered interesting foot shooting condition in GEOM when RW access to root partition by fsck spoils VFS geom there, which has it opened RO at the same time. Seems spoiling concept needs some rework.	2012-07-29 20:04:09 +00:00
Alexander Motin	3631c6382f	Implement media change notification for DA and CD removable media devices. It includes three parts: 1) Modifications to CAM to detect media media changes and report them to disk(9) layer. For modern SATA (and potentially UAS) devices it utilizes Asynchronous Notification mechanism to receive events from hardware. Active polling with TEST UNIT READY commands with 3 seconds period is used for incapable hardware. After that both CD and DA drivers work the same way, detecting two conditions: "NOT READY: Medium not present" after medium was detected previously, and "UNIT ATTENTION: Not ready to ready change, medium may have changed". First one reported to disk(9) as media removal, second as media insert/change. To reliably receive second event new AC_UNIT_ATTENTION async added to make UAs broadcasted to all periphs by generic error handling code in cam_periph_error(). 2) Modifications to GEOM core to handle media remove and change events. Media removal handled by spoiling all consumers attached to the provider. Media change event also schedules provider retaste after spoiling to probe new media. New flag G_CF_ORPHAN was added to consumers to reflect that consumer is in process of destruction. It allows retaste to create new geom instance of the same class, while previous one is still dying. 3) Modifications to some GEOM classes: DEV -- to report media change events to devd; VFS -- to handle spoiling same as orphan to prevent accessing replaced media. PART class already handles spoiling alike to orphan. Reviewed by: silence on geom@ and scsi@ Tested by: avg Sponsored by: iXsystems, Inc. / PC-BSD MFC after: 2 months	2012-07-29 11:51:48 +00:00
Mikolaj Golub	a277f47bd2	Reorder things in g_gate_create() so at the moment when g_new_geomf() is called name is properly initialized. Discussed with: pjd MFC after: 2 weeks	2012-07-28 16:30:50 +00:00
Edward Tomasz Napierala	a1cf7f75a6	Make it possible to resize opened partitions. Sponsored by: FreeBSD Foundation	2012-07-20 17:51:20 +00:00
Edward Tomasz Napierala	3a3ef28e15	Add missing free.	2012-07-18 07:26:20 +00:00
Kenneth D. Merry	edad9799e8	Add back spare fields consumed in r237545. It seems that these should only be consumed to maintain backward compatibility in stable, but should not be consumed in head. Submitted by: trasz, attilio (indirectly)	2012-07-17 22:16:10 +00:00
Edward Tomasz Napierala	9e9d445ed1	The resize GEOM event has no references, thus cannot be canceled.	2012-07-16 17:41:38 +00:00
Edward Tomasz Napierala	8fe7677998	Add back spare fields reused in r238213. According to Attilio, the rule is to use reuse spares only when MFC-ing, not in CURRENT.	2012-07-16 16:50:28 +00:00
Edward Tomasz Napierala	7027e4dac4	Add trivial resize handling to gnop(8). Reviewed by: mav Sponsored by: FreeBSD Foundation	2012-07-07 22:22:13 +00:00
Edward Tomasz Napierala	74badfa6ba	Add trivial resize handling to gmountver(8). Reviewed by: mav Sponsored by: FreeBSD Foundation	2012-07-07 22:20:47 +00:00
Edward Tomasz Napierala	bc97ce36f7	Add disk_resize(), to make it possible for the disk drivers such as da(4) to notify GEOM about LUN size change. Reviewed by: mav (earlier version) Sponsored by: FreeBSD Foundation	2012-07-07 21:28:31 +00:00
Edward Tomasz Napierala	245899cc97	Add a new GEOM method, resize(), which is called after provider size changes. Add a new routine, g_resize_provider(), to use to notify GEOM about provider change. Reviewed by: mav Sponsored by: FreeBSD Foundation	2012-07-07 20:13:40 +00:00
Edward Tomasz Napierala	ad624005b3	Fix orphan() methods of several GEOM classes to not assume that there is an error set on the provider. With GEOM resizing, class can become orphaned when it doesn't implement resize() method and the provider size decreases. Reviewed by: mav Sponsored by: FreeBSD Foundation	2012-07-07 17:09:44 +00:00
Edward Tomasz Napierala	aaaf515fde	Fix typo in the comment.	2012-07-06 15:46:38 +00:00
Pawel Jakub Dawidek	e08ec03778	Extend GEOM Gate class to handle read I/O requests directly within the kernel. This will allow HAST to read directly from the local component without even communicating userland daemon. Sponsored by: Panzura, http://www.panzura.com MFC after: 1 month	2012-07-04 20:16:28 +00:00
Pawel Jakub Dawidek	457bbc4f3a	Use correct part of the Master-Key for generating encryption keys. Before this change the IV-Key was used to generate encryption keys, which was incorrect, but safe - for the XTS mode this key was unused anyway and for CBC mode it was used differently to generate IV vectors, so there is no risk that IV vector collides with encryption key somehow. Bump version number and keep compatibility for older versions. MFC after: 2 weeks	2012-07-04 17:54:17 +00:00
Pawel Jakub Dawidek	3d47ea3324	Correct comment. MFC after: 3 days	2012-07-04 17:44:39 +00:00
Pawel Jakub Dawidek	ec58140a27	Correct a comment and correct style of a flag check. MFC after: 3 days	2012-07-04 17:43:25 +00:00
Gleb Smirnoff	d89862ac87	Make geom_mirror more friendly to SSDs. To properly support TRIM, we need to pass BIO_DELETE requests down to providers that support it. Also, we need to announce our support for BIO_DELETE to upper consumer. This requires: - In g_mirror_start() return true for "GEOM::candelete" request. - In g_mirror_init_disk() probe below provider for "GEOM::candelete" attribute, and mark disk with a flag if it does support BIO_DELETE. - In g_mirror_register_request() distribute BIO_DELETE requests only to those disks, that do support it. Note that we announce "GEOM::candelete" as true unconditionally of whether we have TRIM-capable media down below or not. This is made intentionally, because upper consumer (usually UFS) requests the attribite only once at mount time. And if user ever migrates his mirror from HDDs to SSDs, then he/she would get TRIM working without remounting filesystem. Reviewed by: pjd	2012-07-01 15:43:52 +00:00
Gleb Smirnoff	b0ae63ca25	In g_mirror_regular_request() upon successful delivery treat BIO_DELETE requests same way as BIO_WRITE removing them from queue. This fixes panic with BIO_DELETE operations on geom_mirror. Reviewed by: pjd	2012-07-01 15:30:43 +00:00
Warner Losh	a920522660	Use %j to match intmax_t.	2012-07-01 05:22:13 +00:00
Brooks Davis	9e81f117f9	MFP4 #212266 Fix compile on MIPS64. Sponsored by: DARPA, AFRL	2012-06-29 20:15:00 +00:00
Kenneth D. Merry	c76a6fe732	In g_disk_providergone(), don't continue if the softc is NULL. This may be the case if we've already gone through g_disk_destroy(). Reported by: Michael Butler <imb@protected-networks.net> MFC after: 3 days	2012-06-27 16:05:09 +00:00
Kenneth D. Merry	365e076ed2	Consume spare fields for the providergone pointers added to the g_class and g_geom structures in change 237518. The original change would have broken the ABI. Suggested by: ae MFC after: 4 days	2012-06-25 04:26:10 +00:00
Kenneth D. Merry	c3fb2891f0	Fix a bug which causes a panic in daopen(). The panic is caused by a da(4) instance going away while GEOM is still probing it. In this case, the GEOM disk class instance has been created by disk_create(), and the taste of the disk is queued in the GEOM event queue. While that event is queued, the da(4) instance goes away. When the open call comes into the da(4) driver, it dereferences the freed (but non-NULL) peripheral pointer provided by GEOM, which results in a panic. The solution is to add a callback to the GEOM disk code that is called when all of its resources are cleaned up. This is implemented inside GEOM by adding an optional callback that is called when all consumers have detached from a provider, and the provider is about to be deleted. scsi_cd.c, scsi_da.c: In the register routine for the cd(4) and da(4) routines, acquire a reference to the CAM peripheral instance just before we call disk_create(). Use the new GEOM disk d_gone() callback to register a callback (dadiskgonecb()/cddiskgonecb()) that decrements the peripheral reference count once GEOM has finished cleaning up its resources. In the cd(4) driver, clean up open and close behavior slightly. GEOM makes sure we only get one open() and one close call, so there is no need to set an open flag and decrement the reference count if we are not the first open. In the cd(4) driver, use cam_periph_release_locked() in a couple of error scenarios to avoid extra mutex calls. geom.h: Add a new, optional, providergone callback that is called when a provider is about to be deleted. geom_disk.h: Add a new d_gone() callback to the GEOM disk interface. Bump the DISK_VERSION to version 2. This probably should have been done after a couple of previous changes, especially the addition of the d_getattr() callback. geom_disk.c: Add a providergone callback for the disk class, g_disk_providergone(), that calls the user's d_gone() callback if it exists. Bump the DISK_VERSION to 2. geom_subr.c: In g_destroy_provider(), call the providergone callback if it has been provided. In g_new_geomf(), propagate the class's providergone callback to the new geom instance. blkfront.c: Callers of disk_create() are supposed to pass in DISK_VERSION, not an explicit disk API version number. Update the blkfront driver to do that. disk.9: Update the disk(9) man page to include information on the new d_gone() callback, as well as the previously added d_getattr() callback, d_descr field, and HBA PCI ID fields. MFC after: 5 days	2012-06-24 04:29:03 +00:00
Andrey V. Elsukov	d4746e107f	Always reconstruct partition entries in the PMBR when Boot Camp is disabled. This helps to easily recover from situations when PMBR is damaged and contains no entries. MFC after: 1 week	2012-06-14 11:17:54 +00:00
Alexander Motin	a839e33278	Add missing newlines into XML output. MFC after: 3 days Sponsored by: iXsystems, Inc.	2012-06-05 16:46:34 +00:00
Marcel Moolenaar	f24a8224b2	Add a partition type for nandfs to the apm, bsd, gpt and vtoc8 schemes. The gpart alias for these partition types is "freebsd-nandfs".	2012-05-25 20:33:34 +00:00
Edward Tomasz Napierala	d87e55886e	Revert r235918 for now and add comment explaining the reason for the size check.	2012-05-25 10:08:48 +00:00
Edward Tomasz Napierala	202f0f2a02	Make g_label(4) ignore provider size when looking for UFS labels. Without it, it fails to create labels for filesystems resized by growfs(8). PR: kern/165962 Submitted by: Olivier Cochard-Labbe <olivier at cochard dot me>	2012-05-24 16:48:33 +00:00
Xin LI	8287ee1bbe	- Correct signedness for casts; - Wrap long line while I'm there. Noticed by: pjd, avg	2012-05-23 20:51:21 +00:00
Xin LI	dc89cfa691	Use %ju to match uintmax_t usage	2012-05-23 18:17:02 +00:00
Xin LI	2920997423	Use %j and cast off_t to intmax_t for now to fix build. Noticed by: bz	2012-05-23 17:49:59 +00:00
Grzegorz Bernacki	4ffd4dfe17	Add a new geom class which allows to divide NAND Flash chip into partitions. Partitions are created based on data in dts file which are extracted and interpreted by slicer. Obtained from: Semihalf Supported by: FreeBSD Foundation, Juniper Networks	2012-05-22 08:33:14 +00:00
Andrey V. Elsukov	f931cd70af	Prevent removing of the last active component from a mirror. PR: kern/154860 Reviewed by: pjd MFC after: 1 week	2012-05-18 09:22:21 +00:00
Andrey V. Elsukov	1ee0138d2f	Introduce new device flag G_MIRROR_DEVICE_FLAG_TASTING. It should protect geom from destroying while it is tasting. PR: kern/154860 Reviewed by: pjd MFC after: 1 week	2012-05-18 09:19:07 +00:00
Eitan Adler	615a3e398d	Add missing period at the end of the error message Submitted by: pjd Approved by: cperciva (implicit) MFC after: 3 days X-MFC-With: r235201	2012-05-13 23:27:06 +00:00
Alexander Motin	ef844ef76f	- Prevent error status leak if write to some of the RAID1/1E volume disks failed while write to some other succeeded. Instead mark disk as failed. - Make RAID1E less aggressive in failing disks to avoid volume breakage. MFC after: 2 weeks	2012-05-11 13:20:17 +00:00
Eitan Adler	af23b88b5c	Clarify error that geli generates when it finds corrupt data. PR: kern/165695 Submitted by: Robert Simmons <rsimmons0@gmail.com> Reviewed by: pjd Approved by: cperciva MFC after: 1 week	2012-05-09 17:26:52 +00:00
Alexander Motin	14f9f25ba0	Remove some hardcoded constants from code.	2012-05-06 16:41:27 +00:00
Alexander Motin	eb3b1cd0de	Plug small memory leaks.	2012-05-06 12:55:20 +00:00
Alexander Motin	8f12ca2ee1	Add support for RAID5R. Slightly improve support for RAIDMDF.	2012-05-06 11:32:36 +00:00
Alexander Motin	c0b1ef6661	Fix `gmultipath configure` for big-endian machines. MFC after: 1 week	2012-05-06 05:49:23 +00:00
Alexander Motin	86b0366909	Fix bug causing memory corruption and panics with big-endian metadata.	2012-05-04 08:59:19 +00:00
Alexander Motin	4b97ff6137	Implement read-only support for volumes in optimal state (without using redundancy) for the following RAID levels: RAID4/5E/5EE/6/MDF.	2012-05-04 07:32:57 +00:00
Alexander Motin	8df8e26adc	Add optional -o argument to the `graid label` to specify some metadata format options. Use it for specifying byte order for the DDF metadata: big-endian defined by specification and little-endian used by Adaptec.	2012-05-03 05:32:56 +00:00
Alexander Motin	d525d87560	Improve spare disks support. Unluckily, for some reason Adaptec 1430SA RAID BIOS doesn't want to understand spare disks created by graid. But at least spares created by BIOS are working fine now.	2012-05-01 18:00:31 +00:00
Alexander Motin	2b9c925ff0	Implement volume deletion if disk has more then one partition.	2012-05-01 09:21:21 +00:00
Alexander Motin	47e980965c	Improve DDF metadata writing.	2012-05-01 08:19:29 +00:00
Alexander Motin	00f32ecbd0	Add to GEOM RAID class module, supporting the DDF metadata format, as defined by the SNIA Common RAID Disk Data Format Specification v2.0. Supports multiple volumes per array and multiple partitions per disk. Supports standard big-endian and Adaptec's little-endian byte ordering. Supports all single-layer RAID levels. Dual-layer RAID levels except RAID10 are not supported now because of GEOM RAID design limitations. Some work is still to be done, but the present code already manages basic interoperation with RAID BIOS of the Adaptec 1430SA SATA RAID controller. MFC after: 1 month Sponsored by: iXsystems, Inc.	2012-04-30 17:53:02 +00:00
Alexander Motin	c9f545e5f9	s/gmirror/graid/	2012-04-29 19:40:50 +00:00
Alexander Motin	7b2a8d7823	Fix RAID5 level names changed at r234603.	2012-04-27 08:49:15 +00:00
Alexander Motin	bafd0b5b0a	Fix copy-paste typo in r234603. Submitted by: kan	2012-04-23 16:35:19 +00:00
Alexander Motin	dbb2e75504	Add names for all primary RAID levels defined by DDF 2.0 specification.	2012-04-23 13:04:02 +00:00
Alexander Motin	e26083ca69	Add sos@ copyrights to RAID metadata modules, respecting his efforts in decoding metadata formats in ataraid(4) code.	2012-04-23 09:39:39 +00:00
Alexander Motin	fc1de96060	Add to GEOM RAID class module for reading non-degraded RAID5 volumes and some environment to differentiate 4 possible RAID5 on-disk layouts. Tested with Intel and AMD RAID BIOSes. MFC after: 2 weeks	2012-04-19 12:30:12 +00:00
Dmitry Morozovsky	b20e4de387	VMware environments are not unusual now. Add VMware partitions recognition (both MBR for ESXi <= 4.1 and GPT for ESXi 5) to g_part. Reviewed by: ae Approved by: ae MFC after: 2 weeks	2012-04-18 11:59:03 +00:00
Alexander Motin	63297dfd4a	Some improvements to GEOM MULTIPATH: - Implement "configure" command to allow switching operation mode of running device on-fly without destroying and recreation. - Implement Active/Read mode as hybrid of Active/Active and Active/Passive. In this mode all paths not marked FAIL may handle reads same time, but unlike Active/Active only one path handles write requests at any point in time. It allows to closer follow original write request order if above layers need it for data consistency (not waiting for requisite write completion before sending dependent write). - Hide duplicate messages about device status change. - Remove periodic thread wake up with 10Hz rate. MFC after: 2 weeks Sponsored by: iXsystems, Inc.	2012-04-18 09:42:14 +00:00
Kirk McKusick	85121b0979	Expand locking around identification of filesystem mount point when accounting for I/O counts at completion of I/O operation. Also switch from using global devmtx to vnode mutex to reduce contention. Suggested and reviewed by: kib	2012-04-08 06:20:21 +00:00
Andrey V. Elsukov	ba289b84b0	VMDB offset should be greater than logical volume size only for MBR.	2012-03-29 07:29:27 +00:00
Andrey V. Elsukov	1c45872b03	Do proper cleanup for the GPT case when an error occurs.	2012-03-29 06:37:02 +00:00
Kirk McKusick	1faacf5d09	Keep track of the mount point associated with a special device to enable the collection of counts of synchronous and asynchronous reads and writes for its associated filesystem. The counts are displayed using `mount -v'. Ensure that buffers used for paging indicate the vnode from which they are operating so that counts of paging I/O operations from the filesystem are collected. This checkin only adds the setting of the mount point for the UFS/FFS filesystem, but it would be trivial to add the setting and clearing of the mount point at filesystem mount/unmount time for other filesystems too. Reviewed by: kib	2012-03-28 20:49:11 +00:00
Andrey V. Elsukov	472794bb9f	Check that scheme is not already registered. This may happens when a KLD is preloaded with loader(8) and leads to infinity loops. Also do not return EEXIST error code from MOD_LOAD handler, because we have undocumented(?) ability replace kernel's module with preloaded one. And if we have so, then preloaded module will be initialized first. Thus error in MOD_LOAD handler will be triggered for the kernel. PR: kern/165573 MFC after: 3 weeks	2012-03-23 07:26:17 +00:00
Andrey V. Elsukov	f1104f7190	Add CTLFLAG_TUN to sysctls. MFC after: 1 month	2012-03-19 13:21:10 +00:00
Andrey V. Elsukov	37d1a121d9	Add new GEOM_PART_LDM module that implements the Logical Disk Manager scheme. The LDM is a logical volume manager for MS Windows NT and it is also known as dynamic volumes. It supports about 2000 partitions and also provides the capability for software RAID implementations. This version implements only partitioning scheme capability and based on the linux-ntfs project documentation and several publications across the Web. NOTE: JBOD, RAID0 and RAID5 volumes aren't supported. An access to the LDM metadata is read-only. When LDM is on the disk partitioned with MBR we can also destroy metadata. For the GPT partitioned disks destroy action is not supported. Reviewed by: ivoras (previous version) MFC after: 1 month	2012-03-19 13:14:44 +00:00
Andrey V. Elsukov	422783e365	Make kern.geom.part node not static. Also add CTLFLAG_TUN to the check_integrity sysctl. MFC after: 1 month	2012-03-19 12:57:52 +00:00
Andrey V. Elsukov	5284aff594	Add MODULE_DEPEND() to geom_part modules. MFC after: 2 weeks	2012-03-15 08:39:10 +00:00
Ed Maste	972f6945b8	Remove unactionable message about label geometry It's not clear to a user what they should do after seeing the "geometry does not match label" kernel message, and it does not appear to present a problem in practice. Thus, just remove the messages. Approved by: marcel	2012-03-08 01:48:44 +00:00
Andrey V. Elsukov	5357f27569	If nested scheme allows dump kernel to its partition, we may allow dump for the parent partition too. MFC after: 2 weeks	2012-02-20 06:35:52 +00:00
Andrey V. Elsukov	c3f9f306d2	Add alias for the partition type 0x0f. Now "ebr" name is used for both types 0x05 and 0x0f, but 0x05 is preferred and used when partition is created with "gpart add -t ebr ...". This should keep EBR partitions accessible after r231754 for those, who have EBR on the partition with type 0x0f.	2012-02-20 05:48:57 +00:00
Andrey V. Elsukov	3bcf7d7191	Add additional check to EBR probe and create methods: don't try probe and create EBR scheme when parent partition type is not "ebr". This fixes error messages about corrupted EBR for some partitions where is actually another partition scheme. NOTE: if you have EBR on the partition with different than "ebr" (0x05) type, then you will lost access to partitions until it will be changed. MFC after: 2 weeks	2012-02-15 10:33:29 +00:00
Andrey V. Elsukov	0d8bc07eba	Add PART::type attribute handler. It returns partition type as string. MFC after: 2 weeks	2012-02-15 10:02:19 +00:00
Andrey V. Elsukov	48ef46e55a	Add alias for the partition with type 0x42 to the MBR scheme. MFC after: 1 week	2012-02-10 09:55:18 +00:00
Andrey V. Elsukov	f44d97bd0c	Let's be more realistic and limit maximum number of partition to 4k. MFC after: 1 week	2012-02-10 06:44:30 +00:00
Konstantin Belousov	c480f781ea	Current implementations of sync(2) and syncer vnode fsync() VOP uses mnt_noasync counter to temporary remove MNTK_ASYNC mount option, which is needed to guarantee a synchronous completion of the initiated i/o before syscall or VOP return. Global removal of MNTK_ASYNC option is harmful because not only i/o started from corresponding thread becomes synchronous, but all i/o is synchronous on the filesystem which is initiated during sync(2) or syncer activity. Instead of removing MNTK_ASYNC from mnt_kern_flag, provide a local thread flag to disable async i/o for current thread only. Use the opportunity to move DOINGASYNC() macro into sys/vnode.h and consistently use it through places which tested for MNTK_ASYNC. Some testing demonstrated 60-70% improvements in run time for the metadata-intensive operations on async-mounted UFS volumes, but still with great deviation due to other reasons. Reviewed by: mckusick Tested by: scottl MFC after: 2 weeks	2012-02-06 11:04:36 +00:00
Ed Maste	23f6856fff	Correct typo in comment (numbver)	2012-02-04 18:14:39 +00:00
Andrey V. Elsukov	7b540236bb	The scheme code may not know about some inconsistency in the metadata. So, add an integrity check after recovery attempt. MFC after: 1 week	2012-02-01 09:28:16 +00:00
Attilio Rao	5d7380f8e3	Avoid to check the same cache line/variable from all the locking primitives by breaking stop_scheduler into a per-thread variable. Also, store the new td_stopsched very close to td_*locks members as they will be accessed mostly in the same codepaths as td_stopsched and this results in avoiding a further cache-line pollution, possibly. STOP_SCHEDULER() was pondered to use a new 'thread' argument, in order to take advantage of already cached curthread, but in the end there should not really be a performance benefit, while introducing a KPI breakage. In collabouration with: flo Reviewed by: avg MFC after: 3 months (or never) X-MFC: r228424	2012-01-28 14:00:21 +00:00
Nathan Whitehorn	090dd24636	Experimental support for booting CHRP-type PowerPC systems from hard disks.	2012-01-25 03:37:39 +00:00
Don Lewis	b5bad28182	Allow an MBR primary or extended Linux swap partition to be specified as the system dump device. This was already allowed for GPT. The Linux swap metadata at the beginning of the partition should not be disturbed because the crash dump is written at the end. Reviewed by: alfred, pjd, marcel MFC after: 2 weeks	2012-01-13 18:32:56 +00:00
Jim Harris	c1ad3fcf6a	Add support for >2TB disks in GEOM RAID for Intel metadata format. Reviewed by: mav Approved by: scottl MFC after: 1 week	2012-01-09 23:01:42 +00:00
Aleksandr Rybalko	ce96bb7942	GEOM_UNCOMPRESS module, can be used with uzip images and with new ulzma images. Approved by: adrian (mentor)	2012-01-04 23:39:11 +00:00
Andriy Gapon	f6ce353e58	replace uses of libkern gets with cngets MFC after: 2 months	2011-12-17 15:26:34 +00:00
Alexander Motin	a2fa37fe67	Close race between geom destruction on g_vfs_close() when softc destroyed and g_vfs_orphan() call that tries to access softc, intruced at r227015. PR: kern/162997	2011-12-02 17:09:48 +00:00
Andrey V. Elsukov	a85a0d469e	Add an ability to increase number of allocated APM entries when we have reserved free space in the APM area. Also instead of one write request per each APM entry, use MAXPHY sized writes when we are updating APM. MFC after: 1 month	2011-11-28 16:07:26 +00:00
Andrey V. Elsukov	64c4a83782	The size of APM could be bigger than number of already allocated entries. And the first usable sector should not start from the inside of APM area. MFC after: 1 month	2011-11-28 12:38:24 +00:00
Alexander Motin	107c1508fa	Temporary revert r227009 to fix freeze on UP systems without PREEMPTION. Before r215687, if some withered geom or provider could not be destroyed, g_event thread went to sleep for 0.1s before retrying. After that change it is just restarting immediately. r227009 made orphaned (withered) provider to not detach immediately, but only after context switch. That made loop inside g_event thread infinite on UP systems without PREEMPTION. To address original problem with possible dead lock addressed by r227009 we have to fix r215687 change first, that needs some time to think and test.	2011-11-14 19:32:05 +00:00
Alexander Motin	0c883cef45	Major GEOM MULTIPATH class rewrite: - Improved locking and destruction process to fix crashes. - Improved "automatic" configuration method to make it consistent and safe by reading metadata back from all specified paths after writing to one. - Added provider size check to reduce chance of ordering conflict with other GEOM classes. - Added "manual" configuration method without using on-disk metadata. - Added "add" and "remove" commands to allow manage paths manually. - Failed paths are no longer dropped from geom, but only marked as FAIL and excluded from I/O operations. - Automatically restore failed paths when all others paths are marked as failed, for example, because of device-caused (not transport) errors. - Added "fail" and "restore" commands to manually control FAIL flag. - geom is now destroyed on last path disconnection. - Added optional Active/Active mode support. Unlike Active/Passive mode, load evenly distributed between all working paths. If supported by the device, it allows to significantly improve performance, utilizing bandwidth of all paths. It is controlled by -A option during creation. Disabled by default now. - Improved `status` and `list` commands output. Sponsored by: iXsystems, inc. MFC after: 1 month	2011-11-12 09:52:27 +00:00
Ed Schouten	6472ac3d8a	Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs. The SYSCTL_NODE macro defines a list that stores all child-elements of that node. If there's no SYSCTL_DECL macro anywhere else, there's no reason why it shouldn't be static.	2011-11-07 15:43:11 +00:00
Ed Schouten	d745c852be	Mark MALLOC_DEFINEs static that have no corresponding MALLOC_DECLAREs. This means that their use is restricted to a single C file.	2011-11-07 06:44:47 +00:00
Alexander Motin	ea5791d7ab	Add mutex and two flags to make orphan() call properly asynchronous: - delay consumer closing and detaching on orphan() until all I/Os complete; - prevent new I/Os submission after orphan() called. Previous implementation could destroy consumers still having active requests and worked only because of global workaround made on GEOM level.	2011-11-02 09:24:59 +00:00
Alexander Motin	755d1ea5b5	Make orphan() method in geom_dev asynchronous using destroy_dev_sched_cb() instead of destroy_dev(). It moves device destruction waiting out of the topology lock and so fixes dead lock between orphanization and closing. Real provider and geom destruction called from swi context after device destroyed as callback of the destroy_dev_sched_cb().	2011-11-01 23:12:22 +00:00
Alexander Motin	df96fd6e14	Refactor disk disconnection and geom destruction handling sequences. Do not close/destroy opened consumer directly in case of disconnect. Instead keep it existing until it will be closed in regular way in response to upstream provider destruction. Delay geom destruction in the same way. Previous implementation could destroy consumers still having active requests and worked only because of global workaround made on GEOM level.	2011-11-01 20:56:19 +00:00
Alexander Motin	0849a53fc0	Refactor disk disconnection and geom destruction handling sequences. Do not close/destroy opened consumer directly in case of disconnect. Instead keep it existing until it will be closed in regular way in response to upstream provider destruction. Delay geom destruction in the same way. Previous implementation could destroy consumers still having active requests and worked only because of global workaround made on GEOM level.	2011-11-01 17:04:42 +00:00
Alexander Motin	20a5d5dc60	Workaround the problem introduced by combination of r162200 and r215687. r162200 delays provider orphanization until all running requests complete, to workaround broken orphan() method implementation in some classes. r215687 removes persistent periodic (10Hz) event thread wake ups. Together these changes can indefinitely delay orphanization until some other event wake up the event thread. One consequence of this is inability of CAM to destroy device disconnected when busy and, as consequence, create new one after reconnection. While the best solution would be to revert r162200, it is not easy, as some classes still look broken in that way. Instead conditionally wake up event thread if there are some providers waiting for orphanization. MFC after: 1 week	2011-11-01 08:57:49 +00:00
Andrey V. Elsukov	aea26bc05a	Our geom withering function could take some time before geom with its providers and consumers will be destroyed. Before take some actions with a geom, check that it is not destroyed at the moment. Tested by: nwhitehorn MFC after: 1 week	2011-10-28 11:45:24 +00:00
Pawel Jakub Dawidek	0c879bd990	Before this change when GELI detected hardware crypto acceleration it will start only one worker thread. For software crypto it will start by default N worker threads where N is the number of available CPUs. This is not optimal if hardware crypto is AES-NI, which uses CPU for AES calculations. Change that to always start one worker thread for every available CPU. Number of worker threads per GELI provider can be easly reduced with kern.geom.eli.threads sysctl/tunable and even for software crypto it should be reduced when using more providers. While here, when number of threads exceeds number of CPUs avilable don't reduce this number, assume the user knows what he is doing. Reported by: Yuri Karaban <dev@dev97.com> MFC after: 3 days	2011-10-27 16:12:25 +00:00
Alexander Motin	733a1f3f52	Clarify disks/volumes above 2TiB support in geom_raid: - add support for volumes above 2TiB with Promise metadata format; - enforse and document other limitations: - Intel and Promise metadata formats do not support disks above 2TiB; - NVIDIA metadata format does not support volumes above 2TiB. Sponsored by: iXsystems, Inc. MFC after: 2 weeks	2011-10-26 21:50:10 +00:00
Pawel Jakub Dawidek	92f84a9fae	Allow upper layers to discover than BIO_DELETE and/or BIO_FLUSH is not supported by returning EOPNOTSUPP instead of 0 or ENODEV. MFC after: 3 days	2011-10-25 14:07:17 +00:00
Pawel Jakub Dawidek	37f0f0a75e	Improve style a bit. MFC after: 3 days	2011-10-25 14:05:39 +00:00
Pawel Jakub Dawidek	9495476273	Simplify disk_alloc(). MFC after: 3 days	2011-10-25 14:04:59 +00:00
Pawel Jakub Dawidek	1f8c92e6fa	Add support for creating GELI devices with older metadata version for use with older FreeBSD versions: - Add -V option to 'geli init' to specify version number. If no -V is given the most recent version is used. - If -V is given don't allow to use features not supported by this version. - Print version in 'geli list' output. - Update manual page and add table describing which GELI version is supported by which FreeBSD version, so one can use it when preparing GELI device for older FreeBSD version. Inspired by: Garrett Cooper <yanegomi@gmail.com> MFC after: 3 days	2011-10-25 13:57:50 +00:00
Pawel Jakub Dawidek	effb9912c7	When decoding metadata, check magic string, so we know this is not GELI device before we check its version. We don't want to report that some garbage is unsupported version if this is not even GELI provider. MFC after: 3 days	2011-10-25 13:44:23 +00:00
Pawel Jakub Dawidek	0e236b6c47	Prefer G_ELI_VERSION_* defines for version numbers over plain digits. MFC after: 3 days	2011-10-25 13:09:22 +00:00

1 2 3 4 5 ...

1884 Commits