freebsd-nq

Author	SHA1	Message	Date
Alfred Perlstein	bad7e7f3dd	Provide a device name in the sysctl tree for programs to query the state of crashdump target devices. This will be used to add a "-l" (ell) flag to dumpon(8) to list the currently configured dumpdev. Reviewed by: phk	2012-11-01 17:01:05 +00:00
Edward Tomasz Napierala	549f62fa42	Fix problem with geom_label(4) not recognizing UFS labels on filesystems extended using growfs(8). The problem here is that geom_label checks if the filesystem size recorded in UFS superblock is equal to the provider (i.e. device) size. This check cannot be removed due to backward compatibility. On the other hand, in most cases growfs(8) cannot set fs_size in the superblock to match the provider size, because, differently from newfs(8), it cannot recompute cylinder group sizes. To fix this problem, add another superblock field, fs_providersize, used only for this purpose. The geom_label(4) will attach if either fs_size (filesystem created with newfs(8)) or fs_providersize (filesystem expanded using growfs(8)) matches the device size. PR: kern/165962 Reviewed by: mckusick Sponsored by: FreeBSD Foundation	2012-10-30 21:32:10 +00:00
Alexander Motin	650e245ebf	Minor addition to r242323: Alike to BIO_WRITE, report success if at least one subdisk succeeded with BIO_DELETE. But unlike BIO_WRITE don't fail disk on BIO_DELETE error. Sponsored by: iXsystems, Inc. MFC after: 1 month	2012-10-29 21:08:06 +00:00
Alexander Motin	609a74746a	Add basic BIO_DELETE support to GEOM RAID class for all RAID levels. If at least one subdisk in the volume supports it, BIO_DELETE requests will be propagated down. Unfortunatelly, for RAID levels with redundancy unmapped blocks will be mapped back during first rebuild/resync process. Sponsored by: iXsystems, Inc. MFC after: 1 month	2012-10-29 18:04:38 +00:00
Edward Tomasz Napierala	1af2d09b49	Fix locking problem in disk_resize(); previously it would run without topology lock, resulting in assertion when running with DIAGNOSTIC. Reviewed by: mav (earlier version)	2012-10-29 17:52:43 +00:00
Alexander Motin	a479c51be3	Make GEOM RAID more aggressive in marking volumes as clean on shutdown and move that action from shutdown_pre_sync to shutdown_post_sync stage to avoid extra flapping. ZFS tends to not close devices on shutdown, that doesn't allow GEOM RAID to shutdown gracefully. To handle that, mark volume as clean just when shutdown time comes and there are no active writes. MFC after: 2 weeks	2012-10-29 14:18:54 +00:00
Konstantin Belousov	5050aa86cf	Remove the support for using non-mpsafe filesystem modules. In particular, do not lock Giant conditionally when calling into the filesystem module, remove the VFS_LOCK_GIANT() and related macros. Stop handling buffers belonging to non-mpsafe filesystems. The VFS_VERSION is bumped to indicate the interface change which does not result in the interface signatures changes. Conducted and reviewed by: attilio Tested by: pho	2012-10-22 17:50:54 +00:00
Attilio Rao	682ee99e7a	It seems that it is preferable to keep support for glabel also for filesystems that we don't support natively. Revert part of r241636 to do so. This patch is not targeted for MFC. Requested by: gleb, jhb	2012-10-18 22:18:11 +00:00
Attilio Rao	a42ac676f5	Disconnect non-MPSAFE NTFS from the build in preparation for dropping GIANT from VFS. This code is particulary broken and fragile and other in-kernel implementations around, found in other operating systems, don't really seem clean and solid enough to be imported at all. If someone wants to reconsider in-kernel NTFS implementation for inclusion again, a fair effort for completely fixing and cleaning it up is expected. In the while NTFS regular users can use FUSE interface and ntfs-3g port to work with their NTFS partitions. This is not targeted for MFC.	2012-10-17 11:30:00 +00:00
Alexander Motin	c6f0cd57e3	NULL-ify last previously used pointer instead of last possible pointer. This should be only a cosmetic change. Found by: Clang Static Analyzer	2012-10-10 20:41:37 +00:00
Alexander Motin	6871a543f9	Make graid command line a bit more friendly by allowing volume name or provider name to be specified instead of geom name (first argument in all subcommands except label). In most cases there is only one array used any way, so it is not really useful to make user type ugly geom names like Intel-f0bdf223 or SiI-732c2b9448cf. Though they can be used in some cases. Sponsored by: iXsystems, Inc. MFC after: 1 month	2012-10-07 19:30:16 +00:00
Andriy Gapon	a90c9dfeab	g_part_taste: directly destroy consumer and geom here, no need for withering Besides withered but still alive consumers may interfere with re-tatsing. MFC after: 16 days	2012-10-06 19:52:50 +00:00
Pawel Jakub Dawidek	5d8a6a1078	Remove the topology lock from disk_gone(), it might be called with regular mutexes held and the topology lock is an sx lock. The topology lock was there to protect traversing through the list of providers of disk's geom, but it seems that disk's geom has always exactly one provider. Change the code to call g_wither_provider() for this one provider, which is safe to do without holding the topology lock and assert that there is indeed only one provider. Discussed with: ken MFC after: 1 week	2012-09-28 08:22:51 +00:00
Pawel Jakub Dawidek	171f6b3a34	Use the topology lock to protect list of providers while withering them. It is possible that provider is destroyed while we are iterating over the list. Reported by: Brian Parkison <parkison@panzura.com> Discussed with: phk MFC after: 1 week	2012-09-22 12:41:49 +00:00
Andriy Gapon	85f5b9aa70	g_disk_flushcache definitely should not be traced under G_T_TOPOLOGY ... use G_T_BIO instead MFC after: 1 week	2012-09-18 07:57:34 +00:00
Alexander Motin	c89d2fbe18	Add global and per-module sysctls/tunables to enable/disable metadata taste. That should help to handle some cases when disk has some RAID metadata that should be ignored, especially during boot. MFC after: 3 days	2012-09-13 13:27:09 +00:00
Gleb Smirnoff	4a7f7b10b5	When synchronizing, include in the config dump amount of bytes syncronized. The rationale behind this is the following: for large disks the percent synchronisation counter ticks too seldom, and monitoring software (as well as human operator) can't tell whether synchronisation goes on or one of disks got stuck. On an idle server one can look into gstat and see whether synchronisation goes on or not, but on a busy server that won't work. Also, new value monitored can be differentiated obtaining the synchronisation speed quite precisely. Submitted by: Konstantin Kukushkin <dark ramtel.ru> Reviewed by: pjd	2012-09-11 20:20:13 +00:00
Pawel Jakub Dawidek	769afdc71e	Allow to pass providers with /dev/ prefix to g_provider_by_name(). MFC after: 3 days	2012-09-01 10:52:19 +00:00
Ed Schouten	24d1105dde	Remove unneeded G_PF_CANDELETE flag. This flag is only used by GEOM so it can be propagated to the character device's SI_CANDELETE. Unfortunately, SI_CANDELETE seems to do nothing.	2012-08-28 19:28:31 +00:00
Thomas Quinot	8fb378d6b1	(g_multipath_rotate): Fix algorithm so that it does rotate over all good providers, not just the last two. PR: kern/170379 Reviewed by: mav MFC after: 2 weeks	2012-08-25 10:36:31 +00:00
Pawel Jakub Dawidek	9d18043979	Always initialize sc_ekey, because as of r238116 it is always used. If GELI provider was created on FreeBSD HEAD r238116 or later (but before this change), it is using very weak keys and the data is not protected. The bug was introduced on 4th July 2012. One can verify if its provider was created with weak keys by running: # geli dump <provider> \| grep version If the version is 7 and the system didn't include this fix when provider was initialized, then the data has to be backed up, underlying provider overwritten with random data, system upgraded and provider recreated. Reported by: Fabian Keil <fk@fabiankeil.de> Tested by: Fabian Keil <fk@fabiankeil.de> Discussed with: so MFC after: 3 days	2012-08-10 18:43:29 +00:00
Alexander Motin	d9d6849693	Add missing FAILED event to g_raid_subdisk_event2str() to print it properly in debug messages. Submitted by: Dmitry Luhtionov <dmitryluhtionov@gmail.com>	2012-08-10 13:36:33 +00:00
Jim Harris	82a6ae1009	Clone BIO_ORDERED flag, for disk drivers (namely CAM) that try to consume it. Sponsored by: Intel Discussed with: gibbs, scottl	2012-08-07 20:16:10 +00:00
Mikolaj Golub	1d9db37c77	In g_gate_dumpconf() always check the result of g_gate_hold(). This fixes "Negative sc_ref" panic possible when sysctl_kern_geom_confxml() is run simultaneously with destroying GATE device. Reviewed by: pjd MFC after: 3 days	2012-08-07 18:50:33 +00:00
Jim Harris	c1d00eabe8	In virstor_ctl_stop(), check for a valid softc before trying to update metadata. Sponsored by: Intel Reported and tested by: Marcelo Gondim <gondim at bsdinfo dot com dot br> PR: kern/170199 MFC after: 3 days	2012-08-03 20:24:16 +00:00
Thomas Quinot	71ee4ef0d9	New command "gmultipath prefer" to force selection of a specified provider in an Active/Passive configuration. Reviewed by: mav MFC after: 4 weeks	2012-08-03 14:55:35 +00:00
Alexander Motin	e521fb0558	Partially revert r238886 in part of GEOM_VFS spoiling. This change triggered interesting foot shooting condition in GEOM when RW access to root partition by fsck spoils VFS geom there, which has it opened RO at the same time. Seems spoiling concept needs some rework.	2012-07-29 20:04:09 +00:00
Alexander Motin	3631c6382f	Implement media change notification for DA and CD removable media devices. It includes three parts: 1) Modifications to CAM to detect media media changes and report them to disk(9) layer. For modern SATA (and potentially UAS) devices it utilizes Asynchronous Notification mechanism to receive events from hardware. Active polling with TEST UNIT READY commands with 3 seconds period is used for incapable hardware. After that both CD and DA drivers work the same way, detecting two conditions: "NOT READY: Medium not present" after medium was detected previously, and "UNIT ATTENTION: Not ready to ready change, medium may have changed". First one reported to disk(9) as media removal, second as media insert/change. To reliably receive second event new AC_UNIT_ATTENTION async added to make UAs broadcasted to all periphs by generic error handling code in cam_periph_error(). 2) Modifications to GEOM core to handle media remove and change events. Media removal handled by spoiling all consumers attached to the provider. Media change event also schedules provider retaste after spoiling to probe new media. New flag G_CF_ORPHAN was added to consumers to reflect that consumer is in process of destruction. It allows retaste to create new geom instance of the same class, while previous one is still dying. 3) Modifications to some GEOM classes: DEV -- to report media change events to devd; VFS -- to handle spoiling same as orphan to prevent accessing replaced media. PART class already handles spoiling alike to orphan. Reviewed by: silence on geom@ and scsi@ Tested by: avg Sponsored by: iXsystems, Inc. / PC-BSD MFC after: 2 months	2012-07-29 11:51:48 +00:00
Mikolaj Golub	a277f47bd2	Reorder things in g_gate_create() so at the moment when g_new_geomf() is called name is properly initialized. Discussed with: pjd MFC after: 2 weeks	2012-07-28 16:30:50 +00:00
Edward Tomasz Napierala	a1cf7f75a6	Make it possible to resize opened partitions. Sponsored by: FreeBSD Foundation	2012-07-20 17:51:20 +00:00
Edward Tomasz Napierala	3a3ef28e15	Add missing free.	2012-07-18 07:26:20 +00:00
Kenneth D. Merry	edad9799e8	Add back spare fields consumed in r237545. It seems that these should only be consumed to maintain backward compatibility in stable, but should not be consumed in head. Submitted by: trasz, attilio (indirectly)	2012-07-17 22:16:10 +00:00
Edward Tomasz Napierala	9e9d445ed1	The resize GEOM event has no references, thus cannot be canceled.	2012-07-16 17:41:38 +00:00
Edward Tomasz Napierala	8fe7677998	Add back spare fields reused in r238213. According to Attilio, the rule is to use reuse spares only when MFC-ing, not in CURRENT.	2012-07-16 16:50:28 +00:00
Edward Tomasz Napierala	7027e4dac4	Add trivial resize handling to gnop(8). Reviewed by: mav Sponsored by: FreeBSD Foundation	2012-07-07 22:22:13 +00:00
Edward Tomasz Napierala	74badfa6ba	Add trivial resize handling to gmountver(8). Reviewed by: mav Sponsored by: FreeBSD Foundation	2012-07-07 22:20:47 +00:00
Edward Tomasz Napierala	bc97ce36f7	Add disk_resize(), to make it possible for the disk drivers such as da(4) to notify GEOM about LUN size change. Reviewed by: mav (earlier version) Sponsored by: FreeBSD Foundation	2012-07-07 21:28:31 +00:00
Edward Tomasz Napierala	245899cc97	Add a new GEOM method, resize(), which is called after provider size changes. Add a new routine, g_resize_provider(), to use to notify GEOM about provider change. Reviewed by: mav Sponsored by: FreeBSD Foundation	2012-07-07 20:13:40 +00:00
Edward Tomasz Napierala	ad624005b3	Fix orphan() methods of several GEOM classes to not assume that there is an error set on the provider. With GEOM resizing, class can become orphaned when it doesn't implement resize() method and the provider size decreases. Reviewed by: mav Sponsored by: FreeBSD Foundation	2012-07-07 17:09:44 +00:00
Edward Tomasz Napierala	aaaf515fde	Fix typo in the comment.	2012-07-06 15:46:38 +00:00
Pawel Jakub Dawidek	e08ec03778	Extend GEOM Gate class to handle read I/O requests directly within the kernel. This will allow HAST to read directly from the local component without even communicating userland daemon. Sponsored by: Panzura, http://www.panzura.com MFC after: 1 month	2012-07-04 20:16:28 +00:00
Pawel Jakub Dawidek	457bbc4f3a	Use correct part of the Master-Key for generating encryption keys. Before this change the IV-Key was used to generate encryption keys, which was incorrect, but safe - for the XTS mode this key was unused anyway and for CBC mode it was used differently to generate IV vectors, so there is no risk that IV vector collides with encryption key somehow. Bump version number and keep compatibility for older versions. MFC after: 2 weeks	2012-07-04 17:54:17 +00:00
Pawel Jakub Dawidek	3d47ea3324	Correct comment. MFC after: 3 days	2012-07-04 17:44:39 +00:00
Pawel Jakub Dawidek	ec58140a27	Correct a comment and correct style of a flag check. MFC after: 3 days	2012-07-04 17:43:25 +00:00
Gleb Smirnoff	d89862ac87	Make geom_mirror more friendly to SSDs. To properly support TRIM, we need to pass BIO_DELETE requests down to providers that support it. Also, we need to announce our support for BIO_DELETE to upper consumer. This requires: - In g_mirror_start() return true for "GEOM::candelete" request. - In g_mirror_init_disk() probe below provider for "GEOM::candelete" attribute, and mark disk with a flag if it does support BIO_DELETE. - In g_mirror_register_request() distribute BIO_DELETE requests only to those disks, that do support it. Note that we announce "GEOM::candelete" as true unconditionally of whether we have TRIM-capable media down below or not. This is made intentionally, because upper consumer (usually UFS) requests the attribite only once at mount time. And if user ever migrates his mirror from HDDs to SSDs, then he/she would get TRIM working without remounting filesystem. Reviewed by: pjd	2012-07-01 15:43:52 +00:00
Gleb Smirnoff	b0ae63ca25	In g_mirror_regular_request() upon successful delivery treat BIO_DELETE requests same way as BIO_WRITE removing them from queue. This fixes panic with BIO_DELETE operations on geom_mirror. Reviewed by: pjd	2012-07-01 15:30:43 +00:00
Warner Losh	a920522660	Use %j to match intmax_t.	2012-07-01 05:22:13 +00:00
Brooks Davis	9e81f117f9	MFP4 #212266 Fix compile on MIPS64. Sponsored by: DARPA, AFRL	2012-06-29 20:15:00 +00:00
Kenneth D. Merry	c76a6fe732	In g_disk_providergone(), don't continue if the softc is NULL. This may be the case if we've already gone through g_disk_destroy(). Reported by: Michael Butler <imb@protected-networks.net> MFC after: 3 days	2012-06-27 16:05:09 +00:00
Kenneth D. Merry	365e076ed2	Consume spare fields for the providergone pointers added to the g_class and g_geom structures in change 237518. The original change would have broken the ABI. Suggested by: ae MFC after: 4 days	2012-06-25 04:26:10 +00:00
Kenneth D. Merry	c3fb2891f0	Fix a bug which causes a panic in daopen(). The panic is caused by a da(4) instance going away while GEOM is still probing it. In this case, the GEOM disk class instance has been created by disk_create(), and the taste of the disk is queued in the GEOM event queue. While that event is queued, the da(4) instance goes away. When the open call comes into the da(4) driver, it dereferences the freed (but non-NULL) peripheral pointer provided by GEOM, which results in a panic. The solution is to add a callback to the GEOM disk code that is called when all of its resources are cleaned up. This is implemented inside GEOM by adding an optional callback that is called when all consumers have detached from a provider, and the provider is about to be deleted. scsi_cd.c, scsi_da.c: In the register routine for the cd(4) and da(4) routines, acquire a reference to the CAM peripheral instance just before we call disk_create(). Use the new GEOM disk d_gone() callback to register a callback (dadiskgonecb()/cddiskgonecb()) that decrements the peripheral reference count once GEOM has finished cleaning up its resources. In the cd(4) driver, clean up open and close behavior slightly. GEOM makes sure we only get one open() and one close call, so there is no need to set an open flag and decrement the reference count if we are not the first open. In the cd(4) driver, use cam_periph_release_locked() in a couple of error scenarios to avoid extra mutex calls. geom.h: Add a new, optional, providergone callback that is called when a provider is about to be deleted. geom_disk.h: Add a new d_gone() callback to the GEOM disk interface. Bump the DISK_VERSION to version 2. This probably should have been done after a couple of previous changes, especially the addition of the d_getattr() callback. geom_disk.c: Add a providergone callback for the disk class, g_disk_providergone(), that calls the user's d_gone() callback if it exists. Bump the DISK_VERSION to 2. geom_subr.c: In g_destroy_provider(), call the providergone callback if it has been provided. In g_new_geomf(), propagate the class's providergone callback to the new geom instance. blkfront.c: Callers of disk_create() are supposed to pass in DISK_VERSION, not an explicit disk API version number. Update the blkfront driver to do that. disk.9: Update the disk(9) man page to include information on the new d_gone() callback, as well as the previously added d_getattr() callback, d_descr field, and HBA PCI ID fields. MFC after: 5 days	2012-06-24 04:29:03 +00:00
Andrey V. Elsukov	d4746e107f	Always reconstruct partition entries in the PMBR when Boot Camp is disabled. This helps to easily recover from situations when PMBR is damaged and contains no entries. MFC after: 1 week	2012-06-14 11:17:54 +00:00
Alexander Motin	a839e33278	Add missing newlines into XML output. MFC after: 3 days Sponsored by: iXsystems, Inc.	2012-06-05 16:46:34 +00:00
Marcel Moolenaar	f24a8224b2	Add a partition type for nandfs to the apm, bsd, gpt and vtoc8 schemes. The gpart alias for these partition types is "freebsd-nandfs".	2012-05-25 20:33:34 +00:00
Edward Tomasz Napierala	d87e55886e	Revert r235918 for now and add comment explaining the reason for the size check.	2012-05-25 10:08:48 +00:00
Edward Tomasz Napierala	202f0f2a02	Make g_label(4) ignore provider size when looking for UFS labels. Without it, it fails to create labels for filesystems resized by growfs(8). PR: kern/165962 Submitted by: Olivier Cochard-Labbe <olivier at cochard dot me>	2012-05-24 16:48:33 +00:00
Xin LI	8287ee1bbe	- Correct signedness for casts; - Wrap long line while I'm there. Noticed by: pjd, avg	2012-05-23 20:51:21 +00:00
Xin LI	dc89cfa691	Use %ju to match uintmax_t usage	2012-05-23 18:17:02 +00:00
Xin LI	2920997423	Use %j and cast off_t to intmax_t for now to fix build. Noticed by: bz	2012-05-23 17:49:59 +00:00
Grzegorz Bernacki	4ffd4dfe17	Add a new geom class which allows to divide NAND Flash chip into partitions. Partitions are created based on data in dts file which are extracted and interpreted by slicer. Obtained from: Semihalf Supported by: FreeBSD Foundation, Juniper Networks	2012-05-22 08:33:14 +00:00
Andrey V. Elsukov	f931cd70af	Prevent removing of the last active component from a mirror. PR: kern/154860 Reviewed by: pjd MFC after: 1 week	2012-05-18 09:22:21 +00:00
Andrey V. Elsukov	1ee0138d2f	Introduce new device flag G_MIRROR_DEVICE_FLAG_TASTING. It should protect geom from destroying while it is tasting. PR: kern/154860 Reviewed by: pjd MFC after: 1 week	2012-05-18 09:19:07 +00:00
Eitan Adler	615a3e398d	Add missing period at the end of the error message Submitted by: pjd Approved by: cperciva (implicit) MFC after: 3 days X-MFC-With: r235201	2012-05-13 23:27:06 +00:00
Alexander Motin	ef844ef76f	- Prevent error status leak if write to some of the RAID1/1E volume disks failed while write to some other succeeded. Instead mark disk as failed. - Make RAID1E less aggressive in failing disks to avoid volume breakage. MFC after: 2 weeks	2012-05-11 13:20:17 +00:00
Eitan Adler	af23b88b5c	Clarify error that geli generates when it finds corrupt data. PR: kern/165695 Submitted by: Robert Simmons <rsimmons0@gmail.com> Reviewed by: pjd Approved by: cperciva MFC after: 1 week	2012-05-09 17:26:52 +00:00
Alexander Motin	14f9f25ba0	Remove some hardcoded constants from code.	2012-05-06 16:41:27 +00:00
Alexander Motin	eb3b1cd0de	Plug small memory leaks.	2012-05-06 12:55:20 +00:00
Alexander Motin	8f12ca2ee1	Add support for RAID5R. Slightly improve support for RAIDMDF.	2012-05-06 11:32:36 +00:00
Alexander Motin	c0b1ef6661	Fix `gmultipath configure` for big-endian machines. MFC after: 1 week	2012-05-06 05:49:23 +00:00
Alexander Motin	86b0366909	Fix bug causing memory corruption and panics with big-endian metadata.	2012-05-04 08:59:19 +00:00
Alexander Motin	4b97ff6137	Implement read-only support for volumes in optimal state (without using redundancy) for the following RAID levels: RAID4/5E/5EE/6/MDF.	2012-05-04 07:32:57 +00:00
Alexander Motin	8df8e26adc	Add optional -o argument to the `graid label` to specify some metadata format options. Use it for specifying byte order for the DDF metadata: big-endian defined by specification and little-endian used by Adaptec.	2012-05-03 05:32:56 +00:00
Alexander Motin	d525d87560	Improve spare disks support. Unluckily, for some reason Adaptec 1430SA RAID BIOS doesn't want to understand spare disks created by graid. But at least spares created by BIOS are working fine now.	2012-05-01 18:00:31 +00:00
Alexander Motin	2b9c925ff0	Implement volume deletion if disk has more then one partition.	2012-05-01 09:21:21 +00:00
Alexander Motin	47e980965c	Improve DDF metadata writing.	2012-05-01 08:19:29 +00:00
Alexander Motin	00f32ecbd0	Add to GEOM RAID class module, supporting the DDF metadata format, as defined by the SNIA Common RAID Disk Data Format Specification v2.0. Supports multiple volumes per array and multiple partitions per disk. Supports standard big-endian and Adaptec's little-endian byte ordering. Supports all single-layer RAID levels. Dual-layer RAID levels except RAID10 are not supported now because of GEOM RAID design limitations. Some work is still to be done, but the present code already manages basic interoperation with RAID BIOS of the Adaptec 1430SA SATA RAID controller. MFC after: 1 month Sponsored by: iXsystems, Inc.	2012-04-30 17:53:02 +00:00
Alexander Motin	c9f545e5f9	s/gmirror/graid/	2012-04-29 19:40:50 +00:00
Alexander Motin	7b2a8d7823	Fix RAID5 level names changed at r234603.	2012-04-27 08:49:15 +00:00
Alexander Motin	bafd0b5b0a	Fix copy-paste typo in r234603. Submitted by: kan	2012-04-23 16:35:19 +00:00
Alexander Motin	dbb2e75504	Add names for all primary RAID levels defined by DDF 2.0 specification.	2012-04-23 13:04:02 +00:00
Alexander Motin	e26083ca69	Add sos@ copyrights to RAID metadata modules, respecting his efforts in decoding metadata formats in ataraid(4) code.	2012-04-23 09:39:39 +00:00
Alexander Motin	fc1de96060	Add to GEOM RAID class module for reading non-degraded RAID5 volumes and some environment to differentiate 4 possible RAID5 on-disk layouts. Tested with Intel and AMD RAID BIOSes. MFC after: 2 weeks	2012-04-19 12:30:12 +00:00
Dmitry Morozovsky	b20e4de387	VMware environments are not unusual now. Add VMware partitions recognition (both MBR for ESXi <= 4.1 and GPT for ESXi 5) to g_part. Reviewed by: ae Approved by: ae MFC after: 2 weeks	2012-04-18 11:59:03 +00:00
Alexander Motin	63297dfd4a	Some improvements to GEOM MULTIPATH: - Implement "configure" command to allow switching operation mode of running device on-fly without destroying and recreation. - Implement Active/Read mode as hybrid of Active/Active and Active/Passive. In this mode all paths not marked FAIL may handle reads same time, but unlike Active/Active only one path handles write requests at any point in time. It allows to closer follow original write request order if above layers need it for data consistency (not waiting for requisite write completion before sending dependent write). - Hide duplicate messages about device status change. - Remove periodic thread wake up with 10Hz rate. MFC after: 2 weeks Sponsored by: iXsystems, Inc.	2012-04-18 09:42:14 +00:00
Kirk McKusick	85121b0979	Expand locking around identification of filesystem mount point when accounting for I/O counts at completion of I/O operation. Also switch from using global devmtx to vnode mutex to reduce contention. Suggested and reviewed by: kib	2012-04-08 06:20:21 +00:00
Andrey V. Elsukov	ba289b84b0	VMDB offset should be greater than logical volume size only for MBR.	2012-03-29 07:29:27 +00:00
Andrey V. Elsukov	1c45872b03	Do proper cleanup for the GPT case when an error occurs.	2012-03-29 06:37:02 +00:00
Kirk McKusick	1faacf5d09	Keep track of the mount point associated with a special device to enable the collection of counts of synchronous and asynchronous reads and writes for its associated filesystem. The counts are displayed using `mount -v'. Ensure that buffers used for paging indicate the vnode from which they are operating so that counts of paging I/O operations from the filesystem are collected. This checkin only adds the setting of the mount point for the UFS/FFS filesystem, but it would be trivial to add the setting and clearing of the mount point at filesystem mount/unmount time for other filesystems too. Reviewed by: kib	2012-03-28 20:49:11 +00:00
Andrey V. Elsukov	472794bb9f	Check that scheme is not already registered. This may happens when a KLD is preloaded with loader(8) and leads to infinity loops. Also do not return EEXIST error code from MOD_LOAD handler, because we have undocumented(?) ability replace kernel's module with preloaded one. And if we have so, then preloaded module will be initialized first. Thus error in MOD_LOAD handler will be triggered for the kernel. PR: kern/165573 MFC after: 3 weeks	2012-03-23 07:26:17 +00:00
Andrey V. Elsukov	f1104f7190	Add CTLFLAG_TUN to sysctls. MFC after: 1 month	2012-03-19 13:21:10 +00:00
Andrey V. Elsukov	37d1a121d9	Add new GEOM_PART_LDM module that implements the Logical Disk Manager scheme. The LDM is a logical volume manager for MS Windows NT and it is also known as dynamic volumes. It supports about 2000 partitions and also provides the capability for software RAID implementations. This version implements only partitioning scheme capability and based on the linux-ntfs project documentation and several publications across the Web. NOTE: JBOD, RAID0 and RAID5 volumes aren't supported. An access to the LDM metadata is read-only. When LDM is on the disk partitioned with MBR we can also destroy metadata. For the GPT partitioned disks destroy action is not supported. Reviewed by: ivoras (previous version) MFC after: 1 month	2012-03-19 13:14:44 +00:00
Andrey V. Elsukov	422783e365	Make kern.geom.part node not static. Also add CTLFLAG_TUN to the check_integrity sysctl. MFC after: 1 month	2012-03-19 12:57:52 +00:00
Andrey V. Elsukov	5284aff594	Add MODULE_DEPEND() to geom_part modules. MFC after: 2 weeks	2012-03-15 08:39:10 +00:00
Ed Maste	972f6945b8	Remove unactionable message about label geometry It's not clear to a user what they should do after seeing the "geometry does not match label" kernel message, and it does not appear to present a problem in practice. Thus, just remove the messages. Approved by: marcel	2012-03-08 01:48:44 +00:00
Andrey V. Elsukov	5357f27569	If nested scheme allows dump kernel to its partition, we may allow dump for the parent partition too. MFC after: 2 weeks	2012-02-20 06:35:52 +00:00
Andrey V. Elsukov	c3f9f306d2	Add alias for the partition type 0x0f. Now "ebr" name is used for both types 0x05 and 0x0f, but 0x05 is preferred and used when partition is created with "gpart add -t ebr ...". This should keep EBR partitions accessible after r231754 for those, who have EBR on the partition with type 0x0f.	2012-02-20 05:48:57 +00:00
Andrey V. Elsukov	3bcf7d7191	Add additional check to EBR probe and create methods: don't try probe and create EBR scheme when parent partition type is not "ebr". This fixes error messages about corrupted EBR for some partitions where is actually another partition scheme. NOTE: if you have EBR on the partition with different than "ebr" (0x05) type, then you will lost access to partitions until it will be changed. MFC after: 2 weeks	2012-02-15 10:33:29 +00:00
Andrey V. Elsukov	0d8bc07eba	Add PART::type attribute handler. It returns partition type as string. MFC after: 2 weeks	2012-02-15 10:02:19 +00:00
Andrey V. Elsukov	48ef46e55a	Add alias for the partition with type 0x42 to the MBR scheme. MFC after: 1 week	2012-02-10 09:55:18 +00:00
Andrey V. Elsukov	f44d97bd0c	Let's be more realistic and limit maximum number of partition to 4k. MFC after: 1 week	2012-02-10 06:44:30 +00:00
Konstantin Belousov	c480f781ea	Current implementations of sync(2) and syncer vnode fsync() VOP uses mnt_noasync counter to temporary remove MNTK_ASYNC mount option, which is needed to guarantee a synchronous completion of the initiated i/o before syscall or VOP return. Global removal of MNTK_ASYNC option is harmful because not only i/o started from corresponding thread becomes synchronous, but all i/o is synchronous on the filesystem which is initiated during sync(2) or syncer activity. Instead of removing MNTK_ASYNC from mnt_kern_flag, provide a local thread flag to disable async i/o for current thread only. Use the opportunity to move DOINGASYNC() macro into sys/vnode.h and consistently use it through places which tested for MNTK_ASYNC. Some testing demonstrated 60-70% improvements in run time for the metadata-intensive operations on async-mounted UFS volumes, but still with great deviation due to other reasons. Reviewed by: mckusick Tested by: scottl MFC after: 2 weeks	2012-02-06 11:04:36 +00:00
Ed Maste	23f6856fff	Correct typo in comment (numbver)	2012-02-04 18:14:39 +00:00
Andrey V. Elsukov	7b540236bb	The scheme code may not know about some inconsistency in the metadata. So, add an integrity check after recovery attempt. MFC after: 1 week	2012-02-01 09:28:16 +00:00
Attilio Rao	5d7380f8e3	Avoid to check the same cache line/variable from all the locking primitives by breaking stop_scheduler into a per-thread variable. Also, store the new td_stopsched very close to td_*locks members as they will be accessed mostly in the same codepaths as td_stopsched and this results in avoiding a further cache-line pollution, possibly. STOP_SCHEDULER() was pondered to use a new 'thread' argument, in order to take advantage of already cached curthread, but in the end there should not really be a performance benefit, while introducing a KPI breakage. In collabouration with: flo Reviewed by: avg MFC after: 3 months (or never) X-MFC: r228424	2012-01-28 14:00:21 +00:00
Nathan Whitehorn	090dd24636	Experimental support for booting CHRP-type PowerPC systems from hard disks.	2012-01-25 03:37:39 +00:00
Don Lewis	b5bad28182	Allow an MBR primary or extended Linux swap partition to be specified as the system dump device. This was already allowed for GPT. The Linux swap metadata at the beginning of the partition should not be disturbed because the crash dump is written at the end. Reviewed by: alfred, pjd, marcel MFC after: 2 weeks	2012-01-13 18:32:56 +00:00
Jim Harris	c1ad3fcf6a	Add support for >2TB disks in GEOM RAID for Intel metadata format. Reviewed by: mav Approved by: scottl MFC after: 1 week	2012-01-09 23:01:42 +00:00
Aleksandr Rybalko	ce96bb7942	GEOM_UNCOMPRESS module, can be used with uzip images and with new ulzma images. Approved by: adrian (mentor)	2012-01-04 23:39:11 +00:00
Andriy Gapon	f6ce353e58	replace uses of libkern gets with cngets MFC after: 2 months	2011-12-17 15:26:34 +00:00
Alexander Motin	a2fa37fe67	Close race between geom destruction on g_vfs_close() when softc destroyed and g_vfs_orphan() call that tries to access softc, intruced at r227015. PR: kern/162997	2011-12-02 17:09:48 +00:00
Andrey V. Elsukov	a85a0d469e	Add an ability to increase number of allocated APM entries when we have reserved free space in the APM area. Also instead of one write request per each APM entry, use MAXPHY sized writes when we are updating APM. MFC after: 1 month	2011-11-28 16:07:26 +00:00
Andrey V. Elsukov	64c4a83782	The size of APM could be bigger than number of already allocated entries. And the first usable sector should not start from the inside of APM area. MFC after: 1 month	2011-11-28 12:38:24 +00:00
Alexander Motin	107c1508fa	Temporary revert r227009 to fix freeze on UP systems without PREEMPTION. Before r215687, if some withered geom or provider could not be destroyed, g_event thread went to sleep for 0.1s before retrying. After that change it is just restarting immediately. r227009 made orphaned (withered) provider to not detach immediately, but only after context switch. That made loop inside g_event thread infinite on UP systems without PREEMPTION. To address original problem with possible dead lock addressed by r227009 we have to fix r215687 change first, that needs some time to think and test.	2011-11-14 19:32:05 +00:00
Alexander Motin	0c883cef45	Major GEOM MULTIPATH class rewrite: - Improved locking and destruction process to fix crashes. - Improved "automatic" configuration method to make it consistent and safe by reading metadata back from all specified paths after writing to one. - Added provider size check to reduce chance of ordering conflict with other GEOM classes. - Added "manual" configuration method without using on-disk metadata. - Added "add" and "remove" commands to allow manage paths manually. - Failed paths are no longer dropped from geom, but only marked as FAIL and excluded from I/O operations. - Automatically restore failed paths when all others paths are marked as failed, for example, because of device-caused (not transport) errors. - Added "fail" and "restore" commands to manually control FAIL flag. - geom is now destroyed on last path disconnection. - Added optional Active/Active mode support. Unlike Active/Passive mode, load evenly distributed between all working paths. If supported by the device, it allows to significantly improve performance, utilizing bandwidth of all paths. It is controlled by -A option during creation. Disabled by default now. - Improved `status` and `list` commands output. Sponsored by: iXsystems, inc. MFC after: 1 month	2011-11-12 09:52:27 +00:00
Ed Schouten	6472ac3d8a	Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs. The SYSCTL_NODE macro defines a list that stores all child-elements of that node. If there's no SYSCTL_DECL macro anywhere else, there's no reason why it shouldn't be static.	2011-11-07 15:43:11 +00:00
Ed Schouten	d745c852be	Mark MALLOC_DEFINEs static that have no corresponding MALLOC_DECLAREs. This means that their use is restricted to a single C file.	2011-11-07 06:44:47 +00:00
Alexander Motin	ea5791d7ab	Add mutex and two flags to make orphan() call properly asynchronous: - delay consumer closing and detaching on orphan() until all I/Os complete; - prevent new I/Os submission after orphan() called. Previous implementation could destroy consumers still having active requests and worked only because of global workaround made on GEOM level.	2011-11-02 09:24:59 +00:00
Alexander Motin	755d1ea5b5	Make orphan() method in geom_dev asynchronous using destroy_dev_sched_cb() instead of destroy_dev(). It moves device destruction waiting out of the topology lock and so fixes dead lock between orphanization and closing. Real provider and geom destruction called from swi context after device destroyed as callback of the destroy_dev_sched_cb().	2011-11-01 23:12:22 +00:00
Alexander Motin	df96fd6e14	Refactor disk disconnection and geom destruction handling sequences. Do not close/destroy opened consumer directly in case of disconnect. Instead keep it existing until it will be closed in regular way in response to upstream provider destruction. Delay geom destruction in the same way. Previous implementation could destroy consumers still having active requests and worked only because of global workaround made on GEOM level.	2011-11-01 20:56:19 +00:00
Alexander Motin	0849a53fc0	Refactor disk disconnection and geom destruction handling sequences. Do not close/destroy opened consumer directly in case of disconnect. Instead keep it existing until it will be closed in regular way in response to upstream provider destruction. Delay geom destruction in the same way. Previous implementation could destroy consumers still having active requests and worked only because of global workaround made on GEOM level.	2011-11-01 17:04:42 +00:00
Alexander Motin	20a5d5dc60	Workaround the problem introduced by combination of r162200 and r215687. r162200 delays provider orphanization until all running requests complete, to workaround broken orphan() method implementation in some classes. r215687 removes persistent periodic (10Hz) event thread wake ups. Together these changes can indefinitely delay orphanization until some other event wake up the event thread. One consequence of this is inability of CAM to destroy device disconnected when busy and, as consequence, create new one after reconnection. While the best solution would be to revert r162200, it is not easy, as some classes still look broken in that way. Instead conditionally wake up event thread if there are some providers waiting for orphanization. MFC after: 1 week	2011-11-01 08:57:49 +00:00
Andrey V. Elsukov	aea26bc05a	Our geom withering function could take some time before geom with its providers and consumers will be destroyed. Before take some actions with a geom, check that it is not destroyed at the moment. Tested by: nwhitehorn MFC after: 1 week	2011-10-28 11:45:24 +00:00
Pawel Jakub Dawidek	0c879bd990	Before this change when GELI detected hardware crypto acceleration it will start only one worker thread. For software crypto it will start by default N worker threads where N is the number of available CPUs. This is not optimal if hardware crypto is AES-NI, which uses CPU for AES calculations. Change that to always start one worker thread for every available CPU. Number of worker threads per GELI provider can be easly reduced with kern.geom.eli.threads sysctl/tunable and even for software crypto it should be reduced when using more providers. While here, when number of threads exceeds number of CPUs avilable don't reduce this number, assume the user knows what he is doing. Reported by: Yuri Karaban <dev@dev97.com> MFC after: 3 days	2011-10-27 16:12:25 +00:00
Alexander Motin	733a1f3f52	Clarify disks/volumes above 2TiB support in geom_raid: - add support for volumes above 2TiB with Promise metadata format; - enforse and document other limitations: - Intel and Promise metadata formats do not support disks above 2TiB; - NVIDIA metadata format does not support volumes above 2TiB. Sponsored by: iXsystems, Inc. MFC after: 2 weeks	2011-10-26 21:50:10 +00:00
Pawel Jakub Dawidek	92f84a9fae	Allow upper layers to discover than BIO_DELETE and/or BIO_FLUSH is not supported by returning EOPNOTSUPP instead of 0 or ENODEV. MFC after: 3 days	2011-10-25 14:07:17 +00:00
Pawel Jakub Dawidek	37f0f0a75e	Improve style a bit. MFC after: 3 days	2011-10-25 14:05:39 +00:00
Pawel Jakub Dawidek	9495476273	Simplify disk_alloc(). MFC after: 3 days	2011-10-25 14:04:59 +00:00
Pawel Jakub Dawidek	1f8c92e6fa	Add support for creating GELI devices with older metadata version for use with older FreeBSD versions: - Add -V option to 'geli init' to specify version number. If no -V is given the most recent version is used. - If -V is given don't allow to use features not supported by this version. - Print version in 'geli list' output. - Update manual page and add table describing which GELI version is supported by which FreeBSD version, so one can use it when preparing GELI device for older FreeBSD version. Inspired by: Garrett Cooper <yanegomi@gmail.com> MFC after: 3 days	2011-10-25 13:57:50 +00:00
Pawel Jakub Dawidek	effb9912c7	When decoding metadata, check magic string, so we know this is not GELI device before we check its version. We don't want to report that some garbage is unsupported version if this is not even GELI provider. MFC after: 3 days	2011-10-25 13:44:23 +00:00
Pawel Jakub Dawidek	0e236b6c47	Prefer G_ELI_VERSION_* defines for version numbers over plain digits. MFC after: 3 days	2011-10-25 13:09:22 +00:00
Pawel Jakub Dawidek	038c55adcc	Fit lines into 80 chars. MFC after: 3 days	2011-10-25 13:08:03 +00:00
Pawel Jakub Dawidek	e880ff0062	When metadata is at newer version than the highest supported, return EOPNOTSUPP when decoding. MFC after: 3 days	2011-10-25 07:48:53 +00:00
Marcel Moolenaar	369fe59de8	Add support for Boot Camp. The support is defined as follows: o Detect when Boot Camp is enabled (i.e. the MBR mirrors the GPT). o When Boot Camp is enabled, update the MBR whenever we write the GPT. o Creation of a Boot Camp enabled GPT is not supported. o Automatically disable Boot Camp when the GPT has been changed so that there's either no EFI partition or no HFS+ partition. o The first 4 partitions (by index) get mirrored in the MBR. Requested by, discussed with and tested by: kris@pcbsd.org MFC after: 1 week	2011-10-23 02:51:23 +00:00
Marius Strobl	479a4ef021	Allow to dump on Solaris swap partitions. PR: 161764 Submitted by: Peter Jeremy	2011-10-18 20:16:02 +00:00
Pawel Jakub Dawidek	8d680f2cc9	Add some spare fields to the g_class and g_geom structures needed to implement direct I/O handling and provider's property changes handling.	2011-07-17 20:35:30 +00:00
Andrey V. Elsukov	0857ee8cb8	Remove include of sys/sbuf.h from geom/geom.h. sbuf support is not always required for geom/geom.h users, and no need to depend from it. PR: kern/158398	2011-07-11 10:02:27 +00:00
Andrey V. Elsukov	5d807a0e1a	Include sys/sbuf.h directly. Reviewed by: pjd	2011-07-11 05:22:31 +00:00
Kirk McKusick	8795189c98	Allow disk partitions associated with UFS read-only mounted filesystems to be opened for writing. This functionality used to be special-cased for just the root filesystem, but with this change is now available for all UFS filesystems. This change is needed for journaled soft updates recovery. Discussed with: Jeff Roberson	2011-07-10 00:41:31 +00:00
Andrey V. Elsukov	2b9be05588	Initialize elements of state array when creating the GPT table. This fixes the problem, when the secondary GPT header is not erased when partition table destroyed. Move equal operations from g_part_gpt_create and g_part_gpt_recover to the separate function g_gpt_set_defaults. Reported by: dwhite MFC after: 1 week	2011-06-29 05:41:14 +00:00
Andrey V. Elsukov	671dfdbf11	EBR could contain an early stage of boot code. But we do not support it. Remove message about non empty bootcode, we can not break something while GEOM_PART_EBR_COMPAT is defined. But without GEOM_PART_EBR_COMPAT any changes in EBR are allowed and we can accidentally wipe the boot code. To do not break anything save the first EBR chunk and keep it untouched each time when we are changing EBR. Note that we are still not support boot code for EBR. PR: kern/141235 MFC after: 1 month	2011-06-27 12:42:48 +00:00
Andrey V. Elsukov	61162e857a	MS Windows NT+ uses 4 bytes at offset 0x1b8 in the MBR to identify disk drive. The boot0cfg(8) utility preserves these 4 bytes when is writing bootcode to keep a multiboot ability. Change gpart's bootcode method to keep DSN if it is not zero. Also do not allow writing bootcode with size not equal to MBRSIZE. PR: kern/157819 Tested by: Eir Nym MFC after: 1 month	2011-06-27 10:42:06 +00:00
Andrey V. Elsukov	503e6682cd	Change the way how we update bootcode for BSD scheme. Since the only parameter that we check is size of bootcode, then allow only two sizes: size of boot1 and size of /boot/boot. This partially protects users from losing ability to boot if incorrect bootcode is specified. Requested by: ru	2011-06-20 12:22:30 +00:00
Justin T. Gibbs	416494d7c9	Plumb device physical path reporting from CAM devices, through GEOM and DEVFS, and make it accessible via the diskinfo utility. Extend GEOM's generic attribute query mechanism into generic disk consumers. sys/geom/geom_disk.c: sys/geom/geom_disk.h: sys/cam/scsi/scsi_da.c: sys/cam/ata/ata_da.c: - Allow disk providers to implement a new method which can override the default BIO_GETATTR response, d_getattr(struct bio *). This function returns -1 if not handled, otherwise it returns 0 or an errno to be passed to g_io_deliver(). sys/cam/scsi/scsi_da.c: sys/cam/ata/ata_da.c: - Don't copy the serial number to dp->d_ident anymore, as the CAM XPT is now responsible for returning this information via d_getattr()->(a)dagetattr()->xpt_getatr(). sys/geom/geom_dev.c: - Implement a new ioctl, DIOCGPHYSPATH, which returns the GEOM attribute "GEOM::physpath", if possible. If the attribute request returns a zero-length string, ENOENT is returned. usr.sbin/diskinfo/diskinfo.c: - If the DIOCGPHYSPATH ioctl is successful, report physical path data when diskinfo is executed with the '-v' option. Submitted by: will Reviewed by: gibbs Sponsored by: Spectra Logic Corporation Add generic attribute change notification support to GEOM. sys/sys/geom/geom.h: Add a new attrchanged method field to both g_class and g_geom. sys/sys/geom/geom.h: sys/geom/geom_event.c: - Provide the g_attr_changed() function that providers can use to advertise attribute changes. - Perform delivery of attribute change notifications from a thread context via the standard GEOM event mechanism. sys/geom/geom_subr.c: Inherit the attrchanged method from class to geom (class instance). sys/geom/geom_disk.c: Provide disk_attr_changed() to provide g_attr_changed() access to consumers of the disk API. sys/cam/scsi/scsi_pass.c: sys/cam/scsi/scsi_da.c: sys/geom/geom_dev.c: sys/geom/geom_disk.c: Use attribute changed events to track updates to physical path information. sys/cam/scsi/scsi_da.c: Add AC_ADVINFO_CHANGED to the registered asynchronous CAM events for this driver. When this event occurs, and the updated buffer type references our physical path attribute, emit a GEOM attribute changed event via the disk_attr_changed() API. sys/cam/scsi/scsi_pass.c: Add AC_ADVINFO_CHANGED to the registered asynchronous CAM events for this driver. When this event occurs, update the physical patch devfs alias for this pass instance. Submitted by: gibbs Sponsored by: Spectra Logic Corporation	2011-06-14 17:10:32 +00:00
Attilio Rao	d7073a2b3b	MFC	2011-06-03 17:09:15 +00:00
Alexander Motin	0330cb3bf7	Update disk's stripesize and stripeoffset parameters on provider open. They are media-dependent and may change in run-time, same as sectorsize and/or mediasize. SCSI devices return physical sector size and offset via READ CAPACITY(16) command and so can not report it until media inserted or at least until probe sequence completed. UNMAP support is also reported there.	2011-06-03 13:49:18 +00:00
Andrey V. Elsukov	38c64884ff	Add diagnostic message about not aligned partitions. Idea from: ivoras	2011-06-03 06:58:24 +00:00
Attilio Rao	3bf1ec3a9a	MFC	2011-06-02 14:09:30 +00:00
Andrey V. Elsukov	d15033b3f8	Do not hide stripeoffset from libgeom(3), it may be useful even when stripesize is zero. MFC after: 1 week	2011-06-02 12:49:45 +00:00
Attilio Rao	9cb46334ee	MFC	2011-05-27 16:09:10 +00:00
Andrey V. Elsukov	9854b4eeee	Some partitioning tools may have a different opinion about disk geometry and partitions may start from withing the first track. If we found such partitions, then do not reserve space of the first track, only first sector.	2011-05-27 06:37:42 +00:00
Attilio Rao	7fcdc9a26f	MFC	2011-05-26 17:38:00 +00:00
Andrey V. Elsukov	ceef8f2477	Prevent non-aligned reading from provider while tasting. Reject providers with unsupported sectorsize. Reported by: Joerg Wunsch MFC after: 1 week	2011-05-25 11:14:26 +00:00
Andrey V. Elsukov	6fd1e2e013	Do not truncate available disk space to the closest track boundary.	2011-05-25 09:45:13 +00:00
Andrey V. Elsukov	23a3490034	Do not truncate available disk space to the closest track boundary.	2011-05-25 09:38:12 +00:00
Andrey V. Elsukov	db48d4a92e	Do not truncate available disk space to the closest track boundary.	2011-05-25 09:32:19 +00:00
Andrey V. Elsukov	49d12fd5be	Remove unused variable. MFC after: 1 week	2011-05-24 06:46:07 +00:00
Andrey V. Elsukov	e471361279	Remove unused variable. MFC after: 1 week	2011-05-24 06:44:16 +00:00
Attilio Rao	3ac3f6002b	MFC	2011-05-23 23:58:02 +00:00
Pawel Jakub Dawidek	204a4e196a	Recognize BIO_FLUSH requests and pass them to userland. MFC after: 1 week	2011-05-23 21:00:37 +00:00
Attilio Rao	7e7a34e520	MFC	2011-05-16 16:34:03 +00:00
Andrey V. Elsukov	d0c8ecb812	Make diagnostic messages more specific. With bootverbose print out all inconsistencies of integrity in the partition table, not first found only. Requested by: kib	2011-05-16 15:59:50 +00:00
Andrey V. Elsukov	b6c4978f6f	Add diagnostic messages for integrity checks.	2011-05-16 12:00:32 +00:00
Andrey V. Elsukov	6e81b75a3c	Add a sysctl kern.geom.part.check_integrity for those who has corrupt partition tables and lost an ability to boot after r221788. Also unhide an error message from bootverbose, this would help to easier determine the problem.	2011-05-15 20:03:54 +00:00
Attilio Rao	447274a88b	MFC	2011-05-15 15:47:16 +00:00
Mikolaj Golub	76cc7f6dd6	Fix a memory leak possible in g_eli_key_allocate() if the key with the same keyno is added while we aren't holding the lock. Approved by: pjd (mentor) MFC after: 1 week	2011-05-15 12:39:30 +00:00
Attilio Rao	ef607a6aa3	MFC	2011-05-12 14:01:40 +00:00
Andrew Thompson	b2901e999b	Move the three geom kprocs as threads under a single pid. Reviewed by: julian	2011-05-11 21:47:30 +00:00
Andrey V. Elsukov	c63e8fe201	Add basic metadata integrity check. In case when partition table was probed and read successfull, but it contains invalid values (e.g. overlapped partitions, offset or size is out of bounds), then table will be rejected. MFC after: 1 month	2011-05-11 19:59:43 +00:00
Attilio Rao	521bd6b433	MFC	2011-05-08 14:56:02 +00:00
Andrey V. Elsukov	f30b6bcb60	Limit number of sectors that can be addressed. MFC after: 1 week	2011-05-08 12:28:13 +00:00
Andrey V. Elsukov	284a82d0bb	Limit number of sectors that can be addressed. MFC after: 1 week	2011-05-08 12:20:30 +00:00
Andrey V. Elsukov	6017ae3fdd	Limit number of sectors that can be addressed. Reject table if blkcount from metadata is greater than provider.	2011-05-08 12:16:39 +00:00
Andrey V. Elsukov	2920db1713	Limit number of sectors that can be addressed. MFC after: 1 week	2011-05-08 12:11:16 +00:00
Andrey V. Elsukov	4675b2b65f	Replace UINT_MAX to UINT32_MAX. Pointed out by: kib MFC after: 1 week	2011-05-08 11:42:51 +00:00
Andrey V. Elsukov	ab0ffb4c88	Limit number of sectors that can be addressed. MFC after: 1 week	2011-05-08 11:20:27 +00:00
Andrey V. Elsukov	cfbdf6c3c5	Limit number of sectors that can be addressed. MFC after: 1 week	2011-05-08 11:16:17 +00:00
Pawel Jakub Dawidek	a1f4a8c447	Export GELI class version via sysctl kern.geom.eli.version. MFC after: 1 week	2011-05-08 09:29:21 +00:00
Pawel Jakub Dawidek	731adc8682	Version 6 is compatible with version 5 when it comes to control commands. MFC after: 1 week	2011-05-08 09:25:54 +00:00
Pawel Jakub Dawidek	964d172cbe	Detect and handle metadata of version 6. MFC after: 1 week	2011-05-08 09:25:16 +00:00
Pawel Jakub Dawidek	ad0a523639	When support for multiple encryption keys was committed, GELI integrity mode was not updated to pass CRD_F_KEY_EXPLICIT flag to opencrypto. This resulted in always using first key. We need to support providers created with this bug, so set special G_ELI_FLAG_FIRST_KEY flag for GELI provider in integrity mode with version smaller than 6 and pass the CRD_F_KEY_EXPLICIT flag to opencrypto only if G_ELI_FLAG_FIRST_KEY doesn't exist. Reported by: Anton Yuzhaninov <citrin@citrin.ru> MFC after: 1 week	2011-05-08 09:17:56 +00:00
Pawel Jakub Dawidek	9d644a4032	Remove prototype for a function that no longer exist. MFC after: 1 week	2011-05-08 09:11:04 +00:00
Pawel Jakub Dawidek	937959f0a7	Drop proper key. MFC after: 1 week	2011-05-08 09:09:49 +00:00
Pawel Jakub Dawidek	9104c920b4	Add magic field to the g_eli_key structure to detect if we are really operating on proper structures. MFC after: 1 week	2011-05-08 09:08:50 +00:00
Attilio Rao	aa8b9e0706	MFC	2011-05-06 22:45:33 +00:00
Adrian Chadd	c60fd25d34	Updates to geom_map from the author. The major update here is to support 64 bit size/offsets. There's also style related changes. Submitted by: ray@dlink.ua	2011-05-05 14:43:09 +00:00
Attilio Rao	71a19bdc64	Commit the support for removing cpumask_t and replacing it directly with cpuset_t objects. That is going to offer the underlying support for a simple bump of MAXCPU and then support for number of cpus > 32 (as it is today). Right now, cpumask_t is an int, 32 bits on all our supported architecture. cpumask_t on the other side is implemented as an array of longs, and easilly extendible by definition. The architectures touched by this commit are the following: - amd64 - i386 - pc98 - arm - ia64 - XEN while the others are still missing. Userland is believed to be fully converted with the changes contained here. Some technical notes: - This commit may be considered an ABI nop for all the architectures different from amd64 and ia64 (and sparc64 in the future) - per-cpu members, which are now converted to cpuset_t, needs to be accessed avoiding migration, because the size of cpuset_t should be considered unknown - size of cpuset_t objects is different from kernel and userland (this is primirally done in order to leave some more space in userland to cope with KBI extensions). If you need to access kernel cpuset_t from the userland please refer to example in this patch on how to do that correctly (kgdb may be a good source, for example). - Support for other architectures is going to be added soon - Only MAXCPU for amd64 is bumped now The patch has been tested by sbruno and Nicholas Esborn on opteron 4 x 12 pack CPUs. More testing on big SMP is expected to came soon. pluknet tested the patch with his 8-ways on both amd64 and i386. Tested by: pluknet, sbruno, gianni, Nicholas Esborn Reviewed by: jeff, jhb, sbruno	2011-05-05 14:39:14 +00:00
Andrey V. Elsukov	9a7defbda0	Remove unneeded code. MFC after: 1 week	2011-05-04 18:41:26 +00:00
Andrey V. Elsukov	eb8e9abe72	Remove unneeded code. MFC after: 1 week	2011-05-04 18:26:45 +00:00
Andrey V. Elsukov	ceb1c69a84	Remove unneeded code. MFC after: 1 week	2011-05-04 18:17:21 +00:00
Andrey V. Elsukov	2fbefe4829	Removed KASSERT, g_new_providerf() can not fail. MFC after: 1 week	2011-05-04 18:06:40 +00:00
Andrey V. Elsukov	c211af0352	Remove "for a moment" assignment. struct g_geom zeroed when allocated. MFC after: 1 week	2011-05-04 17:56:53 +00:00
Andrey V. Elsukov	e62dffbf5d	Remove unneeded checks, g_new_xxx functions can not fail. MFC after: 1 week	2011-05-04 17:37:37 +00:00
Andrey V. Elsukov	370efd743a	When checking existence of providers skip those which are orphaned. PR: kern/132273 MFC after: 2 week	2011-05-04 12:59:11 +00:00
Alexander Motin	bd5c368604	Use make_dev_alias_p() added in r221397 to create alias dev entry. It removes panic in case if alias name is already busy for some reason.	2011-05-03 19:12:42 +00:00
Alexander Motin	90f2be2430	Implement relaxed comparision for hardcoded provider names to make it ignore adX/adaY difference in both directions to simplify migration to the CAM-based ATA or back.	2011-04-27 00:10:26 +00:00
Alexander Motin	0d307e0905	- Add shim to simplify migration to the CAM-based ATA. For each new adaX device in /dev/ create symbolic link with adY name, trying to mimic old ATA numbering. Imitation is not complete, but should be enough in most cases to mount file systems without touching /etc/fstab. - To know what behavior to mimic, restore ATA_STATIC_ID option in cases where it was present before. - Add some more details to UPDATING.	2011-04-26 17:01:49 +00:00
Pawel Jakub Dawidek	16a174b5c5	One key is expected from providers smaller than or equal to (2^20)*sectorsize bytes. Remove bogus assertion and while here remove another too obvious assertion. Reported by: Fabian Keil <freebsd-listen@fabiankeil.de> MFC after: 2 weeks	2011-04-24 10:41:13 +00:00
Pawel Jakub Dawidek	5bd8adc750	If number of keys for the given provider doesn't exceed the limit, allocate all of them at attach time. This allows to avoid moving keys around in the most-recently-used queue and needs no mutex synchronization nor refcounting. MFC after: 2 weeks	2011-04-21 13:35:20 +00:00
Pawel Jakub Dawidek	1e09ff3dc3	Instead of allocating memory for all the keys at device attach, create reasonably large cache for the keys that is filled when needed. The previous version was problematic for very large providers (hundreds of terabytes or serval petabytes). Every terabyte of data needs around 256kB for keys. Make the default cache limit big enough to fit all the keys needed for 4TB providers, which will eat at most 1MB of memory. MFC after: 2 weeks	2011-04-21 13:31:43 +00:00
Alexander Motin	fe51d6c1d1	Reduce geom_raid log verbosity.	2011-04-18 16:15:59 +00:00
Gavin Atkinson	47bae5fd09	Remove an incorrect be16toh() that prevented geom_part_apm from working on little-endian machines. Reviewed by: marcel MFC after: 2 weeks	2011-04-15 12:32:52 +00:00
Adrian Chadd	27afdbaa51	Introduce geom_map, a GEOM provider designed for use by embedded flash stores. Some devices - notably those with uboot - don't have an explicit partition table (eg like Redboot's FIS.) geom_map thus provides an easy way to export the hard-coded flash layout as geom providers for use by filesystems and other tools. It also includes a "search" function which allows for dynamic creation of partition layouts where the device only has a single hard-coded partition. For example, if there is a "kernel+rootfs" partition, a single image can be created which appends the rootfs after the kernel with an appropriate search string. geom_map can be told to search for said search string and create a partition beginning after it. Submitted by: Aleksandr Rybalko <ray@dlink.ua>	2011-04-12 08:10:25 +00:00
Mikolaj Golub	90574b0a79	In g_eli_read_done() and g_eli_write_done(), for a bio with bio_children > 1, g_destroy_bio() is never called and the bio leaks. Fix this by calling g_destroy_bio() earlier, before the check. Submitted by: Victor Balada Diaz <victor@bsdes.net> (initial version) Approved by: pjd (mentor) MFC after: 1 week	2011-04-03 17:38:12 +00:00
Pawel Jakub Dawidek	63a6c5c12b	GEOM has an internal mechanism to deal with ENOMEM errors returned via g_io_deliver(). In such case it increases 'pace' counter on each ENOMEM and reschedules the request. The 'pace' counter is decreased for each request going down, but until 'pace' is greater than zero, GEOM will handle at most 10 requests per second. For GEOM GATE users that are proxy to local GEOM providers (like ggatel(8) and HAST) we can end up with almost permanent slow down of GEOM down queue. This is because once we reach GEOM GATE queue limit, we return ENOMEM to the GEOM. This means that we have, eg. 1024 I/O requests in the GEOM GATE queue. To make room in the queue and stop returning ENOMEM we need to proceed the requests of course, but those requests are handled by userland daemons that handle them by reading/writing also from/to local GEOM providers. For example with HAST, a new requests comes to /dev/hast/data, which is GEOM GATE provider. GEOM GATE passes the request to hastd(8) and hastd(8) reads/writes from/to /dev/da0. Once we reach GEOM GATE queue limit, to free up a slot in GEOM GATE queue, hastd(8) has to read/write from/to /dev/da0, but this request will also be very slow, because GEOM now slows down all the requests. We end up with full queue that we can unload at the speed of 10 requests per second. This simply looks like a deadlock. Fix it by allowing userland daemons that work with both GEOM GATE and local GEOM providers to specify unlimited queue size, so GEOM GATE will never return ENOMEM to the GEOM. MFC after: 1 week	2011-04-02 06:56:06 +00:00
Alexander Motin	14e2cd0a00	Bunch of small bugfixes and cleanups. Found with: Clang Static Analyzer	2011-03-31 16:19:53 +00:00
Alexander Motin	636076752a	Bunch of small bugfixes and cleanups. Found with: Coverity Prevent(tm) CID: 9656, 9658, 9693, 9705, 9706, 9707, 9808, 9809, 9810, 9711, 9712, 9713, 9714	2011-03-31 16:14:35 +00:00
Andrey V. Elsukov	53ff3d1e9c	Remove unneeded checks, g_new_xxx functions can not return NULL. Reviewed by: pjd MFC after: 1 week	2011-03-31 06:30:59 +00:00
Mikolaj Golub	bd119384c7	Increase debug level on g_gate device destruction and add message on device creation. Suggested by: danger Approved by: pjd (mentor) MFC after: 3 days	2011-03-30 21:40:14 +00:00
Mikolaj Golub	baf63f65ae	In g_gate_create() there is a window between when g_gate_softc is registered in g_gate_units array and when its sc_provider field is filled. If during this period g_gate_units is accessed by another thread that is checking for provider name collision the crash is possible. Fix this by adding sc_name field to struct g_gate_softc. In g_gate_create() when g_gate_softc is created but sc_provider is still not sc_name points to provider name stored in the local array. Approved by: pjd (mentor) Reported by: Freddie Cash <fjwcash@gmail.com> MFC after: 1 week	2011-03-27 19:56:55 +00:00
Alexander Motin	89b172238a	MFgraid/head: Add new RAID GEOM class, that is going to replace ataraid(4) in supporting various BIOS-based software RAIDs. Unlike ataraid(4) this implementation does not depend on legacy ata(4) subsystem and can be used with any disk drivers, including new CAM-based ones (ahci(4), siis(4), mvs(4), ata(4) with `options ATA_CAM`). To make code more readable and extensible, this implementation follows modular design, including core part and two sets of modules, implementing support for different metadata formats and RAID levels. Support for such popular metadata formats is now implemented: Intel, JMicron, NVIDIA, Promise (also used by AMD/ATI) and SiliconImage. Such RAID levels are now supported: RAID0, RAID1, RAID1E, RAID10, SINGLE, CONCAT. For any all of these RAID levels and metadata formats this class supports full cycle of volume operations: reading, writing, creation, deletion, disk removal and insertion, rebuilding, dirty shutdown detection and resynchronization, bad sector recovery, faulty disks tracking, hot-spare disks. For Intel and Promise formats there is support multiple volumes per disk set. Look graid(8) manual page for additional details. Co-authored by: imp Sponsored by: Cisco Systems, Inc. and iXsystems, Inc.	2011-03-24 21:31:32 +00:00
Alexander Motin	c6d4ed3a32	MFgraid/head r218212, r218257: Introduce new type of BIO_GETATTR -- GEOM::setstate, used to inform lower GEOM about state of it's providers from the point of upper layers. Make geom_disk use led(4) subsystem to illuminate states in such fashion: FAILED - "1" (on), REBUILD - "f5" (slow blink), RESYNC - "f1" (fast blink), ACTIVE - "0" (off). LED name should be set for each disk via kern.geom.disk.%s.led sysctl. Later disk API could be extended to allow disk driver to report this info in custom way via it's own facilities.	2011-03-24 19:23:42 +00:00
Alexander Motin	06f4c96d39	MFgraid/head r217827: Change BIO_GETATTR("GEOM::kerneldump") API to make set_dumper() called by consumer (geom_dev) instead of provider (geom_disk). This allows any geom insert it's code into the dump call chain, implementing more sophisticated functionality then just disk partitioning.	2011-03-24 08:37:48 +00:00
Maxim Sobolev	20cc2dc42e	Some linux distros put mount point into the ext2fs labels, such as '/', or '/boot', which confuses the devfs code and can cause userland programs to fail reading /dev/ext2fs directory with weird error code, such as any program that uses pwlib. Strip any leading slashes before feeding the label to the geom_label code. Sponsored by: Sippy Software, Inc. MFC after: 1 week	2011-03-08 17:00:31 +00:00
Nathan Whitehorn	65cb6238bd	Add the disk ident and a human-meaningful description (here, the disk model string) to the geom_disk config XML so that they are easily accessible from userland. MFC after: 1 week	2011-02-26 14:58:54 +00:00
Alexander Leidinger	cb08c2cc83	Add some FEATURE macros for various GEOM classes. No FreeBSD version bump, the userland application to query the features will be committed last and can serve as an indication of the availablility if needed. Sponsored by: Google Summer of Code 2010 Submitted by: kibab Reviewed by: silence on geom@ during 2 weeks X-MFC after: to be determined in last commit with code from this project	2011-02-25 10:24:35 +00:00
Rebecca Cran	6bccea7c2b	Fix typos - remove duplicate "the". PR: bin/154928 Submitted by: Eitan Adler <lists at eitanadler.com> MFC after: 3 days	2011-02-21 09:01:34 +00:00
Yoshihiro Takahashi	9f0f6d5fd7	Add support to set a slice name.	2011-02-19 11:09:38 +00:00
Luigi Rizzo	67c1af9d00	Correct a subtle bug in the 'gsched_rr' disk scheduler. The algorithm is supposed to work as follows: in order to prevent starvation, when a new client starts being served we record the start time and reset the counter of bytes served. We then switch to a new client after a certain amount of time or bytes, even if the current one still has pending requests. To avoid charging a new client the time of the first seek, we start counting time when the first request is served. Unfortunately a bug in the previous version of the code failed to set the start time in certain cases, resulting in some processes exceeding their timeslice. The fix (in this patch) is trivial, though it took a while to find out and replicate the bug. Thanks to Tommaso Caprai for investigating and fixing the problem. Submitted by: Tommaso Caprai MFC after: 1 week	2011-02-14 08:09:02 +00:00
Marcel Moolenaar	1e189c0839	Use the preload_fetch_addr() and preload_fetch_size() convenience functions to obtain the address and size of the preloaded key files. Sponsored by: Juniper Networks.	2011-02-13 19:34:48 +00:00
Yoshihiro Takahashi	5d627bb558	Add support to write boot menu.	2011-02-11 13:18:00 +00:00
Andrey V. Elsukov	88007f6102	Add new user-friendly aliases for partition types for the MBR and EBR schemes: fat32, ebr, linux-data, linux-raid, linux-swap and linux-lvm. Add bios-boot GUID and alias for the GPT scheme. It used by GRUB 2 loader. Also do sorting definitions of types in diskmbr.h and in g_part.c. PR: bin/120990, kern/147664 MFC after: 2 weeks	2011-01-28 11:13:01 +00:00
Andrey V. Elsukov	1313160649	While inspecting the disklabel check that start offset of partition is within provider's bounds. If not then reject this disklabel. Mark bbarea as NULL to do not free it again in destroy method. MFC after: 1 week	2011-01-27 08:02:26 +00:00
Matthew D Fleming	73d6f8516d	Remove the CTLFLAG_NOLOCK as it seems to be both unused and unfunctional. Wiring the user buffer has only been done explicitly since r101422. Mark the kern.disks sysctl as MPSAFE since it is and it seems to have been mis-using the NOLOCK flag. Partially break the KPI (but not the KBI) for the sysctl_req 'lock' field since this member should be private and the "REQ_LOCKED" state seems meaningless now.	2011-01-26 22:48:09 +00:00
Konstantin Belousov	0ea2e01412	Treat async buffer writes from the gjournal switcher thread the same as from syncer. We shall not sleep on running buffer space when suspending. Reproduced and tested by: pho PR: kern/154228 MFC after: 1 week	2011-01-26 10:34:21 +00:00
Andrey V. Elsukov	799eac8c3d	Limit maximum number of GPT entries to 4k. It is most realistic value and can prevent kernel memory exhausting when big value is specified from command line. Split reading and writing operation to several iteration to do not trigger KASSERT when data length is greater than MAXPHYS. PR: kern/144962, kern/147851 MFC after: 2 weeks	2011-01-18 09:52:53 +00:00
Matthew D Fleming	0c2b0e03f7	sysctl(9) cleanup checkpoint: amd64 GENERIC builds cleanly. Commit the geom piece.	2011-01-12 19:54:07 +00:00
Andrey V. Elsukov	95959703e1	Sector size can not be greater than MAXPHYS. Since GRAID3 calculates sector size from user-specified block size, report to user about big blocksize. PR: kern/147851 MFC after: 1 week	2011-01-12 13:55:01 +00:00
Andrey V. Elsukov	e76dc5129a	Sector size can not be greater than MAXPHYS. MFC after: 1 week	2011-01-12 12:26:10 +00:00
Andrey V. Elsukov	eaaef50811	Remove redundant check. MFC after: 1 week	2011-01-11 13:22:20 +00:00
Andrey V. Elsukov	f2b3e9e870	Round GNOP provider's mediasize to its sectorsize. This prevents KASSERT in g_io_request when geom classes doing tasting. PR: kern/147852 MFC after: 1 week	2011-01-11 11:42:22 +00:00
Matthew D Fleming	ed7beddc48	Fix a memory overflow where the input length to g_gpt_utf8_to_utf16() was specified incorrectly, causing the bzero to run past the end of a malloc(9)'d object. Submitted by: Eric Youngblut < eyoungblut AT isilon DOT com > MFC after: 3 days	2011-01-07 16:46:20 +00:00
Nathan Whitehorn	e76b061420	Add an entry to the gpart XML to determine if the geom has pending changes that need to be committed (or undone). MFC after: 2 weeks	2011-01-06 03:36:04 +00:00
Konstantin Belousov	23b70c1ae2	Finish r210923, 210926. Mark some devices as eternal. MFC after: 2 weeks	2011-01-04 10:59:38 +00:00
Konstantin Belousov	d91e813c7b	Add reporting of GEOM::candelete BIO_GETATTR for md(4) and geom_disk(4). Non-zero value of attribute means that device supports BIO_DELETE. Suggested and reviewed by: pjd Tested by: pho MFC after: 1 week	2010-12-29 12:11:07 +00:00
Andrey V. Elsukov	f25481193e	Allow destroying EBR in COMPAT (default) mode. MFC after: 2 week	2010-12-28 08:42:12 +00:00
Andrey V. Elsukov	d3507dff37	Make EBR probe method less strictly to be able detect EBRs with small non fatal inconsistency. EBR may contain boot loader and sometimes it just has some garbage data. Now this does not prevent FreeBSD to use extended partitions. But since we do not support bootcode for EBR we mark tables which have non empty boot area as corrupt. This does make them readonly and we can not damage this data. PR: kern/141235 MFC after: 1 month	2010-12-28 08:36:44 +00:00
Rebecca Cran	fa5f3816c4	Don't warn if a partition appears not to be aligned on a track boundary. Modern disks use LBA and create a fake CHS geometry that doesn't have any relation to the on-disk layout of data.	2010-12-07 20:46:11 +00:00
Ivan Voras	e5c723f123	Add a note about the magic number 20. Actually, 22.75 entries fit in a 512 byte sector but when choosing magic numbers, 20 looks nicer. Discussed with: marcel	2010-12-02 19:47:27 +00:00
Jaakko Heinonen	e5a2338118	- Report an error when a label with invalid name is attempted to be created with glabel(8). - Fix a typo in an error message. - Fix comment typos. Approved by: pjd	2010-12-01 19:24:07 +00:00
Jaakko Heinonen	f7842e00f5	Use g_eventlock to protect against losing wakeups in the g_event process and replace tsleep(9) with msleep(9) which doesn't use a timeout. The previously used timeout caused the event process to wake up ten times per second on an idle system. one_event() is now called with the topology lock held and it returns with both the topology and event locks held when there are no more events in the queue. Reported by: mav, Marius Nünnerich Reviewed by: freebsd-geom	2010-11-22 16:47:53 +00:00
Ed Schouten	eb4c31fd41	Add support for asterisk characters when filling in the GELI password during boot. Change the last argument of gets() to indicate a visibility flag and add definitions for the numerical constants. Except for the value 2, gets() will behave exactly the same, so existing consumers shouldn't break. We only use it in two places, though. Submitted by: lme (older version)	2010-11-14 14:12:43 +00:00
Andrey V. Elsukov	55514bdfc0	Fix regression introduced in r215088: gpart(8) reports "arg0 'provider': Invalid argument" after creating new partition table. Move code for search of existing geom into g_part_find_geom function and use this function instead of g_part_parm_geom in g_part_ctl_create. Approved by: kib (mentor)	2010-11-11 12:13:41 +00:00
Andrey V. Elsukov	7085c3bc98	In r212554 name of G_PART_PARM_GEOM and G_PART_PARM_PROVIDER ctlreq parameters was changed to "arg0". Fix the last place where it is used. Approved by: kib (mentor)	2010-11-10 14:38:51 +00:00
Jaakko Heinonen	9d142a6ee6	Extend the g_eventlock mutex coverage in one_event() to include setting of the EV_DONE flag and use the mutex to protect against losing wakeups in g_waitfor_event(). Reported by: davidxu Tested by: davidxu Discussed on: freebsd-current	2010-11-03 16:19:35 +00:00
Andrey V. Elsukov	e7926a3703	Reimplemented "gpart destroy -F". Now it does all work in kernel. This was needed for recover implementation. Implement the recover command for GPT. Now GPT will marked as corrupt when any of three types of corruption will be detected: 1. Damaged primary GPT header or table 2. Damaged secondary GPT header or table 3. Secondary header is not located in the last LBA Marked GPT becomes read-only. Any changes with corrupt table are prohibited. Only "destroy" and "recover" commands are allowed. Discussed with: geom@ (mostly silence) Tested by: Ilya A. Arhipov Approved by: mav (mentor) MFC after: 2 weeks	2010-10-25 16:23:35 +00:00
Pawel Jakub Dawidek	0d2f5a4eaa	- Improve error messages, so instead of 'Not fully done', the user will get information that device is already suspended or that device is using one-time key and suspend is not supported. - 'geli suspend -a' silently skips devices that use one-time key, this is fine, but because we log which device were suspended on the console, log also which devices were skipped.	2010-10-22 22:58:00 +00:00
Pawel Jakub Dawidek	2f2d7830b5	Close a race between checking if device is already suspended and suspending it.	2010-10-22 22:54:26 +00:00
Pawel Jakub Dawidek	d8d61ef8fc	Add State tag, so 'geli status' will report active/suspended status, eg: # geli status Name Status Components da0.eli SUSPENDED da0 da1.eli ACTIVE da1	2010-10-22 22:45:26 +00:00
Pawel Jakub Dawidek	4f294e1289	Encryption keys array might be NULL if device is suspended. Check for this, so we don't panic when we detach suspended device.	2010-10-22 22:44:09 +00:00
Pawel Jakub Dawidek	1d0214411e	Move sc_akeyctx and sc_ivctx initialization to the g_eli_mkey_propagate() function which eliminates code duplication and will ensure proper order of operation.	2010-10-22 22:13:11 +00:00
Pawel Jakub Dawidek	3ac01bc2ae	Free opencrypto sessions on suspend, as they also might keep encryption keys.	2010-10-21 19:44:28 +00:00
Pawel Jakub Dawidek	738ffa9780	Fix a bug introduced in r213067 where we use authentication key before initializing it.	2010-10-21 12:58:26 +00:00
Pawel Jakub Dawidek	5ad4a7c74a	Bring in geli suspend/resume functionality (finally). Before this change if you wanted to suspend your laptop and be sure that your encryption keys are safe, you had to stop all processes that use file system stored on encrypted device, unmount the file system and detach geli provider. This isn't very handy. If you are a lucky user of a laptop where suspend/resume actually works with FreeBSD (I'm not!) you most likely want to suspend your laptop, because you don't want to start everything over again when you turn your laptop back on. And this is where geli suspend/resume steps in. When you execute: # geli suspend -a geli will wait for all in-flight I/O requests, suspend new I/O requests, remove all geli sensitive data from the kernel memory (like encryption keys) and will wait for either 'geli resume' or 'geli detach'. Now with no keys in memory you can suspend your laptop without stopping any processes or unmounting any file systems. When you resume your laptop you have to resume geli devices using 'geli resume' command. You need to provide your passphrase, etc. again so the keys can be restored and suspended I/O requests released. Of course you need to remember that 'geli suspend' won't clear file system cache and other places where data from your geli-encrypted file system might be present. But to get rid of those stopping processes and unmounting file system won't help either - you have to turn your laptop off. Be warned. Also note, that suspending geli device which contains file system with geli utility (or anything used by 'geli resume') is not very good idea, as you won't be able to resume it - when you execute geli(8), the kernel will try to read it and this read I/O request will be suspended.	2010-10-20 20:50:55 +00:00
Pawel Jakub Dawidek	056638c469	- Add missing comments. - Make a comment consistent with others.	2010-10-20 20:01:45 +00:00
Jaakko Heinonen	bc2589f5b7	Use make_dev_p(9) with the MAKEDEV_CHECKNAME flag instead of make_dev(9) and print a diagnostic if the call fails. This avoids a panic when a device with an invalid name is attempted to be registered. For example the label class gets device names from untrusted input. Reviewed by: freebsd-geom	2010-10-19 16:48:49 +00:00
Rui Paulo	42a783c16a	The canonical way to print __func__ when using KASSERT() is to write ("%s", __func__). This avoids clang's -Wformat-string warnings.	2010-10-13 11:35:59 +00:00
Andrey V. Elsukov	21bf062e7e	Replace strlen(_PATH_DEV) with sizeof(_PATH_DEV) - 1. Suggested by: kib Approved by: kib (mentor) MFC after: 5 days	2010-10-09 20:20:27 +00:00
Ulf Lilleengen	de02b15928	- Check flag with the bitwise operator, not the logical operator. Submitted by: arundel MFC after: 1 week	2010-10-01 06:12:13 +00:00
Andrey V. Elsukov	b1da166ef1	Some schemes can allocate memory for internal purposes but when GEOM does withering this memory doesn't freed. Add G_PART_DESTROY call to g_part_wither. Also add missed g_free() call to G_PART_READ method for MBR and PC98 schemes. Submitted by: jh (previous version) Reviewed by: pjd Approved by: kib (mentor)	2010-09-25 18:27:29 +00:00
Pawel Jakub Dawidek	f95168e08d	Change g_eli_debug to int, so one can turn off any GELI output by setting kern.geom.eli.debug sysctl to -1. MFC after: 2 weeks	2010-09-25 10:32:04 +00:00
Pawel Jakub Dawidek	350e8df8de	Ignore errors from BIO_FLUSH. It might confuse users that provider wasn't really killed. What we really care about are write errors only. MFC after: 2 weeks	2010-09-25 10:31:05 +00:00
Pawel Jakub Dawidek	cec283baf4	Allow to configure GPT attributes. It shouldn't be allowed to set bootfailed attribute (it should be allowed only to unset it), but for test purposes it might be useful, so the current code allows it. Reviewed by: arch@ (Message-ID: <20100917234542.GE1902@garage.freebsd.pl>) MFC after: 2 weeks	2010-09-24 19:33:47 +00:00
Pawel Jakub Dawidek	9839c97b4d	Update copyright years. MFC after: 1 week	2010-09-23 12:02:08 +00:00
Pawel Jakub Dawidek	9a5a1d1e1e	Add support for AES-XTS. This will be the default now. MFC after: 1 week	2010-09-23 11:58:36 +00:00
Pawel Jakub Dawidek	c6a26d4c88	Implement switching of data encryption key every 2^20 blocks. This ensures the same encryption key won't be used for more than 2^20 blocks (sectors). This will be the default now. MFC after: 1 week	2010-09-23 11:49:47 +00:00
Pawel Jakub Dawidek	1f0fb66f30	Make the code similar to the code in g_eli_integrity.c. MFC after: 1 week	2010-09-23 11:23:10 +00:00
Pawel Jakub Dawidek	b35bfe7e10	Define default overwrite count, so that userland can use it. MFC after: 1 week	2010-09-23 11:19:48 +00:00
Pawel Jakub Dawidek	5e6dce4bf0	When trashing metadata, flush after each write. MFC after: 1 week	2010-09-23 10:43:37 +00:00
Brian Somers	0f81f3046d	Support attaching version 4 metadata Reviewed by: pjd	2010-09-19 10:45:53 +00:00
Alexander Motin	659f684ea0	Add support for dumping kernel to gconcat. Dumping goes to the component, where dump partition begins.	2010-09-16 17:24:25 +00:00
Pawel Jakub Dawidek	2738b715ea	Change message when setting or unsetting attribute less confusing. Before: ada0 has <attrib> set After: <attrib> set on ada0 MFC after: 2 weeks	2010-09-15 21:15:00 +00:00
Pawel Jakub Dawidek	2f4e9a099b	Make the message that informs about bootcode being written to disk less confusing. Note there is still no information about 'partcode' being written to disk (gpart bootcode -p <partcode> <disk>). Maybe in the future all the messages printed by gpart(8) on success could be hidden under -v? PR: bin/150239 Reported by: Roddi <roddi@me.com> Submitted by: arundel MFC after: 2 weeks	2010-09-15 20:59:13 +00:00
Pawel Jakub Dawidek	8107ecf892	- Change all places where G_TYPE_ASCNUM is used to G_TYPE_NUMBER. It turns out the new type wasn't really needed. - Reorganize code a little bit.	2010-09-14 16:21:13 +00:00
Pawel Jakub Dawidek	b312136354	Simplify the code a bit.	2010-09-14 11:42:07 +00:00
Pawel Jakub Dawidek	946e2f3595	- Remove gc_argname field. It was introduced for gpart(8), but if I understand everything correctly, we don't really need it. - Provide default numeric value as strings. This allows to simplify a lot of code. - Bump version number.	2010-09-13 13:48:18 +00:00
Pawel Jakub Dawidek	a478ea7490	- Allow to specify value as const pointers. - Make optional string values always an empty string.	2010-09-13 08:56:07 +00:00
Justin T. Gibbs	f03f7a0ca3	Correct bioq_disksort so that bioq_insert_tail() offers barrier semantic. Add the BIO_ORDERED flag for struct bio and update bio clients to use it. The barrier semantics of bioq_insert_tail() were broken in two ways: o In bioq_disksort(), an added bio could be inserted at the head of the queue, even when a barrier was present, if the sort key for the new entry was less than that of the last queued barrier bio. o The last_offset used to generate the sort key for newly queued bios did not stay at the position of the barrier until either the barrier was de-queued, or a new barrier (which updates last_offset) was queued. When a barrier is in effect, we know that the disk will pass through the barrier position just before the "blocked bios" are released, so using the barrier's offset for last_offset is the optimal choice. sys/geom/sched/subr_disk.c: sys/kern/subr_disk.c: o Update last_offset in bioq_insert_tail(). o Only update last_offset in bioq_remove() if the removed bio is at the head of the queue (typically due to a call via bioq_takefirst()) and no barrier is active. o In bioq_disksort(), if we have a barrier (insert_point is non-NULL), set prev to the barrier and cur to it's next element. Now that last_offset is kept at the barrier position, this change isn't strictly necessary, but since we have to take a decision branch anyway, it does avoid one, no-op, loop iteration in the while loop that immediately follows. o In bioq_disksort(), bypass the normal sort for bios with the BIO_ORDERED attribute and instead insert them into the queue with bioq_insert_tail(). bioq_insert_tail() not only gives the desired command order during insertion, but also provides barrier semantics so that commands disksorted in the future cannot pass the just enqueued transaction. sys/sys/bio.h: Add BIO_ORDERED as bit 4 of the bio_flags field in struct bio. sys/cam/ata/ata_da.c: sys/cam/scsi/scsi_da.c Use an ordered command for SCSI/ATA-NCQ commands issued in response to bios with the BIO_ORDERED flag set. sys/cam/scsi/scsi_da.c Use an ordered tag when issuing a synchronize cache command. Wrap some lines to 80 columns. sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c sys/geom/geom_io.c Mark bios with the BIO_FLUSH command as BIO_ORDERED. Sponsored by: Spectra Logic Corporation MFC after: 1 month	2010-09-02 19:40:28 +00:00
Pawel Jakub Dawidek	efb46508ce	Correct offset conversion to little endian. It was implemented in version 2, but because of a bug it was a no-op, so we were still using offsets in native byte order for the host. Do it properly this time, bump version to 4 and set the G_ELI_FLAG_NATIVE_BYTE_ORDER flag when version is under 4. MFC after: 2 weeks	2010-08-28 08:30:20 +00:00
Alexander Motin	3d7cfb15f5	Remove bintime_cmp() function, unused since r200086. MFC after: 1 week	2010-08-18 15:38:10 +00:00
Andrey V. Elsukov	d02dc4cd41	Check that gsp is not NULL before access. It can be NULL for some cases. Approved by: kib (mentor) MFC after: 1 week	2010-08-03 11:21:17 +00:00
Andrey V. Elsukov	a45f4c6e2c	Check that table is not NULL before access, it can be NULL for some cases. Approved by: mav (mentor) MFC after: 2 weeks	2010-08-03 09:10:48 +00:00
Andrey V. Elsukov	a80f05bb73	Forward ioctl requests to original geom. PR: 148540 Silence from: luigi Reviewed by: pjd Approved by: mav (mentor) MFC after: 2 weeks	2010-08-02 10:30:49 +00:00
Andrey V. Elsukov	b6d4028166	Release access for consumers that are opened, but will be destroyed indirectly by orphan method. PR: 148688 Silence from: marcel Approved by: mav (mentor) MFC after: 2 weeks	2010-08-02 10:26:15 +00:00
Alexander Motin	8edcf69406	Export PCI IDs of ATA/SATA controllers through CAM and ata(4) layers to GEOM. This information needed for proper soft-RAID's on-disk metadata reading and writing.	2010-07-25 15:43:52 +00:00
Andrey V. Elsukov	733a9e2783	Prevent access after free to table entry in case when user deletes partition that not yet created (changes doesn't committed to disk). PR: 148687 Approved by: mav (mentor) MFC after: 7 days	2010-07-23 06:30:01 +00:00
Ruslan Ermilov	cf1457e4fd	Fixed cache size decoding read from a label. PR: kern/144732 Submitted by: Eugene Grosbein MFC after: 3 days	2010-07-14 08:22:00 +00:00
Rui Paulo	c6b2b6fce6	Add NTFS partition type to GEOM_MBR.	2010-06-26 13:20:40 +00:00
Pawel Jakub Dawidek	2aa15ffdab	'unit' can be negative, so use signed type for it. Found by: Coverity Prevent CID: 3731 MFC after: 3 days	2010-06-14 21:58:55 +00:00
Pawel Jakub Dawidek	15725379d0	BIO_DELETE contains range we want to delete and doesn't provide any useful data, so there is no need to copy it to userland. MFC after: 3 days	2010-06-14 21:56:24 +00:00
Andriy Gapon	1bdfff2252	fix a few cases where a string is passed via format argument instead of via %s Most of the cases looked harmless, but this is done for the sake of correctness. In one case it even allowed to drop an intermediate buffer. Found by: clang MFC after: 2 week	2010-06-11 19:27:21 +00:00
Edward Tomasz Napierala	7ce513a52a	Untangle g_print_bio(), silencing Coverity. Found with: Coverity Prevent CID: 3566, 3567	2010-06-10 17:49:36 +00:00
Matt Jacob	59ccfe8176	Try and narrow the gap in which you act on an event that has been canceled. Obtained from: Jaako Heinonen MFC after: 1 month	2010-06-08 22:40:02 +00:00
Edward Tomasz Napierala	c01eb2f36b	Make sure not to pass NULL to g_orphan_provider(). Found with: Coverity Prevent CID: 3411	2010-06-05 08:00:52 +00:00
Marius Strobl	36066952e5	Don't leak memory on destruction. Reviewed by: marcel MFC after: 3 days	2010-06-02 17:17:11 +00:00
Andriy Gapon	56b3acd001	g_label: fix possible NULL pointer dereference in case glabel debug level is >= 1 and gp->provider list is empty for some reason Found by: clang static analyzer MFC after: 4 days	2010-05-31 09:10:39 +00:00
Marius Strobl	785c3f7ea4	Fix some whitespace nits.	2010-05-24 17:33:02 +00:00
Nathan Whitehorn	0532c3a5a5	Teach gpart about bootcode on APM.	2010-05-16 22:21:33 +00:00
Matt Jacob	87e7f7be89	Yet another potential dereference of a dead provider. Sponsored by: Panasas MFC after: 1 week	2010-05-14 21:27:39 +00:00
Matt Jacob	1371a457d9	Make sure to check that the active provider pointer points to something before dereferencing the pointer. Sponsored by: Pansas MFC after: 1 week	2010-05-14 16:56:18 +00:00
Jaakko Heinonen	3535526b15	- Don't return EAGAIN from gv_unload(). It was used to work around the deadlock fixed in r207671. - Wait for worker process to exit at class unload. The worker process was not guaranteed to exit before the linker unloaded the module. - Use 0 as the worker process exit status instead of ENXIO and style the NOTREACHED comment. Reviewed by: lulf X-MFC after: r207671	2010-05-10 19:12:23 +00:00
Jaakko Heinonen	5a279fc5fc	In g_zero_destroy_geom(), return 0 instead of EBUSY in the success case. EBUSY was probably used as a workaround for the deadlock fixed in r207671. Approved by: pjd X-MFC after: r207671	2010-05-10 19:08:53 +00:00
Ulf Lilleengen	42a9ad6697	- Remove obsolete flags. MFC after: 1 week	2010-05-08 16:19:17 +00:00
Jaakko Heinonen	9061251f9a	Fix deadlock between GEOM class unloading and withering. Withering can't proceed while g_unload_class() blocks the event thread. Fix this by not running g_unload_class() as a GEOM event and dropping the topology lock when withering needs to proceed. PR: kern/139847 Silence on: freebsd-geom	2010-05-05 18:53:24 +00:00
Marcel Moolenaar	c74f160cb0	Re-calculate a geometry when reprobing as well. PR: kern/145452 Reported by: "Andrey V. Elsukov" <bu7cher@yandex.ru>	2010-04-25 01:56:39 +00:00
Marcel Moolenaar	6f702278e6	Fix undo for schemes that have internal partitions. Internal partitions do not constitute user-visible or active partitions and as such should not prevent undoing pending operations. While here, initialize the last usable sector for the placeholder geom based on the null scheme, created to allow undoing the destruction of a scheme. This gives consistent output with "gpart show". Based on a patch from: "Andrey V. Elsukov" <bu7cher@yandex.ru>	2010-04-25 00:54:11 +00:00
Marcel Moolenaar	3f71c319f4	Implement the resize verb and add support for resizing partitions for all schemes but EBR. Quality work by Andrey! Submitted by: "Andrey V. Elsukov" <bu7cher@yandex.ru>	2010-04-23 03:11:39 +00:00
Jaakko Heinonen	002d1d1c38	Fix ddb(4) "show geom addr" command when INVARIANTS is enabled. Don't assert that the topology lock is held when g_valid_obj() is called from debugger. MFC after: 1 week	2010-04-19 20:07:35 +00:00
Pawel Jakub Dawidek	31c4cef715	Use lower priority for GELI worker threads. This improves system responsiveness under heavy GELI load. MFC after: 3 days	2010-04-15 16:34:06 +00:00
Andriy Gapon	2a842317eb	g_io_check: respond to zero pp->mediasize with ENXIO Previsouly this condition was reported with EIO by bio_offset > mediasize check. Perhaps that check should be extended to bio_offset+bio_length > mediasize. MFC after: 1 week	2010-04-15 08:39:56 +00:00
Luigi Rizzo	83f8218814	fix copyright format, as requested by Joel Dahl	2010-04-13 09:56:17 +00:00
Luigi Rizzo	c36cf6fbbc	make code compile with KTR	2010-04-13 09:53:08 +00:00
Luigi Rizzo	1831a90ac5	Bring in geom_sched, support for scheduling disk I/O requests in a device independent manner. Also include an example anticipatory scheduler, gsched_rr, which gives very nice performance improvements in presence of competing random access patterns. This is joint work with Fabio Checconi, developed last year and presented at BSDCan 2009. You can find details in the README file or at http://info.iet.unipi.it/~luigi/geom_sched/	2010-04-12 16:37:45 +00:00
Andriy Gapon	8f128ff559	g_vfs_open: allow only one mount per device vnode In other words, deny multiple read-only mounts of the same device. Shared read-only mounts should theoretically be possible, but, unfortunately, can not be implemented correctly using current buffer cache code/interface and results in an eventual system crash. Also, using nullfs seems to be a more efficient way to achieve the same goal. This gets us back to where we were before GEOM and where other BSDs are. Submitted by: pjd (idea for checking for shared mounting) Discussed with: phk, pjd Silence from: fs@, geom@ MFC after: 2 weeks	2010-04-03 08:53:53 +00:00
Andriy Gapon	1b4bc5f851	bo_bsize: revert r205860 and take an alternative approch in getblk In r205860 I missed the fact that there is code that strongly assumes that devvp bo_bsize is equal to underlying provider's sectorsize. In those places it is hard to obtain the sectorsize in an alternative way if devvp bo_bsize is set to something else. So, I am reverting bo_bsize assigment in g_vfs_open. Instead, in getblk I use DEV_BSIZE block size for b_offset calculation if vp is a disk vp as reported by vn_isdisk. This should coinside with vp being a devvp. Reported by: Mykola Dzham <i@levsha.me> Tested by: Mykola Dzham <i@levsha.me> Pointyhat to: avg MFC after: 2 weeks X-ToDo: convert bread(devvp) in all fs to use bo_bsize-d blocks	2010-04-02 15:12:31 +00:00
Andriy Gapon	0c04f06072	g_vfs_open: correctly set devvp.v_bufobj.bo_bsize to DEV_BSIZE Because of how breadn -> bufstrategy -> g_vfs_strategy are currently implemented, bread on devvp always expects DEV_BSIZE block size. Thus, devvp bo_bsize must always be DEV_BSIZE irrespective of media properties or filesystem implementation details. Reviewed by: mckusick MFC after: 2 weeks	2010-03-29 20:34:25 +00:00
Matt Jacob	2b4969ff9e	Change how multipath labels are created and managed. This makes it easier to support various storage boxes which really aren't active-active. We only write the label on the first provider. For all other providers we just "add" the disk. This also allows for an "add" verb. A usage implication is that you should specificy the currently active storage path as the first provider. Note that this does not add RDAC-like functionality, but better allows for autovolumefailover configurations (additional checkins elsewhere will support this). Sponsored by: Panasas MFC after: 1 month	2010-03-29 18:04:06 +00:00
Alexander Motin	a5be8eb530	Do not fetch precise time of request start when stats collection disabled. Reviewed by: pjd, phk	2010-03-24 18:04:25 +00:00
Matt Jacob	b5dce617d8	Add 'rotate' and 'getactive' verbs to provide some control and information about what the currently active path is. Sponsored by: Panasas MFC after: 1 month	2010-03-21 15:02:47 +00:00
Jaakko Heinonen	a41aa4a789	Escape characters unsafe for XML output in GEOM class, instance and provider names. - Characters in range 0x01-0x1f except '\t', '\n', and '\r' are replaced with '?'. Those characters are disallowed in XML. - '&', '<', '>', '\'', '"' and characters in range 0x7f-0xff are replaced with XML numeric character reference. If the kern.geom.confxml sysctl provides invalid XML, libgeom geom_xml2tree() fails and utilities using it do not work. Unsafe characters are common in msdosfs and cd9660 labels. PR: kern/104389 Submitted by: Doug Steinwand (original version) Reviewed by: pjd Discussed on: freebsd-geom MFC after: 3 weeks	2010-03-20 16:16:13 +00:00
Pawel Jakub Dawidek	b0990a1dae	Simplify loops.	2010-03-18 13:11:43 +00:00
Ulf Lilleengen	77d2a01ea8	- Set missing flag when initiating a plex rebuild with the rebuildparity command. - Check if plex is already syncing or rebuilding before initiating a parity rebuild or check.	2010-03-08 21:16:28 +00:00
Pawel Jakub Dawidek	32115b105a	Please welcome HAST - Highly Avalable Storage. HAST allows to transparently store data on two physically separated machines connected over the TCP/IP network. HAST works in Primary-Secondary (Master-Backup, Master-Slave) configuration, which means that only one of the cluster nodes can be active at any given time. Only Primary node is able to handle I/O requests to HAST-managed devices. Currently HAST is limited to two cluster nodes in total. HAST operates on block level - it provides disk-like devices in /dev/hast/ directory for use by file systems and/or applications. Working on block level makes it transparent for file systems and applications. There in no difference between using HAST-provided device and raw disk, partition, etc. All of them are just regular GEOM providers in FreeBSD. For more information please consult hastd(8), hastctl(8) and hast.conf(5) manual pages, as well as http://wiki.FreeBSD.org/HAST. Sponsored by: FreeBSD Foundation Sponsored by: OMCnet Internet Service GmbH Sponsored by: TransIP BV	2010-02-18 23:16:19 +00:00
Pawel Jakub Dawidek	12f35a615a	- Style fixes. - Prefer strlcpy() over strncpy().	2010-02-18 22:29:35 +00:00
Pawel Jakub Dawidek	f24bf7522d	Correct comment.	2010-02-18 22:28:12 +00:00
Pawel Jakub Dawidek	e5131ab452	Log attach just like we log detach.	2010-02-18 22:27:38 +00:00
Oleksandr Tymoshenko	45a7687f90	- Give geom_redboot taste of flash/spi. Now there is another provider of redboot partitions. This patch was missed during merge from projects/mips.	2010-02-03 01:12:19 +00:00
Xin LI	38907b4cc7	Prevent NULL deference by checking return value of gctl_get_asciiparam. MFC after: 2 weeks	2010-02-02 22:25:22 +00:00
Marcel Moolenaar	cd18ad8347	Export the UUID of the partition in the XML. The partition UUID is used by EFI's device path to identify a partition. In order for FreeBSD to add EFI boot options, proper device paths need to be constructed.	2010-01-30 23:13:19 +00:00
Ivan Voras	49e232f2c9	Go through with write_metadata() non-error-handling and make it return "void". This is mostly to avoid dead variable assignment warning by LLVM. No functional change. Pointed out by: trasz Approved by: gnn (mentor)	2010-01-25 20:51:40 +00:00
Edward Tomasz Napierala	fdf64c5752	Remove unneeded variables. Found with: clang	2010-01-25 17:00:21 +00:00
Edward Tomasz Napierala	1373012510	Remove pointless assignment. Found with: clang	2010-01-25 16:58:58 +00:00
Edward Tomasz Napierala	dc9098605e	Remove some pointless variable assignments. Found with: clang	2010-01-25 16:55:30 +00:00
Edward Tomasz Napierala	0a36cb97a8	Remove unused variable. Found with: clang	2010-01-25 16:10:22 +00:00
Xin LI	35daa28f30	Expose stripe offset and stripe size through libgeom and geom(8) userland utilities. Reviewed by: pjd, mav (earlier version)	2010-01-17 06:20:30 +00:00
Edward Tomasz Napierala	b3f9d8c804	Add gmountver, disk mount verification GEOM class. Note that due to e.g. write throttling ('wdrain'), it can stall all the disk I/O instead of just the device it's configured for. Using it for removable media is therefore not a good idea. Reviewed by: pjd (earlier version)	2010-01-16 09:52:49 +00:00
Alexander Motin	0c8fd0c8ac	Change the way in which zero stripesize is handled. Instead of reporting zero stripeoffset in such case (as if device has no stripes), report offset from the beginning of the media (as if device has single infinite stripe). This gives partitioning tools information, required to guess better partition alignment, in case if hardware doesn't report it's stripe size. For example, it should give disklabel info about odd offset made by fdisk.	2010-01-06 13:14:37 +00:00
Alexander Motin	8de5811320	Move wakeup() out of mutex to reduce contention.	2010-01-05 10:52:21 +00:00
Alexander Motin	86de0ca52c	Move wakeup() out of mutex to reduce contention.	2010-01-05 10:30:56 +00:00
Alexander Motin	06b215fd3a	Slightly optimize XOR calculation.	2010-01-05 02:06:05 +00:00
Marcel Moolenaar	665bb830e2	Properly return the UUID represented by the alias. PR: 142174 Submitted by: Przemyslaw Laczynski <torindel@gmail.com> Pointy hat to: rpaulo	2010-01-02 01:02:59 +00:00
Alexander Motin	0d883b11e3	Call wakeup() only for the first request on the queue.	2009-12-30 17:23:27 +00:00
Antoine Brodin	13e403fdea	(S)LIST_HEAD_INITIALIZER takes a (S)LIST_HEAD as an argument. Fix some wrong usages. Note: this does not affect generated binaries as this argument is not used. PR: 137213 Submitted by: Eygene Ryabinkin (initial version) MFC after: 1 month	2009-12-28 22:56:30 +00:00
Alexander Motin	1c80ec0a6b	Add BIO_DELETE support to ada(4): - For SSDs use TRIM feature of DATA SET MANAGEMENT command, as defined by ACS-2 specification working draft. - For CompactFlash use CFA ERASE command, same as ad(4) does. With this patch, `newfs -E /dev/ada1` was able to restore write speed of my heavily weared OCZ Vertex SSD (firmware 1.4) up to the initial level for the most part of it's capacity. Previous 1.3 firmware, even reportiong TRIM capabilty bit set, was not working, reporting ABORT error for every DSM command. I have no idea whether it is normal, but for some reason it takes 200ms to handle any TRIM command on this drive, that was making delete extremely slow. But TRIM command is able to accept long list of LBAs and the length of that list seems doesn't affect it's execution time. Implemented request clusting algorithm allowed me to rise delete rate up to reasonable numbers, when many parallel DELETE requests running.	2009-12-28 20:08:01 +00:00
Alexander Motin	5f9b1143ac	Make geom_concat to passthrough stripe parameters of the first component, hoping that rest will fit.	2009-12-24 14:32:21 +00:00
Alexander Motin	113d8e5046	As soon as geom_raid3 reports it's own stripe as sector size, report largest underlying provider's stripe, multiplied by number of data disks in array, due to transformation done, as array stripe.	2009-12-24 13:38:02 +00:00
Alexander Motin	92f60381d9	As soon as mirror has no own stripes, report largest stripe of unrerlying components, hoping others fit, if they are not equal.	2009-12-24 12:17:22 +00:00
Alexander Motin	8b30323843	Add two disk ioctls, giving user-level tools information about disk/array stripe (optimal access block) size and offset.	2009-12-24 11:05:23 +00:00
Alexander Motin	f00919d2fc	Make geom_stripe report it's stripe size to upper layers.	2009-12-24 10:43:44 +00:00
Alexander Motin	d4060fa67d	Make graid3 fallback to malloc() when component request size is bigger then maximal prepared UMA zone size. This fixes crash with MAXPHYS > 128K.	2009-12-21 23:31:03 +00:00
Rui Paulo	33f7a4124d	Add Microsoft and NetBSD partition types handling.	2009-12-14 20:26:27 +00:00
Rui Paulo	f13174303d	Simplify partition type parsing by using a data-oriented model. While there add more Apple and Linux partition types.	2009-12-14 20:04:06 +00:00
Alexander Motin	891852cc12	Change 'load' balancing mode algorithm: - Instead of measuring last request execution time for each drive and choosing one with smallest time, use averaged number of requests, running on each drive. This information is more accurate and timely. It allows to distribute load between drives in more even and predictable way. - For each drive track offset of the last submitted request. If new request offset matches previous one or close for some drive, prefer that drive. It allows to significantly speedup simultaneous sequential reads. PR: kern/113885 Reviewed by: sobomax	2009-12-03 21:47:51 +00:00
Edward Tomasz Napierala	3ce9ca8947	Provide a set of sysctls and tunables to disable device node creation for specific "kinds" of disk labels - for example, GPT UUIDs. Reason for this is that sometimes, other GEOM classes attach to these device nodes instead of the proper ones - e.g. they attach to /dev/gptid/XXX instead of /dev/ada0p2, which is annoying. Reviewed by: pjd (earlier version) MFC after: 1 month	2009-11-28 11:57:43 +00:00
Rui Paulo	f9d551f7df	Add a missing check for Apple HFS partitions. MFC after: 1 week	2009-11-12 19:30:49 +00:00
Robert Noland	a59a131093	We need to allocate space for the header in the create path also. This fixes a null pointer dereference with "gpart create -s GPT" after the previous commit. Reported by: Yuri Pankov Pointyhat to: me MFC after: 1 week	2009-11-12 16:28:39 +00:00
Robert Noland	1c2dee3cc9	Fix handling of GPT headers when size is > 92 bytes. It is valid for an on-disk GPT header to report a header size which is greater than 92 bytes. Previously, we would read in the sector and copy only the 92 bytes that we know how to deal with before calculating the checksum for comparison. This meant that when we did the checksum, we overshot the buffer and took in random memory, so the checksum would fail. We now determine the size of the header and allocate enough space to preserve the entire on-disk contents. This allows us to be correctly calculate the checksum and be able to modify and write the header back to the disk, while preserving data that we might not understand. Reported by: Kris Weston Approved by: marcel@ MFC after: 2 weeks	2009-11-07 17:29:03 +00:00
Robert Noland	e80d42dda2	Set the active flag in the PMBR when we install bootcode on a GPT partitioned disk. Some BIOS require this to be set before they will boot the device. Approved by: marcel MFC after: 2 weeks	2009-10-14 19:24:01 +00:00
Pawel Jakub Dawidek	f8727e71d7	If provider is open for writing when we taste it, skip it for classes that depend on on-disk metadata. This was we won't attach to providers that are used by other classes. For example we don't want to configure partitions on da0 if it is part of gmirror, what we really want is partitions on mirror/foo. During regular work it works like this: if provider is open for writing a class receives the spoiled event from GEOM and detaches, once provider is closed the taste event is send again and class can rediscover its metadata if it is still there. This doesn't work that way when new class arrives, because GEOM gives all existing providers for it to taste, also those open for writing. Classes have to decided on their own if they want to deal with such providers (eg. geom_dev) or not (classes modified by this commit). Reported by: des, Oliver Lehmann <lehmann@ans-netz.de> Tested by: des, Oliver Lehmann <lehmann@ans-netz.de> Discussed with: phk, marcel Reviewed by: marcel MFC after: 3 days	2009-10-09 09:42:22 +00:00
Ulf Lilleengen	a8a3cd7d9d	- Improve error message consistency and wording.	2009-10-05 08:44:31 +00:00
Marcel Moolenaar	b61808630d	The first 96 bytes may not be zeroes. It can contain trivial boot code that merely emits an error and waits for a key press before rebooting. The error being that extended partitions are not bootable. The origin is presumed to be Windows 2000; Windows XP does not do this... For now, ignore the first 96 bytes when checking that the EBR is (for the most part) all zeroes. Tested by: Mario Lobo <mlobo@digiart.art.br> MFC after: 1 week	2009-09-28 23:52:47 +00:00
Marcel Moolenaar	87f4470620	Don't create more partitions than can fit in the table by checking that the index is within bounds.	2009-09-24 06:00:49 +00:00
Edward Tomasz Napierala	bb3fd7ff4f	Remove unused variable.	2009-09-08 17:20:17 +00:00
Alexander Motin	18e42503ed	Do not check proper request alignment here in geom_dev in production. It will be checked any way later by g_io_check() in g_io_schedule_down(). It is only needed here to not trigger panic from additional check, when INVARIANTS enabled. So cover it with #ifdef INVARIANTS. It saves two 64bit divisions per request.	2009-09-08 05:46:38 +00:00
Alexander Motin	7fc019af65	MFp4: Remove msleep() timeout from g_io_schedule_up/down(). It works fine without it, saving few percents of CPU on high request rates without need to rearm callout twice per request.	2009-09-06 19:33:13 +00:00
Pawel Jakub Dawidek	b740e905a4	Add support for changing providers priority. Submitted by: Mel Flynn	2009-09-06 06:52:06 +00:00
Alexander Motin	af582ea7af	Remove artificial MAX_IO_SIZE constant, equal to DFLTPHYS * 2. Use MAXPHYS instead. It is NULL change for GENERIC kernel, but allows 'fast' mode to work on systems with increased MAXPHYS.	2009-09-04 19:20:46 +00:00
Pawel Jakub Dawidek	e93f5e4d25	Simplify g_disk_ident_adjust() function and allow any printable character in serial number. Discussed with: trasz Obtained from: Wheel Sp. z o.o. (http://www.wheel.pl)	2009-09-04 09:39:06 +00:00
Pawel Jakub Dawidek	07a93e6b3c	There's no need for checking result of M_WAITOK allocation.	2009-08-27 08:40:51 +00:00
Pawel Jakub Dawidek	c16ce31b31	Fix an obvious topology lock leak. MFC after: 3 days	2009-08-27 08:28:34 +00:00
Marcel Moolenaar	8530137252	The start of the EFI GPT partition in the PMBR can always be represented by CHS addressing. Don't define these fields as 0xff, but rather define them correctly. This prevents boot problems on PCs where GPT is being used. PR: 115406 Submitted by: Kent Hauser <kent@khauser.net> Approved by: re (kib)	2009-08-17 16:16:46 +00:00
Ulf Lilleengen	b79cac0f92	- Fix the issue with read access count modification on RAID-5 plexes properly. If the access counts were not increased and decreased in equal numbers by gvinum consumers, the read access count would be inconsistent with the write access count. Instead, modify the read access count with the write access count directly to prevent any inconsistencies. Approved by: re (kib)	2009-07-18 11:12:48 +00:00
Marcel Moolenaar	f43b57e32a	Revert revisions 188839 and 188868. Use of the ioctl in geom_dev.c is invalid because the ioctl happens without prior open. The ioctl got introduced to provide backward compatibility for extended partitions, but it ended up not being used because it didn't work as expected. Since there are no consumers of the ioctl and the implementation is broken, the best fix is to remove the code entirely. Spotted by: phk Approved by: re (kensmith)	2009-07-08 05:56:14 +00:00
Edward Tomasz Napierala	8edfe76ab5	Fix a panic which (reportedly) can happen when unmounting a filesystem with I/O requests in flight on kernels compiled with "options INVARIANTS". Also, make it obvious it's not right to call g_valid_obj() (and macros using it, e.g. G_VALID_CONSUMER()) without topology lock held. Approved by: re (kib) Reported by: pho	2009-07-01 20:16:29 +00:00
Edward Tomasz Napierala	fb231f3627	Make gjournal work with kernel compiled with "options DIAGNOSTIC". Previously, it would panic immediately. Reviewed by: pjd Approved by: re (kib)	2009-06-30 14:34:06 +00:00
Ulf Lilleengen	ac2a008e69	- Apply the same naming rules of LVM names as done in the LVM code itself. PR: kern/135874	2009-06-24 22:09:30 +00:00
John Hay	65a4957806	Do not stop the loop when an empty or deleted directory entry is found. Rather just skip over it.	2009-06-24 06:42:13 +00:00
Ivan Voras	63f4d880e0	Fix tabs, slightly improve comments. Approved by: gnn (mentor) (original) Noticed by: stas	2009-06-18 11:12:11 +00:00
Ivan Voras	452f657cb9	Add support for labels derived from GPT metadata. Approved by: gnn (mentor) Reviewed by: pjd PR: 128398 Submitted by: Marius Nuennerich < marius at nuenneri.ch >	2009-06-13 00:27:03 +00:00
Luigi Rizzo	6231f75bcf	As discussed in the devsummit, introduce two fields in the struct bio to store classification information, and a hook for classifier functions that can be called by g_io_request(). This code is from Fabio Checconi as part of his GSOC work.	2009-06-11 09:55:26 +00:00
Pawel Jakub Dawidek	cb9b72ce4a	Simplify.	2009-06-05 23:35:43 +00:00
Doug Barton	8b3bfb0509	Crank the debug level necessary to display the "Label foo is removed" and "Label for provider ..." messages up from 0 to 1.	2009-05-30 22:31:52 +00:00
Jamie Gritton	76ca6f88da	Place hostnames and similar information fully under the prison system. The system hostname is now stored in prison0, and the global variable "hostname" has been removed, as has the hostname_mtx mutex. Jails may have their own host information, or they may inherit it from the parent/system. The proper way to read the hostname is via getcredhostname(), which will copy either the hostname associated with the passed cred, or the system hostname if you pass NULL. The system hostname can still be accessed directly (and without locking) at prison0.pr_host, but that should be avoided where possible. The "similar information" referred to is domainname, hostid, and hostuuid, which have also become prison parameters and had their associated global variables removed. Approved by: bz (mentor)	2009-05-29 21:27:12 +00:00
Ulf Lilleengen	4147dd02cd	- Unbreak 64 bit platforms by casting off_t to intmax.	2009-05-26 14:15:06 +00:00
Ulf Lilleengen	6d66da20b7	- Fix wrong print on BIO_DONE. - Use db_printf instead of printf. While here, apply this to other ddb commands as well. Pointed out by: pjd	2009-05-26 10:03:44 +00:00
Ulf Lilleengen	bf7d2c1797	- Add 'show bio' DDB command. MFC after: 3 weeks	2009-05-26 07:29:17 +00:00
Edward Tomasz Napierala	916cd41c47	Check return value of gctl_get_asciiparam(). Found with: Coverity Prevent(tm) CID: 1118	2009-05-12 16:59:50 +00:00
Attilio Rao	dfd233edd5	Remove the thread argument from the FSD (File-System Dependent) parts of the VFS. Now all the VFS_* functions and relating parts don't want the context as long as it always refers to curthread. In some points, in particular when dealing with VOPs and functions living in the same namespace (eg. vflush) which still need to be converted, pass curthread explicitly in order to retain the old behaviour. Such loose ends will be fixed ASAP. While here fix a bug: now, UFS_EXTATTR can be compiled alone without the UFS_EXTATTR_AUTOSTART option. VFS KPI is heavilly changed by this commit so thirdy parts modules needs to be recompiled. Bump __FreeBSD_version in order to signal such situation.	2009-05-11 15:33:26 +00:00
Ulf Lilleengen	d8d015cddc	- Split up the BIO queue into a queue for new and one for completed requests. This is necessary for two reasons: 1) In order to avoid collisions with the use of a BIOs flags set by a consumer or a provider 2) Because GV_BIO_DONE was used to mark a BIO as done, not enough flags was available, so the consumer flags of a BIO had to be misused in order to support enough flags. The new queue makes it possible to recycle the GV_BIO_DONE flag into GV_BIO_GROW. As a consequence, gvinum will now work with any other GEOM class under it or on top of it. - Use bio_pflags for storing internal flags on downgoing BIOs, as the requests appear to come from a consumer of a gvinum volume. Use bio_cflags only for cloned BIOs. - Move gv_post_bio to be used internally for maintenance requests. - Remove some cases where flags where set without need. PR: kern/133604	2009-05-06 19:34:32 +00:00
Ulf Lilleengen	41944888fe	- Fix a case where a RAID5 volume would think that it is supposed to grow a new subdisk after a parity rebuild.	2009-05-06 19:18:19 +00:00
Ulf Lilleengen	11c4adc49e	- Check if any plexes are doing internal maintenance before removing them.	2009-05-06 19:06:28 +00:00
Ulf Lilleengen	5a0fa8531c	- Add forgotten KASSERT.	2009-05-06 18:37:32 +00:00
Ulf Lilleengen	1d8dfc60f4	- Fix a bug where the bio_data field of the wrong BIO is freed if an error occurs when doing a RAID5 request.	2009-05-06 18:27:28 +00:00
Ulf Lilleengen	451b95f489	- GV_BIO_RETRY is not used, and it is actually impossible with more than 8 values for bio_cflags/bio_pflags.	2009-05-06 18:24:56 +00:00
Ulf Lilleengen	040272465d	- Split the queue mutex into one for the event queue and one for the BIO queue, as they do not really relate and to prepare for an additional queue to be covered by the BIO queue mutex. - Implement wrappers for fetching the next element from the event queue as well as for putting a new element into the BIO queue.	2009-05-06 18:21:48 +00:00
Ulf Lilleengen	ad75dd77e0	- Make the gvinum softc invisible to userland, as it is not needed.	2009-05-04 17:30:20 +00:00
Ulf Lilleengen	697ab8be86	- Remove assertion of topology lock remaining from 7.x gvinum. It is not needed, as the renaming only changes internal gvinum names and will not alter the geom topology. - The topology lock was not held when calling g_wither_geom after renaming.	2009-04-18 16:36:27 +00:00
Marcel Moolenaar	cce94b6583	Precision '*' expects an int and strlen() returns a size_t. Compensate.	2009-04-16 05:52:47 +00:00
Marcel Moolenaar	6ad9a99f21	Add a compat option to the EBR scheme that controls the naming of the partitions (GEOM_PART_EBR_COMPAT). When compatibility is enabled, changes to the partitioning are disallowed. Remove the device name aliasing added previously to provide backward compatibility, but which in practice doesn't give us anything. Enable compatibility on amd64 and i386.	2009-04-15 22:38:22 +00:00
Ulf Lilleengen	1de45ea74d	- Move out allocation part of different gvinum objects into its own routine and make use of it in the gvinum userland code.	2009-04-10 08:50:14 +00:00
Andrew Thompson	853a10a581	Revert r190676,190677 The geom and CAM changes for root_hold are the wrong solution for USB design quirks. Requested by: scottl	2009-04-10 04:08:34 +00:00

... 6 7 8 9 10 ...

2114 Commits