freebsd-dev

Author	SHA1	Message	Date
Alexander Motin	1e68fe9c33	Avoid unneeded malloc/memcpy/free if there is no metadata on disk. Submitted by: Dmitry Luhtionov <dmitryluhtionov@gmail.com> MFC after: 2 weeks	2014-12-05 10:23:18 +00:00
Alexander Motin	26f0f92fa2	Decode some binary fields of Intel metadata. Submitted by: Dmitry Luhtionov <dmitryluhtionov@gmail.com> MFC after: 2 weeks	2014-12-04 15:54:45 +00:00
Davide Italiano	2be111bf7d	Follow up to r225617. In order to maximize the re-usability of kernel code in userland rename in-kernel getenv()/setenv() to kern_setenv()/kern_getenv(). This fixes a namespace collision with libc symbols. Submitted by: kmacy Tested by: make universe	2014-10-16 18:04:43 +00:00
Hans Petter Selasky	af3b2549c4	Pull in r267961 and r267973 again. Fix for issues reported will follow.	2014-06-28 03:56:17 +00:00
Glen Barber	37a107a407	Revert r267961, r267973: These changes prevent sysctl(8) from returning proper output, such as: 1) no output from sysctl(8) 2) erroneously returning ENOMEM with tools like truss(1) or uname(1) truss: can not get etype: Cannot allocate memory	2014-06-27 22:05:21 +00:00
Hans Petter Selasky	3da1cf1e88	Extend the meaning of the CTLFLAG_TUN flag to automatically check if there is an environment variable which shall initialize the SYSCTL during early boot. This works for all SYSCTL types both statically and dynamically created ones, except for the SYSCTL NODE type and SYSCTLs which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to be used in the case a tunable sysctl has a custom initialisation function allowing the sysctl to still be marked as a tunable. The kernel SYSCTL API is mostly the same, with a few exceptions for some special operations like iterating childrens of a static/extern SYSCTL node. This operation should probably be made into a factored out common macro, hence some device drivers use this. The reason for changing the SYSCTL API was the need for a SYSCTL parent OID pointer and not only the SYSCTL parent OID list pointer in order to quickly generate the sysctl path. The motivation behind this patch is to avoid parameter loading cludges inside the OFED driver subsystem. Instead of adding special code to the OFED driver subsystem to post-load tunables into dynamically created sysctls, we generalize this in the kernel. Other changes: - Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask" to "hw.pcic.intr_mask". - Removed redundant TUNABLE statements throughout the kernel. - Some minor code rewrites in connection to removing not needed TUNABLE statements. - Added a missing SYSCTL_DECL(). - Wrapped two very long lines. - Avoid malloc()/free() inside sysctl string handling, in case it is called to initialize a sysctl from a tunable, hence malloc()/free() is not ready when sysctls from the sysctl dataset are registered. - Bumped FreeBSD version to indicate SYSCTL API change. MFC after: 2 weeks Sponsored by: Mellanox Technologies	2014-06-27 16:33:43 +00:00
Alexander Motin	dea1e22600	Reduce number of opens by REOM RAID during provider taste. Instead opening/closing provider by each of metadata classes, do it only once in core code. Since for SCSI disks open/close means sending some SCSI commands to the device, this change reduces taste time. MFC after: 2 weeks Sponsored by: iXsystems, Inc.	2014-04-28 15:03:52 +00:00
Alexander Motin	1229e83d2b	Fix wrong sizes used to access PD_Type and PD_State DDF metadata fields. This caused incorrect behavior of arrays with big-endian DDF metadata. Little-endian (like used by Adaptec controllers) should not be harmed. Add workaround should be enough to manage compatibility. MFC after: 2 weeks	2014-04-10 16:00:33 +00:00
Eitan Adler	7a22215c53	Fix undefined behavior: (1 << 31) is not defined as 1 is an int and this shifts into the sign bit. Instead use (1U << 31) which gets the expected result. This fix is not ideal as it assumes a 32 bit int, but does fix the issue for most cases. A similar change was made in OpenBSD. Discussed with: -arch, rdivacky Reviewed by: cperciva	2013-11-30 22:17:27 +00:00
Alexander Motin	40ea77a036	Merge GEOM direct dispatch changes from the projects/camlock branch. When safety requirements are met, it allows to avoid passing I/O requests to GEOM g_up/g_down thread, executing them directly in the caller context. That allows to avoid CPU bottlenecks in g_up/g_down threads, plus avoid several context switches per I/O. The defined now safety requirements are: - caller should not hold any locks and should be reenterable; - callee should not depend on GEOM dual-threaded concurency semantics; - on the way down, if request is unmapped while callee doesn't support it, the context should be sleepable; - kernel thread stack usage should be below 50%. To keep compatibility with GEOM classes not meeting above requirements new provider and consumer flags added: - G_CF_DIRECT_SEND -- consumer code meets caller requirements (request); - G_CF_DIRECT_RECEIVE -- consumer code meets callee requirements (done); - G_PF_DIRECT_SEND -- provider code meets caller requirements (done); - G_PF_DIRECT_RECEIVE -- provider code meets callee requirements (request). Capable GEOM class can set them, allowing direct dispatch in cases where it is safe. If any of requirements are not met, request is queued to g_up or g_down thread same as before. Such GEOM classes were reviewed and updated to support direct dispatch: CONCAT, DEV, DISK, GATE, MD, MIRROR, MULTIPATH, NOP, PART, RAID, STRIPE, VFS, ZERO, ZFS::VDEV, ZFS::ZVOL, all classes based on g_slice KPI (LABEL, MAP, FLASHMAP, etc). To declare direct completion capability disk(9) KPI got new flag equivalent to G_PF_DIRECT_SEND -- DISKFLAG_DIRECT_COMPLETION. da(4) and ada(4) disk drivers got it set now thanks to earlier CAM locking work. This change more then twice increases peak block storage performance on systems with manu CPUs, together with earlier CAM locking changes reaching more then 1 million IOPS (512 byte raw reads from 16 SATA SSDs on 4 HBAs to 256 user-level threads). Sponsored by: iXsystems, Inc. MFC after: 2 months	2013-10-22 08:22:19 +00:00
Alexander Motin	b43560ab19	MFprojects/camlock r256445: Add unmapped I/O support to GEOM RAID.	2013-10-16 09:33:23 +00:00
Alexander Motin	0f0b2fd889	Return error when opening read-only volumes (like RAID4/5/...) for writing. Previously opens succeeded, but actual write operations returned errors. Requested by: peter MFC after: 2 weeks	2013-08-13 07:56:40 +00:00
Alexander Motin	db8645f05e	Oops, wrong constant at r254269.	2013-08-13 06:25:34 +00:00
Alexander Motin	e70b565ba4	Fix reasonable but safe Clang warnings.	2013-08-13 06:21:36 +00:00
Alexander Motin	8531bb3f0c	Introduce 3 seconds timeout on `graid stop` command (mostly with -f flag). Since completion waiting goes in g_event thread, it may cause GEOM deadlock if consumer on top (for example, ZFS) uses g_event thread for closing.	2013-07-27 15:02:19 +00:00
Alexander Motin	57eed4a86f	Fix vdc->Secondary_Element_Count metadata field access from 16 to 8 bit. In some cases it could cause kernel panic during failed drive replacement. Reported by: trasz MFC after: 1 week	2013-05-20 00:33:54 +00:00
Alexander Motin	bcb6ad36f2	Return "descr" field alike to "Intel RAID1 volume" for GEOM RAID to make it look better in bsdinstall.	2013-04-27 06:57:39 +00:00
Alexander Motin	a93c0ed463	Remove extra bio_data and bio_length copying to child request after calling g_clone_bio(), that already copied them.	2013-03-26 05:42:12 +00:00
Sean Bruno	bd9fba0cfe	Add legacy support to geom raid to create a /dev/arX device for support of upgrading older machines using ataraid(4) to newer releases. This optional parameter is controlled via kern.geom.raid.legacy_aliases and will create a /dev/ar0 device that will point at /dev/raid/r0 for example. Tested on Dell SC 1425 DDF-1 format software raid controllers installing from stable/7 and upgrading to stable/9 without having to adjust /etc/fstab Reviewed by: mav Obtained from: Yahoo! MFC after: 2 Weeks	2013-03-08 20:07:32 +00:00
Alexander Motin	34d3281c57	Fix panic when Secondary_Element_Count == 1 and Secondary_Element_Seq is not set (255). Reported by: sbruno MFC after: 1 week	2013-03-07 18:55:37 +00:00
Alexander Motin	c3ec009a97	- Fix rebuild position broken at r245522. - Identify one more metadata field.	2013-01-17 03:27:08 +00:00
Alexander Motin	821a0f639e	For Promise/AMD metadata add support for disks with capacity above 2TiB and for volumes with sector size above 512 bytes.	2013-01-17 00:50:25 +00:00
Alexander Motin	ed8180e665	Recalculate volume size only for real CONCATs. For SINGLE trust volume size given by metadata, as it should be correct and in some cases can be smaller then subdisk size.	2013-01-17 00:09:50 +00:00
Alexander Motin	4c10c25e33	Keep value of orig_config_id metadata field. Windows driver writes there previous value of config_id when it is changed in some cases. I guess it may be used do avoid some split-brain conditions.	2013-01-14 20:31:45 +00:00
Alexander Motin	eb84fc957c	Small cosmetic tuning of the IRRT status constants.	2013-01-14 16:38:43 +00:00
Alexander Motin	511c69d9ce	Print some more metadata fields.	2013-01-14 13:06:35 +00:00
Alexander Motin	898a4b74f4	Windows driver writes relative volume IDs to metadata field. Use that value as a hint for raid/rX device number to make it persistent across reboots.	2013-01-14 00:38:51 +00:00
Alexander Motin	f9462b9bbe	- Add checks for Intel metadata version and attributes. Ignore disks with unsupported metadata types like Intel Smart Response to not corrupt them. - Improve setting of these things during metadata writing to protect from incapable BIOS'es and other implementations.	2013-01-13 23:00:40 +00:00
Alexander Motin	b99586c25f	Improve support for disabled disks. If disabled disk disconnected and then reconnected back, leave it as disconnected. If new disk inserted instead of disabled, rebuild it and leave as enabled.	2013-01-13 14:30:37 +00:00
Alexander Motin	865aea63c3	Windows handles INIT and VERIFY as array-wide and it doesn't specify which disks should be rebuilt. Our rebuild code is same time disk-centric. To handle this situation properly check all disks for RBLD flags, and if no disk specified try rebuild/resync all of them except newly inserted.	2013-01-12 21:51:49 +00:00
Alexander Motin	4c95a24141	Implement migration from single disk to RAID1/IRRT for Intel metadata. Windows driver uses such migration when it creates new arrays. While GEOM RAID has no mechanism to implement migration in general case, this specifc case still can be handled easily via degraded RAID1 creation followed by regular rebuild.	2013-01-12 18:25:48 +00:00
Alexander Motin	26c538bc0b	Add basic support for Intel Rapid Recover Technology (Intel RRT). It is alike to RAID1, but with dedicating master and recovery disks and providing manual control over synchronization. It allows to use recovery disk as snapshot of the master disk from the time of the last sync. This implementation is not functionaly complete comparing to Windows, but it is better then silent conversion to RAID1 on first boot.	2013-01-12 09:35:44 +00:00
Alexander Motin	650e245ebf	Minor addition to r242323: Alike to BIO_WRITE, report success if at least one subdisk succeeded with BIO_DELETE. But unlike BIO_WRITE don't fail disk on BIO_DELETE error. Sponsored by: iXsystems, Inc. MFC after: 1 month	2012-10-29 21:08:06 +00:00
Alexander Motin	609a74746a	Add basic BIO_DELETE support to GEOM RAID class for all RAID levels. If at least one subdisk in the volume supports it, BIO_DELETE requests will be propagated down. Unfortunatelly, for RAID levels with redundancy unmapped blocks will be mapped back during first rebuild/resync process. Sponsored by: iXsystems, Inc. MFC after: 1 month	2012-10-29 18:04:38 +00:00
Alexander Motin	a479c51be3	Make GEOM RAID more aggressive in marking volumes as clean on shutdown and move that action from shutdown_pre_sync to shutdown_post_sync stage to avoid extra flapping. ZFS tends to not close devices on shutdown, that doesn't allow GEOM RAID to shutdown gracefully. To handle that, mark volume as clean just when shutdown time comes and there are no active writes. MFC after: 2 weeks	2012-10-29 14:18:54 +00:00
Alexander Motin	c6f0cd57e3	NULL-ify last previously used pointer instead of last possible pointer. This should be only a cosmetic change. Found by: Clang Static Analyzer	2012-10-10 20:41:37 +00:00
Alexander Motin	6871a543f9	Make graid command line a bit more friendly by allowing volume name or provider name to be specified instead of geom name (first argument in all subcommands except label). In most cases there is only one array used any way, so it is not really useful to make user type ugly geom names like Intel-f0bdf223 or SiI-732c2b9448cf. Though they can be used in some cases. Sponsored by: iXsystems, Inc. MFC after: 1 month	2012-10-07 19:30:16 +00:00
Alexander Motin	c89d2fbe18	Add global and per-module sysctls/tunables to enable/disable metadata taste. That should help to handle some cases when disk has some RAID metadata that should be ignored, especially during boot. MFC after: 3 days	2012-09-13 13:27:09 +00:00
Alexander Motin	d9d6849693	Add missing FAILED event to g_raid_subdisk_event2str() to print it properly in debug messages. Submitted by: Dmitry Luhtionov <dmitryluhtionov@gmail.com>	2012-08-10 13:36:33 +00:00
Alexander Motin	ef844ef76f	- Prevent error status leak if write to some of the RAID1/1E volume disks failed while write to some other succeeded. Instead mark disk as failed. - Make RAID1E less aggressive in failing disks to avoid volume breakage. MFC after: 2 weeks	2012-05-11 13:20:17 +00:00
Alexander Motin	14f9f25ba0	Remove some hardcoded constants from code.	2012-05-06 16:41:27 +00:00
Alexander Motin	eb3b1cd0de	Plug small memory leaks.	2012-05-06 12:55:20 +00:00
Alexander Motin	8f12ca2ee1	Add support for RAID5R. Slightly improve support for RAIDMDF.	2012-05-06 11:32:36 +00:00
Alexander Motin	86b0366909	Fix bug causing memory corruption and panics with big-endian metadata.	2012-05-04 08:59:19 +00:00
Alexander Motin	4b97ff6137	Implement read-only support for volumes in optimal state (without using redundancy) for the following RAID levels: RAID4/5E/5EE/6/MDF.	2012-05-04 07:32:57 +00:00
Alexander Motin	8df8e26adc	Add optional -o argument to the `graid label` to specify some metadata format options. Use it for specifying byte order for the DDF metadata: big-endian defined by specification and little-endian used by Adaptec.	2012-05-03 05:32:56 +00:00
Alexander Motin	d525d87560	Improve spare disks support. Unluckily, for some reason Adaptec 1430SA RAID BIOS doesn't want to understand spare disks created by graid. But at least spares created by BIOS are working fine now.	2012-05-01 18:00:31 +00:00
Alexander Motin	2b9c925ff0	Implement volume deletion if disk has more then one partition.	2012-05-01 09:21:21 +00:00
Alexander Motin	47e980965c	Improve DDF metadata writing.	2012-05-01 08:19:29 +00:00
Alexander Motin	00f32ecbd0	Add to GEOM RAID class module, supporting the DDF metadata format, as defined by the SNIA Common RAID Disk Data Format Specification v2.0. Supports multiple volumes per array and multiple partitions per disk. Supports standard big-endian and Adaptec's little-endian byte ordering. Supports all single-layer RAID levels. Dual-layer RAID levels except RAID10 are not supported now because of GEOM RAID design limitations. Some work is still to be done, but the present code already manages basic interoperation with RAID BIOS of the Adaptec 1430SA SATA RAID controller. MFC after: 1 month Sponsored by: iXsystems, Inc.	2012-04-30 17:53:02 +00:00

1 2

64 Commits