freebsd-skq

Author	SHA1	Message	Date
attilio	d57a3c7c06	MFC	2011-05-16 16:34:03 +00:00
ae	0e1ff53f69	Make diagnostic messages more specific. With bootverbose print out all inconsistencies of integrity in the partition table, not first found only. Requested by: kib	2011-05-16 15:59:50 +00:00
ae	40e0f433dd	Add diagnostic messages for integrity checks.	2011-05-16 12:00:32 +00:00
ae	4675dfde1b	Add a sysctl kern.geom.part.check_integrity for those who has corrupt partition tables and lost an ability to boot after r221788. Also unhide an error message from bootverbose, this would help to easier determine the problem.	2011-05-15 20:03:54 +00:00
attilio	d7d74971f1	MFC	2011-05-15 15:47:16 +00:00
trociny	f698adac68	Fix a memory leak possible in g_eli_key_allocate() if the key with the same keyno is added while we aren't holding the lock. Approved by: pjd (mentor) MFC after: 1 week	2011-05-15 12:39:30 +00:00
attilio	99e65551b9	MFC	2011-05-12 14:01:40 +00:00
thompsa	34effb6be7	Move the three geom kprocs as threads under a single pid. Reviewed by: julian	2011-05-11 21:47:30 +00:00
ae	d529455d7d	Add basic metadata integrity check. In case when partition table was probed and read successfull, but it contains invalid values (e.g. overlapped partitions, offset or size is out of bounds), then table will be rejected. MFC after: 1 month	2011-05-11 19:59:43 +00:00
attilio	7ff10cb598	MFC	2011-05-08 14:56:02 +00:00
ae	c286c25c24	Limit number of sectors that can be addressed. MFC after: 1 week	2011-05-08 12:28:13 +00:00
ae	1dcde07c6f	Limit number of sectors that can be addressed. MFC after: 1 week	2011-05-08 12:20:30 +00:00
ae	37c4b4161f	Limit number of sectors that can be addressed. Reject table if blkcount from metadata is greater than provider.	2011-05-08 12:16:39 +00:00
ae	a0e3009476	Limit number of sectors that can be addressed. MFC after: 1 week	2011-05-08 12:11:16 +00:00
ae	45e0589617	Replace UINT_MAX to UINT32_MAX. Pointed out by: kib MFC after: 1 week	2011-05-08 11:42:51 +00:00
ae	fb474a94c3	Limit number of sectors that can be addressed. MFC after: 1 week	2011-05-08 11:20:27 +00:00
ae	4e0c05db9b	Limit number of sectors that can be addressed. MFC after: 1 week	2011-05-08 11:16:17 +00:00
pjd	2b4c44d1c2	Export GELI class version via sysctl kern.geom.eli.version. MFC after: 1 week	2011-05-08 09:29:21 +00:00
pjd	ca83b3035d	Version 6 is compatible with version 5 when it comes to control commands. MFC after: 1 week	2011-05-08 09:25:54 +00:00
pjd	b2b06929c8	Detect and handle metadata of version 6. MFC after: 1 week	2011-05-08 09:25:16 +00:00
pjd	f181092612	When support for multiple encryption keys was committed, GELI integrity mode was not updated to pass CRD_F_KEY_EXPLICIT flag to opencrypto. This resulted in always using first key. We need to support providers created with this bug, so set special G_ELI_FLAG_FIRST_KEY flag for GELI provider in integrity mode with version smaller than 6 and pass the CRD_F_KEY_EXPLICIT flag to opencrypto only if G_ELI_FLAG_FIRST_KEY doesn't exist. Reported by: Anton Yuzhaninov <citrin@citrin.ru> MFC after: 1 week	2011-05-08 09:17:56 +00:00
pjd	b5e32df2e0	Remove prototype for a function that no longer exist. MFC after: 1 week	2011-05-08 09:11:04 +00:00
pjd	db388d2134	Drop proper key. MFC after: 1 week	2011-05-08 09:09:49 +00:00
pjd	bd8265bce9	Add magic field to the g_eli_key structure to detect if we are really operating on proper structures. MFC after: 1 week	2011-05-08 09:08:50 +00:00
attilio	a0b51ba62f	MFC	2011-05-06 22:45:33 +00:00
adrian	8b8191cbec	Updates to geom_map from the author. The major update here is to support 64 bit size/offsets. There's also style related changes. Submitted by: ray@dlink.ua	2011-05-05 14:43:09 +00:00
attilio	fe4de567b5	Commit the support for removing cpumask_t and replacing it directly with cpuset_t objects. That is going to offer the underlying support for a simple bump of MAXCPU and then support for number of cpus > 32 (as it is today). Right now, cpumask_t is an int, 32 bits on all our supported architecture. cpumask_t on the other side is implemented as an array of longs, and easilly extendible by definition. The architectures touched by this commit are the following: - amd64 - i386 - pc98 - arm - ia64 - XEN while the others are still missing. Userland is believed to be fully converted with the changes contained here. Some technical notes: - This commit may be considered an ABI nop for all the architectures different from amd64 and ia64 (and sparc64 in the future) - per-cpu members, which are now converted to cpuset_t, needs to be accessed avoiding migration, because the size of cpuset_t should be considered unknown - size of cpuset_t objects is different from kernel and userland (this is primirally done in order to leave some more space in userland to cope with KBI extensions). If you need to access kernel cpuset_t from the userland please refer to example in this patch on how to do that correctly (kgdb may be a good source, for example). - Support for other architectures is going to be added soon - Only MAXCPU for amd64 is bumped now The patch has been tested by sbruno and Nicholas Esborn on opteron 4 x 12 pack CPUs. More testing on big SMP is expected to came soon. pluknet tested the patch with his 8-ways on both amd64 and i386. Tested by: pluknet, sbruno, gianni, Nicholas Esborn Reviewed by: jeff, jhb, sbruno	2011-05-05 14:39:14 +00:00
ae	0bf1fc417f	Remove unneeded code. MFC after: 1 week	2011-05-04 18:41:26 +00:00
ae	f99e745c73	Remove unneeded code. MFC after: 1 week	2011-05-04 18:26:45 +00:00
ae	a7dee41b12	Remove unneeded code. MFC after: 1 week	2011-05-04 18:17:21 +00:00
ae	548ffad229	Removed KASSERT, g_new_providerf() can not fail. MFC after: 1 week	2011-05-04 18:06:40 +00:00
ae	fc73075ad0	Remove "for a moment" assignment. struct g_geom zeroed when allocated. MFC after: 1 week	2011-05-04 17:56:53 +00:00
ae	f7ea6f62c3	Remove unneeded checks, g_new_xxx functions can not fail. MFC after: 1 week	2011-05-04 17:37:37 +00:00
ae	2fbd405c08	When checking existence of providers skip those which are orphaned. PR: kern/132273 MFC after: 2 week	2011-05-04 12:59:11 +00:00
mav	19fa03535a	Use make_dev_alias_p() added in r221397 to create alias dev entry. It removes panic in case if alias name is already busy for some reason.	2011-05-03 19:12:42 +00:00
mav	79493720f3	Implement relaxed comparision for hardcoded provider names to make it ignore adX/adaY difference in both directions to simplify migration to the CAM-based ATA or back.	2011-04-27 00:10:26 +00:00
mav	519a30551e	- Add shim to simplify migration to the CAM-based ATA. For each new adaX device in /dev/ create symbolic link with adY name, trying to mimic old ATA numbering. Imitation is not complete, but should be enough in most cases to mount file systems without touching /etc/fstab. - To know what behavior to mimic, restore ATA_STATIC_ID option in cases where it was present before. - Add some more details to UPDATING.	2011-04-26 17:01:49 +00:00
pjd	96e417f741	One key is expected from providers smaller than or equal to (2^20)*sectorsize bytes. Remove bogus assertion and while here remove another too obvious assertion. Reported by: Fabian Keil <freebsd-listen@fabiankeil.de> MFC after: 2 weeks	2011-04-24 10:41:13 +00:00
pjd	4e8487e9df	If number of keys for the given provider doesn't exceed the limit, allocate all of them at attach time. This allows to avoid moving keys around in the most-recently-used queue and needs no mutex synchronization nor refcounting. MFC after: 2 weeks	2011-04-21 13:35:20 +00:00
pjd	7e657fb243	Instead of allocating memory for all the keys at device attach, create reasonably large cache for the keys that is filled when needed. The previous version was problematic for very large providers (hundreds of terabytes or serval petabytes). Every terabyte of data needs around 256kB for keys. Make the default cache limit big enough to fit all the keys needed for 4TB providers, which will eat at most 1MB of memory. MFC after: 2 weeks	2011-04-21 13:31:43 +00:00
mav	0bbb5b8e1a	Reduce geom_raid log verbosity.	2011-04-18 16:15:59 +00:00
gavin	14b170ebfa	Remove an incorrect be16toh() that prevented geom_part_apm from working on little-endian machines. Reviewed by: marcel MFC after: 2 weeks	2011-04-15 12:32:52 +00:00
adrian	d7acba8fca	Introduce geom_map, a GEOM provider designed for use by embedded flash stores. Some devices - notably those with uboot - don't have an explicit partition table (eg like Redboot's FIS.) geom_map thus provides an easy way to export the hard-coded flash layout as geom providers for use by filesystems and other tools. It also includes a "search" function which allows for dynamic creation of partition layouts where the device only has a single hard-coded partition. For example, if there is a "kernel+rootfs" partition, a single image can be created which appends the rootfs after the kernel with an appropriate search string. geom_map can be told to search for said search string and create a partition beginning after it. Submitted by: Aleksandr Rybalko <ray@dlink.ua>	2011-04-12 08:10:25 +00:00
trociny	e95ea38956	In g_eli_read_done() and g_eli_write_done(), for a bio with bio_children > 1, g_destroy_bio() is never called and the bio leaks. Fix this by calling g_destroy_bio() earlier, before the check. Submitted by: Victor Balada Diaz <victor@bsdes.net> (initial version) Approved by: pjd (mentor) MFC after: 1 week	2011-04-03 17:38:12 +00:00
pjd	6fa8fbd029	GEOM has an internal mechanism to deal with ENOMEM errors returned via g_io_deliver(). In such case it increases 'pace' counter on each ENOMEM and reschedules the request. The 'pace' counter is decreased for each request going down, but until 'pace' is greater than zero, GEOM will handle at most 10 requests per second. For GEOM GATE users that are proxy to local GEOM providers (like ggatel(8) and HAST) we can end up with almost permanent slow down of GEOM down queue. This is because once we reach GEOM GATE queue limit, we return ENOMEM to the GEOM. This means that we have, eg. 1024 I/O requests in the GEOM GATE queue. To make room in the queue and stop returning ENOMEM we need to proceed the requests of course, but those requests are handled by userland daemons that handle them by reading/writing also from/to local GEOM providers. For example with HAST, a new requests comes to /dev/hast/data, which is GEOM GATE provider. GEOM GATE passes the request to hastd(8) and hastd(8) reads/writes from/to /dev/da0. Once we reach GEOM GATE queue limit, to free up a slot in GEOM GATE queue, hastd(8) has to read/write from/to /dev/da0, but this request will also be very slow, because GEOM now slows down all the requests. We end up with full queue that we can unload at the speed of 10 requests per second. This simply looks like a deadlock. Fix it by allowing userland daemons that work with both GEOM GATE and local GEOM providers to specify unlimited queue size, so GEOM GATE will never return ENOMEM to the GEOM. MFC after: 1 week	2011-04-02 06:56:06 +00:00
mav	f19e4d3eda	Bunch of small bugfixes and cleanups. Found with: Clang Static Analyzer	2011-03-31 16:19:53 +00:00
mav	8fca35a71a	Bunch of small bugfixes and cleanups. Found with: Coverity Prevent(tm) CID: 9656, 9658, 9693, 9705, 9706, 9707, 9808, 9809, 9810, 9711, 9712, 9713, 9714	2011-03-31 16:14:35 +00:00
ae	fa6777f630	Remove unneeded checks, g_new_xxx functions can not return NULL. Reviewed by: pjd MFC after: 1 week	2011-03-31 06:30:59 +00:00
trociny	0d88893312	Increase debug level on g_gate device destruction and add message on device creation. Suggested by: danger Approved by: pjd (mentor) MFC after: 3 days	2011-03-30 21:40:14 +00:00
trociny	42e994cbec	In g_gate_create() there is a window between when g_gate_softc is registered in g_gate_units array and when its sc_provider field is filled. If during this period g_gate_units is accessed by another thread that is checking for provider name collision the crash is possible. Fix this by adding sc_name field to struct g_gate_softc. In g_gate_create() when g_gate_softc is created but sc_provider is still not sc_name points to provider name stored in the local array. Approved by: pjd (mentor) Reported by: Freddie Cash <fjwcash@gmail.com> MFC after: 1 week	2011-03-27 19:56:55 +00:00
mav	8dab5b0501	MFgraid/head: Add new RAID GEOM class, that is going to replace ataraid(4) in supporting various BIOS-based software RAIDs. Unlike ataraid(4) this implementation does not depend on legacy ata(4) subsystem and can be used with any disk drivers, including new CAM-based ones (ahci(4), siis(4), mvs(4), ata(4) with `options ATA_CAM`). To make code more readable and extensible, this implementation follows modular design, including core part and two sets of modules, implementing support for different metadata formats and RAID levels. Support for such popular metadata formats is now implemented: Intel, JMicron, NVIDIA, Promise (also used by AMD/ATI) and SiliconImage. Such RAID levels are now supported: RAID0, RAID1, RAID1E, RAID10, SINGLE, CONCAT. For any all of these RAID levels and metadata formats this class supports full cycle of volume operations: reading, writing, creation, deletion, disk removal and insertion, rebuilding, dirty shutdown detection and resynchronization, bad sector recovery, faulty disks tracking, hot-spare disks. For Intel and Promise formats there is support multiple volumes per disk set. Look graid(8) manual page for additional details. Co-authored by: imp Sponsored by: Cisco Systems, Inc. and iXsystems, Inc.	2011-03-24 21:31:32 +00:00
mav	ba27262ba1	MFgraid/head r218212, r218257: Introduce new type of BIO_GETATTR -- GEOM::setstate, used to inform lower GEOM about state of it's providers from the point of upper layers. Make geom_disk use led(4) subsystem to illuminate states in such fashion: FAILED - "1" (on), REBUILD - "f5" (slow blink), RESYNC - "f1" (fast blink), ACTIVE - "0" (off). LED name should be set for each disk via kern.geom.disk.%s.led sysctl. Later disk API could be extended to allow disk driver to report this info in custom way via it's own facilities.	2011-03-24 19:23:42 +00:00
mav	ad433c09b3	MFgraid/head r217827: Change BIO_GETATTR("GEOM::kerneldump") API to make set_dumper() called by consumer (geom_dev) instead of provider (geom_disk). This allows any geom insert it's code into the dump call chain, implementing more sophisticated functionality then just disk partitioning.	2011-03-24 08:37:48 +00:00
sobomax	e356b9f82b	Some linux distros put mount point into the ext2fs labels, such as '/', or '/boot', which confuses the devfs code and can cause userland programs to fail reading /dev/ext2fs directory with weird error code, such as any program that uses pwlib. Strip any leading slashes before feeding the label to the geom_label code. Sponsored by: Sippy Software, Inc. MFC after: 1 week	2011-03-08 17:00:31 +00:00
nwhitehorn	3fa7ecd613	Add the disk ident and a human-meaningful description (here, the disk model string) to the geom_disk config XML so that they are easily accessible from userland. MFC after: 1 week	2011-02-26 14:58:54 +00:00
netchild	6bf702a55b	Add some FEATURE macros for various GEOM classes. No FreeBSD version bump, the userland application to query the features will be committed last and can serve as an indication of the availablility if needed. Sponsored by: Google Summer of Code 2010 Submitted by: kibab Reviewed by: silence on geom@ during 2 weeks X-MFC after: to be determined in last commit with code from this project	2011-02-25 10:24:35 +00:00
brucec	6d9b42b486	Fix typos - remove duplicate "the". PR: bin/154928 Submitted by: Eitan Adler <lists at eitanadler.com> MFC after: 3 days	2011-02-21 09:01:34 +00:00
nyan	9d394a2762	Add support to set a slice name.	2011-02-19 11:09:38 +00:00
luigi	301c433ed1	Correct a subtle bug in the 'gsched_rr' disk scheduler. The algorithm is supposed to work as follows: in order to prevent starvation, when a new client starts being served we record the start time and reset the counter of bytes served. We then switch to a new client after a certain amount of time or bytes, even if the current one still has pending requests. To avoid charging a new client the time of the first seek, we start counting time when the first request is served. Unfortunately a bug in the previous version of the code failed to set the start time in certain cases, resulting in some processes exceeding their timeslice. The fix (in this patch) is trivial, though it took a while to find out and replicate the bug. Thanks to Tommaso Caprai for investigating and fixing the problem. Submitted by: Tommaso Caprai MFC after: 1 week	2011-02-14 08:09:02 +00:00
marcel	31f3b74d23	Use the preload_fetch_addr() and preload_fetch_size() convenience functions to obtain the address and size of the preloaded key files. Sponsored by: Juniper Networks.	2011-02-13 19:34:48 +00:00
nyan	a893333629	Add support to write boot menu.	2011-02-11 13:18:00 +00:00
ae	a42db43990	Add new user-friendly aliases for partition types for the MBR and EBR schemes: fat32, ebr, linux-data, linux-raid, linux-swap and linux-lvm. Add bios-boot GUID and alias for the GPT scheme. It used by GRUB 2 loader. Also do sorting definitions of types in diskmbr.h and in g_part.c. PR: bin/120990, kern/147664 MFC after: 2 weeks	2011-01-28 11:13:01 +00:00
ae	258630083a	While inspecting the disklabel check that start offset of partition is within provider's bounds. If not then reject this disklabel. Mark bbarea as NULL to do not free it again in destroy method. MFC after: 1 week	2011-01-27 08:02:26 +00:00
mdf	886222db75	Remove the CTLFLAG_NOLOCK as it seems to be both unused and unfunctional. Wiring the user buffer has only been done explicitly since r101422. Mark the kern.disks sysctl as MPSAFE since it is and it seems to have been mis-using the NOLOCK flag. Partially break the KPI (but not the KBI) for the sysctl_req 'lock' field since this member should be private and the "REQ_LOCKED" state seems meaningless now.	2011-01-26 22:48:09 +00:00
kib	3dbd972169	Treat async buffer writes from the gjournal switcher thread the same as from syncer. We shall not sleep on running buffer space when suspending. Reproduced and tested by: pho PR: kern/154228 MFC after: 1 week	2011-01-26 10:34:21 +00:00
ae	ca1b961657	Limit maximum number of GPT entries to 4k. It is most realistic value and can prevent kernel memory exhausting when big value is specified from command line. Split reading and writing operation to several iteration to do not trigger KASSERT when data length is greater than MAXPHYS. PR: kern/144962, kern/147851 MFC after: 2 weeks	2011-01-18 09:52:53 +00:00
mdf	a47f6d552c	sysctl(9) cleanup checkpoint: amd64 GENERIC builds cleanly. Commit the geom piece.	2011-01-12 19:54:07 +00:00
ae	f8ef0d32cf	Sector size can not be greater than MAXPHYS. Since GRAID3 calculates sector size from user-specified block size, report to user about big blocksize. PR: kern/147851 MFC after: 1 week	2011-01-12 13:55:01 +00:00
ae	e2bf490883	Sector size can not be greater than MAXPHYS. MFC after: 1 week	2011-01-12 12:26:10 +00:00
ae	73f08c7ac5	Remove redundant check. MFC after: 1 week	2011-01-11 13:22:20 +00:00
ae	0827d66d68	Round GNOP provider's mediasize to its sectorsize. This prevents KASSERT in g_io_request when geom classes doing tasting. PR: kern/147852 MFC after: 1 week	2011-01-11 11:42:22 +00:00
mdf	be1c25c6aa	Fix a memory overflow where the input length to g_gpt_utf8_to_utf16() was specified incorrectly, causing the bzero to run past the end of a malloc(9)'d object. Submitted by: Eric Youngblut < eyoungblut AT isilon DOT com > MFC after: 3 days	2011-01-07 16:46:20 +00:00
nwhitehorn	f934c98bcf	Add an entry to the gpart XML to determine if the geom has pending changes that need to be committed (or undone). MFC after: 2 weeks	2011-01-06 03:36:04 +00:00
kib	a6922e1e8c	Finish r210923, 210926. Mark some devices as eternal. MFC after: 2 weeks	2011-01-04 10:59:38 +00:00
kib	cbd7f9d931	Add reporting of GEOM::candelete BIO_GETATTR for md(4) and geom_disk(4). Non-zero value of attribute means that device supports BIO_DELETE. Suggested and reviewed by: pjd Tested by: pho MFC after: 1 week	2010-12-29 12:11:07 +00:00
ae	8db1129593	Allow destroying EBR in COMPAT (default) mode. MFC after: 2 week	2010-12-28 08:42:12 +00:00
ae	43a50d46ef	Make EBR probe method less strictly to be able detect EBRs with small non fatal inconsistency. EBR may contain boot loader and sometimes it just has some garbage data. Now this does not prevent FreeBSD to use extended partitions. But since we do not support bootcode for EBR we mark tables which have non empty boot area as corrupt. This does make them readonly and we can not damage this data. PR: kern/141235 MFC after: 1 month	2010-12-28 08:36:44 +00:00
brucec	3ced539f50	Don't warn if a partition appears not to be aligned on a track boundary. Modern disks use LBA and create a fake CHS geometry that doesn't have any relation to the on-disk layout of data.	2010-12-07 20:46:11 +00:00
ivoras	904cc72f19	Add a note about the magic number 20. Actually, 22.75 entries fit in a 512 byte sector but when choosing magic numbers, 20 looks nicer. Discussed with: marcel	2010-12-02 19:47:27 +00:00
jh	0a3bc2b1b9	- Report an error when a label with invalid name is attempted to be created with glabel(8). - Fix a typo in an error message. - Fix comment typos. Approved by: pjd	2010-12-01 19:24:07 +00:00
jh	9c043d5908	Use g_eventlock to protect against losing wakeups in the g_event process and replace tsleep(9) with msleep(9) which doesn't use a timeout. The previously used timeout caused the event process to wake up ten times per second on an idle system. one_event() is now called with the topology lock held and it returns with both the topology and event locks held when there are no more events in the queue. Reported by: mav, Marius Nünnerich Reviewed by: freebsd-geom	2010-11-22 16:47:53 +00:00
ed	4ee939c936	Add support for asterisk characters when filling in the GELI password during boot. Change the last argument of gets() to indicate a visibility flag and add definitions for the numerical constants. Except for the value 2, gets() will behave exactly the same, so existing consumers shouldn't break. We only use it in two places, though. Submitted by: lme (older version)	2010-11-14 14:12:43 +00:00
ae	54af98ea87	Fix regression introduced in r215088: gpart(8) reports "arg0 'provider': Invalid argument" after creating new partition table. Move code for search of existing geom into g_part_find_geom function and use this function instead of g_part_parm_geom in g_part_ctl_create. Approved by: kib (mentor)	2010-11-11 12:13:41 +00:00
ae	802c192516	In r212554 name of G_PART_PARM_GEOM and G_PART_PARM_PROVIDER ctlreq parameters was changed to "arg0". Fix the last place where it is used. Approved by: kib (mentor)	2010-11-10 14:38:51 +00:00
jh	6604019e26	Extend the g_eventlock mutex coverage in one_event() to include setting of the EV_DONE flag and use the mutex to protect against losing wakeups in g_waitfor_event(). Reported by: davidxu Tested by: davidxu Discussed on: freebsd-current	2010-11-03 16:19:35 +00:00
ae	f2e3b4bcd6	Reimplemented "gpart destroy -F". Now it does all work in kernel. This was needed for recover implementation. Implement the recover command for GPT. Now GPT will marked as corrupt when any of three types of corruption will be detected: 1. Damaged primary GPT header or table 2. Damaged secondary GPT header or table 3. Secondary header is not located in the last LBA Marked GPT becomes read-only. Any changes with corrupt table are prohibited. Only "destroy" and "recover" commands are allowed. Discussed with: geom@ (mostly silence) Tested by: Ilya A. Arhipov Approved by: mav (mentor) MFC after: 2 weeks	2010-10-25 16:23:35 +00:00
pjd	0e4c810277	- Improve error messages, so instead of 'Not fully done', the user will get information that device is already suspended or that device is using one-time key and suspend is not supported. - 'geli suspend -a' silently skips devices that use one-time key, this is fine, but because we log which device were suspended on the console, log also which devices were skipped.	2010-10-22 22:58:00 +00:00
pjd	a4a3ac19b4	Close a race between checking if device is already suspended and suspending it.	2010-10-22 22:54:26 +00:00
pjd	c24b1dbd26	Add State tag, so 'geli status' will report active/suspended status, eg: # geli status Name Status Components da0.eli SUSPENDED da0 da1.eli ACTIVE da1	2010-10-22 22:45:26 +00:00
pjd	8ba9fc913b	Encryption keys array might be NULL if device is suspended. Check for this, so we don't panic when we detach suspended device.	2010-10-22 22:44:09 +00:00
pjd	b022d95473	Move sc_akeyctx and sc_ivctx initialization to the g_eli_mkey_propagate() function which eliminates code duplication and will ensure proper order of operation.	2010-10-22 22:13:11 +00:00
pjd	94a920a001	Free opencrypto sessions on suspend, as they also might keep encryption keys.	2010-10-21 19:44:28 +00:00
pjd	5a22d5e587	Fix a bug introduced in r213067 where we use authentication key before initializing it.	2010-10-21 12:58:26 +00:00
pjd	d5e7511690	Bring in geli suspend/resume functionality (finally). Before this change if you wanted to suspend your laptop and be sure that your encryption keys are safe, you had to stop all processes that use file system stored on encrypted device, unmount the file system and detach geli provider. This isn't very handy. If you are a lucky user of a laptop where suspend/resume actually works with FreeBSD (I'm not!) you most likely want to suspend your laptop, because you don't want to start everything over again when you turn your laptop back on. And this is where geli suspend/resume steps in. When you execute: # geli suspend -a geli will wait for all in-flight I/O requests, suspend new I/O requests, remove all geli sensitive data from the kernel memory (like encryption keys) and will wait for either 'geli resume' or 'geli detach'. Now with no keys in memory you can suspend your laptop without stopping any processes or unmounting any file systems. When you resume your laptop you have to resume geli devices using 'geli resume' command. You need to provide your passphrase, etc. again so the keys can be restored and suspended I/O requests released. Of course you need to remember that 'geli suspend' won't clear file system cache and other places where data from your geli-encrypted file system might be present. But to get rid of those stopping processes and unmounting file system won't help either - you have to turn your laptop off. Be warned. Also note, that suspending geli device which contains file system with geli utility (or anything used by 'geli resume') is not very good idea, as you won't be able to resume it - when you execute geli(8), the kernel will try to read it and this read I/O request will be suspended.	2010-10-20 20:50:55 +00:00
pjd	75395aabbc	- Add missing comments. - Make a comment consistent with others.	2010-10-20 20:01:45 +00:00
jh	e0ef538943	Use make_dev_p(9) with the MAKEDEV_CHECKNAME flag instead of make_dev(9) and print a diagnostic if the call fails. This avoids a panic when a device with an invalid name is attempted to be registered. For example the label class gets device names from untrusted input. Reviewed by: freebsd-geom	2010-10-19 16:48:49 +00:00
rpaulo	7bca860ea7	The canonical way to print __func__ when using KASSERT() is to write ("%s", __func__). This avoids clang's -Wformat-string warnings.	2010-10-13 11:35:59 +00:00
ae	ab9dd3ef58	Replace strlen(_PATH_DEV) with sizeof(_PATH_DEV) - 1. Suggested by: kib Approved by: kib (mentor) MFC after: 5 days	2010-10-09 20:20:27 +00:00
lulf	57b68bbf11	- Check flag with the bitwise operator, not the logical operator. Submitted by: arundel MFC after: 1 week	2010-10-01 06:12:13 +00:00
ae	0afdf593c0	Some schemes can allocate memory for internal purposes but when GEOM does withering this memory doesn't freed. Add G_PART_DESTROY call to g_part_wither. Also add missed g_free() call to G_PART_READ method for MBR and PC98 schemes. Submitted by: jh (previous version) Reviewed by: pjd Approved by: kib (mentor)	2010-09-25 18:27:29 +00:00
pjd	6b9ec43f8f	Change g_eli_debug to int, so one can turn off any GELI output by setting kern.geom.eli.debug sysctl to -1. MFC after: 2 weeks	2010-09-25 10:32:04 +00:00
pjd	60321cfa67	Ignore errors from BIO_FLUSH. It might confuse users that provider wasn't really killed. What we really care about are write errors only. MFC after: 2 weeks	2010-09-25 10:31:05 +00:00
pjd	39f36627d9	Allow to configure GPT attributes. It shouldn't be allowed to set bootfailed attribute (it should be allowed only to unset it), but for test purposes it might be useful, so the current code allows it. Reviewed by: arch@ (Message-ID: <20100917234542.GE1902@garage.freebsd.pl>) MFC after: 2 weeks	2010-09-24 19:33:47 +00:00
pjd	3ff79b30f5	Update copyright years. MFC after: 1 week	2010-09-23 12:02:08 +00:00
pjd	32404b1197	Add support for AES-XTS. This will be the default now. MFC after: 1 week	2010-09-23 11:58:36 +00:00
pjd	ed0ad07f3d	Implement switching of data encryption key every 2^20 blocks. This ensures the same encryption key won't be used for more than 2^20 blocks (sectors). This will be the default now. MFC after: 1 week	2010-09-23 11:49:47 +00:00
pjd	8c781f88d0	Make the code similar to the code in g_eli_integrity.c. MFC after: 1 week	2010-09-23 11:23:10 +00:00
pjd	72f4299778	Define default overwrite count, so that userland can use it. MFC after: 1 week	2010-09-23 11:19:48 +00:00
pjd	9a528e9595	When trashing metadata, flush after each write. MFC after: 1 week	2010-09-23 10:43:37 +00:00
brian	52f645c18b	Support attaching version 4 metadata Reviewed by: pjd	2010-09-19 10:45:53 +00:00
mav	0bae586ec2	Add support for dumping kernel to gconcat. Dumping goes to the component, where dump partition begins.	2010-09-16 17:24:25 +00:00
pjd	d7756299d9	Change message when setting or unsetting attribute less confusing. Before: ada0 has <attrib> set After: <attrib> set on ada0 MFC after: 2 weeks	2010-09-15 21:15:00 +00:00
pjd	0c5da4a1c0	Make the message that informs about bootcode being written to disk less confusing. Note there is still no information about 'partcode' being written to disk (gpart bootcode -p <partcode> <disk>). Maybe in the future all the messages printed by gpart(8) on success could be hidden under -v? PR: bin/150239 Reported by: Roddi <roddi@me.com> Submitted by: arundel MFC after: 2 weeks	2010-09-15 20:59:13 +00:00
pjd	e87685cef9	- Change all places where G_TYPE_ASCNUM is used to G_TYPE_NUMBER. It turns out the new type wasn't really needed. - Reorganize code a little bit.	2010-09-14 16:21:13 +00:00
pjd	65239d84e5	Simplify the code a bit.	2010-09-14 11:42:07 +00:00
pjd	3d8ce965d3	- Remove gc_argname field. It was introduced for gpart(8), but if I understand everything correctly, we don't really need it. - Provide default numeric value as strings. This allows to simplify a lot of code. - Bump version number.	2010-09-13 13:48:18 +00:00
pjd	6f96b7c228	- Allow to specify value as const pointers. - Make optional string values always an empty string.	2010-09-13 08:56:07 +00:00
gibbs	6833acab2d	Correct bioq_disksort so that bioq_insert_tail() offers barrier semantic. Add the BIO_ORDERED flag for struct bio and update bio clients to use it. The barrier semantics of bioq_insert_tail() were broken in two ways: o In bioq_disksort(), an added bio could be inserted at the head of the queue, even when a barrier was present, if the sort key for the new entry was less than that of the last queued barrier bio. o The last_offset used to generate the sort key for newly queued bios did not stay at the position of the barrier until either the barrier was de-queued, or a new barrier (which updates last_offset) was queued. When a barrier is in effect, we know that the disk will pass through the barrier position just before the "blocked bios" are released, so using the barrier's offset for last_offset is the optimal choice. sys/geom/sched/subr_disk.c: sys/kern/subr_disk.c: o Update last_offset in bioq_insert_tail(). o Only update last_offset in bioq_remove() if the removed bio is at the head of the queue (typically due to a call via bioq_takefirst()) and no barrier is active. o In bioq_disksort(), if we have a barrier (insert_point is non-NULL), set prev to the barrier and cur to it's next element. Now that last_offset is kept at the barrier position, this change isn't strictly necessary, but since we have to take a decision branch anyway, it does avoid one, no-op, loop iteration in the while loop that immediately follows. o In bioq_disksort(), bypass the normal sort for bios with the BIO_ORDERED attribute and instead insert them into the queue with bioq_insert_tail(). bioq_insert_tail() not only gives the desired command order during insertion, but also provides barrier semantics so that commands disksorted in the future cannot pass the just enqueued transaction. sys/sys/bio.h: Add BIO_ORDERED as bit 4 of the bio_flags field in struct bio. sys/cam/ata/ata_da.c: sys/cam/scsi/scsi_da.c Use an ordered command for SCSI/ATA-NCQ commands issued in response to bios with the BIO_ORDERED flag set. sys/cam/scsi/scsi_da.c Use an ordered tag when issuing a synchronize cache command. Wrap some lines to 80 columns. sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c sys/geom/geom_io.c Mark bios with the BIO_FLUSH command as BIO_ORDERED. Sponsored by: Spectra Logic Corporation MFC after: 1 month	2010-09-02 19:40:28 +00:00
pjd	f3ed6934be	Correct offset conversion to little endian. It was implemented in version 2, but because of a bug it was a no-op, so we were still using offsets in native byte order for the host. Do it properly this time, bump version to 4 and set the G_ELI_FLAG_NATIVE_BYTE_ORDER flag when version is under 4. MFC after: 2 weeks	2010-08-28 08:30:20 +00:00
mav	95a3f24d16	Remove bintime_cmp() function, unused since r200086. MFC after: 1 week	2010-08-18 15:38:10 +00:00
ae	195997ebf0	Check that gsp is not NULL before access. It can be NULL for some cases. Approved by: kib (mentor) MFC after: 1 week	2010-08-03 11:21:17 +00:00
ae	91e7795d2b	Check that table is not NULL before access, it can be NULL for some cases. Approved by: mav (mentor) MFC after: 2 weeks	2010-08-03 09:10:48 +00:00
ae	7ff57057ce	Forward ioctl requests to original geom. PR: 148540 Silence from: luigi Reviewed by: pjd Approved by: mav (mentor) MFC after: 2 weeks	2010-08-02 10:30:49 +00:00
ae	e9c305133c	Release access for consumers that are opened, but will be destroyed indirectly by orphan method. PR: 148688 Silence from: marcel Approved by: mav (mentor) MFC after: 2 weeks	2010-08-02 10:26:15 +00:00
mav	68b26f6649	Export PCI IDs of ATA/SATA controllers through CAM and ata(4) layers to GEOM. This information needed for proper soft-RAID's on-disk metadata reading and writing.	2010-07-25 15:43:52 +00:00
ae	6c308fec85	Prevent access after free to table entry in case when user deletes partition that not yet created (changes doesn't committed to disk). PR: 148687 Approved by: mav (mentor) MFC after: 7 days	2010-07-23 06:30:01 +00:00
ru	dcda8994d5	Fixed cache size decoding read from a label. PR: kern/144732 Submitted by: Eugene Grosbein MFC after: 3 days	2010-07-14 08:22:00 +00:00
rpaulo	ba91cecfe0	Add NTFS partition type to GEOM_MBR.	2010-06-26 13:20:40 +00:00
pjd	6016b9e7a6	'unit' can be negative, so use signed type for it. Found by: Coverity Prevent CID: 3731 MFC after: 3 days	2010-06-14 21:58:55 +00:00
pjd	9910a8f3bf	BIO_DELETE contains range we want to delete and doesn't provide any useful data, so there is no need to copy it to userland. MFC after: 3 days	2010-06-14 21:56:24 +00:00
avg	324886002f	fix a few cases where a string is passed via format argument instead of via %s Most of the cases looked harmless, but this is done for the sake of correctness. In one case it even allowed to drop an intermediate buffer. Found by: clang MFC after: 2 week	2010-06-11 19:27:21 +00:00
trasz	e35649401c	Untangle g_print_bio(), silencing Coverity. Found with: Coverity Prevent CID: 3566, 3567	2010-06-10 17:49:36 +00:00
mjacob	d16c7840fb	Try and narrow the gap in which you act on an event that has been canceled. Obtained from: Jaako Heinonen MFC after: 1 month	2010-06-08 22:40:02 +00:00
trasz	b0cd47b602	Make sure not to pass NULL to g_orphan_provider(). Found with: Coverity Prevent CID: 3411	2010-06-05 08:00:52 +00:00
marius	f05cc30a98	Don't leak memory on destruction. Reviewed by: marcel MFC after: 3 days	2010-06-02 17:17:11 +00:00
avg	4abc4c484d	g_label: fix possible NULL pointer dereference in case glabel debug level is >= 1 and gp->provider list is empty for some reason Found by: clang static analyzer MFC after: 4 days	2010-05-31 09:10:39 +00:00
marius	e326b6b301	Fix some whitespace nits.	2010-05-24 17:33:02 +00:00
nwhitehorn	9090794d6c	Teach gpart about bootcode on APM.	2010-05-16 22:21:33 +00:00
mjacob	5fcf18695d	Yet another potential dereference of a dead provider. Sponsored by: Panasas MFC after: 1 week	2010-05-14 21:27:39 +00:00
mjacob	9d307c8df9	Make sure to check that the active provider pointer points to something before dereferencing the pointer. Sponsored by: Pansas MFC after: 1 week	2010-05-14 16:56:18 +00:00
jh	e34ccbdd57	- Don't return EAGAIN from gv_unload(). It was used to work around the deadlock fixed in r207671. - Wait for worker process to exit at class unload. The worker process was not guaranteed to exit before the linker unloaded the module. - Use 0 as the worker process exit status instead of ENXIO and style the NOTREACHED comment. Reviewed by: lulf X-MFC after: r207671	2010-05-10 19:12:23 +00:00
jh	a27d539c9a	In g_zero_destroy_geom(), return 0 instead of EBUSY in the success case. EBUSY was probably used as a workaround for the deadlock fixed in r207671. Approved by: pjd X-MFC after: r207671	2010-05-10 19:08:53 +00:00
lulf	b6c4219b41	- Remove obsolete flags. MFC after: 1 week	2010-05-08 16:19:17 +00:00
jh	70139e5fa6	Fix deadlock between GEOM class unloading and withering. Withering can't proceed while g_unload_class() blocks the event thread. Fix this by not running g_unload_class() as a GEOM event and dropping the topology lock when withering needs to proceed. PR: kern/139847 Silence on: freebsd-geom	2010-05-05 18:53:24 +00:00
marcel	d69a8fd3dc	Re-calculate a geometry when reprobing as well. PR: kern/145452 Reported by: "Andrey V. Elsukov" <bu7cher@yandex.ru>	2010-04-25 01:56:39 +00:00
marcel	365394e3ab	Fix undo for schemes that have internal partitions. Internal partitions do not constitute user-visible or active partitions and as such should not prevent undoing pending operations. While here, initialize the last usable sector for the placeholder geom based on the null scheme, created to allow undoing the destruction of a scheme. This gives consistent output with "gpart show". Based on a patch from: "Andrey V. Elsukov" <bu7cher@yandex.ru>	2010-04-25 00:54:11 +00:00
marcel	be854af5d1	Implement the resize verb and add support for resizing partitions for all schemes but EBR. Quality work by Andrey! Submitted by: "Andrey V. Elsukov" <bu7cher@yandex.ru>	2010-04-23 03:11:39 +00:00
jh	b6e7ef0d61	Fix ddb(4) "show geom addr" command when INVARIANTS is enabled. Don't assert that the topology lock is held when g_valid_obj() is called from debugger. MFC after: 1 week	2010-04-19 20:07:35 +00:00
pjd	1e56ad0204	Use lower priority for GELI worker threads. This improves system responsiveness under heavy GELI load. MFC after: 3 days	2010-04-15 16:34:06 +00:00
avg	d77d0a6e95	g_io_check: respond to zero pp->mediasize with ENXIO Previsouly this condition was reported with EIO by bio_offset > mediasize check. Perhaps that check should be extended to bio_offset+bio_length > mediasize. MFC after: 1 week	2010-04-15 08:39:56 +00:00
luigi	05b4eb119e	fix copyright format, as requested by Joel Dahl	2010-04-13 09:56:17 +00:00
luigi	014f1e6b2e	make code compile with KTR	2010-04-13 09:53:08 +00:00
luigi	fa43d14d2c	Bring in geom_sched, support for scheduling disk I/O requests in a device independent manner. Also include an example anticipatory scheduler, gsched_rr, which gives very nice performance improvements in presence of competing random access patterns. This is joint work with Fabio Checconi, developed last year and presented at BSDCan 2009. You can find details in the README file or at http://info.iet.unipi.it/~luigi/geom_sched/	2010-04-12 16:37:45 +00:00
avg	5b3e4a4ae8	g_vfs_open: allow only one mount per device vnode In other words, deny multiple read-only mounts of the same device. Shared read-only mounts should theoretically be possible, but, unfortunately, can not be implemented correctly using current buffer cache code/interface and results in an eventual system crash. Also, using nullfs seems to be a more efficient way to achieve the same goal. This gets us back to where we were before GEOM and where other BSDs are. Submitted by: pjd (idea for checking for shared mounting) Discussed with: phk, pjd Silence from: fs@, geom@ MFC after: 2 weeks	2010-04-03 08:53:53 +00:00
avg	ad244906c7	bo_bsize: revert r205860 and take an alternative approch in getblk In r205860 I missed the fact that there is code that strongly assumes that devvp bo_bsize is equal to underlying provider's sectorsize. In those places it is hard to obtain the sectorsize in an alternative way if devvp bo_bsize is set to something else. So, I am reverting bo_bsize assigment in g_vfs_open. Instead, in getblk I use DEV_BSIZE block size for b_offset calculation if vp is a disk vp as reported by vn_isdisk. This should coinside with vp being a devvp. Reported by: Mykola Dzham <i@levsha.me> Tested by: Mykola Dzham <i@levsha.me> Pointyhat to: avg MFC after: 2 weeks X-ToDo: convert bread(devvp) in all fs to use bo_bsize-d blocks	2010-04-02 15:12:31 +00:00
avg	b45e2c09f5	g_vfs_open: correctly set devvp.v_bufobj.bo_bsize to DEV_BSIZE Because of how breadn -> bufstrategy -> g_vfs_strategy are currently implemented, bread on devvp always expects DEV_BSIZE block size. Thus, devvp bo_bsize must always be DEV_BSIZE irrespective of media properties or filesystem implementation details. Reviewed by: mckusick MFC after: 2 weeks	2010-03-29 20:34:25 +00:00
mjacob	93d13fc5c9	Change how multipath labels are created and managed. This makes it easier to support various storage boxes which really aren't active-active. We only write the label on the first provider. For all other providers we just "add" the disk. This also allows for an "add" verb. A usage implication is that you should specificy the currently active storage path as the first provider. Note that this does not add RDAC-like functionality, but better allows for autovolumefailover configurations (additional checkins elsewhere will support this). Sponsored by: Panasas MFC after: 1 month	2010-03-29 18:04:06 +00:00
mav	b64dc964ac	Do not fetch precise time of request start when stats collection disabled. Reviewed by: pjd, phk	2010-03-24 18:04:25 +00:00
mjacob	48555863a3	Add 'rotate' and 'getactive' verbs to provide some control and information about what the currently active path is. Sponsored by: Panasas MFC after: 1 month	2010-03-21 15:02:47 +00:00
jh	8dba506a84	Escape characters unsafe for XML output in GEOM class, instance and provider names. - Characters in range 0x01-0x1f except '\t', '\n', and '\r' are replaced with '?'. Those characters are disallowed in XML. - '&', '<', '>', '\'', '"' and characters in range 0x7f-0xff are replaced with XML numeric character reference. If the kern.geom.confxml sysctl provides invalid XML, libgeom geom_xml2tree() fails and utilities using it do not work. Unsafe characters are common in msdosfs and cd9660 labels. PR: kern/104389 Submitted by: Doug Steinwand (original version) Reviewed by: pjd Discussed on: freebsd-geom MFC after: 3 weeks	2010-03-20 16:16:13 +00:00
pjd	cf2b4e1396	Simplify loops.	2010-03-18 13:11:43 +00:00
lulf	db22efb244	- Set missing flag when initiating a plex rebuild with the rebuildparity command. - Check if plex is already syncing or rebuilding before initiating a parity rebuild or check.	2010-03-08 21:16:28 +00:00
pjd	1c1e2e8b71	Please welcome HAST - Highly Avalable Storage. HAST allows to transparently store data on two physically separated machines connected over the TCP/IP network. HAST works in Primary-Secondary (Master-Backup, Master-Slave) configuration, which means that only one of the cluster nodes can be active at any given time. Only Primary node is able to handle I/O requests to HAST-managed devices. Currently HAST is limited to two cluster nodes in total. HAST operates on block level - it provides disk-like devices in /dev/hast/ directory for use by file systems and/or applications. Working on block level makes it transparent for file systems and applications. There in no difference between using HAST-provided device and raw disk, partition, etc. All of them are just regular GEOM providers in FreeBSD. For more information please consult hastd(8), hastctl(8) and hast.conf(5) manual pages, as well as http://wiki.FreeBSD.org/HAST. Sponsored by: FreeBSD Foundation Sponsored by: OMCnet Internet Service GmbH Sponsored by: TransIP BV	2010-02-18 23:16:19 +00:00
pjd	fda388d7b1	- Style fixes. - Prefer strlcpy() over strncpy().	2010-02-18 22:29:35 +00:00
pjd	94d64f4ec3	Correct comment.	2010-02-18 22:28:12 +00:00
pjd	bcd34167f7	Log attach just like we log detach.	2010-02-18 22:27:38 +00:00
gonzo	58b846696e	- Give geom_redboot taste of flash/spi. Now there is another provider of redboot partitions. This patch was missed during merge from projects/mips.	2010-02-03 01:12:19 +00:00
delphij	33ff87b33e	Prevent NULL deference by checking return value of gctl_get_asciiparam. MFC after: 2 weeks	2010-02-02 22:25:22 +00:00
marcel	68c0faaca6	Export the UUID of the partition in the XML. The partition UUID is used by EFI's device path to identify a partition. In order for FreeBSD to add EFI boot options, proper device paths need to be constructed.	2010-01-30 23:13:19 +00:00
ivoras	bbd4c1e2b3	Go through with write_metadata() non-error-handling and make it return "void". This is mostly to avoid dead variable assignment warning by LLVM. No functional change. Pointed out by: trasz Approved by: gnn (mentor)	2010-01-25 20:51:40 +00:00
trasz	7133fbf3b2	Remove unneeded variables. Found with: clang	2010-01-25 17:00:21 +00:00
trasz	f2a5da9c05	Remove pointless assignment. Found with: clang	2010-01-25 16:58:58 +00:00
trasz	ca36390aa1	Remove some pointless variable assignments. Found with: clang	2010-01-25 16:55:30 +00:00
trasz	298fb23cab	Remove unused variable. Found with: clang	2010-01-25 16:10:22 +00:00
delphij	16c4a5ec20	Expose stripe offset and stripe size through libgeom and geom(8) userland utilities. Reviewed by: pjd, mav (earlier version)	2010-01-17 06:20:30 +00:00
trasz	ba210e8afe	Add gmountver, disk mount verification GEOM class. Note that due to e.g. write throttling ('wdrain'), it can stall all the disk I/O instead of just the device it's configured for. Using it for removable media is therefore not a good idea. Reviewed by: pjd (earlier version)	2010-01-16 09:52:49 +00:00
mav	aa7d598791	Change the way in which zero stripesize is handled. Instead of reporting zero stripeoffset in such case (as if device has no stripes), report offset from the beginning of the media (as if device has single infinite stripe). This gives partitioning tools information, required to guess better partition alignment, in case if hardware doesn't report it's stripe size. For example, it should give disklabel info about odd offset made by fdisk.	2010-01-06 13:14:37 +00:00
mav	a615da72de	Move wakeup() out of mutex to reduce contention.	2010-01-05 10:52:21 +00:00
mav	1073c59bbd	Move wakeup() out of mutex to reduce contention.	2010-01-05 10:30:56 +00:00
mav	6c3ad0385c	Slightly optimize XOR calculation.	2010-01-05 02:06:05 +00:00
marcel	5390d15d43	Properly return the UUID represented by the alias. PR: 142174 Submitted by: Przemyslaw Laczynski <torindel@gmail.com> Pointy hat to: rpaulo	2010-01-02 01:02:59 +00:00
mav	b5e1bf6b39	Call wakeup() only for the first request on the queue.	2009-12-30 17:23:27 +00:00
antoine	bfd388c026	(S)LIST_HEAD_INITIALIZER takes a (S)LIST_HEAD as an argument. Fix some wrong usages. Note: this does not affect generated binaries as this argument is not used. PR: 137213 Submitted by: Eygene Ryabinkin (initial version) MFC after: 1 month	2009-12-28 22:56:30 +00:00
mav	f60568ce2e	Add BIO_DELETE support to ada(4): - For SSDs use TRIM feature of DATA SET MANAGEMENT command, as defined by ACS-2 specification working draft. - For CompactFlash use CFA ERASE command, same as ad(4) does. With this patch, `newfs -E /dev/ada1` was able to restore write speed of my heavily weared OCZ Vertex SSD (firmware 1.4) up to the initial level for the most part of it's capacity. Previous 1.3 firmware, even reportiong TRIM capabilty bit set, was not working, reporting ABORT error for every DSM command. I have no idea whether it is normal, but for some reason it takes 200ms to handle any TRIM command on this drive, that was making delete extremely slow. But TRIM command is able to accept long list of LBAs and the length of that list seems doesn't affect it's execution time. Implemented request clusting algorithm allowed me to rise delete rate up to reasonable numbers, when many parallel DELETE requests running.	2009-12-28 20:08:01 +00:00
mav	ab355c0d66	Make geom_concat to passthrough stripe parameters of the first component, hoping that rest will fit.	2009-12-24 14:32:21 +00:00
mav	cf5b790219	As soon as geom_raid3 reports it's own stripe as sector size, report largest underlying provider's stripe, multiplied by number of data disks in array, due to transformation done, as array stripe.	2009-12-24 13:38:02 +00:00
mav	e6038335e8	As soon as mirror has no own stripes, report largest stripe of unrerlying components, hoping others fit, if they are not equal.	2009-12-24 12:17:22 +00:00
mav	f90d832436	Add two disk ioctls, giving user-level tools information about disk/array stripe (optimal access block) size and offset.	2009-12-24 11:05:23 +00:00
mav	5908bf1b9f	Make geom_stripe report it's stripe size to upper layers.	2009-12-24 10:43:44 +00:00
mav	3edd147826	Make graid3 fallback to malloc() when component request size is bigger then maximal prepared UMA zone size. This fixes crash with MAXPHYS > 128K.	2009-12-21 23:31:03 +00:00
rpaulo	75979e374b	Add Microsoft and NetBSD partition types handling.	2009-12-14 20:26:27 +00:00
rpaulo	6d6ee0c2dc	Simplify partition type parsing by using a data-oriented model. While there add more Apple and Linux partition types.	2009-12-14 20:04:06 +00:00
mav	f018f4f599	Change 'load' balancing mode algorithm: - Instead of measuring last request execution time for each drive and choosing one with smallest time, use averaged number of requests, running on each drive. This information is more accurate and timely. It allows to distribute load between drives in more even and predictable way. - For each drive track offset of the last submitted request. If new request offset matches previous one or close for some drive, prefer that drive. It allows to significantly speedup simultaneous sequential reads. PR: kern/113885 Reviewed by: sobomax	2009-12-03 21:47:51 +00:00
trasz	59762a5f5b	Provide a set of sysctls and tunables to disable device node creation for specific "kinds" of disk labels - for example, GPT UUIDs. Reason for this is that sometimes, other GEOM classes attach to these device nodes instead of the proper ones - e.g. they attach to /dev/gptid/XXX instead of /dev/ada0p2, which is annoying. Reviewed by: pjd (earlier version) MFC after: 1 month	2009-11-28 11:57:43 +00:00
rpaulo	e51ec1f083	Add a missing check for Apple HFS partitions. MFC after: 1 week	2009-11-12 19:30:49 +00:00
rnoland	ab54ccdf3e	We need to allocate space for the header in the create path also. This fixes a null pointer dereference with "gpart create -s GPT" after the previous commit. Reported by: Yuri Pankov Pointyhat to: me MFC after: 1 week	2009-11-12 16:28:39 +00:00
rnoland	ebaba37e4e	Fix handling of GPT headers when size is > 92 bytes. It is valid for an on-disk GPT header to report a header size which is greater than 92 bytes. Previously, we would read in the sector and copy only the 92 bytes that we know how to deal with before calculating the checksum for comparison. This meant that when we did the checksum, we overshot the buffer and took in random memory, so the checksum would fail. We now determine the size of the header and allocate enough space to preserve the entire on-disk contents. This allows us to be correctly calculate the checksum and be able to modify and write the header back to the disk, while preserving data that we might not understand. Reported by: Kris Weston Approved by: marcel@ MFC after: 2 weeks	2009-11-07 17:29:03 +00:00
rnoland	8dda941da3	Set the active flag in the PMBR when we install bootcode on a GPT partitioned disk. Some BIOS require this to be set before they will boot the device. Approved by: marcel MFC after: 2 weeks	2009-10-14 19:24:01 +00:00
pjd	f5413acb70	If provider is open for writing when we taste it, skip it for classes that depend on on-disk metadata. This was we won't attach to providers that are used by other classes. For example we don't want to configure partitions on da0 if it is part of gmirror, what we really want is partitions on mirror/foo. During regular work it works like this: if provider is open for writing a class receives the spoiled event from GEOM and detaches, once provider is closed the taste event is send again and class can rediscover its metadata if it is still there. This doesn't work that way when new class arrives, because GEOM gives all existing providers for it to taste, also those open for writing. Classes have to decided on their own if they want to deal with such providers (eg. geom_dev) or not (classes modified by this commit). Reported by: des, Oliver Lehmann <lehmann@ans-netz.de> Tested by: des, Oliver Lehmann <lehmann@ans-netz.de> Discussed with: phk, marcel Reviewed by: marcel MFC after: 3 days	2009-10-09 09:42:22 +00:00
lulf	b3a1193d86	- Improve error message consistency and wording.	2009-10-05 08:44:31 +00:00
marcel	063c1246e2	The first 96 bytes may not be zeroes. It can contain trivial boot code that merely emits an error and waits for a key press before rebooting. The error being that extended partitions are not bootable. The origin is presumed to be Windows 2000; Windows XP does not do this... For now, ignore the first 96 bytes when checking that the EBR is (for the most part) all zeroes. Tested by: Mario Lobo <mlobo@digiart.art.br> MFC after: 1 week	2009-09-28 23:52:47 +00:00
marcel	e9f89a2ebc	Don't create more partitions than can fit in the table by checking that the index is within bounds.	2009-09-24 06:00:49 +00:00
trasz	a0489928cd	Remove unused variable.	2009-09-08 17:20:17 +00:00
mav	676cac231c	Do not check proper request alignment here in geom_dev in production. It will be checked any way later by g_io_check() in g_io_schedule_down(). It is only needed here to not trigger panic from additional check, when INVARIANTS enabled. So cover it with #ifdef INVARIANTS. It saves two 64bit divisions per request.	2009-09-08 05:46:38 +00:00
mav	9980b82769	MFp4: Remove msleep() timeout from g_io_schedule_up/down(). It works fine without it, saving few percents of CPU on high request rates without need to rearm callout twice per request.	2009-09-06 19:33:13 +00:00
pjd	e816f77286	Add support for changing providers priority. Submitted by: Mel Flynn	2009-09-06 06:52:06 +00:00
mav	ff8e63fcdf	Remove artificial MAX_IO_SIZE constant, equal to DFLTPHYS * 2. Use MAXPHYS instead. It is NULL change for GENERIC kernel, but allows 'fast' mode to work on systems with increased MAXPHYS.	2009-09-04 19:20:46 +00:00
pjd	39353b796a	Simplify g_disk_ident_adjust() function and allow any printable character in serial number. Discussed with: trasz Obtained from: Wheel Sp. z o.o. (http://www.wheel.pl)	2009-09-04 09:39:06 +00:00
pjd	728f7b16d7	There's no need for checking result of M_WAITOK allocation.	2009-08-27 08:40:51 +00:00
pjd	d461caca3f	Fix an obvious topology lock leak. MFC after: 3 days	2009-08-27 08:28:34 +00:00
marcel	61d6cdbea1	The start of the EFI GPT partition in the PMBR can always be represented by CHS addressing. Don't define these fields as 0xff, but rather define them correctly. This prevents boot problems on PCs where GPT is being used. PR: 115406 Submitted by: Kent Hauser <kent@khauser.net> Approved by: re (kib)	2009-08-17 16:16:46 +00:00
lulf	08a0ba34aa	- Fix the issue with read access count modification on RAID-5 plexes properly. If the access counts were not increased and decreased in equal numbers by gvinum consumers, the read access count would be inconsistent with the write access count. Instead, modify the read access count with the write access count directly to prevent any inconsistencies. Approved by: re (kib)	2009-07-18 11:12:48 +00:00
marcel	8fa709769a	Revert revisions 188839 and 188868. Use of the ioctl in geom_dev.c is invalid because the ioctl happens without prior open. The ioctl got introduced to provide backward compatibility for extended partitions, but it ended up not being used because it didn't work as expected. Since there are no consumers of the ioctl and the implementation is broken, the best fix is to remove the code entirely. Spotted by: phk Approved by: re (kensmith)	2009-07-08 05:56:14 +00:00
trasz	bde8460ebe	Fix a panic which (reportedly) can happen when unmounting a filesystem with I/O requests in flight on kernels compiled with "options INVARIANTS". Also, make it obvious it's not right to call g_valid_obj() (and macros using it, e.g. G_VALID_CONSUMER()) without topology lock held. Approved by: re (kib) Reported by: pho	2009-07-01 20:16:29 +00:00
trasz	669f7d9712	Make gjournal work with kernel compiled with "options DIAGNOSTIC". Previously, it would panic immediately. Reviewed by: pjd Approved by: re (kib)	2009-06-30 14:34:06 +00:00
lulf	380cb58248	- Apply the same naming rules of LVM names as done in the LVM code itself. PR: kern/135874	2009-06-24 22:09:30 +00:00
jhay	4067f2d3af	Do not stop the loop when an empty or deleted directory entry is found. Rather just skip over it.	2009-06-24 06:42:13 +00:00
ivoras	23d60df09c	Fix tabs, slightly improve comments. Approved by: gnn (mentor) (original) Noticed by: stas	2009-06-18 11:12:11 +00:00
ivoras	79583448b4	Add support for labels derived from GPT metadata. Approved by: gnn (mentor) Reviewed by: pjd PR: 128398 Submitted by: Marius Nuennerich < marius at nuenneri.ch >	2009-06-13 00:27:03 +00:00
luigi	39676e4ab4	As discussed in the devsummit, introduce two fields in the struct bio to store classification information, and a hook for classifier functions that can be called by g_io_request(). This code is from Fabio Checconi as part of his GSOC work.	2009-06-11 09:55:26 +00:00
pjd	6796cdb325	Simplify.	2009-06-05 23:35:43 +00:00
dougb	e2703a8f9f	Crank the debug level necessary to display the "Label foo is removed" and "Label for provider ..." messages up from 0 to 1.	2009-05-30 22:31:52 +00:00
jamie	572db1408a	Place hostnames and similar information fully under the prison system. The system hostname is now stored in prison0, and the global variable "hostname" has been removed, as has the hostname_mtx mutex. Jails may have their own host information, or they may inherit it from the parent/system. The proper way to read the hostname is via getcredhostname(), which will copy either the hostname associated with the passed cred, or the system hostname if you pass NULL. The system hostname can still be accessed directly (and without locking) at prison0.pr_host, but that should be avoided where possible. The "similar information" referred to is domainname, hostid, and hostuuid, which have also become prison parameters and had their associated global variables removed. Approved by: bz (mentor)	2009-05-29 21:27:12 +00:00
lulf	8765020747	- Unbreak 64 bit platforms by casting off_t to intmax.	2009-05-26 14:15:06 +00:00
lulf	66e14dfc33	- Fix wrong print on BIO_DONE. - Use db_printf instead of printf. While here, apply this to other ddb commands as well. Pointed out by: pjd	2009-05-26 10:03:44 +00:00
lulf	6ffe643641	- Add 'show bio' DDB command. MFC after: 3 weeks	2009-05-26 07:29:17 +00:00
trasz	953ec5bcb5	Check return value of gctl_get_asciiparam(). Found with: Coverity Prevent(tm) CID: 1118	2009-05-12 16:59:50 +00:00
attilio	1dcb84131b	Remove the thread argument from the FSD (File-System Dependent) parts of the VFS. Now all the VFS_* functions and relating parts don't want the context as long as it always refers to curthread. In some points, in particular when dealing with VOPs and functions living in the same namespace (eg. vflush) which still need to be converted, pass curthread explicitly in order to retain the old behaviour. Such loose ends will be fixed ASAP. While here fix a bug: now, UFS_EXTATTR can be compiled alone without the UFS_EXTATTR_AUTOSTART option. VFS KPI is heavilly changed by this commit so thirdy parts modules needs to be recompiled. Bump __FreeBSD_version in order to signal such situation.	2009-05-11 15:33:26 +00:00
lulf	0ece818a7b	- Split up the BIO queue into a queue for new and one for completed requests. This is necessary for two reasons: 1) In order to avoid collisions with the use of a BIOs flags set by a consumer or a provider 2) Because GV_BIO_DONE was used to mark a BIO as done, not enough flags was available, so the consumer flags of a BIO had to be misused in order to support enough flags. The new queue makes it possible to recycle the GV_BIO_DONE flag into GV_BIO_GROW. As a consequence, gvinum will now work with any other GEOM class under it or on top of it. - Use bio_pflags for storing internal flags on downgoing BIOs, as the requests appear to come from a consumer of a gvinum volume. Use bio_cflags only for cloned BIOs. - Move gv_post_bio to be used internally for maintenance requests. - Remove some cases where flags where set without need. PR: kern/133604	2009-05-06 19:34:32 +00:00
lulf	4fbf65a82b	- Fix a case where a RAID5 volume would think that it is supposed to grow a new subdisk after a parity rebuild.	2009-05-06 19:18:19 +00:00
lulf	bb7cc8f64c	- Check if any plexes are doing internal maintenance before removing them.	2009-05-06 19:06:28 +00:00
lulf	c2e31af67e	- Add forgotten KASSERT.	2009-05-06 18:37:32 +00:00
lulf	9192c52290	- Fix a bug where the bio_data field of the wrong BIO is freed if an error occurs when doing a RAID5 request.	2009-05-06 18:27:28 +00:00
lulf	5ef86e69f2	- GV_BIO_RETRY is not used, and it is actually impossible with more than 8 values for bio_cflags/bio_pflags.	2009-05-06 18:24:56 +00:00
lulf	2857e7c796	- Split the queue mutex into one for the event queue and one for the BIO queue, as they do not really relate and to prepare for an additional queue to be covered by the BIO queue mutex. - Implement wrappers for fetching the next element from the event queue as well as for putting a new element into the BIO queue.	2009-05-06 18:21:48 +00:00
lulf	5a4e3bbf09	- Make the gvinum softc invisible to userland, as it is not needed.	2009-05-04 17:30:20 +00:00
lulf	c18c28353d	- Remove assertion of topology lock remaining from 7.x gvinum. It is not needed, as the renaming only changes internal gvinum names and will not alter the geom topology. - The topology lock was not held when calling g_wither_geom after renaming.	2009-04-18 16:36:27 +00:00
marcel	07d1bef868	Precision '*' expects an int and strlen() returns a size_t. Compensate.	2009-04-16 05:52:47 +00:00
marcel	cf8f14f029	Add a compat option to the EBR scheme that controls the naming of the partitions (GEOM_PART_EBR_COMPAT). When compatibility is enabled, changes to the partitioning are disallowed. Remove the device name aliasing added previously to provide backward compatibility, but which in practice doesn't give us anything. Enable compatibility on amd64 and i386.	2009-04-15 22:38:22 +00:00
lulf	62687638df	- Move out allocation part of different gvinum objects into its own routine and make use of it in the gvinum userland code.	2009-04-10 08:50:14 +00:00
thompsa	39714cb212	Revert r190676,190677 The geom and CAM changes for root_hold are the wrong solution for USB design quirks. Requested by: scottl	2009-04-10 04:08:34 +00:00
marcel	b7c761733f	Don't use hexadecimal in the EBR partition names, because 'a'..'f' are more commonly known as BSD partition names. Discussed with: ivoras@	2009-04-08 16:18:16 +00:00
thompsa	2d53d4304d	Add interleaving root hold tokens from the CAM probe to disk_create and geom provider tasting. This is needed for disk attachments that happen after threads are running in the boot process. Tested by: rnoland	2009-04-03 19:49:33 +00:00
thompsa	fe5458f665	Add a how argument to root_mount_hold() so it can be passed NOWAIT and be called in situations where sleeping isnt allowed.	2009-04-03 19:46:12 +00:00
marcel	16a8b4ee94	The 9 bytes immediately prior to the partition table can contain signatures or disk serial numbers. Don't assume those to be zero in all cases. This fixes a false negative. Tested by: avatar@mmlab.cse.yzu.edu.tw	2009-04-03 05:54:49 +00:00
marcel	80c869f5c8	Sharpen the saw: o PC98 uses 32-bit block numbers. Limit the scheme to 2^32-1 blocks when the media is larger. The 32-bit block numbers are implicit (16-bit cylinder * 8-bit head * 8-bit sector).	2009-03-30 01:03:58 +00:00
marcel	658fba6dac	Sharpen the saw: o MBR uses 32-bit block numbers. Limit the scheme to 2^32-1 blocks when the media is larger.	2009-03-30 00:53:46 +00:00
marcel	5f1a50b47c	Sharpen the saw: o EBR uses 32-bit block numbers. Limit the scheme to 2^32-1 blocks when the media is larger. o Calculate the number of entries based on the rounded media size, rather than the raw media size.	2009-03-30 00:48:42 +00:00
marcel	bfa8e2dd6f	Sharpen the saw: o Don't create a GPT scheme underneath another scheme when the probe doesn't allow it.	2009-03-30 00:33:43 +00:00
lulf	4b994f250d	- Add files that should have been added in r190507.	2009-03-28 21:06:59 +00:00

... 3 4 5 6 7 ...

1805 Commits