1438 Commits

Author SHA1 Message Date
pjd
e6687a7b4d MFC r209262:
r209186:

BIO_DELETE contains range we want to delete and doesn't provide any
useful data, so there is no need to copy it to userland.

r209187:

'unit' can be negative, so use signed type for it.

Found by:	Coverity Prevent
CID:		3731

Approved by:	re (kensmith)
2010-06-23 23:03:25 +00:00
marius
926697037a MFC: r208746
Don't leak memory on destruction.

Reviewed by:	marcel
Approved by:	re (kib)
2010-06-11 21:54:04 +00:00
ae
3d400e78a3 MFC r197608:
The first 96 bytes may not be zeroes. It can contain trivial boot
code that merely emits an error and waits for a key press before
rebooting. The error being that extended partitions are not
bootable. The origin is presumed to be Windows 2000; Windows XP
does not do this...

For now, ignore the first 96 bytes when checking that the EBR is
(for the most part) all zeroes.

Tested by:	Mario Lobo <mlobo at digiart.art.br>
		Dieter <dieterbsd at engineer.com>
PR:		kern/141235
Reviewed by:	marcel
Approved by:	kib (mentor)
Approved by:	re (bz)
2010-06-07 20:31:55 +00:00
ae
f657fb5023 MFC r207181:
Re-calculate a geometry when reprobing as well.

PR:		kern/145452
Reviewed by:	marcel
Approved by:	kib (mentor)
Approved by:	re (bz)
2010-06-07 10:22:22 +00:00
avg
0aea3d2994 MFC r201374: g_part_gpt: Properly return the UUID represented by the alias
PR:		kern/142174
Approved by:	re (kib)
Approved by:	marcel
2010-05-31 20:17:37 +00:00
nwhitehorn
56466102a6 MFC r200534,200535:
Simplify partition type parsing by using a data-oriented model.
While there add more Apple and Linux partition types.

This unbreaks the build after r208341.

Reported by:	many
Pointy hat to:	me
2010-05-23 15:30:32 +00:00
nwhitehorn
507f598a31 MFC r200557,208173:
Teach gpart about bootcode on APM.
2010-05-23 02:40:04 +00:00
jh
d882f11f9e MFC r206859:
Fix ddb(4) "show geom addr" command when INVARIANTS is enabled. Don't
assert that the topology lock is held when g_valid_obj() is called from
debugger.
2010-04-26 16:20:18 +00:00
mjacob
df94da6bff This is an MFC of 205847, 204071 and 196580
------
Change how multipath labels are created and managed. This makes it easier
to support various storage boxes which really aren't active-active.

We only write the label on the *first* provider. For all other providers
we just "add" the disk. This also allows for an "add" verb.

A usage implication is that you should specificy the currently active
storage path as the first provider.

Note that this does not add RDAC-like functionality, but better allows for
autovolumefailover configurations (additional checkins elsewhere will support
this).

------------------------------------------------------------------------

- Style fixes.
- Prefer strlcpy() over strncpy().

------------------------------------------------------------------------

There's no need for checking result of M_WAITOK allocation.
2010-04-23 16:49:18 +00:00
mjacob
782b9b4388 This is an MFC of 205412.
Add 'rotate' and 'getactive' verbs to provide some control and information
about what the currently active path is.
2010-04-23 16:26:10 +00:00
avg
d5cc2b0161 MFC r206650: g_io_check: respond to zero pp->mediasize with ENXIO 2010-04-22 12:24:59 +00:00
luigi
0241e13792 MFC r206551 (forgotten in previous commit): fix builds with ktr 2010-04-20 21:33:14 +00:00
luigi
be47c154d1 MFC geom_sched code, a geom-based disk scheduling framework. 2010-04-20 15:23:12 +00:00
pjd
578c149cf8 MFC r206665:
Use lower priority for GELI worker threads. This improves system
responsiveness under heavy GELI load.
2010-04-18 21:26:59 +00:00
pjd
61b0f125e6 MFC r204076,r204077,r204083,r205279:
r204076:

Please welcome HAST - Highly Avalable Storage.

HAST allows to transparently store data on two physically separated machines
connected over the TCP/IP network. HAST works in Primary-Secondary
(Master-Backup, Master-Slave) configuration, which means that only one of the
cluster nodes can be active at any given time. Only Primary node is able to
handle I/O requests to HAST-managed devices. Currently HAST is limited to two
cluster nodes in total.

HAST operates on block level - it provides disk-like devices in /dev/hast/
directory for use by file systems and/or applications. Working on block level
makes it transparent for file systems and applications. There in no difference
between using HAST-provided device and raw disk, partition, etc. All of them
are just regular GEOM providers in FreeBSD.

For more information please consult hastd(8), hastctl(8) and hast.conf(5)
manual pages, as well as http://wiki.FreeBSD.org/HAST.

Sponsored by:	FreeBSD Foundation
Sponsored by:	OMCnet Internet Service GmbH
Sponsored by:	TransIP BV

r204077:

Remove some lines left over by accident.

r204083:

Add missing KEYWORD line.

Pointed out by:	dougb

r205279 sys:

Simplify loops.
2010-04-18 21:14:49 +00:00
avg
6b8d7e1071 MFC r206130: g_vfs_open: allow only one mount per device vnode 2010-04-17 11:57:41 +00:00
jh
30afb9c340 MFC r205385:
Escape characters unsafe for XML output in GEOM class, instance and
provider names.

- Characters in range 0x01-0x1f except '\t', '\n', and '\r' are replaced
  with '?'. Those characters are disallowed in XML.
- '&', '<', '>', '\'', '"' and characters in range 0x7f-0xff are
  replaced with XML numeric character reference.

If the kern.geom.confxml sysctl provides invalid XML, libgeom
geom_xml2tree() fails and utilities using it do not work. Unsafe
characters are common in msdosfs and cd9660 labels.

PR:		kern/104389
2010-04-10 14:28:58 +00:00
trasz
031d8a9aee MFC r199875:
Provide a set of sysctls and tunables to disable device node creation
for specific "kinds" of disk labels - for example, GPT UUIDs.  Reason
for this is that sometimes, other GEOM classes attach to these device
nodes instead of the proper ones - e.g. they attach to /dev/gptid/XXX
instead of /dev/ada0p2, which is annoying.

Reviewed by:	pjd (earlier version)
2010-03-27 18:04:33 +00:00
delphij
7ccc467390 MFC r203408:
Prevent NULL deference by checking return value of
gctl_get_asciiparam.
2010-02-16 06:34:44 +00:00
mav
ad134769ed MFC r201566, r201567:
Move wakeup() out of mutex to reduce contention.
2010-02-05 11:56:12 +00:00
mav
cc498fbc6e MFC r201545:
Slightly optimize XOR calculation.
2010-02-05 11:53:41 +00:00
mav
c2d31df45f MFC r201264:
Call wakeup() only for the first request on the queue.
2010-02-05 11:52:28 +00:00
antoine
ade3e62942 MFC r201145 to stable/8:
(S)LIST_HEAD_INITIALIZER takes a (S)LIST_HEAD as an argument.
  Fix some wrong usages.
  Note: this does not affect generated binaries as this argument is not used.

  PR:		137213
  Submitted by:	Eygene Ryabinkin (initial version)
2010-01-30 12:11:21 +00:00
mav
a44d57eec1 MFC r201139:
Add BIO_DELETE support to ada(4):
- For SSDs use TRIM feature of DATA SET MANAGEMENT command, as defined by
ACS-2 specification working draft.
- For CompactFlash use CFA ERASE command, same as ad(4) does.

With this patch, `newfs -E /dev/ada1` was able to restore write speed of
my heavily weared OCZ Vertex SSD (firmware 1.4) up to the initial level
for the most part of it's capacity.

I have no idea whether it is normal, but for some reason it takes 200ms
to handle any TRIM command on this drive, that was making delete extremely
slow. But TRIM command is able to accept long list of LBAs and the length of
that list seems doesn't affect it's execution time. Implemented request
clusting algorithm allowed me to rise delete rate up to reasonable numbers,
when many parallel DELETE requests running.
2010-01-19 12:58:29 +00:00
mav
ab38c9b08c MFC r201645:
Change the way in which zero stripesize is handled. Instead of reporting
zero stripeoffset in such case (as if device has no stripes), report offset
from the beginning of the media (as if device has single infinite stripe).

This gives partitioning tools information, required to guess better
partition alignment, in case if hardware doesn't report it's stripe size.
For example, it should give disklabel info about odd offset made by fdisk.
2010-01-15 23:56:19 +00:00
mav
41ffc478e5 MFC r200934:
Add two disk ioctls, giving user-level tools information about disk/array
stripe (optimal access block) size and offset.
2010-01-05 13:51:23 +00:00
mav
536d45b203 MFC r200942:
Make geom_concat to passthrough stripe parameters of the first component,
hoping that rest will fit.
2010-01-05 13:50:14 +00:00
mav
3deed09e22 MFC r200940:
As soon as geom_raid3 reports it's own stripe as sector size, report largest
underlying provider's stripe, multiplied by number of data disks in array,
due to transformation done, as array stripe.
2010-01-05 13:49:18 +00:00
mav
0f3f0f89b5 MFC r200935:
As soon as mirror has no own stripes, report largest stripe of unrerlying
components, hoping others fit, if they are not equal.
2010-01-05 13:47:55 +00:00
mav
867e021455 MFC r200933:
Make geom_stripe report it's stripe size to upper layers.
2010-01-05 13:46:39 +00:00
mav
08db162dbd MFC r200821:
Make graid3 fallback to malloc() when component request size is bigger
then maximal prepared UMA zone size. This fixes crash with MAXPHYS > 128K.
2009-12-29 21:23:18 +00:00
mav
f6390dafb6 MFC r200086:
Change 'load' balancing mode algorithm:
- Instead of measuring last request execution time for each drive and
choosing one with smallest time, use averaged number of requests, running
on each drive. This information is more accurate and timely. It allows to
distribute load between drives in more even and predictable way.
- For each drive track offset of the last submitted request. If new request
offset matches previous one or close for some drive, prefer that drive.
It allows to significantly speedup simultaneous sequential reads.

PR:             kern/113885
2009-12-08 23:23:45 +00:00
mav
247973f9df MFC r196879:
Add support for changing providers priority.
2009-12-08 23:15:48 +00:00
rnoland
7e065181c8 MFC r199017,199228
Fix handling of GPT headers when size is > 92 bytes.

This should address reading GPT headers written by opensolaris.
2009-11-21 14:53:08 +00:00
rpaulo
b3260f0a08 MFC 199232:
Add a missing check for Apple HFS partitions.
2009-11-19 15:28:08 +00:00
mav
f3174e0a6d MFC r196964:
Do not check proper request alignment here in geom_dev in production.
It will be checked any way later by g_io_check() in g_io_schedule_down().
It is only needed here to not trigger panic from additional check, when
INVARIANTS enabled. So cover it with #ifdef INVARIANTS. It saves two
64bit divisions per request.
2009-11-17 21:45:28 +00:00
mav
bc2242265c MFC r196904:
Remove msleep() timeout from g_io_schedule_up/down(). It works fine
without it, saving few percents of CPU on high request rates without
need to rearm callout twice per request.
2009-11-17 21:43:42 +00:00
mav
e2098ce810 MFC r196837:
Remove artificial MAX_IO_SIZE constant, equal to DFLTPHYS * 2. Use MAXPHYS
instead. It is NULL change for GENERIC kernel, but allows 'fast' mode to
work on systems with increased MAXPHYS.
2009-11-17 21:42:11 +00:00
rnoland
df51b2c6ff MFC r198097
Set the active flag in the PMBR when we install bootcode on a GPT
partitioned disk.  Some BIOS require this to be set before they will
boot the device.
2009-10-30 15:45:00 +00:00
pjd
0aef4906b9 MFC r197898:
If provider is open for writing when we taste it, skip it for classes that
depend on on-disk metadata. This was we won't attach to providers that are used
by other classes. For example we don't want to configure partitions on da0 if
it is part of gmirror, what we really want is partitions on mirror/foo.

During regular work it works like this: if provider is open for writing a class
receives the spoiled event from GEOM and detaches, once provider is closed the
taste event is send again and class can rediscover its metadata if it is still
there.  This doesn't work that way when new class arrives, because GEOM gives
all existing providers for it to taste, also those open for writing. Classes
have to decided on their own if they want to deal with such providers (eg.
geom_dev) or not (classes modified by this commit).

Reported by:	des, Oliver Lehmann <lehmann@ans-netz.de>
Tested by:	des, Oliver Lehmann <lehmann@ans-netz.de>
Discussed with:	phk, marcel
Reviewed by:	marcel
Approved by:	re (kib)
2009-10-12 21:08:06 +00:00
marcel
f59c986f5d MFC rev 197449:
Don't create more partitions than can fit in the table by checking
that the index is within bounds.

Approved by:	re (kib)
2009-09-25 17:48:30 +00:00
pjd
c105cbb514 MFC r196822, r196823, r196824:
Remove 'ad:' prefix from disk serial number. We don't want serial number
to change when we reconnect the disk in a way that it is accessible through
CAM for example.

Discussed with:	trasz

Simplify g_disk_ident_adjust() function and allow any printable character
in serial number.

Discussed with:	trasz
Obtained from:	Wheel Sp. z o.o. (http://www.wheel.pl)

Make serial numbers of daX disks visible by GEOM.

No objections from:	scottl
Obtained from:	Wheel Sp. z o.o. (http://www.wheel.pl)

Approved by:	re (kib)
2009-09-15 11:23:59 +00:00
pjd
87906d2f78 MFC r196579:
Fix an obvious topology lock leak.

Approved by:	re (kib)
2009-09-07 16:25:09 +00:00
marcel
1358a95135 MFC rev 196333:
The start of the EFI GPT partition in the PMBR can always be represented
by CHS addressing. Don't define these fields as 0xff, but rather define
them correctly. This prevents boot problems on PCs where GPT is being
used.

PR:             115406
Submitted by:   Kent Hauser <kent@khauser.net>
Approved by:    re (kib)
2009-08-17 16:24:50 +00:00
lulf
08a0ba34aa - Fix the issue with read access count modification on RAID-5 plexes properly.
If the access counts were not increased and decreased in equal numbers by
  gvinum consumers, the read access count would be inconsistent with the write
  access count. Instead, modify the read access count with the write access
  count directly to prevent any inconsistencies.

Approved by:	re (kib)
2009-07-18 11:12:48 +00:00
marcel
8fa709769a Revert revisions 188839 and 188868. Use of the ioctl in geom_dev.c
is invalid because the ioctl happens without prior open. The ioctl
got introduced to provide backward compatibility for extended
partitions, but it ended up not being used because it didn't work
as expected. Since there are no consumers of the ioctl and the
implementation is broken, the best fix is to remove the code
entirely.

Spotted by:	phk
Approved by:	re (kensmith)
2009-07-08 05:56:14 +00:00
trasz
bde8460ebe Fix a panic which (reportedly) can happen when unmounting a filesystem
with I/O requests in flight on kernels compiled with "options INVARIANTS".
Also, make it obvious it's not right to call g_valid_obj() (and macros
using it, e.g. G_VALID_CONSUMER()) without topology lock held.

Approved by:	re (kib)
Reported by:	pho
2009-07-01 20:16:29 +00:00
trasz
669f7d9712 Make gjournal work with kernel compiled with "options DIAGNOSTIC".
Previously, it would panic immediately.

Reviewed by:	pjd
Approved by:	re (kib)
2009-06-30 14:34:06 +00:00
lulf
380cb58248 - Apply the same naming rules of LVM names as done in the LVM code itself.
PR:		kern/135874
2009-06-24 22:09:30 +00:00
jhay
4067f2d3af Do not stop the loop when an empty or deleted directory entry is found.
Rather just skip over it.
2009-06-24 06:42:13 +00:00