Commit Graph

1490 Commits

Author SHA1 Message Date
Pawel Jakub Dawidek
946e2f3595 - Remove gc_argname field. It was introduced for gpart(8), but if I
understand everything correctly, we don't really need it.
- Provide default numeric value as strings. This allows to simplify
  a lot of code.
- Bump version number.
2010-09-13 13:48:18 +00:00
Pawel Jakub Dawidek
a478ea7490 - Allow to specify value as const pointers.
- Make optional string values always an empty string.
2010-09-13 08:56:07 +00:00
Justin T. Gibbs
f03f7a0ca3 Correct bioq_disksort so that bioq_insert_tail() offers barrier semantic.
Add the BIO_ORDERED flag for struct bio and update bio clients to use it.

The barrier semantics of bioq_insert_tail() were broken in two ways:

 o In bioq_disksort(), an added bio could be inserted at the head of
   the queue, even when a barrier was present, if the sort key for
   the new entry was less than that of the last queued barrier bio.

 o The last_offset used to generate the sort key for newly queued bios
   did not stay at the position of the barrier until either the
   barrier was de-queued, or a new barrier (which updates last_offset)
   was queued.  When a barrier is in effect, we know that the disk
   will pass through the barrier position just before the
   "blocked bios" are released, so using the barrier's offset for
   last_offset is the optimal choice.

sys/geom/sched/subr_disk.c:
sys/kern/subr_disk.c:
	o Update last_offset in bioq_insert_tail().

	o Only update last_offset in bioq_remove() if the removed bio is
	  at the head of the queue (typically due to a call via
	  bioq_takefirst()) and no barrier is active.

	o In bioq_disksort(), if we have a barrier (insert_point is non-NULL),
	  set prev to the barrier and cur to it's next element.  Now that
	  last_offset is kept at the barrier position, this change isn't
	  strictly necessary, but since we have to take a decision branch
	  anyway, it does avoid one, no-op, loop iteration in the while
	  loop that immediately follows.

	o In bioq_disksort(), bypass the normal sort for bios with the
	  BIO_ORDERED attribute and instead insert them into the queue
	  with bioq_insert_tail().  bioq_insert_tail() not only gives
	  the desired command order during insertion, but also provides
	  barrier semantics so that commands disksorted in the future
	  cannot pass the just enqueued transaction.

sys/sys/bio.h:
	Add BIO_ORDERED as bit 4 of the bio_flags field in struct bio.

sys/cam/ata/ata_da.c:
sys/cam/scsi/scsi_da.c
	Use an ordered command for SCSI/ATA-NCQ commands issued in
	response to bios with the BIO_ORDERED flag set.

sys/cam/scsi/scsi_da.c
	Use an ordered tag when issuing a synchronize cache command.

	Wrap some lines to 80 columns.

sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c
sys/geom/geom_io.c
	Mark bios with the BIO_FLUSH command as BIO_ORDERED.

Sponsored by:	Spectra Logic Corporation
MFC after:	1 month
2010-09-02 19:40:28 +00:00
Pawel Jakub Dawidek
efb46508ce Correct offset conversion to little endian. It was implemented in version 2,
but because of a bug it was a no-op, so we were still using offsets in native
byte order for the host. Do it properly this time, bump version to 4 and set
the G_ELI_FLAG_NATIVE_BYTE_ORDER flag when version is under 4.

MFC after:	2 weeks
2010-08-28 08:30:20 +00:00
Alexander Motin
3d7cfb15f5 Remove bintime_cmp() function, unused since r200086.
MFC after:	1 week
2010-08-18 15:38:10 +00:00
Andrey V. Elsukov
d02dc4cd41 Check that gsp is not NULL before access. It can be NULL
for some cases.

Approved by:	kib (mentor)
MFC after:	1 week
2010-08-03 11:21:17 +00:00
Andrey V. Elsukov
a45f4c6e2c Check that table is not NULL before access, it can be NULL
for some cases.

Approved by:	mav (mentor)
MFC after:	2 weeks
2010-08-03 09:10:48 +00:00
Andrey V. Elsukov
a80f05bb73 Forward ioctl requests to original geom.
PR:		148540
Silence from:	luigi
Reviewed by:	pjd
Approved by:	mav (mentor)
MFC after:	2 weeks
2010-08-02 10:30:49 +00:00
Andrey V. Elsukov
b6d4028166 Release access for consumers that are opened, but will be destroyed
indirectly by orphan method.

PR:		148688
Silence from:	marcel
Approved by:	mav (mentor)
MFC after: 	2 weeks
2010-08-02 10:26:15 +00:00
Alexander Motin
8edcf69406 Export PCI IDs of ATA/SATA controllers through CAM and ata(4) layers to
GEOM. This information needed for proper soft-RAID's on-disk metadata
reading and writing.
2010-07-25 15:43:52 +00:00
Andrey V. Elsukov
733a9e2783 Prevent access after free to table entry in case when
user deletes partition that not yet created (changes doesn't
committed to disk).

PR:		148687
Approved by:	mav (mentor)
MFC after:	7 days
2010-07-23 06:30:01 +00:00
Ruslan Ermilov
cf1457e4fd Fixed cache size decoding read from a label.
PR:		kern/144732
Submitted by:	Eugene Grosbein
MFC after:	3 days
2010-07-14 08:22:00 +00:00
Rui Paulo
c6b2b6fce6 Add NTFS partition type to GEOM_MBR. 2010-06-26 13:20:40 +00:00
Pawel Jakub Dawidek
2aa15ffdab 'unit' can be negative, so use signed type for it.
Found by:	Coverity Prevent
CID:		3731
MFC after:	3 days
2010-06-14 21:58:55 +00:00
Pawel Jakub Dawidek
15725379d0 BIO_DELETE contains range we want to delete and doesn't provide any useful
data, so there is no need to copy it to userland.

MFC after:	3 days
2010-06-14 21:56:24 +00:00
Andriy Gapon
1bdfff2252 fix a few cases where a string is passed via format argument instead of
via %s

Most of the cases looked harmless, but this is done for the sake of
correctness.  In one case it even allowed to drop an intermediate buffer.

Found by:	clang
MFC after:	2 week
2010-06-11 19:27:21 +00:00
Edward Tomasz Napierala
7ce513a52a Untangle g_print_bio(), silencing Coverity.
Found with:	Coverity Prevent
CID:		3566, 3567
2010-06-10 17:49:36 +00:00
Matt Jacob
59ccfe8176 Try and narrow the gap in which you act on an event that has been canceled.
Obtained from:	Jaako Heinonen
MFC after:	1 month
2010-06-08 22:40:02 +00:00
Edward Tomasz Napierala
c01eb2f36b Make sure not to pass NULL to g_orphan_provider().
Found with:	Coverity Prevent
CID:		3411
2010-06-05 08:00:52 +00:00
Marius Strobl
36066952e5 Don't leak memory on destruction.
Reviewed by:	marcel
MFC after:	3 days
2010-06-02 17:17:11 +00:00
Andriy Gapon
56b3acd001 g_label: fix possible NULL pointer dereference
in case glabel debug level is >= 1 and gp->provider list is empty
for some reason

Found by:	clang static analyzer
MFC after:	4 days
2010-05-31 09:10:39 +00:00
Marius Strobl
785c3f7ea4 Fix some whitespace nits. 2010-05-24 17:33:02 +00:00
Nathan Whitehorn
0532c3a5a5 Teach gpart about bootcode on APM. 2010-05-16 22:21:33 +00:00
Matt Jacob
87e7f7be89 Yet another potential dereference of a dead provider.
Sponsored by:   Panasas
MFC after:	1 week
2010-05-14 21:27:39 +00:00
Matt Jacob
1371a457d9 Make sure to check that the active provider pointer points to something before
dereferencing the pointer.

Sponsored by:   Pansas
MFC after:	1 week
2010-05-14 16:56:18 +00:00
Jaakko Heinonen
3535526b15 - Don't return EAGAIN from gv_unload(). It was used to work around the
deadlock fixed in r207671.
- Wait for worker process to exit at class unload. The worker process
  was not guaranteed to exit before the linker unloaded the module.
- Use 0 as the worker process exit status instead of ENXIO and style
  the NOTREACHED comment.

Reviewed by:	lulf
X-MFC after:	r207671
2010-05-10 19:12:23 +00:00
Jaakko Heinonen
5a279fc5fc In g_zero_destroy_geom(), return 0 instead of EBUSY in the success case.
EBUSY was probably used as a workaround for the deadlock fixed in r207671.

Approved by:	pjd
X-MFC after:	r207671
2010-05-10 19:08:53 +00:00
Ulf Lilleengen
42a9ad6697 - Remove obsolete flags.
MFC after:	1 week
2010-05-08 16:19:17 +00:00
Jaakko Heinonen
9061251f9a Fix deadlock between GEOM class unloading and withering. Withering can't
proceed while g_unload_class() blocks the event thread. Fix this by not
running g_unload_class() as a GEOM event and dropping the topology lock
when withering needs to proceed.

PR:		kern/139847
Silence on:	freebsd-geom
2010-05-05 18:53:24 +00:00
Marcel Moolenaar
c74f160cb0 Re-calculate a geometry when reprobing as well.
PR:		kern/145452
Reported by:	"Andrey V. Elsukov" <bu7cher@yandex.ru>
2010-04-25 01:56:39 +00:00
Marcel Moolenaar
6f702278e6 Fix undo for schemes that have internal partitions. Internal partitions
do not constitute user-visible or active partitions and as such should
not prevent undoing pending operations.

While here, initialize the last usable sector for the placeholder geom
based on the null scheme, created to allow undoing the destruction of
a scheme. This gives consistent output with "gpart show".

Based on a patch from:	"Andrey V. Elsukov" <bu7cher@yandex.ru>
2010-04-25 00:54:11 +00:00
Marcel Moolenaar
3f71c319f4 Implement the resize verb and add support for resizing partitions
for all schemes but EBR. Quality work by Andrey!

Submitted by:	"Andrey V. Elsukov" <bu7cher@yandex.ru>
2010-04-23 03:11:39 +00:00
Jaakko Heinonen
002d1d1c38 Fix ddb(4) "show geom addr" command when INVARIANTS is enabled. Don't
assert that the topology lock is held when g_valid_obj() is called from
debugger.

MFC after:	1 week
2010-04-19 20:07:35 +00:00
Pawel Jakub Dawidek
31c4cef715 Use lower priority for GELI worker threads. This improves system
responsiveness under heavy GELI load.

MFC after:	3 days
2010-04-15 16:34:06 +00:00
Andriy Gapon
2a842317eb g_io_check: respond to zero pp->mediasize with ENXIO
Previsouly this condition was reported with EIO by bio_offset > mediasize
check.
Perhaps that check should be extended to bio_offset+bio_length > mediasize.

MFC after:	1 week
2010-04-15 08:39:56 +00:00
Luigi Rizzo
83f8218814 fix copyright format, as requested by Joel Dahl 2010-04-13 09:56:17 +00:00
Luigi Rizzo
c36cf6fbbc make code compile with KTR 2010-04-13 09:53:08 +00:00
Luigi Rizzo
1831a90ac5 Bring in geom_sched, support for scheduling disk I/O requests
in a device independent manner. Also include an example anticipatory
scheduler, gsched_rr, which gives very nice performance improvements
in presence of competing random access patterns.

This is joint work with Fabio Checconi, developed last year
and presented at BSDCan 2009. You can find details in the
README file or at

http://info.iet.unipi.it/~luigi/geom_sched/
2010-04-12 16:37:45 +00:00
Andriy Gapon
8f128ff559 g_vfs_open: allow only one mount per device vnode
In other words, deny multiple read-only mounts of the same device.
Shared read-only mounts should theoretically be possible, but,
unfortunately, can not be implemented correctly using current
buffer cache code/interface and results in an eventual system crash.
Also, using nullfs seems to be a more efficient way to achieve the same
goal.

This gets us back to where we were before GEOM and where other BSDs are.

Submitted by:	pjd (idea for checking for shared mounting)
Discussed with:	phk, pjd
Silence from:	fs@, geom@
MFC after:	2 weeks
2010-04-03 08:53:53 +00:00
Andriy Gapon
1b4bc5f851 bo_bsize: revert r205860 and take an alternative approch in getblk
In r205860 I missed the fact that there is code that strongly assumes
that devvp bo_bsize is equal to underlying provider's sectorsize.
In those places it is hard to obtain the sectorsize in an alternative
way if devvp bo_bsize is set to something else.
So, I am reverting bo_bsize assigment in g_vfs_open.
Instead, in getblk I use DEV_BSIZE block size for b_offset calculation
if vp is a disk vp as reported by vn_isdisk.  This should coinside with
vp being a devvp.

Reported by:	Mykola Dzham <i@levsha.me>
Tested by:	Mykola Dzham <i@levsha.me>
Pointyhat to:	avg
MFC after:	2 weeks
X-ToDo:		convert bread(devvp) in all fs to use bo_bsize-d blocks
2010-04-02 15:12:31 +00:00
Andriy Gapon
0c04f06072 g_vfs_open: correctly set devvp.v_bufobj.bo_bsize to DEV_BSIZE
Because of how breadn -> bufstrategy -> g_vfs_strategy are currently
implemented, bread on devvp always expects DEV_BSIZE block size.
Thus, devvp bo_bsize must always be DEV_BSIZE irrespective of media
properties or filesystem implementation details.

Reviewed by:	mckusick
MFC after:	2 weeks
2010-03-29 20:34:25 +00:00
Matt Jacob
2b4969ff9e Change how multipath labels are created and managed. This makes it easier
to support various storage boxes which really aren't active-active.

We only write the label on the *first* provider. For all other providers
we just "add" the disk. This also allows for an "add" verb.

A usage implication is that you should specificy the currently active
storage path as the first provider.

Note that this does not add RDAC-like functionality, but better allows for
autovolumefailover configurations (additional checkins elsewhere will support
this).

Sponsored by:	Panasas
MFC after:	1 month
2010-03-29 18:04:06 +00:00
Alexander Motin
a5be8eb530 Do not fetch precise time of request start when stats collection disabled.
Reviewed by:	pjd, phk
2010-03-24 18:04:25 +00:00
Matt Jacob
b5dce617d8 Add 'rotate' and 'getactive' verbs to provide some control and information
about what the currently active path is.

Sponsored by:	Panasas
MFC after:	1 month
2010-03-21 15:02:47 +00:00
Jaakko Heinonen
a41aa4a789 Escape characters unsafe for XML output in GEOM class, instance and
provider names.

- Characters in range 0x01-0x1f except '\t', '\n', and '\r' are replaced
  with '?'. Those characters are disallowed in XML.
- '&', '<', '>', '\'', '"' and characters in range 0x7f-0xff are
  replaced with XML numeric character reference.

If the kern.geom.confxml sysctl provides invalid XML, libgeom
geom_xml2tree() fails and utilities using it do not work. Unsafe
characters are common in msdosfs and cd9660 labels.

PR:		kern/104389
Submitted by:	Doug Steinwand (original version)
Reviewed by:	pjd
Discussed on:	freebsd-geom
MFC after:	3 weeks
2010-03-20 16:16:13 +00:00
Pawel Jakub Dawidek
b0990a1dae Simplify loops. 2010-03-18 13:11:43 +00:00
Ulf Lilleengen
77d2a01ea8 - Set missing flag when initiating a plex rebuild with the rebuildparity
command.
- Check if plex is already syncing or rebuilding before initiating a parity
  rebuild or check.
2010-03-08 21:16:28 +00:00
Pawel Jakub Dawidek
32115b105a Please welcome HAST - Highly Avalable Storage.
HAST allows to transparently store data on two physically separated machines
connected over the TCP/IP network. HAST works in Primary-Secondary
(Master-Backup, Master-Slave) configuration, which means that only one of the
cluster nodes can be active at any given time. Only Primary node is able to
handle I/O requests to HAST-managed devices. Currently HAST is limited to two
cluster nodes in total.

HAST operates on block level - it provides disk-like devices in /dev/hast/
directory for use by file systems and/or applications. Working on block level
makes it transparent for file systems and applications. There in no difference
between using HAST-provided device and raw disk, partition, etc. All of them
are just regular GEOM providers in FreeBSD.

For more information please consult hastd(8), hastctl(8) and hast.conf(5)
manual pages, as well as http://wiki.FreeBSD.org/HAST.

Sponsored by:	FreeBSD Foundation
Sponsored by:	OMCnet Internet Service GmbH
Sponsored by:	TransIP BV
2010-02-18 23:16:19 +00:00
Pawel Jakub Dawidek
12f35a615a - Style fixes.
- Prefer strlcpy() over strncpy().
2010-02-18 22:29:35 +00:00
Pawel Jakub Dawidek
f24bf7522d Correct comment. 2010-02-18 22:28:12 +00:00