Commit Graph

71 Commits

Author SHA1 Message Date
Alexander Motin
40ea77a036 Merge GEOM direct dispatch changes from the projects/camlock branch.
When safety requirements are met, it allows to avoid passing I/O requests
to GEOM g_up/g_down thread, executing them directly in the caller context.
That allows to avoid CPU bottlenecks in g_up/g_down threads, plus avoid
several context switches per I/O.

The defined now safety requirements are:
 - caller should not hold any locks and should be reenterable;
 - callee should not depend on GEOM dual-threaded concurency semantics;
 - on the way down, if request is unmapped while callee doesn't support it,
   the context should be sleepable;
 - kernel thread stack usage should be below 50%.

To keep compatibility with GEOM classes not meeting above requirements
new provider and consumer flags added:
 - G_CF_DIRECT_SEND -- consumer code meets caller requirements (request);
 - G_CF_DIRECT_RECEIVE -- consumer code meets callee requirements (done);
 - G_PF_DIRECT_SEND -- provider code meets caller requirements (done);
 - G_PF_DIRECT_RECEIVE -- provider code meets callee requirements (request).
Capable GEOM class can set them, allowing direct dispatch in cases where
it is safe.  If any of requirements are not met, request is queued to
g_up or g_down thread same as before.

Such GEOM classes were reviewed and updated to support direct dispatch:
CONCAT, DEV, DISK, GATE, MD, MIRROR, MULTIPATH, NOP, PART, RAID, STRIPE,
VFS, ZERO, ZFS::VDEV, ZFS::ZVOL, all classes based on g_slice KPI (LABEL,
MAP, FLASHMAP, etc).

To declare direct completion capability disk(9) KPI got new flag equivalent
to G_PF_DIRECT_SEND -- DISKFLAG_DIRECT_COMPLETION.  da(4) and ada(4) disk
drivers got it set now thanks to earlier CAM locking work.

This change more then twice increases peak block storage performance on
systems with manu CPUs, together with earlier CAM locking changes reaching
more then 1 million IOPS (512 byte raw reads from 16 SATA SSDs on 4 HBAs to
256 user-level threads).

Sponsored by:	iXsystems, Inc.
MFC after:	2 months
2013-10-22 08:22:19 +00:00
Edward Tomasz Napierala
19e5b2d50e Make geom_label(4) resize-aware. This fixes a situation when "gpart resize"
would resize a partition, but label providers - e.g. /dev/gptid/XXX - would
stay the same size.

Reviewed by:	mav
MFC after:	1 month
Sponsored by:	FreeBSD Foundation
2013-10-18 09:14:19 +00:00
Alexander Motin
7868ec506b geom_slice.c and its consumers like GEOM_LABEL are not touching the data
unless hotspots are used.  Pass G_PF_ACCEPT_UNMAPPED flag through except
such rare cases (obsolete GEOM_SUNLABEL and GEOM_BSD).
2013-03-26 07:55:24 +00:00
Jaakko Heinonen
02c62349c9 - Don't pass geom and provider names as format strings.
- Add __printflike() attributes.
- Remove an extra argument for the g_new_geomf() call in swapongeom_ev().

Reviewed by:	pjd
2012-11-20 12:32:18 +00:00
Ed Schouten
24d1105dde Remove unneeded G_PF_CANDELETE flag.
This flag is only used by GEOM so it can be propagated to the character
device's SI_CANDELETE. Unfortunately, SI_CANDELETE seems to do nothing.
2012-08-28 19:28:31 +00:00
Alexander Motin
3631c6382f Implement media change notification for DA and CD removable media devices.
It includes three parts:
 1) Modifications to CAM to detect media media changes and report them to
disk(9) layer. For modern SATA (and potentially UAS) devices it utilizes
Asynchronous Notification mechanism to receive events from hardware.
Active polling with TEST UNIT READY commands with 3 seconds period is used
for incapable hardware. After that both CD and DA drivers work the same way,
detecting two conditions: "NOT READY: Medium not present" after medium was
detected previously, and "UNIT ATTENTION: Not ready to ready change, medium
may have changed". First one reported to disk(9) as media removal, second
as media insert/change. To reliably receive second event new
AC_UNIT_ATTENTION async added to make UAs broadcasted to all periphs by
generic error handling code in cam_periph_error().
 2) Modifications to GEOM core to handle media remove and change events.
Media removal handled by spoiling all consumers attached to the provider.
Media change event also schedules provider retaste after spoiling to probe
new media. New flag G_CF_ORPHAN was added to consumers to reflect that
consumer is in process of destruction. It allows retaste to create new
geom instance of the same class, while previous one is still dying.
 3) Modifications to some GEOM classes: DEV -- to report media change
events to devd; VFS -- to handle spoiling same as orphan to prevent
accessing replaced media. PART class already handles spoiling alike to
orphan.

Reviewed by:	silence on geom@ and scsi@
Tested by:	avg
Sponsored by:	iXsystems, Inc. / PC-BSD
MFC after:	2 months
2012-07-29 11:51:48 +00:00
Edward Tomasz Napierala
ad624005b3 Fix orphan() methods of several GEOM classes to not assume that there
is an error set on the provider.  With GEOM resizing, class can become
orphaned when it doesn't implement resize() method and the provider size
decreases.

Reviewed by:	mav
Sponsored by:	FreeBSD Foundation
2012-07-07 17:09:44 +00:00
Alexander Motin
0c8fd0c8ac Change the way in which zero stripesize is handled. Instead of reporting
zero stripeoffset in such case (as if device has no stripes), report offset
from the beginning of the media (as if device has single infinite stripe).

This gives partitioning tools information, required to guess better
partition alignment, in case if hardware doesn't report it's stripe size.
For example, it should give disklabel info about odd offset made by fdisk.
2010-01-06 13:14:37 +00:00
Dag-Erling Smørgrav
2616144e43 Add sbuf_new_auto as a shortcut for the very common case of creating a
completely dynamic sbuf.

Obtained from:	Varnish
MFC after:	2 weeks
2008-08-09 11:14:05 +00:00
Pawel Jakub Dawidek
a04c28bdd9 Handle GEOM::ident attribute by attaching 'sX' string at the end of ident
received from the underlying provider, where X is pp->index value.

OK'ed by:	phk
2007-05-05 17:52:22 +00:00
Pawel Jakub Dawidek
42461fba65 Implement BIO_FLUSH handling by simply passing it down to the components.
Sponsored by:	home.pl
2006-10-31 21:23:51 +00:00
Marcel Moolenaar
d99c155975 Add g_wither_provider() to abstract the details of destroying a
particular provider. Use this function where g_orphan_provider()
is being called so that the flags are updated correctly and
g_orphan_provider() is called only when allowed.
2006-04-10 03:55:13 +00:00
Pawel Jakub Dawidek
bdf2e45a5c Allow to use g_slice_orphan() from outside.
MFC after:	3 days
2006-02-18 11:21:17 +00:00
Craig Rodrigues
318c3a55f0 Fix so that when a slice or a partition is removed through g_slice_config(),
it is destroyed in GEOM, in addition to being removed from /dev.
Before this patch, if you applied a new MBR which deleted a slice,
the deleted slice would not be in /dev, but it would still appear
in kern.geom.conftxt and kern.geom.confxml, which would confused
the diskPartitionEditor in sysinstall.

Submitted by:   pjd
Tested by:      pjd, rodrigc
MFC after:	1 week
2005-09-14 21:38:35 +00:00
Poul-Henning Kamp
2859a695dc Stop wasting a bootverbose line on all geom slices. 2004-11-03 09:08:10 +00:00
Poul-Henning Kamp
a9654c8c58 Do not override the class provided dumpconf function. 2004-08-18 21:42:08 +00:00
Lukas Ertl
5289667c16 Check for a NULL pointer before dereferencing it. 2004-07-25 09:41:31 +00:00
Poul-Henning Kamp
8529ce7a87 We only need to check for overlaps if we increasing access counts. 2004-07-04 13:44:48 +00:00
Pawel Jakub Dawidek
916a3aa05b Free only if pointer isn't NULL. 2004-07-01 12:42:13 +00:00
Pawel Jakub Dawidek
40f798dad1 Don't force class to give a valid softc to g_slice_new(), it is not always
needed.

Approved by:	phk
2004-06-24 10:50:20 +00:00
Poul-Henning Kamp
d2bae332d6 Remove the absolute count g_access_abs() function since experience has
shown that it is not useful.

Rename the relative count g_access_rel() function to g_access(), only
the name has changed.

Change all g_access_rel() calls in our CVS tree to call g_access() instead.

Add an #ifndef BURN_BRIDGES #define of g_access_rel() for source
code compatibility.
2004-02-12 22:42:11 +00:00
David E. O'Brien
50b1faef38 Use __FBSDID().
Approved by:	phk
2003-06-11 06:49:16 +00:00
Poul-Henning Kamp
a1a9b44569 Add missing va_end() calls.
Noticed by:	tmm
2003-06-07 10:16:53 +00:00
Poul-Henning Kamp
ce67c955ca Add a destroy_geom method to the slice "library".
If a slice class has no destroy_geom method, use this one.

This should allow all slicers to kldload.
2003-05-31 19:25:05 +00:00
Poul-Henning Kamp
15649213a6 Use a more tailored spoil routine for slices, and take advantage of
g_wither_geom() to do most of the work for us.
2003-05-02 06:29:33 +00:00
Poul-Henning Kamp
82b53b8dc8 Rename g_slice_init() to the more appropriate g_slice_alloc() and give
it a g_slice_free() partner function.
2003-05-02 05:33:27 +00:00
Poul-Henning Kamp
8cd1535a24 Rename g_call_me() to g_post_event(), and give it a flag
argument to determine if we can M_WAITOK in malloc.
2003-04-23 20:46:12 +00:00
Poul-Henning Kamp
d3a1a13766 Do not mandate that slicers have a private ->start(), they may not need
one.  KASSERT() that they have one if G_SLICE_HOT_START is used.
2003-04-22 21:19:17 +00:00
Poul-Henning Kamp
7220a9e779 Make more of the "hotspot" stuff generic:
Give the class a way to specify the necessary action for read/delete/write:
ALLOW, DENY, START or CALL.

Update geom_bsd to use this.
2003-04-19 10:14:39 +00:00
Poul-Henning Kamp
183a45f65e Create a dedicated structure for holding hotspot information rather than
using slice structures for it.
2003-04-19 10:00:22 +00:00
Poul-Henning Kamp
3924ad705e Time has run from the "run GEOM in userland" harness, and the new regression
test is built to test GEOM as running in the kernel.

This commit is basically "unifdef -D_KERNEL" to remove the mainly #include
related code to support the userland-harness.
2003-04-13 09:02:06 +00:00
Poul-Henning Kamp
0d3e96e39c Retire the "frontstuff" record keeping, it was no match for the
in-band meta-data of BSD labels and a more complex solution will be needed.
2003-04-12 08:41:26 +00:00
Poul-Henning Kamp
4eba52a2d2 Remove all references to BIO_SETATTR. We will not be using it. 2003-04-03 19:19:36 +00:00
Poul-Henning Kamp
b4b138c27f Including <sys/stdint.h> is (almost?) universally only to be able to use
%j in printfs, so put a newsted include in <sys/systm.h> where the printf
prototype lives and save everybody else the trouble.
2003-03-18 08:45:25 +00:00
Warner Losh
a163d034fa Back out M_* changes, per decision of the TRB.
Approved by: trb
2003-02-19 05:47:46 +00:00
Poul-Henning Kamp
8028f16b1b Don't divide by zero if there is no stripewidth specified. 2003-02-11 15:23:41 +00:00
Poul-Henning Kamp
8a63edc3a1 Better names for struct disk elements: d_maxsize, d_stripeoffset
and d_stripesisze;

Introduce si_stripesize and si_stripeoffset in struct cdev so we
can make the visible to clustering code.

Add stripesize and stripeoffset to providers.

DTRT with stripesize and stripeoffset in various places in GEOM.
2003-02-11 14:57:34 +00:00
Poul-Henning Kamp
faa7b8b96e Propagate G_PF_CANDELETE to our own providers from the provider we attach to. 2003-02-11 12:36:33 +00:00
Alfred Perlstein
44956c9863 Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
2003-01-21 08:56:16 +00:00
Poul-Henning Kamp
9eebd265b9 Add a check for negative offset locations and return EINVAL for them. 2002-12-17 21:31:58 +00:00
Poul-Henning Kamp
a1d5f791fa Get rid of g_slice_addslice() and use g_slice_config() instead.
Tested with:	i386 + src/tools/regression/geom
2002-12-16 23:08:48 +00:00
Poul-Henning Kamp
0f9d3dba37 Constification and some s/int/u_int/ changes. 2002-12-16 22:33:27 +00:00
Poul-Henning Kamp
821a4d01ea Don't interpret the hotspots relative to all slices on a slicer, but
relative to the parent device.
2002-12-13 21:31:13 +00:00
Poul-Henning Kamp
188321b737 Add a simplified version of the hot-spot code to enable us to protect
in-band disklabels from in-band vandalism.

Approve by:	re
2002-12-02 19:59:25 +00:00
Poul-Henning Kamp
534de7e11d Remember to update the providers idea of its size when we reconfigure
a slice child.

Approved by:	re
2002-11-20 20:12:52 +00:00
Poul-Henning Kamp
d518e53936 Add the remaning part of the new libdisk interaction.
WARNING:  This is not a published interface, it is a stopgap measure for
WARNING:  libdisk so we can get 5.0-R out of the door.

Sponsored by:	DARPA & NAI Labs
2002-10-28 22:43:54 +00:00
Poul-Henning Kamp
3d5500fc51 Reduce the GEOM verbosity under bootverbose to something more sufferable.
This is not quite the set of information I would want, but the tree where
I have the "correct" version is messed up with conflicts.

Sponsored by:	DARPA & NAI Labs.
2002-10-25 20:09:45 +00:00
Poul-Henning Kamp
3f12caa180 Now that the sectorsize and mediasize are properties of the provider,
don't take the detour over the I/O path to discover them using getattr(),
we can just pick them out directly.

Do note though, that for now they are only valid after the first open
of the underlying disk device due compatibility with the old disk_create()
API.  This will change in the future so they will always be valid.

Sponsored by:   DARPA & NAI Labs.
2002-10-20 20:28:24 +00:00
Poul-Henning Kamp
48444d6262 Make the sectorsize a property of providers so we can include it in the XML
output.

Sponsored by:	DARPA & NAI Labs
2002-10-20 19:18:07 +00:00
Poul-Henning Kamp
14ac6812b9 Use %jd instead of %lld now that we have it. 2002-10-20 18:48:12 +00:00