Commit Graph

1162 Commits

Author SHA1 Message Date
Poul-Henning Kamp
6ef8480a88 Add BO_SYNC() and add a default which uses the secret vnode pointer
and VOP_FSYNC() for now.
2005-01-11 10:43:08 +00:00
Pawel Jakub Dawidek
437566858a Increase default synchronization speed.
MFC after: 3 days
2005-01-09 14:43:39 +00:00
Warner Losh
fa521b0366 /* -> /*- for copyright notices, minor format tweaks as necessary 2005-01-06 18:27:30 +00:00
Pawel Jakub Dawidek
ea973705b3 - Fix 'rebuild' command - it can no longer relay on retaste event
(we ignore it).
- Remove code used for handling spoil events, as spoiling is not possible
  anymore, because we keep consumers open for writing all the time.

MFC after:	4 days
2005-01-04 12:15:21 +00:00
Pawel Jakub Dawidek
da84416791 Spoiling is now not possible, because we keep consumers open for writing
all the time. Remove unused code then.

MFC after:	4 days
2005-01-04 12:11:49 +00:00
Pawel Jakub Dawidek
fd6d312082 Fix 'rebuild' command (we ignore retaste event now, so don't relay on it). 2005-01-03 19:42:37 +00:00
Pawel Jakub Dawidek
cdca9c06d9 Remove unused #include. 2005-01-03 12:53:10 +00:00
John Baldwin
63710c4d35 Stop explicitly touching td_base_pri outside of the scheduler and simply
set a thread's priority via sched_prio() when that is the desired action.
The schedulers will start managing td_base_pri internally shortly.
2004-12-30 20:29:58 +00:00
Pawel Jakub Dawidek
7f456a7d61 Remove debug code. 2004-12-28 21:52:45 +00:00
Pawel Jakub Dawidek
a245a5483c - Add genid field to the metadata which will allow to improve reliability a bit.
After this change, when component is disconnected because of an I/O error,
  it will not be connected and synchronized automatically, it will be logged
  as broken and skipped. Autosynchronization can occur, when component is
  disconnected (on orphan event) and connected again - there were no I/O
  error, so there is no need to not connected the component, but when there were
  writes while it wasn't connected, it will be synchronized.
  This fix cases, when component is disconnected because of I/O error and can be
  connected again and again.
- Bump version number.
- Implement backward compatibility mechanism. After this change when metadata in
  old version is detected, it is automatically upgraded to the new (current)
  version.
2004-12-25 19:17:47 +00:00
Pawel Jakub Dawidek
538ff5ee7a Update disk->d_genid field when increasing sc->sc_genid. 2004-12-23 21:15:15 +00:00
Pawel Jakub Dawidek
9a9f504132 - Add genid field to the metadata which will allow to improve reliability a bit.
After this change, when component is disconnected because of an I/O error,
  it will not be connected and synchronized automatically, it will be logged
  as broken and skipped. Autosynchronization can occur, when component is
  disconnected (on orphan event) and connected again - there were no I/O
  error, so there is no need to not connected the component, but when there were
  writes while it wasn't connected, it will be synchronized.
  This fix cases, when component is disconnected because of I/O error and can be
  connected again and again.
- Bump version number.
- Add version change history.
- Implement backward compatibility mechanism. After this change when metadata in
  old version is detected, it is automatically upgraded to the new (current)
  version.
2004-12-22 23:09:32 +00:00
Pawel Jakub Dawidek
4485f00081 Now, when force device destruction is done on shutdown, hide warning,
that device cannot be destroyed immediately, under debug=1.

Suggested by:	simon
2004-12-21 19:50:18 +00:00
Pawel Jakub Dawidek
d97d5ee931 Improve reliability and clean up code a bit.
For more details check src/sys/geom/mirror/g_mirror.c rev.1.47,1.48,1.49,1.50.
2004-12-21 19:30:59 +00:00
Pawel Jakub Dawidek
f663832b75 This should not be permitted, but some GEOM classes held the topology lock
while doing g_(read|write)_data() (e.g. BSD). This can cause a deadlock
in MIRROR class. Not sure if this is safe to drop the topology lock in BSD
class, so change the code in MIRROR class to avoid this deadlock.
2004-12-21 18:42:51 +00:00
Pawel Jakub Dawidek
54bab03f04 Implement g_topology_try_lock().
No objection from:	phk
2004-12-21 18:32:46 +00:00
Pawel Jakub Dawidek
dc7d54e7b3 Remove unused variables. 2004-12-19 23:55:49 +00:00
Pawel Jakub Dawidek
a2a7b44de0 - Argument 'flags' in g_mirror_destroy_consumer() function is unsed -
mark it as such.
- Before closing consumer check if it is open. It can be closed here
  when g_mirror_connect_disk() fails on g_access().
2004-12-19 23:33:59 +00:00
Pawel Jakub Dawidek
9eec299fab Some major cleanups.
Keeping consumers open when device is closed is very hard. We need to
open consumers sometimes to update metadata, etc.
Many hacks was introduced in the past to made it possible. You cannot
be sure that you can open consumer for writing always, even if you think
it should be allowed. If one of the mirror components is for example da0
and you try to open it, you can get EPERM when da0s1 is opened for reading
(because BSD class opens consumers (da0) with an extra 'e' bit set).
Waiting for the events queue to be empty may do the trick, but it makes
code much uglier (as you cannot always call g_waitidle()), it doesn't
solve all edge cases and it can introduce deadlocks if there are events
in the queue that wait for gmirror.

I removed those hacks. Now all consumers are open r1w1e1 always, even if
device is closed. Maybe it is less clean from GEOM perspective, but simpify
code a lot and make it much more reliable.
The only issue was retaste event which is sent when we close consumers
opened for writing. I ignore retaste event by not detaching consumer
immediately (so retaste event is not send to my class) and sending event
right after it to detach and destroy consumer.
2004-12-19 23:12:00 +00:00
Pawel Jakub Dawidek
c37e2f9bbf Don't quit on first failure, just skip failures. 2004-12-19 22:58:25 +00:00
Christian Brueffer
44d086bde6 Fix typo in a comment.
MFC after:	3 days
2004-12-15 12:18:41 +00:00
Pawel Jakub Dawidek
89dd8e5326 bioq_insert_head() function is already in subr_disk.c. 2004-12-13 13:02:06 +00:00
Poul-Henning Kamp
2221dbebce Pass the file->flags down to geom ioctl handlers.
Reject certain ioctls if write permission is not indicated.

Bump geom API version.

Reported by:	Ruben de Groot <mail25@bzerk.org>
2004-12-12 10:09:05 +00:00
Pawel Jakub Dawidek
53ed4e0d54 - Turn off 'fast' mode by default and increase maximum memory to consume
when this mode is used.
- Manual page update.
2004-12-09 12:26:47 +00:00
Marcel Moolenaar
9055ed836a o Don't limit GPT as a rank 2 provider. Allow it to be connected
anywhere in the DAG. This includes configurations that are not
   allowed by the EFI specification.
o  Reject a GPT partition table if it's not preceeded by a PMBR.
   There's no need to preserve the MBR partitioning anymore as GPT
   is mature and with the first bullet extending the applicability
   of GPT, it's better to be a bit more strict.
2004-12-05 06:02:21 +00:00
Pawel Jakub Dawidek
afd05d741f When initializing device, set d_softc and d_no fields for all components,
because we know it then and we need it when inserting a component which
wasn't destroyed while device was running.

Reported by:	Michael Handler <handler@grendel.net>
MFC after:	1 week
2004-12-04 21:20:59 +00:00
Warner Losh
3bc18cb767 Add observations of the Linux98 and Grub/98 boot loaders. These
observations lead me to believe that the convetion for pc98 boot
loaders is to have a jump unstruction, followed by a string, followed
by code.  The jump usually doesn't have a nop after it and usually the
string is NUL terminated, but Grub/98 breaks both of these rules.

# I looked for, but failed to find the Minux boot blocks for PC-9801 port.
2004-11-30 09:40:11 +00:00
Warner Losh
696ac86f2c Reject tasting of this provider if the sector size isn't a multiple of
512.  If I had an audio cdrom in my cd player when I booted my system,
I'd get a panic from geom because you can't read 8192 bytes from an
audio cdrom.

Remove XXX comment about IPL1 and replace it with some information
from my soon to be published web page on the pc98 disk layout.  The
IPL1 test was the result of an observation of a disk with FreeBSD's
boot0 program.  It was testing part of an area what appears to be
reserved for a boot loader name, which comes after a jump over this
area.  I don't yet know if it is required to be any specific jump
instruction, or if the destination has to be location 11. [1]

[1] FreeBSD Press No. 13, page 115, poorly translated by myself.  The
picture there shows offset 8 as the destination of the jump, but
FreeBSD's boot0 program has three padding NULs after the IPL1 name and
uses a 16-bit 'jmp' instruction.
2004-11-30 08:00:14 +00:00
Poul-Henning Kamp
d4dbba5f83 Fix a long standing bug in geom_mbr which is only now exposed by the
correct open/close behaviour of filesystems:

When an ioctl to modify the MBR arrives, we cannot take for granted that
we have the consumer open.

The symptom is that one cannot run 'boot0cfg -s2 /dev/ad0' in single-user
mode because / is the only open partition in only open r1w0e1.

If it is not, we attempt to increase the write count by one and
decrease it again afterwards.

Presumably most if not all other slices suffer from the same problem.
2004-11-28 20:57:25 +00:00
Lukas Ertl
997337fd20 Implement 'setstate' to allow setting the state of drives and subdisks
for debugging and emergency purposes.
2004-11-26 12:31:36 +00:00
Lukas Ertl
fb5885af37 Implement checkparity/rebuildparity. 2004-11-26 12:01:00 +00:00
Pawel Jakub Dawidek
a17dd95f14 - Add missing Giant drop before acquiring the topology lock.
- Move DROP_GIANT()/PICKUP_GIANT() to g_gate_ioctl().
2004-11-23 11:18:26 +00:00
Max Khon
9595dba40d Use M_ZERO to not panic in mtx_init when INVARIANTS enabled.
Submitted by:	simokawa
MFC after:	1 week
2004-11-20 13:10:04 +00:00
Lukas Ertl
fb4e65d035 Move RAID5 offset calculation into a separate function to avoid
code duplication.
2004-11-15 13:04:55 +00:00
Lukas Ertl
94175098f1 Share gv_roughlength() between kernel and userland, as we will need it
there later.
2004-11-15 12:30:59 +00:00
Pawel Jakub Dawidek
085f43afae Before trying to update metadata (so open consumer for writing), be sure
that the events queue is empty. In other case we're able to hit the race
where for example da0s1 is tasted by some other class, which means that
da0 is open with exclusive bit set, which means that we can't open da0
for writing if it is our component.

Reported by:	Attila Nagy <bra@fsn.hu> (and somebody else sometime ago,
		                          but I cannot find who it was)
2004-11-09 23:27:21 +00:00
Pawel Jakub Dawidek
b8005b9b24 Introduce g_waitidlelock() function which is simlar to g_waitidle(),
but should be called with the topology lock held and returns with the
topology lock held and empty event queue.

Approved by:	phk (sometime ago)
2004-11-09 23:20:50 +00:00
Pawel Jakub Dawidek
b36b4bfb55 Don't rely on DIRTY flag to be sure that consumer if open, because
DIRTY flag can be removed in idle process. Use consumer's acw field
instead to avoid opening consumer twice.
2004-11-09 23:15:40 +00:00
Pawel Jakub Dawidek
9c6a3f03c6 For BIO_READ check if provider is open for reading and for BIO_WRITE,
check if provider is open for writing.
This fixes panic when device is open only for writing and we send write
request.
2004-11-09 23:04:45 +00:00
Pawel Jakub Dawidek
fdc3c6ce23 Drop Giant lock before grabbing the topology lock. 2004-11-09 00:35:08 +00:00
Pawel Jakub Dawidek
463674f7e0 If device is marked as beeing destroyed, deny all access requests. 2004-11-08 20:23:53 +00:00
Pawel Jakub Dawidek
9bb09163fc Don't forget to make sure that there are no not-finished requests before
marking components as clean.

Pointed out by:	scottl
2004-11-05 17:18:39 +00:00
Pawel Jakub Dawidek
4d006a98d1 - Mark all raid3 components as clean after kern.geom.raid3.idletime seconds.
- Make kern.geom.raid3.timeout variable tunable.
2004-11-05 13:12:58 +00:00
Pawel Jakub Dawidek
9da3072cae Mark raid3 devices as clean on shutdown (after all file systems are
unmounted).

Suggested by:	scottl
2004-11-05 13:01:25 +00:00
Pawel Jakub Dawidek
79e614937e - Use ->index consumer's field to track number of in-flight requests.
- Remove unused #include.
2004-11-05 12:42:16 +00:00
Pawel Jakub Dawidek
6349471be3 Use shutdown hooks to mark mirrors as clean after all file systems are
unmounted.

Suggested by:	scottl
2004-11-05 12:35:21 +00:00
Pawel Jakub Dawidek
127cf38ee4 Remove unused #include. 2004-11-05 12:31:32 +00:00
Pawel Jakub Dawidek
14089dae44 - Add a sysctl kern.geom.mirror.idletime, so one can specify after how many
seconds of idling, DRITY flags are removed.
- If mirror is in idle state or is not open for writing, sleep without
  timeout when waiting for I/O requests.
- Don't use atomic operations, for now sysctls are protected by Giant.
- Update debugs.
2004-11-05 10:55:04 +00:00
Pawel Jakub Dawidek
2fdf5be172 MFp4:
- Fix for good (I hope) force-stopping mirrors and some filure cases
  (e.g. the last good component dies when synchronization is in progress).
  Don't use ->nstart/->nend consumer's fields, as this could be racy,
  because those fields are used in g_down/g_up, use ->index consumer's
  field instead for tracking number of not finished requests.

  Reported by:	marcel

- After 5 seconds of idle time (this should be configurable) mark all
  dirty providers as clean, so when mirror is not used in 5 seconds
  and there will be power failure, no synchronization on boot is needed.

  Idea from:	sorry, I can't find who suggested this

- When there are no ACTIVE components and no NEW components destroy whole
  mirror, not only provider.

- Fix one debug to show information about I/O request, before we change
  its command.
2004-11-05 09:05:15 +00:00
Poul-Henning Kamp
f9eeb89522 Finish cut&paste adjustments.
Spotted by:	tegge
2004-11-04 07:17:08 +00:00
Poul-Henning Kamp
e93a5ce092 Stop dumping the MBR entries under bootverbose 2004-11-03 09:08:33 +00:00
Poul-Henning Kamp
2859a695dc Stop wasting a bootverbose line on all geom slices. 2004-11-03 09:08:10 +00:00
Poul-Henning Kamp
55f499a94f Don't set si_bsize_phys, nobody cares. 2004-10-29 11:11:44 +00:00
Poul-Henning Kamp
4d13ab3da2 Add GEOM class "VFS" for filesystems and other buffer cache users
of GEOM devices.

There is nothing magic about this, it just gives a bufobj interface
to GEOM.
2004-10-29 09:56:56 +00:00
Poul-Henning Kamp
725419af56 Add g_wither_geom_close() function. 2004-10-29 09:19:03 +00:00
Poul-Henning Kamp
6afb3b1c37 Give dev_strategy() an explict cdev argument in preparation for removing
buf->b-dev.

Put a bio between the buf passed to dev_strategy() and the device driver
strategy routine in order to not clobber fields in the buf.

Assert copyright on vfs_bio.c and update copyright message to canonical
text.  There is no legal difference between John Dysons two-clause
abbreviated BSD license and the canonical text.
2004-10-29 07:16:37 +00:00
Lukas Ertl
6c39d46363 Give each plex a separate queue where held back bios are put on.
This lowers the CPU usage of the worker thread and prevents a
possible live lock on non-SMP machines.

MFC candidate.
2004-10-26 21:01:42 +00:00
Poul-Henning Kamp
8c24ef5f78 Use unit number allocation functions for GEOM minor numbers. 2004-10-25 12:28:28 +00:00
Poul-Henning Kamp
f8fe7a735c Retire si_stripesize and si_stripeoffset they will not be needed in cdev
in the future.
2004-10-25 07:40:54 +00:00
Poul-Henning Kamp
85986ce002 Don't call g_waitidle(), it happens automagically now. 2004-10-23 20:52:15 +00:00
Poul-Henning Kamp
9197ce2ee5 Add a new per-thread private flag: TDP_GEOM.
This flag gets set whenever the thread posts an event on the GEOM
event queue, and if the flag is set when the thread is prepared
to return to userland from the kernel, g_waitidle() will be called
to make sure that the posted events have completed.

This can replace an insufficient number of g_waitidle() calls in
various other places, and has the advantage of being failsafe:  Any
system call which does a VOP_OPEN()/VOP_CLOSE will now correctly
wait for any geom events it posted as part of spoils or tastes.

Assert that topology and Giant is not held in g_waitidle().
2004-10-23 20:49:17 +00:00
Poul-Henning Kamp
a11021f362 Move the prototype for g_waitidle() to a more visible place. 2004-10-23 20:22:02 +00:00
Andrew R. Reiter
f96c8ef18a - Turn KASSERT()s into warning printf()'s in the g_class_load() routine.
This removes a panic that will occur if you build with GENERIC and
  attempt to kldload a GEOM module that is already in the kernel.

Reviewed by: phk
2004-10-22 22:16:24 +00:00
Robert Watson
49dbb61dfc Add KTR_GEOM, which allows tracing of basic GEOM I/O events occuring
in the g_up and g_down threads.  Each time a bio is propelled up and
down the stack, an event is generating showing the provider, offset,
and length, as well as thread wakeup and work status information.
2004-10-21 18:35:24 +00:00
Pawel Jakub Dawidek
06697d4f59 Ehh. Introduce a hack: Wait for 3 seconds, so GEOM is able to give us
providers for tasting. Before this hack, race below is possible:
	SI_SUB_RAID (no not-fully-configured geoms, so don't block)
	GEOM tasting (now geoms are created)
	SI_SUB_MOUNT_ROOT (if root file system is placed on a mirror, it is
		possible that this mirror is not fully configured yet)
There is a lot of work to do to avoid such hacks and I need a working
solution before 5.3, sorry.

Reported by:	John Hay <jhay@icomtek.csir.co.za>
2004-10-14 07:55:29 +00:00
Pawel Jakub Dawidek
268111a210 Only allow for unloading when there are no geoms in LABEL GEOM class.
We have to use our own destroy_geom method, because default one, which
is a part of geom_slice is broken.
MT5 candidate.

PR:		kern/72467
Submitted by:	Vladimir Novoseltsev
2004-10-14 07:46:13 +00:00
Brian Feldman
6f299fa373 When loading GEOM modules, we expect the actual load process to be done
by the time that kldload(8) returns.  Satisfy that by making the GEOM
module load event -- only when the kernel is !cold -- wait until the
GEOM module init function has finished instead of returning immediately.

This is the other half of fixing md(8) (actually, "mfs" in fstab(5))
that is similar to r1.128 of src/sys/dev/md/md.c.  This bug would be
why RAM disks would often fail on boot and the first call to mdconfig(8)
would probably fail.

pjd has ideas for not requiring kldload(8) to work synchronously for
control devices that could make this obsolete.

Silence on:	-arch
2004-10-12 04:44:54 +00:00
Stephan Uphoff
f7717523a2 Trace information about a buffer while we still control it.
Reviewed by:    phk
Approved by:    sam (mentor)
2004-10-11 21:22:59 +00:00
Søren Schmidt
39e6971cba Only do the geometry translations on ad* devices, other devices seems to
have their own way of life.
Those other devices translations should be moved here as well.
2004-10-08 21:27:27 +00:00
Pawel Jakub Dawidek
7aefe57c5c Be sure to always return 0 for negative access requests.
Reported by:	Maciej Kucharz <qk@comp.waw.pl>
2004-10-07 20:13:23 +00:00
Søren Schmidt
6c35773729 Move the PC98 specific geometry "gunk" to geom_pc98.c where it belongs.
This also adds support for bigger disks on the controller I have access to,
and maybe others if I understood the adhoc methods used on those.

Those with more PC98 bigdrive controllers it is hereby invited to add/fix
support for those in geom_pc98.c and not using #ifdef PC98 all over the place.
2004-10-07 17:37:09 +00:00
Poul-Henning Kamp
276f72c550 Don't set the BIO_ONQUEUE debugging flag until we actually put the bio
onto a queue.  This made the ENOMEM handling an instant panic.
2004-10-06 20:59:59 +00:00
Pawel Jakub Dawidek
dd12956ac7 Geoms without softc are geoms which are initialized, so wait for them. 2004-10-06 18:47:15 +00:00
Pawel Jakub Dawidek
18d2addc23 Look out for geoms without softc.
Reported by:	tegge
2004-10-06 14:15:47 +00:00
Pawel Jakub Dawidek
59883b3b34 Before root file system is mounted, wait for mirrors in degraded state. 2004-10-05 11:17:08 +00:00
Lukas Ertl
4cb1b18827 Don't allow to create a drive that already exists. 2004-10-02 20:50:21 +00:00
Lukas Ertl
d9d3a74c87 Correctly skip the '/dev/' part when creating new drives and prefix
a drive's provider with '/dev/' when printing the config.

Reported by:  will@
2004-10-02 20:12:20 +00:00
Pawel Jakub Dawidek
c7e17f4bbe Unlock g_gate_list_mtx mutex when we cannot allocate unit number.
MT5 candidate.

PR:		kern/72253
Submitted by:	Ivan Voras <ivoras@fer.hr>
2004-10-02 15:03:26 +00:00
Lukas Ertl
c3aadfb9d6 Make it possible to rebuild degraded RAID5 plexes. Note that it is
currently not possible to do this while the volume is mounted.

MFC in:  1 week
2004-09-30 12:57:35 +00:00
Poul-Henning Kamp
19fa21aa50 Protect the start/end counts on consumers and providers with the up/down
mutexes.

Make it possible to also protect the disk statistics (at a minor cost in
performance) by setting bit 2 of kern.geom.collectstats.
2004-09-28 11:56:37 +00:00
Pawel Jakub Dawidek
8dd5480d29 - Set maximum request size to MAXPHYS (128kB), instead of DFLPHYS (64kB).
- Set minimum request size to sectorsize, instead of 512 bytes.

Approved by:	phk (some time ago)
2004-09-28 08:34:27 +00:00
Pawel Jakub Dawidek
604fce4f60 Just use MAXPHYS as maximum I/O request size, instead of using my own
#define for this purpose.
No functional change.
2004-09-28 07:33:37 +00:00
Pawel Jakub Dawidek
e5e7825cc3 Decrease kern.geom.raid3.timeout to 4, so it is smaller than
vfs.root.mountdelay by default.
2004-09-27 22:12:14 +00:00
Pawel Jakub Dawidek
6c25233782 Deny invalid I/O requests which comes from userland here, because later
we'll get a panic.
MT5 candidate.

Reviewed by:	phk
2004-09-27 22:10:01 +00:00
Pawel Jakub Dawidek
d2fb9c62e2 Avoid race while synchronizing components. It is very hard to bump into,
but it is possible:
1. Read data from good component for synchronization.
2. Write data to the same area.
3. Write synchronization data, which are now stale.

Found by:	tegge (for gmirror)
2004-09-27 20:32:35 +00:00
Pawel Jakub Dawidek
829c0864cb Minor, but very important condition fix. The current one can never be true. 2004-09-27 19:32:26 +00:00
Pawel Jakub Dawidek
cf41526bdc Decrease kern.geom.mirror.timeout to 4, so it is smaller than
vfs.root.mountdelay by default.
2004-09-27 13:47:37 +00:00
Pawel Jakub Dawidek
0217ba9893 Forgot to commit addition of ds_resync field. 2004-09-26 20:42:35 +00:00
Pawel Jakub Dawidek
e8adbe4499 Avoid race while synchronizing components. It is very hard to bump into,
but it is possible:
1. Read data from good component for synchronization.
2. Write data to the same area.
3. Write synchronization data, which are now stale.

Found by:	tegge
2004-09-26 20:41:07 +00:00
Pawel Jakub Dawidek
31522023f9 Simplify code a bit. 2004-09-26 20:30:15 +00:00
Poul-Henning Kamp
a7830346e2 Assert topology is held in g_dev_getprovider().
Don't call devsw().  It is not necessary, and we do not need to hold dev_lock
to compare the devsw pointer to our own since we do not dereference it.
2004-09-24 06:43:20 +00:00
Pawel Jakub Dawidek
201dfcf143 This is not needed anymore, it is forced in GEOM now.
Actually, it can even cause some problems, because GEOM requires sectorsize
to be more than 0 on first access, not on provider creation, so we can skip
valid providers by doing this check here.

Reported by:	Divacky Roman <xdivac02@stud.fit.vutbr.cz>
		Sven Willenberger <sven@dmv.com>
2004-09-20 17:26:25 +00:00
Max Khon
9cf3607da2 Use correct malloc type when freeing memory allocated by g_read_data.
PR:		71431
Submitted by:	daichi
2004-09-19 10:27:46 +00:00
Lukas Ertl
b916fcec4d Single concat or striped plexes don't need no special initialization
if their subdisks are all available, so let them be brought up.
2004-09-18 18:03:20 +00:00
Lukas Ertl
67e3ab6ee5 Re-vamp how I/O is handled in volumes and plexes.
Analogous to the drive level, give each volume and plex a worker thread
that picks up and processes incoming and completed BIOs.

This should fix the data corruption issues that have come up a few
weeks ago and improve performance, especially of RAID5 plexes.

The volume level needs a little work, though.
2004-09-18 13:44:43 +00:00
Max Khon
b3f05a2e9e g_nop_create: destroy newly created provider in case of errors. 2004-09-16 15:28:48 +00:00
Lukas Ertl
12653dec9d Give the DRIVE geom a worker thread that picks up incoming bios,
sends them down, and takes care of the finished bios.  This makes it
easier to handle I/O errors at drive level.
2004-09-13 21:01:36 +00:00
Lukas Ertl
fce2deb197 Rename gv_kill_thread() to gv_kill_plex_thread(), since there are more
threads to come.
2004-09-13 17:44:47 +00:00
Lukas Ertl
a0781b98f3 Save the config back to disk when a drive goes down. 2004-09-13 17:33:52 +00:00
Lukas Ertl
ea29a30466 Read a whole sector instead of GV_HDR_LEN, since a sector might be
bigger (i.e. on CD-ROMs).
2004-09-13 17:27:58 +00:00
Pawel Jakub Dawidek
7e8ca741ca Make kern.geom.debugflags sysctl tunable from /boot/loader.conf.
It will help to debug problems when booting.

Approved by:	phk
2004-09-13 14:58:27 +00:00
Poul-Henning Kamp
4090065137 Fix a problem that shows up if less than the full complement of
lock sectors are defined ("number_of_keys" argument to gbde init being
less than 4 in the default compile).
2004-09-11 17:58:53 +00:00
Poul-Henning Kamp
cbca0b53e5 Respect that G_BDE_MAXKEYS is a compile time variable. 2004-09-11 17:57:51 +00:00
Max Khon
51eb0765c6 Do not compile in zlib.c. Add a dependency on module instead. 2004-09-08 17:27:31 +00:00
Pawel Jakub Dawidek
f7b4d339ac Show current status of mirror device directly.
Suggested by:	Krzysztof Ciep³ucha <kris@home.pl>
2004-09-08 16:37:22 +00:00
Poul-Henning Kamp
5ae652c0ed For removable devices without media we set a zero mediasize but a non-zero
sectorsize in order to avoid a lot of checks around various divisions etc.

Enforce the sectorsize being > 0 with a KASSERT on successful open.

Fix scsi_cd.c to return 2k sectors when no media inserted.
2004-09-05 21:15:58 +00:00
Pawel Jakub Dawidek
6d7b8aecd3 Allow to configure debug level from /boot/loader.conf. 2004-08-30 18:50:06 +00:00
Poul-Henning Kamp
dcbd0fe5aa Add more KASSERTS and checks. 2004-08-30 09:33:06 +00:00
Pawel Jakub Dawidek
45d5e85a40 GCC, ehh. 2004-08-29 14:29:30 +00:00
Pawel Jakub Dawidek
c0d68b6ef2 Use sc->sc_mediasize instead of sc->sc_provider->mediasize which contains
exactly the same value, but is shorter.
2004-08-28 02:35:43 +00:00
Pawel Jakub Dawidek
08249e9e6e Warn the user if we are not going to use whole provider space.
Requested by:	Michael Handler <handler@grendel.net>
2004-08-28 02:34:10 +00:00
Pawel Jakub Dawidek
16ebaa0793 Don't allow to insert providers, which are too small.
Reported by:	Michael Handler <handler@grendel.net>
2004-08-28 02:02:48 +00:00
Lukas Ertl
5bad268cdc Move config_new_drive() to the correct place and rename it to
gv_config_new_drive().
2004-08-27 21:32:18 +00:00
Poul-Henning Kamp
a2033c9615 Introduce g_alloc_bio() as a waiting variant of g_new_bio().
Use in places where we can sleep and where we previously failed to check
for a NULL pointer.

MT5 candidate.
2004-08-27 14:43:11 +00:00
Lukas Ertl
4328802ce9 When attaching a consumer from a volume to a plex, check if the
volume already has a plex attached and adjust the access counts
of the new consumer accordingly.
2004-08-26 21:04:41 +00:00
Pawel Jakub Dawidek
29c78ab315 Skip providers with not defined sector size.
Reported by:	kuriyama
2004-08-26 12:42:47 +00:00
Pawel Jakub Dawidek
4cf67afe37 Log verification errors at level 1. 2004-08-25 19:18:07 +00:00
Pawel Jakub Dawidek
f0c8658d4e Dump disk number. 2004-08-25 12:14:44 +00:00
Pawel Jakub Dawidek
c8b906bcbe Allow to set kern.geom.mirror.timeout from /boot/loader.conf. 2004-08-23 20:42:34 +00:00
Lukas Ertl
a3423d4c6f Compare the addresses of two RAID5 work packets directly instead
of the addresses of their related bios when locking one out, since
they could share a bio and this could lead to parity corruption.
2004-08-23 17:50:18 +00:00
Lukas Ertl
c4bdc6fc32 Implement the possibility to remove drives. 2004-08-22 17:07:55 +00:00
Pawel Jakub Dawidek
dba915cfee Implementation of 'verify reading' algorithm, which uses parity data for
verification of regular data when device is in complete state.
On verification error, EIO error is returned for the bio and sysctl
kern.geom.raid3.stat.parity_mismatch is increased.

Suggested by:	phk
2004-08-22 16:21:12 +00:00
Lukas Ertl
45d0fdcda9 Add forgotten format specifier in a KASSERT and shut up the compiler.
Submitted by: Gavin Atkinson <gavin.atkinson@ury.york.ac.uk>
2004-08-22 13:34:24 +00:00
Pawel Jakub Dawidek
d12bd83e9b Add version history. 2004-08-21 21:15:03 +00:00
Pawel Jakub Dawidek
f5a2f7feac Implement new reading algorithm, which will use parity component for reading
as well, even if device is in complete state.
I observe 40% of speed-up with this option for random read operations,
but slowdown for sequential reads.
Basically, without this option reading from a RAID3 device built from 5
components (c0-c4) looks like this:

	Request no.	Used components
	1		c0+c1+c2+c3
	2		c0+c1+c2+c3
	3		c0+c1+c2+c3

With the new feature:

	Request no.	Used components
	1		c0+c1+c2+c3
	2		(c1^c2^c3^c4)+c1+c2+c3
	3		c0+(c0^c2^c3^c4)+c2+c3
	4		c0+c1+(c0^c1^c3^c4)+c3
	5		c0+c1+c2+(c0^c1^c2^c4)
	6		c0+c1+c2+c3
	[...]
2004-08-21 18:11:46 +00:00
Lukas Ertl
83bfcb1092 A volume can be up if it has a degraded RAID5 plex. 2004-08-19 12:03:27 +00:00
Pawel Jakub Dawidek
d86bc96cab We really don't want to receive spoil event for synchroniztion consumers. 2004-08-18 23:33:37 +00:00
Poul-Henning Kamp
a9654c8c58 Do not override the class provided dumpconf function. 2004-08-18 21:42:08 +00:00
Lukas Ertl
9a8bd51965 Pretty print some informational messages. 2004-08-18 20:43:56 +00:00
Lukas Ertl
d30f29867e Fix a stupid bug in the drive taste function: when checking if a
drive is known to the configuration check also if it already has a geom.
Without this check several needless geoms are created and valid
configuration data was overwritten.

This change obsoletes the need for a separate geom to taste an
offered provider and the consumer doesn't need to be opened with the
exclusive bit set.
2004-08-18 20:34:45 +00:00
Pawel Jakub Dawidek
b25aec32ff NOP class doesn't operate on metadata, so the spoil event can be safely
ignored.
2004-08-18 16:58:42 +00:00
Pawel Jakub Dawidek
28b31df727 Dump device status on 'list' command. 2004-08-18 16:46:51 +00:00
Pawel Jakub Dawidek
f1ad62a4d8 Bump synchronization ID if we are sure, that we have ACTIVE components. 2004-08-18 07:28:48 +00:00
David E. O'Brien
fa6a78376f Minor style.9 cleanup. 2004-08-16 10:33:35 +00:00
Pawel Jakub Dawidek
809a9dc601 Decrease debug level to 0. 2004-08-16 08:33:04 +00:00
Pawel Jakub Dawidek
5e6db16cd6 Fix warning. 2004-08-16 08:21:31 +00:00
Pawel Jakub Dawidek
2d1661a5b6 Introduce GEOM RAID3 class, i.e. kernel module, which implements RAID3
transformation and graid3(8) userland utility, which can be used for
configuration. No manual page yet, sorry.

Hardware provided by:	Daniel Seuffert
2004-08-16 06:23:14 +00:00
Pawel Jakub Dawidek
f62d59df32 Avoid code duplication by introducing g_mirror_write_metadata() function,
which is used now by g_mirror_clear_metadata() function and
g_mirror_update_metadata() function.
2004-08-15 13:58:29 +00:00
Lukas Ertl
71fd4f60da Make informational output look less like an accident. 2004-08-14 09:56:17 +00:00
Max Khon
75261008d7 Add geom_uzip -- geom class that implements read-only compressed disks.
Currently supports cloop V2.0 disk compression format.
May support more formats in future.
2004-08-13 09:40:58 +00:00
Pawel Jakub Dawidek
887c9fd564 MFp4: Simplify code a bit:
- Remove kern.geom.mirror.sync_block_size sysctl. It is quite obvious that we
  want to use the biggest size possible.
- Do not use UMA zone for sync data allocations. There could be only one
  synchronization request per synchronized disk at a time, so allocate memory
  for one request on whole synchronization process related to one disk.

Tested by synchronizing one component (out of three) and by synchronizing
two components (out of three) in parallel.
2004-08-11 23:41:53 +00:00
Pawel Jakub Dawidek
445a4b68f2 Actually, HARDCODED flag isn't stored in metadata, so don't bother
dumping it.
2004-08-11 22:16:42 +00:00
Pawel Jakub Dawidek
2def749bb1 - Fix typo.
- Dump HARDCODED flag.
2004-08-11 22:12:44 +00:00
Pawel Jakub Dawidek
a5ef629f10 Increase default kern.geom.stripe.maxmem to 50 elements. 2004-08-11 12:57:17 +00:00
Pawel Jakub Dawidek
1b949c05a3 When sending request once again because of ENOMEM, reset bio_children
and bio_inbed fields to 0. Without this change we can end up with
I/O leakage in some rare situations.
I tested this change by putting failure probability mechanism simlar
to this used in NOP class into g_clone_bio(9) function, so it was
able to return NULL with the given probability.

Discussed with:	phk
2004-08-11 12:04:35 +00:00
Pawel Jakub Dawidek
6d8fb92d78 Try harder to not panic on 'stop -f'.
After the commit, this command should be really safe to use.
2004-08-11 11:10:46 +00:00
Lukas Ertl
92f49a969d If we kill the worklist thread of a RAID5 plex we can destroy
the worklist mutex at the same time, so move the mtx_destroy() call
to gv_kill_thread().
2004-08-10 20:51:48 +00:00
Lukas Ertl
ecffb8e64b Lock the topology before calling gv_parse_config, not afterwards. 2004-08-10 20:15:12 +00:00
Pawel Jakub Dawidek
6b2b3e8745 - Recognize HARDCODED flag when dumping consumer configuration.
- Improve code readabilty a bit.
2004-08-10 19:53:31 +00:00
Pawel Jakub Dawidek
c38d2f4eca Forgot to commit those: introduce hardcoded provider functionality,
which allow to store provider's name in the metadata and avoid
problems when few providers share the same last sector.
2004-08-10 19:52:12 +00:00
Pawel Jakub Dawidek
4ffa3fef69 Fix one of the lastest commit. This bio_caller1 should also be changed to
bio_driver1 (as all the rest).
This introduced a small memory leak, but it wasn't really critical,
because maximum memory for g_stripe_zone is always set, so after few
requests gstripe was working in "economic" mode.
2004-08-10 19:07:55 +00:00
Pawel Jakub Dawidek
6c74f5177c - Introduce option for hardcoding providers' names into metadata.
It allows to fix problems when last provider's sector is shared between few
  providers.
- Bump version number for CONCAT and STRIPE and add code for backward
  compatibility.
- Do not bump version number of MIRROR, as it wasn't officially introduced yet.
  Even if someone started to play with it, there is no big deal, because
  wrong MD5 sum of metadata will deny those providers.
- Update manual pages.
- Add version history to g_(stripe|concat).h files.
2004-08-09 11:29:42 +00:00
Pawel Jakub Dawidek
7e72a70863 Do not use g_wither_geom(9). I doesn't work in the way which is expected
here anymore (after g_wither_washer() was introduced), i.e. geom and consumer
will not be immediately destroyed if possible.
2004-08-09 11:14:25 +00:00
Poul-Henning Kamp
157b106eae Too many versions.
Spotted by:	pjd
2004-08-09 06:04:00 +00:00
Poul-Henning Kamp
07f076fe7a OK, now check geom class version numbers. 2004-08-08 08:34:46 +00:00
Poul-Henning Kamp
5721c9c76a Tag all geom classes in the tree with a version number. 2004-08-08 07:57:53 +00:00
Poul-Henning Kamp
e232f70a75 OOps, that check was a bit premature. Allow zero versions as well. 2004-08-08 07:30:47 +00:00
Poul-Henning Kamp
650ee351b3 Use default method initialization on geoms. 2004-08-08 06:49:07 +00:00
Poul-Henning Kamp
dd66958e28 Give classes a version number and refuse to touch classes which are not
understood.  This makes room for additional binary compatibility in the
future.

Put fields in the class for the geom's methods and initialize the methods
of a new geom from these fields.  This saves some code in all classes.
2004-08-08 06:46:27 +00:00
Pawel Jakub Dawidek
cea363682f Add and document kern.geom.stripe.fast_failed sysctl, which shows how
many times "fast" mode failed.
2004-08-06 10:19:34 +00:00
Pawel Jakub Dawidek
ec70430134 Fields bio_caller[12] should be used by the consumer and fields
bio_driver[12] should be used by the provider!
2004-08-06 10:07:03 +00:00
Pawel Jakub Dawidek
37abacd4ff Fix I/O leakage. We're cloning bios in g_stripe_start_fast(), but when
something goes wrong while running in "fast" mode, we free all bios and
falling back to "economic" mode. Freeing bios, doesn't mean decrease
bio_children, so bio_inbed couldn't be equal to bio_children and request
was never finished.
Decrease bio_children manually when destroying bios.

Reported by:	Sam Lawrance <boris@brooknet.com.au>, simon
2004-08-06 09:55:40 +00:00
Pawel Jakub Dawidek
db332970e7 Don't use 'bp' after its destruction! 2004-08-05 14:07:21 +00:00
Pawel Jakub Dawidek
a4fa09ec93 Simplify a bit - we could use 'sc' here as it was initialized properly. 2004-08-05 13:22:17 +00:00
Pawel Jakub Dawidek
51385a3c00 - Add two fields to bio structure: 'bio_cflags' which can be used by
consumer and 'bio_pflags' which can be used by provider.
- Remove BIO_FLAG1 and BIO_FLAG2 flags. From now on new fields should be
  used for internal flags.
- Update g_bio(9) manual page.
- Update some comments.
- Update GEOM_MIRROR, which was the only one using BIO_FLAGs.

Idea from:	phk
Reviewed by:	phk
2004-08-04 21:35:05 +00:00
Pawel Jakub Dawidek
fe7c3780c8 - Add "prefer" balance algorithm. When used, only disk with the biggest
priority will be used for reading.
- Bump version number.
2004-08-04 12:09:53 +00:00
Pawel Jakub Dawidek
e1efe7edcd MFp4: We don't really need g_mirror_free_disk() function. 2004-08-04 10:02:06 +00:00
Pawel Jakub Dawidek
7d01f1820a Fix comment. 2004-08-03 15:41:33 +00:00
Pawel Jakub Dawidek
969ff54d70 - Fix unloading by the same way it is done in my other classes:
set gp->softc to NULL and return ENXIO when it is NULL, so GEOM
  will not panic or hang, but unload one device on every 'unload'.
  This make 'unload' command usable, but it have to be executed
  <number of devices> + 1 times.
- Made use of 'pp' variable.
2004-08-02 00:37:40 +00:00
Pawel Jakub Dawidek
6b1c71eef8 Typo. 2004-08-01 20:41:58 +00:00
Pawel Jakub Dawidek
4084e6aad2 - Launch main provider when there are no more disks in NEW state.
- Log syncid bump at debug level 1.
2004-08-01 09:01:50 +00:00
Pawel Jakub Dawidek
959521ff1f If there are no valid components after the timeout, just destroy device.
There is probably nothing to wait for.
2004-07-31 22:10:51 +00:00
Lukas Ertl
4b017d0d93 Propagate size changes upwards. 2004-07-31 21:34:21 +00:00
Pawel Jakub Dawidek
4b2e596e38 Handle spoil event in dedicated function: g_mirror_spoiled().
The different between the new function and g_mirror_orphan() (which was
used previously) is that syncid is bumped immediately, instead of on
first write, because when consumer was spoiled, it means, that its
provider was opened for writing, so we can't trust that its data
will be valid when it will be connected again.
2004-07-31 21:08:17 +00:00
Pawel Jakub Dawidek
484405eb8d Remove unused field. 2004-07-31 13:03:38 +00:00
Pawel Jakub Dawidek
2d53f42307 Destroy synchronization geom immediately. This should fix unloading without
stopping all mirrors.
2004-07-31 11:22:03 +00:00
Pawel Jakub Dawidek
c9d3021a92 Allow slice creation on providers from MIRROR class.
This should allow mounting root file system from a mirror.
2004-07-31 01:17:20 +00:00
Pawel Jakub Dawidek
55d6eb9fef Add '-p' option for 'insert' command which allows to specify priority
of the new component.
Version number wasn't bumped (it should be), because I think there are
no geom_mirror users yet.
2004-07-31 00:54:44 +00:00
Pawel Jakub Dawidek
ff9160f5f3 - Check if 'slice' argument was given.
- Check if disk isn't already the mirror component.
2004-07-31 00:51:33 +00:00
Pawel Jakub Dawidek
f251dfbf5d Dump correct field. 2004-07-31 00:37:14 +00:00
Lukas Ertl
663e5a3311 Set the access counts of a subdisk correctly when attaching it
to a plex that already has subdisks.
2004-07-30 23:40:38 +00:00
Pawel Jakub Dawidek
fa4a1febf7 Add GEOM_MIRROR class which provide RAID1 functionality and has many useful
features. The gmirror(8) utility should be used for control of this class.
There is no manual page yet, but I'm working on it with keramida@.

Many useful tests provided by:	simon (thank you!)
Some ideas from:		scottl, simon, phk
2004-07-30 23:13:45 +00:00
Pawel Jakub Dawidek
63ead4f2d3 Nuke geom_mirror class. New geom_mirror class is in the way.
Approved by:	phk
2004-07-30 19:59:36 +00:00
Pawel Jakub Dawidek
f49b0080d0 Allow to create slices on providers from class LABEL and class NOP.
This is really ugly way to do this, but there is no other way for now.
It allows to mount root file system from providers which belong to
those classes.

Approved by:	phk
2004-07-30 19:55:12 +00:00
Pawel Jakub Dawidek
d5c96d389e - Add '-S' option, which allow to specify sector size for transparent
provider.
- Bump version number.

This allows for a quite interesting trick. One can setup a stripe with
stripe size of 512 bytes and create transparent provider on top of it
with sector size equal to <ndisks> * 512. The result will be something
like RAID3 without parity disk (every access will touch all disks).
2004-07-30 08:19:22 +00:00
Lukas Ertl
07c424cdaf Shut up the compiler and temporarily '#if 0' gv_destroy_geom(),
until we need it again.
2004-07-29 11:32:09 +00:00
Pawel Jakub Dawidek
1d723f1d51 Improve geom(8)'s 'list' command to show geoms and their providers and
consumers. Teach STRIPE, CONCAT and NOP classes about this improvement.
2004-07-26 17:14:47 +00:00
Pawel Jakub Dawidek
889c5dc22b Change naming scheme from /dev/<name>.stripe to /dev/stripe/<name>. 2004-07-26 16:10:27 +00:00
Pawel Jakub Dawidek
ba385d0091 Change naming scheme from /dev/<name>.concat to /dev/concat/<name>. 2004-07-26 16:08:32 +00:00
Pawel Jakub Dawidek
2017a9d3e2 M_WAITOK is ok here, while I'm using M_WAITOK later in this function. 2004-07-26 15:41:28 +00:00
Pawel Jakub Dawidek
75cc259de8 M_WAITOK is ok here, while I'm using M_WAITOK later in this function. 2004-07-26 15:35:04 +00:00
Lukas Ertl
d8a720f9ec Save the vinum config back to disk after syncing two plexes. 2004-07-26 07:30:21 +00:00
Lukas Ertl
3242376072 There's a chance that the VINUMDRIVE class tastes before the
VINUM class, so let the VINUMDRIVE class parse the on-disk
configuration, too.
2004-07-25 23:01:09 +00:00
Lukas Ertl
5289667c16 Check for a NULL pointer before dereferencing it. 2004-07-25 09:41:31 +00:00
Lukas Ertl
c291a77678 Use a temporary geom when tasting vinumdrives and lock the 'real'
vinumdrive geom with an exclusive bit.  This should fix the problem
when underlying partitions overlap (i.e. the 'a' partition is at
the same offset as the 'c' partition).

Ideas borrowed from pjd@, quite a bit of testing by
Matthias Schuendehuette <msch@snafu.de>.
2004-07-24 22:26:40 +00:00
Lukas Ertl
6db345b90e Disable kldunloading of geom_vinum temporarily until I figured out
how to do it correctly.
2004-07-24 19:04:24 +00:00
Pawel Jakub Dawidek
e370e911b2 MFp4: Add two options for gnop(8)'s 'create' command:
-o offset - specifies where to start on the original provider
	-s size - specifies size of the transparent provider
2004-07-19 07:52:56 +00:00
Pawel Jakub Dawidek
726cb09fbc Fix copy&paste bug. 2004-07-18 16:51:58 +00:00
Pawel Jakub Dawidek
be7695cf65 Fix exclusive-bit leakage. 2004-07-18 06:54:29 +00:00
Poul-Henning Kamp
3e019deaed Do a pass over all modules in the kernel and make them return EOPNOTSUPP
for unknown events.

A number of modules return EINVAL in this instance, and I have left
those alone for now and instead taught MOD_QUIESCE to accept this
as "didn't do anything".
2004-07-15 08:26:07 +00:00
Pawel Jakub Dawidek
f3b3b0ce27 Remove unused macro. 2004-07-13 12:01:29 +00:00
Pawel Jakub Dawidek
45c26487a7 Decrease log level of one debug message, so there is no hole (level 2
wasn't used at all).
2004-07-13 12:01:11 +00:00
Pawel Jakub Dawidek
71df072547 Minor sysctl description fixes.
Submitted by:	simon
2004-07-13 11:23:31 +00:00
Pawel Jakub Dawidek
4c55f05b89 Implement "FAST" mode for GEOM_STRIPE class and turn it on by default.
In this mode you can setup even very small stripe size and you can be
sure that only one I/O request will be send to every disks in stripe.
It consumes some more memory, but if allocation fails, it will fall
back to "ECONOMIC" mode.

It is about 10 times faster for small stripe size than "ECONOMIC" mode
and other RAID0 implementations. It is even recommended to use this
mode and small stripe size, so our requests are always splitted.

One can still use "ECONOMIC" mode by setting kern.geom.stripe.fast to 0.
It is also possible to setup maximum memory which "FAST" mode can consume,
by setting kern.geom.stripe.maxmem from /boot/loader.conf.
2004-07-09 14:30:09 +00:00
Poul-Henning Kamp
539bec4042 Only detach consumers which are attached when we wither stuff away.
Pointed out by:	pjd
2004-07-09 14:06:17 +00:00
Poul-Henning Kamp
1b464bd889 Make withering water tight.
When we orphan/wither a provider, an attached geom+consumer could
end up being withered as a result and it may be in front of us in
the normal object scanning order so we need to do multi-pass.  On
the other hand, there may be withering stuff we can't get rid off
(yet), so we need to keep track of both the existence of withering
stuff and if there is more we can do at this time.
2004-07-08 16:17:14 +00:00
Poul-Henning Kamp
4fa0290fbf Fail normally rather than KASSERT if attempt to open a spoiled consumer. 2004-07-08 10:34:09 +00:00
Pawel Jakub Dawidek
01ab535e8d Add missing argument. 2004-07-06 17:06:54 +00:00
Pawel Jakub Dawidek
e943975c69 Properly free resources if g_access() fails. 2004-07-06 16:29:32 +00:00
Pawel Jakub Dawidek
a2e31b8b53 - Add 'stop' command, which works just like 'destroy' command, but sounds
less dangerous.
- Update manual pages and extend examples.
- Bump versions.
2004-07-05 21:16:37 +00:00
Pawel Jakub Dawidek
1b39b2f671 g_clone_bio() can fail, be ready for this.
Approved by:	le
2004-07-05 13:24:22 +00:00
Poul-Henning Kamp
8529ce7a87 We only need to check for overlaps if we increasing access counts. 2004-07-04 13:44:48 +00:00
Pawel Jakub Dawidek
e1237b285b Introduce GEOM_LABEL class.
This class is used for detecting volume labels on file systems:
UFS, MSDOSFS (FAT12, FAT16, FAT32) and ISO9660.
It also provide native labelization (there is no need for file system).

g_label_ufs.c is based on geom_vol_ffs from Gordon Tetlow.
g_label_msdos.c and g_label_iso9660.c are probably hacks, I just found
where volume labels are stored and I use those offsets here,
but with this class it should be easy to do it as it should be done by
someone who know how.
Implementing volume labels detection for other file systems also should
be trivial.

New providers are created in those directories:
/dev/ufs/ (UFS1, UFS2)
/dev/msdosfs/ (FAT12, FAT16, FAT32)
/dev/iso9660/ (ISO9660)
/dev/label/ (native labels, configured with glabel(8))

Manual page cleanups and some comments inside were submitted by
Simon L. Nielsen, who was, as always, very helpful. Thanks!
2004-07-02 19:40:36 +00:00
Pawel Jakub Dawidek
9ecdd506ea Remove unused argument for good. 2004-07-01 15:42:03 +00:00
Pawel Jakub Dawidek
916a3aa05b Free only if pointer isn't NULL. 2004-07-01 12:42:13 +00:00
Poul-Henning Kamp
11e9a67906 Fix regression in last commit. 2004-06-29 08:33:58 +00:00
Poul-Henning Kamp
52c583feb9 Make sure to kill the devstat entry for disappearing disks.
PR:	68074
Submitted by:	Hendrik Scholz <hscholz@raisdorf.net>
2004-06-27 20:53:20 +00:00
Pawel Jakub Dawidek
55336b83e0 Introduce a hack that will make geom_gate to work with read-only mounts.
Now, when trying to mount file system in read-only mode it tries to
opened a device for writting to be able to update to read-write mode
latter. Ehh.

Discussed with:	phk
2004-06-27 12:56:11 +00:00
Robert Watson
5706472c3a The g_up and g_down threads use a local 'mymutex' mutex to allow WITNESS
to warn about attempts to sleep in the I/O path.  This change pushes the
definition and use of 'mymutex' behind #ifdef WITNESS to avoid the cost
in non-debugging cases.  This results in a clear .22% performance win for
512 byte and 1k I/O tests on my SMP test box.  Not much, but every bit
counts.
2004-06-26 23:27:42 +00:00
Lukas Ertl
865897c9f1 Mark a plex as 'newborn' when it is created. This is used to indicate
that new RAID5 plexes need to be initialized first.
2004-06-25 18:04:33 +00:00
Pawel Jakub Dawidek
40f798dad1 Don't force class to give a valid softc to g_slice_new(), it is not always
needed.

Approved by:	phk
2004-06-24 10:50:20 +00:00
Christian S.J. Peron
d7af790b0d Currently, if the drives specified for volume creation are
not active GEOM providers, it will result in a kernel panic.

If the GEOM provider or disk goes away before the volume
configuration data gets written to the disk, it will result
in another kernel panic.

o Make sure that the drives specified for volume creation
  are active GEOM providers.

o When writing out volume configuration data to associated drives,
  make sure that the GEOM provider is active, otherwise continue
  to the next drive in the volume.

Approved by:	le, bmilekic (mentor)
2004-06-24 02:40:34 +00:00
Lukas Ertl
3a1e11b485 Add a function to clean up RAID5 packets and use it when I/O has
finished or when building the complete packet fails.
2004-06-23 23:52:55 +00:00
Lukas Ertl
c3dba6d0e0 Remove two debugging printfs that are currently rather disturbing
than helpful.
2004-06-23 22:32:01 +00:00
Lukas Ertl
b950fbe67b Accept "sd len 0" and auto-size the subdisk correctly.
Spotted by: csjp
2004-06-23 21:15:55 +00:00
Lukas Ertl
1b699be2e1 No need to free the softc, because it wasn't allocated. 2004-06-22 18:13:43 +00:00
Lukas Ertl
291cb0ac69 Don't sleep in the g_down path. More error checks to come. 2004-06-22 14:54:31 +00:00
Poul-Henning Kamp
71c911b18c Kill g_access_rel() already now before we send it down 5-stable 2004-06-21 20:31:49 +00:00
Pawel Jakub Dawidek
47f44cb708 Don't hold topology lock while calling g_gate_release().
Found by:	KASSERT()
2004-06-21 09:12:08 +00:00
Poul-Henning Kamp
fc6c63b477 Duplicate the securelevel check from spec_vnops.c here. 2004-06-19 09:00:53 +00:00
Lukas Ertl
7f72de2d55 Clean up allocated ressources when destroying the main vinum geom. 2004-06-18 19:53:33 +00:00
Poul-Henning Kamp
b90c855961 Reduce the thaumaturgical level of root filesystem mounts: Instead of using
an otherwise redundant clone routine in geom_disk.c, mount a temporary
DEVFS and do a proper lookup.

Submitted by:	thomas
2004-06-17 21:24:13 +00:00
Poul-Henning Kamp
f3732fd15b Second half of the dev_t cleanup.
The big lines are:
	NODEV -> NULL
	NOUDEV -> NODEV
	udev_t -> dev_t
	udev2dev() -> findcdev()

Various minor adjustments including handling of userland access to kernel
space struct cdev etc.
2004-06-17 17:16:53 +00:00
Lukas Ertl
99b536d888 Handle dead disks in a somewhat sane way. 2004-06-16 14:41:04 +00:00
Poul-Henning Kamp
89c9c53da0 Do the dreaded s/dev_t/struct cdev */
Bump __FreeBSD_version accordingly.
2004-06-16 09:47:26 +00:00
Lukas Ertl
da8f1aa53d Fix several bugs related to subdisk drive_offset calculation. 2004-06-15 20:56:25 +00:00
Lukas Ertl
a6facf72b1 Don't free a VINUMDRIVE softc when it's orphaned or spoiled. All
allocated ressouces should be ultimately freed in gv_destroy_geom()
(when unloading the module and not earlier), but I need to look at this
more closely.
2004-06-14 17:12:32 +00:00
Lukas Ertl
1a9e260fd4 Correctly calculate subdisk offset in RAID5 plexes. 2004-06-14 17:06:55 +00:00
Lukas Ertl
73679edcc7 Add a first version of a GEOMified vinum. 2004-06-12 21:16:10 +00:00
Poul-Henning Kamp
cf4572847a Make the sysctl kern.geom.collectstats more granular.
Bit 0 controls statistics collection on GEOM providers.
Bit 1 controls statistics collection on GEOM consumers.

Default value is 1.

Prodded by:	scottl
2004-06-09 19:44:44 +00:00
Pawel Jakub Dawidek
d462f0a1ac Fix format string. 2004-06-07 13:40:40 +00:00
Pawel Jakub Dawidek
0e11f0a93b Don't allow for duplicated entries creation. 2004-06-07 13:33:09 +00:00
Joerg Wunsch
c7cfc3b129 Add SVR4-compatible VTOC-style elements to the Sun label. The
FreeBSD kernel doesn't use them but sunlabel(8) shortly will,
and both these files are used by sunlabel(8).
2004-06-01 20:18:25 +00:00
Poul-Henning Kamp
887ae9a1d2 Zap a redundant NULL 2004-05-30 18:04:06 +00:00
Pawel Jakub Dawidek
3fb17452b0 Dump some more informations:
- device state
	- list of used providers
	- total number of disks
	- number of disks online

Prodded by:	Alex Deiter <tiamat@komi.mts.ru>
2004-05-26 11:36:27 +00:00
Pawel Jakub Dawidek
02692c510d - Change command name from 'config' to 'configure'.
- Bump version number.
2004-05-21 15:23:48 +00:00
Pawel Jakub Dawidek
02637cdcb1 - Teach CONCAT class how to talk with geom(8).
- Remove provider if any disk was lost.
- Dump CONCAT version.

Supported by:	Wheel - Open Technologies - http://www.wheel.pl
2004-05-20 10:40:18 +00:00
Pawel Jakub Dawidek
b09121f9e1 Introduce STRIPE GEOM class. It implements RAID0 transformation and it
is intend to be fast. Just like CONCAT class it provides manual and
auto configuration methods.

Supported by:	Wheel - Open Technologies - http://www.wheel.pl
2004-05-20 10:20:49 +00:00
Pawel Jakub Dawidek
89aaffec5c Introduce NOP GEOM class. This is totally transparent GEOM class, but
it is very useful for tests. One is able to destroy its provider
forcibly if wants to test how other class handle such events.
One is also able to specify failure probability to check how other
classes handle I/O errors.

Supported by:	Wheel - Open Technologies - http://www.wheel.pl
2004-05-20 10:15:53 +00:00
Søren Schmidt
bbf15239ed Dont try to finish devstat's if the disk pointer is NULL, this can happen
when a disk has been destroyed but still has outstanding bio's.

Reviewed by:	phk
2004-05-11 13:17:40 +00:00
Pawel Jakub Dawidek
053271038e Close some small wakeup<->msleep races. 2004-05-05 12:30:41 +00:00
Pawel Jakub Dawidek
c2496c87c1 Fix compilation on 64-bit architectures.
Noticed by:	Tinderbox
2004-05-04 07:45:39 +00:00
Pawel Jakub Dawidek
b62093b274 Turn off debugging by default. 2004-05-03 21:11:54 +00:00
Pawel Jakub Dawidek
37c9eaae29 Prefer signed type over unsigned to be able to assert negative
reference count.
2004-05-03 21:02:02 +00:00
Pawel Jakub Dawidek
4d1e1bf3f5 - Hold g_gate_list_mtx lock while generating/checking unit number.
Found by:	mtx_assert() g_gate.c:273
- Set command before returning to userland with ENOMEM error value.
	Found by:	assert() ggatel.c:108
2004-05-03 18:06:24 +00:00
Pawel Jakub Dawidek
0d785336d1 Make it compile on 64-bit architectures.
The biggest issue was that 16-bit atomic operations aren't supported
on all architectures.
2004-05-02 17:57:49 +00:00
Pawel Jakub Dawidek
fe27e77251 Kernel bits of GEOM Gate. 2004-04-30 16:08:12 +00:00
Marcel Moolenaar
1b19d4ae61 Allow disks with a GPT to be used on big-endian machines. The GPT is
little-endian by definition and needs byte-swap operations for any
multi-byte field. While here fix indentation.
2004-04-30 05:05:39 +00:00
Pawel Jakub Dawidek
f1f163e9cb - Don't check if 'gp' is non-NULL, it always is and GEOM wants to
dump geom configuration when 'pp' and 'cp' are NULL.
- Use tabs instead of spaces.
2004-04-20 17:07:55 +00:00
Pawel Jakub Dawidek
46aeebec57 Calculate bio_completed properly or die!
Approved by:	phk
2004-04-04 20:37:28 +00:00
Peter Grehan
ae33b79b67 Move the name attribute to the end of the conftxt line to simplify
libdisk parsing (the name may be empty, or contain spaces).

Submitted by:  Suleiman Souhlal <refugee@segfaulted.com>
2004-04-01 01:33:37 +00:00
Pawel Jakub Dawidek
950124e354 Move "is consumer attached?" check before G_VALID_PROVIDER() check,
because if consumer is not attached, its provider never will be valid,
so we never reach this check.

Approved by:	phk
2004-03-18 07:17:10 +00:00
Poul-Henning Kamp
e26bafdc25 Be more insistent on destroying geoms at unload time. Still not perfect,
but it will do (better) for now.

KASSERT that to have providers a class must have an access method.

Tag the new_provider event with the geom as well.
2004-03-11 08:16:23 +00:00
Poul-Henning Kamp
3d1d5bc3c3 Rearrange some of the GEOM debugging tools to be more structured.
Retire g_sanity() and corresponding debugflag (0x8)

  Retire g_{stall,release}_events().

  Under #ifdef DIAGNOSTIC:

    Make g_valid_obj() an official function and have it return an an
    non-zero integer which indicates the kind of object when found.

    Implement G_VALID_{CLASS,GEOM,CONSUMER,PROVIDER}() macros based
    on g_valid_obj().

    Sprinkle calls to these macros liberally over the infrastructure.

    Always check that we do not free a live object.
2004-03-10 08:49:08 +00:00
Pawel Jakub Dawidek
48fbd94b4e - Don't take sectorsize from first disk. Calculate it by finding
least common multiple of all disks sector sizes.
  This will allow to safely concatenate disks with different sector sizes.
- Mark unused function arguments.
- Other minor cleanups.
2004-03-09 11:18:53 +00:00
Pawel Jakub Dawidek
810914da53 Print a space character between string given as a macro argument and
bio description.
2004-03-09 11:00:24 +00:00
Poul-Henning Kamp
b07ef6c2db Don't panic on providers already withered when we wither a geom. 2004-03-07 17:33:15 +00:00
John Baldwin
6074439965 kthread_exit() no longer requires Giant, so don't force callers to acquire
Giant just to call kthread_exit().

Requested by:	many
2004-03-05 22:42:17 +00:00
Pawel Jakub Dawidek
32d7144dbc Correct year in copyrights. 2004-03-04 10:22:42 +00:00
Pawel Jakub Dawidek
a88ae49f98 - Remove d_valid field, we can use d_consumer field to check if disk
is valid.
- Use SYSCTL_DECL() instead of using own, ugly extern.
2004-03-03 22:29:24 +00:00
Pawel Jakub Dawidek
db33b1c4d0 Removed unused fields. 2004-03-01 17:33:11 +00:00
Pawel Jakub Dawidek
03816084de We don't need d_length field. 2004-03-01 17:32:48 +00:00
Pawel Jakub Dawidek
0e2ff2832c Even if we're sure that we can't be orphaned here, we have to define
orphan field - we're enforcing it in GEOM. This will reach KASSERT
in INVARIANTS case.

Add missing space.

Approved by:	scottl (mentor)
2004-02-27 15:34:21 +00:00
Pawel Jakub Dawidek
0787ce83b2 Remove unused field.
Approved by:	scottl (mentor)
2004-02-27 15:32:49 +00:00
Poul-Henning Kamp
dc08ffec87 Device megapatch 4/6:
Introduce d_version field in struct cdevsw, this must always be
initialized to D_VERSION.

Flip sense of D_NOGIANT flag to D_NEEDGIANT, this involves removing
four D_NOGIANT flags and adding 145 D_NEEDGIANT flags.
2004-02-21 21:10:55 +00:00
Pawel Jakub Dawidek
19d16e2fee Introduce CONCAT GEOM class for disk concatenation.
It allows manual and automatic (based on on-disk metadata) concatenation.

Reviewed by:	phk, scottl
Approved by:	scottl (mentor)
2004-02-19 15:19:49 +00:00
Poul-Henning Kamp
0b7ed341e1 Change the disk(9) API in order to make device removal more robust.
Previously the "struct disk" were owned by the device driver and this
gave us problems when the device disappared and the users of that device
were not immediately disappearing.

Now the struct disk is allocate with a new call, disk_alloc() and owned
by geom_disk and just abandonned by the device driver when disk_create()
is called.

Unfortunately, this results in a ton of "s/\./->/" changes to device
drivers.

Since I'm doing the sweep anyway, a couple of other API improvements
have been carried out at the same time:

The Giant awareness flag has been flipped from DISKFLAG_NOGIANT to
DISKFLAG_NEEDSGIANT

A version number have been added to disk_create() so that we can detect,
report and ignore binary drivers with old ABI in the future.

Manual page update to follow shortly.
2004-02-18 21:36:53 +00:00
Poul-Henning Kamp
281591449a Do not check error code from closing ->access() calls, we know they succeed. 2004-02-14 17:59:44 +00:00
Poul-Henning Kamp
bfc37a5112 Add a KASSERT which checks that a class never fails a closing ->access()
call.
2004-02-14 17:58:57 +00:00
Poul-Henning Kamp
d2bae332d6 Remove the absolute count g_access_abs() function since experience has
shown that it is not useful.

Rename the relative count g_access_rel() function to g_access(), only
the name has changed.

Change all g_access_rel() calls in our CVS tree to call g_access() instead.

Add an #ifndef BURN_BRIDGES #define of g_access_rel() for source
code compatibility.
2004-02-12 22:42:11 +00:00
Poul-Henning Kamp
f865123ec4 Give both consumers and providers a {void *private, u_int index} which
the implementing class can use to hang internal info from.
2004-02-12 20:32:11 +00:00
Pawel Jakub Dawidek
72e330954e Added g_print_bio() function to print informations about given bio.
Approved by:	phk, scottl (mentor)
2004-02-11 18:21:32 +00:00
Pawel Jakub Dawidek
18e88d825c Now we have g_topology_assert_not(), so use it to detect deadlocks.
Approved by:	phk, scottl (mentor)
2004-02-10 15:55:17 +00:00
Pawel Jakub Dawidek
692498b0cd Added macro which will be used to assert, that the topology lock is not held.
Approved by:	phk, scottl (mentor)
2004-02-10 15:53:28 +00:00
Poul-Henning Kamp
99cf2f941c don't call sbuf_clear() right after sbuf_new(), it is not necessary. 2004-02-10 10:54:19 +00:00
Poul-Henning Kamp
df3df337b8 Polish the work/state engine in preparation for HW-crypto support. 2004-02-08 10:19:18 +00:00
Poul-Henning Kamp
d091e630f1 Add a missing error case return.
Problem reported by:	Flemming Jacobsen <fj@batmule.dk>
2004-02-08 09:39:02 +00:00
Poul-Henning Kamp
3aa5a3ad90 We don't need to hold Giant to create the worker kthread. 2004-02-07 23:01:17 +00:00
Pawel Jakub Dawidek
12047230cd Allow decreasing access count even if there is no disk anymore.
This will allow closing disks that were removed while opened.

Approved by:	phk, scottl (mentor)
2004-02-06 23:10:49 +00:00
Lukas Ertl
5b2f81ec4b Fix memory leak.
PR:            kern/58634
Submitted by:  le
Approved by:   phk
2004-02-06 22:51:04 +00:00
Poul-Henning Kamp
0ed4f6a180 Allow a GEOM class to unload if it has no geoms or a method function to
get rid of them.

Prodded by:	pjd
2004-02-02 19:49:41 +00:00
Pawel Jakub Dawidek
cff2ddaeb2 - Use proper names in KASSERTs.
- Typos.

Approved by:	phk, scottl (mentor)
2004-02-02 17:50:09 +00:00
Poul-Henning Kamp
abc2e0fd56 Check error return from g_clone_bio(). (netchild@)
Rearrange code to avoid duplication (phk@)

Submitted by:	netchild@
2004-02-02 13:36:06 +00:00
Poul-Henning Kamp
793ffa8e55 Don't mingle malloc/g_event flags.
Spotted by:	pjd@
2004-02-02 10:58:07 +00:00
Poul-Henning Kamp
5fcf4e4398 Bring back the geom_bioqueues, they _are_ a good idea.
ATA will uses these RSN.
2004-01-28 08:39:18 +00:00
Poul-Henning Kamp
57ab2e0468 Make sure to keep track of canceled events.
Submitted by:	Pawel Jakub Dawidek <nick@garage.freebsd.pl>
2004-01-23 21:09:38 +00:00
Poul-Henning Kamp
799426f877 Add KASSERTS.
Submitted by:	Pawel Jakub Dawidek <nick@garage.freebsd.pl>
2004-01-23 21:02:49 +00:00
Poul-Henning Kamp
f5b3481451 Plug an insignificant memoryleak.
Submitted by:	Pawel Jakub Dawidek <nick@garage.freebsd.pl>
2004-01-23 20:40:25 +00:00
Poul-Henning Kamp
752e0f0196 Add missing newline in printf.
Submitted by:	Pawel Jakub Dawidek <nick@garage.freebsd.pl>
2004-01-23 20:36:21 +00:00
Poul-Henning Kamp
bbf53bc053 Remove the MD5_KEY debugging tool 2004-01-23 11:47:06 +00:00