387 Commits

Author SHA1 Message Date
Greg Lehey
2e387f1bd3 vinumstart: If a write request is for a RAID-[45] plex or a volume
with more than one plex, the data will be accessed
            multiple times.  During this time, userland code could
            potentially modify the buffer, thus causing data
            corruption.  In the case of a multi-plexed volume this
            might be cosmetic, but in the case of a RAID-[45] plex it
            can cause severe data corruption which only becomes
            evident after a drive failure.  Avoid this situation by
            making a copy of the data buffer before using it.

	    Note that this solution does not guarantee any particular
	    content of the buffer, just that it remains unchanged for
	    the duration of the request.

Suggested by:	alfred
2001-05-22 02:36:47 +00:00
Greg Lehey
5be7546b83 tokenize: Take third parameter specifying the maximum number of
parameters to return.  This code is used both in userland and in the
kernel.
2001-05-22 02:35:57 +00:00
Greg Lehey
24ad5cd24b Cosmetics: wrap long lines to be < 80 characters. 2001-05-22 02:35:19 +00:00
Greg Lehey
c4d4f4147d Add a new debug flag, DEBUG_LOCKREQS, which logs only lock requests.
Use this instead of DEBUG_LASTREQS to decide whether to log lock
requests.

MFS:

vinumlock: Catch a potential race condition where one process is
           waiting for a lock, and between the time it is woken and
           it retries the lock, another process gets it and places it
           in the first entry in the table.

           This problem has not been observed, but it's possible, and
           it's easy enough to fix.

Submitted by:   tegge

vinumunlock: Catch a real bug capable of hanging a system.  When
             releasing a lock, vinumunlock() called wakeup_one.  This
             caused wakeups to sometimes get lost.  After due
             consideration, we think that this is due to the fact that
             you can't guarantee that some other process is also
             waiting on the same address.  This makes wakeup_one a
             very dangerous function to use.
2001-05-22 02:34:30 +00:00
Greg Lehey
cf65a3dc63 Change ioctls to use the expurgated userland version of the Vinum
structures.
2001-05-22 02:33:32 +00:00
Greg Lehey
acac8659d7 format_config: Replace long format lines.
Requested by:  bde

Add retryerrors keyword.

vinum_scandisk: Print a different message if an inadvertent start
command did not find any additional drives.  The previous message "no
drives found" confused and worried many people.

MFS:

vinum_open: Recognize Mylex devices as storage devices.
2001-05-22 02:32:22 +00:00
Greg Lehey
ab15c118bf complete_rqe:
In case of error, check the VF_RETRYERRORS flag in the subdisk and
  don't take the subdisk down if it's set, just retry the I/O.

  Requested by:	peter

  If the buffer has been copied (XFR_COPYBUF), release the copied
  buffer when the I/O completes.

  Suggested by:	alfred
2001-05-22 02:31:08 +00:00
Greg Lehey
0e414969e0 Remove unnecessary declarations of userland functions.
Desired by:	   bde

This commit is the first of a general cleanup of the header files..
It won't be enough to make bde happy.

Move debug definitions from vinumhdr.h.
2001-05-22 02:30:44 +00:00
Greg Lehey
177bb9657f config_sd: Add code to recognize "retryerrors" keyword.
config_plex: Don't create the device until we're finished.

parse_config: check for corrupted configuration, thus avoiding a
potential panic.

remove_sd_entry: Restore structure.
2001-05-22 02:29:54 +00:00
Greg Lehey
7e18e4ffbc free_vinum: Change some explicit struct member references to the SD,
PLEX and VOL.
2001-05-22 02:29:15 +00:00
Greg Lehey
2c4a6d9016 Add xferinfo flag bit for copied buffers.
Create a new struct rangelockinfo.  In revision 1.21 of vinumlock.c,
the plex info was removed from struct rangelock, since it wasn't
needed there.  It *is* needed for trace information, however, so use
struct rangelockinfo for that.
2001-05-22 02:28:55 +00:00
Greg Lehey
1739a10826 New file containing definitions for separate views of objects for
userland and kernel.
2001-05-22 01:41:12 +00:00
Poul-Henning Kamp
f83880518b Send the remains (such as I have located) of "block major numbers" to
the bit-bucket.
2001-03-26 12:41:29 +00:00
Alfred Perlstein
ea846c773d devfs convertion used VINUMRMINOR incorrectly (passing args in
backwards order)

Submitted by: Bernd Walter <ticso@mail.cicely.de>
2001-03-22 23:21:49 +00:00
Peter Wemm
e793f3a0e0 By convention, the moduledata is static unless there is a reason for it
to not be.
2001-03-13 05:55:43 +00:00
Alfred Perlstein
729d4f1db0 Fix vinum for both devfs and non-devfs systems.
userland tool:

  Use the vfs.devfs.generation sysctl to test for devfs presense
  (thanks phk!) when devfs is active it will not try to create the
  device nodes in /dev and therefore will not complain about the
  failure to do so.

  Revert the change in the #define for VINUM_DIR in the kernel
  header so that vinum can find its device nodes.

  Replace perror() with vinum_perror() to print file/line when
  DEVBUG is defined (not defined by default).

kernel:

  Don't use the #define names for the "superdev" creation since
  they will be prepended by "/dev/" (based on VINUM_DIR), instead
  use string constants.

  Create both debug and non-debug "superdev" nodes in the devfs.

Problem noticed and fix tested by: Martin Blapp <mblapp@fuchur.lan.attic.ch>
2001-02-20 22:07:36 +00:00
Alfred Perlstein
1a1f84bdd7 forced commit to note that the last delta also reordered some code in
remove_sd_entry() to:

  Simplify (hopefully) it by moving all error returns closer to
  the beginning of the function.

  Return an error when "Error removing subdisk %s: not found in
  plex %s\n" would have been reported, as I doubt that we are "OK"
  after printing that error message.
2001-02-20 12:14:01 +00:00
Alfred Perlstein
d9ca06d2a3 Take a shot at making vinum devfs aware.
Adding make_dev() and destroy_dev() calls in (hopefully) the right
places.

This is done by calling make_dev() in each object constructor and
caching the dev_t's returned from make_dev() in each struct
'subdisk'(sd), 'plex' and 'volume' such that the 'object'_free()
functioncs can call destroy dev.

This change makes a subset of the old /dev/vinum appear under devfs.

Enough nodes appear such that I'm able to mount my striped volume.

There may be more work needed to get vinum configuration working
properly.
2001-02-20 11:37:04 +00:00
Bosko Milekic
9ed346bab0 Change and clean the mutex lock interface.
mtx_enter(lock, type) becomes:

mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks)
mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)

similarily, for releasing a lock, we now have:

mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN.
We change the caller interface for the two different types of locks
because the semantics are entirely different for each case, and this
makes it explicitly clear and, at the same time, it rids us of the
extra `type' argument.

The enter->lock and exit->unlock change has been made with the idea
that we're "locking data" and not "entering locked code" in mind.

Further, remove all additional "flags" previously passed to the
lock acquire/release routines with the exception of two:

MTX_QUIET and MTX_NOSWITCH

The functionality of these flags is preserved and they can be passed
to the lock/unlock routines by calling the corresponding wrappers:

mtx_{lock, unlock}_flags(lock, flag(s)) and
mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN
locks, respectively.

Re-inline some lock acq/rel code; in the sleep lock case, we only
inline the _obtain_lock()s in order to ensure that the inlined code
fits into a cache line. In the spin lock case, we inline recursion and
actually only perform a function call if we need to spin. This change
has been made with the idea that we generally tend to avoid spin locks
and that also the spin locks that we do have and are heavily used
(i.e. sched_lock) do recurse, and therefore in an effort to reduce
function call overhead for some architectures (such as alpha), we
inline recursion for this case.

Create a new malloc type for the witness code and retire from using
the M_DEV type. The new type is called M_WITNESS and is only declared
if WITNESS is enabled.

Begin cleaning up some machdep/mutex.h code - specifically updated the
"optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN
and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently
need those.

Finally, caught up to the interface changes in all sys code.

Contributors: jake, jhb, jasone (in no particular order)
2001-02-09 06:11:45 +00:00
Greg Lehey
2a1735da45 Allocate lock table and mutex not only for parity plexes, but also for
striped plexes.  This prevents various panics introduced in the last
rewrite of the locking code.

Suffered by:   "Niels Chr. Bank-Pedersen" <ncbp@bank-pedersen.dk>
2001-02-02 07:14:13 +00:00
John Baldwin
e7d904a13a - Proc locking around the vinumdaemon dinking with its flags.
- P_INMEM -> PS_INMEM.
2001-01-24 10:28:19 +00:00
Jake Burkholder
a448b62ac9 Make intr_nesting_level per-process, rather than per-cpu. Setup
interrupt threads to run with it always >= 1, so that malloc can
detect M_WAITOK from "interrupt" context.  This is also necessary
in order to context switch from sched_ithd() directly.

Reviewed By:	peter
2001-01-21 19:25:07 +00:00
Greg Lehey
9cd8b5cf0d Correct check for partition c. Previously the check was for drive 2,
which did not exactly have the desired result.

Submitted by:	Akira Watanabe <akira@myaw.ei.meisei-u.ac.jp>
2001-01-20 03:46:19 +00:00
Greg Lehey
940b9a4808 struct rangelock: Remove the field 'plex' from the entry. Range locks
are accessed only via the plex, so there's never any confusion as to
the plex number.  This value was, as a result, unused.
2001-01-14 06:34:57 +00:00
Greg Lehey
96f2d202a1 format_config: If a subdisk loses its drive (due to a bug which has
not yet been caught), don't save the config with a null drive
   	name (which causes the drive to be renamed "plex" on the next
   	start), put in the text "*invalid*" instead.

	This is damage control, not a fix.

Experienced by:	peter

Break some long format strings so that they fit in style(9)-sized
lines.

Remove some "outdentation".
2001-01-14 06:33:10 +00:00
Greg Lehey
a08d6f1a2c config_plex: Check that we have specified a plex organization.
Tripped over by:	"Jeroen C. van Gelderen" <jeroen@vangelderen.org>
2001-01-14 06:29:56 +00:00
Greg Lehey
891b3d045f Reinstate 1.19.
Prodded by: iedowse
2001-01-10 21:42:06 +00:00
Greg Lehey
a208c55ad3 Part of rewrite of RAID-[45] locking code:
Rename INITIAL_LOCKS to PLEX_LOCKS, since it now stays a constant.

struct plex:
  Add a mutex lockmtx.
  Remove alloclocks.
2001-01-10 05:08:30 +00:00
Greg Lehey
91c6496f6e vinumstart: Don't check for B_DONE on return from bre(), it doesn't
happen any more.

abortrequest: don't bufdone the user bp on error, let vinumstart() do
it.

Based on analysis by:	tegge
2001-01-10 05:07:52 +00:00
Greg Lehey
6e763eb38a bre5: don't bufdone the user bp on error, let vinumstart() do it.
Based on analysis by:	tegge
2001-01-10 05:07:14 +00:00
Greg Lehey
83d4664f33 Remove obsolete functions [un]lockplex and [un]lockvol.
Rewrite lockrange and unlockrange.  The lock table is now a fixed
size, so there is no possibility for race conditions when expanding.
The current size (256 locked ranges) should be large enough that it
makes no sense to expand it.  To do expansion right would require
quiescing the plex (requiring at least 256 I/O completions), and the
performance implications are horrendous.

Add a mutex per plex for accessing the lock table.

Based on analysis by:	tegge
2001-01-10 05:06:37 +00:00
Greg Lehey
6694e200e5 Get definition of Malloc right when not using VINUMDEBUG
Pointed out by:	tegge
2001-01-10 05:05:46 +00:00
Greg Lehey
8684adf5cd open_drive: Refuse to open partition c of a disk device.
This should eliminate one case of foot shooting .

vinum_scandisk: If a drive in the partition table is downed, free it.
This duplicates code for the compatibility partition, which for some
reason was omitted here.
2001-01-10 05:02:44 +00:00
Greg Lehey
5f1a173c01 config_plex: Initialize mutex for parity plexes.
remove_plex_entry: Destroy mutex for parity plexes.

Part of rewrite of RAID-[45] locking code.
2001-01-10 05:01:12 +00:00
Jake Burkholder
ef73ae4b0c Use PCPU_GET, PCPU_PTR and PCPU_SET to access all per-cpu variables
other then curproc.
2001-01-10 04:43:51 +00:00
Dag-Erling Smørgrav
a8513d6f69 Re-commit revision 1.32, which grog incorrectly backed out in revision 1.33. 2000-12-20 11:17:09 +00:00
Greg Lehey
7509f91aba revive_block: Don't go beyond the end of the stripe when reviving
striped plexes.

Submitted by:   des

Don't lock buffers before calls to sdio, sdio does it by itself.

Submitted by:	tegge

parityops:	Use correct casts when returning error information.
2000-12-20 05:18:58 +00:00
Greg Lehey
06694e9333 build_rq_buffer: Note which buffer headers we lock.
sdio: Unlock the buffer if we fail.

Submitted by:	tegge
2000-12-20 05:18:09 +00:00
Greg Lehey
e8c0649946 Rearrange #includes to make more sense. This is still not the reform
that bde is waiting to see, but at least it works.
2000-12-20 05:17:29 +00:00
Greg Lehey
3daf911efe Rename detached plexes and subdisks correctly (off by one error)
Submitted by:	Terry Glanfield <Terry.Glanfield@program-products.co.uk>
2000-12-20 05:16:46 +00:00
Greg Lehey
ed4963f8e1 open_drive: Add support for more than 32 devices of a particular kind.
Requested by:	Bernd Walter <ticso@cicely8.cicely.de>
		Cor Bosman <cor@xs4all.net>
		Kai Storbeck <kai@xs4all.net>
		Joe Greco <jgreco@ns.sol.net>

	     Add support for Compaq SMART-2 RAID (idad) as storage
	     device for Vinum subdisks.

Reported by:	Aaron Hill <hillaa@hotmail.com>
2000-12-20 05:15:50 +00:00
Greg Lehey
e872ce018e give_plex_to_volume: Recalculate volume size after attaching.
Cosmetics.
2000-12-20 05:13:26 +00:00
Greg Lehey
4fd45755c5 Add flag XFR_BUFLOCKED to identify buffers which have been locked.
Part of fix to ensure that we unlock buffers we lock.

In principle submitted by: tegge
2000-12-20 05:10:08 +00:00
Greg Lehey
e19bc84f20 Don't include system-specific header files for userland program.
Discovered by default by: alfred
2000-11-28 06:38:53 +00:00
Dag-Erling Smørgrav
cd085280b5 Make sure we don't cross stripe boundaries when reviving striped plexes.
This makes crash recovery work for stripe sizes that are not multiples of
DEFAULT_REVIVE_BLOCKSIZE (currently 64 kB).
While we're here, fix a few cosmetic nits.

Reviewed by:	grog
Sponsored by:	Enitel ASA (http://www.enitel.no/)
2000-11-17 23:40:01 +00:00
Poul-Henning Kamp
416282b562 Get rid of the last traces of ACTUALLY_LKM_NOT_KERNEL 2000-10-23 08:35:41 +00:00
Bruce Evans
3d7613a6e2 Avoid impending world breakage.
1. Don't include <sys/conf.h> in userland.  It is not used, and including it
   without including its prerequisite <sys/time.h> should have broken the
   world.
2. Don't include <sys/mount.h>.  It is not used, except in -current it
   bogusly includes <sys/stat.h> which bogusly includes <sys/time.h> and
   thus accidentally provides the prerequisite in (1).
3. Cleaned up nearby include messes.

Not approved by despite 5 weeks notice: MAINTAINER
2000-10-13 13:43:37 +00:00
Greg Lehey
7a598eabdd open_drive:
Add support for AMD RAID controllers as "disks".

Requested-by:  Marius Bendiksen <mbendiks@eunet.no>

  Remove potential panic when attempting to open non-existent drivers.

init_drive: Return error codes correctly.  Previously it would
            occasionally return 0.  The error was redetected
            elsewhere, but this was causing a number of confusing
            error messages.
2000-08-16 04:31:37 +00:00
Greg Lehey
6b45806f44 start_object: Set the revive length correctly. 2000-06-07 03:34:18 +00:00
Greg Lehey
fe8b826551 revive_block:
Fix several instances of breakage in RAID-5 revive code.

   Tidy up code.

parityops:
   Don't attempt to do anything if the plex is degraded or worse.

parityrebuild:
   Add comments.
   Perform transfers in correct length.
2000-06-07 03:33:09 +00:00