the -v option, though it's not clear that it won't bite us elsewhere.
Forgotten by: phk
Implement setreadpol() function for the VINUM_READPOL ioctl.
Submitted by: Allan Saddi <allan@saddi.com>
with it.
Finally implement read policies. The previous "implementation" didn't
work because it referred to plexes which were almost invariably when
referred to. Instead, deprecate the "prefer" keyword for volumes
(though it's still there for the moment) and add a keyword "preferred"
to the plex definition. The relationship is like this:
Old:
vol foo ... prefer foo.p3
New:
plex foo.p3 volume foo preferred
print_config: Print "preferred" where appropriate.
No longer print "prefer" on volume config entries.
work because it referred to plexes which were almost invariably when
referred to. Instead, deprecate the "prefer" keyword for volumes
(though it's still there for the moment) and add a keyword "preferred"
to the plex definition. The relationship is like this:
Old:
vol foo ... prefer foo.p3
New:
plex foo.p3 volume foo preferred
give_plex_to_volume: set preferred plex if specified on plex
definition entry. This involves adding a parameter to the function to
specify the preferred plex.
config_plex: Implement preferred keyword.
Add ioctl VINUM_READCONFIG which implements both the "read" and
"start" commands in vinum(8). Aim for marginally better error
messages when something goes wrong.
just return it. Don't try to reinitialize it. This should fix a
number of inconsistencies that some people encountered with "vinum
start".
PR: 30588
PR: 43475
object: subdisks, plexes and volumes. The encoding for plexes and
subdisks no longer reflects the object to which they belong. The
super devices are high-order volume numbers. This gives vastly more
potential volumes (4 million instead of 256).
Rewrite minor number decoding. Now we have only three types of
object: subdisks, plexes and volumes. The encoding for plexes and
subdisks no longer reflects the object to which they belong. The
super devices are high-order volume numbers. This gives vastly more
potential volumes (4 million instead of 256).
object: subdisks, plexes and volumes. The encoding for plexes and
subdisks no longer reflects the object to which they belong. The
super devices are high-order volume numbers. This gives vastly more
potential volumes (4 million instead of 256).
Correct formats for some error messages. Don't cast the value to
match the format.
object: subdisks, plexes and volumes. The encoding for plexes and
subdisks no longer reflects the object to which they belong. The
super devices are high-order volume numbers. This gives vastly more
potential volumes (4 million instead of 256).
Tidy up comments.
Check for null rqgs. This continue to be reported, though I can't
work out why.
Correct formats for some error messages. Don't cast the value to
match the format.
Use microtime, not getmicrotime, for timing debug entries.
object: subdisks, plexes and volumes. The encoding for plexes and
subdisks no longer reflects the object to which they belong. The
super devices are high-order volume numbers. This gives vastly more
potential volumes (4 million instead of 256).
As a result of the minor number changes, split out the superdevice
handling into a separate function, vinum_super_ioctl. This was most
of the code of vinumioctl.
attachobject: Improve error checking.
init_drive: Rephrase error message text.
Remove dead code (inside #if 0).
Change name of find_drive_by_dev to the more descriptive
find_drive_by_name.
Tidy up comments.
get_emppty_drive: Fix a day one bug with strcpy parameters.
Change name of find_drive_by_dev to the more descriptive
find_drive_by_name.
Rewrite minor number decoding. Now we have only three types of
object: subdisks, plexes and volumes. The encoding for plexes and
subdisks no longer reflects the object to which they belong. The
super devices are high-order volume numbers. This gives vastly more
potential volumes (4 million instead of 256).
object: subdisks, plexes and volumes. The encoding for plexes and
subdisks no longer reflects the object to which they belong. The
super devices are high-order volume numbers. This gives vastly more
potential volumes (4 million instead of 256).
Remove an unnecessary goto.
vinumopen: Return EINVAL, not ENXIO, on an attempt to open a
referenced plex.
or sched_lock are sufficient to test this flag.
XXX: vinum should really be using a kernel process via kthread_create()
instead of this hack. I'm not even sure PS_INMEM can be clear at this
point anyways.
in geom_disk.c.
As a side effect this makes a lot of #include <sys/devicestat.h>
lines not needed and some biofinish() calls can be reduced to
biodone() again.
branches:
Initialize struct cdevsw using C99 sparse initializtion and remove
all initializations to default values.
This patch is automatically generated and has been tested by compiling
LINT with all the fields in struct cdevsw in reverse order on alpha,
sparc64 and i386.
Approved by: re(scottl)
- Remove the buftimelock mutex and acquire the buf's interlock to protect
these fields instead.
- Hold the vnode interlock while locking bufs on the clean/dirty queues.
This reduces some cases from one BUF_LOCK with a LK_NOWAIT and another
BUF_LOCK with a LK_TIMEFAIL to a single lock.
Reviewed by: arch, mckusick
similar patch has been in 4.x for a while, but is more hacky there.)
For this to work, vinum has to be loaded early (e. g. from
boot/loader), for obvious reasons. If the kernel env variable
(aka. loader variable) "vinum.autostart" is set, vinum then asks the
sysctl kern.disks for all available disks in the system, and scans
them for possible vinum headers.
For statically compiled kernels, this behaviour can be obtained even
without boot/loader by using "options VINUM_AUTOSTART" (though this is
not the recommended way).
Alternatively, the 4.x way to specify "vinum.drives" is also supported.
No further hacks (like the 4.x "vinum.root" variable) are needed,
since in 5.x, mountroot() asks back at the drivers to have them
resolve the name of the root FS into a dev_t (using the dev_clone
eventhandler).
(The MFC reminder below is for a partial MFC for vinum.autostart, the
rest is already there in 4.x.)
Timed out on: grog
MFC after: 2 weeks
could already exist, and this triggers a booby trap panic in make_dev.
remove_plex_entry: Don't remove the stripe mutex here, it gets done in
free_plex.
Approved by: re (rwatson)
is not long long on all archs. (They happen to be long's on 64-bit arch's
and gcc considers that significant enough to warn about it.) These should
probably be uintmax_t but I didn't feel like adding all the extra includes.
(1) Use namei() and devfs to discover devices rather than a hard-coded
MAKEDEV implementation. Once rootfs is in place, this will allow
Vinum to be used for the root file system partition.
(2) Pass FREAD to device opens so that GEOM will return sector size
rather than an error on attempts to read label data.
(3) Avoid clobbering return values from close_drive() and masking this
failure, resulting in a later divide by zero due to not having
updated the Vinum-cached sector size.
(4) Ignore failures from DIOCWLABEL as that appears not to be required
in the GEOM environment.
We've done testing in simple Vinum environments, but those with more
complex environments might want to give this a spin in DP2 and make
sure everything is up to speed.
Fixes in collaboration with: iedowse
Reviewed by: grog
before freeing so that WITNESS doesn't dereference mutex data pointers
and page fault. It's now possible to unload vinum.ko with a GENERIC
kernel on 5.0-CURRENT without panic.
Debugged/fixed with the aid of: jake, grog
instead of %llx when %j is available).
Changed nearby output formats from %x to %#x so that it is obvious that the
numbers are in hex (vinum mostly uses 0x%x elsewhere).
Didn't fix nearby format printf errors (long lines).
most cases NULL is passed, but in some cases such as network driver locks
(which use the MTX_NETWORK_LOCK macro) and UMA zone locks, a name is used.
Tested on: i386, alpha, sparc64
DIOCGMEDIASIZE instead.
The partition type check has been XXX'ed out since we cannot expect
that a BSD disklabel with a type field be available on all platforms.
general cleanup of the API. The entire API now consists of two functions
similar to the pre-KSE API. The suser() function takes a thread pointer
as its only argument. The td_ucred member of this thread must be valid
so the only valid thread pointers are curthread and a few kernel threads
such as thread0. The suser_cred() function takes a pointer to a struct
ucred as its first argument and an integer flag as its second argument.
The flag is currently only used for the PRISON_ROOT flag.
Discussed on: smp@
the bio and buffer structures to have daddr64_t bio_pblkno,
b_blkno, and b_lblkno fields which allows access to disks
larger than a Terabyte in size. This change also requires
that the VOP_BMAP vnode operation accept and return daddr64_t
blocks. This delta should not affect system operation in
any way. It merely sets up the necessary interfaces to allow
the development of disk drivers that work with these larger
disk block addresses. It also allows for the development of
UFS2 which will use 64-bit block addresses.
apply to this file. The correct message is:
throw_rude_remark: Make sure we're holding the config lock before
proceeding. There's no reason to assume that this
has ever happened, but the alternative might be a
double fault.
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.
Sorry john! (your next MFC will be a doosie!)
Reviewed by: peter@freebsd.org, dillon@freebsd.org
X-MFC after: ha ha ha ha
In file included from ../../../dev/vinum/vinumhdr.h:77,
from ../../../dev/vinum/vinum.c:44:
../../../dev/vinum/vinumext.h:165: warning: redundant redeclaration of `setjmp' in same scope
../../../sys/systm.h:96: warning: previous declaration of `setjmp'
../../../dev/vinum/vinummemory.c:44: warning: redundant redeclaration of `longjmp' in same scope
../../../sys/systm.h:97: warning: previous declaration of `longjmp'
vinumhdr.h:80: warning: redundant redeclaration of `vinum_cdevsw'
vinumext.h:239: warning: previous declaration of `vinum_cdevsw'
in each of the following files:
vinum.c, vinumconfig.c, vinumdaemon.c, vinuminterrupt.c, vinumio.c,
vinumioctl.c, vinumlock.c, vinummemory.c, vinumraid5.c, vinumrequest.c,
vinumrevive.c, vinumstate.c, vinumutil.c
requiring fewer header files for userland programs.
Remove the gross debug device/non-debug device hack used to recognize
whether the kernel module was in sync with the userland module.
compiled with debug support. This can be used by userland programs to
recognize which ioctls the module supports.
As a result, remove the gross debug device/non-debug device hack used
to recognize whether the kernel module was in sync with the userland
module.
Replace explicit references to major/minor numbers of vinum
superdevice with the VINUM_SUPERDEV macro written for that purpose.
gets incremented every time the kernel-userland interface changes.
This enables vinum(8) to check for the correct kernel version and to
produce a useful message if it doesn't match.
Requested by: Too many to count.
Move the definitions of struct drive, sd, plex and volume to
vinumobj.h.
Add a new debug flag, DEBUG_LOCKREQS, which logs only lock requests.
with more than one plex, the data will be accessed
multiple times. During this time, userland code could
potentially modify the buffer, thus causing data
corruption. In the case of a multi-plexed volume this
might be cosmetic, but in the case of a RAID-[45] plex it
can cause severe data corruption which only becomes
evident after a drive failure. Avoid this situation by
making a copy of the data buffer before using it.
Note that this solution does not guarantee any particular
content of the buffer, just that it remains unchanged for
the duration of the request.
Suggested by: alfred
Use this instead of DEBUG_LASTREQS to decide whether to log lock
requests.
MFS:
vinumlock: Catch a potential race condition where one process is
waiting for a lock, and between the time it is woken and
it retries the lock, another process gets it and places it
in the first entry in the table.
This problem has not been observed, but it's possible, and
it's easy enough to fix.
Submitted by: tegge
vinumunlock: Catch a real bug capable of hanging a system. When
releasing a lock, vinumunlock() called wakeup_one. This
caused wakeups to sometimes get lost. After due
consideration, we think that this is due to the fact that
you can't guarantee that some other process is also
waiting on the same address. This makes wakeup_one a
very dangerous function to use.
Requested by: bde
Add retryerrors keyword.
vinum_scandisk: Print a different message if an inadvertent start
command did not find any additional drives. The previous message "no
drives found" confused and worried many people.
MFS:
vinum_open: Recognize Mylex devices as storage devices.
In case of error, check the VF_RETRYERRORS flag in the subdisk and
don't take the subdisk down if it's set, just retry the I/O.
Requested by: peter
If the buffer has been copied (XFR_COPYBUF), release the copied
buffer when the I/O completes.
Suggested by: alfred