.\" Hey, Emacs, edit this file in -*- nroff-fill -*- mode .\"- .\" Copyright (c) 1997, 1998 .\" Nan Yang Computer Services Limited. All rights reserved. .\" .\" This software is distributed under the so-called ``Berkeley .\" License'': .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" 3. All advertising materials mentioning features or use of this software .\" must display the following acknowledgement: .\" This product includes software developed by Nan Yang Computer .\" Services Limited. .\" 4. Neither the name of the Company nor the names of its contributors .\" may be used to endorse or promote products derived from this software .\" without specific prior written permission. .\" .\" This software is provided ``as is'', and any express or implied .\" warranties, including, but not limited to, the implied warranties of .\" merchantability and fitness for a particular purpose are disclaimed. .\" In no event shall the company or contributors be liable for any .\" direct, indirect, incidental, special, exemplary, or consequential .\" damages (including, but not limited to, procurement of substitute .\" goods or services; loss of use, data, or profits; or business .\" interruption) however caused and on any theory of liability, whether .\" in contract, strict liability, or tort (including negligence or .\" otherwise) arising in any way out of the use of this software, even if .\" advised of the possibility of such damage. .\" .\" $Id: vinum.4,v 1.11 1999/02/05 00:34:21 grog Exp $ .\" .Dd 22 July 1998 .Dt vinum 4 .Os FreeBSD .Sh NAME .Nm vinum .Nd Logical Volume Manager .Sh SYNOPSIS .Cd "kldload vinum" .Cd "kldload Vinum" .Sh DESCRIPTION .Nm is a logical volume manager inspired by, but not derived from, the Veritas Volume Manager. It provides the following features: .Bl -bullet .It It provides device-independent logical disks, called \fIvolumes\fP. Volumes are not restricted to the size of any disk on the system. .It The volumes consist of one or more \fIplexes\fP, each of which contain the entire address space of a volume. This represents an implementation of RAID-1 (mirroring). Multiple plexes can also be used for .\" XXX What about sparse plexes? Do we want them? .if t .sp .Bl -bullet .It Increased read throughput. .Nm will read data from the least active disk, so if a volume has plexes on multiple disks, more data can be read in parallel. .Nm reads data from only one plex, but it writes data to all plexes. .It Increased reliability. By storing plexes on different disks, data will remain available even if one of the plexes becomes unavailable. In comparison with a RAID-5 plex (see below), using multiple plexes requires more storage space, but gives better performance, particularly in the case of a drive failure. .It Additional plexes can be used for on-line data reorganization. By attaching an additional plex and subsequently detaching one of the older plexes, data can be moved on-line without compromising access. .It An additional plex can be used to obtain a consistent dump of a file system. By attaching an additional plex and detaching at a specific time, the detached plex becomes an accurate snapshot of the file system at the time of detachment. .\" Make sure to flush! .El .It Each plex consists of one or more logical disk slices, called \fIsubdisks\fP. Subdisks are defined as a contiguous block of physical disk storage. A plex may consist of any reasonable number of subdisks (in other words, the real limit is not the number, but other factors, such as memory and performance, associated with maintaining a large number of subdisks). .It A number of mappings between subdisks and plexes are available: .Bl -bullet .It \fIConcatenated plexes\fP\| consist of one or more subdisks, each of which is mapped to a contiguous part of the plex address space. .It \fIStriped plexes\fP\| consist of two or more subdisks of equal size. The file address space is mapped in \fIstripes\fP, integral fractions of the subdisk size. Consecutive plex address space is mapped to stripes in each subdisk in .if n turn. .if t \{\ turn: .PS move right 2i down SD0: box SD1: box SD2: box "plex 0" at SD0.n+(0,.2) "subdisk 0" rjust at SD0.w-(.2,0) "subdisk 1" rjust at SD1.w-(.2,0) "subdisk 2" rjust at SD2.w-(.2,0) .PE .\} The subdisks of a striped plex must all be the same size. .It \fIRAID-5 plexes\fP\| require at least three equal-sized subdisks. They resemble striped plexes, except that in each stripe, one subdisk stores parity information. This subdisk changes in each stripe: in the first stripe, it is the first subdisk, in the second it is the second subdisk, etc. In the event of a single disk failure, .Nm will recover the data based on the information stored on the remaining subdisks. This mapping is particularly suited to read-intensive access. The subdisks of a RAID-5 plex must all be the same size. .\" Make sure to flush! .El .It .Nm Drives are the lowest level of the storage hierarchy. They represent disk special devices. .It .Nm offers automatic startup. Unlike UNIX file systems, .Nm volumes contain all the configuration information needed to ensure that they are started correctly when the subsystem is enabled. This is also a significant advantage over the Veritas\(tm File System. This feature regards the presence of the volumes. It does not mean that the volumes will be mounted automatically, since the standard startup procedures with .Pa /etc/fstab perform this function. .El .Sh KERNEL CONFIGURATION .Nm is currently supplied as a kernel loadable module (kld), and does not require configuration. As with other klds, it is absolutely necessary to match the kld to the version of the operating system. Failure to do so will cause .Nm to issue an error message and terminate. .Pp .Nm is currently available in two versions: a freely available version which does not contain RAID-5 functionality, and a full version including RAID-5 functionality, which is available from Cybernet Systems Inc. (http://www.cybernet.com\fR). .Sh RUNNING VINUM Normally, you start a configured version of .Nm at boot time. Set the variable .Ar vinum_drives in .Ar /etc/rc.conf to indicate the slices on which .Nm drives are located. For example, if you have .Nm drives on .Ar /dev/da1h , .Ar /dev/da2h , .Ar /dev/da3h , .Ar /dev/da4h and .Ar /dev/da5h , you would set the variable to: .Bd -literal vinum_drives="/dev/da1 /dev/da2 /dev/da3 /dev/da4 /dev/da5" .Ed .Pp .Sh VINUM INSTALLATION The freely available version of the .Nm kld is called .Pa /modules/vinum.ko , and the RAID-5 version is .Pa /modules/Vinum.o . To load the module: .Pp .Bd -unfilled -offset indent # kldload vinum .Ed .Pp .Nm vinum(8) also automatically loads the kld module if it is not yet loaded. .Pp After loading .Nm vinum , it must be confiugred. In an existing installation, the following command reads the configuration an existing set of disks: .Bd -unfilled -offset indent # vinum read /dev/da1 /dev/da2 /dev/da3 /dev/da4 /dev/da5 /dev/da6 .Ed .sp This command must specify all of the devices used by .Nm vinum . .Nm vinum(8) reads the configuration from the device with the newest configuration file, then updates it if necessary with additional information from successively older configurations. These commands are normally embedded in the startup file .Pa /etc/rc . .Pp See .Xr vinum 8 for information on how to create a .Nm configuration. .Pp To unload the kld, first find the .Ar Id field in .Pa kldstat: .Bd -unfilled -offset indent # kldstat Id Refs Address Size Name 1 2 0xf0100000 1c7de8 kernel 2 1 0xf0f5b000 b0000 Vinum.ko .Ed .Pp To unload the module, use .Pa kldunload: .Bd -unfilled -offset indent # kldunload -n Vinum .Ed .Pp The kld can only be unloaded when idle, in other words when no volumes are mounted and no other instances of the .Nm program are active. Unloading the kld does not harm the data in the volumes. .Ss CONFIGURING AND STARTING OBJECTS Use the .Xr vinum 8 utility to configure and start .Nm objects. .Sh IOCTL CALLS .Pa ioctl calls are intended for the use of the .Nm configuration program only. The are described in the header file .Pa /sys/sys/vinumio.h .Ss DISK LABELS Conventional disk special devices have a .Em disk label in the second sector of the device. See .Xr disklabel 5 for more details. This disk label describes the layout of the partitions within the device. .Nm does not subdivide volumes, so volumes do not contain a physical disk label. For convenience, .Nm implements the ioctl calls DIOCGDINFO (get disk label), DIOCGPART (get partition information), DIOCWDINFO (write partition information) and DIOCSDINFO (set partition information). DIOCGDINFO and DIOCGPART refer to an internal representation of the disk label which is not present on the volume. As a result, the .Fl r option of .Xr disklabel 8 , which reads the ``raw disk'', will fail. .Pp In general, .Xr disklabel 8 serves no useful purpose on a vinum volume. If you run it, it will show you three partitions, a, b and c, all the same except for the fstype, for example: .Bd -unfilled -offset 3 partitions: # size offset fstype [fsize bsize bps/cpg] a: 2048 0 4.2BSD 1024 8192 0 # (Cyl. 0 - 0) b: 2048 0 swap # (Cyl. 0 - 0) c: 2048 0 unused 0 0 # (Cyl. 0 - 0) .Ed .Pp .Nm ignores the DIOCWDINFO and DIOCSDINFO ioctls, since there is nothing to change. As a result, any attempt to modify the disk label will be silently ignored. .Sh MAKING FILE SYSTEMS Since .Nm volumes do not contain partitions, the names do not need to conform to the standard rules for naming disk partitions. For a physical disk partition, the last letter of the device name specifies the partition identifier (a to h). .Nm volumes need not conform to this convention, but if they do not, .Nm newfs will complain that it cannot determine the partition. To solve this problem, use the .Fl v flag to .Nm newfs . .Sh OBJECT NAMING .Nm assigns default names to plexes and subdisks, although they may be overridden. We do not recommend overriding the default names. Experience with the .if t Veritas\(tm .if n Veritas(tm) volume manager, which allows arbitary naming of objects, has shown that this flexibility does not bring a significant advantage, and it can cause confusion. .sp Names may contain any non-blank character, but it is recommended to restrict them to letters, digits and the underscore characters. The names of volumes, plexes and subdisks may be up to 64 characters long, and the names of drives may up to 32 characters long. When choosing volume and plex names, bear in mind that automatically generated plex and subvolume names are longer than the name from which they are derived. .Bl -bullet .It When .Nm vinum(8) creates or deletes objects, it creates a directory .Pa /dev/vinum , in which it makes device entries for each volume it finds. It also creates subdirectories, .Pa /dev/vinum/plex , .Pa /dev/vinum/sd and .Pa /dev/vinum/rsd , in which it stores device entries for the plexes and subdisks. .Pa /dev/vinum/sd contains block device entries, while .Pa /dev/vinum/rsd contains character device entries. In addition, it creates two more directories, .Pa /dev/vinum/vol and .Pa /dev/vinum/drive , in which it stores hierarchical information for volumes and drives. .It In addition, .Nm creates two super-devices, .Pa /dev/vinum/control and .Pa /dev/vinum/controld . These are used by .Xr vinum 8 and the .Nm daemon respectively. .It Unlike .Nm UNIX drives, .Nm volumes are not subdivided into partitions, and thus do not contain a disk label. Unfortunately, this confuses a number of utilities, notably .Nm newfs , which normally tries to interpret the last letter of a .Nm volume name as a partition identifier. If you use a volume name which does not end in the letters .Ar a to .Ar c , you must use the .Fl v flag to .Nm newfs in order to tell it to ignore this convention. .\" .It Plexes do not need to be assigned explicit names. By default, a plex name is the name of the volume followed by the letters \f(CW.p\fR and the number of the plex. For example, the plexes of volume .Ar vol3 are called .Ar vol3.p0 , .Ar vol3.p1 and so on. These names can be overridden, but it is not recommended. .br .It Like plexes, subdisks are assigned names automatically, and explicit naming is discouraged. A subdisk name is the name of the plex followed by the letters \f(CW.s\fR and a number identifying the subdisk. For example, the subdisks of plex .Ar vol3.p0 are called .Ar vol3.p0.s0 , .Ar vol3.p0.s1 and so on. .br .It By contrast, .Nm drives must be named. This makes it possible to move a drive to a different location and still recognize it automatically. Drive names may be up to 32 characters long. .El .Pp EXAMPLE .Pp Assume the .Nm objects described in the section CONFIGURATION FILE in .Xr vinum 8 . The directory .Ar /dev/vinum looks like: .Bd -unfilled -offset indent # ls -lR /dev/vinum/ /dev/rvinum total 5 brwxr-xr-- 1 root wheel 25, 2 Mar 30 16:08 concat brwx------ 1 root wheel 25, 0x40000000 Mar 30 16:08 control brwx------ 1 root wheel 25, 0x40000001 Mar 30 16:08 controld drwxrwxrwx 2 root wheel 512 Mar 30 16:08 drive drwxrwxrwx 2 root wheel 512 Mar 30 16:08 plex drwxrwxrwx 2 root wheel 512 Mar 30 16:08 rvol drwxrwxrwx 2 root wheel 512 Mar 30 16:08 sd brwxr-xr-- 1 root wheel 25, 3 Mar 30 16:08 strcon brwxr-xr-- 1 root wheel 25, 1 Mar 30 16:08 stripe brwxr-xr-- 1 root wheel 25, 0 Mar 30 16:08 tinyvol drwxrwxrwx 7 root wheel 512 Mar 30 16:08 vol brwxr-xr-- 1 root wheel 25, 4 Mar 30 16:08 vol5 /dev/vinum/drive: total 0 brw-r----- 1 root operator 4, 15 Oct 21 16:51 drive2 brw-r----- 1 root operator 4, 31 Oct 21 16:51 drive4 /dev/vinum/plex: total 0 brwxr-xr-- 1 root wheel 25, 0x10000002 Mar 30 16:08 concat.p0 brwxr-xr-- 1 root wheel 25, 0x10010002 Mar 30 16:08 concat.p1 brwxr-xr-- 1 root wheel 25, 0x10000003 Mar 30 16:08 strcon.p0 brwxr-xr-- 1 root wheel 25, 0x10010003 Mar 30 16:08 strcon.p1 brwxr-xr-- 1 root wheel 25, 0x10000001 Mar 30 16:08 stripe.p0 brwxr-xr-- 1 root wheel 25, 0x10000000 Mar 30 16:08 tinyvol.p0 brwxr-xr-- 1 root wheel 25, 0x10000004 Mar 30 16:08 vol5.p0 brwxr-xr-- 1 root wheel 25, 0x10010004 Mar 30 16:08 vol5.p1 /dev/vinum/rvol: total 0 crwxr-xr-- 1 root wheel 91, 2 Mar 30 16:08 concat crwxr-xr-- 1 root wheel 91, 3 Mar 30 16:08 strcon crwxr-xr-- 1 root wheel 91, 1 Mar 30 16:08 stripe crwxr-xr-- 1 root wheel 91, 0 Mar 30 16:08 tinyvol crwxr-xr-- 1 root wheel 91, 4 Mar 30 16:08 vol5 /dev/vinum/sd: total 0 brwxr-xr-- 1 root wheel 25, 0x20000002 Mar 30 16:08 concat.p0.s0 brwxr-xr-- 1 root wheel 25, 0x20100002 Mar 30 16:08 concat.p0.s1 brwxr-xr-- 1 root wheel 25, 0x20010002 Mar 30 16:08 concat.p1.s0 brwxr-xr-- 1 root wheel 25, 0x20000003 Mar 30 16:08 strcon.p0.s0 brwxr-xr-- 1 root wheel 25, 0x20100003 Mar 30 16:08 strcon.p0.s1 brwxr-xr-- 1 root wheel 25, 0x20010003 Mar 30 16:08 strcon.p1.s0 brwxr-xr-- 1 root wheel 25, 0x20110003 Mar 30 16:08 strcon.p1.s1 brwxr-xr-- 1 root wheel 25, 0x20000001 Mar 30 16:08 stripe.p0.s0 brwxr-xr-- 1 root wheel 25, 0x20100001 Mar 30 16:08 stripe.p0.s1 brwxr-xr-- 1 root wheel 25, 0x20000000 Mar 30 16:08 tinyvol.p0.s0 brwxr-xr-- 1 root wheel 25, 0x20100000 Mar 30 16:08 tinyvol.p0.s1 brwxr-xr-- 1 root wheel 25, 0x20000004 Mar 30 16:08 vol5.p0.s0 brwxr-xr-- 1 root wheel 25, 0x20100004 Mar 30 16:08 vol5.p0.s1 brwxr-xr-- 1 root wheel 25, 0x20010004 Mar 30 16:08 vol5.p1.s0 brwxr-xr-- 1 root wheel 25, 0x20110004 Mar 30 16:08 vol5.p1.s1 /dev/vinum/vol: total 5 brwxr-xr-- 1 root wheel 25, 2 Mar 30 16:08 concat drwxr-xr-x 4 root wheel 512 Mar 30 16:08 concat.plex brwxr-xr-- 1 root wheel 25, 3 Mar 30 16:08 strcon drwxr-xr-x 4 root wheel 512 Mar 30 16:08 strcon.plex brwxr-xr-- 1 root wheel 25, 1 Mar 30 16:08 stripe drwxr-xr-x 3 root wheel 512 Mar 30 16:08 stripe.plex brwxr-xr-- 1 root wheel 25, 0 Mar 30 16:08 tinyvol drwxr-xr-x 3 root wheel 512 Mar 30 16:08 tinyvol.plex brwxr-xr-- 1 root wheel 25, 4 Mar 30 16:08 vol5 drwxr-xr-x 4 root wheel 512 Mar 30 16:08 vol5.plex /dev/vinum/vol/concat.plex: total 2 brwxr-xr-- 1 root wheel 25, 0x10000002 Mar 30 16:08 concat.p0 drwxr-xr-x 2 root wheel 512 Mar 30 16:08 concat.p0.sd brwxr-xr-- 1 root wheel 25, 0x10010002 Mar 30 16:08 concat.p1 drwxr-xr-x 2 root wheel 512 Mar 30 16:08 concat.p1.sd /dev/vinum/vol/concat.plex/concat.p0.sd: total 0 brwxr-xr-- 1 root wheel 25, 0x20000002 Mar 30 16:08 concat.p0.s0 brwxr-xr-- 1 root wheel 25, 0x20100002 Mar 30 16:08 concat.p0.s1 /dev/vinum/vol/concat.plex/concat.p1.sd: total 0 brwxr-xr-- 1 root wheel 25, 0x20010002 Mar 30 16:08 concat.p1.s0 /dev/vinum/vol/strcon.plex: total 2 brwxr-xr-- 1 root wheel 25, 0x10000003 Mar 30 16:08 strcon.p0 drwxr-xr-x 2 root wheel 512 Mar 30 16:08 strcon.p0.sd brwxr-xr-- 1 root wheel 25, 0x10010003 Mar 30 16:08 strcon.p1 drwxr-xr-x 2 root wheel 512 Mar 30 16:08 strcon.p1.sd /dev/vinum/vol/strcon.plex/strcon.p0.sd: total 0 brwxr-xr-- 1 root wheel 25, 0x20000003 Mar 30 16:08 strcon.p0.s0 brwxr-xr-- 1 root wheel 25, 0x20100003 Mar 30 16:08 strcon.p0.s1 /dev/vinum/vol/strcon.plex/strcon.p1.sd: total 0 brwxr-xr-- 1 root wheel 25, 0x20010003 Mar 30 16:08 strcon.p1.s0 brwxr-xr-- 1 root wheel 25, 0x20110003 Mar 30 16:08 strcon.p1.s1 /dev/vinum/vol/stripe.plex: total 1 brwxr-xr-- 1 root wheel 25, 0x10000001 Mar 30 16:08 stripe.p0 drwxr-xr-x 2 root wheel 512 Mar 30 16:08 stripe.p0.sd /dev/vinum/vol/stripe.plex/stripe.p0.sd: total 0 brwxr-xr-- 1 root wheel 25, 0x20000001 Mar 30 16:08 stripe.p0.s0 brwxr-xr-- 1 root wheel 25, 0x20100001 Mar 30 16:08 stripe.p0.s1 /dev/vinum/vol/tinyvol.plex: total 1 brwxr-xr-- 1 root wheel 25, 0x10000000 Mar 30 16:08 tinyvol.p0 drwxr-xr-x 2 root wheel 512 Mar 30 16:08 tinyvol.p0.sd /dev/vinum/vol/tinyvol.plex/tinyvol.p0.sd: total 0 brwxr-xr-- 1 root wheel 25, 0x20000000 Mar 30 16:08 tinyvol.p0.s0 brwxr-xr-- 1 root wheel 25, 0x20100000 Mar 30 16:08 tinyvol.p0.s1 /dev/vinum/vol/vol5.plex: total 2 brwxr-xr-- 1 root wheel 25, 0x10000004 Mar 30 16:08 vol5.p0 drwxr-xr-x 2 root wheel 512 Mar 30 16:08 vol5.p0.sd brwxr-xr-- 1 root wheel 25, 0x10010004 Mar 30 16:08 vol5.p1 drwxr-xr-x 2 root wheel 512 Mar 30 16:08 vol5.p1.sd /dev/vinum/vol/vol5.plex/vol5.p0.sd: total 0 brwxr-xr-- 1 root wheel 25, 0x20000004 Mar 30 16:08 vol5.p0.s0 brwxr-xr-- 1 root wheel 25, 0x20100004 Mar 30 16:08 vol5.p0.s1 /dev/vinum/vol/vol5.plex/vol5.p1.sd: total 0 brwxr-xr-- 1 root wheel 25, 0x20010004 Mar 30 16:08 vol5.p1.s0 brwxr-xr-- 1 root wheel 25, 0x20110004 Mar 30 16:08 vol5.p1.s1 /dev/rvinum: crwxr-xr-- 1 root wheel 91, 2 Mar 30 16:08 rconcat crwxr-xr-- 1 root wheel 91, 3 Mar 30 16:08 rstrcon crwxr-xr-- 1 root wheel 91, 1 Mar 30 16:08 rstripe crwxr-xr-- 1 root wheel 91, 0 Mar 30 16:08 rtinyvol crwxr-xr-- 1 root wheel 91, 4 Mar 30 16:08 rvol5 .Ed .Pp In the case of unattached plexes and subdisks, the naming is reversed. Subdisks are named after the disk on which they are located, and plexes are named after the subdisk. .\" XXX .Nm This mapping is still to be determined. .Ss OBJECT STATES .Pp Each .Nm object has a \fIstate\fR associated with it. .Nm uses this state to determine the handling of the object. .Pp .Ss VOLUME STATES Volumes may have the following states: .sp .Bl -hang -width 14n .It volume_down The volume is completely inaccessible. .It volume_up The volume is up and at least partially functional. Not all plexes may be available. .El .Ss "PLEX STATES" Plexes may have the following states: .sp .ne 1i .Bl -hang -width 14n .It faulty A plex which has gone completely down because of I/O errors. .It down A plex which has been taken down by the administrator. .It initializing A plex which is being initialized. .sp The remaining states represent plexes which are at least partially up. .It corrupt A plex entry which is at least partially up. Not all subdisks are available, and an inconsistency has occurred. If no other plex is uncorrupted, the volume is no longer consistent. .It degraded A RAID-5 plex entry which is accessible, but one subdisk is down, requiring recovery for many I/O requests. .It flaky A plex which is really up, but which has a reborn subdisk which we don't completely trust, and which we don't want to read if we can avoid it. .It up A plex entry which is completely up. All subdisks are up. .El .sp 2v .Ss "SUBDISK STATES" Subdisks can have the following states: .sp .ne 1i .Bl -hang -width 14n .It empty A subdisk entry which has been created completely. All fields are correct, and the disk has been updated, but there is no data on the disk. .It initializing A subdisk entry which has been created completely and which is currently being initialized. .sp The following states represent invalid data. .It obsolete A subdisk entry which has been created completely. All fields are correct, the config on disk has been updated, and the data was valid, but since then the drive has been taken down, and as a result updates have been missed. .It stale A subdisk entry which has been created completely. All fields are correct, the disk has been updated, and the data was valid, but since then the drive has been crashed and updates have been lost. .sp The following states represent valid, inaccessible data. .It crashed A subdisk entry which has been created completely. All fields are correct, the disk has been updated, and the data was valid, but since then the drive has gone down. No attempt has been made to write to the subdisk since the crash, so the data is valid. .It down A subdisk entry which was up, which contained valid data, and which was taken down by the administrator. The data is valid. .It reviving The subdisk is currently in the process of being revived. We can write but not read. .sp The following states represent accessible subdisks with valid data. .It reborn A subdisk entry which has been created completely. All fields are correct, the disk has been updated, and the data was valid, but since then the drive has gone down and up again. No updates were lost, but it is possible that the subdisk has been damaged. We won't read from this subdisk if we have a choice. If this is the only subdisk which covers this address space in the plex, we set its state to up under these circumstances, so this status implies that there is another subdisk to fulfil the request. .It up A subdisk entry which has been created completely. All fields are correct, the disk has been updated, and the data is valid. .El .sp 2v .Ss "DRIVE STATES" Drives can have the following states: .sp .ne 1i .Bl -hang -width 14n .It referenced At least one subdisk refers to the drive, but it is not currently accessible to the system. .It down The drive is not accessible. .It up The drive is up and running. .El .sp 2v .Sh BUGS AND OMISSIONS .Bl -enum .It .Nm is a new product. Many bugs can be expected. The configuration mechanism is not yet fully functional. If you have difficulties, please look at http://www.lemis.com/vinum_beta.html and http://www.lemis.com/vinum_debugging.html before reporting problems. .It It is possible to configure .Nm statically, but it has never been tested in this form. Don't even bother to report the problem if you have trouble with a static .Nm pseudo-device, unless you can also repeat it with the kld module. .It It is necessary to initialize RAID-5 plexes. Failure to do so will not impede normal operation, but it will cause complete corruption if one of the disks should fail. I don't know any good way to enforce this initialization (or the even slower alternative of rebuilding the parity blocks). If anybody has a good idea, I'd be grateful for input. .It Detection of differences between the version of the kernel and the kld is not yet implemented. .El .Sh DEBUGGING PROBLEMS WITH VINUM .Pp Solving problems with .Nm can be a difficult affair. This section suggests some approaches. .Ss Configuration problems .Pp It is relatively easy (too easy) to run into problems with the .Nm configuration. If you do, the first thing you should do is stop configuration updates: .if t .ps -3 .if t .vs -3 .Bd -literal # vinum setdaemon 4 .Ed .if t .vs .if t .ps .Pp This will stop updates and any further corruption of the on-disk configuration. .Pp Next, look at the on-disk configuration, using a Bourne-style shell: .if t .ps -3 .if t .vs -3 .Bd -literal # rm -f log # for i in /dev/da0s1h /dev/da1s1h /dev/da2s1h /dev/da3s1h; do (dd if=$i skip=8 count=6|tr -d '\e000-\e011\e200-\e377'; echo) >> log done .Ed .if t .vs .if t .ps .Pp The names of the devices are the names of all .Nm slices. The file .Pa log should then contain something like this: .if t .ps -3 .if t .vs -3 .Bd -literal IN VINOpanic.lemis.comdrive1}6E7~^K6T^Yfoovolume obj state up volume src state up volume raid state down volume r state down volume foo state up plex name obj.p0 state corrupt org concat vol obj plex name obj.p1 state corrupt org striped 128b vol obj plex name src.p0 state corrupt org striped 128b vol src plex name src.p1 state up org concat vol src plex name raid.p0 state faulty org disorg vol raid plex name r.p0 state faulty org disorg vol r plex name foo.p0 state up org concat vol foo plex name foo.p1 state faulty org concat vol foo sd name obj.p0.s0 drive drive2 plex obj.p0 state reborn len 409600b driveoffset 265b plexoffset 0b sd name obj.p0.s1 drive drive4 plex obj.p0 state up len 409600b driveoffset 265b plexoffset 409600b sd name obj.p1.s0 drive drive1 plex obj.p1 state up len 204800b driveoffset 265b plexoffset 0b sd name obj.p1.s1 drive drive2 plex obj.p1 state reborn len 204800b driveoffset 409865b plexoffset 128b sd name obj.p1.s2 drive drive3 plex obj.p1 state up len 204800b driveoffset 265b plexoffset 256b sd name obj.p1.s3 drive drive4 plex obj.p1 state up len 204800b driveoffset 409865b plexoffset 384b .Ed .if t .vs .if t .ps .Pp The first line contains the .Nm label and must start with the text .Li IN VINO. It also contains the name of the system. The exact definition is contained in .Pa /usr/src/sys/dev/vinum/vinumvar.h. The saved configuration starts in the middle of the line with the text .Li volume obj state up and starts in sector 9 of the disk. The rest of the output shows the remainder of the on-disk configuration. It may be necessary to increase the .Ar count argument of .Cm dd in order to see the complete configuration. .Pp The configuration on all disks should be the same. If this is not the case, please report the problem with the exact contents of the file .Pa log . There is probably little that can be done to recover the on-disk configuration, but if you keep a copy of the files used to create the objects, you should be able to re-create them. The .Cm create command does not change the subdisk data, so this will not cause data corruption. You may need to use the .Cm resetconfig command if you have this kind of trouble. .Ss Kernel Panics .Pp In order to analyse a panic which you suspect comes from .Nm you will need to build a debug kernel. See the online handbook for more details of how to do this. Be sure to include the .Nm ddb debugger. To do this, put the following lines in your kernel configuration file: .Bd -literal options DDB options BREAK_TO_DEBUGGER .Ed .Pp You will need some additional steps to get symbolic information for the .Nm kernel loadable module: .Bl -enum .It If possible, make a copy of or a link to the debug kernel at .Pa /var/crash/kernel.gdb, since the .Cm gdb initialization file looks for it in this location. .It Make sure that you build the .Nm module with debugging information. This is the normal situation with the standard .Pa Makefile. .It After starting .Nm issue the following commands: .if t .ps -3 .if t .vs -3 .Bd -literal echo add-symbol-file /modules/vinum.ko \e 0x`objdump --section-headers /modules/vinum.ko \e | grep ' .text' \e | awk '{print $4}'`\+`kldstat \e | grep vinum | awk '{print $3}'` .Ed .if t .vs .if t .ps .Pp It's easiest to store this in a file, make it executable, and run it. The output will be something like: .if t .ps -3 .if t .vs -3 .Bd -literal add-symbol-file /modules/vinum.ko 0x00005e24+0xf0f4e000 .Ed .if t .vs .if t .ps .It Copy the file .Pa /usr/src/sys/modules/vinum/.gdbinit.crash to the directory in which you will be performing the analysis, typically .Pa /var/crash , and call it .Pa .gdbinit . .It If the version of .Nm in .Pa /modules does not contain symbols, you will not get an error message, but the stack trace will not show the symbols. Check the module before starting .Nm gdb : .Bd -literal $ file /modules/vinum.ko /modules/vinum.ko: ELF 32-bit LSB shared object, Intel 80386, version 1 (FreeBSD), not stripped .Ed .Pp If the output shows that .Pa /modules/vinum.ko is stripped, you will have to find a version which is not. Usually this will be either in .Pa /usr/obj/sys/modules/vinum/vinum.ko (if you have built .Nm with a .Ar make world ) or .Pa /usr/src/sys/modules/vinum/vinum.ko (if you have built .Nm in this directory). .It If you have not named your debug kernel .Pa /var/crash/kernel.gdb, edit .Pa .gdbinit to indicate the correct location. .P If you are remote debugging via a serial connection, copy the file .Pa /usr/src/sys/modules/vinum/.gdbinit.crash as .Pa .gdbinit to the directory in which you perform the debugging, and start it with .Bd -literal -indent gdb -k .Ed .Pp .Cd gdb will automatically establish the connection; the remote machine must already be in .Nm gdb . This .Pa .gdbinit file expects the serial connection to run at 38400 bits per second; if you run at a different speed, edit the file accordingly (look for the .Ar remotebaud specification). .It Either take a dump or use .Cm gdb to analyse the problem. Enter the output of the shell script shown above. The following example shows a remote debugging session using the .Ar debug command of .Nm vinum(8): .if t .ps -3 .if t .vs -3 .Bd -literal (kgdb) add-symbol-file /usr/src/sys/modules/vinum/vinum.ko 0x00005e24+0xf0f4e000 add symbol table from file "/usr/src/sys/modules/vinum/vinum.ko" at text_addr = 0xf0f53e24? (y or n) y (kgdb) bt #0 Debugger (msg=0xf0f661ac "vinum debug") at ../../i386/i386/db_interface.c:318 #1 0xf0f60a7c in vinumioctl (dev=0x40001900, cmd=0xc008464b, data=0xf6923ed0 "", flag=0x3, p=0xf688e6c0) at /usr/src/sys/modules/vinum/../../dev/vinum/vinumioctl.c:109 #2 0xf01833b7 in spec_ioctl (ap=0xf6923e0c) at ../../miscfs/specfs/spec_vnops.c:424 #3 0xf0182cc9 in spec_vnoperate (ap=0xf6923e0c) at ../../miscfs/specfs/spec_vnops.c:129 #4 0xf01eb3c1 in ufs_vnoperatespec (ap=0xf6923e0c) at ../../ufs/ufs/ufs_vnops.c:2312 #5 0xf017dbb1 in vn_ioctl (fp=0xf1007ec0, com=0xc008464b, data=0xf6923ed0 "", p=0xf688e6c0) at vnode_if.h:395 #6 0xf015dce0 in ioctl (p=0xf688e6c0, uap=0xf6923f84) at ../../kern/sys_generic.c:473 #7 0xf0214c0b in syscall (frame={tf_es = 0x27, tf_ds = 0x27, tf_edi = 0xefbfcff8, tf_esi = 0x1, tf_ebp = 0xefbfcf90, tf_isp = 0xf6923fd4, tf_ebx = 0x2, tf_edx = 0x804b614, tf_ecx = 0x8085d10, tf_eax = 0x36, tf_trapno = 0x7, tf_err = 0x2, tf_eip = 0x8060a34, tf_cs = 0x1f, tf_eflags = 0x286, tf_esp = 0xefbfcf78, tf_ss = 0x27}) at ../../i386/i386/trap.c:1100 #8 0xf020a1fc in Xint0x80_syscall () #9 0x804832d in ?? () #10 0x80482ad in ?? () #11 0x80480e9 in ?? () (kgdb) f 1 #1 0xf0f60a7c in vinumioctl (dev=0x40001900, cmd=0xc008464b, data=0xf6923ed0 "", flag=0x3, p=0xf688e6c0) at /usr/src/sys/modules/vinum/../../dev/vinum/vinumioctl.c:109 Source file is more recent than executable. 109 Debugger ("vinum debug"); .Ed .if t .vs .if t .ps .Pp When entering from the debugger, it's important that the source of frame 1 (the bottom of the example) contains the text .if t .ps -3 .if t .vs -3 .Bd -literal Debugger ("vinum debug"); .Ed .if t .vs .if t .ps .Pp This is an indication that the address specifications are correct. .El .Pp For an initial investigation, the most important information is the output of the .Nm bt (backtrace) command above. .Sh AUTHOR Greg Lehey .Pa . .Sh HISTORY .Nm vinum first appeared in FreeBSD 3.0. .Sh SEE ALSO .Xr vinum 8 , .Xr disklabel 5 , .Xr disklabel 8 .