396 lines
14 KiB
Groff
Raw Normal View History

.\" $NetBSD: raid.4,v 1.16 2000/11/02 03:34:08 oster Exp $
.\"
.\" Copyright (c) 1998 The NetBSD Foundation, Inc.
.\" All rights reserved.
.\"
.\" This code is derived from software contributed to The NetBSD Foundation
.\" by Greg Oster
.\"
.\" Redistribution and use in source and binary forms, with or without
.\" modification, are permitted provided that the following conditions
.\" are met:
.\" 1. Redistributions of source code must retain the above copyright
.\" notice, this list of conditions and the following disclaimer.
.\" 2. Redistributions in binary form must reproduce the above copyright
.\" notice, this list of conditions and the following disclaimer in the
.\" documentation and/or other materials provided with the distribution.
.\" 3. All advertising materials mentioning features or use of this software
.\" must display the following acknowledgement:
.\" This product includes software developed by the NetBSD
.\" Foundation, Inc. and its contributors.
.\" 4. Neither the name of The NetBSD Foundation nor the names of its
.\" contributors may be used to endorse or promote products derived
.\" from this software without specific prior written permission.
.\"
.\" THIS SOFTWARE IS PROVIDED BY THE NETBSD FOUNDATION, INC. AND CONTRIBUTORS
.\" ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED
.\" TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
.\" PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FOUNDATION OR CONTRIBUTORS
.\" BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
.\" CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
.\" SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
.\" INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
.\" CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
.\" ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
.\" POSSIBILITY OF SUCH DAMAGE.
.\"
.\"
.\" Copyright (c) 1995 Carnegie-Mellon University.
.\" All rights reserved.
.\"
.\" Author: Mark Holland
.\"
.\" Permission to use, copy, modify and distribute this software and
.\" its documentation is hereby granted, provided that both the copyright
.\" notice and this permission notice appear in all copies of the
.\" software, derivative works or modified versions, and any portions
.\" thereof, and that both notices appear in supporting documentation.
.\"
.\" CARNEGIE MELLON ALLOWS FREE USE OF THIS SOFTWARE IN ITS "AS IS"
.\" CONDITION. CARNEGIE MELLON DISCLAIMS ANY LIABILITY OF ANY KIND
.\" FOR ANY DAMAGES WHATSOEVER RESULTING FROM THE USE OF THIS SOFTWARE.
.\"
.\" Carnegie Mellon requests users of this software to return to
.\"
.\" Software Distribution Coordinator or Software.Distribution@CS.CMU.EDU
.\" School of Computer Science
.\" Carnegie Mellon University
.\" Pittsburgh PA 15213-3890
.\"
.\" any improvements or extensions that they make and grant Carnegie the
.\" rights to redistribute these changes.
.\"
2003-06-02 11:19:24 +00:00
.\" $FreeBSD$
.\"
.Dd October 20, 2002
.Dt RAID 4
.Os
.Sh NAME
.Nm raid
.Nd RAIDframe disk driver
.Sh SYNOPSIS
2003-06-02 11:19:24 +00:00
.Cd "device raidframe"
.Sh DESCRIPTION
The
.Nm
driver provides RAID 0, 1, 4, and 5 (and more!) capabilities to
.Fx .
This
document assumes that the reader has at least some familiarity with RAID
2003-06-02 11:19:24 +00:00
and RAID concepts.
The reader is also assumed to know how to configure
disks and add devices into kernels, how to generate kernels, and how
to partition disks.
.Pp
RAIDframe provides a number of different RAID levels including:
2003-06-02 11:19:24 +00:00
.Bl -item
.It
RAID 0 provides simple data striping across the components.
.It
RAID 1 provides mirroring.
.It
RAID 4 provides data striping across the components, with parity
stored on a dedicated drive (in this case, the last component).
2003-06-02 11:19:24 +00:00
.It
RAID 5 provides data striping across the components, with parity
distributed across all the components.
.El
.Pp
There are a wide variety of other RAID levels supported by RAIDframe,
including Even-Odd parity, RAID level 5 with rotated sparing, Chained
2003-06-02 11:19:24 +00:00
declustering, and Interleaved declustering.
The reader is referred
to the RAIDframe documentation mentioned in the
.Sx HISTORY
section for more detail on these various RAID configurations.
.Pp
Depending on the parity level configured, the device driver can
2003-06-02 11:19:24 +00:00
support the failure of component drives.
The number of failures allowed depends on the parity level selected.
If the driver is able
to handle drive failures, and a drive does fail, then the system is
2003-06-02 11:19:24 +00:00
operating in
.Dq "degraded mode" .
In this mode, all missing data must be
reconstructed from the data and parity present on the other
2003-06-02 11:19:24 +00:00
components.
This results in much slower data accesses, but
does mean that a failure need not bring the system to a complete halt.
.Pp
The RAID driver supports and enforces the use of
2003-06-02 11:19:24 +00:00
.Dq "component labels" .
A
2003-06-02 11:19:24 +00:00
.Dq "component label"
contains important information about the component, including a
user-specified serial number, the row and column of that component in
the RAID set, and whether the data (and parity) on the component is
2003-06-02 11:19:24 +00:00
.Dq clean .
If the driver determines that the labels are very inconsistent with
2003-06-02 11:19:24 +00:00
respect to each other (e.g., two or more serial numbers do not match)
or that the component label is not consistent with its assigned place
in the set (e.g., the component label claims the component should be
the 3rd one a 6-disk set, but the RAID set has it as the 3rd component
2003-06-02 11:19:24 +00:00
in a 5-disk set) then the device will fail to configure.
If the
driver determines that exactly one component label seems to be
incorrect, and the RAID set is being configured as a set that supports
a single failure, then the RAID set will be allowed to configure, but
the incorrectly labeled component will be marked as
2003-06-02 11:19:24 +00:00
.Dq failed ,
and the RAID set will begin operation in degraded mode.
If all of the components are consistent among themselves, the RAID set
will configure normally.
.Pp
Component labels are also used to support the auto-detection and
2003-06-02 11:19:24 +00:00
auto-configuration of RAID sets.
A RAID set can be flagged as
auto-configurable, in which case it will be configured automatically
2003-06-02 11:19:24 +00:00
during the kernel boot process.
RAID file systems which are
automatically configured are also eligible to be the root file system.
There is currently only limited support (alpha and pmax architectures)
for booting a kernel directly from a RAID 1 set, and no support for
2003-06-02 11:19:24 +00:00
booting from any other RAID sets.
To use a RAID set as the root
file system, a kernel is usually obtained from a small non-RAID
partition, after which any auto-configuring RAID set can be used for the
2003-06-02 11:19:24 +00:00
root file system.
See
.Xr raidctl 8
for more information on auto-configuration of RAID sets.
.Pp
The driver supports
2003-06-02 11:19:24 +00:00
.Dq "hot spares" ,
disks which are on-line, but are not
2003-06-02 11:19:24 +00:00
actively used in an existing file system.
Should a disk fail, the
driver is capable of reconstructing the failed disk onto a hot spare
or back onto a replacement drive.
If the components are hot swapable, the failed disk can then be
removed, a new disk put in its place, and a copyback operation
2003-06-02 11:19:24 +00:00
performed.
The copyback operation, as its name indicates, will copy
the reconstructed data from the hot spare to the previously failed
2003-06-02 11:19:24 +00:00
(and now replaced) disk.
Hot spares can also be hot-added using
.Xr raidctl 8 .
.Pp
If a component cannot be detected when the RAID device is configured,
2003-06-02 11:19:24 +00:00
that component will be simply marked as
.Dq failed .
.Pp
2003-06-02 11:19:24 +00:00
The userland utility for doing all
.Nm
configuration and other operations
is
.Xr raidctl 8 .
Most importantly,
.Xr raidctl 8
must be used with the
.Fl i
2003-06-02 11:19:24 +00:00
option to initialize all RAID sets.
In particular, this initialization includes re-building the parity data.
This rebuilding
of parity data is also required when either a) a new RAID device is
2003-06-02 11:19:24 +00:00
brought up for the first time or b) after an unclean shutdown of a
RAID device.
By using the
.Fl P
option to
.Xr raidctl 8 ,
and performing this on-demand recomputation of all parity
2003-06-02 11:19:24 +00:00
before doing an
.Xr fsck 8
or a
.Xr newfs 8 ,
2003-06-02 11:19:24 +00:00
file system integrity and parity integrity can be ensured.
It bears
repeating again that parity recomputation is
2003-06-02 11:19:24 +00:00
.Em required
before any file systems are created or used on the RAID device.
If the
parity is not correct, then missing data cannot be correctly recovered.
.Pp
2003-06-02 11:19:24 +00:00
RAID levels may be combined in a hierarchical fashion.
For example, a RAID 0
device can be constructed out of a number of RAID 5 devices (which, in turn,
may be constructed out of the physical disks, or of other RAID devices).
.Pp
It is important that drives be hard-coded at their respective
2003-06-02 11:19:24 +00:00
addresses (i.e., not left free-floating, where a drive with SCSI ID of
4 can end up as
.Pa /dev/da0c )
for well-behaved functioning of the RAID
device.
This is true for all types of drives, including IDE, SCSI,
etc.
For IDE drivers, use the
.Cd "options ATA_STATIC_ID"
in your kernel config file.
For SCSI, you should
.Dq "wire down"
the devices according to their ID.
See
.Xr cam 4
for examples of this.
The rationale for fixing the device addresses
2003-06-02 11:19:24 +00:00
is as follows: consider a system with three SCSI drives at SCSI IDs
4, 5, and 6, and which map to components
.Pa /dev/da0e , /dev/da1e ,
and
.Pa /dev/da2e
of a RAID 5 set.
If the drive with SCSI ID 5 fails, and the
system reboots, the old
.Pa /dev/da2e
will show up as
.Pa /dev/da1e .
The RAID
driver is able to detect that component positions have changed, and
2003-06-02 11:19:24 +00:00
will not allow normal configuration.
If the device addresses are hard
coded, however, the RAID driver would detect that the middle component
2003-06-02 11:19:24 +00:00
is unavailable, and bring the RAID 5 set up in degraded mode.
Note
that the auto-detection and auto-configuration code does not care
2003-06-02 11:19:24 +00:00
about where the components live.
The auto-configuration code will
correctly configure a device even after any number of the components
have been re-arranged.
.Pp
The first step to using the
.Nm
2003-06-02 11:19:24 +00:00
driver is to ensure that it is suitably configured in the kernel.
This is done by adding the
.Pp
.D1 Cd "device raidframe"
.Pp
2003-06-02 11:19:24 +00:00
line to the kernel configuration file.
No count argument is required as the
driver will automatically create and configure new device units as needed.
To turn on component auto-detection and auto-configuration of RAID
2003-06-02 11:19:24 +00:00
sets, simply add
.Pp
.D1 Cd "options RAID_AUTOCONFIG"
.Pp
to the kernel configuration file.
.Pp
All component partitions must be of the type
.Dv FS_BSDFFS
2003-06-02 11:19:24 +00:00
(e.g.,
.Cm 4.2BSD )
or
.Dv FS_RAID .
The use of the latter is strongly encouraged, and is required if
2003-06-02 11:19:24 +00:00
auto-configuration of the RAID set is desired.
Since RAIDframe leaves
room for disk labels, RAID components can be simply raw disks, or
partitions which use an entire disk.
.Pp
A more detailed treatment of actually using a
.Nm
device is found in
.Xr raidctl 8 .
It is highly recommended that the steps to reconstruct, copyback, and
re-compute parity are well understood by the system administrator(s)
2003-06-02 11:19:24 +00:00
.Em before
a component failure.
Doing the wrong thing when a component fails may
result in data loss.
.Sh WARNINGS
Certain RAID levels (1, 4, 5, 6, and others) can protect against some
2003-06-02 11:19:24 +00:00
data loss due to component failure.
However, the loss of two
components of a RAID 4 or 5 system, or the loss of a single component
of a RAID 0 system, will result in the entire file systems on that RAID
device being lost.
RAID is
2003-06-02 11:19:24 +00:00
.Em NOT
a substitute for good backup practices.
.Pp
Recomputation of parity
2003-06-02 11:19:24 +00:00
.Em MUST
be performed whenever there is a chance that it may have been
2003-06-02 11:19:24 +00:00
compromised.
This includes after system crashes, or before a RAID
device has been used for the first time.
Failure to keep parity
correct will be catastrophic should a component ever fail \[em] it is
better to use RAID 0 and get the additional space and speed, than it
2003-06-02 11:19:24 +00:00
is to use parity, but not keep the parity correct.
At least with RAID 0 there is no perception of increased data security.
.Sh FILES
2003-06-02 11:19:24 +00:00
.Bl -tag -width ".Pa /dev/raid*" -compact
.It Pa /dev/raid*
2003-06-02 11:19:24 +00:00
The
.Nm
device special files.
.El
.Sh SEE ALSO
.Xr config 8 ,
.Xr fsck 8 ,
.Xr mount 8 ,
.Xr newfs 8 ,
.Xr raidctl 8
.Sh HISTORY
The
.Nm
driver in
.Fx
is a port of RAIDframe, a framework for rapid prototyping of RAID
structures developed by the folks at the Parallel Data Laboratory at
2003-06-02 11:19:24 +00:00
Carnegie Mellon University (CMU).
RAIDframe, as originally distributed
by CMU, provides a RAID simulator for a number of different
architectures, and a user-level device driver and a kernel device
2003-06-02 11:19:24 +00:00
driver for Digital
.Ux .
The
.Nm
driver is a kernelized version of RAIDframe v1.1, based on the
.Nx
2003-06-02 11:19:24 +00:00
port of RAIDframe by
.An "Greg Oster" .
.Pp
A more complete description of the internals and functionality of
2003-06-02 11:19:24 +00:00
RAIDframe is found in the paper
.%T "RAIDframe: A Rapid Prototyping Tool for RAID Systems" ,
by
.An "William V. Courtright II" , "Garth Gibson" , "Mark Holland" ,
.An "LeAnn Neal Reilly" ,
and
.An "Jim Zelenka" ,
and published by the
Parallel Data Laboratory of Carnegie Mellon University.
The
.Nm
driver first appeared in
.Fx 4.4 .
.Sh COPYRIGHT
.Bd -unfilled
The RAIDframe Copyright is as follows:
Copyright (c) 1994-1996 Carnegie-Mellon University.
All rights reserved.
Permission to use, copy, modify and distribute this software and
its documentation is hereby granted, provided that both the copyright
notice and this permission notice appear in all copies of the
software, derivative works or modified versions, and any portions
thereof, and that both notices appear in supporting documentation.
CARNEGIE MELLON ALLOWS FREE USE OF THIS SOFTWARE IN ITS "AS IS"
CONDITION. CARNEGIE MELLON DISCLAIMS ANY LIABILITY OF ANY KIND
FOR ANY DAMAGES WHATSOEVER RESULTING FROM THE USE OF THIS SOFTWARE.
Carnegie Mellon requests users of this software to return to
Software Distribution Coordinator or Software.Distribution@CS.CMU.EDU
School of Computer Science
Carnegie Mellon University
Pittsburgh PA 15213-3890
any improvements or extensions that they make and grant Carnegie the
rights to redistribute these changes.
.Ed