asomers 442baa5184 zfsd(8), the ZFS fault management daemon
Add zfsd, which deals with hard drive faults in ZFS pools. It manages
hotspares and replements in drive slots that publish physical paths.

cddl/usr.sbin/zfsd
	Add zfsd(8) and its unit tests

cddl/usr.sbin/Makefile
	Add zfsd to the build

lib/libdevdctl
	A C++ library that helps devd clients process events

lib/Makefile
share/mk/bsd.libnames.mk
share/mk/src.libnames.mk
	Add libdevdctl to the build. It's a private library, unusable by
	out-of-tree software.

etc/defaults/rc.conf
	By default, set zfsd_enable to NO

etc/mtree/BSD.include.dist
	Add a directory for libdevdctl's include files

etc/mtree/BSD.tests.dist
	Add a directory for zfsd's unit tests

etc/mtree/BSD.var.dist
	Add /var/db/zfsd/cases, where zfsd stores case files while it's shut
	down.

etc/rc.d/Makefile
etc/rc.d/zfsd
	Add zfsd's rc script

sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c
	Fix the resource.fs.zfs.statechange message. It had a number of
	problems:

	It was only being emitted on a transition to the HEALTHY state.
	That made it impossible for zfsd to take actions based on drives
	getting sicker.

	It compared the new state to vdev_prevstate, which is the state that
	the vdev had the last time it was opened.  That doesn't make sense,
	because a vdev can change state multiple times without being
	reopened.

	vdev_set_state contains logic that will change the device's new
	state based on various conditions.  However, the statechange event
	was being posted _before_ that logic took effect.  Now it's being
	posted after.

Submitted by:	gibbs, asomers, mav, allanjude
Reviewed by:	mav, delphij
Relnotes:	yes
Sponsored by:	Spectra Logic Corp, iX Systems
Differential Revision:	https://reviews.freebsd.org/D6564
2016-05-28 17:43:40 +00:00

158 lines
4.5 KiB
Groff

.\"-
.\" Copyright (c) 2016 Allan Jude
.\" All rights reserved.
.\"
.\" Redistribution and use in source and binary forms, with or without
.\" modification, are permitted provided that the following conditions
.\" are met:
.\" 1. Redistributions of source code must retain the above copyright
.\" notice, this list of conditions and the following disclaimer.
.\" 2. Redistributions in binary form must reproduce the above copyright
.\" notice, this list of conditions and the following disclaimer in the
.\" documentation and/or other materials provided with the distribution.
.\"
.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
.\" SUCH DAMAGE.
.\"
.\" $FreeBSD$
.\"
.Dd May 26, 2016
.Dt ZFSD 8
.Os
.Sh NAME
.Nm zfsd
.Nd ZFS fault management daemon
.Sh SYNOPSIS
.Nm
.Op Fl d
.Sh DESCRIPTION
.Nm
attempts to resolve ZFS faults that the kernel can't resolve by itself.
It listens to
.Xr devctl 4
events, which are how the kernel notifies userland of events such as I/O
errors and disk removals.
.Nm
attempts to resolve these faults by activating or deactivating hot spares
and onlining offline vdevs.
.Pp
The following options are available:
.Bl -tag -width indent
.It Fl d
Run in the foreground instead of daemonizing.
.El
.Pp
System administrators never interact with
.Nm
directly.
Instead, they control its behavior indirectly through zpool configuration.
There are two ways to influence
.Nm :
assigning hotspares and setting pool properties.
Currently, only the
.Em autoreplace
property has any effect.
See
.Xr zpool 8
for details.
.Pp
.Nm
will attempt to resolve the following types of fault:
.Bl -tag -width a
.It device removal
When a leaf vdev disappears,
.Nm
will activate any available hotspare.
.It device arrival
When a new GEOM device appears,
.Nm
will attempt to read its ZFS label, if any.
If it matches a previously removed vdev on an active pool,
.Nm
will online it.
Once resilvering completes, any active hotspare will detach automatically.
.Pp
If the new device has no ZFS label but its physical path matches the
physical path of a previously removed vdev on an active pool, and that
pool has the autoreplace property set, then
.Nm
will replace the missing vdev with the newly arrived device.
Once resilvering completes, any active hotspare will detach automatically.
.It vdev degrade or fault events
If a vdev becomes degraded or faulted,
.Nm
will activate any available hotspare.
.It I/O errors
If a leaf vdev generates more than 50 I/O errors in a 60 second period, then
.Nm
will mark that vdev as
.Em FAULTED .
.Xr zfs 4
will no longer issue any I/Os to it.
.Nm
will activate a hotspare if one is available.
.It Checksum errors
If a leaf vdev generates more than 50 checksum errors in a 60 second
period, then
.Nm
will mark that vdev as
.Em DEGRADED .
.Xr zfs 4
will still use it, but zfsd will activate a spare anyway.
.It Spare addition
If the system administrator adds a hotspare to a pool that is already degraded,
.Nm
will activate the spare.
.It Resilver complete
.Nm
will detach any hotspare once a permanent replacement finishes resilvering.
.It Physical path change
If the physical path of an existing disk changes,
.Nm
will attempt to replace any missing disk with the same physical path,
if its pool's autoreplace property is set.
.El
.Pp
.Nm
will log interesting events and its actions to syslog with facility
.Em daemon
and identity
.Op zfsd .
.El
.Sh FILES
.Bl -tag -width a -compact
.It Pa /var/db/zfsd/cases
When
.Nm
exits, it serializes any unresolved casefiles here,
then reads them back in when next it starts up.
.El
.Sh SEE ALSO
.Xr devctl 4 ,
.Xr zfs 4 ,
.Xr zpool 8
.Sh HISTORY
.Nm
first appeared in
.Fx 11.0 .
.Sh AUTHORS
.Nm
was originally written by
.An Justin Gibbs Aq Mt gibbs@FreeBSD.org
and
.An Alan Somers Aq Mt asomers@FreeBSD.org
.Sh TODO
In the future,
.Nm
should be able to resume a pool that became suspended due to device
removals, if enough missing devices have returned.