freebsd-dev/cddl/usr.sbin/zfsd/zfsd.8

155 lines
4.5 KiB
Groff
Raw Normal View History

zfsd(8), the ZFS fault management daemon Add zfsd, which deals with hard drive faults in ZFS pools. It manages hotspares and replements in drive slots that publish physical paths. cddl/usr.sbin/zfsd Add zfsd(8) and its unit tests cddl/usr.sbin/Makefile Add zfsd to the build lib/libdevdctl A C++ library that helps devd clients process events lib/Makefile share/mk/bsd.libnames.mk share/mk/src.libnames.mk Add libdevdctl to the build. It's a private library, unusable by out-of-tree software. etc/defaults/rc.conf By default, set zfsd_enable to NO etc/mtree/BSD.include.dist Add a directory for libdevdctl's include files etc/mtree/BSD.tests.dist Add a directory for zfsd's unit tests etc/mtree/BSD.var.dist Add /var/db/zfsd/cases, where zfsd stores case files while it's shut down. etc/rc.d/Makefile etc/rc.d/zfsd Add zfsd's rc script sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c Fix the resource.fs.zfs.statechange message. It had a number of problems: It was only being emitted on a transition to the HEALTHY state. That made it impossible for zfsd to take actions based on drives getting sicker. It compared the new state to vdev_prevstate, which is the state that the vdev had the last time it was opened. That doesn't make sense, because a vdev can change state multiple times without being reopened. vdev_set_state contains logic that will change the device's new state based on various conditions. However, the statechange event was being posted _before_ that logic took effect. Now it's being posted after. Submitted by: gibbs, asomers, mav, allanjude Reviewed by: mav, delphij Relnotes: yes Sponsored by: Spectra Logic Corp, iX Systems Differential Revision: https://reviews.freebsd.org/D6564
2016-05-28 17:43:40 +00:00
.\"-
.\" Copyright (c) 2016 Allan Jude
.\" All rights reserved.
.\"
.\" Redistribution and use in source and binary forms, with or without
.\" modification, are permitted provided that the following conditions
.\" are met:
.\" 1. Redistributions of source code must retain the above copyright
.\" notice, this list of conditions and the following disclaimer.
.\" 2. Redistributions in binary form must reproduce the above copyright
.\" notice, this list of conditions and the following disclaimer in the
.\" documentation and/or other materials provided with the distribution.
.\"
.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
.\" SUCH DAMAGE.
.\"
.\" $FreeBSD$
.\"
.Dd April 18, 2020
zfsd(8), the ZFS fault management daemon Add zfsd, which deals with hard drive faults in ZFS pools. It manages hotspares and replements in drive slots that publish physical paths. cddl/usr.sbin/zfsd Add zfsd(8) and its unit tests cddl/usr.sbin/Makefile Add zfsd to the build lib/libdevdctl A C++ library that helps devd clients process events lib/Makefile share/mk/bsd.libnames.mk share/mk/src.libnames.mk Add libdevdctl to the build. It's a private library, unusable by out-of-tree software. etc/defaults/rc.conf By default, set zfsd_enable to NO etc/mtree/BSD.include.dist Add a directory for libdevdctl's include files etc/mtree/BSD.tests.dist Add a directory for zfsd's unit tests etc/mtree/BSD.var.dist Add /var/db/zfsd/cases, where zfsd stores case files while it's shut down. etc/rc.d/Makefile etc/rc.d/zfsd Add zfsd's rc script sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c Fix the resource.fs.zfs.statechange message. It had a number of problems: It was only being emitted on a transition to the HEALTHY state. That made it impossible for zfsd to take actions based on drives getting sicker. It compared the new state to vdev_prevstate, which is the state that the vdev had the last time it was opened. That doesn't make sense, because a vdev can change state multiple times without being reopened. vdev_set_state contains logic that will change the device's new state based on various conditions. However, the statechange event was being posted _before_ that logic took effect. Now it's being posted after. Submitted by: gibbs, asomers, mav, allanjude Reviewed by: mav, delphij Relnotes: yes Sponsored by: Spectra Logic Corp, iX Systems Differential Revision: https://reviews.freebsd.org/D6564
2016-05-28 17:43:40 +00:00
.Dt ZFSD 8
.Os
.Sh NAME
.Nm zfsd
.Nd ZFS fault management daemon
.Sh SYNOPSIS
.Nm
.Op Fl d
.Sh DESCRIPTION
.Nm
attempts to resolve ZFS faults that the kernel can't resolve by itself.
It listens to
.Xr devctl 4
events, which are how the kernel notifies userland of events such as I/O
errors and disk removals.
.Nm
attempts to resolve these faults by activating or deactivating hot spares
and onlining offline vdevs.
.Pp
The following options are available:
.Bl -tag -width indent
.It Fl d
Run in the foreground instead of daemonizing.
.El
.Pp
System administrators never interact with
.Nm
directly.
Instead, they control its behavior indirectly through zpool configuration.
There are two ways to influence
.Nm :
assigning hotspares and setting pool properties.
Currently, only the
.Em autoreplace
property has any effect.
See
.Xr zpool 8
for details.
.Pp
.Nm
will attempt to resolve the following types of fault:
.Bl -tag -width a
.It device removal
When a leaf vdev disappears,
.Nm
will activate any available hotspare.
.It device arrival
When a new GEOM device appears,
.Nm
will attempt to read its ZFS label, if any.
If it matches a previously removed vdev on an active pool,
.Nm
will online it.
Once resilvering completes, any active hotspare will detach automatically.
.Pp
If the new device has no ZFS label but its physical path matches the
physical path of a previously removed vdev on an active pool, and that
pool has the autoreplace property set, then
.Nm
will replace the missing vdev with the newly arrived device.
Once resilvering completes, any active hotspare will detach automatically.
.It vdev degrade or fault events
If a vdev becomes degraded or faulted,
.Nm
will activate any available hotspare.
.It I/O errors
If a leaf vdev generates more than 50 I/O errors in a 60 second period, then
.Nm
will mark that vdev as
.Em FAULTED .
ZFS will no longer issue any I/Os to it.
zfsd(8), the ZFS fault management daemon Add zfsd, which deals with hard drive faults in ZFS pools. It manages hotspares and replements in drive slots that publish physical paths. cddl/usr.sbin/zfsd Add zfsd(8) and its unit tests cddl/usr.sbin/Makefile Add zfsd to the build lib/libdevdctl A C++ library that helps devd clients process events lib/Makefile share/mk/bsd.libnames.mk share/mk/src.libnames.mk Add libdevdctl to the build. It's a private library, unusable by out-of-tree software. etc/defaults/rc.conf By default, set zfsd_enable to NO etc/mtree/BSD.include.dist Add a directory for libdevdctl's include files etc/mtree/BSD.tests.dist Add a directory for zfsd's unit tests etc/mtree/BSD.var.dist Add /var/db/zfsd/cases, where zfsd stores case files while it's shut down. etc/rc.d/Makefile etc/rc.d/zfsd Add zfsd's rc script sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c Fix the resource.fs.zfs.statechange message. It had a number of problems: It was only being emitted on a transition to the HEALTHY state. That made it impossible for zfsd to take actions based on drives getting sicker. It compared the new state to vdev_prevstate, which is the state that the vdev had the last time it was opened. That doesn't make sense, because a vdev can change state multiple times without being reopened. vdev_set_state contains logic that will change the device's new state based on various conditions. However, the statechange event was being posted _before_ that logic took effect. Now it's being posted after. Submitted by: gibbs, asomers, mav, allanjude Reviewed by: mav, delphij Relnotes: yes Sponsored by: Spectra Logic Corp, iX Systems Differential Revision: https://reviews.freebsd.org/D6564
2016-05-28 17:43:40 +00:00
.Nm
will activate a hotspare if one is available.
.It Checksum errors
If a leaf vdev generates more than 50 checksum errors in a 60 second
period, then
.Nm
will mark that vdev as
.Em DEGRADED .
ZFS will still use it, but zfsd will activate a spare anyway.
zfsd(8), the ZFS fault management daemon Add zfsd, which deals with hard drive faults in ZFS pools. It manages hotspares and replements in drive slots that publish physical paths. cddl/usr.sbin/zfsd Add zfsd(8) and its unit tests cddl/usr.sbin/Makefile Add zfsd to the build lib/libdevdctl A C++ library that helps devd clients process events lib/Makefile share/mk/bsd.libnames.mk share/mk/src.libnames.mk Add libdevdctl to the build. It's a private library, unusable by out-of-tree software. etc/defaults/rc.conf By default, set zfsd_enable to NO etc/mtree/BSD.include.dist Add a directory for libdevdctl's include files etc/mtree/BSD.tests.dist Add a directory for zfsd's unit tests etc/mtree/BSD.var.dist Add /var/db/zfsd/cases, where zfsd stores case files while it's shut down. etc/rc.d/Makefile etc/rc.d/zfsd Add zfsd's rc script sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c Fix the resource.fs.zfs.statechange message. It had a number of problems: It was only being emitted on a transition to the HEALTHY state. That made it impossible for zfsd to take actions based on drives getting sicker. It compared the new state to vdev_prevstate, which is the state that the vdev had the last time it was opened. That doesn't make sense, because a vdev can change state multiple times without being reopened. vdev_set_state contains logic that will change the device's new state based on various conditions. However, the statechange event was being posted _before_ that logic took effect. Now it's being posted after. Submitted by: gibbs, asomers, mav, allanjude Reviewed by: mav, delphij Relnotes: yes Sponsored by: Spectra Logic Corp, iX Systems Differential Revision: https://reviews.freebsd.org/D6564
2016-05-28 17:43:40 +00:00
.It Spare addition
If the system administrator adds a hotspare to a pool that is already degraded,
.Nm
will activate the spare.
.It Resilver complete
.Nm
will detach any hotspare once a permanent replacement finishes resilvering.
.It Physical path change
If the physical path of an existing disk changes,
.Nm
will attempt to replace any missing disk with the same physical path,
if its pool's autoreplace property is set.
.El
.Pp
.Nm
will log interesting events and its actions to syslog with facility
.Em daemon
and identity
.Op zfsd .
.El
.Sh FILES
.Bl -tag -width a -compact
.It Pa /var/db/zfsd/cases
When
.Nm
exits, it serializes any unresolved casefiles here,
then reads them back in when next it starts up.
.El
.Sh SEE ALSO
.Xr devctl 4 ,
.Xr zpool 8
.Sh HISTORY
.Nm
first appeared in
.Fx 11.0 .
.Sh AUTHORS
.Nm
was originally written by
.An Justin Gibbs Aq Mt gibbs@FreeBSD.org
and
.An Alan Somers Aq Mt asomers@FreeBSD.org
.Sh TODO
In the future,
.Nm
should be able to resume a pool that became suspended due to device
removals, if enough missing devices have returned.