zfsd(8), the ZFS fault management daemon
Add zfsd, which deals with hard drive faults in ZFS pools. It manages
hotspares and replements in drive slots that publish physical paths.
cddl/usr.sbin/zfsd
Add zfsd(8) and its unit tests
cddl/usr.sbin/Makefile
Add zfsd to the build
lib/libdevdctl
A C++ library that helps devd clients process events
lib/Makefile
share/mk/bsd.libnames.mk
share/mk/src.libnames.mk
Add libdevdctl to the build. It's a private library, unusable by
out-of-tree software.
etc/defaults/rc.conf
By default, set zfsd_enable to NO
etc/mtree/BSD.include.dist
Add a directory for libdevdctl's include files
etc/mtree/BSD.tests.dist
Add a directory for zfsd's unit tests
etc/mtree/BSD.var.dist
Add /var/db/zfsd/cases, where zfsd stores case files while it's shut
down.
etc/rc.d/Makefile
etc/rc.d/zfsd
Add zfsd's rc script
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c
Fix the resource.fs.zfs.statechange message. It had a number of
problems:
It was only being emitted on a transition to the HEALTHY state.
That made it impossible for zfsd to take actions based on drives
getting sicker.
It compared the new state to vdev_prevstate, which is the state that
the vdev had the last time it was opened. That doesn't make sense,
because a vdev can change state multiple times without being
reopened.
vdev_set_state contains logic that will change the device's new
state based on various conditions. However, the statechange event
was being posted _before_ that logic took effect. Now it's being
posted after.
Submitted by: gibbs, asomers, mav, allanjude
Reviewed by: mav, delphij
Relnotes: yes
Sponsored by: Spectra Logic Corp, iX Systems
Differential Revision: https://reviews.freebsd.org/D6564
2016-05-28 17:43:40 +00:00
|
|
|
/*-
|
|
|
|
* Copyright (c) 2011, 2012, 2013 Spectra Logic Corporation
|
|
|
|
* All rights reserved.
|
|
|
|
*
|
|
|
|
* Redistribution and use in source and binary forms, with or without
|
|
|
|
* modification, are permitted provided that the following conditions
|
|
|
|
* are met:
|
|
|
|
* 1. Redistributions of source code must retain the above copyright
|
|
|
|
* notice, this list of conditions, and the following disclaimer,
|
|
|
|
* without modification.
|
|
|
|
* 2. Redistributions in binary form must reproduce at minimum a disclaimer
|
|
|
|
* substantially similar to the "NO WARRANTY" disclaimer below
|
|
|
|
* ("Disclaimer") and any redistribution must be conditioned upon
|
|
|
|
* including a substantially similar Disclaimer requirement for further
|
|
|
|
* binary redistribution.
|
|
|
|
*
|
|
|
|
* NO WARRANTY
|
|
|
|
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
|
|
|
|
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
|
|
|
|
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR
|
|
|
|
* A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
|
|
|
|
* HOLDERS OR CONTRIBUTORS BE LIABLE FOR SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
|
|
|
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
|
|
|
|
* OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
|
|
|
|
* HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
|
|
|
|
* STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
|
|
|
|
* IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
|
|
|
|
* POSSIBILITY OF SUCH DAMAGES.
|
|
|
|
*
|
|
|
|
* Authors: Justin T. Gibbs (Spectra Logic Corporation)
|
|
|
|
*
|
|
|
|
* $FreeBSD$
|
|
|
|
*/
|
|
|
|
|
|
|
|
/**
|
|
|
|
* \file callout.cc
|
|
|
|
*
|
|
|
|
* \brief Implementation of the Callout class - multi-client
|
|
|
|
* timer services built on top of the POSIX interval timer.
|
|
|
|
*/
|
|
|
|
|
Merge OpenZFS support in to HEAD.
The primary benefit is maintaining a completely shared
code base with the community allowing FreeBSD to receive
new features sooner and with less effort.
I would advise against doing 'zpool upgrade'
or creating indispensable pools using new
features until this change has had a month+
to soak.
Work on merging FreeBSD support in to what was
at the time "ZFS on Linux" began in August 2018.
I first publicly proposed transitioning FreeBSD
to (new) OpenZFS on December 18th, 2018. FreeBSD
support in OpenZFS was finally completed in December
2019. A CFT for downstreaming OpenZFS support in
to FreeBSD was first issued on July 8th. All issues
that were reported have been addressed or, for
a couple of less critical matters there are
pull requests in progress with OpenZFS. iXsystems
has tested and dogfooded extensively internally.
The TrueNAS 12 release is based on OpenZFS with
some additional features that have not yet made
it upstream.
Improvements include:
project quotas, encrypted datasets,
allocation classes, vectorized raidz,
vectorized checksums, various command line
improvements, zstd compression.
Thanks to those who have helped along the way:
Ryan Moeller, Allan Jude, Zack Welch, and many
others.
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D25872
2020-08-25 02:21:27 +00:00
|
|
|
#include <sys/byteorder.h>
|
zfsd(8), the ZFS fault management daemon
Add zfsd, which deals with hard drive faults in ZFS pools. It manages
hotspares and replements in drive slots that publish physical paths.
cddl/usr.sbin/zfsd
Add zfsd(8) and its unit tests
cddl/usr.sbin/Makefile
Add zfsd to the build
lib/libdevdctl
A C++ library that helps devd clients process events
lib/Makefile
share/mk/bsd.libnames.mk
share/mk/src.libnames.mk
Add libdevdctl to the build. It's a private library, unusable by
out-of-tree software.
etc/defaults/rc.conf
By default, set zfsd_enable to NO
etc/mtree/BSD.include.dist
Add a directory for libdevdctl's include files
etc/mtree/BSD.tests.dist
Add a directory for zfsd's unit tests
etc/mtree/BSD.var.dist
Add /var/db/zfsd/cases, where zfsd stores case files while it's shut
down.
etc/rc.d/Makefile
etc/rc.d/zfsd
Add zfsd's rc script
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c
Fix the resource.fs.zfs.statechange message. It had a number of
problems:
It was only being emitted on a transition to the HEALTHY state.
That made it impossible for zfsd to take actions based on drives
getting sicker.
It compared the new state to vdev_prevstate, which is the state that
the vdev had the last time it was opened. That doesn't make sense,
because a vdev can change state multiple times without being
reopened.
vdev_set_state contains logic that will change the device's new
state based on various conditions. However, the statechange event
was being posted _before_ that logic took effect. Now it's being
posted after.
Submitted by: gibbs, asomers, mav, allanjude
Reviewed by: mav, delphij
Relnotes: yes
Sponsored by: Spectra Logic Corp, iX Systems
Differential Revision: https://reviews.freebsd.org/D6564
2016-05-28 17:43:40 +00:00
|
|
|
#include <sys/time.h>
|
|
|
|
|
|
|
|
#include <signal.h>
|
|
|
|
#include <syslog.h>
|
|
|
|
|
|
|
|
#include <climits>
|
|
|
|
#include <list>
|
|
|
|
#include <map>
|
|
|
|
#include <string>
|
|
|
|
|
|
|
|
#include <devdctl/guid.h>
|
|
|
|
#include <devdctl/event.h>
|
|
|
|
#include <devdctl/event_factory.h>
|
|
|
|
#include <devdctl/consumer.h>
|
|
|
|
#include <devdctl/exception.h>
|
|
|
|
|
|
|
|
#include "callout.h"
|
|
|
|
#include "vdev_iterator.h"
|
|
|
|
#include "zfsd.h"
|
|
|
|
#include "zfsd_exception.h"
|
|
|
|
|
|
|
|
std::list<Callout *> Callout::s_activeCallouts;
|
|
|
|
bool Callout::s_alarmFired(false);
|
|
|
|
|
|
|
|
void
|
|
|
|
Callout::Init()
|
|
|
|
{
|
|
|
|
signal(SIGALRM, Callout::AlarmSignalHandler);
|
|
|
|
}
|
|
|
|
|
|
|
|
bool
|
|
|
|
Callout::Stop()
|
|
|
|
{
|
|
|
|
if (!IsPending())
|
|
|
|
return (false);
|
|
|
|
|
|
|
|
for (std::list<Callout *>::iterator it(s_activeCallouts.begin());
|
|
|
|
it != s_activeCallouts.end(); it++) {
|
|
|
|
if (*it != this)
|
|
|
|
continue;
|
|
|
|
|
|
|
|
it = s_activeCallouts.erase(it);
|
|
|
|
if (it != s_activeCallouts.end()) {
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Maintain correct interval for the
|
|
|
|
* callouts that follow the just removed
|
|
|
|
* entry.
|
|
|
|
*/
|
|
|
|
timeradd(&(*it)->m_interval, &m_interval,
|
|
|
|
&(*it)->m_interval);
|
|
|
|
}
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
m_pending = false;
|
|
|
|
return (true);
|
|
|
|
}
|
|
|
|
|
|
|
|
bool
|
|
|
|
Callout::Reset(const timeval &interval, CalloutFunc_t *func, void *arg)
|
|
|
|
{
|
|
|
|
bool cancelled(false);
|
|
|
|
|
|
|
|
if (!timerisset(&interval))
|
|
|
|
throw ZfsdException("Callout::Reset: interval of 0");
|
|
|
|
|
|
|
|
cancelled = Stop();
|
|
|
|
|
|
|
|
m_interval = interval;
|
|
|
|
m_func = func;
|
|
|
|
m_arg = arg;
|
|
|
|
m_pending = true;
|
|
|
|
|
|
|
|
std::list<Callout *>::iterator it(s_activeCallouts.begin());
|
|
|
|
for (; it != s_activeCallouts.end(); it++) {
|
|
|
|
|
|
|
|
if (timercmp(&(*it)->m_interval, &m_interval, <=)) {
|
|
|
|
/*
|
|
|
|
* Decrease our interval by those that come
|
|
|
|
* before us.
|
|
|
|
*/
|
|
|
|
timersub(&m_interval, &(*it)->m_interval, &m_interval);
|
|
|
|
} else {
|
|
|
|
/*
|
|
|
|
* Account for the time between the newly
|
|
|
|
* inserted event and those that follow.
|
|
|
|
*/
|
|
|
|
timersub(&(*it)->m_interval, &m_interval,
|
|
|
|
&(*it)->m_interval);
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
s_activeCallouts.insert(it, this);
|
|
|
|
|
|
|
|
|
|
|
|
if (s_activeCallouts.front() == this) {
|
|
|
|
itimerval timerval = { {0, 0}, m_interval };
|
|
|
|
|
|
|
|
setitimer(ITIMER_REAL, &timerval, NULL);
|
|
|
|
}
|
|
|
|
|
|
|
|
return (cancelled);
|
|
|
|
}
|
|
|
|
|
|
|
|
void
|
|
|
|
Callout::AlarmSignalHandler(int)
|
|
|
|
{
|
|
|
|
s_alarmFired = true;
|
|
|
|
ZfsDaemon::WakeEventLoop();
|
|
|
|
}
|
|
|
|
|
|
|
|
void
|
|
|
|
Callout::ExpireCallouts()
|
|
|
|
{
|
|
|
|
if (!s_alarmFired)
|
|
|
|
return;
|
|
|
|
|
|
|
|
s_alarmFired = false;
|
|
|
|
if (s_activeCallouts.empty()) {
|
|
|
|
/* Callout removal/SIGALRM race was lost. */
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Expire the first callout (the one we used to set the
|
|
|
|
* interval timer) as well as any callouts following that
|
|
|
|
* expire at the same time (have a zero interval from
|
|
|
|
* the callout before it).
|
|
|
|
*/
|
|
|
|
do {
|
|
|
|
Callout *cur(s_activeCallouts.front());
|
|
|
|
s_activeCallouts.pop_front();
|
|
|
|
cur->m_pending = false;
|
|
|
|
cur->m_func(cur->m_arg);
|
|
|
|
} while (!s_activeCallouts.empty()
|
|
|
|
&& timerisset(&s_activeCallouts.front()->m_interval) == 0);
|
|
|
|
|
|
|
|
if (!s_activeCallouts.empty()) {
|
|
|
|
Callout *next(s_activeCallouts.front());
|
|
|
|
itimerval timerval = { { 0, 0 }, next->m_interval };
|
|
|
|
|
|
|
|
setitimer(ITIMER_REAL, &timerval, NULL);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
timeval
|
|
|
|
Callout::TimeRemaining() const
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* Outline: Add the m_interval for each callout in s_activeCallouts
|
|
|
|
* ahead of this, except for the first callout. Add to that the result
|
|
|
|
* of getitimer (That's because the first callout stores its original
|
|
|
|
* interval setting while the timer is ticking).
|
|
|
|
*/
|
|
|
|
itimerval timervalToAlarm;
|
|
|
|
timeval timeToExpiry;
|
|
|
|
std::list<Callout *>::iterator it;
|
|
|
|
|
|
|
|
if (!IsPending()) {
|
|
|
|
timeToExpiry.tv_sec = INT_MAX;
|
|
|
|
timeToExpiry.tv_usec = 999999; /*maximum normalized value*/
|
|
|
|
return (timeToExpiry);
|
|
|
|
}
|
|
|
|
|
|
|
|
timerclear(&timeToExpiry);
|
|
|
|
getitimer(ITIMER_REAL, &timervalToAlarm);
|
|
|
|
timeval& timeToAlarm = timervalToAlarm.it_value;
|
|
|
|
timeradd(&timeToExpiry, &timeToAlarm, &timeToExpiry);
|
|
|
|
|
|
|
|
it =s_activeCallouts.begin();
|
|
|
|
it++; /*skip the first callout in the list*/
|
|
|
|
for (; it != s_activeCallouts.end(); it++) {
|
|
|
|
timeradd(&timeToExpiry, &(*it)->m_interval, &timeToExpiry);
|
|
|
|
if ((*it) == this)
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
return (timeToExpiry);
|
|
|
|
}
|