Pawel Jakub Dawidek 32115b105a Please welcome HAST - Highly Avalable Storage.
HAST allows to transparently store data on two physically separated machines
connected over the TCP/IP network. HAST works in Primary-Secondary
(Master-Backup, Master-Slave) configuration, which means that only one of the
cluster nodes can be active at any given time. Only Primary node is able to
handle I/O requests to HAST-managed devices. Currently HAST is limited to two
cluster nodes in total.

HAST operates on block level - it provides disk-like devices in /dev/hast/
directory for use by file systems and/or applications. Working on block level
makes it transparent for file systems and applications. There in no difference
between using HAST-provided device and raw disk, partition, etc. All of them
are just regular GEOM providers in FreeBSD.

For more information please consult hastd(8), hastctl(8) and hast.conf(5)
manual pages, as well as http://wiki.FreeBSD.org/HAST.

Sponsored by:	FreeBSD Foundation
Sponsored by:	OMCnet Internet Service GmbH
Sponsored by:	TransIP BV
2010-02-18 23:16:19 +00:00

233 lines
6.5 KiB
Groff

.\" Copyright (c) 2010 The FreeBSD Foundation
.\" All rights reserved.
.\"
.\" This software was developed by Pawel Jakub Dawidek under sponsorship from
.\" the FreeBSD Foundation.
.\"
.\" Redistribution and use in source and binary forms, with or without
.\" modification, are permitted provided that the following conditions
.\" are met:
.\" 1. Redistributions of source code must retain the above copyright
.\" notice, this list of conditions and the following disclaimer.
.\" 2. Redistributions in binary form must reproduce the above copyright
.\" notice, this list of conditions and the following disclaimer in the
.\" documentation and/or other materials provided with the distribution.
.\"
.\" THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE
.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
.\" SUCH DAMAGE.
.\"
.\" $FreeBSD$
.\"
.Dd February 1, 2010
.Dt HASTD 8
.Os
.Sh NAME
.Nm hastd
.Nd "Highly Available Storage daemon"
.Sh SYNOPSIS
.Nm
.Op Fl dFh
.Op Fl c Ar config
.Op Fl P Ar pidfile
.Sh DESCRIPTION
The
.Nm
daemon is responsible for managing highly available GEOM providers.
.Pp
.Nm
allows to transparently store data on two physically separated machines
connected over the TCP/IP network.
Only one machine (cluster node) can actively use storage provided by
.Nm .
This machine is called primary.
The
.Nm
daemon operates on block level, which makes it transparent for file
systems and applications.
.Pp
There is one main
.Nm
daemon which starts new worker process as soon as a role for the given
resource is changed to primary or as soon as a role for the given
resource is changed to secondary and remote (primary) node will
successfully connect to it.
Every worker process gets a new process title (see
.Xr setproctitle 3 ) ,
which describes its role and resource it controls.
The exact format is:
.Bd -literal -offset indent
hastd: <resource name> (<role>)
.Ed
.Pp
When (and only when)
.Nm
operates in primary role for the given resource, corresponding
.Pa /dev/hast/<name>
disk-like device (GEOM provider) is created.
File systems and applications can use this provider to send I/O
requests to.
Every write, delete and flush operation
.Dv ( BIO_WRITE , BIO_DELETE , BIO_FLUSH )
is send to local component and synchronously replicated
to the remote (secondary) node if it is available.
Read operations
.Dv ( BIO_READ )
are handled locally unless I/O error occurs or local version of the data
is not up-to-date yet (synchronization is in progress).
.Pp
The
.Nm
daemon uses the GEOM Gate class to receive I/O requests from the
in-kernel GEOM infrastructure.
The
.Nm geom_gate.ko
module is loaded automatically if the kernel was not compiled with the
following option:
.Bd -ragged -offset indent
.Cd "options GEOM_GATE"
.Ed
.Pp
The connection between two
.Nm
daemons is always initiated from the one running as primary to the one
running as secondary.
When primary
.Nm
is unable to connect or connection fails, it will try to re-establish
connection every few seconds.
Once connection is established, primary
.Nm
will synchronize every extent that was modified during connection outage
to the secondary
.Nm .
.Pp
It is possible that in case of connection outage between the nodes
.Nm
primary role for the given resource will be configured on both nodes.
This in turn leads to incompatible data modifications.
Such condition is called split-brain and cannot be automatically
resolved by the
.Nm
daemon as this will lead most likely to data corruption or lost of
important changes.
Even though it cannot be fixed by
.Nm
itself, it will be detected and further connection between independently
modified nodes will not be possible.
Once this situation is manually resolved by an administrator, resource
on one of the nodes can be initialized (erasing local data), which makes
connection to the remote node possible again.
Connection of freshly initialized component will trigger full resource
synchronization.
.Pp
The
.Nm
daemon itself never picks his role up automatically.
The role has to be configured with the
.Xr hastctl 8
control utility by additional software like
.Nm ucarp
or
.Nm heartbeat
that can reliably manage role separation and switch secondary node to
primary role in case of original primary failure.
.Pp
The
.Nm
daemon can be started with the following command line arguments:
.Bl -tag -width ".Fl P Ar pidfile"
.It Fl c Ar config
Specify alternative location of the configuration file.
The default location is
.Pa /etc/hast.conf .
.It Fl d
Print or log debugging information.
This option can be specified multiple times to raise the verbosity
level.
.It Fl F
Start the
.Nm
daemon in the foreground.
By default
.Nm
starts in the background.
.It Fl h
Print the
.Nm
usage message.
.It Fl P Ar pidfile
Specify alternative location of a file where main process PID will be
stored.
The default location is
.Pa /var/run/hastd.pid .
.El
.Sh EXIT STATUS
Exit status is 0 on success, or one of the values described in
.Xr sysexits 3
on failure.
.Sh EXAMPLES
Launch
.Nm
on both nodes.
Set role for resource
.Nm shared
to primary on
.Nm nodeA
and to secondary on
.Nm nodeB .
Create file system on
.Pa /dev/hast/shared
provider and mount it.
.Bd -literal -offset indent
nodeB# hastd
nodeB# hastctl role secondary shared
nodeA# hastd
nodeA# hastctl role primary shared
nodeA# newfs -U /dev/hast/shared
nodeA# mount -o noatime /dev/hast/shared /shared
.Ed
.Sh FILES
.Bl -tag -width ".Pa /var/run/hastctl" -compact
.It Pa /etc/hast.conf
The configuration file for
.Nm
and
.Xr hastctl 8 .
.It Pa /var/run/hastctl
Control socket used by the
.Xr hastctl 8
control utility to communicate with
.Nm .
.It Pa /var/run/hastd.pid
The default location of the
.Nm
PID file.
.El
.Sh SEE ALSO
.Xr sysexits 3 ,
.Xr geom 4 ,
.Xr hast.conf 5 ,
.Xr ggatec 8 ,
.Xr ggated 8 ,
.Xr ggatel 8 ,
.Xr hastctl 8 ,
.Xr mount 8 ,
.Xr newfs 8 ,
.Xr g_bio 9 .
.Sh AUTHORS
The
.Nm
was developed by
.An Pawel Jakub Dawidek Aq pjd@FreeBSD.org
under sponsorship of the FreeBSD Foundation.