0b626a289e
This way we know how to connect to secondary node when we are primary. The same variable is used by the secondary node - it only accepts connections from the address stored in 'remote' variable. In cluster configurations it is common that each node has its individual IP address and there is one addtional shared IP address which is assigned to primary node. It seems it is possible that if the shared IP address is from the same network as the individual IP address it might be choosen by the kernel as a source address for connection with the secondary node. Such connection will be rejected by secondary, as it doesn't come from primary node individual IP. Add 'source' variable that allows to specify source IP address we want to bind to before connecting to the secondary node. MFC after: 1 week
401 lines
10 KiB
Groff
401 lines
10 KiB
Groff
.\" Copyright (c) 2010 The FreeBSD Foundation
|
|
.\" Copyright (c) 2010-2011 Pawel Jakub Dawidek <pawel@dawidek.net>
|
|
.\" All rights reserved.
|
|
.\"
|
|
.\" This software was developed by Pawel Jakub Dawidek under sponsorship from
|
|
.\" the FreeBSD Foundation.
|
|
.\"
|
|
.\" Redistribution and use in source and binary forms, with or without
|
|
.\" modification, are permitted provided that the following conditions
|
|
.\" are met:
|
|
.\" 1. Redistributions of source code must retain the above copyright
|
|
.\" notice, this list of conditions and the following disclaimer.
|
|
.\" 2. Redistributions in binary form must reproduce the above copyright
|
|
.\" notice, this list of conditions and the following disclaimer in the
|
|
.\" documentation and/or other materials provided with the distribution.
|
|
.\"
|
|
.\" THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND
|
|
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
|
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
|
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE
|
|
.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
|
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
|
|
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
|
|
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
|
|
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
|
|
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
|
|
.\" SUCH DAMAGE.
|
|
.\"
|
|
.\" $FreeBSD$
|
|
.\"
|
|
.Dd March 20, 2011
|
|
.Dt HAST.CONF 5
|
|
.Os
|
|
.Sh NAME
|
|
.Nm hast.conf
|
|
.Nd configuration file for the
|
|
.Xr hastd 8
|
|
daemon and the
|
|
.Xr hastctl 8
|
|
utility.
|
|
.Sh DESCRIPTION
|
|
The
|
|
.Nm
|
|
file is used by both
|
|
.Xr hastd 8
|
|
daemon
|
|
and
|
|
.Xr hastctl 8
|
|
control utility.
|
|
Configuration file is designed in a way that exactly the same file can be
|
|
(and should be) used on both HAST nodes.
|
|
Every line starting with # is treated as comment and ignored.
|
|
.Sh CONFIGURATION FILE SYNTAX
|
|
General syntax of the
|
|
.Nm
|
|
file is following:
|
|
.Bd -literal -offset indent
|
|
# Global section
|
|
control <addr>
|
|
listen <addr>
|
|
replication <mode>
|
|
checksum <algorithm>
|
|
compression <algorithm>
|
|
timeout <seconds>
|
|
exec <path>
|
|
|
|
on <node> {
|
|
# Node section
|
|
control <addr>
|
|
listen <addr>
|
|
}
|
|
|
|
on <node> {
|
|
# Node section
|
|
control <addr>
|
|
listen <addr>
|
|
}
|
|
|
|
resource <name> {
|
|
# Resource section
|
|
replication <mode>
|
|
checksum <algorithm>
|
|
compression <algorithm>
|
|
name <name>
|
|
local <path>
|
|
timeout <seconds>
|
|
exec <path>
|
|
|
|
on <node> {
|
|
# Resource-node section
|
|
name <name>
|
|
# Required
|
|
local <path>
|
|
# Required
|
|
remote <addr>
|
|
source <addr>
|
|
}
|
|
on <node> {
|
|
# Resource-node section
|
|
name <name>
|
|
# Required
|
|
local <path>
|
|
# Required
|
|
remote <addr>
|
|
source <addr>
|
|
}
|
|
}
|
|
.Ed
|
|
.Pp
|
|
Most of the various available configuration parameters are optional.
|
|
If parameter is not defined in the particular section, it will be
|
|
inherited from the parent section.
|
|
For example, if the
|
|
.Ic listen
|
|
parameter is not defined in the node section, it will be inherited from
|
|
the global section.
|
|
In case the global section does not define the
|
|
.Ic listen
|
|
parameter at all, the default value will be used.
|
|
.Sh CONFIGURATION FILE DESCRIPTION
|
|
The
|
|
.Aq node
|
|
argument can be replaced either by a full hostname as obtained by
|
|
.Xr gethostname 3 ,
|
|
only first part of the hostname, or by node's UUID as found in the
|
|
.Va kern.hostuuid
|
|
.Xr sysctl 8
|
|
variable.
|
|
.Pp
|
|
The following statements are available:
|
|
.Bl -tag -width ".Ic xxxx"
|
|
.It Ic control Aq addr
|
|
.Pp
|
|
Address for communication with
|
|
.Xr hastctl 8 .
|
|
Each of the following examples defines the same control address:
|
|
.Bd -literal -offset indent
|
|
uds:///var/run/hastctl
|
|
unix:///var/run/hastctl
|
|
/var/run/hastctl
|
|
.Ed
|
|
.Pp
|
|
The default value is
|
|
.Pa uds:///var/run/hastctl .
|
|
.It Ic listen Aq addr
|
|
.Pp
|
|
Address to listen on in form of:
|
|
.Bd -literal -offset indent
|
|
protocol://protocol-specific-address
|
|
.Ed
|
|
.Pp
|
|
Each of the following examples defines the same listen address:
|
|
.Bd -literal -offset indent
|
|
0.0.0.0
|
|
0.0.0.0:8457
|
|
tcp://0.0.0.0
|
|
tcp://0.0.0.0:8457
|
|
tcp4://0.0.0.0
|
|
tcp4://0.0.0.0:8457
|
|
.Ed
|
|
.Pp
|
|
The default value is
|
|
.Pa tcp4://0.0.0.0:8457 .
|
|
.It Ic replication Aq mode
|
|
.Pp
|
|
Replication mode should be one of the following:
|
|
.Bl -tag -width ".Ic xxxx"
|
|
.It Ic memsync
|
|
.Pp
|
|
Report the write operation as completed when local write completes and
|
|
when the remote node acknowledges the data receipt, but before it
|
|
actually stores the data.
|
|
The data on remote node will be stored directly after sending
|
|
acknowledgement.
|
|
This mode is intended to reduce latency, but still provides a very good
|
|
reliability.
|
|
The only situation where some small amount of data could be lost is when
|
|
the data is stored on primary node and sent to the secondary.
|
|
Secondary node then acknowledges data receipt and primary reports
|
|
success to an application.
|
|
However, it may happen that the secondary goes down before the received
|
|
data is really stored locally.
|
|
Before secondary node returns, primary node dies entirely.
|
|
When the secondary node comes back to life it becomes the new primary.
|
|
Unfortunately some small amount of data which was confirmed to be stored
|
|
to the application was lost.
|
|
The risk of such a situation is very small.
|
|
The
|
|
.Ic memsync
|
|
replication mode is currently not implemented.
|
|
.It Ic fullsync
|
|
.Pp
|
|
Mark the write operation as completed when local as well as remote
|
|
write completes.
|
|
This is the safest and the slowest replication mode.
|
|
The
|
|
.Ic fullsync
|
|
replication mode is the default.
|
|
.It Ic async
|
|
.Pp
|
|
The write operation is reported as complete right after the local write
|
|
completes.
|
|
This is the fastest and the most dangerous replication mode.
|
|
This mode should be used when replicating to a distant node where
|
|
latency is too high for other modes.
|
|
The
|
|
.Ic async
|
|
replication mode is currently not implemented.
|
|
.El
|
|
.It Ic checksum Aq algorithm
|
|
.Pp
|
|
Checksum algorithm should be one of the following:
|
|
.Bl -tag -width ".Ic sha256"
|
|
.It Ic none
|
|
No checksum will be calculated for the data being send over the network.
|
|
This is the default setting.
|
|
.It Ic crc32
|
|
CRC32 checksum will be calculated.
|
|
.It Ic sha256
|
|
SHA256 checksum will be calculated.
|
|
.El
|
|
.It Ic compression Aq algorithm
|
|
.Pp
|
|
Compression algorithm should be one of the following:
|
|
.Bl -tag -width ".Ic none"
|
|
.It Ic none
|
|
Data send over the network will not be compressed.
|
|
.It Ic hole
|
|
Only blocks that contain all zeros will be compressed.
|
|
This is very useful for initial synchronization where potentially many blocks
|
|
are still all zeros.
|
|
There should be no measurable performance overhead when this algorithm is being
|
|
used.
|
|
This is the default setting.
|
|
.It Ic lzf
|
|
The LZF algorithm by Marc Alexander Lehmann will be used to compress the data
|
|
send over the network.
|
|
LZF is very fast, general purpose compression algorithm.
|
|
.El
|
|
.It Ic timeout Aq seconds
|
|
.Pp
|
|
Connection timeout in seconds.
|
|
The default value is
|
|
.Va 5 .
|
|
.It Ic exec Aq path
|
|
.Pp
|
|
Execute the given program on various HAST events.
|
|
Below is the list of currently implemented events and arguments the given
|
|
program is executed with:
|
|
.Bl -tag -width ".Ic xxxx"
|
|
.It Ic "<path> role <resource> <oldrole> <newrole>"
|
|
.Pp
|
|
Executed on both primary and secondary nodes when resource role is changed.
|
|
.Pp
|
|
.It Ic "<path> connect <resource>"
|
|
.Pp
|
|
Executed on both primary and secondary nodes when connection for the given
|
|
resource between the nodes is established.
|
|
.Pp
|
|
.It Ic "<path> disconnect <resource>"
|
|
.Pp
|
|
Executed on both primary and secondary nodes when connection for the given
|
|
resource between the nodes is lost.
|
|
.Pp
|
|
.It Ic "<path> syncstart <resource>"
|
|
.Pp
|
|
Executed on primary node when synchronization process of secondary node is
|
|
started.
|
|
.Pp
|
|
.It Ic "<path> syncdone <resource>"
|
|
.Pp
|
|
Executed on primary node when synchronization process of secondary node is
|
|
completed successfully.
|
|
.Pp
|
|
.It Ic "<path> syncintr <resource>"
|
|
.Pp
|
|
Executed on primary node when synchronization process of secondary node is
|
|
interrupted, most likely due to secondary node outage or connection failure
|
|
between the nodes.
|
|
.Pp
|
|
.It Ic "<path> split-brain <resource>"
|
|
.Pp
|
|
Executed on both primary and secondary nodes when split-brain condition is
|
|
detected.
|
|
.Pp
|
|
.El
|
|
The
|
|
.Aq path
|
|
argument should contain full path to executable program.
|
|
If the given program exits with code different than
|
|
.Va 0 ,
|
|
.Nm hastd
|
|
will log it as an error.
|
|
.Pp
|
|
The
|
|
.Aq resource
|
|
argument is resource name from the configuration file.
|
|
.Pp
|
|
The
|
|
.Aq oldrole
|
|
argument is previous resource role (before the change).
|
|
It can be one of:
|
|
.Ar init ,
|
|
.Ar secondary ,
|
|
.Ar primary .
|
|
.Pp
|
|
The
|
|
.Aq newrole
|
|
argument is current resource role (after the change).
|
|
It can be one of:
|
|
.Ar init ,
|
|
.Ar secondary ,
|
|
.Ar primary .
|
|
.Pp
|
|
.It Ic name Aq name
|
|
.Pp
|
|
GEOM provider name that will appear as
|
|
.Pa /dev/hast/<name> .
|
|
If name is not defined, resource name will be used as provider name.
|
|
.It Ic local Aq path
|
|
.Pp
|
|
Path to the local component which will be used as backend provider for
|
|
the resource.
|
|
This can be either GEOM provider or regular file.
|
|
.It Ic remote Aq addr
|
|
.Pp
|
|
Address of the remote
|
|
.Nm hastd
|
|
daemon.
|
|
Format is the same as for the
|
|
.Ic listen
|
|
statement.
|
|
When operating as a primary node this address will be used to connect to
|
|
the secondary node.
|
|
When operating as a secondary node only connections from this address
|
|
will be accepted.
|
|
.Pp
|
|
A special value of
|
|
.Va none
|
|
can be used when the remote address is not yet known (eg. the other node is not
|
|
set up yet).
|
|
.It Ic source Aq addr
|
|
.Pp
|
|
Local address to bind to before connecting to the remote
|
|
.Nm hastd
|
|
daemon.
|
|
Format is the same as for the
|
|
.Ic listen
|
|
statement.
|
|
.El
|
|
.Sh FILES
|
|
.Bl -tag -width ".Pa /var/run/hastctl" -compact
|
|
.It Pa /etc/hast.conf
|
|
The default
|
|
.Nm
|
|
configuration file.
|
|
.It Pa /var/run/hastctl
|
|
Control socket used by the
|
|
.Xr hastctl 8
|
|
control utility to communicate with the
|
|
.Xr hastd 8
|
|
daemon.
|
|
.El
|
|
.Sh EXAMPLES
|
|
The example configuration file can look as follows:
|
|
.Bd -literal -offset indent
|
|
resource shared {
|
|
local /dev/da0
|
|
|
|
on hasta {
|
|
remote tcp4://10.0.0.2
|
|
}
|
|
on hastb {
|
|
remote tcp4://10.0.0.1
|
|
}
|
|
}
|
|
resource tank {
|
|
on hasta {
|
|
local /dev/mirror/tanka
|
|
source tcp4://10.0.0.1
|
|
remote tcp4://10.0.0.2
|
|
}
|
|
on hastb {
|
|
local /dev/mirror/tankb
|
|
source tcp4://10.0.0.2
|
|
remote tcp4://10.0.0.1
|
|
}
|
|
}
|
|
.Ed
|
|
.Sh SEE ALSO
|
|
.Xr gethostname 3 ,
|
|
.Xr geom 4 ,
|
|
.Xr hastctl 8 ,
|
|
.Xr hastd 8 .
|
|
.Sh AUTHORS
|
|
The
|
|
.Nm
|
|
was written by
|
|
.An Pawel Jakub Dawidek Aq pjd@FreeBSD.org
|
|
under sponsorship of the FreeBSD Foundation.
|