Commit Graph

21 Commits

Author SHA1 Message Date
Pawel Jakub Dawidek
f463896e5e Use newly added descriptors_assert() function to ensure only expected
descriptors are open.

MFC after:	1 week
2011-01-28 21:57:42 +00:00
Pawel Jakub Dawidek
da1783ea29 Close all unneeded descriptors after fork(2).
MFC after:	1 week
2011-01-28 21:52:37 +00:00
Pawel Jakub Dawidek
ce837469ba Before this change on first connect between primary and secondary we
initialize all the data. This is huge waste of time and resources if
there were no writes yet, as there is no real data to synchronize.

Optimize this by sending "virgin" argument to secondary, which gives it a hint
that synchronization is not needed.

In the common case (where noth nodes are configured at the same time) instead
of synchronizing everything, we don't synchronize at all.

MFC after:	1 week
2010-10-24 17:28:25 +00:00
Pawel Jakub Dawidek
1f39b27946 Simplify code a bit.
MFC after:	3 days
2010-10-24 15:44:23 +00:00
Pawel Jakub Dawidek
d7be7905ae Plug memory leak.
MFC after:	3 days
2010-10-24 15:42:16 +00:00
Pawel Jakub Dawidek
9dd5a6cb0f Switch to sigprocmask(2) API also in the main process and secondary process.
This way the primary process inherits signal mask from the main process,
which fixes a race where signal is delivered to the primary process before
configuring signal mask.

Reported by:	Mikolaj Golub <to.my.trociny@gmail.com>
MFC after:	3 days
2010-09-22 19:08:11 +00:00
Pawel Jakub Dawidek
8b70e6ae9c Fix possible deadlock where worker process sends an event to the main process
while the main process sends control message to the worker process, but worker
process hasn't started control thread yet, because it waits for reply from the
main process.

The fix is to start the control thread before sending any events.

Reported and fix suggested by:	Mikolaj Golub <to.my.trociny@gmail.com>
MFC after:	3 days
2010-09-22 19:03:11 +00:00
Pawel Jakub Dawidek
e43e02f1a4 Add __dead2 to functions that we know they are going to exit.
MFC after:	3 days
2010-09-20 13:23:43 +00:00
Pawel Jakub Dawidek
8ecdeae9d9 Correct error message.
Submitted by:	Mikolaj Golub <to.my.trociny@gmail.com>
MFC after:	2 weeks
2010-08-31 12:03:29 +00:00
Pawel Jakub Dawidek
5bdff860e7 Because it is very hard to make fork(2) from threaded process safe (we are
limited to async-signal safe functions in the child process), move all hooks
execution to the main (non-threaded) process.

Do it by maintaining connection (socketpair) between child and parent
and sending events from the child to parent, so it can execute the hook.

This is step in right direction for others reasons too. For example there is
one less problem to drop privs in worker processes.

MFC after:	2 weeks
Obtained from:	Wheel Systems Sp. z o.o. http://www.wheelsystems.com
2010-08-30 23:26:10 +00:00
Pawel Jakub Dawidek
5b41e64486 Execute hook when connection between the nodes is established or lost.
MFC after:	2 weeks
Obtained from:	Wheel Systems Sp. z o.o. http://www.wheelsystems.com
2010-08-30 00:31:30 +00:00
Pawel Jakub Dawidek
2be8fd75ff Execute hook when split-brain is detected.
MFC after:	2 weeks
Obtained from:	Wheel Systems Sp. z o.o. http://www.wheelsystems.com
2010-08-30 00:12:10 +00:00
Pawel Jakub Dawidek
ecc99c890e Allow to run hooks from the main hastd process.
MFC after:	2 weeks
Obtained from:	Wheel Systems Sp. z o.o. http://www.wheelsystems.com
2010-08-29 21:41:53 +00:00
Pawel Jakub Dawidek
f7fe83f9f8 Implement keepalive mechanism inside HAST protocol so we can detect secondary
node failures quickly for HAST resources that are rarely modified.

Remove XXX from a comment now that the guard thread never sleeps infinitely.

MFC after:	2 weeks
Obtained from:	Wheel Systems Sp. z o.o. http://www.wheelsystems.com
2010-08-27 14:26:37 +00:00
Pawel Jakub Dawidek
16bd7026a2 Add QUEUE_INSERT() and QUEUE_TAKE() macros that simplify the code a bit.
MFC after:	2 weeks
Obtained from:	Wheel Systems Sp. z o.o. http://www.wheelsystems.com
2010-08-27 14:01:28 +00:00
Pawel Jakub Dawidek
a2ef0636b4 Reset signal handlers after fork().
MFC after:	1 month
2010-08-05 18:58:00 +00:00
Pawel Jakub Dawidek
005f438bf5 - Use pjdlog_exitx() to log errors and exit instead of errx().
- Use 'unable to' (instead of 'cannot') consistently.

MFC after:	1 month
2010-08-05 18:56:24 +00:00
Pawel Jakub Dawidek
f3bd74124a Correct various log messages.
Submitted by:	Mikolaj Golub <to.my.trociny@gmail.com>
MFC after:	3 days
2010-06-14 21:46:48 +00:00
Pawel Jakub Dawidek
7a716e072e Plug memory leak.
Found by:	Coverity Prevent
CID:		7057
MFC after:	3 days
2010-06-14 21:41:22 +00:00
Pawel Jakub Dawidek
5571414ca8 Fix a problem where hastd will stuck in recv(2) after sending request to
secondary, which died between send(2) and recv(2). Do it by adding timeout
to recv(2) for primary incoming and outgoing sockets and secondary outgoing
socket.

Reported by:	Mikolaj Golub <to.my.trociny@gmail.com>
Tested by:	Mikolaj Golub <to.my.trociny@gmail.com>
MFC after:	3 days
2010-04-29 15:36:32 +00:00
Pawel Jakub Dawidek
32115b105a Please welcome HAST - Highly Avalable Storage.
HAST allows to transparently store data on two physically separated machines
connected over the TCP/IP network. HAST works in Primary-Secondary
(Master-Backup, Master-Slave) configuration, which means that only one of the
cluster nodes can be active at any given time. Only Primary node is able to
handle I/O requests to HAST-managed devices. Currently HAST is limited to two
cluster nodes in total.

HAST operates on block level - it provides disk-like devices in /dev/hast/
directory for use by file systems and/or applications. Working on block level
makes it transparent for file systems and applications. There in no difference
between using HAST-provided device and raw disk, partition, etc. All of them
are just regular GEOM providers in FreeBSD.

For more information please consult hastd(8), hastctl(8) and hast.conf(5)
manual pages, as well as http://wiki.FreeBSD.org/HAST.

Sponsored by:	FreeBSD Foundation
Sponsored by:	OMCnet Internet Service GmbH
Sponsored by:	TransIP BV
2010-02-18 23:16:19 +00:00