freebsd-dev

Author	SHA1	Message	Date
Pawel Jakub Dawidek	a61f579394	Provides three states for pjdlog_initialized, so we can also tell that this is fist initialization ever. MFC after: 2 weeks	2011-03-07 10:33:52 +00:00
Pawel Jakub Dawidek	8cd3d45ad9	Allow to compress on-the-wire data using two algorithms: - HOLE - it simply turns all-zero blocks into few bytes header; it is extremely fast, so it is turned on by default; it is mostly intended to speed up initial synchronization where we expect many zeros; - LZF - very fast algorithm by Marc Alexander Lehmann, which shows very decent compression ratio and has BSD license. MFC after: 2 weeks	2011-03-06 23:09:33 +00:00
Pawel Jakub Dawidek	1fee97b01f	Allow to checksum on-the-wire data using either CRC32 or SHA256. MFC after: 2 weeks	2011-03-06 22:56:14 +00:00
Pawel Jakub Dawidek	493812ee6e	When we decide to unlink socket file, sun_path must be set. If it is set, but there is problem unlinking the file, log a warning. MFC after: 1 week	2011-02-09 08:01:10 +00:00
Pawel Jakub Dawidek	0d8d37212b	Explicitly include <sys/types.h> as suggested by getpid(2) and don't rely on <sys/un.h> including what's needed. MFC after: 1 week	2011-02-08 23:16:19 +00:00
Pawel Jakub Dawidek	f431ab182a	Unlink UNIX domain socket file only if: 1. The descriptor is the one we are listening on (not the one when we connect as a client and not the one which is created on accept(2)). 2. Descriptor was created by us (PID matches with the PID stored on bind(2)). Reported by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 1 week	2011-02-08 23:08:20 +00:00
Pawel Jakub Dawidek	e84a29b629	Now that we break the loop on fstat(2) failure we no longer need to satisfy gcc's imperfections. MFC after: 1 week	2011-02-06 14:17:08 +00:00
Pawel Jakub Dawidek	207ee3cdea	Add (void) cast before snprintf(3)s for which we are not interested in return values. MFC after: 1 week	2011-02-06 14:09:19 +00:00
Pawel Jakub Dawidek	ee3a876c18	Treat fstat(2) failure (different than EBADF) as fatal error. Reported by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 1 week	2011-02-06 14:07:58 +00:00
Pawel Jakub Dawidek	18d6e1a5f6	Open syslog when logging sysconf(3) failure. Reported by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 1 week	2011-02-06 14:06:37 +00:00
Pawel Jakub Dawidek	5aa85abd1d	Close more descriptors that can be open if the worker process for the given resource is already running. Submitted by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 1 week	2011-02-06 12:21:29 +00:00
Pawel Jakub Dawidek	32ecf62028	Setup another socketpair between parent and child, so that primary sandboxed worker can ask the main privileged process to connect in worker's behalf and then we can migrate descriptor using this socketpair to worker. This is not really needed now, but will be needed once we start to use capsicum for sandboxing. MFC after: 1 week	2011-02-03 11:39:49 +00:00
Pawel Jakub Dawidek	21e7bc5e52	Add missing locking after moving keepalive_send() to remote send thread in r214692. MFC after: 1 week	2011-02-03 11:33:32 +00:00
Pawel Jakub Dawidek	f4c96f944c	Let the caller log info about successful privilege drop. We don't want to log this in hastctl. MFC after: 1 week	2011-02-03 10:37:44 +00:00
Pawel Jakub Dawidek	01ab52c021	- Rename proto_descriptor_{send,recv}() functions to proto_connection_{send,recv} and change them to return proto_conn structure. We don't operate directly on descriptors, but on proto_conns. - Add wrap method to wrap descriptor with proto_conn. - Remove methods to send and receive descriptors and implement this functionality as additional argument to send and receive methods. MFC after: 1 week	2011-02-02 15:53:09 +00:00
Pawel Jakub Dawidek	1c1933226f	Add proto_connect_wait() to wait for connection to finish. If timeout argument to proto_connect() is -1, then the caller needs to use this new function to wait for connection. This change is in preparation for capsicum, where sandboxed worker wants to ask main process to connect in worker's behalf and pass descriptor to the worker. Because we don't want the main process to wait for the connection, it will start async connection and pass descriptor to the worker who will be responsible for waiting for the connection to finish. MFC after: 1 week	2011-02-02 15:46:28 +00:00
Pawel Jakub Dawidek	9d70b24b93	Allow to specify connection timeout by the caller. MFC after: 1 week	2011-02-02 15:42:00 +00:00
Pawel Jakub Dawidek	5ee1703532	Move protocol allocation and deallocation to separate functions. MFC after: 1 week	2011-02-02 15:23:07 +00:00
Pawel Jakub Dawidek	8dd94e231b	Be prepared that hp_client or hp_server might be NULL now. MFC after: 1 week	2011-02-02 08:24:26 +00:00
Pawel Jakub Dawidek	292c424d6e	Do not set socket send and receive buffer. It will be auto-tuned. Confirmed by: rwatson MFC after: 1 week	2011-02-01 07:58:43 +00:00
Pawel Jakub Dawidek	94486ae22d	Fix build on ia64. I found no way how to use CMSG_NXTHDR() macro on ia64 without alignment warnings. MFC after: 1 week	2011-01-31 23:46:36 +00:00
Pawel Jakub Dawidek	2c450cb873	Until I fix the build on ia64 comment out problematic lines. Those lines are part of the (for now) unused functions.	2011-01-31 23:08:26 +00:00
Pawel Jakub Dawidek	8046c499ab	Implement two new functions for sending descriptor and receving descriptor over UNIX domain sockets and socket pairs. This is in preparation for capsicum. MFC after: 1 week	2011-01-31 18:35:17 +00:00
Pawel Jakub Dawidek	2ec483c58e	- Use pjdlog for assertions and aborts as this will log assert/abort message to syslog if we run in background. - Asserts in proto.c that method we want to call is implemented and remove dummy methods from protocols implementation that are only there to abort the program with nice message. MFC after: 1 week	2011-01-31 18:32:17 +00:00
Pawel Jakub Dawidek	05a6b8de87	Rename pjdlog_verify() to pjdlog_abort() as it better describes what the the function does and mark it with __dead2. MFC after: 1 week	2011-01-31 15:52:00 +00:00
Pawel Jakub Dawidek	6d7967de8a	Drop privileges in worker processes. Accepting connections and handshaking in secondary is still done before dropping privileges. It should be implemented by only accepting connections in privileged main process and passing connection descriptors to the worker, but is not implemented yet. MFC after: 1 week	2011-01-28 22:35:46 +00:00
Pawel Jakub Dawidek	49499e981e	Implement function that drops privileges by: - chrooting to /var/empty (user hast home directory), - setting groups to 'hast' (user hast primary group), - setting real group id, effective group id and saved group id to 'hast', - setting real user id, effective user id and saved user id to 'hast'. At the end verify that those operations where successfull. MFC after: 1 week	2011-01-28 22:33:47 +00:00
Pawel Jakub Dawidek	f463896e5e	Use newly added descriptors_assert() function to ensure only expected descriptors are open. MFC after: 1 week	2011-01-28 21:57:42 +00:00
Pawel Jakub Dawidek	579fd4b2ff	Add function to assert that the only descriptors we have open are the ones we expect to be open. Also assert that they point at expected type. Because openlog(3) API is unable to tell us descriptor number it is using, we have to close syslog socket, remember assert message in local buffer and if we fail on assertion, reopen syslog socket and log the message. MFC after: 1 week	2011-01-28 21:56:47 +00:00
Pawel Jakub Dawidek	da1783ea29	Close all unneeded descriptors after fork(2). MFC after: 1 week	2011-01-28 21:52:37 +00:00
Pawel Jakub Dawidek	d64c0992e4	Add comments to places where we treat errors as ciritical, but it is possible to handle them more gracefully. MFC after: 1 week	2011-01-28 21:51:40 +00:00
Pawel Jakub Dawidek	c3c56f8e41	Add function to close all unneeded descriptors after fork(2). MFC after: 1 week	2011-01-28 21:48:15 +00:00
Pawel Jakub Dawidek	70db96bf67	Initialize all global variables on pjdlog_init(). MFC after: 1 week	2011-01-28 21:36:01 +00:00
Pawel Jakub Dawidek	19654a238e	Remember created control connection so on fork(2) we can close it in child. Found with: procstat(1) MFC after: 1 week	2011-01-27 19:33:57 +00:00
Pawel Jakub Dawidek	c0dbce0016	Close the control socket before exiting, so it will be unlinked. MFC after: 1 week	2011-01-27 19:31:35 +00:00
Pawel Jakub Dawidek	94bf851dc1	Extend pjdlog_verify() to support the following additional macros: PJDLOG_RVERIFY() - always check expression and on false log the given message and exit. PJDLOG_RASSERT() - check expression when NDEBUG is not defined and on false log given message and exit. PJDLOG_ABORT() - log the given message and exit. MFC after: 1 week	2011-01-27 19:28:29 +00:00
Pawel Jakub Dawidek	eeb3cd677d	Add functions to initialize/finalize pjdlog. This allows to open/close log file at will. MFC after: 1 week	2011-01-27 19:24:07 +00:00
Pawel Jakub Dawidek	6ef7ddd788	Use my copyright for 2011 work. MFC after: 1 week	2011-01-27 19:18:42 +00:00
Pawel Jakub Dawidek	c62457374f	Add LOG_NDELAY flag to openlog(3) - we want descriptor to be immediately open so there are no surprises once we start chrooting or using capsicum. MFC after: 1 week	2011-01-27 19:15:25 +00:00
Pawel Jakub Dawidek	c1410d7a90	- Remove obvious NOTREACHED comment after abort() call. - Remove redundant newline at the end of the file. MFC after: 1 week	2011-01-27 19:12:44 +00:00
Pawel Jakub Dawidek	6062588f8d	Remove __dead2 from pjdlog_verify() prototype, it does return sometimes. MFC after: 1 week	2011-01-27 19:10:24 +00:00
Pawel Jakub Dawidek	115f4e5c3e	Don't open configuration file from worker process. Handle SIGHUP in the master process only and pass changes to the worker processes over control socket. This removes access to global namespace in preparation for capsicum sandboxing. MFC after: 2 weeks	2011-01-24 15:04:15 +00:00
Pawel Jakub Dawidek	79e82fe290	Add missing logs. MFC after: 1 week	2011-01-22 23:30:01 +00:00
Pawel Jakub Dawidek	eed4e65fdb	Add nv_assert() which allows to assert that the given name exists. MFC after: 1 week	2011-01-22 22:38:18 +00:00
Pawel Jakub Dawidek	09d6ae1b34	Use more consistent function name with the others (pjdlogv_prefix_set() instead of pjdlog_prefix_setv()). MFC after: 1 week	2011-01-22 22:35:08 +00:00
Pawel Jakub Dawidek	911a2aa37a	Use int16 for error. MFC after: 1 week	2011-01-22 22:33:27 +00:00
Pawel Jakub Dawidek	5ed118d861	- On primary worker reload, update hr_exec field. - Update comment. MFC after: 1 week	2011-01-22 22:31:55 +00:00
Pawel Jakub Dawidek	ac7b0b09f3	execve(2), not fork(2) resets signal handler to the default value (if it isn't ignored). Correct comment talking about that. Pointed out by: kib MFC after: 3 days	2011-01-12 16:16:54 +00:00
Pawel Jakub Dawidek	bcaa0b6789	Add a note that when custom signal handler is installed for a signal, signal action is restored to default in child after fork(2). In this case there is no need to do anything with dummy SIGCHLD handler, because after fork(2) it will be automatically reverted to SIG_IGN. Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com MFC after: 3 days	2011-01-12 14:38:17 +00:00
Pawel Jakub Dawidek	9cc97e5803	Install default signal handlers before masking signals we want to handle. It is possible that the parent process ignores some of them and sigtimedwait() will never see them, eventhough they are masked. The most common situation for this to happen is boot process where init(8) ignores SIGHUP before starting to execute /etc/rc. This in turn caused hastd(8) to ignore SIGHUP. Reported by: trasz Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com MFC after: 3 days	2011-01-12 14:35:29 +00:00
Pawel Jakub Dawidek	a7130d73a6	Detect when resource is configured more than once. MFC after: 3 days	2010-12-26 19:08:41 +00:00
Pawel Jakub Dawidek	66db33a13b	When node-specific configuration is missing in resource section, provide more useful information. Instead of: hastd: remote address not configured for resource foo Print the following: No resource foo configuration for this node (acceptable node names: freefall, freefall.freebsd.org, 44333332-4c44-4e31-4a30-313920202020). MFC after: 3 days	2010-12-26 19:07:58 +00:00
Pawel Jakub Dawidek	fba1bf5a2c	The 'ret' variable is of type ssize_t and we use proper format for it (%zd), so no (bogus) cast is needed. MFC after: 3 days	2010-12-16 19:48:03 +00:00
Pawel Jakub Dawidek	cd7b7ee577	Improve problems logging. MFC after: 3 days	2010-12-16 07:30:47 +00:00
Pawel Jakub Dawidek	7208920499	Don't ignore errors from remote requests. MFC after: 3 days	2010-12-16 07:29:58 +00:00
Pawel Jakub Dawidek	347bde360a	Log the fact of launching and include protocol version number. MFC after: 3 days	2010-12-16 07:28:40 +00:00
Rebecca Cran	e267ef95d5	Don't generate input() since it's not used.	2010-11-22 14:16:22 +00:00
Pawel Jakub Dawidek	d448536ceb	Move timeout.tv_sec initialization outside the loop - sigtimedwait(2) won't modify it. Submitted by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days	2010-11-15 03:07:42 +00:00
Pawel Jakub Dawidek	1dd5a4bfa2	1. Exit when we cannot create incoming connection. 2. Improve logging to inform which connection can't be created. Submitted by: [1] Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days	2010-11-15 03:05:33 +00:00
Pawel Jakub Dawidek	448efa9421	Send packets to remote node only via the send thread to avoid possible races - in this case a keepalive packet was send from wrong thread which lead to connection dropping, because of corrupted packet. Fix it by sending keepalive packets directly from the send thread. As a bonus we now send keepalive packets only when connection is idle. Submitted by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days	2010-11-02 22:13:08 +00:00
Pawel Jakub Dawidek	ce837469ba	Before this change on first connect between primary and secondary we initialize all the data. This is huge waste of time and resources if there were no writes yet, as there is no real data to synchronize. Optimize this by sending "virgin" argument to secondary, which gives it a hint that synchronization is not needed. In the common case (where noth nodes are configured at the same time) instead of synchronizing everything, we don't synchronize at all. MFC after: 1 week	2010-10-24 17:28:25 +00:00
Pawel Jakub Dawidek	b9ffbb0a94	Implement nv_exists() function that returns true if argument of the given name exists. MFC after: 3 days	2010-10-24 17:24:08 +00:00
Pawel Jakub Dawidek	3dea75d2a8	Move all NV defines into nv.c, they are not used externally thus there is no need to make then visible from outside. MFC after: 3 days	2010-10-24 17:22:34 +00:00
Pawel Jakub Dawidek	1f39b27946	Simplify code a bit. MFC after: 3 days	2010-10-24 15:44:23 +00:00
Pawel Jakub Dawidek	d7be7905ae	Plug memory leak. MFC after: 3 days	2010-10-24 15:42:16 +00:00
Pawel Jakub Dawidek	584a9bc3f8	Plug memory leaks. Found with: valgrind MFC after: 3 days	2010-10-24 15:41:23 +00:00
Pawel Jakub Dawidek	2964aeb34a	Load geom_gate.ko module after parsing arguments. MFC after: 3 days	2010-10-24 15:38:58 +00:00
Pawel Jakub Dawidek	6c71649c5f	Use closefrom(2) instead of close(2) in a loop. MFC after: 1 week	2010-10-20 21:10:01 +00:00
Pawel Jakub Dawidek	3f562cce40	Log correct connection when canceling half-open connection. Submitted by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days	2010-10-17 15:47:27 +00:00
Pawel Jakub Dawidek	bb317aa6ea	Use one fprintf() instead of two. MFC after: 3 days	2010-10-16 22:50:12 +00:00
Pawel Jakub Dawidek	c0a124e6ce	Clear signal mask before executing a hook. Submitted by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days	2010-10-16 22:48:48 +00:00
Pawel Jakub Dawidek	51c63dce86	We can't zero out ggio request, as we have some fields in there we initialize once during start-up. Reported by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days	2010-10-08 15:05:39 +00:00
Pawel Jakub Dawidek	022f07b682	We close the event socketpair early in the mainloop to prevent spaming with error messages, so when we clean up after child process, we have to check if the event socketpair is still there. Submitted by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days	2010-10-08 15:02:15 +00:00
Pawel Jakub Dawidek	4e47b646bb	Clear ggate structures before using them. We don't initialize all the field and there can be some garbage from the stack. MFC after: 1 week	2010-10-07 18:23:28 +00:00
Pawel Jakub Dawidek	783ee75392	Log error message when we fail to destroy ggate provider. MFC after: 3 days	2010-10-07 18:20:16 +00:00
Pawel Jakub Dawidek	4a88128b01	Start the guard thread first, so we can handle signals from the very begining. Reported by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 1 week	2010-10-07 18:19:02 +00:00
Pawel Jakub Dawidek	b46198a5db	Don't close local component on exit as we can hang waiting on g_waitidle. I'm unable to reproduce the race described in comment anymore and also the comment is incorrect - localfd represents local component from configuration file, eg. /dev/da0 and not HAST provider. Reported by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 1 week	2010-10-07 18:16:22 +00:00
Pawel Jakub Dawidek	428ad0a9c4	Decrease report interval to 5 seconds, as this also means we will check for signals every 5 seconds and not every 10 seconds as before. MFC after: 3 days	2010-10-04 21:44:26 +00:00
Pawel Jakub Dawidek	5f24b330df	hook_check() is now only used to report about long-running hooks, so the argument is redundant, remove it. MFC after: 3 days	2010-10-04 21:43:06 +00:00
Pawel Jakub Dawidek	41013c0b21	We can't mask ignored signal, so install dummy signal hander for SIGCHLD before masking it. This fixes bogus reports about hooks running for too long and other problems related to garbage-collecting child processes. Reported by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days	2010-10-04 21:41:18 +00:00
Pawel Jakub Dawidek	b71de2e057	Plug memory leak on fork(2) failure. Submitted by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days	2010-09-26 10:39:01 +00:00
Pawel Jakub Dawidek	9dd5a6cb0f	Switch to sigprocmask(2) API also in the main process and secondary process. This way the primary process inherits signal mask from the main process, which fixes a race where signal is delivered to the primary process before configuring signal mask. Reported by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days	2010-09-22 19:08:11 +00:00
Pawel Jakub Dawidek	196abd3518	Assert that descriptor numbers are sane. MFC after: 3 days	2010-09-22 19:05:54 +00:00
Pawel Jakub Dawidek	8b70e6ae9c	Fix possible deadlock where worker process sends an event to the main process while the main process sends control message to the worker process, but worker process hasn't started control thread yet, because it waits for reply from the main process. The fix is to start the control thread before sending any events. Reported and fix suggested by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days	2010-09-22 19:03:11 +00:00
Pawel Jakub Dawidek	0c24d8e2a1	Fix descriptor leaks: when child exits, we have to close control and event socket pairs. We did that only in one case out of three. MFC after: 3 days	2010-09-22 18:57:06 +00:00
Pawel Jakub Dawidek	c56cf19ebf	If we are unable to receive control message is most likely because the main process died. Instead of entering infinite loop, terminate. MFC after: 3 days	2010-09-22 18:39:43 +00:00
Pawel Jakub Dawidek	351b9a37a4	Sort includes. MFC after: 3 days	2010-09-22 18:38:02 +00:00
Pawel Jakub Dawidek	e43e02f1a4	Add __dead2 to functions that we know they are going to exit. MFC after: 3 days	2010-09-20 13:23:43 +00:00
Pawel Jakub Dawidek	6d19256b15	Include process PID in log messages. Submitted by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 2 weeks	2010-08-31 12:05:13 +00:00
Pawel Jakub Dawidek	8ecdeae9d9	Correct error message. Submitted by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 2 weeks	2010-08-31 12:03:29 +00:00
Pawel Jakub Dawidek	71c895eb1f	Forgot to add event.c and event.h in r212038. Pointed out by: pluknet <pluknet@gmail.com> MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com	2010-08-31 09:38:43 +00:00
Pawel Jakub Dawidek	852ac373cb	Mask only those signals that we want to handle. Suggested by: jilles MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com	2010-08-31 06:22:03 +00:00
Pawel Jakub Dawidek	5bdff860e7	Because it is very hard to make fork(2) from threaded process safe (we are limited to async-signal safe functions in the child process), move all hooks execution to the main (non-threaded) process. Do it by maintaining connection (socketpair) between child and parent and sending events from the child to parent, so it can execute the hook. This is step in right direction for others reasons too. For example there is one less problem to drop privs in worker processes. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com	2010-08-30 23:26:10 +00:00
Pawel Jakub Dawidek	6b276294af	We only want to know if descriptors are ready for reading. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com	2010-08-30 23:19:21 +00:00
Pawel Jakub Dawidek	eea2deaad0	When someone gives NULL as data, assume this is because he want to declare connection side only. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com	2010-08-30 23:16:45 +00:00
Pawel Jakub Dawidek	6be3a25c85	Use pjdlog_exit() before fork(). MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com	2010-08-30 22:28:04 +00:00
Pawel Jakub Dawidek	b938cdcc9b	Constify arguments we can constify. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com	2010-08-30 22:26:42 +00:00
Pawel Jakub Dawidek	5b41e64486	Execute hook when connection between the nodes is established or lost. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com	2010-08-30 00:31:30 +00:00
Pawel Jakub Dawidek	2be8fd75ff	Execute hook when split-brain is detected. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com	2010-08-30 00:12:10 +00:00
Pawel Jakub Dawidek	6d0c801ea9	Use sigtimedwait(2) for signals handling in primary process. This fixes various races and eliminates use of pthread* API in signal handler. Pointed out by: kib With help from: jilles MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com	2010-08-30 00:06:05 +00:00
Pawel Jakub Dawidek	ff6bb1f8b3	- Move functionality responsible for checking one connection to separate function to make code more readable. - Be sure not to reconnect too often in case of signal delivery, etc. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com	2010-08-29 22:55:21 +00:00
Pawel Jakub Dawidek	ee087cdf97	Disconnect after logging errors. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com	2010-08-29 22:17:53 +00:00
Pawel Jakub Dawidek	a870e771b9	- Call hook on role change. - Document new event. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com	2010-08-29 21:42:45 +00:00
Pawel Jakub Dawidek	ecc99c890e	Allow to run hooks from the main hastd process. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com	2010-08-29 21:41:53 +00:00
Pawel Jakub Dawidek	25ec2e3e2b	- Add hook_fini() which should be called after fork() from the main hastd process, once it start to use hooks. - Add hook_check_one() in case the caller expects different child processes and once it can recognize it, it will pass pid and status to hook_check_one(). MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com	2010-08-29 21:39:49 +00:00
Pawel Jakub Dawidek	572cdb2216	Implement mtx_destroy() and rw_destroy(). MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com	2010-08-29 21:37:21 +00:00
Pawel Jakub Dawidek	5da2320932	When SIGTERM or SIGINT is received, terminate worker processes. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com	2010-08-27 21:28:02 +00:00
Pawel Jakub Dawidek	4767ee29f1	When logging to stdout/stderr, flush after each log. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com	2010-08-27 21:26:55 +00:00
Pawel Jakub Dawidek	b9cf0cf5fa	Correct when we log interrupted synchronization. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com	2010-08-27 21:20:32 +00:00
Pawel Jakub Dawidek	eba09893fd	Check if no signals were delivered just before going to sleep. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com	2010-08-27 20:49:06 +00:00
Pawel Jakub Dawidek	01125a9381	Add hooks execution. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com	2010-08-27 20:48:12 +00:00
Pawel Jakub Dawidek	ac59403c39	Document new 'exec' parameter. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com	2010-08-27 15:20:31 +00:00
Pawel Jakub Dawidek	0becad39a7	Allow to execute specified program on various HAST events. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com	2010-08-27 15:16:52 +00:00
Pawel Jakub Dawidek	1cdaf10c45	- Run hooks in background - don't block waiting for them to finish. - Keep all hooks we're running in a global list, so we can report when they finish and also report when they are running for too long. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com	2010-08-27 14:38:12 +00:00
Pawel Jakub Dawidek	e64887c4d6	When logging to stdout/stderr don't close those descriptors after fork(). MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com	2010-08-27 14:35:39 +00:00
Pawel Jakub Dawidek	3f828c18e5	Reduce indent where possible. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com	2010-08-27 14:28:39 +00:00
Pawel Jakub Dawidek	f7fe83f9f8	Implement keepalive mechanism inside HAST protocol so we can detect secondary node failures quickly for HAST resources that are rarely modified. Remove XXX from a comment now that the guard thread never sleeps infinitely. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com	2010-08-27 14:26:37 +00:00
Pawel Jakub Dawidek	8f8c798c13	- Remove redundant and incorrect 'old' word from debug message. - Log disconnects as warnings. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com	2010-08-27 14:12:53 +00:00
Pawel Jakub Dawidek	e23d2d0187	Don't increase number synchronized bytes in case of an error. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com	2010-08-27 14:10:25 +00:00
Pawel Jakub Dawidek	53d9b386eb	Log that synchronization was interrupted in a proper place. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com	2010-08-27 14:08:10 +00:00
Pawel Jakub Dawidek	55ce1e7c8b	We have sync_start() function to start synchronization, introduce sync_stop() function to stop it. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com	2010-08-27 14:06:00 +00:00
Pawel Jakub Dawidek	16bd7026a2	Add QUEUE_INSERT() and QUEUE_TAKE() macros that simplify the code a bit. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com	2010-08-27 14:01:28 +00:00
Pawel Jakub Dawidek	6e5f008ac4	Add mtx_owned() implementation. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com	2010-08-27 13:58:38 +00:00
Pawel Jakub Dawidek	7087d13fae	Make comment more readable. MFC after: 2 weeks Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com	2010-08-27 13:54:17 +00:00
Pawel Jakub Dawidek	28df1f238a	For some setups sending data in 128kB chunks makes communication very slow. No idea why. 32kB on the other hand seems to work properly everywhere. Reported by: Thomas Steen Rasmussen <thomas@gibfest.dk> MFC after: 3 weeks	2010-08-18 12:09:27 +00:00
Pawel Jakub Dawidek	471bb09914	The 'size' variable is there to limit how many bytes we want to copy from 'addr'. It is very likely that size of 'addr' is larger than 'size', so checking strlcpy() return value is bogus. MFC after: 3 weeks	2010-08-16 21:59:56 +00:00
Joel Dahl	c2025a7660	Fix typos, spelling, formatting and mdoc mistakes found by Nobuyuki while translating these manual pages. Minor corrections by me. Submitted by: Nobuyuki Koganemaru <n-kogane@syd.odn.ne.jp>	2010-08-16 15:18:30 +00:00
Pawel Jakub Dawidek	44d63cff2e	Document 'none' value for remote. Reviewed by: dougb MFC after: 1 month	2010-08-05 19:54:57 +00:00
Pawel Jakub Dawidek	0989854d45	Implement configuration reload on SIGHUP. This includes: - Load added resources. - Stop and forget removed resources. - Update modified resources in least intrusive way, ie. don't touch /dev/hast/<name> unless path to local component or provider name were modified. Obtained from: Wheel Systems Sp. z o.o. http://www.wheelsystems.com MFC after: 1 month	2010-08-05 19:16:31 +00:00
Pawel Jakub Dawidek	bbbb114cda	Prepare configuration parsing code to be called multiple times: - Don't exit on errors if not requested. - Don't keep configuration in global variable, but allocate memory for configuration. - Call yyrestart() before yyparse() so that on error in configuration file we will start from the begining next time and not from the place we left of. MFC after: 1 month	2010-08-05 19:08:54 +00:00
Pawel Jakub Dawidek	a00829bb71	Make control_set_role() more public. We will need it soon. MFC after: 1 month	2010-08-05 19:04:29 +00:00
Pawel Jakub Dawidek	f377917cdc	Allow to use 'none' keywork as remote address in case second cluster node is not setup yet. MFC after: 1 month	2010-08-05 19:01:57 +00:00
Pawel Jakub Dawidek	a2ef0636b4	Reset signal handlers after fork(). MFC after: 1 month	2010-08-05 18:58:00 +00:00
Pawel Jakub Dawidek	005f438bf5	- Use pjdlog_exitx() to log errors and exit instead of errx(). - Use 'unable to' (instead of 'cannot') consistently. MFC after: 1 month	2010-08-05 18:56:24 +00:00
Pawel Jakub Dawidek	2c5dadc9cf	Assert that various buffers we are large enough. MFC after: 1 month	2010-08-05 18:27:41 +00:00
Pawel Jakub Dawidek	524840d8d0	Problem with assertion is that it logs on stderr. Add two macros: PJDLOG_ASSERT() and PJDLOG_VERIFY() that will check the given condition and log the problem where appropriate. The difference between those two is that PJDLOG_VERIFY() always work and PJDLOG_ASSERT() can be turned off by defining NDEBUG. MFC after: 1 month	2010-08-05 18:26:38 +00:00
Pawel Jakub Dawidek	6b97e48326	Keep $FreeBSD$ in __FBSDID() only for C files. MFC after: 1 month	2010-08-05 18:23:43 +00:00
Pawel Jakub Dawidek	e3031161eb	Mark two more places that we won't reach. MFC after: 1 month	2010-08-05 18:21:45 +00:00
Pawel Jakub Dawidek	9bf24e1a00	Now that TCP will be checked last we don't need any knowledge about other protocols. MFC after: 1 month	2010-08-05 17:57:59 +00:00
Pawel Jakub Dawidek	50692f84c6	Add an argument to the proto_register() function which allows protocol to declare it is the default and be placed at the end of the queue so it is checked last. MFC after: 1 month	2010-08-05 17:56:41 +00:00
Joel Dahl	a53bb70bda	Spelling fixes.	2010-07-31 21:09:49 +00:00
Pawel Jakub Dawidek	1a517c3ec5	Actually, only the fullsync mode is implemented, not memsync mode. Correct manual page. MFC after: 3 days	2010-07-22 08:30:14 +00:00
Pawel Jakub Dawidek	f3bd74124a	Correct various log messages. Submitted by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days	2010-06-14 21:46:48 +00:00
Pawel Jakub Dawidek	96610dd9b6	Fix typos. MFC after: 3 days	2010-06-14 21:44:58 +00:00
Pawel Jakub Dawidek	328e0f4b04	Initialize gctl_seq for synchronization requests. Reported by: hiroshi@soupacific.com Analysed by: Mikolaj Golub <to.my.trociny@gmail.com> Tested by: hiroshi@soupacific.com, Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days	2010-06-14 21:44:20 +00:00
Pawel Jakub Dawidek	7a716e072e	Plug memory leak. Found by: Coverity Prevent CID: 7057 MFC after: 3 days	2010-06-14 21:41:22 +00:00
Pawel Jakub Dawidek	b0dfbe5b27	Plug memory leak. Found by: Coverity Prevent CID: 7056 MFC after: 3 days	2010-06-14 21:37:25 +00:00
Pawel Jakub Dawidek	9f31eddba0	Plug memory leak. Found by: Coverity Prevent CID: 7051 MFC after: 3 days	2010-06-14 21:33:18 +00:00
Pawel Jakub Dawidek	6744284aec	Plug memory leaks. Found by: Coverity Prevent CID: 7052, 7053, 7054, 7055 MFC after: 3 days	2010-06-14 21:25:20 +00:00
Pawel Jakub Dawidek	9fab3c1b94	Remove macros that are not really needed. The idea was to have them in case we grow more descriptors, but I'll reconsider readding them once we get there. Passing (a = b) expression to FD_ISSET() is bad idea, as FD_ISSET() evaluates its argument twice. Found by: Coverity Prevent CID: 5243 MFC after: 3 days	2010-06-14 21:18:58 +00:00
Pawel Jakub Dawidek	a58b195e35	Eliminate dead code. Found by: Coverity Prevent CID: 5158 MFC after: 3 days	2010-06-14 21:01:13 +00:00
Ulrich Spörlein	0b31f1f731	mdoc: move remaining sections into consistent order This pertains mostly to FILES, HISTORY, EXIT STATUS and AUTHORS sections. Found by: mdocml lint run Reviewed by: ru	2010-05-13 12:08:11 +00:00
Pawel Jakub Dawidek	d92714eea8	Default connection timeout is way too long. To make it shorter we have to make socket non-blocking, connect() and if we get EINPROGRESS, we have to wait using select(). Very complex, but I know no other way to define connection timeout for a given socket. Reported by: hiroshi@soupacific.com MFC after: 3 days	2010-04-29 21:55:20 +00:00
Pawel Jakub Dawidek	c6ddcbe009	- Check if the worker process was killed by signal and restart it. - Improve logging. Pointed out by: Garrett Cooper <yanefbsd@gmail.com> MFC after: 3 days	2010-04-29 15:42:24 +00:00
Pawel Jakub Dawidek	5571414ca8	Fix a problem where hastd will stuck in recv(2) after sending request to secondary, which died between send(2) and recv(2). Do it by adding timeout to recv(2) for primary incoming and outgoing sockets and secondary outgoing socket. Reported by: Mikolaj Golub <to.my.trociny@gmail.com> Tested by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days	2010-04-29 15:36:32 +00:00
Pawel Jakub Dawidek	83a5671405	Restart worker thread only if the problem was temporary. In case of persistent problem we don't want to loop forever. MFC after: 3 days	2010-04-28 22:41:06 +00:00
Pawel Jakub Dawidek	5abfc9c145	Mark temporary issues as such. MFC after: 3 days	2010-04-28 22:39:47 +00:00
Pawel Jakub Dawidek	06c117d1d1	Use WEXITSTATUS() to obtain real exit code. MFC after: 3 days	2010-04-28 22:26:30 +00:00
Pawel Jakub Dawidek	1228041bd5	Don't assume that "resource" property is in metadata. Reported by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days	2010-04-28 22:23:29 +00:00
Pawel Jakub Dawidek	36df4f8d05	Fix compilation with WITHOUT_CRYPT or WITHOUT_OPENSSL options. Reported by: Andrei V. Lavreniyuk <andy.lavr@reactor-xg.kiev.ua> MFC after: 3 days	2010-04-22 19:18:10 +00:00
Pawel Jakub Dawidek	20ec52dc4b	Fix log size calculation which caused message truncation. Submitted by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days	2010-04-16 06:49:12 +00:00
Pawel Jakub Dawidek	09398e9bd4	Fix control socket leak when worker process exits. Submitted by: Mikolaj Golub <to.my.trociny@gmail.com> MFC after: 3 days	2010-04-16 06:47:29 +00:00
Pawel Jakub Dawidek	20b77db949	Increase ggate queue size to maximum value. HAST was not able to stand heavy random load. Reported by: Hiroyuki Yamagami MFC after: 3 days	2010-04-15 17:04:08 +00:00
Pawel Jakub Dawidek	0d9014f354	Don't hold connection lock when doing reconnects as it makes I/Os wait for connection timeouts. Reported by: Kevin Day <toasty@dragondata.com>	2010-03-27 16:35:07 +00:00
Ulrich Spörlein	7729e3ba40	Remove redundant WARNS?=6 overrides and inherit the WARNS setting from the toplevel directory. This does not change any WARNS level and survives a make universe. Approved by: ed (co-mentor)	2010-03-02 18:44:08 +00:00
Ruslan Ermilov	c59ee18a21	Fixed static linkage.	2010-02-26 09:41:16 +00:00
Pawel Jakub Dawidek	2e1facf96f	Changing proto_socketpair.c compilation and linking order revealed a problem - we should simply ignore proto_server() if address doesn't start with socketpair://, and not abort.	2010-02-21 19:56:47 +00:00
Pawel Jakub Dawidek	32115b105a	Please welcome HAST - Highly Avalable Storage. HAST allows to transparently store data on two physically separated machines connected over the TCP/IP network. HAST works in Primary-Secondary (Master-Backup, Master-Slave) configuration, which means that only one of the cluster nodes can be active at any given time. Only Primary node is able to handle I/O requests to HAST-managed devices. Currently HAST is limited to two cluster nodes in total. HAST operates on block level - it provides disk-like devices in /dev/hast/ directory for use by file systems and/or applications. Working on block level makes it transparent for file systems and applications. There in no difference between using HAST-provided device and raw disk, partition, etc. All of them are just regular GEOM providers in FreeBSD. For more information please consult hastd(8), hastctl(8) and hast.conf(5) manual pages, as well as http://wiki.FreeBSD.org/HAST. Sponsored by: FreeBSD Foundation Sponsored by: OMCnet Internet Service GmbH Sponsored by: TransIP BV	2010-02-18 23:16:19 +00:00

... 3 4 5 6 7 ...

368 Commits