Change communication protocol to be much more resistant on network
problems and to allow for much better performance.
Better performance is achieved by creating two connections between
ggatec and ggated one for sending the data and one for receiving it.
Every connection is handled by separeted thread, so there is no more
synchronous data flow (send and wait for response), now one threads
sends all requests and another receives the data.
Use two threads in ggatec(8):
- sendtd, which takes I/O requests from the kernel and sends them to the
ggated daemon on the other end;
- recvtd, which waits for ggated responses and forwards them to the kernel.
Use three threads in ggated(8):
- recvtd, which waits for I/O requests and puts them onto incoming queue;
- disktd, which takes requests from the incoming queue, does disk operations
and puts finished requests onto outgoing queue;
- sendtd, which takes finished requests from the outgoing queue and sends
responses back to ggatec.
Because there were major changes in communication protocol, there is no
backward compatibility, from now on, both client and server has to run
on 5.x or 6.x (or at least ggated should be from the same FreeBSD version
on which ggatec is running).
For Gbit networks some buffers need to be increased. I use those settings:
kern.ipc.maxsockbuf=16777216
net.inet.tcp.sendspace=8388608
net.inet.tcp.recvspace=8388608
and I use '-S 4194304 -R 4194304' options for both, ggatec and ggated.
Approved by: re (scottl)
This little thing can cause a deadlock, because taste mechanism start
to work after creation of ggate provider and I/O requests are sent from
other classes from the g_event thread, so number of pending events isn't 0.
Now ggatec(8) start second handshake and ggated(8) is trying to open
GEOM provider (for example md(4)) and it can't, because it hangs on
g_waitidle() in g_dev_open(). g_waitidle() cannot finish because
there is a pending read on event queue, and this read can't be
finished, because ggated(8) can't open target device.
GEOM Gate will recover from this deadlock, because requests will
timeout, but it of course isn't the best solution and I don't know
better one for now, so we should avoid opening GEOM providers while
there are pending requests in event queue.