The standard SunOS ypbind(8) (and, until now, the FreeBSD ypbind)
only selects servers based on whether or not they respond to clnt_broadcast().
Ypbind(8) broadcasts to the YPPROC_DOMAIN_NONACK procedure and waits
for answers; whichever server answers first is the one ypbind uses
for the local client binding.
This mechanism fails when binding across subnets is desired. In order
for a client on one subnet to bind to a server on another subnet, the
gateway(s) between the client and server must be configured to forward
broadcasts. If this is not possible, then a slave server must be
installed on the remote subnet. If this is also not possible, you
have to force the client to bind to the remote server with ypset(8).
Unfortunately, this last option is less than ideal. If the remote
server becomes unavailable, ypbind(8) will lose its binding and
revert to its broadcast-based search behavior. Even if there are
other servers available, or even if the original server comes back
up, ypbind(8) will not be able to create a new binding since all
the servers are on remote subnets where its broadcasts won't be heard.
If the administrator isn't around to run ypset(8) again, the system
is hosed.
In some Linux NIS implementations, there exists a yp.conf file where
you can explicitly specify a server address and avoid the use of
ypbind altogether. This is not desireable since it removes the
possibility of binding to an alternate server in the event that the
one specified in yp.conf crashes.
Some people have mentioned to me how they though the 'restricted mode'
operation (using the -S flag) could be used as a solution for this
problem since it allows one to specify a list of servers. In fact,
this is not the case: the -S flag just tells ypbind(8) that when it
listens for replies to its broadcasts, it should only honor them if
the replying hosts appear in the specified restricted list.
This behavior has now been changed. If you use the -m flag in conjunction
with the -S flag, ypbind(8) will use a 'many-cast' instead of a broadcast
for choosing a server. In many-cast mode, ypbind(8) will transmit directly
to the YPPROC_DOMAIN_NONACK procedure of all the servers specified in
the restricted mode list and then wait for a reply. As with the broadcast
method, whichever server from the list answers first is used for the
local binding. All other behavior is the same: ypbind(8) continues
to ping its bound server every 60 seconds to insure it's still alive
and will many-cast again if the server fails to respond. The code used
to achieve this is in yp_ping.c; it includes a couple of modified RPC
library routines.
Note that it is not possible to use this mechanism without using
the restricted list since we need to know the addresses of the available
NIS servers ahead of time in order to transmit to them.
Most-recently-requested by: Tom Samplonius
This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.
Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.
-S domainname,server1,server2,server3,...
The -S flag allows the system administrator to lock ypbind to a
particular domain and group of NIS servers. Up to ten servers can
be specified. There must not be any spaces between the commas in
the domain/server specification. This option is used to insure that
that the system binds only to one domain and only to one of the
specified servers, which is useful for systems that are both NIS
servers and NIS clients: it provides a way to restrict what ma-
chines the system can bind to without the need for specifying the
-ypset or -ypsetme options, which are often considered to be secu-
rity holes. The specified servers must have valid entries in the
local /etc/hosts file. IP addresses may be specified in place of
hostnames. If ypbind can't make sense ouf of the arguments, it will
ignore the -S flag and continue running normally.
Note that ypbind will consider the domainname specified with the -S
flag to be the system default domain.
(According to what Garrett showed me, OSF/1 actually only allows 4 servers
to be specified. Ten seemed to be a bit more reasonable to me.)
Suggested by: G. Wollman
Idea lifted from: OSF/1
- Moved to a more client-driven model. We aggressively attempt to keep
the default domain bound (as before) but we give up on non-default
domains if we lose contact with a server and fail to get a response
after one round of broadcasting. This helps drastically reduce the
amount of network bandwitdh that ypbind consumes: if a client references
the secondary domain at some later point, this will prod ypbind into
establishing a new binding anyway, so continuously broadcasting without
need is pointless.
Note that we still actively seek out a binding for our default domain
even if no client program has queried us yet. I'm not exactly sure if
this matches SunOS's behavior or not, but I decided to do it this way
since we can get into all sorts of trouble if our default domain comes
unbound. Even so, we're still much quieter than we used to be.
- Removed a bunch of no-longer pertinent comments and a couple of
chunks of #ifdef 0'ed code that no longer fit in to the new layout.
- Theo deRaadt must have become frustrated with the callback mechanism
in clnt_broadcast(), because he shamelessly stole the clnt_broadcast()
code right out of the RPC library and hacked it up to suit his needs.
(Comments and all! :)
I can understand why: clnt_broadcast() blocks while awaiting replies.
Changing this behavior requires surgery. However, you can work around
this: fork the broadcast into a child process and relay the results
back to the parent via a pipe. (Careful obervation has shown that the
SunOS ypbind forks children for broadcasting too, though I can only
guess what sort of interprocess communication it uses. pipe() seems to
do the job well enough.)
This may seem like the long way around, but it's not really that
hard to implement, and I'd prefer to use documented RPC library functions
wherever possible. We're careful to limit the number of simultaneous
broadcasters to avoid swamping the system (the current limit is 5).
Each clnt_broadcast() call only sends out a small number of packets
at increasing intervals. We're also careful not to spawn more than one
bradcaster for a given domain.
- Used clntudp_bufcreate() and clnt_call() to implement a ping()
function for directly querying a particular server so that we can
check if it's still alive. This lets me completely remove the old
bradcasting code and use actual RPC library calls instead, at the
cost of more than a few handfulls of torn-out hair. (Make no mistake
folks: I *HATE* RPC.) Currently, the ping interval is one minute.
- Fixed another potential 'nfds too big for select()' bug: use
_rpc_dtablesize() instead of getdtablesize().
- Quieted gcc -Wall a bit.
- Probably a bunch of other stuff that I've forgotten.
ypbind.8:
- Updated man page to reflect modifications.
ypwhich.c:
- Small mind-o fix from last time: decode error results from
ypbind correctly (*groan*)
yplib.c:
- same as above
- Change behavior of _yp_dobind() a little: if we get back a 'Domain
not bound' error for a given domain, retry a few times before giving
up and passing the error back to the caller. We have to sleep for a
few seconds between tries since the 'Domain not bound' error comes
back immediately (by repeatedly looping, we end up pounding on ypbind).
We retry at most 20 times at 5 second intervals. This gives us a full
minute to get a response. This seems to deviate a bit from SunOS
behavior -- it appears to wait forever -- but I don't like the idea
of perpetually hanging inside a library call.
Note that this should fix the problems some people have with bindings
not being established fast enough at boot time; sometimes amd is started
in /etc/rc after ypbind has run but before it gets a binding set up. The
automounter gets annoyed at this and tends to exit. By pausing ther YP
calls until a binding is ready, we avoid this situation.
- Another _yp_dobind() change: if we determine that our binding files
are unlocked or nonexistent, jump directly to code that pokes ypbind
into restablishing the binding. Again, if it fails, we'll time out
eventually and return.