ypbind.c:
Make fewer assumtions about the state of the dom_alive and dom_broadcasting
flags in roc_received().
If select() fails, use syslog() to report the error rather than perror().
Check that all our malloc()s succeed. Report malloc() failure in
ypbindproc_setdom_2() to callers.
yplib.c:
Use #defined constants in ypbinderr_string() rather than hard-coded values.
- If you take the wheel entry out of /etc/group and turn on NIS,
the '+:*::' line is incorrectly flagged as the entry for wheel (the
empty gid section is translated to 0), hence getgrgid() returns '+'
as the name of the group instead of 'wheel.'
- Using just '+:' as the 'turn on NIS' switch in /etc/group makes
getgrgid() dump core because of a null pointer dereference. (Last
time I was in here, I foolishly assumed that fixing the core dump
problems with getgrnam() and getgrent() would fix getgrgid() too.
Silly me.)
- Moved to a more client-driven model. We aggressively attempt to keep
the default domain bound (as before) but we give up on non-default
domains if we lose contact with a server and fail to get a response
after one round of broadcasting. This helps drastically reduce the
amount of network bandwitdh that ypbind consumes: if a client references
the secondary domain at some later point, this will prod ypbind into
establishing a new binding anyway, so continuously broadcasting without
need is pointless.
Note that we still actively seek out a binding for our default domain
even if no client program has queried us yet. I'm not exactly sure if
this matches SunOS's behavior or not, but I decided to do it this way
since we can get into all sorts of trouble if our default domain comes
unbound. Even so, we're still much quieter than we used to be.
- Removed a bunch of no-longer pertinent comments and a couple of
chunks of #ifdef 0'ed code that no longer fit in to the new layout.
- Theo deRaadt must have become frustrated with the callback mechanism
in clnt_broadcast(), because he shamelessly stole the clnt_broadcast()
code right out of the RPC library and hacked it up to suit his needs.
(Comments and all! :)
I can understand why: clnt_broadcast() blocks while awaiting replies.
Changing this behavior requires surgery. However, you can work around
this: fork the broadcast into a child process and relay the results
back to the parent via a pipe. (Careful obervation has shown that the
SunOS ypbind forks children for broadcasting too, though I can only
guess what sort of interprocess communication it uses. pipe() seems to
do the job well enough.)
This may seem like the long way around, but it's not really that
hard to implement, and I'd prefer to use documented RPC library functions
wherever possible. We're careful to limit the number of simultaneous
broadcasters to avoid swamping the system (the current limit is 5).
Each clnt_broadcast() call only sends out a small number of packets
at increasing intervals. We're also careful not to spawn more than one
bradcaster for a given domain.
- Used clntudp_bufcreate() and clnt_call() to implement a ping()
function for directly querying a particular server so that we can
check if it's still alive. This lets me completely remove the old
bradcasting code and use actual RPC library calls instead, at the
cost of more than a few handfulls of torn-out hair. (Make no mistake
folks: I *HATE* RPC.) Currently, the ping interval is one minute.
- Fixed another potential 'nfds too big for select()' bug: use
_rpc_dtablesize() instead of getdtablesize().
- Quieted gcc -Wall a bit.
- Probably a bunch of other stuff that I've forgotten.
ypbind.8:
- Updated man page to reflect modifications.
ypwhich.c:
- Small mind-o fix from last time: decode error results from
ypbind correctly (*groan*)
yplib.c:
- same as above
- Change behavior of _yp_dobind() a little: if we get back a 'Domain
not bound' error for a given domain, retry a few times before giving
up and passing the error back to the caller. We have to sleep for a
few seconds between tries since the 'Domain not bound' error comes
back immediately (by repeatedly looping, we end up pounding on ypbind).
We retry at most 20 times at 5 second intervals. This gives us a full
minute to get a response. This seems to deviate a bit from SunOS
behavior -- it appears to wait forever -- but I don't like the idea
of perpetually hanging inside a library call.
Note that this should fix the problems some people have with bindings
not being established fast enough at boot time; sometimes amd is started
in /etc/rc after ypbind has run but before it gets a binding set up. The
automounter gets annoyed at this and tends to exit. By pausing ther YP
calls until a binding is ready, we avoid this situation.
- Another _yp_dobind() change: if we determine that our binding files
are unlocked or nonexistent, jump directly to code that pokes ypbind
into restablishing the binding. Again, if it fails, we'll time out
eventually and return.
ypbind.c: if a client program asks ypbind for the name of the server
for a particular domain, and there isn't a binding for that domain
available yet, ypbind needs to supply a status value along with its
failure message. Set yprespbody.ypbind_error before returning from
a ypbindproc_domain request.
yplib.c: properly handle the error status messages ypbind now has the
ability to send us. Add a ypbinderr_string() function to decode the
error values.
ypwhich.c: handle ypbind errors correctly: yperr_string() can't handle
ypbind_status messages -- use ypbinderr_string instead.
- it succeeded on non-directories (see POSIX 5.1.2.4).
- it hung on (non-open) named pipes.
- it leaked memory if the second malloc() failed.
- it didn't preserve errno across errors in close().
of the plus or minus lists at all, reject him. This lets you create
a +@netgroup list of users that you want to admit and reject everybody
else. If you end your +@netgroup list with the wildcard line
(+:::::::::) then you'll have a +@netgroup list that remaps the
specified people but leaves people not in any netgroup unaffected.
where one or more of the non-default domains are not yet bound.
If we make a YP request for a domain other than the default domain,
and there is no binding for the new domain yet, _yp_dobind() sees
that the /var/yp/binding/DOMAIN.VERS file for the unbound domain is
not locked (by ypbind) and from this it concludes that the NIS system
is dead, so it gives up.
This behavior has been changed: before giving up in this case, we now
make a second check to see if the binding file for the *default* domain
is also not locked. Only if the default domain binding file is also
unlocked to we now assume that ypbind has bought the farm and bail out.
(Note: this assumes that the user hasn't changed the default domain
while ypbind is running.)
With this change, _do_ypbind() is allowed to proceed into the next
section of code wherein it prods ypbind into establishing a binding
for the new domain. This first call times out after ten seconds,
after which it should retry and succeed. From then on, the binding
for the second domain should be handled normally.
isctype.c:
o The tolower() and toupper() functions duplicated too much code
and were out of date (surprise). This didn't matter because
it was difficult to call them.
o Change formatting to be more like that in <ctype.h> (with
extra parentheses as in the macros). Perhaps this file should
be machine generated or everything should be handled like
__tolower() so that no code is repeated.
nomacros.c:
o Instead of looking at _USE_CTYPE_INLINE_ to see what <ctype.h>
has done, set _EXTERNALIZE_CTYPE_INLINES_ to tell <ctype.h>
what to do, so that we don't have anything left to do. Note
that code is now generated even if inlines are used by default.
This allows users to switch to non-inline versions.
select() returns EINVAL if you try to feed it a value of FD_SETSIZE greater
that 256. You can apparently adjust this by specifying a larger value of
FD_SETSIZE when configuring your kernel. However, if you set the maximum
number of open file descriptors per process to some value greater than
the FD_SETSIZE value that select() expects, many selects() within the RPC
library code will be botched because _rpc_dtablesize() will return
invalid numbers. This is to say that it will return the upper descriptor
table size limit which can be much higher than 256. Unless select() is
prepared to expect this 'unusually' high value, it will fail. (A good
example of this can be seen with NIS enabled: if you type 'unlimit' at
the shell prompt and then run any command that does NIS calls, you'll
be bombarded with errors from clnttcp_create().)
A temporary fix for this is to clamp the value returned by _rpc_dtablesize()
at FD_SETSIZE (as defined in <sys/types.h> (256)). I suppose the Right
Thing would be to provide some mechanism for select() to dynamically
adjust itself to handle FD_SETSIZE values larger than 256, but it's a
bit late in the game for that. Hopefully 256 file descriptors will be enough
to keep RPC happy for now.
add #includes for YP headers when compiling with -DYP to avoid some implicit
declarations.
getgrent.c & getnetgrent.c: add some #includes to avoid implicit declarations
of YP functions.
Obtained from: Casper H. Dik (by vay of Usenet)
Small patch to help improve NIS rebinding times (among other things):
>From: casper@fwi.uva.nl (Casper H.S. Dik)
>Newsgroups: comp.sys.sun.misc,comp.sys.sun.admin
>Subject: FIX for slow rebinding of NIS.
>Summary: a small change in libc makes life with NIS a lot easier.
>Message-ID: <1992Jan17.173905.11727@fwi.uva.nl>
>Date: 17 Jan 92 17:39:05 GMT
>Sender: news@fwi.uva.nl
>Organization: FWI, University of Amsterdam
>Lines: 138
>Nntp-Posting-Host: halo.fwi.uva.nl
Have you been plagued by long waits when your NIS server is rebooted?
READ ON!
Sun has a patch, but the README says:
********************* WARNING ******************************
This is a new version of ypbind that never uses the NIS
binding file to cache the servers binding. This will have
the effect of fixing the current symptom. However, it might
degrade the overall performance of the system when the
server is available. This is most likely to happen on an
overloaded server, which will cause the network to produce
a broadcast storm.
*************************************************************
Therefor, I have produced another fix.
o What goes wrong.
When the NIS server is rebooted, ypserv will obtain different ports
to listen for RPC requests. All clients will continue to use the old
binding they obtained earlier. The NIS server will send ICMP dst unreachable
messages for the RPC requests that arrive at the old port. These ICMPs
are dropped on the floor and the client code will continue sending the
requests until the timer has expired. The small fix at the end of this
message will pick up these ICMP messages and deliver them to the RPC layer.
o Before and after.
I've tested this on some machines and this is the result:
(kill and restart ypserv on the server)
original% time ypmatch user passwd
user:....
0.040u 0.090s 2:35.64 0.0% 0+126k 0+0io 0pf+0w (155 seconds elapsed time)
fixedhost% time ypmatch user passwd
user:....
0.050u 0.050s 0:10.20 0.9% 0+136k 0+0io 0pf+0w (10 seconds elapsed time)
Rebinding is almost instantaneous.
o Other benefits.
RPC calls that use UDP as transport will no longer time out but
will abort much sooner. (E.g., the remote host is unreachable or
111/udp is filtered by an intermediate router)
Grrr. If the dbhash routines weren't grossly overengineered I wouldn't
even need to do this! :-(
Also now export the hash_stats routine. Manpage coming RSN - I promise.
Make sure all arguments to the yp_*() functions are valid before sending
them off to the server. This is somewhat distressing: once again my
FreeBSD box brought down my entire network because of NIS bogosities.
I *think* the poor argument checking in this module is the cause, but
I still haven't been able to reproduce the exact series of events that
lead to the ypserv crashes. For now I've resorted to sticking my FreeBSD
box in a seprate domain. Hopefully a weekend of heavy testing will
uncover the problem.