/var/run resides on an NFS filesystem (flock() always returns 0 in
this case, so we falsely assume that ypbind is dead and bail out).
Settle instead for better failure checking when using clnttcp_create()
and clnt_call() to interact with ypbind. We still try to flock()
/var/yp/binding/$DOMAINNAME.2, but if this doesn't work, we drop into
the code that retrieves the binding information from ypbind directly.
If that also fails, then we're toast. On NFS filesystems, this means
we'll be ignoring the binding file for no reason and always talking to
ypbind even though we don't have to, but at least things will work.
(I could just replace the flock(/var/run/ypbind.lock) check with
an RPC call to ypbind's NULLPROC procedure, but if the flock() of
the binding file doesn't pan out we're going to try to talk to
ypbind later anyway. *sigh* Is NFS file locking ever going to work?)
broken. The translation from network number to ASCII string was not
working correctly (you would sometimes get things like 0.244.0.0 instead
of 244.0.0).
Also copied results of yp_match() to a static buffer for consistency
with gethostbynis.c.
Note: _getnetbynisaddr() chops off trailing .0's, i.e. 244.0.0 is
truncated to 244. By contrast, getnetbyht.c code (for local /etc/networks
lookups) leaves the traling .0's in place. This means that the NIS
and local file lookups will match different things when looking up the
same network number. I'm not sure which is the correct behavior. (I
think the DNS lookup code tries all combinations -- should the NIS
and local host lookup routines do that too?)
the precision; ANSI X3J11 is not crystal clear but certainly says
that the precision specifies the number of /digits/, and signs
and "0x" aren't really digits.
NetBSD already has a similar patch.
of a successful map retrieval. (This has to do with a previous change
to xdr_ypresp_all_seq() and ypxfr_get_map(); originally, yp_all()
would look for a return value of YP_FALSE to signal success, but now
it should be looking for YP_NOMORE. It should not be passing YP_NOMORE
back up to the caller though.)
Noticed by: <aagero@aage.priv.no>
There is also another small bug here, which is that the call to
xdr_free() that happens immediately after the clnt_call() in yp_all()
clobbers the return status value. I've worked around this for now,
but I think the xdr_free() is actually bogus and should be removed.
I want to check some more before I do that though.
a machine with aliase ip addresses on the same subnet of an
interfaces' `real' ip addresses would generate <n> duplicate
broadcasts in clnt_broadcast().
Basically, this fix does a purge on the list of bradcast addresses.
- Fix problem described in PR #1079: _gethostbynisaddr() doesn't
work. Make it accept the same arguments as all the other
gethostby*addr() functions and properly convert the supplied IP
address into a text string so that yp_match() can find it in the
hosts.byaddr map.
- Also fix potential memory leak: copy the results of yp_match() to
a static buffer and free the result (yp_match() returns dynamically
allocated memory).
ether_addr.c:
- Since I was in the neighborhood, fix ether_ntohost() and
ether_hostton() so that they don't bogusly for a free(result)
when yp_match() fails.
matter much on some systems, but on ftp servers (like wcarchive) where
you run with special stripped group and pwd.db files in the anonymous
ftp /etc, this can be a major speedup for ls(1).
ss_flags to SS_DISABLE and SS_ONSTACK. SA_ONSTACK is still used in
struct sigaction. Nowhere in our entire source tree could I find a
single place these were used.
reconnect once using the saved openlog() parameters.
This helps one of the system startup race conditions. If syslogd takes too
long to get going, some daemons can fail the connection and forever log
to the console even though the syslogd is running. That is ..unfortunate..
the statically compiled PS_STRINGS and USRSTACK variables. This prevents
programs using setproctitle from coredumping if the kernel VM is increased,
and stops libkvm users (w, ps, etc) from needing to be recompiled if only
the VM layout changes.
explicit that it is global to the entire "session", and that setsid() or
daemon() are need to have been called at some point.
The most notable offender of setlogin() misuse is XFree86's xdm.
for "fts_open" was wrong. Also, the "fts_info" field of the FTSENT
structure was misleadingly described as containing "flags". Actually, it
contains a single integer value.
in the main text of various man pages.
Thanks to Warner Losh for adding an option to manck to allow
it to scan the entire man page looking for bogus xrefs, instead
of just checking the SEE ALSO section.
resides in read-only memory is going to cause the program to core dump,
and this is commmon with older pre-ANSI C programs.
(I've scratched my head over this one at 3 in the morning before
while trying to port some ancient program)
Suggested by: Gary Kline <kline@tera.com>
Also corrected a few minor formatting errors, file location and cross
references in some of the section 3 man pages.
This shuts up a lot of the output from "manck" for section 3.
Install (optional) libutil.h with prototypes for the functions and
document this in the man page.
minor cleanups to the various routines, include the prototype file, declare
return codes etc.
of signals. Signals are now properly caught, tty state is being
restored, and the previous sigaction triggered. Upon receipt of a
sigcont, echo is turned off again.
SIGTSTP causes a buffer flush, the man page mentions this. (Although
i rather think of it as a feature than a bug.)
This is likely to be my last FreeBSD action for 1995, xearth shows
me that our .au guys must already write 1996. :-)
looking at a high resolution clock for each of the following events:
function call, function return, interrupt entry, interrupt exit,
and interesting branches. The differences between the times of
these events are added at appropriate places in a ordinary histogram
(as if very fast statistical profiling sampled the pc at those
places) so that ordinary gprof can be used to analyze the times.
gmon.h:
Histogram counters need to be 4 bytes for microsecond resolutions.
They will need to be larger for the 586 clock.
The comments were vax-centric and wrong even on vaxes. Does anyone
disagree?
gprof4.c:
The standard gprof should support counters of all integral sizes
and the size of the counter should be in the gmon header. This
hack will do until then. (Use gprof4 -u to examine the results
of non-statistical profiling.)
config/*:
Non-statistical profiling is configured with `config -pp'.
`config -p' still gives ordinary profiling.
kgmon/*:
Non-statistical profiling is enabled with `kgmon -B'. `kgmon -b'
still enables ordinary profiling (and distables non-statistical
profiling) if non-statistical profiling is configured.
is really necessary. Going backwards on a P6 is much slower than forwards
and it's a little slower on a P5. Also moved the count mask and 'std'
down a few lines - it's a couple percent faster this way on a P5.
replace the dozen other various hacks in the code that do all sorts
of crude things including spamming the envrionment strings with the new
argv string.
This version is mainly inspired by the sendmail version, with a couple of
ideas taken from the NetBSD implementation as well.
XDR routines auto-generated by rpcgen don't quite match the format of
the original ones even though tey have the same names (that was one of
the things wrong with the old XDR routines).
rpcgen-erated on the fly (just like librpcsvc).
Makefile: Add rule for generating yp_xdr.c and yp.h.
xdryp.c: gut everything except the special ypresp_all XDR function
needed to to handle yp_all() (this one can't be created on
the fly), and xdr_datum(), which isn't used internally by
libc, but which as documented as being there in yp_prot.h,
so what the hell. We now get everything else from yp_xdr.c.
yplib.c: change a few structure member names to match those found in
yp.h instead of those declared in yp_prot.h.
via mmap() up around the shared library area. Previously the directory
was allocated from space from it's own memory pool. Because of the way it
was being extended on processes with large malloced data segments (ie: inn)
once the page directory was extended for some reason, it was not possible
to lower the heap size any more to return pages to the OS.
(If my understanding is correct, page directory expansion occurs at 4MB,
12MB, 20MB, 28MB, etc.) I was seeing INN allocate a large amount of short
term memory, pushing it over the 28MB mark, and once it's transient demands
hit 28MB, it never freed it's pages and swap space again.)
I've been running this in my libc for about a month...
Also, seperate MALLOC_STATS from EXTRA_SANITY.. I found it useful to call
malloc_dump() from within INN from a ctlinnd command to see where the hell
all the memory was going.. :-) I've left MALLOC_STATS enabled, as it has
no run-time or data storage cost.
Reviewed by: phk
it before before trying to establish a binding. If /var/run/ypbind.lock
doesn't exist, or if it exists and isn't locked, then ypbind isn't
running, which means NIS is either turned off or hosed.
- Have _yp_check() call yp_unbind() after it sucessfully calls yp_bind()
to make sure it frees resources correctly. (I don't think there's really
a memory leak here, but it seems somehow wrong to call yp_bind() without
making a corresponding call to yp_unbind() afterwards.)
This makes the NIS code behave a little better in cases where libc makes
calls to NIS, but it isn't running correctly (i.e. there's no ypbind).
This cleans up some strange libc behavior that manifests itself if
you have the system domain name set, but aren't actually running NIS.
In this event, the getrpcent(3) code could try to call into NIS and
cause several inexplicable "clnttcp_create error: RPC program not
registered" messages to appear. This happens because _yp_check() checks
if the system domain name is set and, if it is, proceeds to call
yp_bind() to attempt to establish a binding. Since there is no
binding file (remember: ypbind isn't running, so /var/yp/binding
will be empty), _yp_dobind() will attempt to contact ypbind to
prod it into binding the domain. And because ypbind isn't running,
the code generates the 'clnttcp_create' error. Ultimately the
_yp_check() fails and the getrpcent(3) code rolls over to the /etc/rpc
file, but the error messages are annoying, and the code should be
smart enough to forgo the binding attempt when NIS is turned off.
both call getservent() to do most of the work, so we only need to modify
this file to take care of everybody).
Note that there is only one NIS services map (services.byname) even
though there are getservbyname() and getservbyport() library functions.
but a commit mail got lost, it's the same as for this commit:
lib/libc/gen confstr.c crypt.c disklabel.c fstab.c getcap.c
getgrent.c getgrouplist.c getpass.c getpwent.c
initgroups.c nlist.c psignal.c pwcache.c setmode.c
sleep.c sysconf.c sysctl.c syslog.c usleep.c
lib/libc/locale none.c read_runemagi.c setlocale.c
lib/libc/net gethostbydns.c getnetbydns.c getnetbynis.c
lib/libc/nls msgcat.c
lib/libc/quad Makefile.inc
lib/libc/regex engine.c regcomp.c regerror.c
Minor cleanup, mostly unused vars and missing #includes.
Limit the number of quad functions we pull in for 'i386'.
I still belive the quad stuff should go back into gcc.
Add compile-time warnings about crypt functions.
- Fix buffer overflow problem once and for all: do away with the buffer
copies to 'user' prior to calling _scancaches() and just pass a pointer
to the buffer returned by yp_match()/yp_first()/yp_next()/whatever.
(We turn the first ':' to a NUL first so strcmp() works, then change it
back later. Submitted by Bill Fenner <fenner@parc.xerox.com> and
tweaked slightly by me.
- Give _pw_breakout_yp() the 'more elegant solution' I promised way back when.
Eliminate several copies to static buffers and replace them with just
one copy. (The buffer returned by the NIS functions is at most
YPMAXRECORD bytes long, so we should only need one static buffer of
the same length (plus 2 for paranoia's sake).)
- Also in _pw_breakout_yp(): always set pw.pw_passwd to the username
obtained via NIS regardless of what pw_fields says: usernames cannot
be overridden so we have no choice but to use the name returned by
NIS.
- _Again_ in _pw_breakout_yp(): before doing anything else, check that
the first character of the NIS-returned buffer is not a '+' or '-'.
If it is, drop the entry. (#define EXTRA_PARANOIA 1 :)
- Probe for the master.passwd.* maps once during __initdb() instead
of doing it each time _getyppass() or _nextyppass() is called.
- Don't copy the NIS data buffers to static memory in _getyppass()
and _nextyppass(): this is done in _pw_breakout_yp() now.
- Test against phkmalloc and phkmalloc/2 (TNG!) to make sure we're
free()ing the yp buffers sanely.
- Put _havemaster(), _getyppass() and nextyppass() prototypes under
#ifdef YP. (Somehow they ended up on the wrong side of the #endif.)
- Remove unused variable ___yp_only.
- In some cases, we don't properly resolve _all_ possible group memberships.
If a user is a member of both local and NIS groups, we sometimes lose some
of the membership info from NIS. (Reported by: Thorsten Kukuk
<kukuk@uni-paderborn.de>)
- Make NIS +groupname overrides actually work the way the SunOS group(5)
man page says they should (make them work for all cases: getgrent(),
getgrnam() and getgrgid()).
- When not compiled with -DYP, grscan() should ignore entries that
begin with a '+'. When compiled _with_ -DYP, grscan() should ignore
+groupname entries that don't refer to real NIS groups.
- Remove redundant redeclaration of fgets(), strsep() and index() inside
grscan(). We already #include all the right header files for these.
Note: -groupname exclusion as specified in the Sun documentation still
isn't supported. This'll be a 2.2 addition. Right now I just want this
stuff to work.
What was happening, is if syslogd was not running, syslog() would do
a strcat("\r\n") on a non-null-terminated buffer, and write it to the console.
This meant that sometimes extra characters could be written to the console
during boot, depending on the stack contents.
This totally avoids the potential problem by using writev() like the rest
of the does, and avoid modifying the buffer after the trouble we've gone to
to carefully protect it.
This is actually a trivial fix, in spite of the long commit message.. :-)
It only appeared during boot and shutdown with syslogd stopped.
running on a tty. (Same as isatty()) The old-style TIOCGETP ioctl
wouldn't fly if the kernel didn't have COMPAT_43.
Submitted by: Carl Fongheiser <cmf@netins.net>
Performance is comparable to gnumalloc if you have sufficient RAM, and
it screams around it if you don't.
Compiled with "EXTRA_SANITY" until further notice.
see malloc.3 for more details.
control hooks.
It is similar to an unrolled multi-part snprintf(), in that a "FILE *" is
attached to a string buffer. There is also an optimisation for the case
where the syslog format string does not contain %m, which should improve
performance of "informational" logging, like from ftpd.
the group map after encountering a badly formatted entry.
getpwent.c: same as above for _nextyppass(), and also turn a couple of
sprintf()s into snprintf()s to avoid potential buffer overruns. (The
other day I nearly went mad because of a username in my NIS database
that's actually 9 characters long instead of 8. Stuffing a 9-character
username into an 8-character buffer can do some strange things.)
(This reminds me: I hope somebody's planning to fix the buffer overrun
security hole in syslog(3) before 2.1 ships.)
on, which is fine, except that _yp_dobind() is called before we check
the cache. The means we can return from the cache check (if we have
a hit) without calling _yp_unbind().
We should do the cache check first and _then_ drop into the section
that binds the server and does the yp_match query.
seperate function to avoid duplication. Also fix getpwent() a
small bit to properly handle the case where the magic NIS '+'
entry appears before the end of the password file.
getgrent.c: be a little more SunOS-ish. Make it look like the NIS
group map is 'inserted' at the the point(s) where the magic NIS '+'
entry/entries appear.
getgrent: fix a file descriptor leak: remember to close the netgroup
file after we determine that we're using NIS-only innetgr() lookups.
Since Bruce changed the #include <res_config.h> to #include "res_config.h"
this is no longer needed, and only makes the 'make' more verbose for
no real reason.
Note that this was done by selective patching from diffs, to not conflict
with the 4.4bsd base code.. This was *not* a trivial task.. I have been
testing this code (apart from cosmetic changes) in my libc for a while now.
Obtained from: Paul Vixie <paul@vix.com>
/usr/include/ufs/ufs/quota (#include <ufs/ufs/quota.h>) that seems to work
ok though.
Closes PR # docs/670: quotactl man page incorr...
Submitted by: evans@scnc.k12.mi.us (Jeffrey Evans)
Fix for PR #510. The original problem was that __ivaliduser() was
failing to grant access to a machine listed in a +@netgroup specified
in /etc/hosts.equiv, even though the host being checked was most
certainly in the +@netgroup.
The /etc/hosts.equiv file in question looked like this:
localhost
+@netgroup
The reason for the failure was had to do with gethostbyaddr(). Inside
the __ivaliduser() routine, we need to do a gethostbyaddr() in order
to get back the actual name of the host we're trying to validate since
we're only passed its IP address. The hostname returned by gethostbyaddr()
is later passed as an argument to innetgr(). The problem is that
__icheckhost() later does a gethostbyname() of its own, which clobbers
the buffer returned by gethostbyaddr().
The fix is just to copy the hostname into a private buffer and use
_that_ as the 'host' argument that gets passed to innetgr().
And here I was crawling all over the innetgr() code thinking the
problem was there. *sigh*
bump it again if something else is added before 2.2.
The xdr_* functions are enabled only in the 2.2 (-current) branch
so far. If that modification is moved to the 2.1 (-stable) branch,
this one should, too.
Reviewed by: the mailing lists
- getnetgrent.c: address some NIS compatibility problems. We really need
to use the netgroup.byuser and netgroup.byhost maps to speed up innetgr()
when using NIS. Also, change the NIS interaction in the following way:
If /etc/netgroup does not exist or is empty (or contains only the
NIS '+' token), we now use NIS exclusively. This lets us use the
'reverse netgroup' maps and is more or less the behavior of other
platforms.
If /etc/netgroup exists and contains local netgroup data (but no '+').
we use only lthe local stuff and ignore NIS.
If /etc/netgroup exists and contains both local data and the '+',
we use the local data nd the netgroup map as a single combined
database (which, unfortunately, can be slow when the netgroup
database is large). This is what we have been doing up until now.
Head off a potential NULL pointer dereference in the old innetgr()
matching code.
Also fix the way the NIS netgroup map is incorporated into things:
adding the '+' is supposed to make it seem as though the netgroup
database is 'inserted' wherever the '+' is placed. We didn't quite
do it that way before.
(The NetBSD people apparently use a real, honest-to-gosh, netgroup.db
database that works just like the password database. This is
actually a neat idea since netgroups is the sort of thing that
can really benefit from having multi-key search capability,
particularly since reverse lookups require more than a trivial
amount of processing. Should we do something like this too?)
- netgroup.5: document all this stuff.
- rcmd.c: some sleuthing with some test programs linked with my own
version of innetgr() has revealed that SunOS always passes the NIS
domain name to innetgr() in the 'domain' argument. We might as well
do the same (if YP is defined).
- ether_addr.c: also fix the NIS interaction so that placing the
'+' token in the /etc/ethers file makes it seem like the NIS
ethers data is 'inserted' at that point. (Chances are nobody will
notice the effect of this change, which is just te way I like it. :)
specified in the top level Makefiles.
Previously I missed dozens of Makefiles that skip the install after
using `cmp -s' to decide that the install isn't necessary.
changeover, so we have to extend the format of timezone files (in a backward-
compatible way, of course). This probably means that libc needs a minor
version number bump before 2.2 is released (or maybe not).
by me). This probably loses for multibyte characters, but I have no
way of telling. I'll let ache decide whether to add this support to
startup_setlocale. Note that for this to make any sense at all, the
symlinks in /usr/share/locale must go. (For the moment, this doesn't
make any difference since there are no locales supplied.)
Obtained from: Arthur David Olson <ado@elsie.nci.nih.gov>
Back out the 'help NIS rebind faster' hack. This change used a
connect()/send() pair rather than the original sendto() to allow
RPC to pass ICMP host unreachable and similar errors up to RPC
programs that use UDP. This is not a terrible thing by itself, but it can
cause trouble in environments with multi-homed hosts: if the portmapper
on the multi-homed machine sends a reply with a source address
that's different than the one associated with the connection by
connect(), the kernel will send a port unreachable message and
drop the reply. For the sake of compatibility with everybody else
on the planet, it's best to revert to the old behavior.
*long, heavy sigh*
like 38400<any 8bit char, isalpha> it not detect this stuff and
produce very big number instead. Fixed by operating with unsigned char
and checking for isascii. (secure/telnetd hits by it f.e.)
the comment before checking for long lines, so there was a possibility
that the wrap-around might be used as an exploitable hostname.
Reviewed by:
Submitted by:
Obtained from:
Strange as it sounds, it should map to YPERR_DOMAIN instead.
The YP_NODOM protocol error code is generally returned by ypserv when you
ask it for data from a domain that it doesn't support. By contrast,
the YPERR_NODOM error code means 'local domain name not set.'
Consequently, this incorrect mapping leads to yperr_string() generating
a very confusing error message. YPERR_DOMAIN says 'couldn't
bind to a server which serves this domain' which is much closer
to the truth.
_gr_breakout_yp(): if we encounter a NULL pointer generated as the
result of a badly formatted NIS passwd entry (e.g. missing fields),
we punt and return an error code, thereby silently skipping the
bad entry.
last night:
_gr_breakout_yp() doesn't check for badly formatted NIS group entries.
For example, a bogus entry like this:
bootp::user1,user2,user3
will lead to a null pointer dereference and a SEGV (note that the GID
field is missing -- this results in one of the strsep(&result, ":")
returning NULL). The symtpom of this problem is programs dumping
core left and right the moment you add a + entry to /etc/group.
Note that while this is similar to an earlier bug, it's caused by a
different set of circumstances.
The fix is to check for the NULL pointers and have _gr_breakout_yp()
punt and return a failure code if it catches one. This is more or
less the behavior of SunOS: if a bad NIS group entry is encountered,
it's silently ignored. I don't think our standard (non-NIS) group
parsing code behaves the same way. It doesn't crash though, so I'm
citing the 'it ain't broken, don't fix it' rule and leaving it alone.
I'll probably have to add similar checks to _pw_breakout_yp() in
getpwent.c to ward off the same problems. It's rare that bad NIS
map entries like this occur, but we should handle them gracefully
when they do.
'cycle in netgroup check too greedy').
PR #508 is apparently due to an inconsistency in the way the 4.4BSD
netgroup code deals with bad netgroups. When 4.4BSD code encounters
a badly formed netgroup entry (e.g. (somehost,-somedomain), which,
because of the missing comma between the '-' and 'somedomain,' has
only 2 fields instead of 3), it generates an error message and
then bails out without doing any more processing on the netgroup
containing the bad entry. Conversely, every other *NIX in the world
that usees netgroups just tries to parse the entry as best it can
and then silently continues on its way.
The result is that two bad things happen: 1) we ignore other valid entries
within the netgroup containing the bogus entry, which prevents
us from interoperating with other systems that don't behave this way,
and 2) by printing an error to stderr from inside libc, we hose certain
programs, in this case rlogind. In the problem report, Bill Fenner
noted that the 'B' from 'Bad' was missing, and that rlogind exited
immediately after generating the error. The missing 'B' is apparently
not caused by any problem in getnetgrent.c; more likely it's getting
swallowed up by rlogind somehow, and the error message itself causes
rlogind to become confused. I was able to duplicate this problem and
discovered that running a simple test program on my FreeBSD system
resulted in a properly formatted (if confusing) error, whereas triggering
the error by trying to rlogin to the machine yielded the missing 'B'
problem.
Anyway, the fixes for this are as follows:
- The error message has been reformatted so that it prints out more useful
information (e.g. Bad entry (somehost,-somedomain) in netgroup "foo").
We check for NULL entries so that we don't print '(null)' anymore too. :)
- Rearranged things in parse_netgrp() so that we make a best guess at
what bad entries are supposed to look like and then continue processing
instead of bailing out.
- Even though the error message has been cleaned up, it's wrapped inside
a #ifdef DEBUG. This way we match the behavior of other systems. Since we
now handle the error condition better anyway, this error message becomes
less important.
PR #507 is another case of inconsistency. The code that handles
duplicate/circular netgroup entries isn't really 'too greedy; -- it's
just too noisy. If you have a netgroup containing duplicate entries,
the code actually does the right thing, but it also generates an error
message. As with the 'Bad netgroup' message, spewing this out from
inside libc can also hose certain programs (like rlogind). Again, no
other system generates an error message in this case.
The only change here is to hide the error message inside an #ifdef DEBUG.
Like the other message, it's largely superfluous since the code handles
the condition correctly.
Note that PR #510 (+@netgroup host matching in /etc/hosts.equiv) is still
being investigated. I haven't been able to duplicate it myself, and I
strongly suspect it to be a configuration problem of some kind. However,
I'm leaving all three PRs open until I get 510 resolved just for the
sake of paranoia.
ypbind.c:
Make fewer assumtions about the state of the dom_alive and dom_broadcasting
flags in roc_received().
If select() fails, use syslog() to report the error rather than perror().
Check that all our malloc()s succeed. Report malloc() failure in
ypbindproc_setdom_2() to callers.
yplib.c:
Use #defined constants in ypbinderr_string() rather than hard-coded values.
- If you take the wheel entry out of /etc/group and turn on NIS,
the '+:*::' line is incorrectly flagged as the entry for wheel (the
empty gid section is translated to 0), hence getgrgid() returns '+'
as the name of the group instead of 'wheel.'
- Using just '+:' as the 'turn on NIS' switch in /etc/group makes
getgrgid() dump core because of a null pointer dereference. (Last
time I was in here, I foolishly assumed that fixing the core dump
problems with getgrnam() and getgrent() would fix getgrgid() too.
Silly me.)
- Moved to a more client-driven model. We aggressively attempt to keep
the default domain bound (as before) but we give up on non-default
domains if we lose contact with a server and fail to get a response
after one round of broadcasting. This helps drastically reduce the
amount of network bandwitdh that ypbind consumes: if a client references
the secondary domain at some later point, this will prod ypbind into
establishing a new binding anyway, so continuously broadcasting without
need is pointless.
Note that we still actively seek out a binding for our default domain
even if no client program has queried us yet. I'm not exactly sure if
this matches SunOS's behavior or not, but I decided to do it this way
since we can get into all sorts of trouble if our default domain comes
unbound. Even so, we're still much quieter than we used to be.
- Removed a bunch of no-longer pertinent comments and a couple of
chunks of #ifdef 0'ed code that no longer fit in to the new layout.
- Theo deRaadt must have become frustrated with the callback mechanism
in clnt_broadcast(), because he shamelessly stole the clnt_broadcast()
code right out of the RPC library and hacked it up to suit his needs.
(Comments and all! :)
I can understand why: clnt_broadcast() blocks while awaiting replies.
Changing this behavior requires surgery. However, you can work around
this: fork the broadcast into a child process and relay the results
back to the parent via a pipe. (Careful obervation has shown that the
SunOS ypbind forks children for broadcasting too, though I can only
guess what sort of interprocess communication it uses. pipe() seems to
do the job well enough.)
This may seem like the long way around, but it's not really that
hard to implement, and I'd prefer to use documented RPC library functions
wherever possible. We're careful to limit the number of simultaneous
broadcasters to avoid swamping the system (the current limit is 5).
Each clnt_broadcast() call only sends out a small number of packets
at increasing intervals. We're also careful not to spawn more than one
bradcaster for a given domain.
- Used clntudp_bufcreate() and clnt_call() to implement a ping()
function for directly querying a particular server so that we can
check if it's still alive. This lets me completely remove the old
bradcasting code and use actual RPC library calls instead, at the
cost of more than a few handfulls of torn-out hair. (Make no mistake
folks: I *HATE* RPC.) Currently, the ping interval is one minute.
- Fixed another potential 'nfds too big for select()' bug: use
_rpc_dtablesize() instead of getdtablesize().
- Quieted gcc -Wall a bit.
- Probably a bunch of other stuff that I've forgotten.
ypbind.8:
- Updated man page to reflect modifications.
ypwhich.c:
- Small mind-o fix from last time: decode error results from
ypbind correctly (*groan*)
yplib.c:
- same as above
- Change behavior of _yp_dobind() a little: if we get back a 'Domain
not bound' error for a given domain, retry a few times before giving
up and passing the error back to the caller. We have to sleep for a
few seconds between tries since the 'Domain not bound' error comes
back immediately (by repeatedly looping, we end up pounding on ypbind).
We retry at most 20 times at 5 second intervals. This gives us a full
minute to get a response. This seems to deviate a bit from SunOS
behavior -- it appears to wait forever -- but I don't like the idea
of perpetually hanging inside a library call.
Note that this should fix the problems some people have with bindings
not being established fast enough at boot time; sometimes amd is started
in /etc/rc after ypbind has run but before it gets a binding set up. The
automounter gets annoyed at this and tends to exit. By pausing ther YP
calls until a binding is ready, we avoid this situation.
- Another _yp_dobind() change: if we determine that our binding files
are unlocked or nonexistent, jump directly to code that pokes ypbind
into restablishing the binding. Again, if it fails, we'll time out
eventually and return.