17 Commits

Author SHA1 Message Date
wpaul
7715d6b5da Some small signal handling tweaks: be sure to keep wait3()ing until all
children are reaped and make sure to block SIGCHLD delivery during handler
execution when installing SIGCHLD handler with sigaction().
1995-07-15 23:27:49 +00:00
rgrimes
4f960dd75f Remove trailing whitespace. 1995-05-30 03:57:47 +00:00
wpaul
f9661b97c5 This is another bug fix that should have gone into my last commit. I
actually had this done at one point and lost it somewhere along the
line. Again, this is an honest to gosh bug fix only: no functionality
is changed.

- After a child broadcaster process dies or is killed, set its dom_pipe_fds
descriptors to -1 so that the 'READFD > 0' test in the select() loop
does the right thing.

Since descriptor values can be re-used, failure to do this can lead
to a situation where a descriptor for an RPC socket can be mistaken for
a pipe. If this happens, RPC sockets could be incorrectly handed off to
handle_children(), which would then clear the descriptor from the select()
descriptor mask and prevent svc_getreqset() from handling them. The end
result would be that some RPC events would go unserviced. Curiously,
the failures only happen intermittently.
1995-05-29 16:39:52 +00:00
wpaul
3eb2d33deb Reviewed by: rgrimes, jkh and davidg (sort of)
Rod, Jordan and David have more or less given me the OK on this
with the understanding that it doesn't change any functionality.
It doesn't: these are bug fixes only. No other part of the system
should be affected. Of course, since I'm the only one working on
NIS, you'll just have to take my word on it. :)

Fixes for the following annoyingly subtle bugs:

- ypbindproc_setdom_2 is supposed to be declared void *, not boot_t *,
and it fails to correctly signal failures back to the ypset(8) command:
we need to call one of the svcerr_*() functions (in this case,
svcerr_noprog() seems a logical choice -- we're really cheating
a bit here because nothing else quite fits) to tell ypset that the
attempt to set the binding for a domain failed. If we don't do this,
failed ypset attempts either appear (incorrectly) to succeed, or
they time out.

- The lock handling for child processes isn't quite right. The
child broadcaster processes have to release all locks on the
binding files and the ypbind.lock file.

- The parent ypbind process will SEGV if you do the following:

-- start ypbind with the -ypset or -ypsetme flag
-- type 'ypwhich -d random_unserved_domain'
-- type 'ypset -d random_unserved_domain anyhost'
-- type 'ypwhich -d random_unserved_domain' again
-- wait about 60 seconds

What happens is this: the ypwhich command causes ypbind to fork a
broadcaster process that searches for a server for random_unserved_domain.
If you then use ypset to force a binding while this process is still alive,
the state flags that tell the ypbind parent process that the child
is running will be cleared. The second ypwhich command then causes
a *second* child process to be forked for random_unserved_domain,
which is verbotten. When the first broadcaster exits and tells the
parent that it wasn't able to find a server for the domain, the parent
clobbers the entry for random_unserved_domain. Then the second broadcaster
exits and the same thing happens, only trying to clobber the entry
twice causes a SEGV.

The fix for this is a slight change in program structure: since we
can't have more than one broadcaster for a given domain at a time,
we save the pipe descriptors and pid for the child broadcaster in members
of the _dom_binding struct for the domain. (As a side effect, we
can get rid of the global child_fds variable.) So when rpc_received()
finds that it's been asked to do a ypset for a domain for which a
broadcaster process exists, it sends a SIGINT to the child to kill it
and closes the pipe to the now-dead child. This keeps everything in sync
and insures that we don't leak file descriptors.

- ping() should be using YPPROC_DOMAIN rather than YPPROC_DOMAIN_NONACK
when it does its clnt_call() to the server.

- Removed the check for client_handle == NULL in ping() and make
client_handle local to ping instead of a member of the _dom_binding
struct. This fixes another potential ypset problem: using ypset to
force a binding to a machine that has an NIS server but which *doesn't*
support the domain we're after can result in permanently bogus bindings.

- the 'server OK' message prints the wrong IP address.
1995-05-26 05:28:00 +00:00
wpaul
8c0e89af78 One for the road: create a ypbind.lock file under /var/run and try to lock
it. If we can't it means there's already a ypbind running and we should
abort.
1995-05-12 16:52:58 +00:00
wpaul
7169c20822 Ack! One slipped through the cracks: remember to return the correctly
filled-in result structure to the caller when a resource allocation
error is encountered in ypbindproc_domain_2.
1995-05-11 00:16:54 +00:00
wpaul
78ce3864de Performace improvements/simplifications/cleanups:
- Make the child process reaper signal-driven. (Previously, we called reaper()
  once a second each time we went through the select() loop. This was
  convenient, but inefficient.)

- Increase main select() timeout from 1 second to 60 seconds and use
  this as the ping timer instead of using timestamps in the _dom_binding
  structure. This nd the reaper() change noted above makes ypbind a little
  less CPU-intensive.

- Don't flag EINTR's from select() as errors since they will happen as a
  result of incoming SIGCHLD's interrupting select().

- Prevent possible resource hogging. Currently we malloc() memory
  each time a user process asks us to establish a binding for a domain,
  but we never free it. This could lead to serious memory leakage if a
  'clever' user did something like ask ypwhich to check the bindings
  for domains 0.0.0.0.0.0.0.0.0.0 through 9.9.9.9.9.9.9.9.9.9 inclusive.
  (This would also make a mess out of the /var/yp/binding directory.)

  We now avoid this silliness by a) limiting the maximum number of
  simultaneous bindings we can manage to 200, and b) free()ing _dom_binding
  structures of secondary domains whose servers have stopped responding.
  We unlink the /var/yp/binding/domain.vers files for the free()ed
  domains too.

  (This is safe to do since a client can prod us into reestablishing the
  binding, at which time we'll simply allocate a new _dom_binding structure
  for it.)

  We keep count of the total number of domains. If asked to
  allocate more than the maximum, we return an error. I have yet to hear
  of anybody needing 200 simultaneous NIS bindings, so this should be
  enough. (I chose the number 200 arbitrarily. It can be increased if need
  be.)

- Changed "server not responding"/"server OK" messages to display server
  IP addresses again since it looks spiffier.

- Use daemon() to daemonify ourselves,

- Added a SIGTERM handler that removes all binding files and unregisters
  the ypbind service from the portmapper when a SIGTERM in received.

- The comment 'blow away everything in BINDINGDIR' has no associated code.
  Give it some: clean out /var/yp/binding at startup (if it exists).

This completes my ypbind wishlist. Barring bug fixes, I shouldn't need to
go poking around in here anymore. (Of course, this means I can start
working on my ypserv whishlist now... :)
1995-05-10 23:02:41 +00:00
wpaul
f9be2b50a3 Cosmetic changes and paranoia checks:
ypbind.c:
Make fewer assumtions about the state of the dom_alive and dom_broadcasting
flags in roc_received().
Cosmetic changes and paranoia checks:

ypbind.c:
Make fewer assumtions about the state of the dom_alive and dom_broadcasting
flags in roc_received().

If select() fails, use syslog() to report the error rather than perror().

Check that all our malloc()s succeed. Report malloc() failure in
ypbindproc_setdom_2() to callers.

yplib.c:

Use #defined constants in ypbinderr_string() rather than hard-coded values.
1995-05-03 18:34:22 +00:00
wpaul
b72b2e3557 ypbind.c: Major overhaul.
- Moved to a more client-driven model. We aggressively attempt to keep
the default domain bound (as before) but we give up on non-default
domains if we lose contact with a server and fail to get a response
after one round of broadcasting. This helps drastically reduce the
amount of network bandwitdh that ypbind consumes: if a client references
the secondary domain at some later point, this will prod ypbind into
establishing a new binding anyway, so continuously broadcasting without
need is pointless.

Note that we still actively seek out a binding for our default domain
even if no client program has queried us yet. I'm not exactly sure if
this matches SunOS's behavior or not, but I decided to do it this way
since we can get into all sorts of trouble if our default domain comes
unbound. Even so, we're still much quieter than we used to be.

- Removed a bunch of no-longer pertinent comments and a couple of
chunks of #ifdef 0'ed code that no longer fit in to the new layout.

- Theo deRaadt must have become frustrated with the callback mechanism
in clnt_broadcast(), because he shamelessly stole the clnt_broadcast()
code right out of the RPC library and hacked it up to suit his needs.
(Comments and all! :)

I can understand why: clnt_broadcast() blocks while awaiting replies.
Changing this behavior requires surgery. However, you can work around
this: fork the broadcast into a child process and relay the results
back to the parent via a pipe. (Careful obervation has shown that the
SunOS ypbind forks children for broadcasting too, though I can only
guess what sort of interprocess communication it uses. pipe() seems to
do the job well enough.)

This may seem like the long way around, but it's not really that
hard to implement, and I'd prefer to use documented RPC library functions
wherever possible. We're careful to limit the number of simultaneous
broadcasters to avoid swamping the system (the current limit is 5).
Each clnt_broadcast() call only sends out a small number of packets
at increasing intervals. We're also careful not to spawn more than one
bradcaster for a given domain.

- Used clntudp_bufcreate() and clnt_call() to implement a ping()
function for directly querying a particular server so that we can
check if it's still alive. This lets me completely remove the old
bradcasting code and use actual RPC library calls instead, at the
cost of more than a few handfulls of torn-out hair. (Make no mistake
folks: I *HATE* RPC.) Currently, the ping interval is one minute.

- Fixed another potential 'nfds too big for select()' bug: use
_rpc_dtablesize() instead of getdtablesize().

- Quieted gcc -Wall a bit.

- Probably a bunch of other stuff that I've forgotten.

ypbind.8:

- Updated man page to reflect modifications.

ypwhich.c:

- Small mind-o fix from last time: decode error results from
ypbind correctly (*groan*)

yplib.c:

- same as above

- Change behavior of _yp_dobind() a little: if we get back a 'Domain
not bound' error for a given domain, retry a few times before giving
up and passing the error back to the caller. We have to sleep for a
few seconds between tries since the 'Domain not bound' error comes
back immediately (by repeatedly looping, we end up pounding on ypbind).
We retry at most 20 times at 5 second intervals. This gives us a full
minute to get a response. This seems to deviate a bit from SunOS
behavior -- it appears to wait forever -- but I don't like the idea
of perpetually hanging inside a library call.

Note that this should fix the problems some people have with bindings
not being established fast enough at boot time; sometimes amd is started
in /etc/rc after ypbind has run but before it gets a binding set up. The
automounter gets annoyed at this and tends to exit. By pausing ther YP
calls until a binding is ready, we avoid this situation.

- Another _yp_dobind() change: if we determine that our binding files
are unlocked or nonexistent, jump directly to code that pokes ypbind
into restablishing the binding. Again, if it fails, we'll time out
eventually and return.
1995-04-26 19:03:16 +00:00
wpaul
be813a3b68 small NIS binding fixes:
ypbind.c: if a client program asks ypbind for the name of the server
for a particular domain, and there isn't a binding for that domain
available yet, ypbind needs to supply a status value along with its
failure message. Set yprespbody.ypbind_error before returning from
a ypbindproc_domain request.

yplib.c: properly handle the error status messages ypbind now has the
ability to send us. Add a ypbinderr_string() function to decode the
error values.

ypwhich.c: handle ypbind errors correctly: yperr_string() can't handle
ypbind_status messages -- use ypbinderr_string instead.
1995-04-21 18:04:36 +00:00
wpaul
77cd0ce3be In environments with multiple NIS servers (a master and several slaves)
one ypbind broadcast can yield several responses. This can lead to
some confusion: the syslog message from ypbind will indicate a rebinding
to the first server that responds, but we may subsequently change our
binding to another server when the other responses arrive. This results
in ypbind reporting 'server OK' to one address and ypwhich reporting a
binding to another.

The behavior of the rpc_received() function has been changed to prevent
this: subsequent responses received after a binding has already been
established are ignored. Rebinding gratuitously each time we get a
new response is silly anyway.

Also backed out the non-fix I made in my last ypbind commit. (Pass
me the extra large conical hat, please.)

(At some point I'm going to seriously re-work ypbind and the _yp_dobind()
library function to bring them in line with SunOS's documented behavior:
binding requests are supposed to be 'client-driven.' The _yp_dobind()
function should be responsible for retrying connections in response to
calls from client programs rather than having ypbind broadcasting
continously until a server responds. The current setup works okay in
normal operation, but we broadcast far too often than we should.)
1995-04-15 23:35:46 +00:00
wpaul
007c073f7a First crack at a man page for ypbind. 1995-04-09 21:59:06 +00:00
wpaul
3c9467d883 Fix long standing bogosity in ypbind: if /var/yp/binding doesn't exist,
ypbind is supposed to create it but it doesn't. This is because when
it checks the return value for the attempted open() of
/var/yp/binding/DOMAIN.VERSION, it tests only for a value of -1. This
is bogus because open() doesn't return -1 in this case. Now it checks
for < 0 instead.

This should make life easier for many NIS-newbies who would otherwise
be left scratching their heads wondering why the NIS client stuff won't
work despite their best efforts. ("I set the domain name on my machine,
and /var/yp exists, but when I start ypbind and try a 'ypcat passwd,'
it says it can't bind to a server for this domain! Please help!")

*long, heavy sigh*
1995-04-02 03:10:55 +00:00
wpaul
c0657f0db1 Submitted by: Sebastian Strollo <seb@erix.eriksson.se>
Fixes to ypbind:

- Correctly report the fact that we've bound to a new server when
logging the 'server OK' message.

- Report 'server not responding' just once instead of every
5 seconds while searching for a new server. (Prevents overstuffing
the syslog.)

- Apply patch from Sebstian Strollo to implement '-s' (secure) flag.
ypbind will reject connections from servers that do not originate
from a reserved TCP port.

- Apply patch from Sebastian Strollo to detect when a YP server has
crashed and come back up on a different port number.
1995-02-26 04:42:48 +00:00
wpaul
2d9bdef553 ypbind jumbo patch :)
The existing ypbind exhibits some truly anti-social behavior. After
initially establishing a binding with an NIS server, the following events
take place:

- ypbind waits for 60 seconds before trying to broadcast a ping again
- after the 60 seconds expires, ypbind sends out broadcasts every 5 seconds
  come hell or high water.

These broadcasts travel far and wide, even to NIS servers in other domains
which dutifully log the packets even though they don't respond to them.
This leads to lots of unnecessary traffic and bloated log files.

This behavior has been fixed/changed. Here's what happens now:

- We still broadcast every 5 seconds at startup, just like before.

- Once bound, we send out packets once every 60 seconds to the server
  we're bound to AND NO ONE ELSE.

- If we fail to receive a reply from our server within FAIL_THRESHOLD
  seconds, we assume our server has croaked and go back to broadcasting
  everywhere every 5 seconds again until somebody answers. FAIL_THRESHOLD
  is currently set to 20 seconds.

Other fixes/improvements:

- ypbind now logs 'server not responding' and 'server OK' messages where
  appropriate.

Thanks to Thomas Graichen <graichen@omega.physik.fu-berlin.de> for
reporting the problem and guilt-tripping me into fixing it. :)
1995-02-16 01:21:44 +00:00
dg
8210ef4a47 Don't return the address of a stack variable. 1994-09-23 10:25:38 +00:00
wollman
0d5d309384 Copying YP programs over from 1.1.5, with a slightly different directory
structure than before.
1994-08-08 01:03:58 +00:00