would cause syslogd to eventually kill innocent processes in the
system over time (note: not `could' but `would'). Many thanks to my
colleague Mirko for digging into the kernel structures and providing
me with the debugging framework to find out about the nature of this
bug (and to isolate that syslogd was the culprit) in a rather large
set of distributed machines at client sites where this happened
occasionally.
Whenever a child process was no longer responsive, or when syslogd
receives a SIGHUP so it closes all its logging file descriptors, for
any descriptor that refers to a pipe syslogd enters the data about the
old logging child process into a `dead queue', where it is being
removed from (and the status of the dead kitten being fetched) upon
receipt of a SIGCHLD. However, there's a high probability that the
SIGCHLD already arrives before the child's data are actually entered
into the dead queue inside the SIGHUP handler, so the SIGCHLD handler
has nothing to fetch and remove and simply continues. Whenever this
happens, the process'es data remain on the dead queue forever, and
since domark() tried to get rid of totally unresponsive children by
first sending a SIGTERM and later a SIGKILL, it was only a matter of
time until the system had recycled enough PIDs so an innocent process
got shot to death.
Fix the race by masking SIGHUP and SIGCHLD from both handlers mutually.
Add additional bandaids ``just in case'', i. e. don't enter a process
into the dead queue if we can't signal it (this should only happen in
case it is already dead by that time so we can fetch the status
immediately instead of deferring this to the SIGCHLD handler); for the
kill(2) inside domark(), check for an error status (/* Can't happen */
:) and remove it from the dead queue in this case (which if it would
have been there in the first place would have reduced the problem to a
statistically minimal likelihood so i certainly would never have
noticed the bug at all :).
Mirko also reviewed the fix in priciple (mutual blocking of both
signals inside the handlers), but not the actual code.
Reviewed by: Mirko Kaffka <mirko@interface-business.de>
Approved by: jkh
straight into debug mode if you boot -v. Also conditionalize some
annoying debugging output now that we have this ability.
Partially submitted by: msmith
Approved by: jkh [to make certain wise-acres happy ;)]
-Open socket() at first and then setuid() to actual user.
-Allow ping6 preload option only for root.
Approved by: jkh
Submitted by: Neil Blakey-Milner <nbm@mithrandr.moria.org>
BSD-style license, as an add-on to phk's beerware license. Please fedex
some beer to phk.
- Add a ``make depend'' line to the jail-building, which fixes openssl,
among other things. Suggested by: kris
- Add ``newaliases'' to the list of things to do when setting up a new
jail, so that the jailed sendmail doesn't complain.
- Correct references to ``kern.jail.set_hostname_allowed'' which now read
``jail.set_hostname_allowed''.
- Add a reference to sysctl.conf where the sysctl can easily be set in
a persistent way.
- Add a list of cross references to the man page.
- Fix a formatting nit or two.
Sorry for the flapping, but no change will be done for 4.0 anymore.
Official standard will be published around April or later.
If different format would be adopted at that time, then support for
the new format will be added to the succeeding FreeBSD 4.x.
Approved by: jkh
instructions so as to reduce warnings during jail startup, etc.
Add a somewhat bolder warning recommending the use of
kern.jail.set_hostname to limit jail renamining.
a distribution, recognize it and treat as fatal media error. This
happens in the case of a timeout on FTP installations where the
user chooses not to select another FTP site, and resulted in
segmentation fault.
Approved by: jkh