Commit Graph

6991 Commits

Author SHA1 Message Date
Don Lewis
6567eef757 A Linux thread created using clone() should not send SIGCHLD to its
parent if no signal is specified in the clone() flags argument.

PR:		42457
MFC after:	2 weeks
2004-02-19 06:43:48 +00:00
Nate Lawson
32869e71fb Add support for 'h' and 'hh' modifiers for printf(9).
Submitted by:	Bruno Ducrot <ducrot AT poupinou.org>
Reviewed by:	bde
2004-02-19 05:29:39 +00:00
Colin Percival
3a1bdbf8d1 Don't ignore errors from vfs_allocate_syncvnode.
PR:		kern/18503
Submitted by:	Anatoly Vorobey <mellon@pobox.com>
Approved by:	rwatson (mentor)
2004-02-18 05:20:54 +00:00
Peter Wemm
df7c361e64 Checkpoint a hack to enable running i386 libc_r binaries on a 64 bit
kernel.  I'm not happy with it yet - refinements are to come.
This hack allows the kern.ps_strings and kern.usrstack sysctls to respond
to a 32 bit request, such as those coming from emulated i386 binaries.
2004-02-18 00:54:17 +00:00
David Malone
a1cc6206fb Correct a comment.
Reviewed by:	alfred, tanimura
2004-02-17 12:30:32 +00:00
Dag-Erling Smørgrav
963385cf22 Mechanical whistespace cleanup. 2004-02-17 10:21:03 +00:00
Dag-Erling Smørgrav
44f4b94b38 Don't bother storing a result when all you need are the side effects. 2004-02-16 18:38:46 +00:00
David Malone
a82294d01c In fdcheckstd the descriptor table should never be shared, so just
KASSERT this rather than trying to deal with what happens when file
descriptors change out from under us.
2004-02-15 21:14:48 +00:00
Bruce Evans
72632ef235 Fixed style bugs near previous commit (mainly formatting errors and
missing parentheses).  Use default handling (trap to debugger) for
udev2dev(x, 1) since it is an error and doesn't happen anywhere in
the sys tree except in bogusly commented out code in coda.
2004-02-15 20:14:47 +00:00
Colin Percival
a20e9655b9 Remove opv_desc_vector from vfs_add_vnodeops, since it is defined
and given a value, but never used.  This has no effect on the
resulting binaries, since gcc optimizes the variable away anyway.

PR:		kern/62684
Approved by:	rwatson (mentor)
2004-02-15 17:27:33 +00:00
Poul-Henning Kamp
2a3faf2fbd Split the initialization of the cdevsw into a separate function. 2004-02-15 10:35:33 +00:00
Robert Watson
402d7aa884 Remove excess brackets. 2004-02-15 00:43:22 +00:00
Poul-Henning Kamp
d60d18d491 Use standard style for cdevsw initialization. 2004-02-14 20:03:36 +00:00
Robert Watson
679a106075 By default, don't allow processes in a jail to list the set of
jails in the system.  Previous behavior (allowed) may be restored
by setting security.jail.list_allowed=1.
2004-02-14 19:19:47 +00:00
Robert Watson
7e440242e5 Fix mismerge in last commit: check that cred->cr_prison is NULL
before dereferencing the prison pointer.
2004-02-14 18:52:43 +00:00
Robert Watson
f08df373a3 By default, when a process in jail calls getfsstat(), only return the
data for the file system on which the jail's root vnode is located.
Previous behavior (show data for all mountpoints) can be restored
by setting security.jail.getfsstatroot_only to 0.  Note: this also
has the effect of hiding other mounts inside a jail, such as /dev,
/tmp, and /proc, but errs on the side of leaking less information.
2004-02-14 18:31:11 +00:00
Poul-Henning Kamp
d08c5d0b9b Remove the check which used to protect us against make_dev() being
called until DEVFS had a chance to initialize.  Since DEVFS is mandatory
and things over in that department coincidentally works from without
any initialization now, this is safe.
2004-02-14 17:19:43 +00:00
Brian Feldman
a0ed09c0af T -CURRENT DO NOT CRASH UPON ^T K PLZ THX.
Also, use sched_pctcpu() instead of assuming td->td_kse is non-NULL.
2004-02-14 01:30:06 +00:00
Brian Feldman
f662a93197 Always socantsendmore() before deallocating a socket. This, in turn,
calls selwakeup() if necessary (which it is, if you don't want freed
memory hanging around on your td->td_selq).

Props to:	alfred
2004-02-12 01:48:40 +00:00
Don Lewis
55b5f2a202 When reparenting a process to init, make sure that p_sigparent is
set to SIGCHLD.  This avoids the creation of orphaned Linux-threaded
zombies that init is unable to reap.  This can occur when the parent
process sets its SIGCHLD to SIG_IGN.  Fix a similar situation in the
PT_DETACH code.

Tested by:	"Steven Hartland" <killing AT multiplay.co.uk>
2004-02-11 22:06:02 +00:00
John Baldwin
e7a44cace2 Argh! Fix a bogon. lim_cur() was returning the hard (max) limit rather
than the soft (cur) limit.

Submitted by:	bde
2004-02-11 18:04:13 +00:00
Mike Silbersack
b49d824e8b Add the SF_NODISKIO flag to sendfile. This flag causes sendfile to be
mindful of blocking on disk I/O and instead return EBUSY when such
blocking would occur.

Results from the DeBox project indicate that blocking on disk I/O
can slow the performance of a kqueue/poll based webserver.  Using
a flag such as SF_NODISKIO and throwing connections that would block
to helper processes/threads helped increase performance.

Currently, only the Flash webserver uses this flag, although it could
probably be applied to thttpd with relative ease.

Idea by:	Yaoping Ruan & Vivek Pai
2004-02-08 07:35:48 +00:00
Alan Cox
c5aebf380c swp_pager_async_iodone() no longer requires Giant. Modify bufdone()
and swapgeom_done() to perform swp_pager_async_iodone() without Giant.

Reviewed by:	tegge
2004-02-07 08:54:50 +00:00
John Baldwin
a875f38546 - Convert the plimit lock to a pool mutex lock.
- Hide struct plimit from userland.

Submitted by:	bde (2)
2004-02-06 19:35:14 +00:00
John Baldwin
f4daf05619 - Correct the translation of old rlimit values to properly handle the old
RLIM_INFINITY case for ogetrlimit().
- Use %jd and intmax_t to output negative time in usec in calcru().
- Rework getrusage() to make a copy of the rusage struct into a local
  variable while holding Giant and then do the copyout from the local
  variable to avoid having to have the original process rusage struct
  locked while doing the copyout (which would not be safe).  This also
  includes a few style fixes from Bruce to getrusage().

Submitted by:	bde (1, parts of 3)
Suggested by:	bde (2)
2004-02-06 19:30:12 +00:00
John Baldwin
99b6e02ba6 A few more style fixes from Bruce including a few I missed last time.
Submitted by:	bde
2004-02-06 19:25:34 +00:00
John Baldwin
4c3558aa82 Always set a process' state to normal when it is fully constructed in
fork1() rather than only doing it for the RFSTOPPED case and then having
to fix it up in other places later on.
2004-02-05 21:01:37 +00:00
John Baldwin
b4323d7729 - A lot of style and whitespace fixes.
- Update a few comments regarding locking notes.

Submitted by:	bde (1, mostly)
2004-02-05 20:53:25 +00:00
Jacques Vidrine
b00a3c85da Correct a reference counting bug in shmat(2). If vm_map_find(9)
failed, the reference count for the virtual memory object referenced
by the specified shared memory segment would have been erroneously
incremented.

Reported by:	Joost Pol <joost@pine.nl>
2004-02-05 18:00:35 +00:00
Alexander Kabaev
dec8868dcc Rename cn_unavailable to cnunavailable for little more consistency.
Garbage collect unused cndebug() function.

Suggested by:	bde
2004-02-05 17:35:28 +00:00
Mike Silbersack
b711d74eaf Style fixes: don't indent variable names.
Submitted by:	bde
2004-02-05 08:29:27 +00:00
Alexander Kabaev
e99c09e2dc Eliminate global cons_unavailable flag and replace it by the status
bit maintained on a per-device basis. Single variable is inadequate
on machines running with multiple consoles enabled.
2004-02-05 01:56:43 +00:00
John Baldwin
91d5354a2c Locking for the per-process resource limits structure.
- struct plimit includes a mutex to protect a reference count.  The plimit
  structure is treated similarly to struct ucred in that is is always copy
  on write, so having a reference to a structure is sufficient to read from
  it without needing a further lock.
- The proc lock protects the p_limit pointer and must be held while reading
  limits from a process to keep the limit structure from changing out from
  under you while reading from it.
- Various global limits that are ints are not protected by a lock since
  int writes are atomic on all the archs we support and thus a lock
  wouldn't buy us anything.
- All accesses to individual resource limits from a process are abstracted
  behind a simple lim_rlimit(), lim_max(), and lim_cur() API that return
  either an rlimit, or the current or max individual limit of the specified
  resource from a process.
- dosetrlimit() was renamed to kern_setrlimit() to match existing style of
  other similar syscall helper functions.
- The alpha OSF/1 compat layer no longer calls getrlimit() and setrlimit()
  (it didn't used the stackgap when it should have) but uses lim_rlimit()
  and kern_setrlimit() instead.
- The svr4 compat no longer uses the stackgap for resource limits calls,
  but uses lim_rlimit() and kern_setrlimit() instead.
- The ibcs2 compat no longer uses the stackgap for resource limits.  It
  also no longer uses the stackgap for accessing sysctl's for the
  ibcs2_sysconf() syscall but uses kernel_sysctl() instead.  As a result,
  ibcs2_sysconf() no longer needs Giant.
- The p_rlimit macro no longer exists.

Submitted by:	mtm (mostly, I only did a few cleanups and catchups)
Tested on:	i386
Compiled on:	alpha, amd64
2004-02-04 21:52:57 +00:00
Mike Silbersack
ff5e43a3fd Rename iov_to_uio to uiofromiov to be more consistent with other
uio* functions.

Suggested by:	bde
2004-02-04 08:43:21 +00:00
Pawel Jakub Dawidek
19b0efd32d Allow assert that the current thread does not hold the sx(9) lock.
Reviewed by:		jhb
In cooperation with:	juli, jhb
Approved by:		jhb, scottl (mentor)
2004-02-04 08:14:58 +00:00
Mike Silbersack
2ccbe4b596 Style fixes
Submitted by:	bde
2004-02-04 08:14:47 +00:00
Robert Watson
5e312ddcc6 A variety of further cleanups to ttyinfo():
- Rename temporary variable names ("tmp", "tmp2") to more informative
  names ("load", "pctcpu", "rss", ...)

- Unclutter indentation and return paths: rather than lots of nested
  ifs, simply return earlier if it's not going to work out.  Simplify
  general structure and avoid "deep" code.

- Comment on the thread/process selection and locking.

- Correct handling of "running"/"runnable" states, avoid "unknown"
  that people were seeing for running processes.  This was due to
  a misunderstanding of the more complex state machine / inhibitors
  behavior of KSE.

- Do perform ttyinfo() printing on KSE (P_SA) processes, it seems
  generally to work.

While I initially attempted to formulate this as two commits (one
layout, the other content), I concluded that the layout changes were
really structural changes.

Many elements submitted by:  bde
2004-02-04 05:46:05 +00:00
John Baldwin
3e9ac3ebf2 Remove a bogus assertion.
Noticed by:	bde
Pointy hat to:	jhb
2004-02-03 15:14:27 +00:00
Daniel Eischen
b5426f096b Regen after adding ksem_timedwait(). 2004-02-03 05:11:31 +00:00
Daniel Eischen
aae94fbbb6 Add ksem_timedwait() to complement ksem_wait().
Glanced at by:	alfred
2004-02-03 05:08:32 +00:00
Robert Watson
4f638130c3 Don't dec/inc the amountpipes counter every time we resize a pipe --
instead, just dec/inc in the ctor/dtor.  For now, increment/decrement
in two's, since we're now performing the operation once per pair,
not once per pipe.  Not really any measurable performance change
in my micro-benchmarks, but doing less work is good, especially when
it comes to atomic operations.

Suggested by:	alc
2004-02-03 04:55:24 +00:00
Robert Watson
9a830ddc54 Catch instances of (pipe == NULL) that were obsoleted with recent
changes to jointly allocated pipe pairs.  Replace these checks
with pipe_present checks.  This avoids a NULL pointer dereference
when a pipe is half-closed.

Submitted by:	Peter Edwards <peter.edwards@openet-telecom.com>
2004-02-03 02:50:51 +00:00
John Baldwin
9c9c52a3ed - Assert that witness_cold is not true in enroll().
- Only check witness_watch once in enroll().

Reported by:	ru (2)
2004-02-02 22:15:17 +00:00
Pawel Jakub Dawidek
3410b19324 Fix many issues related to mount/unmount:
1. Root from inside a jail was able to unmount any file system
   (except /).
2. Unprivileged root was able to unmount file systems mounted by
   privileged root (execpt /).
3. User from inside a jail was able to mount file system when
   sysctl vfs.usermount was set to 1.
4. User was able to mount file system when vfs.usermount was set to 1
   (that's ok) and unmount it even if vfs.usermount was equal to 0
   (that's not correct).

Possibility from point 1 was reported by: Dariusz Kowalski <darek@76.pl>

Only a part of this fix will be MFC'ed (if approved).

PR:		kern/60149
Reviewed by:	rwatson
Approved by:	scottl (mentor)
MFC after:	3 days
2004-02-02 19:02:05 +00:00
Mike Silbersack
02ec600572 Remove debugging code that slipped into the previous commit.
Spotted by:	bde
2004-02-02 09:09:59 +00:00
Jeff Roberson
b209e5e3e4 - style fixes to the critical_exit() KASSERT().
Submitted by:	bde
2004-02-02 08:13:27 +00:00
Jeff Roberson
0392e39dff - Allow interactive tasks to use the maximum time-slice. This is not as
detrimental as I thought it would be in the case of massive process
   storms from a shell and it makes regular desktop usage noticeably
   better.
2004-02-01 10:38:13 +00:00
Mike Silbersack
beb699c7ba Rewrite sendfile's header support so that headers are now sent in the first
packet along with data, instead of in their own packet.  When serving files
of size (packetsize - headersize) or smaller, this will result in one less
packet crossing the network.  Quick testing with thttpd and http_load has
shown a noticeable performance improvement in this case (350 vs 330 fetches
per second.)

Included in this commit are two support routines, iov_to_uio, and m_uiotombuf;
these routines are used by sendfile to construct the header mbuf chain that
will be linked to the rest of the data in the socket buffer.
2004-02-01 07:56:44 +00:00
Jeff Roberson
f2f51f8ab8 - Disable ithread binding in all cases for now. This doesn't make as much
sense with sched_4bsd as it does with sched_ule.
 - Use P_NOLOAD instead of the absence of td->td_ithd to determine whether or
   not a thread should be accounted for in sched_tdcnt.
2004-02-01 06:20:18 +00:00
Robert Watson
4795b82c13 Coalesce pipe allocations and frees. Previously, the pipe code
would allocate two 'struct pipe's from the pipe zone, and malloc a
mutex.

- Create a new "struct pipepair" object holding the two 'struct
  pipe' instances, struct mutex, and struct label reference.  Pipe
  structures now have a back-pointer to the pipe pair, and a
  'pipe_present' flag to indicate whether the half has been
  closed.

- Perform mutex init/destroy in zone init/destroy, avoiding
  reallocating the mutex for each pipe.  Perform most pipe structure
  setup in zone constructor.

- VM memory mappings for pageable buffers are still done outside of
  the UMA zone.

- Change MAC API to speak 'struct pipepair' instead of 'struct pipe',
  update many policies.  MAC labels are also handled outside of the
  UMA zone for now.  Label-only policy modules don't have to be
  recompiled, but if a module is recompiled, its pipe entry points
  will need to be updated.  If a module actually reached into the
  pipe structures (unlikely), that would also need to be modified.

These changes substantially simplify failure handling in the pipe
code as there are many fewer possible failure modes.

On half-close, pipes no longer free the 'struct pipe' for the closed
half until a full-close takes place.  However, VM mapped buffers
are still released on half-close.

Some code refactoring is now possible to clean up some of the back
references, etc; this patch attempts not to change the structure
of most of the pipe implementation, only allocation/free code
paths, so as to avoid introducing bugs (hopefully).

This cuts about 8%-9% off the cost of sequential pipe allocation
and free in system call tests on UP and SMP in my micro-benchmarks.
May or may not make a difference in macro-benchmarks, but doing
less work is good.

Reviewed by:	juli, tjr
Testing help:	dwhite, fenestro, scottl, et al
2004-02-01 05:56:51 +00:00