Commit Graph

83 Commits

Author SHA1 Message Date
jhb
86661e93ae Catch up to moving headers:
- machine/ipl.h -> sys/ipl.h
- machine/mutex.h -> sys/mutex.h
2000-10-20 07:58:15 +00:00
truckman
a8ef22bd44 Enforce process limit policy in one place to keep proccnt from diverging
from reality.
2000-09-14 23:07:39 +00:00
jasone
daa58ba7a4 Major update to the way synchronization is done in the kernel. Highlights
include:

* Mutual exclusion is used instead of spl*().  See mutex(9).  (Note: The
  alpha port is still in transition and currently uses both.)

* Per-CPU idle processes.

* Interrupts are run in their own separate kernel threads and can be
  preempted (i386 only).

Partially contributed by:	BSDi (BSD/OS)
Submissions by (at least):	cp, dfr, dillon, grog, jake, jhb, sheldonh
2000-09-07 01:33:02 +00:00
truckman
a233f9bd24 Remove uidinfo hash table lookup and maintenance out of chgproccnt() and
chgsbsize(), which are called rather frequently and may be called from an
interrupt context in the case of chgsbsize().  Instead, do the hash table
lookup and maintenance when credentials are changed, which is a lot less
frequent.  Add pointers to the uidinfo structures to the ucred and pcred
structures for fast access.  Pass a pointer to the credential to chgproccnt()
and chgsbsize() instead of passing the uid.  Add a reference count to the
uidinfo structure and use it to decide when to free the structure rather
than freeing the structure when the resource consumption drops to zero.
Move the resource tracking code from kern_proc.c to kern_resource.c.  Move
some duplicate code sequences in kern_prot.c to separate helper functions.
Change KASSERTs in this code to unconditional tests and calls to panic().
2000-09-05 22:11:13 +00:00
phk
29e5a8dc50 Previous commit changing SYSCTL_HANDLER_ARGS violated KNF.
Pointed out by:	bde
2000-07-04 11:25:35 +00:00
phk
b09ca1a9bb Style police catches up with rev 1.26 of src/sys/sys/sysctl.h:
Sanitize SYSCTL_HANDLER_ARGS so that simplistic tools can grog our
sources:

        -sysctl_vm_zone SYSCTL_HANDLER_ARGS
        +sysctl_vm_zone (SYSCTL_HANDLER_ARGS)
2000-07-03 09:35:31 +00:00
nbm
2ca50fa2d0 Add sysctl descriptions to a few sysctls. Simply "documentation".
PR:		kern/8015
Submitted by:	Stefan Eggers <seggers@semyam.dinoco.de>
2000-06-26 13:52:31 +00:00
alfred
6bd686ba79 fix races in the uidinfo subsystem, several problems existed:
1) while allocating a uidinfo struct malloc is called with M_WAITOK,
   it's possible that while asleep another process by the same user
   could have woken up earlier and inserted an entry into the uid
   hash table.  Having redundant entries causes inconsistancies that
   we can't handle.

   fix: do a non-waiting malloc, and if that fails then do a blocking
   malloc, after waking up check that no one else has inserted an entry
   for us already.

2) Because many checks for sbsize were done as "test then set" in a non
   atomic manner it was possible to exceed the limits put up via races.

   fix: instead of querying the count then setting, we just attempt to
   set the count and leave it up to the function to return success or
   failure.

3) The uidinfo code was inlining and repeating, lookups and insertions
   and deletions needed to be in their own functions for clarity.

Reviewed by: green
2000-06-22 22:27:16 +00:00
jake
5e208b0c18 Back out the previous change to the queue(3) interface.
It was not discussed and should probably not happen.

Requested by:		msmith and others
2000-05-26 02:09:24 +00:00
jake
1d685644e0 Change the way that the queue(3) structures are declared; don't assume that
the type argument to *_HEAD and *_ENTRY is a struct.

Suggested by:	phk
Reviewed by:	phk
Approved by:	mdodd
2000-05-23 20:41:01 +00:00
jlemon
6edfb1e197 Introduce kqueue() and kevent(), a kernel event notification facility. 2000-04-16 18:53:38 +00:00
peter
0543a2a0cc Put on my asbestos underwear and commit the patch that I posted to -arch
some time ago that changes kern.randompid from a boolean to a randomness
range for the next pid assigment.  Too high causes a lot of extra work
to scan for free pids, and too low merely wastes randomness entropy.  It's
still possible to select a completely random range by using PID_MAX (100k)
or -1 as a shortcut to mean "the whole range".
Also, don't waste randomness when doing a wraparound.
1999-12-06 11:13:50 +00:00
luoqi
e1482a2b30 User ldt sharing. 1999-12-06 04:53:08 +00:00
dan
25892aea8e Introduce OpenBSD-like Random PIDs. Controlled by a sysctl knob
(kern.randompid), which is currently defaulted off.  Use ARC4 (RC4) for our
random number generation, which will not get me executed for violating
crypto laws; a Good Thing(tm).

Reviewed and Approved by: bde, imp
1999-11-28 17:51:09 +00:00
phk
0b3822b367 The at_exit and at_fork functions currently use a 'roll your own'
linked list to store the callbak routines.  The patch converts the
lists to queue(3) TAILQs, making the code slightly clearer and ensuring
that callbacks are executed in FIFO order.

Man page also updated as necesary.

(discontinued use of M_TEMP malloc type while here anyway /phk)

Submitted by:   Jake Burkholder jake@checker.org
PR:             14912
1999-11-19 21:29:03 +00:00
phk
9e8e07135f Introduce commandline caching in the kernel.
This fixes some nasty procfs problems for SMP, makes ps(1) run much faster,
and makes ps(1) even less dependent on /proc which will aid chroot and
jails alike.

To disable this facility and revert to previous behaviour:
        sysctl -w kern.ps_arg_cache_limit=0

For full details see the current@FreeBSD.org mail-archives.
1999-11-16 20:31:58 +00:00
phk
bed630ce3c This is a partial commit of the patch from PR 14914:
Alot of the code in sys/kern directly accesses the *Q_HEAD and *Q_ENTRY
   structures for list operations.  This patch makes all list operations
   in sys/kern use the queue(3) macros, rather than directly accessing the
   *Q_{HEAD,ENTRY} structures.

This batch of changes compile to the same object files.

Reviewed by:    phk
Submitted by:   Jake Burkholder <jake@checker.org>
PR:     14914
1999-11-16 10:56:05 +00:00
peter
5f2dea3689 Trim unused options (or #ifdef for undoc options).
Submitted by:	phk
1999-10-11 15:19:12 +00:00
peter
e4b04a2b21 $Id$ -> $FreeBSD$ 1999-08-28 01:08:13 +00:00
alc
fbaba2d628 Fix the following problem:
When creating new processes (or performing exec), the new page
directory is initialized too early.  The kernel might grow before
p_vmspace is initialized for the new process.  Since pmap_growkernel
doesn't yet know about the new page directory, it isn't updated, and
subsequent use causes a failure.

The fix is (1) to clear p_vmspace early, to stop pmap_growkernel
from stomping on memory, and (2) to defer part of the initialization
of new page directories until p_vmspace is initialized.

PR:		kern/12378
Submitted by:	tegge
Reviewed by:	dfr
1999-07-21 18:02:27 +00:00
peter
94cfddacf9 Stop rfork(0) from panicing. (oops!!)
Submitted by:	Peter Holm <peter@holm.cc>
1999-07-03 20:58:44 +00:00
peter
97e2382795 Slight tweak to fork1() calling conventions. Add a third argument so
the caller can easily find the child proc struct.  fork(), rfork() etc
syscalls set p->p_retval[] themselves.  Simplify the SYSINIT_KT() code
and other kernel thread creators to not need to use pfind() to find the
child based on the pid.  While here, partly tidy up some of the fork1()
code for RF_SIGSHARE etc.
1999-06-30 15:33:41 +00:00
phk
79134080a0 This Implements the mumbled about "Jail" feature.
This is a seriously beefed up chroot kind of thing.  The process
is jailed along the same lines as a chroot does it, but with
additional tough restrictions imposed on what the superuser can do.

For all I know, it is safe to hand over the root bit inside a
prison to the customer living in that prison, this is what
it was developed for in fact:  "real virtual servers".

Each prison has an ip number associated with it, which all IP
communications will be coerced to use and each prison has its own
hostname.

Needless to say, you need more RAM this way, but the advantage is
that each customer can run their own particular version of apache
and not stomp on the toes of their neighbors.

It generally does what one would expect, but setting up a jail
still takes a little knowledge.

A few notes:

   I have no scripts for setting up a jail, don't ask me for them.

   The IP number should be an alias on one of the interfaces.

   mount a /proc in each jail, it will make ps more useable.

   /proc/<pid>/status tells the hostname of the prison for
   jailed processes.

   Quotas are only sensible if you have a mountpoint per prison.

   There are no privisions for stopping resource-hogging.

   Some "#ifdef INET" and similar may be missing (send patches!)

If somebody wants to take it from here and develop it into
more of a "virtual machine" they should be most welcome!

Tools, comments, patches & documentation most welcome.

Have fun...

Sponsored by:   http://www.rndassociates.com/
Run for almost a year by:       http://www.servetheweb.com/
1999-04-28 11:38:52 +00:00
luoqi
81e3a332a9 Enable vmspace sharing on SMP. Major changes are,
- %fs register is added to trapframe and saved/restored upon kernel entry/exit.
- Per-cpu pages are no longer mapped at the same virtual address.
- Each cpu now has a separate gdt selector table. A new segment selector
  is added to point to per-cpu pages, per-cpu global variables are now
  accessed through this new selector (%fs). The selectors in gdt table are
  rearranged for cache line optimization.
- fask_vfork is now on as default for both UP and SMP.
- Some aio code cleanup.

Reviewed by:	Alan Cox	<alc@cs.rice.edu>
		John Dyson	<dyson@iquest.net>
		Julian Elischer	<julian@whistel.com>
		Bruce Evans	<bde@zeta.org.au>
		David Greenman	<dg@root.com>
1999-04-28 01:04:33 +00:00
dt
b57fa594b9 Use pointer arithmetic to do pointer arithmetic. 1999-04-24 11:25:01 +00:00
peter
44d31e635d Well folks, this is it - The second stage of the removal for build support
for LKM's..
1999-04-17 08:36:07 +00:00
peter
0d41c1d11f Use the reference-counted PHOLD()/PRELE() rather than P_NOSWAP. 1999-04-06 03:03:34 +00:00
julian
a93530134e Fix thread/process tracking and differentiation for Linux threads emulation.
Submitted by:	Richard Seaman, Jr." <dick@tar.com>

Also clean some compiler warnings in surrounding code.
1999-03-02 00:28:09 +00:00
julian
77672a61df Enable Linux threads support by default.
This takes the conditionals out of the code that has been tested by
various people for a while.
ps and friends (libkvm) will need a recompile as some proc structure
changes are made.

Submitted by:	"Richard Seaman, Jr." <dick@tar.com>
1999-01-26 02:38:12 +00:00
julian
b8d69589f3 Changes to the LINUX_THREADS support to only allocate extra memory for
shared signal handling when there is shared signal handling being
used.

This removes the main objection to making the shared signal handling
a standard ability in rfork() and friends and 'unconditionalising'
this code. (i.e. the allocation of an extra 328 bytes per process).

Signal handling information remains in the U area until such a time as
it's reference count would be incremented to > 1. At that point a new
struct is malloc'd and maintained in KVM so that it can be shared between
the processes (threads) using it.

A function to check the reference count and move the struct back to the U
area when it drops back to 1 is also supplied. Signal information is
therefore now swapable for all processes that are not sharing that
information with other processes. THis should addres the concerns raised
by Garrett and others.

Submitted by:	"Richard Seaman, Jr." <dick@tar.com>
1999-01-07 21:23:50 +00:00
julian
f271b9be8e Reviewed by: Luoqi Chen, Jordan Hubbard
Submitted by:	 "Richard Seaman, Jr." <lists@tar.com>
Obtained from:	linux :-)

Code to allow Linux Threads to run under FreeBSD.

By default not enabled
This code is dependent on the conditional
COMPAT_LINUX_THREADS (suggested by Garret)
This is not yet a 'real' option but will be within some number of hours.
1998-12-19 02:55:34 +00:00
truckman
24fe22505c If the session leader dies, s_leader is set to NULL and getsid() may
dereference a NULL pointer, causing a panic.  Instead of following
s_leader to find the session id, store it in the session structure.

Jukka found the following info:

	BTW - I just found what I have been looking for. Std 1003.1
	Part 1: SYSTEM API [C LANGUAGE] section 2.2.2.80 states quite
	explicitly...

	Session lifetime: The period between when a session is created
	and the end of lifetime of all the process groups that remain
	as members of the session.

	So, this quite clearly tells that while there is any single
	process in any process group which is a member of the session,
	the session remains as an independent entity.

Reviewed by:	peter
Submitted by:	"Jukka A. Ukkonen" <jau@jau.tmt.tele.fi>
1998-11-09 15:08:04 +00:00
dyson
87f527a1ea VM level code cleanups.
1)	Start using TSM.
	Struct procs continue to point to upages structure, after being freed.
	Struct vmspace continues to point to pte object and kva space for kstack.
	u_map is now superfluous.
2)	vm_map's don't need to be reference counted.  They always exist either
	in the kernel or in a vmspace.  The vmspaces are managed by reference
	counts.
3)	Remove the "wired" vm_map nonsense.
4)	No need to keep a cache of kernel stack kva's.
5)	Get rid of strange looking ++var, and change to var++.
6)	Change more data structures to use our "zone" allocator.  Added
	struct proc, struct vmspace and struct vnode.  This saves a significant
	amount of kva space and physical memory.  Additionally, this enables
	TSM for the zone managed memory.
7)	Keep ioopt disabled for now.
8)	Remove the now bogus "single use" map concept.
9)	Use generation counts or id's for data structures residing in TSM, where
	it allows us to avoid unneeded restart overhead during traversals, where
	blocking might occur.
10)	Account better for memory deficits, so the pageout daemon will be able
	to make enough memory available (experimental.)
11)	Fix some vnode locking problems. (From Tor, I think.)
12)	Add a check in ufs_lookup, to avoid lots of unneeded calls to bcmp.
	(experimental.)
13)	Significantly shrink, cleanup, and make slightly faster the vm_fault.c
	code.  Use generation counts, get rid of unneded collpase operations,
	and clean up the cluster code.
14)	Make vm_zone more suitable for TSM.

This commit is partially as a result of discussions and contributions from
other people, including DG, Tor Egge, PHK, and probably others that I
have forgotten to attribute (so let me know, if I forgot.)

This is not the infamous, final cleanup of the vnode stuff, but a necessary
step.  Vnode mgmt should be correct, but things might still change, and
there is still some missing stuff (like ioopt, and physical backing of
non-merged cache files, debugging of layering concepts.)
1998-01-22 17:30:44 +00:00
dyson
a68f4a7087 We have had support for running the kernel daemons as threads for
quite a while, but forgot to do so.  For now, this code supports
most daemons  running as kernel threads in UP kernels, and as
full processes in SMP.  We will soon be able to run them as
threads in SMP, but not yet.
1997-12-12 04:00:59 +00:00
bde
345c971abd Removed unused includes.
Staticized.

Avoid passing a `retval' to fork1().

Fixed some style bugs.
1997-11-20 16:36:17 +00:00
phk
5f349f6779 Move the "retval" (3rd) parameter from all syscall functions and put
it in struct proc instead.

This fixes a boatload of compiler warning, and removes a lot of cruft
from the sources.

I have not removed the /*ARGSUSED*/, they will require some looking at.

libkvm, ps and other userland struct proc frobbing programs will need
recompiled.
1997-11-06 19:29:57 +00:00
bde
29f6fdf205 Fixed some gratuitous ANSIisms. 1997-08-26 00:15:04 +00:00
peter
964cbb4f27 Print a warning if an unsupported (under SMP) shared address space fork
is attempted rather than just failing with an errno.
1997-08-22 15:10:00 +00:00
dyson
abc6af3236 This is an upgrade so that the kernel supports the AIO calls from
POSIX.4.  Additionally, there is some initial code that supports LIO.
This code supports AIO/LIO for all types of file descriptors, with
few if any restrictions.  There will be a followup very soon that
will support significantly more efficient operation for VCHR type
files (raw.)  This code is also dependent on some kernel features
that don't work under SMP yet.  After I commit the changes to the
kernel to support proper address space sharing on SMP, this code
will also work under SMP.
1997-07-06 02:40:43 +00:00
peter
86ea24b760 Preliminary support for per-cpu data pages.
This eliminates a lot of #ifdef SMP type code.  Things like _curproc reside
in a data page that is unique on each cpu, eliminating the expensive macros
like:    #define curproc (SMPcurproc[cpunumber()])

There are some unresolved bootstrap and address space sharing issues at
present, but Steve is waiting on this for other work.  There is still some
strictly temporary code present that isn't exactly pretty.

This is part of a larger change that has run into some bumps, this part is
standalone so it should be safe.  The temporary code goes away when the
full idle cpu support is finished.

Reviewed by: fsmp, dyson
1997-06-22 16:04:22 +00:00
dyson
8cbe8ae56c Modifications to existing files to support the initial AIO/LIO and
kernel based threading support.
1997-06-16 00:29:36 +00:00
peter
97b25362fe Don't need "opt_smp.h" on these files 1997-05-29 04:52:04 +00:00
peter
2cd137c24a Create sysctl kern.fast_vfork, on for uniprocessor by default, off for
SMP.
1997-04-26 15:59:50 +00:00
peter
9721bff580 Disable RFMEM in vfork for smp case.. It doesn't seem to work too well
yet..
1997-04-26 14:31:36 +00:00
ache
f394565e22 Restore memory space separation (RFMEM) for vfork() after
shell imgact memory clobbering fixed
1997-04-23 22:13:18 +00:00
dyson
45be92f4ae Give up on the fast vfork() for a while. 1997-04-23 01:59:14 +00:00
dyson
f861494908 Re-institute the efficent version of vfork. It appears to make a
difference of approx 3mins in make world on my P6!!!  This means
that vfork now has full address space sharing, so beware with
sloppy vfork programming.  Also, you really do need to apply
the previously committed popen fix in libc.
1997-04-20 16:57:12 +00:00
dyson
28135ae9b2 Make a problem that I cannot reproduce go away for now. This commit
is to decrease the inconvienience of other developers until I can
really fix the code.
Reviewed by:	Donald J. Maddox <dmaddox@scsn.net>
1997-04-14 01:28:58 +00:00
dyson
f540a9f271 Fully implement vfork. Vfork is now much much faster than even our
fork. (On my machine, fork is about 240usecs, vfork is 78usecs.)

Implement rfork(!RFPROC !RFMEM), which allows a thread to divorce its memory
	from the other threads of a group.

Implement rfork(!RFPROC RFCFDG), which closes all file descriptors, eliminating
	possible existing shares with other threads/processes.

Implement rfork(!RFPROC RFFDG), which divorces the file descriptors for a
	thread from the rest of the group.

Fix the case where a thread does an exec.  It is almost nonsense for a thread
	to modify the other threads address space by an exec, so we
	now automatically divorce the address space before modifying it.
1997-04-13 01:48:35 +00:00
peter
a6f7d93a6f Remove explicit zero of p_vmspace on creation, it's now in the startzero
section of the proc struct.
1997-04-07 09:38:39 +00:00