freebsd-skq

Author	SHA1	Message	Date
julian	aa2dc0a5d9	Part 1 of KSE-III The ability to schedule multiple threads per process (one one cpu) by making ALL system calls optionally asynchronous. to come: ia64 and power-pc patches, patches for gdb, test program (in tools) Reviewed by: Almost everyone who counts (at various times, peter, jhb, matt, alfred, mini, bernd, and a cast of thousands) NOTE: this is still Beta code, and contains lots of debugging stuff. expect slight instability in signals..	2002-06-29 17:26:22 +00:00
trhodes	28d42899b7	More s/file system/filesystem/g	2002-05-16 21:28:32 +00:00
mux	85b0c22bf2	Convert devfs to nmount. Reviewed by: phk	2002-05-02 20:27:42 +00:00
rwatson	63ab78794e	Divorce proc0 and proc1 credentials earlier; while this isn't technically needed in the current code, in the MAC tree, create_init() relies on the ability to modify the credentials present for initproc, and should not perform that modification on a shared credential. Pro-active diff reduction against MAC changes that are in the queue; also facilitates other work, including the capabilities implementation. Submitted by: green Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-04-19 13:35:53 +00:00
mux	a207e41bef	Rework the kernel environment subsystem. We now convert the static environment needed at boot time to a dynamic subsystem when VM is up. The dynamic kernel environment is protected by an sx lock. This adds some new functions to manipulate the kernel environment : freeenv(), setenv(), unsetenv() and testenv(). freeenv() has to be called after every getenv() when you have finished using the string. testenv() only tests if an environment variable is present, and doesn't require a freeenv() call. setenv() and unsetenv() are self explanatory. The kenv(2) syscall exports these new functionalities to userland, mainly for kenv(1). Reviewed by: peter	2002-04-17 13:06:36 +00:00
jhb	db9aa81e23	Change callers of mtx_init() to pass in an appropriate lock type name. In most cases NULL is passed, but in some cases such as network driver locks (which use the MTX_NETWORK_LOCK macro) and UMA zone locks, a name is used. Tested on: i386, alpha, sparc64	2002-04-04 21:03:38 +00:00
tanimura	9ae6d1242c	The description of fd_mtx is "filedesc structure."	2002-03-29 11:26:05 +00:00
jeff	dff418f166	Add a new mtx_init option "MTX_DUPOK" which allows duplicate acquires of locks with this flag. Remove the dup_list and dup_ok code from subr_witness. Now we just check for the flag instead of doing string compares. Also, switch the process lock, process group lock, and uma per cpu locks over to this interface. The original mechanism did not work well for uma because per cpu lock names are unique to each zone. Approved by: jhb	2002-03-27 09:23:41 +00:00
phk	0a13c0f2f6	Move the mount of the root filesystem to happen in the init process before the exec if /sbin/init. This allows the scheduler to get started and kthreads a chance to run before we start filesystem operations.	2002-03-08 10:33:11 +00:00
peter	b7d611f95e	Fix warning. s/microuptime()/binuptime()/ for switchtime initial value.	2002-02-26 01:03:39 +00:00
tanimura	a09da29859	Lock struct pgrp, session and sigio. New locks are: - pgrpsess_lock which locks the whole pgrps and sessions, - pg_mtx which protects the pgrp members, and - s_mtx which protects the session members. Please refer to sys/proc.h for the coverage of these locks. Changes on the pgrp/session interface: - pgfind() needs the pgrpsess_lock held. - The caller of enterpgrp() is responsible to allocate a new pgrp and session. - Call enterthispgrp() in order to enter an existing pgrp. - pgsignal() requires a pgrp lock held. Reviewed by: jhb, alfred Tested on: cvsup.jp.FreeBSD.org (which is a quad-CPU machine running -current)	2002-02-23 11:12:57 +00:00
phk	fa959f1afd	Convert p->p_runtime and PCPU(switchtime) to bintime format.	2002-02-22 13:32:01 +00:00
julian	37369620df	In a threaded world, differnt priorirites become properties of different entities. Make it so. Reviewed by: jhb@freebsd.org (john baldwin)	2002-02-11 20:37:54 +00:00
julian	b5eb64d6f0	Pre-KSE/M3 commit. this is a low-functionality change that changes the kernel to access the main thread of a process via the linked list of threads rather than assuming that it is embedded in the process. It IS still embeded there but remove all teh code that assumes that in preparation for the next commit which will actually move it out. Reviewed by: peter@freebsd.org, gallatin@cs.duke.edu, benno rice,	2002-02-07 20:58:47 +00:00
alfred	5e2f4cf200	Include sys/_lock.h and sys/_mutex.h to reduce namespace pollution. Requested by: jhb	2002-01-13 21:37:49 +00:00
alfred	844237b396	SMP Lock struct file, filedesc and the global file list. Seigo Tanimura (tanimura) posted the initial delta. I've polished it quite a bit reducing the need for locking and adapting it for KSE. Locks: 1 mutex in each filedesc protects all the fields. protects "struct file" initialization, while a struct file is being changed from &badfileops -> &pipeops or something the filedesc should be locked. 1 mutex in each struct file protects the refcount fields. doesn't protect anything else. the flags used for garbage collection have been moved to f_gcflag which was the FILLER short, this doesn't need locking because the garbage collection is a single threaded container. could likely be made to use a pool mutex. 1 sx lock for the global filelist. struct file * fhold(struct file fp); / increments reference count on a file / struct file fhold_locked(struct file fp); / like fhold but expects file to locked / struct file ffind_hold(struct thread , int fd); / finds the struct file in thread, adds one reference and returns it unlocked / struct file ffind_lock(struct thread , int fd); / ffind_hold, but returns file locked */ I still have to smp-safe the fget cruft, I'll get to that asap.	2002-01-13 11:58:06 +00:00
luigi	4893656ff8	Add/correct description for some sysctl variables where it was missing. The description field is unused in -stable, so the MFC there is equivalent to a comment. It can be done at any time, i am just setting a reminder in 45 days when hopefully we are past 4.5-release. MFC after: 45 days	2001-12-16 16:07:20 +00:00
jhb	21b6b26912	Overhaul the per-CPU support a bit: - The MI portions of struct globaldata have been consolidated into a MI struct pcpu. The MD per-CPU data are specified via a macro defined in machine/pcpu.h. A macro was chosen over a struct mdpcpu so that the interface would be cleaner (PCPU_GET(my_md_field) vs. PCPU_GET(md.md_my_md_field)). - All references to globaldata are changed to pcpu instead. In a UP kernel, this data was stored as global variables which is where the original name came from. In an SMP world this data is per-CPU and ideally private to each CPU outside of the context of debuggers. This also included combining machine/globaldata.h and machine/globals.h into machine/pcpu.h. - The pointer to the thread using the FPU on i386 was renamed from npxthread to fpcurthread to be identical with other architectures. - Make the show pcpu ddb command MI with a MD callout to display MD fields. - The globaldata_register() function was renamed to pcpu_init() and now init's MI fields of a struct pcpu in addition to registering it with the internal array and list. - A pcpu_destroy() function was added to remove a struct pcpu from the internal array and list. Tested on: alpha, i386 Reviewed by: peter, jake	2001-12-11 23:33:44 +00:00
jhb	46e3f92a5d	Add a per-thread ucred reference for syscalls and synchronous traps from userland. The per thread ucred reference is immutable and thus needs no locks to be read. However, until all the proc locking associated with writes to p_ucred are completed, it is still not safe to use the per-thread reference. Tested on: x86 (SMP), alpha, sparc64	2001-10-26 08:12:54 +00:00
jhb	b03804752e	Don't initialize proc0's mutex twice. It is already done earlier on in the MD startup code.	2001-09-18 22:09:47 +00:00
peter	3a353f88b0	In the devfs case, have initproc attempt the easy cases of mounting /dev. This works if /dev exists, or if / is read/write (nfsroot). If it is too hard, leave it up to init -d (which will probably fail if /dev does not exist, but there isn't much else we can do short of making a union mount on /). This means we get a proper /dev if you boot a 5.x kernel on a 4.x world, which I happen to do often (the ramdisks on our install netboot servers have 4.x userland worlds on them).	2001-09-15 11:15:22 +00:00
julian	5596676e6c	KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha	2001-09-12 08:38:13 +00:00
dillon	e028603b7e	With Alfred's permission, remove vm_mtx in favor of a fine-grained approach (this commit is just the first stage). Also add various GIANT_ macros to formalize the removal of Giant, making it easy to test in a more piecemeal fashion. These macros will allow us to test fine-grained locks to a degree before removing Giant, and also after, and to remove Giant in a piecemeal fashion via sysctl's on those subsystems which the authors believe can operate without Giant.	2001-07-04 16:20:28 +00:00
peter	f10fa038c1	With this commit, I hereby pronounce gensetdefs past its use-by date. Replace the a.out emulation of 'struct linker_set' with something a little more flexible. <sys/linker_set.h> now provides macros for accessing elements and completely hides the implementation. The linker_set.h macros have been on the back burner in various forms since 1998 and has ideas and code from Mike Smith (SET_FOREACH()), John Polstra (ELF clue) and myself (cleaned up API and the conversion of the rest of the kernel to use it). The macros declare a strongly typed set. They return elements with the type that you declare the set with, rather than a generic void *. For ELF, we use the magic ld symbols (__start_<setname> and __stop_<setname>). Thanks to Richard Henderson <rth@redhat.com> for the trick about how to force ld to provide them for kld's. For a.out, we use the old linker_set struct. NOTE: the item lists are no longer null terminated. This is why the code impact is high in certain areas. The runtime linker has a new method to find the linker set boundaries depending on which backend format is in use. linker sets are still module/kld unfriendly and should never be used for anything that may be modular one day. Reviewed by: eivind	2001-06-13 10:58:39 +00:00
rwatson	f504530d9f	o Merge contents of struct pcred into struct ucred. Specifically, add the real uid, saved uid, real gid, and saved gid to ucred, as well as the pcred->pc_uidinfo, which was associated with the real uid, only rename it to cr_ruidinfo so as not to conflict with cr_uidinfo, which corresponds to the effective uid. o Remove p_cred from struct proc; add p_ucred to struct proc, replacing original macro that pointed. p->p_ucred to p->p_cred->pc_ucred. o Universally update code so that it makes use of ucred instead of pcred, p->p_ucred instead of p->p_pcred, cr_ruidinfo instead of p_uidinfo, cr_{r,sv}{u,g}id instead of p_*, etc. o Remove pcred0 and its initialization from init_main.c; initialize cr_ruidinfo there. o Restruction many credential modification chunks to always crdup while we figure out locking and optimizations; generally speaking, this means moving to a structure like this: newcred = crdup(oldcred); ... p->p_ucred = newcred; crfree(oldcred); It's not race-free, but better than nothing. There are also races in sys_process.c, all inter-process authorization, fork, exec, and exit. o Remove sigio->sio_ruid since sigio->sio_ucred now contains the ruid; remove comments indicating that the old arrangement was a problem. o Restructure exec1() a little to use newcred/oldcred arrangement, and use improved uid management primitives. o Clean up exit1() so as to do less work in credential cleanup due to pcred removal. o Clean up fork1() so as to do less work in credential cleanup and allocation. o Clean up ktrcanset() to take into account changes, and move to using suser_xxx() instead of performing a direct uid==0 comparision. o Improve commenting in various kern_prot.c credential modification calls to better document current behavior. In a couple of places, current behavior is a little questionable and we need to check POSIX.1 to make sure it's "right". More commenting work still remains to be done. o Update credential management calls, such as crfree(), to take into account new ruidinfo reference. o Modify or add the following uid and gid helper routines: change_euid() change_egid() change_ruid() change_rgid() change_svuid() change_svgid() In each case, the call now acts on a credential not a process, and as such no longer requires more complicated process locking/etc. They now assume the caller will do any necessary allocation of an exclusive credential reference. Each is commented to document its reference requirements. o CANSIGIO() is simplified to require only credentials, not processes and pcreds. o Remove lots of (p_pcred==NULL) checks. o Add an XXX to authorization code in nfs_lock.c, since it's questionable, and needs to be considered carefully. o Simplify posix4 authorization code to require only credentials, not processes and pcreds. Note that this authorization, as well as CANSIGIO(), needs to be updated to use the p_cansignal() and p_cansched() centralized authorization routines, as they currently do not take into account some desirable restrictions that are handled by the centralized routines, as well as being inconsistent with other similar authorization instances. o Update libkvm to take these changes into account. Obtained from: TrustedBSD Project Reviewed by: green, bde, jhb, freebsd-arch, freebsd-audit	2001-05-25 16:59:11 +00:00
jhb	87198ba253	- Lock the VM when initializing the vmspace for proc0. - Don't bother releasing Giant while doing a lookup on the vm_map of initproc while starting up init. We have to grab it again right after the lookup anyways.	2001-05-23 22:06:47 +00:00
alfred	a3f0842419	Introduce a global lock for the vm subsystem (vm_mtx). vm_mtx does not recurse and is required for most low level vm operations. faults can not be taken without holding Giant. Memory subsystems can now call the base page allocators safely. Almost all atomic ops were removed as they are covered under the vm mutex. Alpha and ia64 now need to catch up to i386's trap handlers. FFS and NFS have been tested, other filesystems will need minor changes (grabbing the vm lock when twiddling page properties). Reviewed (partially) by: jake, jhb	2001-05-19 01:28:09 +00:00
jhb	59ffccfbd6	- Move the setting of bootverbose to a MI SI_SUB_TUNABLES SYSINIT. - Attach a writable sysctl to bootverbose (debug.bootverbose) so it can be toggled after boot. - Move the printf of the version string to a SI_SUB_COPYRIGHT SYSINIT just afer the display of the copyright message instead of doing it by hand in three MD places.	2001-05-17 22:28:46 +00:00
grog	4b9d9cbaac	Revert consequences of changes to mount.h, part 2. Requested by: bde	2001-04-29 02:45:39 +00:00
grog	1f5de30718	Correct #includes to work with fixed sys/mount.h.	2001-04-23 09:05:15 +00:00
jhb	3588cc574a	Stick proc0 in the PID hash table.	2001-04-11 18:50:50 +00:00
jhb	79cf991a6b	Convert the allproc and proctree locks from lockmgr locks to sx locks.	2001-03-28 11:52:56 +00:00
obrien	1814a78315	Do not set a default ELF syscall ABI fallback. If one runs an un-branded Linux static binary that calls Linux's fcntl the machine will reboot when interupted by the FreeBSD syscall ABI.	2001-03-04 11:58:50 +00:00
iedowse	63324ae546	The kernel did not hold a vnode reference associated with the `rootvnode' pointer, but vfs_syscalls.c's checkdirs() assumed that it did. This bug reliably caused a panic at reboot time if any filesystem had been mounted directly over /. The checkdirs() function is called at mount time to find any process fd_cdir or fd_rdir pointers referencing the covered mountpoint vnode. It transfers these to point at the root of the new filesystem. However, this process was not reversed at unmount time, so processes with a cwd/root at a mount point would unexpectedly lose their cwd/root following a mount-unmount cycle at that mountpoint. This change should fix both of the above issues. Start_init() now holds an extra vnode reference corresponding to `rootvnode', and dounmount() releases this reference when the root filesystem is unmounted just before reboot. Dounmount() now undoes the actions taken by checkdirs() at mount time; any process cdir/rdir pointers that reference the root vnode of the unmounted filesystem are transferred to the now-uncovered vnode. Reviewed by: bde, phk	2001-02-28 20:54:28 +00:00
jake	623f06756a	Sigh. Try to get priorities sorted out. Don't bother trying to update native priority, it is diffcult to get right and likely to end up horribly wrong. Use an honestly wrong fixed value that seems to work; PUSER for user threads, and the interrupt priority for ithreads. Set it once when the process is created and forget about it. Suggested by: bde Pointy hat: me	2001-02-28 02:53:44 +00:00
jake	3d95b58cac	Initialize native priority to PRI_MAX. It was usually 0 which made a process's priority go through the roof when it released a (contested) mutex. Only set the native priority in mtx_lock if hasn't already been set. Reviewed by: jhb	2001-02-26 23:27:35 +00:00
jhb	a97d568931	It turns out the kernel console works fine and thus doesn't need quite this much extra testing.	2001-02-24 03:40:23 +00:00
peter	c4ff4fe19e	Stricter style(9) conformance - remove unnecessary blank lines in previous commit.	2001-02-23 23:05:46 +00:00
jhb	1e88ca386b	Test out the kernel console just before launching the AP's.	2001-02-23 19:44:25 +00:00
rwatson	ab5676fc87	o Move per-process jail pointer (p->pr_prison) to inside of the subject credential structure, ucred (cr->cr_prison). o Allow jail inheritence to be a function of credential inheritence. o Abstract prison structure reference counting behind pr_hold() and pr_free(), invoked by the similarly named credential reference management functions, removing this code from per-ABI fork/exit code. o Modify various jail() functions to use struct ucred arguments instead of struct proc arguments. o Introduce jailed() function to determine if a credential is jailed, rather than directly checking pointers all over the place. o Convert PRISON_CHECK() macro to prison_check() function. o Move jail() function prototypes to jail.h. o Emulate the P_JAILED flag in fill_kinfo_proc() and no longer set the flag in the process flags field itself. o Eliminate that "const" qualifier from suser/p_can/etc to reflect mutex use. Notes: o Some further cleanup of the linux/jail code is still required. o It's now possible to consider resolving some of the process vs credential based permission checking confusion in the socket code. o Mutex protection of struct prison is still not present, and is required to protect the reference count plus some fields in the structure. Reviewed by: freebsd-arch Obtained from: TrustedBSD Project	2001-02-21 06:39:57 +00:00
jake	55d5108ac5	Implement a unified run queue and adjust priority levels accordingly. - All processes go into the same array of queues, with different scheduling classes using different portions of the array. This allows user processes to have their priorities propogated up into interrupt thread range if need be. - I chose 64 run queues as an arbitrary number that is greater than 32. We used to have 4 separate arrays of 32 queues each, so this may not be optimal. The new run queue code was written with this in mind; changing the number of run queues only requires changing constants in runq.h and adjusting the priority levels. - The new run queue code takes the run queue as a parameter. This is intended to be used to create per-cpu run queues. Implement wrappers for compatibility with the old interface which pass in the global run queue structure. - Group the priority level, user priority, native priority (before propogation) and the scheduling class into a struct priority. - Change any hard coded priority levels that I found to use symbolic constants (TTIPRI and TTOPRI). - Remove the curpriority global variable and use that of curproc. This was used to detect when a process' priority had lowered and it should yield. We now effectively yield on every interrupt. - Activate propogate_priority(). It should now have the desired effect without needing to also propogate the scheduling class. - Temporarily comment out the call to vm_page_zero_idle() in the idle loop. It interfered with propogate_priority() because the idle process needed to do a non-blocking acquire of Giant and then other processes would try to propogate their priority onto it. The idle process should not do anything except idle. vm_page_zero_idle() will return in the form of an idle priority kernel thread which is woken up at apprioriate times by the vm system. - Update struct kinfo_proc to the new priority interface. Deliberately change its size by adjusting the spare fields. It remained the same size, but the layout has changed, so userland processes that use it would parse the data incorrectly. The size constraint should really be changed to an arbitrary version number. Also add a debug.sizeof sysctl node for struct kinfo_proc.	2001-02-12 00:20:08 +00:00
jhb	6e847a265b	Move the initailization of the proc lock for proc0 very early into the MD startup code.	2001-02-09 16:25:16 +00:00
bmilekic	f364d4ac36	Change and clean the mutex lock interface. mtx_enter(lock, type) becomes: mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks) mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized) similarily, for releasing a lock, we now have: mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN. We change the caller interface for the two different types of locks because the semantics are entirely different for each case, and this makes it explicitly clear and, at the same time, it rids us of the extra `type' argument. The enter->lock and exit->unlock change has been made with the idea that we're "locking data" and not "entering locked code" in mind. Further, remove all additional "flags" previously passed to the lock acquire/release routines with the exception of two: MTX_QUIET and MTX_NOSWITCH The functionality of these flags is preserved and they can be passed to the lock/unlock routines by calling the corresponding wrappers: mtx_{lock, unlock}_flags(lock, flag(s)) and mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN locks, respectively. Re-inline some lock acq/rel code; in the sleep lock case, we only inline the _obtain_lock()s in order to ensure that the inlined code fits into a cache line. In the spin lock case, we inline recursion and actually only perform a function call if we need to spin. This change has been made with the idea that we generally tend to avoid spin locks and that also the spin locks that we do have and are heavily used (i.e. sched_lock) do recurse, and therefore in an effort to reduce function call overhead for some architectures (such as alpha), we inline recursion for this case. Create a new malloc type for the witness code and retire from using the M_DEV type. The new type is called M_WITNESS and is only declared if WITNESS is enabled. Begin cleaning up some machdep/mutex.h code - specifically updated the "optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently need those. Finally, caught up to the interface changes in all sys code. Contributors: jake, jhb, jasone (in no particular order)	2001-02-09 06:11:45 +00:00
jhb	eef4e5d2f9	- Catch up to p_sflag changes. - The MD code now initializes proc0.p_heldmtx, proc0.p_contested, and curproc. - The MD code calls here with Giant already held. - Proc locking.	2001-01-24 10:40:56 +00:00
jake	4f5d8ed825	Use PCPU_GET, PCPU_PTR and PCPU_SET to access all per-cpu variables other then curproc.	2001-01-10 04:43:51 +00:00
jake	a4ad237eaa	- Change the allproc_lock to use a macro, ALLPROC_LOCK(how), instead of explicit calls to lockmgr. Also provides macros for the flags pased to specify shared, exclusive or release which map to the lockmgr flags. This is so that the use of lockmgr can be easily replaced with optimized reader-writer locks. - Add some locking that I missed the first time.	2000-12-13 00:17:05 +00:00
jhb	fb0ab4bf94	- Add a mutex to the proc structure p_mtx that will be used to lock accesses to each individual proc. - Initialize the lock during fork1(), and destroy it in wait1().	2000-12-03 01:22:34 +00:00
jake	95fb73d495	Use an mp-safe callout for endtsleep.	2000-12-01 04:55:52 +00:00
jake	9326f655fc	Use callout_reset instead of timeout(9). Most callouts are statically allocated, 2 have been added to struct proc for setitimer and sleep. Reviewed by: jhb, jlemon	2000-11-27 22:52:31 +00:00
jake	0c0be4e826	Protect the following with a lockmgr lock: allproc zombproc pidhashtbl proc.p_list proc.p_hash nextpid Reviewed by: jhb Obtained from: BSD/OS and netbsd	2000-11-22 07:42:04 +00:00
obrien	c3b27201d1	ELF kernels should use an ELF sysvec. This allows us to move a.out specific files to those platforms that acutally support a.out.	2000-11-05 10:41:35 +00:00
jhb	d944886e4d	Catch up to moving headers: - machine/ipl.h -> sys/ipl.h - machine/mutex.h -> sys/mutex.h	2000-10-20 07:58:15 +00:00
jhb	ad31d928a6	Release Giant before starting up init. Submitted by: jake	2000-09-15 19:25:29 +00:00
dfr	d721d5e1d6	Move the include of <sys/systm.h> so that KTR gets a declaration for snprintf().	2000-09-10 13:54:52 +00:00
jasone	769e0f974d	Major update to the way synchronization is done in the kernel. Highlights include: * Mutual exclusion is used instead of spl(). See mutex(9). (Note: The alpha port is still in transition and currently uses both.) Per-CPU idle processes. * Interrupts are run in their own separate kernel threads and can be preempted (i386 only). Partially contributed by: BSDi (BSD/OS) Submissions by (at least): cp, dfr, dillon, grog, jake, jhb, sheldonh	2000-09-07 01:33:02 +00:00
truckman	0575d8a3f9	Remove uidinfo hash table lookup and maintenance out of chgproccnt() and chgsbsize(), which are called rather frequently and may be called from an interrupt context in the case of chgsbsize(). Instead, do the hash table lookup and maintenance when credentials are changed, which is a lot less frequent. Add pointers to the uidinfo structures to the ucred and pcred structures for fast access. Pass a pointer to the credential to chgproccnt() and chgsbsize() instead of passing the uid. Add a reference count to the uidinfo structure and use it to decide when to free the structure rather than freeing the structure when the resource consumption drops to zero. Move the resource tracking code from kern_proc.c to kern_resource.c. Move some duplicate code sequences in kern_prot.c to separate helper functions. Change KASSERTs in this code to unconditional tests and calls to panic().	2000-09-05 22:11:13 +00:00
phk	e47f61e183	Avoid the modules madness I inadvertently introduced by making the cloning infrastructure standard in kern_conf. Modules are now the same with or without devfs support. If you need to detect if devfs is present, in modules or elsewhere, check the integer variable "devfs_present". This happily removes an ugly hack from kern/vfs_conf.c. This forces a rename of the eventhandler and the standard clone helper function. Include <sys/eventhandler.h> in <sys/conf.h>: it's a helper #include like <sys/queue.h> Remove all #includes of opt_devfs.h they no longer matter.	2000-09-02 19:17:34 +00:00
phk	b648921acc	Remove all traces of Julians DEVFS (incl from kern/subr_diskslice.c) Remove old DEVFS support fields from dev_t. Make uid, gid & mode members of dev_t and set them in make_dev(). Use correct uid, gid & mode in make_dev in disk minilayer. Add support for registering alias names for a dev_t using the new function make_dev_alias(). These will show up as symlinks in DEVFS. Use makedev() rather than make_dev() for MFSs magic devices to prevent DEVFS from noticing this abuse. Add a field for DEVFS inode number in dev_t. Add new DEVFS in fs/devfs. Add devfs cloning to: disk minilayer (ie: ad(4), sd(4), cd(4) etc etc) md(4), tun(4), bpf(4), fd(4) If DEVFS add -d flag to /sbin/inits args to make it mount devfs. Add commented out DEVFS to GENERIC	2000-08-20 21:34:39 +00:00
peter	85c9a2ddc1	Clean up some low level bootstrap code: - stop using the evil 'struct trapframe' argument for mi_startup() (formerly main()). There are much better ways of doing it. - do not use prepare_usermode() - setregs() in execve() will do it all for us as long as the p_md.md_regs pointer is set. (which is now done in machdep.c rather than init_main.c. The Alpha port did it this way all along and is much cleaner). - collect all the magic %cr0 etc register settings into one place and have the AP's call that instead of using magic numbers (!!) that keep changing over and over again. - Make it safe to call kthread_create() earlier, including during the device probe sequence. It doesn't need the callback mechanism that NetBSD's version uses. - kthreads created this way are root-less as they exist before the root filesystem is mounted. init(1) is set up so that it aquires the root pointers prior to running. If other kthreads want filesystem acccess we can make this code more generic. - set all threads start times once we have decided what time it is. - init uses a trampoline rather than the evil prepare_usermode() hack. - kern_descrip.c has a couple of tweaks to deal with forking when there is no rootdir or cwd etc. - adjust the early SYSINIT() sequence so that a few prereqisites are in place. eg: make sure the run queue is initialized before doing forks. With this, the USB code can easily create a kthread to do the device tree discovery. (I have tested it, it works nicely). There are still some open issues before this is truely useful. - tsleep() does not like working before the clock is running. It sort-of tries to spin wait, but it can do more useful things now. - stopping a kthread in kld code at unload time is "interesting" but we have a solution for that. The Alpha code needs no changes for this. It already uses pretty much the same strategies, but a little cleaner.	2000-08-11 09:05:12 +00:00
peter	ee51d18b57	Fix the SYSINIT() bubble sort. This was fixed in kern_linker.c already.	2000-08-02 21:05:21 +00:00
markm	137a7a4f8d	Remove no-longer-relevant comment.	2000-06-25 10:14:06 +00:00
alfred	7f71a1a091	fix races in the uidinfo subsystem, several problems existed: 1) while allocating a uidinfo struct malloc is called with M_WAITOK, it's possible that while asleep another process by the same user could have woken up earlier and inserted an entry into the uid hash table. Having redundant entries causes inconsistancies that we can't handle. fix: do a non-waiting malloc, and if that fails then do a blocking malloc, after waking up check that no one else has inserted an entry for us already. 2) Because many checks for sbsize were done as "test then set" in a non atomic manner it was possible to exceed the limits put up via races. fix: instead of querying the count then setting, we just attempt to set the count and leave it up to the function to return success or failure. 3) The uidinfo code was inlining and repeating, lookups and insertions and deletions needed to be in their own functions for clarity. Reviewed by: green	2000-06-22 22:27:16 +00:00
jkh	71756b1d36	Add new oid, debug.boothowto. This allows userland apps to see how the kernel was booted and perhaps do conditional things based upon it (sysinstall, for example, will now turn Debug mode on automatically if boot -v was done). Submitted by: msmith Suggested by: ulf	2000-02-25 11:43:08 +00:00
grog	1af318cfbc	If we fail to find init, print out the search path used. This helps differentiate between one of three different scenarios: 1. No init. 2. Path to init munged by an incorrect loader configuration. 3. Root file system not mounted. Reviewed-by: billf	1999-12-20 02:50:49 +00:00
phk	1adcecffd9	struct mountlist and struct mount.mnt_list have no business being a CIRCLEQ. Change them to TAILQ_HEAD and TAILQ_ENTRY respectively. This removes ugly mp != (void*)&mountlist comparisons. Requested by: phk Submitted by: Jake Burkholder jake@checker.org PR: 14967	1999-11-20 10:00:46 +00:00
phk	8fca18de89	This is a partial commit of the patch from PR 14914: Alot of the code in sys/kern directly accesses the Q_HEAD and Q_ENTRY structures for list operations. This patch makes all list operations in sys/kern use the queue(3) macros, rather than directly accessing the *Q_{HEAD,ENTRY} structures. This batch of changes compile to the same object files. Reviewed by: phk Submitted by: Jake Burkholder <jake@checker.org> PR: 14914	1999-11-16 10:56:05 +00:00
msmith	6270a8f009	swapinit isn't called from vfs_mountroot, so don't complain about it in a #if 0'ed comment. Call the machine-dependant cpu_rootconf functions from sysinits in their respective areas, don't do it from a stub here.	1999-11-01 23:53:27 +00:00
phk	8e3c3eafed	useracc() the prequel: Merge the contents (less some trivial bordering the silly comments) of <vm/vm_prot.h> and <vm/vm_inherit.h> into <vm/vm.h>. This puts the #defines for the vm_inherit_t and vm_prot_t types next to their typedefs. This paves the road for the commit to follow shortly: change useracc() to use VM_PROT_{READ\|WRITE} rather than B_{READ\|WRITE} as argument.	1999-10-29 18:09:36 +00:00
peter	787140aa42	Trim unused options (or #ifdef for undoc options). Submitted by: phk	1999-10-11 15:19:12 +00:00
bde	10335a428d	Moved the definition of `boottime' and its sysctl to the correct file.	1999-09-13 14:22:27 +00:00
peter	3b842d34e8	$Id$ -> $FreeBSD$	1999-08-28 01:08:13 +00:00
phk	663cbe4fc2	Convert DEVFS hooks in (most) drivers to make_dev(). Diskslice/label code not yet handled. Vinum, i4b, alpha, pc98 not dealt with (left to respective Maintainers) Add the correct hook for devfs to kern_conf.c The net result of this excercise is that a lot less files depends on DEVFS, and devtoname() gets more sensible output in many cases. A few drivers had minor additional cleanups performed relating to cdevsw registration. A few drivers don't register a cdevsw{} anymore, but only use make_dev().	1999-08-23 20:59:21 +00:00
peter	6cb5fe6c6b	Slight reorganization of kernel thread/process creation. Instead of using SYSINIT_KT() etc (which is a static, compile-time procedure), use a NetBSD-style kthread_create() interface. kproc_start is still available as a SYSINIT() hook. This allowed simplification of chunks of the sysinit code in the process. This kthread_create() is our old kproc_start internals, with the SYSINIT_KT fork hooks grafted in and tweaked to work the same as the NetBSD one. One thing I'd like to do shortly is get rid of nfsiod as a user initiated process. It makes sense for the nfs client code to create them on the fly as needed up to a user settable limit. This means that nfsiod doesn't need to be in /sbin and is always "available". This is a fair bit easier to do outside of the SYSINIT_KT() framework.	1999-07-01 13:21:46 +00:00
peter	a9c3f31bb0	Slight tweak to fork1() calling conventions. Add a third argument so the caller can easily find the child proc struct. fork(), rfork() etc syscalls set p->p_retval[] themselves. Simplify the SYSINIT_KT() code and other kernel thread creators to not need to use pfind() to find the child based on the pid. While here, partly tidy up some of the fork1() code for RF_SIGSHARE etc.	1999-06-30 15:33:41 +00:00
jb	154417badf	Use colons instead of semi-colons to behave like UNIX instead of DOS. Suggested by: bde	1999-05-11 10:08:10 +00:00
peter	2bc4e0eea2	Lites2 seems to have pretty much disappeared from the radar, and I suspect far more than this hack would be needed now..	1999-05-09 20:42:45 +00:00
peter	78e46d3bff	s/main/mi_startup/ for the kernel entry point so that egcs doesn't get upset about it (and generate things like __main() calls that are reserved for main()). Renaming was phk's suggestion, but I'd already thought about it too. (phk liked my suggested name tada() but I decided against it :-) Reviewed by: phk	1999-05-09 19:01:49 +00:00
des	183c88e4f6	Nit fix.	1999-05-07 17:37:08 +00:00
jb	8a1c5093d7	Allow the init_path to be customised in an embedded system using the INIT_PATH config option. Also fix two bugs which caused an infinite loop in none of the programs in the init_path were found. That code was obviously not tested!	1999-05-05 12:20:23 +00:00
billf	dd35516544	Add sysctl descriptions to many SYSCTL_XXXs PR: kern/11197 Submitted by: Adrian Chadd <adrian@FreeBSD.org> Reviewed by: billf(spelling/style/minor nits) Looked at by: bde(style)	1999-05-03 23:57:32 +00:00
dt	b1fc5e056c	Set curproc at the end of proc0_init(). This patch also moves the bogus comment (the comment is still not quite right) and (as a side effect) removes some verbose initialisations (we depend on static initialisation to 0 for almost everything in proc0). The alpha kernels are bootable again. The change won't affect i386's until machdep.c is changed. Submitted by: bde	1999-04-29 22:51:59 +00:00
phk	ca21a25f17	This Implements the mumbled about "Jail" feature. This is a seriously beefed up chroot kind of thing. The process is jailed along the same lines as a chroot does it, but with additional tough restrictions imposed on what the superuser can do. For all I know, it is safe to hand over the root bit inside a prison to the customer living in that prison, this is what it was developed for in fact: "real virtual servers". Each prison has an ip number associated with it, which all IP communications will be coerced to use and each prison has its own hostname. Needless to say, you need more RAM this way, but the advantage is that each customer can run their own particular version of apache and not stomp on the toes of their neighbors. It generally does what one would expect, but setting up a jail still takes a little knowledge. A few notes: I have no scripts for setting up a jail, don't ask me for them. The IP number should be an alias on one of the interfaces. mount a /proc in each jail, it will make ps more useable. /proc/<pid>/status tells the hostname of the prison for jailed processes. Quotas are only sensible if you have a mountpoint per prison. There are no privisions for stopping resource-hogging. Some "#ifdef INET" and similar may be missing (send patches!) If somebody wants to take it from here and develop it into more of a "virtual machine" they should be most welcome! Tools, comments, patches & documentation most welcome. Have fun... Sponsored by: http://www.rndassociates.com/ Run for almost a year by: http://www.servetheweb.com/	1999-04-28 11:38:52 +00:00
luoqi	af7e9be5cc	Enable vmspace sharing on SMP. Major changes are, - %fs register is added to trapframe and saved/restored upon kernel entry/exit. - Per-cpu pages are no longer mapped at the same virtual address. - Each cpu now has a separate gdt selector table. A new segment selector is added to point to per-cpu pages, per-cpu global variables are now accessed through this new selector (%fs). The selectors in gdt table are rearranged for cache line optimization. - fask_vfork is now on as default for both UP and SMP. - Some aio code cleanup. Reviewed by: Alan Cox <alc@cs.rice.edu> John Dyson <dyson@iquest.net> Julian Elischer <julian@whistel.com> Bruce Evans <bde@zeta.org.au> David Greenman <dg@root.com>	1999-04-28 01:04:33 +00:00
dt	3843abdc03	Fixed printf format errors on alpha.	1999-04-24 18:50:48 +00:00
des	d3e9d9a7c3	Make the location of init(8) tunable at boot time.	1999-04-20 21:15:13 +00:00
bde	6d8d63664b	Removed all traces of `p_switchtime'. The relevant timestamp is per-cpu, not per-process. Keep it in `switchtime' consistently. It is now clear that the timestamp is always valid in fork_trampoline() except when the child is running on a previously idle cpu, which can only happen if there are multiple cpus, so don't check or set the timestamp in fork_trampoline except in the (i386) SMP case. Just remove the alpha code for setting it unconditionally, since there is no SMP case for alpha and the code had rotted. Parts reviewed by: dfr, phk	1999-02-28 10:53:29 +00:00
bde	d2f1b649d7	Don't forget to update `switchticks' in corner cases (except for the alpha fork_trampoline(), forget it because it I believe it is only necessary for the unsupported SMP case).	1999-02-25 11:03:08 +00:00
luoqi	082d37c1ac	Hide access to vmspace:vm_pmap with inline function vmspace_pmap(). This is the preparation step for moving pmap storage out of vmspace proper. Reviewed by: Alan Cox <alc@cs.rice.edu> Matthew Dillion <dillon@apollo.backplane.com>	1999-02-19 14:25:37 +00:00
luoqi	8812d69a9a	Initialize procsig0.ps_refcnt to 1 (instead of 2), this would silence complaints about ps_refcnt greater than two when we try to fork() a kthread from proc0 with RFSIGSHARE flag set. Noticed by: Tor Egge <tegge@fast.no> Reviewed by: Richard Seaman, Jr. <dick@tar.com>	1999-02-17 21:03:14 +00:00
msmith	1637c07695	Remove unused "kern.shutdown_timeout" sysctl node.	1999-01-30 19:36:02 +00:00
dillon	a784c6e33d	More const fixes for -Wall, -Wcast-qual	1999-01-29 23:18:50 +00:00
dillon	1c7115fb5f	More -Wall / -Wcast-qual cleanup. Also, EXEC_SET can't use C_DECLARE_MODULE due to the linker_file_sysinit() function making modifications to the data.	1999-01-29 08:36:45 +00:00
julian	05a2232887	Enable Linux threads support by default. This takes the conditionals out of the code that has been tested by various people for a while. ps and friends (libkvm) will need a recompile as some proc structure changes are made. Submitted by: "Richard Seaman, Jr." <dick@tar.com>	1999-01-26 02:38:12 +00:00
julian	a7b385889e	Changes to the LINUX_THREADS support to only allocate extra memory for shared signal handling when there is shared signal handling being used. This removes the main objection to making the shared signal handling a standard ability in rfork() and friends and 'unconditionalising' this code. (i.e. the allocation of an extra 328 bytes per process). Signal handling information remains in the U area until such a time as it's reference count would be incremented to > 1. At that point a new struct is malloc'd and maintained in KVM so that it can be shared between the processes (threads) using it. A function to check the reference count and move the struct back to the U area when it drops back to 1 is also supplied. Signal information is therefore now swapable for all processes that are not sharing that information with other processes. THis should addres the concerns raised by Garrett and others. Submitted by: "Richard Seaman, Jr." <dick@tar.com>	1999-01-07 21:23:50 +00:00
dfr	9aad9d912f	Various changes to support OSF1 emulation: * Move the user stack from VM_MAXUSER_ADDRESS to a place below the 32bit boundary (needed to support 32bit OSF programs). This should also save one pagetable per process. * Add cvtqlsv to the set of instructions handled by the floating point software completion code. * Disable all floating point exceptions by default. * A minor change to execve to allow the OSF1 image activator to support dynamic loading.	1998-12-30 10:38:59 +00:00
julian	d718e5c06d	Fix two bogons created by 'patch(1)' in my last commit.	1998-12-19 08:23:31 +00:00
julian	61490236bc	Reviewed by: Luoqi Chen, Jordan Hubbard Submitted by: "Richard Seaman, Jr." <lists@tar.com> Obtained from: linux :-) Code to allow Linux Threads to run under FreeBSD. By default not enabled This code is dependent on the conditional COMPAT_LINUX_THREADS (suggested by Garret) This is not yet a 'real' option but will be within some number of hours.	1998-12-19 02:55:34 +00:00
peter	184dca28fb	Fix sysinit_add(). - Don't include multiple copies of the previous sysinit in the new one. - Leave space for and explicitly null terminate the new list.	1998-10-15 17:09:19 +00:00
peter	80590bd05b	Implement merging SYSINIT's from preloaded KLD modules. This means we check off SYSINIT entries as they are run, and when more arrive, we re-sort and restart (skipping the already-run entries). This can only be done after KMEM (and malloc) is up and running - this is fine because KLD is the only consumer of this and it's done after that. The nice thing about this is that the SYSINIT's within preloaded KLD modules are executed in their natural order. It should be possible to register devices for the probes which follow, etc. (soon.. several key things prevent this, such as use of linker sets for things like pci devices).	1998-10-09 23:42:47 +00:00
dfr	0d42654efc	Make sure that the argv pointers for init are aligned to the correct boundary on the alpha.	1998-10-06 11:55:40 +00:00
sos	8397655514	Remove the SLICE code. This clearly needs alot more thought, and we dont need this to hunt us down in 3.0-RELEASE.	1998-09-14 19:56:42 +00:00
bde	99afdcef51	Cast pointers to intptr_t instead of or before casting to long. Fixed bitrot in K&R support (suword() now takes a long word). Didn't fix corresponding bitrot in store.9 and fetch.9. The correct types for the store and fetch families are problematic. The `word' functions are unfortunately named and need to be split to handle ints/longs/object pointers/function pointers. Storing argv[] as longs is quite broken when longs are longer than pointers, but usually works because it clobbers variables that will soon be reinitialized.	1998-07-15 05:21:48 +00:00
dfr	1d5f38ac22	This commit fixes various 64bit portability problems required for FreeBSD/alpha. The most significant item is to change the command argument to ioctl functions from int to u_long. This change brings us inline with various other BSD versions. Driver writers may like to use (__FreeBSD_version == 300003) to detect this change. The prototype FreeBSD/alpha machdep will follow in a couple of days time.	1998-06-07 17:13:14 +00:00
phk	d3d65c6b2e	Some cleanups related to timecounters and weird ifdefs in <sys/time.h>. Clean up (or if antipodic: down) some of the msgbuf stuff. Use an inline function rather than a macro for timecounter delta. Maintain process "on-cpu" time as 64 bits of microseconds to avoid needless second rollover overhead. Avoid calling microuptime the second time in mi_switch() if we do not pass through _idle in cpu_switch() This should reduce our context-switch overhead a bit, in particular on pre-P5 and SMP systems. WARNING: Programs which muck about with struct proc in userland will have to be fixed. Reviewed, but found imperfect by: bde	1998-05-28 09:30:28 +00:00
phk	86337bf437	s/nanoruntime/nanouptime/g s/microruntime/microuptime/g Reviewed by: bde	1998-05-17 11:53:46 +00:00
julian	0796a5c56e	Add changes and code to implement a functional DEVFS. This code will be turned on with the TWO options DEVFS and SLICE. (see LINT) Two labels PRE_DEVFS_SLICE and POST_DEVFS_SLICE will deliniate these changes. /dev will be automatically mounted by init (thanks phk) on bootup. See /sys/dev/slice/slice.4 for more info. All code should act the same without these options enabled. Mike Smith, Poul Henning Kamp, Soeren, and a few dozen others This code does not support the following: bad144 handling. Persistance. (My head is still hurting from the last time we discussed this) ATAPI flopies are not handled by the SLICE code yet. When this code is running, all major numbers are arbitrary and COULD be dynamically assigned. (this is not done, for POLA only) Minor numbers for disk slices ARE arbitray and dynamically assigned.	1998-04-19 23:32:49 +00:00
des	396b114475	Seventy-odd "its" / "it's" typos in comments fixed as per kern/6108.	1998-04-17 22:37:19 +00:00
phk	87312aadcc	When pmap_pinit0() allocates a page for proc0's page directory, kernal page table may need to be extended. But while growing the kernel page table (pmap_growkernel()), newly allocated kernel page table pages are entered into every process' page directory. For proc0, the page directory is not allocated yet, and results in a page fault. Eventually, the machine panics with "lockmgr: not holding exclusive lock". PR: 5458 Reviewed by: phk Submitted by: Luoqi Chen <luoqi@luoqi.watermarkgroup.com>	1998-04-11 17:24:06 +00:00
phk	c641c2b9ee	Minor adjustments to the timecounting and proc0. Mostly Submitted by: bde	1998-04-08 09:01:53 +00:00
peter	6e3ec235ff	curproc is initialized in locore at the same time for both SMP and UP now.	1998-04-06 15:51:22 +00:00
phk	5e9a131f20	Time changes mark 2: * Figure out UTC relative to boottime. Four new functions provide time relative to boottime. * move "runtime" into struct proc. This helps fix the calcru() problem in SMP. * kill mono_time. * add timespec{add\|sub\|cmp} macros to time.h. (XXX: These may change!) * nanosleep, select & poll takes long sleeps one day at a time Reviewed by: bde Tested by: ache and others	1998-04-04 13:26:20 +00:00
phk	9b703b1455	Eradicate the variable "time" from the kernel, using various measures. "time" wasn't a atomic variable, so splfoo() protection were needed around any access to it, unless you just wanted the seconds part. Most uses of time.tv_sec now uses the new variable time_second instead. gettime() changed to getmicrotime(0. Remove a couple of unneeded splfoo() protections, the new getmicrotime() is atomic, (until Bruce sets a breakpoint in it). A couple of places needed random data, so use read_random() instead of mucking about with time which isn't random. Add a new nfs_curusec() function. Mark a couple of bogosities involving the now disappeard time variable. Update ffs_update() to avoid the weird "== &time" checks, by fixing the one remaining call that passwd &time as args. Change profiling in ncr.c to use ticks instead of time. Resolution is the same. Add new function "tvtohz()" to avoid the bogus "splfoo(), add time, call hzto() which subtracts time" sequences. Reviewed by: bde	1998-03-30 09:56:58 +00:00
dyson	fbe6fe8df6	Make the rootdir handling more consistent. Now, processes always have a root vnode associated with them, and no special checks for the null case are needed. Submitted by: terry@freebsd.org	1998-02-15 04:17:09 +00:00
eivind	4547a09753	Back out DIAGNOSTIC changes.	1998-02-06 12:14:30 +00:00
eivind	c552a9a1c3	Turn DIAGNOSTIC into a new-style option.	1998-02-04 22:34:03 +00:00
phk	bb6f7d8184	Retire LFS. If you want to play with it, you can find the final version of the code in the repository the tag LFS_RETIREMENT. If somebody makes LFS work again, adding it back is certainly desireable, but as it is now nobody seems to care much about it, and it has suffered considerable bitrot since its somewhat haphazard integration. R.I.P	1998-01-30 11:34:06 +00:00
dyson	197bd655c4	VM level code cleanups. 1) Start using TSM. Struct procs continue to point to upages structure, after being freed. Struct vmspace continues to point to pte object and kva space for kstack. u_map is now superfluous. 2) vm_map's don't need to be reference counted. They always exist either in the kernel or in a vmspace. The vmspaces are managed by reference counts. 3) Remove the "wired" vm_map nonsense. 4) No need to keep a cache of kernel stack kva's. 5) Get rid of strange looking ++var, and change to var++. 6) Change more data structures to use our "zone" allocator. Added struct proc, struct vmspace and struct vnode. This saves a significant amount of kva space and physical memory. Additionally, this enables TSM for the zone managed memory. 7) Keep ioopt disabled for now. 8) Remove the now bogus "single use" map concept. 9) Use generation counts or id's for data structures residing in TSM, where it allows us to avoid unneeded restart overhead during traversals, where blocking might occur. 10) Account better for memory deficits, so the pageout daemon will be able to make enough memory available (experimental.) 11) Fix some vnode locking problems. (From Tor, I think.) 12) Add a check in ufs_lookup, to avoid lots of unneeded calls to bcmp. (experimental.) 13) Significantly shrink, cleanup, and make slightly faster the vm_fault.c code. Use generation counts, get rid of unneded collpase operations, and clean up the cluster code. 14) Make vm_zone more suitable for TSM. This commit is partially as a result of discussions and contributions from other people, including DG, Tor Egge, PHK, and probably others that I have forgotten to attribute (so let me know, if I forgot.) This is not the infamous, final cleanup of the vnode stuff, but a necessary step. Vnode mgmt should be correct, but things might still change, and there is still some missing stuff (like ioopt, and physical backing of non-merged cache files, debugging of layering concepts.)	1998-01-22 17:30:44 +00:00
dyson	738872cad6	After one of my analysis passes to evaluate methods for SMP TLB mgmt, I noticed some major enhancements available for UP situations. The number of UP TLB flushes is decreased much more than significantly with these changes. Since a TLB flush appears to cost minimally approx 80 cycles, this is a "nice" enhancement, equiv to eliminating between 40 and 160 instructions per TLB flush. Changes include making sure that kernel threads all use the same PTD, and eliminate unneeded PTD switches at context switch time.	1997-12-14 02:11:23 +00:00
dyson	1835b7ab0e	We have had support for running the kernel daemons as threads for quite a while, but forgot to do so. For now, this code supports most daemons running as kernel threads in UP kernels, and as full processes in SMP. We will soon be able to run them as threads in SMP, but not yet.	1997-12-12 04:00:59 +00:00
sef	c7d273eccb	Changes to allow event-based process monitoring and control.	1997-12-06 04:11:14 +00:00
julian	cf4eb29e47	Shift a few SYSINT() calls around. this results in a few functions becoming static, and the SYSINITs being close to the code they are related to. setting up the dump device is with dumpsys() and kicking off the scheduler is with the scheduler. Mounting root is with the code that does it. Reviewed by: phk	1997-11-25 07:07:48 +00:00
bde	67ac522ce5	Fixed multiple definitions of boothowto. Fixed bitrot in the read-only access to kern.boottime.	1997-11-24 18:35:04 +00:00
phk	4d26888936	Remove a bunch of variables which were unused both in GENERIC and LINT. Found by: -Wunused	1997-11-07 08:53:44 +00:00
phk	4c8218a5c7	Move the "retval" (3rd) parameter from all syscall functions and put it in struct proc instead. This fixes a boatload of compiler warning, and removes a lot of cruft from the sources. I have not removed the /ARGSUSED/, they will require some looking at. libkvm, ps and other userland struct proc frobbing programs will need recompiled.	1997-11-06 19:29:57 +00:00
gibbs	52ace446d2	init_main.c subr_autoconf.c: Add support for "interrupt driven configuration hooks". A component of the kernel can register a hook, most likely during auto-configuration, and receive a callback once interrupt services are available. This callback will occur before the root and dump devices are configured, so the configuration task can affect the selection of those two devices or complete any tasks that need to be performed prior to launching init. System boot is posponed so long as a hook is registered. The hook owner is responsible for removing the hook once their task is complete or the system boot can continue. kern_acct.c kern_clock.c kern_exit.c kern_synch.c kern_time.c: Change the interface and implementation for the kernel callout service. The new implemntaion is based on the work of Adam M. Costello and George Varghese, published in a technical report entitled "Redesigning the BSD Callout and Timer Facilities". The interface used in FreeBSD is a little different than the one outlined in the paper. The new function prototypes are: struct callout_handle timeout(void (func)(void ), void arg, int ticks); void untimeout(void (func)(void ), void arg, struct callout_handle handle); If a client wishes to remove a timeout, it must store the callout_handle returned by timeout and pass it to untimeout. The new implementation gives 0(1) insert and removal of callouts making this interface scale well even for applications that keep 100s of callouts outstanding. See the updated timeout.9 man page for more details.	1997-09-21 22:00:25 +00:00
bde	6ffb8bf9af	Removed unused #includes.	1997-09-02 20:06:59 +00:00
peter	7dfe3e5abe	Clean up the SMP AP bootstrap and eliminate the wretched idle procs. - We now have enough per-cpu idle context, the real idle loop has been revived (cpu's halt now with nothing to do). - Some preliminary support for running some operations outside the global lock (eg: zeroing "free but not yet zeroed pages") is present but appears to cause problems. Off by default. - the smp_active sysctl now behaves differently. It's merely a 'true/false' option. Setting smp_active to zero causes the AP's to halt in the idle loop and stop scheduling processes. - bootstrap is a lot safer. Instead of sharing a statically compiled in stack a number of times (which has caused lots of problems) and then abandoning it, we use the idle context to boot the AP's directly. This should help >2 cpu support since the bootlock stuff was in doubt. - print physical apic id in traps.. helps identify private pages getting out of sync. (You don't want to know how much hair I tore out with this!) More cleanup to follow, this is more of a checkpoint than a 'finished' thing.	1997-08-26 18:10:38 +00:00
fsmp	36308aa291	The promised "better fix" for "Trap 9 When Boot SMP" problem. We now tsleep() in kthread_init() between start_init() and prepare_usermode() while waiting for ALL the idle_loop() processes to come online. Debugged & tested by: "Thomas D. Dean" <tomdean@ix.netcom.com> Reviewed by: David Greenman <dg@root.com>	1997-08-15 02:33:30 +00:00
fsmp	585d84061d	Fixes kern/3835: SMP kernel crash on enable "dumps on wd0" - SMP: set value of curproc in main(), before the SYSINIT stuff runs. Reviewed by: Bruce Evans <bde@zeta.org.au>	1997-08-07 21:22:29 +00:00
dyson	8fa8ae3d0d	Get rid of the ad-hoc memory allocator for vm_map_entries, in lieu of a simple, clean zone type allocator. This new allocator will also be used for machine dependent pmap PV entries.	1997-08-05 00:02:08 +00:00
davidn	9c6cf56f17	Adds sysctl int for shutdown timeout. Reviewed by: Poul-Henning Kamp <phk@dk.tfs.com>	1997-07-10 11:44:42 +00:00
peter	2dc5ff96e7	Preliminary support for per-cpu data pages. This eliminates a lot of #ifdef SMP type code. Things like _curproc reside in a data page that is unique on each cpu, eliminating the expensive macros like: #define curproc (SMPcurproc[cpunumber()]) There are some unresolved bootstrap and address space sharing issues at present, but Steve is waiting on this for other work. There is still some strictly temporary code present that isn't exactly pretty. This is part of a larger change that has run into some bumps, this part is standalone so it should be safe. The temporary code goes away when the full idle cpu support is finished. Reviewed by: fsmp, dyson	1997-06-22 16:04:22 +00:00
dyson	1dcc2689e7	Modifications to existing files to support the initial AIO/LIO and kernel based threading support.	1997-06-16 00:29:36 +00:00
peter	007e29a189	Don't need "opt_smp.h" on these files	1997-05-29 04:52:04 +00:00
tegge	6ea632b44d	Bring in some kernel bootp support. This removes the need for netboot to fill in the nfs_diskless structure, at the cost of some kernel bloat. The advantage is that this code works on a wider range of network adapters than netboot. Several new kernel options are documented in LINT. Obtained from: parts of the code comes from NetBSD.	1997-05-11 18:05:39 +00:00
peter	6323aa10bf	Man the liferafts! Here comes the long awaited SMP -> -current merge! There are various options documented in i386/conf/LINT, there is more to come over the next few days. The kernel should run pretty much "as before" without the options to activate SMP mode. There are a handful of known "loose ends" that need to be fixed, but have been put off since the SMP kernel is in a moderately good condition at the moment. This commit is the result of the tinkering and testing over the last 14 months by many people. A special thanks to Steve Passe for implementing the APIC code!	1997-04-26 11:46:25 +00:00
peter	ecf50a7463	The biggie: Get rid of the UPAGES from the top of the per-process address space. (!) Have each process use the kernel stack and pcb in the kvm space. Since the stacks are at a different address, we cannot copy the stack at fork() and allow the child to return up through the function call tree to return to user mode - create a new execution context and have the new process begin executing from cpu_switch() and go to user mode directly. In theory this should speed up fork a bit. Context switch the tss_esp0 pointer in the common tss. This is a lot simpler since than swithching the gdt[GPROC0_SEL].sd.sd_base pointer to each process's tss since the esp0 pointer is a 32 bit pointer, and the sd_base setting is split into three different bit sections at non-aligned boundaries and requires a lot of twiddling to reset. The 8K of memory at the top of the process space is now empty, and unmapped (and unmappable, it's higher than VM_MAXUSER_ADDRESS). Simplity the pmap code to manage process contexts, we no longer have to double map the UPAGES, this simplifies and should measuably speed up fork(). The following parts came from John Dyson: Set PG_G on the UPAGES that are now in kernel context, and invalidate them when swapping them out. Move the upages object (upobj) from the vmspace to the proc structure. Now that the UPAGES (pcb and kernel stack) are out of user space, make rfork(..RFMEM..) do what was intended by sharing the vmspace entirely via reference counting rather than simply inheriting the mappings.	1997-04-07 07:16:06 +00:00
bde	0bc1781701	Fixed some invalid (non-atomic) accesses to `time', mostly ones of the form `tv = time'. Use a new function gettime(). The current version just forces atomicicity without fixing precision or efficiency bugs. Simplified some related valid accesses by using the central function.	1997-03-22 06:53:45 +00:00
wosch	9108e06194	Include copyright message from <sys/copyright.h>	1997-03-01 17:49:09 +00:00
peter	94b6d72794	Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not ready for it yet.	1997-02-22 09:48:43 +00:00
dyson	10f666af84	This is the kernel Lite/2 commit. There are some requisite userland changes, so don't expect to be able to run the kernel as-is (very well) without the appropriate Lite/2 userland changes. The system boots and can mount UFS filesystems. Untested: ext2fs, msdosfs, NFS Known problems: Incorrect Berkeley ID strings in some files. Mount_std mounts will not work until the getfsent library routine is changed. Reviewed by: various people Submitted by: Jeffery Hsu <hsu@freebsd.org>	1997-02-10 02:22:35 +00:00
bde	4034801600	Set the soft openfiles limit to maxfiles instead of to NOFILE. The limit is now only used by init, so it may as well be "infinite". Don't use RLIM_INFINITY, since setrlimit() doesn't allow setting that value. Use maxfiles instead of RLIM_INFINITY for the hard limit for the same reason. Similarly for the maxprocesses limits (use the "infinite" value of maxproc instead if MAXUPRC and RLIM_INFINITY). NOFILES, MAXUPRC, CHILD_MAX and OPEN_MAX are no longer used in /usr/src and should go away. Their values are almost guaranteed to be wrong now that login.conf exists, so anything that uses the values is broken. Unfortunately, there are probably a lot of ports that depend on them being defined. The global limits maxfilesperproc and maxprocperuid should go away too.	1997-01-27 12:43:36 +00:00
bde	68c982254b	Reduced #include spam in <sys/sysproto.h> and fixed things that depended on it. makesyscalls.sh: This parsed $Id$. Fixed(?) to parse $FreeBSD$. The output is wrong when the id is not expanded in the source file. syscalls.master: Fixed declaration of sigsuspend(). There are still some bogons and spam involving sigset_t. Use `struct foo ' instead of the equivalent `foo_t ' for some nfs and lfs syscalls so that <sys/sysproto.h> doesn't depend on <sys/mount.h>.	1997-01-16 15:58:32 +00:00
jkh	808a36ef65	Make the long-awaited change from $Id$ to $FreeBSD$ This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long. Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.	1997-01-14 07:20:47 +00:00
alex	ad4ccae81c	Typo fix.	1996-12-17 00:46:07 +00:00
phk	ef2dec2170	init_main.c: pass -d to init if DEVFS_ROOT kern_conf.c: gd driver is a disk. vfs_subr.c: include opt_devfs.h	1996-10-28 11:34:57 +00:00
alex	0392fde426	Fix signed/unsigned comparison warnings. Reviewed by: bde	1996-10-20 21:01:46 +00:00
peter	d984089805	call srandom() during the boot to start the sequence with a slightly less predictable seed.	1996-09-23 04:37:54 +00:00
asami	bbb6994e50	Second phase of merge, get rid of more machine-independent-dependencies. Get rid of pc98/pc98/pc98_device.h. Submitted by: The FreeBSD(98) Development Team	1996-09-03 10:24:29 +00:00
asami	db0af2c4dc	s/pc98/isa/g in struct _device and _driver. Resync along the way. Submitted by: The FreeBSD(98) Development Team	1996-08-31 15:07:42 +00:00

1 2 3 4 5 ...

296 Commits