freebsd-nq

Author	SHA1	Message	Date
Robert Watson	0510317039	Improve style consistency of vfs_syscalls.c by converting the style used in various extattr_*() calls to match the rest of the file. Originally, these bits at the end looked more like style(9). This patch was submitted by green by way of the TrustedBSD MAC tree, and I fixed a few problems with it on the way through. Someone with more time on their hands should convert the entire file to style(9); this commit is for diff reduction purposes. Submitted by: green Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-04-20 01:37:08 +00:00
Robert Watson	89e9e6e7c5	In sendfile(), use the vn_rdwr() helper function, rather than manually constructing a struct aio and invoking VOP_READ() directly. This cleans up the code a little, but also has the advantage of making sure almost all vnode read/write access in the kernel goes through the helper function, meaning that instrumentation of that helper function can impact almost all relevant read/write operations. In this case, it permits us to put MAC hooks into vn_rdwr() and not modify uipc_syscalls.c (yet). In general, if helper vn_*() functions exist, they should be used in preference to direct VOP's in system call service code. Submitted by: green Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-04-19 13:46:24 +00:00
Robert Watson	5a06cb0ca6	Divorce proc0 and proc1 credentials earlier; while this isn't technically needed in the current code, in the MAC tree, create_init() relies on the ability to modify the credentials present for initproc, and should not perform that modification on a shared credential. Pro-active diff reduction against MAC changes that are in the queue; also facilitates other work, including the capabilities implementation. Submitted by: green Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-04-19 13:35:53 +00:00
Poul-Henning Kamp	3bdd2d061a	suser is Giant safe, so optimize a pointless case.	2002-04-19 09:20:13 +00:00
SUZUKI Shinsuke	88ff5695c1	just merged cosmetic changes from KAME to ease sync between KAME and FreeBSD. (based on freebsd4-snap-20020128) Reviewed by: ume MFC after: 1 week	2002-04-19 04:46:24 +00:00
Jacques Vidrine	e983a3762b	When exec'ing a set[ug]id program, make sure that the stdio file descriptors (0, 1, 2) are allocated by opening /dev/null for any which are not already open. Reviewed by: alfred, phk MFC after: 2 days	2002-04-19 00:45:29 +00:00
Maxime Henrion	b48a4280fc	Avoid calling malloc() or free() while holding the kenv lock. Reviewed by: jake	2002-04-17 17:51:10 +00:00
Maxime Henrion	d786139c76	Rework the kernel environment subsystem. We now convert the static environment needed at boot time to a dynamic subsystem when VM is up. The dynamic kernel environment is protected by an sx lock. This adds some new functions to manipulate the kernel environment : freeenv(), setenv(), unsetenv() and testenv(). freeenv() has to be called after every getenv() when you have finished using the string. testenv() only tests if an environment variable is present, and doesn't require a freeenv() call. setenv() and unsetenv() are self explanatory. The kenv(2) syscall exports these new functionalities to userland, mainly for kenv(1). Reviewed by: peter	2002-04-17 13:06:36 +00:00
Maxime Henrion	fd448168b7	Add an entry for the kenv(2) syscall (code to follow). Reviewed by: peter	2002-04-17 13:05:13 +00:00
Ian Dowse	df99ca52f1	The recent NFS forced unmount improvements introduced a side-effect where some client operations might be unexpectedly cancelled during an unsuccessful non-forced unmount attempt. This causes problems for amd(8), because it periodically attempts a non-forced unmount to check if the filesystem is still in use. Fix this by adding a new mountpoint flag MNTK_UNMOUNTF that is set only during the operation of a forced unmount. Use this instead of MNTK_UNMOUNT to trigger the cancellation of hung NFS operations. Also correct a problem where dounmount() might inadvertently clear the MNTK_UNMOUNT flag. Reported by: simokawa MFC after: 1 week	2002-04-17 01:07:29 +00:00
John Baldwin	ba626c1db2	Lock proctree_lock instead of pgrpsess_lock.	2002-04-16 17:11:34 +00:00
John Baldwin	596325f154	- Lock proctree_lock instead of pgrpsess_lock. - Use temporary variables to hold a pointer to a pgrp while we dink with it while not holding either the associated proc lock or proctree_lock. It is in theory possible that p->p_pgrp could change out from under us.	2002-04-16 17:09:22 +00:00
John Baldwin	c8b1829d8e	- Lock proctree_lock instead of pgrpsess_lock. - Simplify return logic of setsid() and setpgid().	2002-04-16 17:06:11 +00:00
John Baldwin	ea97757a54	- Lock proctree_lock instead of pgrpsess_lock. - Exclusively lock proctree_lock while calling leavepgrp().	2002-04-16 17:04:21 +00:00
John Baldwin	f089b57070	- Merge the pgrpsess_lock and proctree_lock sx locks into one proctree_lock sx lock. Trying to get the lock order between these locks was getting too complicated as the locking in wait1() was being fixed. - leavepgrp() now requires an exclusive lock of proctree_lock to be held when it is called. - fixjobc() no longer gets a shared lock of proctree_lock now that it requires an xlock be held by the caller. - Locking notes in sys/proc.h are adjusted to note that everything that used to be protected by the pgrpsess_lock is now protected by the proctree_lock.	2002-04-16 17:03:05 +00:00
Poul-Henning Kamp	fe4dc7a6ee	Remove two debug printfs which should never have been committed.	2002-04-15 21:08:51 +00:00
John Baldwin	38e0823392	You have to cast int64_t's to long long if you printf them with %lld. This now compiles on alpha without a warning. Pointy-hat to: phk	2002-04-15 21:04:32 +00:00
Poul-Henning Kamp	e1d970f181	Improve the implementation of adjtime(2). Apply the change as a continuous slew rather than as a series of discrete steps and make it possible to adjust arbitraryly huge amounts of time in either direction. In practice this is done by hooking into the same once-per-second loop as the NTP PLL and setting a suitable frequency offset deducting the amount slewed from the remainder. If the remaining delta is larger than 1 second we slew at 5000PPM (5msec/sec), for a delta less than a second we slew at 500PPM (500usec/sec) and for the last one second period we will slew at whatever rate (less than 500PPM) it takes to eliminate the delta entirely. The old implementation stepped the clock a number of microseconds every HZ to acheive the same effect, using the same rates of change. Eliminate the global variables tickadj, tickdelta and timedelta and their various use and initializations. This removes the most significant obstacle to running timecounter and NTP housekeeping from a timeout rather than hardclock.	2002-04-15 12:23:11 +00:00
Poul-Henning Kamp	b35c8f287d	Take the "tickadj" element out of struct clockinfo. Our adjtime(2) implementation is being changed and the very concept of tickadj will no longer be meaningful.	2002-04-15 12:11:06 +00:00
Poul-Henning Kamp	b9c6e8bdbd	In the ntp_adjtime(2) syscall, return our actual estimate of unapplied offset correction instead of the most recent offset applied.	2002-04-15 08:58:24 +00:00
Jeff Roberson	5e914b96b9	Finish adding support code for sysctl kern.mprof. This dumps some malloc information related to bucket size effeciency. Three things are printed on each row: Size is the size the user actually asked for rounded to 16 bytes. Requests is the number of times this size was asked for. Real Size is the size we actually handed out. At the end the total memory used and total waste is displayed. Currently my system displays about 33% wasted memory. The intent of this code is to gather statistics for tuning the malloc bucket sizes. It is not intended to be run with INVARIANTS and it is not entirely mp safe. It can be enabled via 'options MALLOC_PROFILE' which was commited earlier.	2002-04-15 05:24:01 +00:00
Jeff Roberson	6f2671750e	Remove malloc_type's ks_limit. Updated the kmemzones logic such that the ks_size bitmap can be used as an index into it to report the size of the zone used. Create the kern.malloc sysctl which replaces the kvm mechanism to report similar data. This will provide an easy place for statistics aggregation if malloc_type statistics become per cpu data. Add some code ifdef'd under MALLOC_PROFILING to facilitate a tool for sizing the malloc buckets.	2002-04-15 04:05:53 +00:00
Alfred Perlstein	46e12b42fe	Don't allow one to trace an ancestor when already traced. PR: kern/29741 Submitted by: Dave Zarzycki <zarzycki@FreeBSD.org> Fix from: Tim J. Robbins <tim@robbins.dropbear.id.au> MFC After: 2 weeks	2002-04-14 17:12:55 +00:00
Jeff Roberson	79a3e97054	Use VOP_GETVOBJECT instead of accessing the member directly. This fixed an issue with nullfs and NAMEI shared. Submitted by: Alexander Kabaev	2002-04-14 10:18:48 +00:00
Alan Cox	24ab015f79	Regen	2002-04-14 05:33:58 +00:00
Alan Cox	b0d97980f6	Remove the requirement that Giant be held around sigreturn().	2002-04-14 05:31:47 +00:00
Alan Cox	00e731601d	o Use aiocblist::fd_file in the AIO threads rather than recomputing the file * from the calling process's descriptor table. o Eliminate sharing of the calling process's descriptor table with the AIO threads.	2002-04-14 03:04:19 +00:00
John Baldwin	9c1ab3e04a	- Change killpg1()'s first argument to be a thread instead of a process so we can use td_ucred. - In killpg1(), the proc lock is sufficient to check if p_stat is SZOMB or not. We don't need sched_lock. - Close some races in psignal(). In psignal() there is a big switch statement based on p_stat. All the different cases are assuming that the process (or thread) isn't going to change state out from under it. To ensure this is true, just lock sched_lock for the entire switch. We practically held it the entire time already anyways. This also simplifies the locking somewhat and actually results in fewer lock operations. - Allow signotify() to be called with the sched_lock held since psignal() now does that. - Use td_ucred in a couple of places.	2002-04-13 23:33:36 +00:00
John Baldwin	bad56603ba	- Change donice() to take a thread as the first argument instead of a process so it can use td_ucred. - Require the target process of donice() to be locked when donice() is called. - Use td_ucred. - Lock the target process of p_cansee() and while reading the credentials of a process. - Change the logic of rtprio() slightly so it does it's copyin() if needed prior to locking the target process. - rtprio() no longer needs Giant. In theory with full KSE it would still need Giant to protect p_ucred of curproc for the p_canfoo() functions but p_canfoo() will be changing to using td_ucred of curthread before full KSE hits the tree.	2002-04-13 23:28:23 +00:00
John Baldwin	07f3485d5e	- Change the algorithms of the syscalls to modify process credentials to allocate a blank cred first, lock the process, perform checks on the old process credential, copy the old process credential into the new blank credential, modify the new credential, update the process credential pointer, unlock the process, and cleanup rather than trying to allocate a new credential after performing the checks on the old credential. - Cleanup _setugid() a little bit. - setlogin() doesn't need Giant thanks to pgrp/session locking and td_ucred.	2002-04-13 23:07:05 +00:00
John Baldwin	a7ff744350	- Change the first argument of ktrcanset(), ktrsetchildren(), and ktrops() to a thread pointer so that ktrcanset() can use td_ucred. - Add some proc locking to partially protect p_tracep and p_traceflag.	2002-04-13 22:54:18 +00:00
Thomas Moestl	8db523989f	Use pmap_extract() instead of pmap_kextract() to retrieve the physical address associated with a user virtual address in pipe_build_write_buffer(). Reviewed by: alc	2002-04-13 20:09:06 +00:00
Jeroen Ruigrok van der Werven	bcbf4411d6	Use the correct macros for F_SETFD/F_GETFD instead of magic numbers. Reflect that fact in the manual page. PR: 12723 Submitted by: Peter Jeremy <peter.jeremy@alcatel.com.au> Approved by: bde MFC after: 2 weeks	2002-04-13 10:16:53 +00:00
Thomas Moestl	de67a4bd91	Back out the last revision - it does not work correctly when one of the pages in question is not in the top-level vm object, but in one of the shadow ones. Pointed out by: alc Pointy hat to: tmm	2002-04-13 00:03:07 +00:00
John Baldwin	6871a6c89e	Rework ptrace(2) to be more locking friendly. We do any needed copyin()'s and acquire the proctree_lock if needed first. Then we lock the process if necessary and fiddle with it as appropriate. Finally we drop locks and do any needed copyout's. This greatly simplifies the locking.	2002-04-12 21:17:37 +00:00
Thomas Moestl	60f2606a7d	Do not use pmap_kextract() to find out the physical address of a user belong to a user virtual address; while this happens to work on some architectures, it can't on sparc64, since user and kernel virtual address spaces overlap there (the distinction between them is done via separate address space identifiers). Instead, look up the page in the vm_map of the process in question. Reviewed by: jake	2002-04-12 19:38:41 +00:00
Jeffrey Hsu	4037698769	Fix corner case where m_len was not being initialized. Submitted by: Maksim Yevmenkin <myevmenk@digisle.net> MFC after: 1 week	2002-04-12 00:01:50 +00:00
John Baldwin	b106d2f56a	- Set the base priority of an ithread that has no handlers when we set its normal priority. - Lock sched_lock while we dink with the priorities. - Remove a few extra blank lines.	2002-04-11 21:03:35 +00:00
Alan Cox	ab9ab5702e	Regen	2002-04-11 17:35:53 +00:00
Alan Cox	a0805f6f7a	Remove the requirement that Giant be held around osigreturn(). All platform- specific implementations are MPSAFE.	2002-04-11 17:34:38 +00:00
John Baldwin	7edfb592df	- Change settime() to take a thread as its first argument instead of a proc so it can use td_ucred. - Push Giant down into the end of settime() where we actually set the time on the timecounter and time of day clock. - Remove Giant from clock_settime(). - Push Giant down in settimeofday() to just protect the 'tz' global variable.	2002-04-10 04:09:07 +00:00
John Baldwin	9522390c28	Display the recursion count in the lock_instance in the show locks output. Indirectly requested by: peter	2002-04-10 01:25:11 +00:00
John Baldwin	9351347a17	Cosmetic fixup in output of lock types in show locks output.	2002-04-10 01:19:53 +00:00
Brian Somers	f1e4a6e941	In linker_load_module(), check that rootdev != NODEV before calling linker_search_module(). Without this, modules loaded from loader.conf that then try to load in additional modules (such as digi.ko loading a card's BIOS) die badly in the vn_open() called from linker_search_module(). It may be worth checking (KASSERTing?) that rootdev != NODEV in vn_open() too.	2002-04-10 01:14:45 +00:00
Brian Somers	96987c74d6	Change linker_reference_module() so that it's passed a struct mod_depend * (which may be NULL). The only consumer of this function at the moment is digi_loadmoduledata(), and that passes a NULL mod_depend *. In linker_reference_module(), check to see if we've already got the required module loaded. If we have, bump the reference count and return that, otherwise continue the module search as normal.	2002-04-10 01:13:57 +00:00
John Baldwin	65c9b4303b	- Change fill_kinfo_proc() to require that the process is locked when it is called. - Change sysctl_out_proc() to require that the process is locked when it is called and to drop the lock before it returns. If this proves too complex we can change sysctl_out_proc() to simply acquire the lock at the very end and have the calling code drop the lock right after it returns. - Lock the process we are going to export before the p_cansee() in the loop in sysctl_kern_proc() and hold the lock until we call sysctl_out_proc(). - Don't call p_cansee() on the process about to be exported twice in the aforementioned loop.	2002-04-09 20:10:46 +00:00
John Baldwin	9b28af9165	Whitespace changes to wrap long lines.	2002-04-09 20:01:16 +00:00
John Baldwin	6dc958b9ff	We don't need Giant to read the pgrp ID since the proc lock has protected p_pgrp since the pgrp locking went in. We also don't need it to check for invalid values in the options argument to wait1(), so push Giant down slightly.	2002-04-09 20:00:40 +00:00
John Baldwin	16e7bc7b90	- Remove an early KSE diagnostic panic. The thread pointer here is always curthread. - We don't need Giant to do suser() checks now, so don't lock Giant until after the check.	2002-04-09 19:58:38 +00:00
John Baldwin	2b60cfc5ce	Don't lock the ithread lock in ithread_create(). The ithread isn't on any lists or in any tables yet so there are no other references to it, thus we don't need to lock it.	2002-04-09 16:26:37 +00:00
Poul-Henning Kamp	1bdb20a68e	Implement DIOCGFRONTSTUFF ioctl which reports how many bytes from the start of the device magic stuff might occupy. Sponsored by: DARPA & NAI Labs.	2002-04-09 15:43:32 +00:00
Poul-Henning Kamp	7f086a0852	Rename DIOCGKERNELDUMP to DIOCSKERNELDUMP as it strictly speaking is a "set" not a "get" operation. Sponsored by: DARPA & NAI Labs.	2002-04-09 10:04:09 +00:00
Jeff Roberson	a59f8b9e6c	Turn #ifdef LOOKUP_SHARED into #ifndef LOOKUP_EXCLUSIVE to enable this behavior by default. Also, change the options line to reflect this. If there are no problems reported this will become the only behavior and the knob will be removed in a month or so. Demanded by: obrien	2002-04-09 05:14:17 +00:00
Maxime Henrion	a48ca36999	The fourth parameter to copystr() is a size_t, not an int. Approved by: peter	2002-04-08 21:14:19 +00:00
Poul-Henning Kamp	2dd527b3ac	Move generic disk ioctls from <sys/disklabel.h> to <sys/disk.h>. Sponsored by: DARPA & NAI Labs	2002-04-08 09:20:07 +00:00
Poul-Henning Kamp	d39e457bba	Put back dumppcb, but this time we put a comment to tell what it is for. Brucifixion by: bde	2002-04-08 06:59:13 +00:00
Alan Cox	c0bf5caa74	Restructure aio_return() to eliminate duplicated code and facilitate Giant push down.	2002-04-08 04:57:56 +00:00
Jeffrey Hsu	20504246d8	There's only one socket zone so we don't need to remember it in every socket structure.	2002-04-08 03:04:22 +00:00
Maxime Henrion	9d8353732e	o Change kernel_vmount() interface to be more convenient : pass two separate strings instead of passing "foo=bar". o Don't forget to clear the VMOUNT flag on the vnode when vfs_nmount() fails because the fs doesn't implement VFS_NMOUNT (and in vfs_mount() when the fs doesn't implement VFS_MOUNT) ; also decrement the vfs refcount in the !MNT_UPDATE case.	2002-04-07 13:22:47 +00:00
David Malone	cf4ce70bb3	Remove a comment which relates to the old name cache code, which was replaced in 1997. Approved by: phk	2002-04-07 08:58:31 +00:00
Alan Cox	ae124fc4bd	Reduce the duplication of code for error handling in _aio_aqueue().	2002-04-07 07:17:59 +00:00
Alan Cox	63a4964eec	Change jobref and *ijoblist from int to long in order to avoid a catastrophe after the 2^32nd AIO operation on 64-bit architectures.	2002-04-07 01:28:34 +00:00
Jake Burkholder	98281c99fc	Remove a stale comment.	2002-04-06 08:44:04 +00:00
Jake Burkholder	a9f5d33875	Include machine/ktr.h for sparc64 so we pick up KTR_CPU.	2002-04-06 08:43:17 +00:00
Jake Burkholder	a30d7c60f6	Use CTASSERT rather than a runtime check to detect kinfo_proc size changes. Remove the ugly yuck code to busy wait for 20 seconds.	2002-04-06 08:13:52 +00:00
Yoshihiro Takahashi	d7ef6277af	Added the new kernel dumping support for pc98.	2002-04-06 06:41:54 +00:00
Bruce Evans	c78f394575	Updated a doubly stale comment about signotify(). Fixed a nearby long line.	2002-04-05 10:00:37 +00:00
Peter Wemm	911fc92344	Increase the size of the register stack storage on ia64 from 32K to 2MB so that we can compile gcc. This is a hack because it adds a fixed 2MB to each process's VSIZE regardless of how much is really being used since there is no grow-up stack support. At least it isn't physical memory. Sigh. Add a sysctl to enable tweaking it for new processes.	2002-04-05 01:57:45 +00:00
Thomas Moestl	d7f7792edf	Add a generic implementation of inittodr() and resettodr(), as well as a set of helper routines to deal with real-time clocks. The generic functions access the clock diver using a kobj interface. This is intended to reduce code reduplication and make it easy to support more than one clock model on a single architecture. This code is currently only used on sparc64, but it is planned to convert the code of the other architectures to it later.	2002-04-04 23:39:10 +00:00
John Baldwin	6008862bc2	Change callers of mtx_init() to pass in an appropriate lock type name. In most cases NULL is passed, but in some cases such as network driver locks (which use the MTX_NETWORK_LOCK macro) and UMA zone locks, a name is used. Tested on: i386, alpha, sparc64	2002-04-04 21:03:38 +00:00
John Baldwin	0c88508a78	Change mtx_init() to now take an extra argument. The third argument is the generic lock type for use with witness. If this argument is NULL then the lock name is used as the lock type. Add a macro for a lock type name for network driver locks.	2002-04-04 20:52:27 +00:00
John Baldwin	9939f0f11c	Set the lock type equal to the lock name for now as all of the current sx locks don't use very specific lock names.	2002-04-04 20:49:35 +00:00
John Baldwin	b6396e1656	Add a new char * pointer lo_type to struct lock_object that is used to point to a more generic name for a lock that is more suitable for use by witness when grouping locks. For example, although network driver locks use the interface name for the name of each lock, they should all use the same witness and be treated the same as witness. Another example is that all UMA zone locks should be treated the same. The witness code has also been updated to print out the lock type in addition to the lock name in a few places where it is relevant.	2002-04-04 20:45:21 +00:00
Poul-Henning Kamp	f67ad03a25	Delete the bogus d_boot[01] fields from struct disklabel. This shrinks the size 4 bytes on alpha, down to the same 276 bytes as all other platforms. Construct a hack to make old ioctls work on new kernels. Once world is recompiled only the new and correct sysctls will be used. This hack will become annoying around 1st of may to make people rebuild their worlds and it will be gone before 5.0.	2002-04-04 20:34:48 +00:00
Bruce Evans	79065dba2a	Moved signal handling and rescheduling from userret() to ast() so that they aren't in the usual path of execution for syscalls and traps. The main complication for this is that we have to set flags to control ast() everywhere that changes the signal mask. Avoid locking in userret() in most of the remaining cases. Submitted by: luoqi (first part only, long ago, reorganized by me) Reminded by: dillon	2002-04-04 17:49:48 +00:00
Bruce Evans	179235b38b	Optimized the check for unmasked pending signals in CURSIG() using a new inline function sigsetmasked() and a new macro SIGPENDING(). CURSIG() will soon be moved out of the normal path of execution for syscalls and traps. Then its efficiency will be less important but the new interfaces will be useful for checking for unmasked pending signals in more places. Submitted by: luoqi (long ago, in a slightly different form) Assert that sched_lock is not held in CURSIG().	2002-04-04 15:19:41 +00:00
Alan Cox	9b16adc1e7	o aio_process needn't fhold()/fdrop() the fp now that _aio_aqueue() and aio_free_entry() do this. o Remove two unnecessary/unused variables from aio_process() and one field from aiocblist.	2002-04-04 02:13:20 +00:00
Alfred Perlstein	19a0f7e1be	Avoid a lock order reversal by dropping the eventhandler_mutex earlier. We get enough protection from the lock on the individual lists that we aquire later. Noticed/Tested by: Steven G. Kargl <kargl@troutmask.apl.washington.edu> Submitted by: Jonathan Mini <mini@haikugeek.com>	2002-04-04 00:52:03 +00:00
John Baldwin	7049932843	- Axe a stale comment. We haven't allowed the ucred pointer passed to securelevel_*() to be NULL for a while now. - Use KASSERT() instead of if (foo) panic(); to optimize the !INVARIANTS case. Submitted by: Martin Faxer <gmh003532@brfmasthugget.se>	2002-04-03 18:35:25 +00:00
Maxime Henrion	bcc931752f	Add two forgotten vfs_unbusy() calls, in vfs_mount() and vfs_nmount(). Reviewed by: phk	2002-04-03 12:19:03 +00:00
Ruslan Ermilov	12c79eb288	Dike out a highly insecure UCONSOLE option. TIOCCONS must be able to VOP_ACCESS() /dev/console to succeed. Obtained from: OpenBSD	2002-04-03 10:56:59 +00:00
Matthew Dillon	d1b534dfc6	brelse() was improperly clearing B_DELWRI in the B_DELWRI\|B_INVAL case without removing the buffer from the vnode's dirty buffer list, which can result in a panic in NFS. Replaced the code with a call to bundirty() which deals with it properly. PR: kern/36108, kern/36174 Submitted by: various people Special mention: to Danny Schales <dan@coes.LaTech.edu> for providing a core dump that helped me track this down. MFC after: 1 day	2002-04-03 00:17:36 +00:00
Dag-Erling Smørgrav	e633070431	Revert to open hashing. It makes the code simpler, and works farily well even when the number of records approaches the size of the hash table. Besides, the previous implementation (using linear probing) was broken :) Also, use the newly introduced MTX_SYSINIT.	2002-04-02 23:26:32 +00:00
John Baldwin	c53c013bae	- Move the MI mutexes sched_lock and Giant from being declared in the various machdep.c's to being declared in kern_mutex.c. - Add a new function mutex_init() used to perform early initialization needed for mutexes such as setting up thread0's contested lock list and initializing MI mutexes. Change the various MD startup routines to call this function instead of duplicating all the code themselves. Tested on: alpha, i386	2002-04-02 22:19:16 +00:00
John Baldwin	7feefcd6ce	Spelling police.	2002-04-02 20:44:30 +00:00
John Baldwin	c08cf3c3e8	Enforce an implicit lock order of sleepable locks before non-sleepable locks.	2002-04-02 19:27:21 +00:00
Andrew R. Reiter	72a492cacf	- Add a mutex to lock the global securelevel value. - Make use of MTX_SYSINIT() as the means to initialize our mutex lock.	2002-04-02 17:43:17 +00:00
Seigo Tanimura	2a60b9b951	Fix leakage of p_pgrp lock.	2002-04-02 17:12:06 +00:00
John Baldwin	48c343df5f	Explicitly document how we implicitly enforce the lock order of sleep locks before spin locks.	2002-04-02 16:51:20 +00:00
Andrew R. Reiter	c27b56999e	- Add MTX_SYSINIT and SX_SYSINIT as macro glue for allowing sx and mtx locks to be able to setup a SYSINIT call. This helps in places where a lock is needed to protect some data, but the data is not truly associated with a subsystem that can properly initialize it's lock. The macros use the mtx_sysinit() and sx_sysinit() functions, respectively, as the handler argument to SYSINIT(). Reviewed by: alfred, jhb, smp@	2002-04-02 16:05:43 +00:00
Dag-Erling Smørgrav	b784ffe91a	Instead of get_cyclecount(9), use nanotime(9) to record acquisition and release times. Measurements are made and stored in nanoseconds but presented in microseconds, which should be sufficient for the locks for which we actually want this (those that are held long and / or often). Also, rename some variables and structure members to unit-agnostic names.	2002-04-02 14:42:01 +00:00
Poul-Henning Kamp	408ab1b875	Retire the bogus ioctl DIOCGPART in toto. Once again we can notice that badly thought out hacks ferment and infect far more code than initially expected. Sponsored by: DARPA and NAI Labs.	2002-04-02 11:52:13 +00:00
Marcel Moolenaar	7902451821	Don't compile the dummy dumpsys for ia64.	2002-04-02 10:55:40 +00:00
Robert Watson	3bd1da2958	Update comment regarding the locking of the sysctl tree. Rename memlock to sysctllock, and MEMLOCK()/MEMUNLOCK() to SYSCTL_LOCK()/ SYSCTL_UNLOCK() and related changes to make the lock names make more sense. Submitted by: Jonathan Mini <mini@haikugeek.com>	2002-04-02 05:50:07 +00:00
Alfred Perlstein	29a2c0cd09	Use sx locks instead of flags+tsleep locks. Submitted by: Jonathan Mini <mini@haikugeek.com>	2002-04-02 04:20:38 +00:00
Alfred Perlstein	28fe1a715e	Use sx locks rather than lockmgr locks for eventhandlers. Submitted by: Jonathan Mini <mini@haikugeek.com>	2002-04-02 04:18:54 +00:00
Dag-Erling Smørgrav	6c35e80948	Mutex profiling code, conditional on the MUTEX_PROFILING option. Adds the following sysctl variables: debug.mutex.prof.enable enable / disable profiling debug.mutex.prof.acquisitions number of mutex acquisitions recorded debug.mutex.prof.records number of acquisition points recorded debug.mutex.prof.maxrecords max number of acquisition points debug.mutex.prof.rejected number of rejections (due to full table) debug.mutex.prof.hashsize hash size debug.mutex.prof.collisions number of hash collisions debug.mutex.prof.stats profiling statistics The code records four numbers for each acquisition point (identified by source file name and line number): longest time held, total time held, number of non-recursive acquisitions, average time held. The measurements are in clock cycles (as returned by get_cyclecount(9)); this may cause measurements on some SMP systems to be unreliable. This can probably be worked around by replacing get_cyclecount(9) by some incarnation of nanotime(9). This work was derived from initial patches by eivind.	2002-04-02 00:01:49 +00:00
Matthew Dillon	182da8209d	Stage-2 commit of the critical*() code. This re-inlines cpu_critical_enter() and cpu_critical_exit() and moves associated critical prototypes into their own header file, <arch>/<arch>/critical.h, which is only included by the three MI source files that need it. Backout and re-apply improperly comitted syntactical cleanups made to files that were still under active development. Backout improperly comitted program structure changes that moved localized declarations to the top of two procedures. Partially re-apply one of the program structure changes to move 'mask' into an intermediate block rather then in three separate sub-blocks to make the code more readable. Re-integrate bug fixes that Jake made to the sparc64 code. Note: In general, developers should not gratuitously move declarations out of sub-blocks. They are where they are for reasons of structure, grouping, readability, compiler-localizability, and to avoid developer-introduced bugs similar to several found in recent years in the VFS and VM code. Reviewed by: jake	2002-04-01 23:51:23 +00:00
John Baldwin	44731cab3b	Change the suser() API to take advantage of td_ucred as well as do a general cleanup of the API. The entire API now consists of two functions similar to the pre-KSE API. The suser() function takes a thread pointer as its only argument. The td_ucred member of this thread must be valid so the only valid thread pointers are curthread and a few kernel threads such as thread0. The suser_cred() function takes a pointer to a struct ucred as its first argument and an integer flag as its second argument. The flag is currently only used for the PRISON_ROOT flag. Discussed on: smp@	2002-04-01 21:31:13 +00:00
John Baldwin	4c44ad8ee5	Whitespace only change: use ANSI function declarations instead of K&R.	2002-04-01 20:13:31 +00:00
Poul-Henning Kamp	c23cda8580	Extend a hack to also hack around PC98's definition of __i386__	2002-04-01 20:13:03 +00:00
John Baldwin	4269e184e8	Fix style bug in previous commit.	2002-04-01 17:53:42 +00:00
Jake Burkholder	60a57b73ef	ktr changes to improve performance and make writing a userland utility to dump the trace buffer feasible. - Remove KTR_EXTEND. This changes the format of the trace entries when activated, making writing a userland tool which is not tied to a specific kernel configuration difficult. - Use get_cyclecount() for timestamps. nanotime() is much too heavy weight and requires recursion protection due to ktr traces occuring as a result of ktr traces. KTR_VERBOSE may still require recursion protection, which is now conditional on it. - Allow KTR_CPU to be overridden by MD code. This is so that it is possible to trace early in startup before pcpu and/or curthread are setup. - Add a version number for the ktr interface. A userland tool can check this to detect mismatches. - Use an array for the parameters to make decoding in userland easier. - Add file and line recording to the non-extended traces now that the extended version is no more. These changes will break gdb macros to decode the extended version of the trace buffer which are floating around. Users of these macros should either use the show ktr command in ddb, or use the userland utility which can be run on a core dump. Approved by: jhb Tested on: i386, sparc64	2002-04-01 05:35:26 +00:00
Poul-Henning Kamp	81661c94b6	Here follows the new kernel dumping infrastructure. Caveats: The new savecore program is not complete in the sense that it emulates enough of the old savecores features to do the job, but implements none of the options yet. I would appreciate if a userland hacker could help me out getting savecore to do what we want it to do from a users point of view, compression, email-notification, space reservation etc etc. (send me email if you are interested). Currently, savecore will scan all devices marked as "swap" or "dump" in /etc/fstab _or_ any devices specified on the command-line. All architectures but i386 lack an implementation of dumpsys(), but looking at the i386 version it should be trivial for anybody familiar with the platform(s) to provide this function. Documentation is quite sparse at this time, more to come. Details: ATA and SCSI drivers should work as the dump formatting code has been removed. The IDA, TWE and AAC have not yet been converted. Dumpon now opens the device and uses ioctl(DIOCGKERNELDUMP) to set the device as dumpdev. To implement the "off" argument, /dev/null is used as the device. Savecore will fail if handed any options since they are not (yet) implemented. All devices marked "dump" or "swap" in /etc/fstab will be scanned and dumps found will be saved to diskfiles named from the MD5 hash of the header record. The header record is dumped in readable format in the .info file. The kernel is not saved. Only complete dumps will be saved. All maintainer rights for this code are disclaimed: feel free to improve and extend. Sponsored by: DARPA, NAI Labs	2002-03-31 22:37:00 +00:00
Poul-Henning Kamp	1f3a74b1b1	Implement the two "GEOM" ioctls DIOCGSECTORSIZE and DIOCGMEDIASIZE for the non-GEOM code as well. This simplifies the the kernel-dumping and disk-management tools as less compatibility cruft will be needed. Sponsored by: DARPA and NAI Labs.	2002-03-31 21:17:12 +00:00
Alan Cox	a5c0b1c020	Keep the reference to the file acquired in _aio_aqueue() until the operation completes. The reference is released in aio_free_entry(). Submitted by: tegge	2002-03-31 20:17:56 +00:00
Alfred Perlstein	7b11fea64f	Close some holes with p->p_args by NULL'ing out the p->p_args pointer while holding the proc lock, and by holding the pargs structure when accessing it from outside of the owner. Submitted by: Jonathan Mini <mini@haikugeek.com>	2002-03-31 10:33:12 +00:00
Poul-Henning Kamp	8d19a26558	Centralize the "bootdev" and "dumpdev" variables. They are still pretty bogus all things considered, but at least now they don't camouflage as being MD variables.	2002-03-31 07:15:28 +00:00
Alan Cox	5e20c11f19	Add a local proc *p in exec_new_vmspace() to avoid repeated dereferencing to obtain it.	2002-03-31 00:05:30 +00:00
Bruce Evans	4f1f485f34	Fixed handling of short reads in readdisklabel() and writedisklabel(). These functions use DEV_STRATEGY() which can easily return a short count (with no error) for reads near EOF. EOF happens for "disks" too small to contain a label sector (mainly for empty slices). The functions didn't understand this at all, and looked for labels in the garbage in the buffer beyond what DEV_STRATEGY() returned. The recent UMA changes combined with my local changes and configuration resulted in the garbage often containing a valid but garbage label left over from a previous call. Bugs in EOF handling in -current limited the problem to "disks" with size precisely LABELSECTOR sectors. LABELSECTOR happens to be a very unusual "disk" size since it is only 0 for non-i386 arches that don't usually have disks with DOS MBRs.	2002-03-30 16:02:43 +00:00
Dan Moschuk	e7876c0943	Nuke CV_DEBUG in favour of INVARIANTS. Approved by: jhb	2002-03-30 03:52:52 +00:00
Jake Burkholder	b454c6dd29	Style fixes purposefully left out of last commit. I checked the kse tree and didn't see any changes that this conflicts with.	2002-03-29 16:45:03 +00:00
Jake Burkholder	d0ce9a7e07	Remove abuse of intr_disable/restore in MI code by moving the loop in ast() back into the calling MD code. The MD code must ensure no races between checking the astpening flag and returning to usermode. Submitted by: peter (ia64 bits) Tested on: alpha (peter, jeff), i386, ia64 (peter), sparc64	2002-03-29 16:35:26 +00:00
Seigo Tanimura	5cf4bcebbf	The description of fd_mtx is "filedesc structure."	2002-03-29 11:26:05 +00:00
Matthew N. Dodd	32bc1098b2	Add resource_list_add_next() which returns the RID for the resource added.	2002-03-29 06:42:54 +00:00
Alfred Perlstein	c1508b28c6	To remove nested include of sys/lock.h and sys/mutex.h from sys/proc.h make the pargs_* functions into non-inlines in kern/kern_proc.c. Requested by: bde	2002-03-28 18:12:27 +00:00
Poul-Henning Kamp	45609bea17	Get the magnitude of the NTP adjustment right.	2002-03-28 16:02:44 +00:00
Maxime Henrion	daab5e2472	- Properly sync vfs_nmount() with changes that have be already done in vfs_mount(), in particular revisions 1.215, 1.227 and 1.240. - flag2 is a low quality variable name, change it to kern_flag. - strncpy NUL-terminates f_fstypename and f_mntonname since the strings have length <= <buffer length> - 1, so the explicit NUL-termination is bogus. - M_ZERO'ing space for fstype and fspath is stupid since we never use the space beyond the end of the string. - Do various style(9) cleanups in both functions. Submitted by: bde Reviewed by: phk	2002-03-28 13:47:32 +00:00
Alan Cox	cd430164f1	Allow resursion on the pipe mutex because filt_piperead() and filt_pipewrite() can be called both with and without the pipe mutex held. (For example, if called by pipeselwakeup(), it is held. Whereas, if called by kqueue_scan(), it is not.) Reviewed by: alfred	2002-03-27 21:47:50 +00:00
Alfred Perlstein	8899023f66	Make the reference counting of 'struct pargs' SMP safe. There is still some locations where the PROC lock should be held in order to prevent inconsistent views from outside (like the proc->p_fd fix for kern/vfs_syscalls.c:checkdirs()) that can be fixed later. Submitted by: Jonathan Mini <mini@haikugeek.com>	2002-03-27 21:36:18 +00:00
Jeff Roberson	f22a4b62f5	Add a new mtx_init option "MTX_DUPOK" which allows duplicate acquires of locks with this flag. Remove the dup_list and dup_ok code from subr_witness. Now we just check for the flag instead of doing string compares. Also, switch the process lock, process group lock, and uma per cpu locks over to this interface. The original mechanism did not work well for uma because per cpu lock names are unique to each zone. Approved by: jhb	2002-03-27 09:23:41 +00:00
Matthew Dillon	e6bbfd402d	oops, forgot to commit this. td->td_savecrit = 0 replaced by API call cpu_thread_link().	2002-03-27 08:26:37 +00:00
Jake Burkholder	f2a79bb9b4	Make this compile. Pointy hat to: dillon	2002-03-27 06:44:32 +00:00
Matthew Dillon	d74ac6819b	Compromise for critical()/cpu_critical() recommit. Cleanup the interrupt disablement assumptions in kern_fork.c by adding another API call, cpu_critical_fork_exit(). Cleanup the td_savecrit field by moving it from MI to MD. Temporarily move cpu_critical() from <arch>/include/cpufunc.h to <arch>/<arch>/critical.c (stage-2 will clean this up). Implement interrupt deferral for i386 that allows interrupts to remain enabled inside critical sections. This also fixes an IPI interlock bug, and requires uses of icu_lock to be enclosed in a true interrupt disablement. This is the stage-1 commit. Stage-2 will occur after stage-1 has stabilized, and will move cpu_critical() into its own header file(s) + other things. This commit may break non-i386 architectures in trivial ways. This should be temporary. Reviewed by: core Approved by: core	2002-03-27 05:39:23 +00:00
Bruce Evans	c0f7f75fd7	"Fixed" -Wshadow warnings by changing the name of some function parameters from `index' to `indx'. The correct fix would be to not support or use index().	2002-03-27 04:04:17 +00:00
Alan Cox	cb100b25ce	Remove an unnecessary and inconsistently used variable from exec_new_vmspace().	2002-03-26 19:20:04 +00:00
Andrew R. Reiter	dcce8874eb	- Fixup a few style nits: - return error -> return (error); - move a declaration to the top of the function. - become bug for bug compatible with if (error) lines. Submitted by: bde	2002-03-26 18:07:10 +00:00
Maxime Henrion	17594b936b	As discussed in -arch, add the new nmount(2) system call and the new vfs_getopt()/vfs_copyopt() API. This is intended to be used later, when there will be filesystems implementing the VFS_NMOUNT operation. The mount(2) system call will disappear when all filesystems will be converted to the new API. Documentation will be committed in a while. Reviewed by: phk	2002-03-26 15:33:44 +00:00
Bruce Evans	237e41fc58	Added used include of <sys/sx.h>. Don't depend on namespace pollution in <sys/file.h>.	2002-03-26 01:09:51 +00:00
Bruce Evans	ee99e978a3	Added used include of <sys/sx.h>. Don't depend on namespace pollution in <sys/file.h> or <sys/socketvar.h>.	2002-03-25 21:52:04 +00:00
David E. O'Brien	0beb3ecc6c	Commit work-around for panics when mounting FS's that are auto-loaded as modules (ie. procfs.ko). When the kernel loads dynamic filesystem module, it looks for any of the VOP operations specified by the new filesystem that have not been registered already by the currently known filesystems. If any of such operations exist, vfs_add_vnops function calls vfs_opv_recalc function, which rebuilds vop_t vectors for each filesystem and sets all global pointers like ufs_vnops_p, devfs_specop_p, etc to the new values and then frees the old pointers. This behavior is bad because there might be already active vnodes whose v_op fields will be left pointing to the random garbage, leading to inevitable crash soon. Submitted by: Alexander Kabaev <ak03@gte.com>	2002-03-25 21:30:50 +00:00
Andrew R. Reiter	517f30c2c1	- Recommit the securelevel_gt() calls removed by commits rev. 1.84 of kern_linker.c and rev. 1.237 of vfs_syscalls.c since these are not the source of the recent panics occuring around kldloading file system support modules. Requested by: rwatson	2002-03-25 18:26:34 +00:00
Poul-Henning Kamp	aaead0dfe9	Modernize my email address.	2002-03-25 13:52:45 +00:00
Bruce Evans	70f52b4845	Fixed some style bugs in the removal of __P(()). The main ones were not removing tabs before "__P((", and not outdenting continuation lines to preserve non-KNF lining up of code with parentheses. Switch to KNF formatting and/or rewrap the whole prototype in some cases.	2002-03-24 05:09:11 +00:00
John Baldwin	d846883bc4	Use td_ucred in several trivial syscalls and remove Giant locking as appropriate.	2002-03-22 22:32:04 +00:00
John Baldwin	f2ae7368ea	Use explicit Giant locks and unlocks for rather than instrumented ones for code that is still not safe. suser() reads p_ucred so it still needs Giant for the time being. This should allow kern.giant.proc to be set to 0 for the time being.	2002-03-22 21:02:02 +00:00
Robert Watson	29dc1288b0	Merge from TrustedBSD MAC branch: Move the network code from using cr_cansee() to check whether a socket is visible to a requesting credential to using a new function, cr_canseesocket(), which accepts a subject credential and object socket. Implement cr_canseesocket() so that it does a prison check, a uid check, and add a comment where shortly a MAC hook will go. This will allow MAC policies to seperately instrument the visibility of sockets from the visibility of processes. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-03-22 19:57:41 +00:00
Alfred Perlstein	db51256707	When "cloning" a pipe's buffer bcopy the data after dropping the pipe's lock as the data may be paged out and cause a fault.	2002-03-22 16:09:22 +00:00
Robert Watson	7906271f25	In sysctl, req->td is believed always to be non-NULL, so there's no need to test req->td for NULL values and then do somewhat more bizarre things relating to securelevel special-casing and suser checks. Remove the testing and conditional security checks based on req->td!=NULL, and insert a KASSERT that td != NULL. Callers to sysctl must always specify the thread (be it kernel or otherwise) requesting the operation, or a number of current sysctls will fail due to assumptions that the thread exists. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs Discussed with: bde	2002-03-22 14:58:27 +00:00
Robert Watson	4584bb3945	Since cred never appears to be passed into the securelevel calls as NULL, turn warning printf's into panic's, since this call has been restructured such that a NULL cred would result in a page fault anyway. There appears to be one case where NULL is explicitly passed in in the sysctl code, and this is believed to be in error, so will be modified. Securelevels now always require a credential context so that per-jail securelevels are properly implemented. Obtained from: TrustedBSD Project Sponsored by: NAI Labs Discussed with: bde	2002-03-22 14:49:12 +00:00
Andrew R. Reiter	fe3240e9aa	- Back out the commit to make the linker_load_file() securelevel check made aware in jail environments. Supposedly something is broken, so this should be backed out until further investigation proves otherwise, or a proper fix can be provided.	2002-03-22 04:56:09 +00:00
Robert Watson	1b350b4542	Break out the "see_other_uids" policy check from the various method-based inter-process security checks. To do this, introduce a new cr_seeotheruids(u1, u2) function, which encapsulates the "see_other_uids" logic. Call out to this policy following the jail security check for all of {debug,sched,see,signal} inter-process checks. This more consistently enforces the check, and makes the check easy to modify. Eventually, it may be that this check should become a MAC policy, loaded via a module. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-03-22 02:28:26 +00:00
Andrew R. Reiter	e85b9ae9ac	- Fix a logic error in checking the securelevel that was introduced in the previous commit. Pointy hats to: arr, rwatson	2002-03-21 15:27:39 +00:00
Warner Losh	cb9a238a8a	Remove last two abuses of cpu_critical_{enter,exit} in the MI code. Reviewed by: jake, jhb, rwatson	2002-03-21 06:11:09 +00:00
Benno Rice	565ab9395f	Add a change mirroring that made to kern/subr_trap.c and others. This makes kernel builds with DIAGNOSTIC work again. Apparently forgotten by: jhb Might want to be checked by: jhb	2002-03-21 02:47:51 +00:00
Jeff Roberson	59295dba57	UMA permited us to utilize the 'waitok' flag to soalloc.	2002-03-20 21:23:26 +00:00
John Baldwin	01c04d2de9	Change the way we ensure td_ucred is NULL if DIAGNOSTIC is defined. Instead of caching the ucred reference, just go ahead and eat the decerement and increment of the refcount. Now that Giant is pushed down into crfree(), we no longer have to get Giant in the common case. In the case when we are actually free'ing the ucred, we would normally free it on the next kernel entry, so the cost there is not new, just in a different place. This also removse td_cache_ucred from struct thread. This is still only done #ifdef DIAGNOSTIC. [ missed this file in the previous commit ] Tested on: i386, alpha	2002-03-20 21:12:04 +00:00
John Baldwin	c1a513c951	- Push down Giant into crfree() in the case that we actually free a ucred. - Add a cred_free_thread() function (conditional on DIAGNOSTICS) that drops a per-thread ucred reference to be used in debugging code when leaving the kernel.	2002-03-20 21:00:50 +00:00
Andrew R. Reiter	c457a4403a	- Change a check of securelevel to securelevel_gt() call in order to help against users within a jail attempting to load kernel modules. - Add a check of securelevel_gt() to vfs_mount() in order to chop some low hanging fruit for the repair of securelevel checking of linking and unlinking files from within jails. There is more to be done here. Reviewed by: rwatson	2002-03-20 16:03:42 +00:00
Andrew R. Reiter	dca9d05526	- Remove a semi-colon from after SYSINIT that was introduced in rev. 1.163.	2002-03-20 14:46:38 +00:00
Jeff Roberson	586c8b6b29	Add calls to uma_zone_set_max() to restore previously enforced limits.	2002-03-20 05:30:58 +00:00
Jeff Roberson	54d77689ed	Backout part of my previous commit; I was wrong about vm_zone's handling of limits on zones w/o objects.	2002-03-20 04:39:32 +00:00
Jeff Roberson	9e9d298a9b	Remove references to vm_zone.h and switch over to the new uma API.	2002-03-20 04:11:52 +00:00
Jeff Roberson	c897b81311	Remove references to vm_zone.h and switch over to the new uma API. Also, remove maxsockets. If you look carefully you'll notice that the old zone allocator never honored this anyway.	2002-03-20 04:09:59 +00:00
Alfred Perlstein	4d77a549fe	Remove __P.	2002-03-19 21:25:46 +00:00
Alfred Perlstein	1f31a77ce8	don't generate files with __P.	2002-03-19 20:48:32 +00:00
Andrew R. Reiter	08a54da785	- Change a malloc / bzero pair to make use of the M_ZERO malloc(9) flag.	2002-03-19 15:41:21 +00:00
Peter Wemm	30171114b3	Fix a gcc-3.1+ warning. warning: deprecated use of label at end of compound statement ie: you cannot do this anymore: switch(foo) { .... default: }	2002-03-19 11:02:06 +00:00
Peter Wemm	3ba30c18a2	Pacify gcc-3.1+, initialize two variables to avoid -Wuninitialized warnings.	2002-03-19 10:57:40 +00:00
Peter Wemm	a5e7c7da5e	Fix warnings on gcc-3.1+ where __func__ is a const char * instead of a string.	2002-03-19 10:56:46 +00:00
Jeff Roberson	8355f576a9	This is the first part of the new kernel memory allocator. This replaces malloc(9) and vm_zone with a slab like allocator. Reviewed by: arch@	2002-03-19 09:11:49 +00:00
Alfred Perlstein	4a950215ef	Close a race when vfs_syscalls.c:checkdirs() runs. To do this protect the filedesc pointer in the proc with PROC_LOCK in both checkdirs() and kern_descrip.c:fdfree().	2002-03-19 04:30:04 +00:00
Bruce Evans	367b50a28f	Fixed some printf format errors (hopefully all of the remaining daddr64_t ones for GENERIC, and all others on the same line as those). Reformat the printfs if necessary to avoid new long lones or old format printf errors.	2002-03-19 04:09:21 +00:00
Andrew R. Reiter	9b3851e9e3	- Lock down the ``module'' structure by adding an SX lock that is used by all the global bits of ``module'' data. This commit adds a few generic macros, MOD_SLOCK, MOD_XLOCK, etc., that are meant to be used as ways of accessing the SX lock. It is also the first step in helping to lock down the kernel linker and module systems. Reviewed by: jhb, jake, smp@	2002-03-18 07:45:30 +00:00
Kirk McKusick	a0595d0249	Add a flags parameter to VFS_VGET to pass through the desired locking flags when acquiring a vnode. The immediate purpose is to allow polling lock requests (LK_NOWAIT) needed by soft updates to avoid deadlock when enlisting other processes to help with the background cleanup. For the future it will allow the use of shared locks for read access to vnodes. This change touches a lot of files as it affects most filesystems within the system. It has been well tested on FFS, loopback, and CD-ROM filesystems. only lightly on the others, so if you find a problem there, please let me (mckusick@mckusick.com) know.	2002-03-17 01:25:47 +00:00
Jake Burkholder	ac59490b5e	Convert all pmap_kenter/pmap_kremove pairs in MI code to use pmap_qenter/ pmap_qremove. pmap_kenter is not safe to use in MI code because it is not guaranteed to flush the mapping from the tlb on all cpus. If the process in question is preempted and migrates cpus between the call to pmap_kenter and pmap_kremove, the original cpu will be left with stale mappings in its tlb. This is currently not a problem for i386 because we do not use PG_G on SMP, and thus all mappings are flushed from the tlb on context switches, not just user mappings. This is not the case on all architectures, and if PG_G is to be used with SMP on i386 it will be a problem. This was committed by peter earlier as part of his fine grained tlb shootdown work for i386, which was backed out for other reasons. Reviewed by: peter	2002-03-17 00:56:41 +00:00
Dag-Erling Smørgrav	8bc814e603	Implement PT_IO (read / write arbitrary amounts of data or text). Submitted by: Artur Grabowski <art@{blahonga,openbsd}.org> Obtained from: OpenBSD	2002-03-16 02:40:02 +00:00
Dag-Erling Smørgrav	a888d317bb	PT_[GS]ET{,DB,FP}REGS isn't really optional any more, since we have dummy backend functions for those archs that don't support them. I meant to do this ages ago, but never got around to it. Inspired by: OpenBSD	2002-03-15 20:17:12 +00:00
Kirk McKusick	0d2af52141	Introduce the new 64-bit size disk block, daddr64_t. Change the bio and buffer structures to have daddr64_t bio_pblkno, b_blkno, and b_lblkno fields which allows access to disks larger than a Terabyte in size. This change also requires that the VOP_BMAP vnode operation accept and return daddr64_t blocks. This delta should not affect system operation in any way. It merely sets up the necessary interfaces to allow the development of disk drivers that work with these larger disk block addresses. It also allows for the development of UFS2 which will use 64-bit block addresses.	2002-03-15 18:49:47 +00:00
Alfred Perlstein	628abf6c69	Giant pushdown for read/write/pread/pwrite syscalls. kern/kern_descrip.c: Aquire Giant in fdrop_locked when file refcount hits zero, this removes the requirement for the caller to own Giant for the most part. kern/kern_ktrace.c: Aquire Giant in ktrgenio, simplifies locking in upper read/write syscalls. kern/vfs_bio.c: Aquire Giant in bwillwrite if needed. kern/sys_generic.c Giant pushdown, remove Giant for: read, pread, write and pwrite. readv and writev aren't done yet because of the possible malloc calls for iov to uio processing. kern/sys_socket.c Grab giant in the socket fo_read/write functions. kern/vfs_vnops.c Grab giant in the vnode fo_read/write functions.	2002-03-15 08:03:46 +00:00
Alfred Perlstein	3b018f572d	Bug fixes: Missed a place where the pipe sleep lock was needed in order to safely grab Giant, fix it and add an assertion to make sure this doesn't happen again. Fix typos in the PIPE_GET_GIANT/PIPE_DROP_GIANT that could cause the wrong mutex to get passed to PIPE_LOCK/PIPE_UNLOCK. Fix a location where the wrong pipe was being passed to PIPE_GET_GIANT/PIPE_DROP_GIANT.	2002-03-15 07:18:09 +00:00
Alfred Perlstein	85f190e4d1	Fixes to make select/poll mpsafe. Problem: selwakeup required calling pfind which would cause lock order reversals with the allproc_lock and the per-process filedesc lock. Solution: Instead of recording the pid of the select()'ing process into the selinfo structure, actually record a pointer to the thread. To avoid dereferencing a bad address all the selinfo structures that are in use by a thread are kept in a list hung off the thread (protected by sellock). When a selwakeup occurs the selinfo is removed from that threads list, it is also removed on the way out of select or poll where the thread will traverse its list removing all the selinfos from its own list. Problem: Previously the PROC_LOCK was used to provide the mutual exclusion needed to ensure proper locking, this couldn't work because there was a single condvar used for select and poll and condvars can only be used with a single mutex. Solution: Introduce a global mutex 'sellock' which is used to provide mutual exclusion when recording events to wait on as well as performing notification when an event occurs. Interesting note: schedlock is required to manipulate the per-thread TDF_SELECT flag, however if given its own field it would not need schedlock, also because TDF_SELECT is only manipulated under sellock one doesn't actually use schedlock for syncronization, only to protect against corruption. Proc locks are no longer used in select/poll. Portions contributed by: davidc	2002-03-14 01:32:30 +00:00
Brian Feldman	0e0af8ecda	Rename SI_SUB_MUTEX to SI_SUB_MTX_POOL to make the name at all accurate. While doing this, move it earlier in the sysinit boot process so that the VM system can use it. After that, the system is now able to use sx locks instead of lockmgr locks in the VM system. To accomplish this, some of the more questionable uses of the locks (such as testing whether they are owned or not, as well as allowing shared+exclusive recursion) are removed, and simpler logic throughout is used so locks should also be easier to understand. This has been tested on my laptop for months, and has not shown any problems on SMP systems, either, so appears quite safe. One more user of lockmgr down, many more to go :)	2002-03-13 23:48:08 +00:00
Archie Cobbs	44a8ff315e	Add realloc() and reallocf(), and make free(NULL, ...) acceptable. Reviewed by: alfred	2002-03-13 01:42:33 +00:00
Jeff Roberson	8de00f4a87	This patch adds the "LOCKSHARED" option to namei which causes it to only acquire shared locks on leafs. The stat() and open() calls have been changed to make use of this new functionality. Using shared locks in these cases is sufficient and can significantly reduce their latency if IO is pending to these vnodes. Also, this reduces the number of exclusive locks that are floating around in the system, which helps reduce the number of deadlocks that occur. A new kernel option "LOOKUP_SHARED" has been added. It defaults to off so this patch can be turned on for testing, and should eventually go away once it is proven to be stable. I have personally been running this patch for over a year now, so it is believed to be fully stable. Reviewed by: jake, obrien Approved by: jake	2002-03-12 04:00:11 +00:00
Poul-Henning Kamp	417fb7f6fa	Make the disk_clone() routine more robust for abuse. Sneak in a trivial bit of the GEOM stuff while we're here anyway.	2002-03-11 08:08:02 +00:00
Seigo Tanimura	183ccde6c6	Stop abusing the pgrpsess_lock.	2002-03-11 07:53:13 +00:00
Seigo Tanimura	aa3bf85c54	Do not lock the pgrpsess_lock exclusively across ttywait(). Spotted by: David Wolfskill <david@catwhisker.org> Investigated by: rwatson	2002-03-11 07:51:08 +00:00
David Malone	6c75a65a00	Don't assign strcmp to a variable called err and then compare it with zero, just compare strcmp with zero. This fixes the same bug which Maxim just fixed and fixes some odd style too. PR: 35712 Reviewed by: arr	2002-03-10 23:12:43 +00:00
Maxim Sobolev	832af2d5ed	Fix a breakage introduced in rev.1.75 (supposedly style cleanup), which results in "missing dependencies" error when loading some kld modules. It is sad to see how often these days style cleanus break doesn't broken things. Perhaps people should recall good old principle: "don't fix it if it isn't broken".	2002-03-10 19:20:01 +00:00
Poul-Henning Kamp	01de1b13b8	Make the proposed name arg to dev_stdclone() const.	2002-03-10 10:50:05 +00:00
Alfred Perlstein	bbbb04ce62	Remove __P	2002-03-09 22:44:37 +00:00
Alfred Perlstein	be4af4b723	Don't deref NULL mutex pointer when pipeclose()'ing a pipe that is not fully instaniated. Revert the logic in pipeclose so that we don't have the entire function pretty much under a single if() statement, instead invert the test and just return if it fails. Submitted (in different form) by: bde Don't use pool mutexes for pipes. We can not use pool mutexes because we will need to grab the select lock while holding a pipe lock which is not allowed because you may not aquire additional mutexes when holding a pool mutex. Instead malloc(9) space for the mutex that is shared between the pipes.	2002-03-09 22:06:31 +00:00
Poul-Henning Kamp	1c1676edca	Delete "notyet" code before it becomes "ohh no" code.	2002-03-09 20:11:25 +00:00
Luigi Rizzo	2dbd9d5bc3	Make the DEVICE_POLLING code compile with -Werror and in LINT	2002-03-09 08:02:52 +00:00
John Baldwin	60e269643d	- Use a MI critical section in witness_sleep() and witness_list() as they simply need to prevent switching from another CPU and do not need interrupts disabled. - Add a comment to witness_list() about why displaying spin locks for threads on other CPU's really is just a bad idea and probably shouldn't be done.	2002-03-08 18:57:57 +00:00
John Baldwin	c29824db05	Read KTR_CPU into a temporary variable so that we use a consistent value for both the cpumask check and the cpu entry field w/o needing to use a critical section.	2002-03-08 18:55:59 +00:00
Poul-Henning Kamp	fb92273bdc	Move the mount of the root filesystem to happen in the init process before the exec if /sbin/init. This allows the scheduler to get started and kthreads a chance to run before we start filesystem operations.	2002-03-08 10:33:11 +00:00
Mike Silbersack	77a7d074e4	Unconditionally limit maxproc so that it is not possible to exhaust all kmaps. The only reward for setting maxproc to a value which will cause kmap exhaustion is a panic during a forkbomb attack. MFC after: 3 days	2002-03-07 04:50:36 +00:00
Jake Burkholder	752dff3d9c	Add needed includes of machine/smp.h, remove nested include in sys/smp.h so that inlines in machine/smp.h can use variables declared in sys/smp.h.	2002-03-07 04:43:51 +00:00
Dag-Erling Smørgrav	e97c3e3d5c	Rename runq_find() to runq_findproc(), and hide it behind #ifdef DIAGNOSTIC, as it can have a severe impact on performance under high load, and the bug it was meant to catch was fixed ages ago.	2002-03-06 15:34:07 +00:00
Maxim Konovalov	cf11f48256	Fix a typo, unbreak the world. Thanks to: mux Approved by: ru	2002-03-06 12:28:51 +00:00
Bruce Evans	3006e31679	Don't (blindly) truncate the unit number to 4 digits when formatting the string returned by device_get_nameunit().	2002-03-06 11:34:02 +00:00
Maxim Konovalov	9dfd307b10	Maximum semid is seminfo.semmni not seminfo.semmsl. PR: kern/34979 Submitted by: James Gritton <jamie@gritton.org> Reviewed by: alfred, ru Approved by: ru MFC after: 1 week	2002-03-06 10:52:49 +00:00
Robert Watson	89e1164ee2	Three p_ucred -> td_ucred's missed in jhb's earlier pass; all appear to be safe.	2002-03-05 19:45:45 +00:00
Robert Watson	b0ad6e203a	The change from td->td_proc->p_ucred to td->td_ucred has shortened some lines: more agressively line wrap under those circumstances.	2002-03-05 19:31:25 +00:00
John Baldwin	c6f55f33ea	- Use td_ucred for jail checks. - Move jail checks and some other checks involving constants and stack variables out from under Giant. This isn't perfectly safe atm because jail_sysvipc_allowed is read w/o a lock meaning that its value could be stale. This global variable will soon become a per-jail flag, however, at which time it will either not need a lock or will use the prison lock.	2002-03-05 18:57:36 +00:00
Eivind Eklund	f52bd684f3	* Move bswlist declaration and initialization from kern/vfs_bio.c to vm/vm_pager.c, which is the only place it is used. * Make the QUEUE_* definitions and bufqueues local to vfs_bio.c. * constify buf_wmesg.	2002-03-05 18:20:58 +00:00
Eivind Eklund	04858e7ee4	Change wmesg to const char * instead of char *	2002-03-05 17:45:12 +00:00
Robert Watson	ba51c2659d	Part II: update various mechanically generated files to allow for new system call number allocations. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-03-05 16:13:01 +00:00
Robert Watson	11ffd032ff	Reserve system call numbers for the MAC framework. This will prevent people working on the MAC tree from getting toasted whenever system call numbers are allocated in the main tree (for example, for KSE :-). Calls allocated: __mac_{get,set}_proc, __mac_{get,set}_{fd,file}(). Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-03-05 16:11:11 +00:00
Eivind Eklund	eb8e6d5276	Document all functions, global and static variables, and sysctls. Includes some minor whitespace changes, and re-ordering to be able to document properly (e.g, grouping of variables and the SYSCTL macro calls for them, where the documentation has been added.) Reviewed by: phk (but all errors are mine)	2002-03-05 15:38:49 +00:00
Robert Drehmel	6f60771b6d	Fix a warning.	2002-03-05 15:19:33 +00:00
Jeff Roberson	88c99cfbc8	Add a new variable mp_maxid. This is used so that per cpu datastructures may be allocated as arrays indexed by the cpu id. Previously the only reliable way to know the max cpu id was through MAXCPU. mp_ncpus isn't useful here because cpu ids may be sparsely mapped, although x86 and alpha do not do this. Also, call cpu_mp_probe much earlier so the max cpu id is known before the VM starts up. This is intended to help support per cpu queues for the new allocator, but may be useful elsewhere. Reviewed by: jake Approved by: jake	2002-03-05 10:01:46 +00:00
Seigo Tanimura	996abba928	Track the number of wired pages to avoid unwiring unwired pages. Reviewed by: alfred	2002-03-05 00:51:03 +00:00
Mitsuru IWASAKI	899ccf541a	Add generalized power profile code. This makes other power-management system (APM for now) to be able to generate power profile change events (ie. AC-line status changes), and other kernel components, not only the ACPI components, can be notified the events. - move subroutines in acpi_powerprofile.c (removed) to kern/subr_power.c - call power_profile_set_state() also from APM driver when AC-line status changes - add call-back function for Crusoe LongRun controlling on power profile changes for a example	2002-03-04 18:46:13 +00:00
Bosko Milekic	5a4f147089	Fix bug in mb_alloc that made systems configured with PAGE_SIZE / MCLBYTES == 1 crash. Fix them by changing the appropriate "allocate new page and bucket" code in mb_alloc to use the macro for properly grabbing an allocated object from a bucket, the one that checks whether the bucket is empty. This should allow ken to continue testing zero-copy stuff on -CURRENT. Noticed and provided debug info: ken	2002-03-03 22:10:04 +00:00
Dima Dorfman	e74d483140	Check the version of ex_anon (a `struct xucred') before using it to fill out netc_anon (a `struct ucred'), and add an XXX around the entire operation since it isn't clear whether it's doing the right thing with things like cr_uidinfo and cr_prison.	2002-03-03 06:07:57 +00:00
Seigo Tanimura	92c914f936	Fix lock leakage and late unlock. Submitted by: bde	2002-03-02 12:42:24 +00:00
Ian Dowse	167b8d0334	In sosend(), enforce the socket buffer limits regardless of whether the data was supplied as a uio or an mbuf. Previously the limit was ignored for mbuf data, and NFS could run the kernel out of mbufs when an ipfw rule blocked retransmissions.	2002-02-28 11:22:40 +00:00
Warner Losh	0cf3c909d8	Remove now unused struct proc *p. Approved by: jhb	2002-02-27 20:57:57 +00:00
John Baldwin	bdd67d483c	- Change namei() to use td_ucred instead of p_ucred. - Change the hack in access() that uses a temporary credential to set td_ucred to the temp cred instead of p_ucred.	2002-02-27 19:15:29 +00:00
John Baldwin	6f105b3444	- Change unp_listen() to accept a thread rather than a proc as its second argument. - Use td_ucred in unp_listen() instead of p_ucred.	2002-02-27 19:14:01 +00:00
John Baldwin	4a7d6cd251	Fix Giant leakage in several error cases in __semctl().	2002-02-27 19:12:14 +00:00
John Baldwin	6bd7ad69a1	Add a comment about an unlocked access to p_ucred that will go away in the near future.	2002-02-27 19:10:50 +00:00
Alfred Perlstein	9f01374de5	kill __P.	2002-02-27 18:51:53 +00:00
Alfred Perlstein	566c1313a3	add assertions in the places where giant is required to catch when the pipe is locked and shouldn't be. initialize pipe->pipe_mtxp to NULL when creating pipes in order not to trip the above assertions. swap pipe lock with giant around calls to pipe_destroy_write_buffer() pipe_destroy_write_buffer issue noticed by: jhb	2002-02-27 18:49:58 +00:00
John Baldwin	a854ed9893	Simple p_ucred -> td_ucred changes to start using the per-thread ucred reference.	2002-02-27 18:32:23 +00:00
John Baldwin	65e3406d28	Temporarily lock Giant while we update td_ucred. The proc lock doesn't fully protect p_ucred yet so Giant is needed until all the p_ucred locking is done. This is the original reason td_ucred was not used immediately after its addition. Unfortunately, not using td_ucred is not enough to avoid problems. Since p_ucred could be stale, we could actually be dereferencing a stale pointer to dink with the refcount, so we really need Giant to avoid foot-shooting. This allows td_ucred to be safely used as well.	2002-02-27 18:30:01 +00:00
Alfred Perlstein	21dbcfd500	Fix a NULL deref panic in pipe_write, we can't blindly lock pipe->pipe_peer->pipe_mtxp because it may be NULL, so lock the passed in pipe's mutex instead.	2002-02-27 17:23:16 +00:00
Robert Drehmel	ad1ff0997e	Make getcredhostname() take a buffer and the buffer's size as arguments. The correct hostname is copied into the buffer while having the prison's lock acquired in a jailed process' case. Reviewed by: jhb, rwatson	2002-02-27 16:43:20 +00:00
Robert Drehmel	9484d0c0e8	Add a function which returns the correct hostname for a given credential. Reviewed by: phk	2002-02-27 14:58:32 +00:00
Alfred Perlstein	ffddaaeeeb	MPsafe fixes: use SYSINIT to initialize pipe_zone. use PIPE_LOCK to protect kevent ops.	2002-02-27 11:27:48 +00:00
Seigo Tanimura	2f9325870d	Return ESRCH if the target process is not inferior to the curproc. Spotted by: HIROSHI OOTA <oota@LSi.nec.co.jp>	2002-02-27 10:38:14 +00:00
Alfred Perlstein	e6be967434	Don't hardcode /sys when making tags, instead use ${.CURDIR}/.. this fixes a problem where one tries to make tags when the source isn't in /sys. Submitted by: Jihui Yang <yangjihui@yahoo.com>	2002-02-27 10:07:15 +00:00
Peter Wemm	d1693e1701	Back out all the pmap related stuff I've touched over the last few days. There is some unresolved badness that has been eluding me, particularly affecting uniprocessor kernels. Turning off PG_G helped (which is a bad sign) but didn't solve it entirely. Userland programs still crashed.	2002-02-27 09:51:33 +00:00
Alfred Perlstein	f81b04d96c	First rev at making pipe(2) pipe's MPsafe. Both ends of the pipe share a pool_mutex, this makes allocation and deadlock avoidance easy. Remove some un-needed FILE_LOCK ops while I'm here. There are some issues wrt to select and the f{s,g}etown code that we'll have to deal with, I think we may also need to move the calls to vfs_timestamp outside of the sections covered by PIPE_LOCK.	2002-02-27 07:35:59 +00:00
Dima Dorfman	76183f3453	Introduce a version field to `struct xucred' in place of one of the spares (the size of the field was changed from u_short to u_int to reflect what it really ends up being). Accordingly, change users of xucred to set and check this field as appropriate. In the kernel, this is being done inside the new cru2x() routine which takes a `struct ucred' and fills out a `struct xucred' according to the former. This also has the pleasant sideaffect of removing some duplicate code. Reviewed by: rwatson	2002-02-27 04:45:37 +00:00
Peter Wemm	bd1e3a0f89	Jake further reduced IPI shootdowns on sparc64 in loops by using ranged shootdowns in a couple of key places. Do the same for i386. This also hides some physical addresses from higher levels and has it use the generic vm_page_t's instead. This will help for PAE down the road. Obtained from: jake (MI code, suggestions for MD part)	2002-02-27 02:14:58 +00:00
Matthew Dillon	181df8c9d4	revert last commit temporarily due to whining on the lists.	2002-02-26 20:33:41 +00:00
Matthew Dillon	f96ad4c223	STAGE-1 of 3 commit - allow (but do not require) interrupts to remain enabled in critical sections and streamline critical_enter() and critical_exit(). This commit allows an architecture to leave interrupts enabled inside critical sections if it so wishes. Architectures that do not wish to do this are not effected by this change. This commit implements the feature for the I386 architecture and provides a sysctl, debug.critical_mode, which defaults to 1 (use the feature). For now you can turn the sysctl on and off at any time in order to test the architectural changes or track down bugs. This commit is just the first stage. Some areas of the code, specifically the MACHINE_CRITICAL_ENTER #ifdef'd code, is strictly temporary and will be cleaned up in the STAGE-2 commit when the critical_() functions are moved entirely into MD files. The following changes have been made: critical_enter() and critical_exit() for I386 now simply increment and decrement curthread->td_critnest. They no longer disable hard interrupts. When critical_exit() decrements the counter to 0 it effectively calls a routine to deal with whatever interrupts were deferred during the time the code was operating in a critical section. Other architectures are unaffected. * fork_exit() has been conditionalized to remove MD assumptions for the new code. Old code will still use the old MD assumptions in regards to hard interrupt disablement. In STAGE-2 this will be turned into a subroutine call into MD code rather then hardcoded in MI code. The new code places the burden of entering the critical section in the trampoline code where it belongs. * I386: interrupts are now enabled while we are in a critical section. The interrupt vector code has been adjusted to deal with the fact. If it detects that we are in a critical section it currently defers the interrupt by adding the appropriate bit to an interrupt mask. * In order to accomplish the deferral, icu_lock is required. This is i386-specific. Thus icu_lock can only be obtained by mainline i386 code while interrupts are hard disabled. This change has been made. * Because interrupts may or may not be hard disabled during a context switch, cpu_switch() can no longer simply assume that PSL_I will be in a consistent state. Therefore, it now saves and restores eflags. * FAST INTERRUPT PROVISION. Fast interrupts are currently deferred. The intention is to eventually allow them to operate either while we are in a critical section or, if we are able to restrict the use of sched_lock, while we are not holding the sched_lock. * ICU and APIC vector assembly for I386 cleaned up. The ICU code has been cleaned up to match the APIC code in regards to format and macro availability. Additionally, the code has been adjusted to deal with deferred interrupts. * Deferred interrupts use a per-cpu boolean int_pending, and masks ipending, spending, and fpending. Being per-cpu variables it is not currently necessary to lock; bus cycles modifying them. Note that the same mechanism will enable preemption to be incorporated as a true software interrupt without having to further hack up the critical nesting code. * Note: the old critical_enter() code in kern/kern_switch.c is currently #ifdef to be compatible with both the old and new methodology. In STAGE-2 it will be moved entirely to MD code. Performance issues: One of the purposes of this commit is to enhance critical section performance, specifically to greatly reduce bus overhead to allow the critical section code to be used to protect per-cpu caches. These caches, such as Jeff's slab allocator work, can potentially operate very quickly making the effective savings of the new critical section code's performance very significant. The second purpose of this commit is to allow architectures to enable certain interrupts while in a critical section. Specifically, the intention is to eventually allow certain FAST interrupts to operate rather then defer. The third purpose of this commit is to begin to clean up the critical_enter()/critical_exit()/cpu_critical_enter()/ cpu_critical_exit() API which currently has serious cross pollution in MI code (in fork_exit() and ast() for example). The fourth purpose of this commit is to provide a framework that allows kernel-preempting software interrupts to be implemented cleanly. This is currently used for two forward interrupts in I386. Other architectures will have the choice of using this infrastructure or building the functionality directly into critical_enter()/ critical_exit(). Finally, this commit is designed to greatly improve the flexibility of various architectures to manage critical section handling, software interrupts, preemption, and other highly integrated architecture-specific details.	2002-02-26 17:06:21 +00:00
Bruce Evans	ffe4d2f7c7	Fixed 3 regressions in rev.1.99 (clobbering of the English fix in rev.1.98, and 2 unformattings).	2002-02-26 16:17:45 +00:00
Søren Schmidt	ed57cfc480	Hide "bla bla exists, skipping it" behind bootverbose.	2002-02-26 10:38:33 +00:00
Poul-Henning Kamp	c91f7a7332	Cast the variable, not the constant to 64 bits.	2002-02-26 09:27:39 +00:00
Poul-Henning Kamp	0f5c7c4b1c	Fix warning in !SMP case. Submitted by: Maxime Henrion <mux@mu.org>	2002-02-26 09:21:52 +00:00
Poul-Henning Kamp	1634e90817	Remove unused variable.	2002-02-26 09:16:27 +00:00
Peter Wemm	e2256f43ed	Fix warning. s/microuptime()/binuptime()/ for switchtime initial value.	2002-02-26 01:03:39 +00:00
Peter Wemm	bd47bef5aa	Fix a warning. Do not assume pointer == long.	2002-02-26 00:55:27 +00:00
Peter Wemm	6bd95d70db	Work-in-progress commit syncing up pmap cleanups that I have been working on for a while: - fine grained TLB shootdown for SMP on i386 - ranged TLB shootdowns.. eg: specify a range of pages to shoot down with a single IPI, since the IPI is very expensive. Adjust some callers that used to trigger this inside tight loops to do a ranged shootdown at the end instead. - PG_G support for SMP on i386 (options ENABLE_PG_G) - defer PG_G activation till after we decide what we are going to do with PSE and the 4MB pages at the start of the kernel. This should solve some rumored strangeness about stale PG_G entries getting stuck underneath the 4MB pages. - add some instrumentation for the fine TLB shootdown - convert some asm instruction wrappers from functions to inlines. gcc seems to do a fair bit better with this. - [temporarily!] pessimize the tlb shootdown IPI handlers. I will fix this again shortly. This has been working fairly well for me for a while, but I have tweaked it again prior to commit since my last major testing round. The only outstanding problem that I know of is PG_G related, which is why there is an option for it (not on by default for SMP). I have seen a world speedups by a few percent (as much as 4 or 5% in one case) but I have not accurately measured this - I am a bit sceptical of these numbers.	2002-02-25 23:49:51 +00:00
Ian Dowse	ddb7d629f1	Sockets passed into uipc_abort() have been allocated by sonewconn() but never accept'ed, so they must be destroyed. Originally, unp_drop() detected this situation by checking if so->so_head is non-NULL. However, since revision 1.54 of uipc_socket.c (Feb 1999), so->so_head is set to NULL before calling soabort(), so any unix-domain sockets waiting to be accept'ed are leaked if the server socket is closed. Resolve this by moving the socket destruction code into uipc_abort() itself, and making it unconditional (the other caller of unp_drop() never needs the socket to be destroyed). Use unp_detach() to avoid the original code duplication when destroying the socket. PR: kern/17895 Reviewed by: dwmalone (an earlier version of the patch) MFC after: 1 week	2002-02-25 00:03:34 +00:00
Poul-Henning Kamp	5b7d8efa8d	Add a generation number to timecounters and spin if it changes under our feet when we look inside timecounter structures. Make the "sync_other" code more robust by never overwriting the tc_next field. Add counters for the bin[up]time functions. Call tc_windup() in tc_init() and switch_timecounter() to make sure we all the fields set right.	2002-02-24 20:04:07 +00:00
Poul-Henning Kamp	e9be968e95	Fix a typo (?) in previous commit told ttyprintf() to print the integer part of the user-time as a 64bit quantity. This resulted in weird output from SIGINFO.	2002-02-24 19:56:41 +00:00
Seigo Tanimura	f591779bb5	Lock struct pgrp, session and sigio. New locks are: - pgrpsess_lock which locks the whole pgrps and sessions, - pg_mtx which protects the pgrp members, and - s_mtx which protects the session members. Please refer to sys/proc.h for the coverage of these locks. Changes on the pgrp/session interface: - pgfind() needs the pgrpsess_lock held. - The caller of enterpgrp() is responsible to allocate a new pgrp and session. - Call enterthispgrp() in order to enter an existing pgrp. - pgsignal() requires a pgrp lock held. Reviewed by: jhb, alfred Tested on: cvsup.jp.FreeBSD.org (which is a quad-CPU machine running -current)	2002-02-23 11:12:57 +00:00
Jake Burkholder	39dda4e363	Make this compile. Pointy hat to: julian	2002-02-23 01:42:13 +00:00
Julian Elischer	77c4066424	Add some DIAGNOSTIC code. While in userland, keep the thread's ucred reference in a shadow field so that the usual place to store it is NULL. If DIAGNOSTIC is not set, the thread ucred is kept valid until the next kernel entry, at which time it is checked against the process cred and possibly corrected. Produces a BIG speedup in kernels with INVARIANTS set. (A previous commit corrected it for the non INVARIANTS case already) Reviewed by: dillon@freebsd.org	2002-02-22 23:58:22 +00:00
Andrew R. Reiter	e68baa7073	- Whitespace fixes leftover from previous commit. Submitted by: bde	2002-02-22 13:43:56 +00:00
Andrew R. Reiter	54c94c8a35	- Whitespace fixup left over from previous commit. - Remove bogus cast. Submitted by: bde	2002-02-22 13:33:10 +00:00
Poul-Henning Kamp	1cbb9c3b03	Convert p->p_runtime and PCPU(switchtime) to bintime format.	2002-02-22 13:32:01 +00:00
Poul-Henning Kamp	4e2befc031	Use better scaling factor for NTPs correction. Explain the magic.	2002-02-22 12:59:20 +00:00
Poul-Henning Kamp	57c10583aa	GC: BIO_ORDERED, various infrastructure dealing with BIO_ORDERED.	2002-02-22 09:26:35 +00:00

... 3 4 5 6 7 ...

4962 Commits