freebsd-nq

Author	SHA1	Message	Date
John Baldwin	5f36700a32	- Add trylock variants of shared and exclusive locks. - The sx assertions don't actually need the internal sx mutex lock, so don't bother doing so. - Add a new assertion SX_ASSERT_LOCKED() that asserts that either a shared or exclusive lock should be held. This assertion should be used instead of SX_ASSERT_SLOCKED() in almost all cases. - Adjust some KASSERT()'s to include file and line information. - Use the new witness_assert() function in the WITNESS case for sx slock asserts to verify that the current thread actually owns a slock.	2001-06-27 06:39:37 +00:00
John Baldwin	04297fe609	- Add a new witness_assert() to perform arbitrary locking assertions. - Clean up the KTR tracepoints to be slighlty more consistent and useful - Fix a bug in WITNESS where we would recurse indefinitely and blow the stack when acquiring Giant after sleeping with a sleepable lock held. Reported by: tanimura (3)	2001-06-27 06:27:29 +00:00
John Baldwin	776e0b3693	- Always use the proc lock of the task leader to protect the peers list of processes. - Don't construct fake call args and then call kill(). psignal is not anymore complicated and is quicker and not prone to locking problems. Calling psignal() avoids having to do a pfind() since we already have a proc pointer and also allows us to keep the task leader locked while we kill all the peer processes so the list is kept coherent. - When a kthread exits, do a wakeup() on its proc pointers. This can be used by kernel modules that have kthreads and want to ensure they have safely exited before completely the MOD_UNLOAD event. Connectivity provided by: Usenix wireless	2001-06-27 06:15:44 +00:00
John Baldwin	b7e554f5d6	- Move the 'clk' spinlock below other spin locks since KTR trace events may need the clock lock for nanotime(). - Add KTR trace events for lock list manipulations and other witness operations. - Use a temporary variable instead of setting the lock list head directly and then setting up the links to add a new lock list entry to the lock list. This small race could result in witness "forgetting" about all the locks held by this process temporarily during an interrupt. - Close a more fatal race condition when removing a lock from a list. Removing a lock from the list entails both decrementing the count of items in this bucket as well as shuffling items in the current bucket up a notch to replace the gap left by the removed item. Wrap these operations in a critical section.	2001-06-25 23:17:52 +00:00
John Baldwin	1715f07da3	- Replace the unused KTR_IDLELOOP trace class with a new KTR_WITNESS trace class to trace witness events. - Make the ktr_cpu field of ktr_entry be a standard field rather than one present only in the KTR_EXTEND case. - Move the default definition of KTR_ENTRIES from sys/ktr.h to kern/kern_ktr.c. It has not been needed in the header file since KTR was un-inlined. - Minor include cleanup in kern/kern_ktr.c. - Fiddle with the ktr_cpumask in ktr_tracepoint() to disable KTR events on the current CPU while we are processing an event. - Set the current CPU inside of the critical section to ensure we don't migrate CPU's after the critical section but before we set the CPU.	2001-06-25 23:09:31 +00:00
John Baldwin	1d79f1bb9a	- Sort includes. - Count the context switches during shutdown when we give ithreads a chance to run as volutary context switches. Submitted by: bde (2)	2001-06-25 18:30:42 +00:00
John Baldwin	c4f7a18726	Count the context switch when blocking on a mutex as a voluntary context switch. Count the context switch when preempting the current thread to let a higher priority thread blocked on a mutex we just released run as an involuntary context switch. Reported by: bde	2001-06-25 18:29:32 +00:00
John Baldwin	84bbc4dbda	Count the switch when an ithread goes idle as a voluntary context switch. Submitted by: bde	2001-06-25 18:27:33 +00:00
David Malone	db3cc2d09f	Don't dereference a NULL pointer if we fail to get a sendfilebuf.	2001-06-24 12:27:30 +00:00
Matthew Dillon	c7503f60c4	After exhaustive discussions and some meandering and confusion, enough people are on track with the cause and effect of this, and although fixing this severely degenerate case appears to violate the letter of POSIX.1-200x, Bruce and I (and enough others) agree that it should be comitted. So, this patch generates an ENOENT error for any attempt to do a path lookup through an empty symlink (e.g. open(), stat()). Submitted by: "Andrey A. Chernov" <ache@nagual.pp.ru> Reviewed by: bde Discussed exhaustively on: freebsd-current Previously committed to: NetBSD 4 years ago	2001-06-24 05:24:41 +00:00
John Baldwin	1df95969b5	- Lock CURSIG() with the proc lock to close the signal race with psignal. - Grab Giant around ktrace points. - Clean up KTR_PROC tracepoints to not display the value of sched_lock.mtx_lock as it isn't really needed anymore and just obfuscates the messages. - Add a few if conditions to replace gotos. - Ensure that every msleep KTR event ends up with a matching msleep resume KTR event (this was broken when we didn't do a mi_switch()). - Only note via ktrace that we resumed from a switch once rather than twice in several places in msleep(). - Remove spl's rom asleep and await as the proc lock and sched_lock provide all the needed locking. - In mawait() add in a needed ktrace point for noting that we are about to switch out.	2001-06-22 23:11:26 +00:00
John Baldwin	87f9ffb805	- Lock CURSIG with the proc lock and don't release the proc lock until after grabbing the sched lock to close a race. - Lock ktrace points with Giant.	2001-06-22 23:06:38 +00:00
John Baldwin	06c836bbca	- Grab the proc lock around CURSIG and postsig(). Don't release the proc lock until after grabbing the sched_lock to avoid CURSIG racing with psignal. - Don't grab Giant for addupc_task() as it isn't needed. Reported by: tegge (signal race), bde (addupc_task a while back)	2001-06-22 23:05:11 +00:00
John Baldwin	2ad7d3049a	- Change CURSIG() and postsig() to require that the proc lock is held rather than grabbing it and releasing it themselves. This allows callers of these functions to get the lock to close race conditions. - Grab Giant around ktrace in postsig. - Count the switches performed on SIGSTOP's as involuntary context switches in the resource usage stats. Reported by: tegge (signal race), bde (missing csw stats)	2001-06-22 23:02:37 +00:00
Matt Jacob	2f7f966cb8	int -> size_t fix	2001-06-22 19:54:38 +00:00
Matt Jacob	8f5a1742c2	Temporary fix at least- define NCPU_PRESENT which will be mp_npcus for SMP kernels, one (1) for non-SMP.	2001-06-22 16:03:23 +00:00
Jim Pirzyk	f83ae79fbe	changed hostid from long to unsigned long to be able to store values > 2GB on i386 platforms. Also changed SYSCTL type from INT to ULONG and removed comment about it. PR: kern/21132 MFC after: 1 month	2001-06-22 16:03:14 +00:00
Bosko Milekic	08442f8a82	Introduce numerous SMP friendly changes to the mbuf allocator. Namely, introduce a modified allocation mechanism for mbufs and mbuf clusters; one which can scale under SMP and which offers the possibility of resource reclamation to be implemented in the future. Notable advantages: o Reduce contention for SMP by offering per-CPU pools and locks. o Better use of data cache due to per-CPU pools. o Much less code cache pollution due to excessively large allocation macros. o Framework for `grouping' objects from same page together so as to be able to possibly free wired-down pages back to the system if they are no longer needed by the network stacks. Additional things changed with this addition: - Moved some mbuf specific declarations and initializations from sys/conf/param.c into mbuf-specific code where they belong. - m_getclr() has been renamed to m_get_clrd() because the old name is really confusing. m_getclr() HAS been preserved though and is defined to the new name. No tree sweep has been done "to change the interface," as the old name will continue to be supported and is not depracated. The change was merely done because m_getclr() sounds too much like "m_get a cluster." - TEMPORARILY disabled mbtypes statistics displaying in netstat(1) and systat(1) (see TODO below). - Fixed systat(1) to display number of "free mbufs" based on new per-CPU stat structures. - Fixed netstat(1) to display new per-CPU stats based on sysctl-exported per-CPU stat structures. All infos are fetched via sysctl. TODO (in order of priority): - Re-enable mbtypes statistics in both netstat(1) and systat(1) after introducing an SMP friendly way to collect the mbtypes stats under the already introduced per-CPU locks (i.e. hopefully don't use atomic() - it seems too costly for a mere stat update, especially when other locks are already present). - Optionally have systat(1) display not only "total free mbufs" but also "total free mbufs per CPU pool." - Fix minor length-fetching issues in netstat(1) related to recently re-enabled option to read mbuf stats from a core file. - Move reference counters at least for mbuf clusters into an unused portion of the cluster itself, to save space and need to allocate a counter. - Look into introducing resource freeing possibly from a kproc. Reviewed by (in parts): jlemon, jake, silby, terry Tested by: jlemon (Intel & Alpha), mjacob (Intel & Alpha) Preliminary performance measurements: jlemon (and me, obviously) URL: http://people.freebsd.org/~bmilekic/mb_alloc/	2001-06-22 06:35:32 +00:00
John Baldwin	fbd26f7594	Fix some lock order reversals where we called free() while holding a proc lock. We now use temporary variables to save the process argument pointer and just update the pointer while holding the lock. We then perform the free on the cached pointer after releasing the lock.	2001-06-20 23:10:06 +00:00
Bosko Milekic	f5eece3fb9	Change m_devget()'s outdated and unused `offset' argument to actually mean something: offset into the first mbuf of the target chain before copying the source data over. Make drivers using m_devget() with a first argument "data - ETHER_ALIGN" to use the offset argument to pass ETHER_ALIGN in. The way it was previously done is potentially dangerous if the source data was at the top of a page and the offset caused the previous page to be copied (if the previous page has not yet been appropriately mapped). The old `offset' argument in m_devget() is not used anywhere (it's always 0) and dates back to ~1995 (and earlier?) when support for ethernet trailers existed. With that support gone, it was merely collecting dust. Tested on alpha by: jlemon Partially submitted by: jlemon Reviewed by: jlemon MFC after: 3 weeks	2001-06-20 19:48:35 +00:00
John Baldwin	2e1aacccac	Preemption by an interrupt thread is an involuntary switch, not a voluntary one. Pointy-hat to: me	2001-06-20 18:26:41 +00:00
Dag-Erling Smørgrav	0e79fe6f0e	Constify (silence warnings introduced by last commit to sys/module.h)	2001-06-20 16:08:45 +00:00
Garrett Wollman	37336173d3	After one too many PRs on the subject, bite the bullet and define IOV_MAX and its associated constants. Implement _SC_IOV_MAX in the usual way. Be a bit sloppy about the namespace question; this should get cleared up in time for 5.0. MFC after: 1 month	2001-06-18 20:24:54 +00:00
John Baldwin	6fad32afc9	Lock Giant in postsig() for the KTRACE case as ktrpsig() needs Giant when it writes out to the trace file. Reported by: peter, gallatin, and others	2001-06-18 19:23:43 +00:00
Brian Somers	09dbb40410	Add linker_reference_module(). This function loads a module if required, otherwise bumps the reference count -- the opposite of linker_file_unload().	2001-06-18 15:09:33 +00:00
Brian Somers	21ff14e0f9	Don't remove the SI_CHEAPCLONE for unsupported minors	2001-06-18 09:22:30 +00:00
Peter Wemm	b85db19691	Move setugid() a little sooner to before we release tracing in case crdup() or change_e*id() block on malloc() or mutex.	2001-06-16 23:34:23 +00:00
Peter Wemm	5a280d9cd1	Add INTR_TYPE_AV so that we can get to the PI_AV priority in the ithread handlers. This is beneficial since it means that pcm's MPSAFE handler can get run before things that will block on Giant in the shared irq case.	2001-06-16 22:42:19 +00:00
Jonathan Lemon	9fa416ca19	Fix warnings: 112: warning: cast to pointer from integer of different size 125: warning: cast to pointer from integer of different size	2001-06-16 07:02:47 +00:00
Jonathan Lemon	7b748f0a21	Correctly hook up the write kqfilter to pipes. Submitted by: Niels Provos <provos@citi.umich.edu>	2001-06-15 20:45:01 +00:00
Peter Wemm	b93c3c5ed6	Fix some warnings in kern_environment.c. Make the getenv*() family take a const 'name', since they dont modify anything. 159: warning: passing arg 1 of `getenv_int' discards qualifiers... 167: warning: passing arg 1 of `getenv' discards qualifiers from pointer..	2001-06-15 07:29:17 +00:00
Peter Wemm	ee24290963	As per comments in sys/linker_set.h: BANG! BANG! BANG! BANG! BANG! BANG! CLICK! CLICK! CLICK! CLICK! CLICK! <reload> BANG! BANG! BANG! BANG! BANG! BANG! CLICK! CLICK! CLICK! CLICK! CLICK!	2001-06-14 01:28:56 +00:00
Peter Wemm	f41325db5f	With this commit, I hereby pronounce gensetdefs past its use-by date. Replace the a.out emulation of 'struct linker_set' with something a little more flexible. <sys/linker_set.h> now provides macros for accessing elements and completely hides the implementation. The linker_set.h macros have been on the back burner in various forms since 1998 and has ideas and code from Mike Smith (SET_FOREACH()), John Polstra (ELF clue) and myself (cleaned up API and the conversion of the rest of the kernel to use it). The macros declare a strongly typed set. They return elements with the type that you declare the set with, rather than a generic void *. For ELF, we use the magic ld symbols (__start_<setname> and __stop_<setname>). Thanks to Richard Henderson <rth@redhat.com> for the trick about how to force ld to provide them for kld's. For a.out, we use the old linker_set struct. NOTE: the item lists are no longer null terminated. This is why the code impact is high in certain areas. The runtime linker has a new method to find the linker set boundaries depending on which backend format is in use. linker sets are still module/kld unfriendly and should never be used for anything that may be modular one day. Reviewed by: eivind	2001-06-13 10:58:39 +00:00
Peter Wemm	db957588c9	Patch up a blunder I made a few days ago. nmbcnt was being initialized too late. Noted by: bmilekic Pointy-hat to: peter	2001-06-13 00:36:41 +00:00
Peter Wemm	2398f0cd1d	Hints overhaul: - Replace some very poorly thought out API hacks that should have been fixed a long while ago. - Provide some much more flexible search functions (resource_find_*()) - Use strings for storage instead of an outgrowth of the rather inconvenient temporary ioconf table from config(). We already had a fallback to using strings before malloc/vm was running anyway.	2001-06-12 09:40:04 +00:00
Dag-Erling Smørgrav	8f7e4eb568	Rename nextpid to lastpid and externalize it.	2001-06-11 21:54:19 +00:00
Dag-Erling Smørgrav	fe46349692	Blah, I cut out a tad too much in the previous commit. (thanks again, Jake!)	2001-06-11 18:43:32 +00:00
Dag-Erling Smørgrav	e3b373228c	copyin(9) doesn't return ENAMETOOLONG. (thanks, Jake!)	2001-06-11 18:36:18 +00:00
Dag-Erling Smørgrav	b0def2b548	Add sbuf_copyin(). Also add 'b' variants of sbuf_{cat,copyin,cpy}() which ignore NUL bytes in the source string.	2001-06-11 17:05:52 +00:00
Hajimu UMEMOTO	3384154590	Sync with recent KAME. This work was based on kame-20010528-freebsd43-snap.tgz and some critical problem after the snap was out were fixed. There are many many changes since last KAME merge. TODO: - The definitions of SADB_* in sys/net/pfkeyv2.h are still different from RFC2407/IANA assignment because of binary compatibility issue. It should be fixed under 5-CURRENT. - ip6po_m member of struct ip6_pktopts is no longer used. But, it is still there because of binary compatibility issue. It should be removed under 5-CURRENT. Reviewed by: itojun Obtained from: KAME MFC after: 3 weeks	2001-06-11 12:39:29 +00:00
David Malone	c7fd62da6c	Try to make the setting of the SIGCHLD handler the same as setting of the NOCLDWAI flag. Susv2 seems to require this. Submitted by: Cejka Rudolf <cejkar@dcse.fee.vutbr.cz> Reviewed by: dillon	2001-06-11 09:15:41 +00:00
Dag-Erling Smørgrav	d647935801	sbuf_new(9) now returns a struct sbuf * instead of an int. If the caller does not provide a struct sbuf, sbuf_new(9) will allocate one and return a pointer to it.	2001-06-10 15:48:04 +00:00
Peter Wemm	0978669829	"Fix" the previous initial attempt at fixing TUNABLE_INT(). This time around, use a common function for looking up and extracting the tunables from the kernel environment. This saves duplicating the same function over and over again. This way typically has an overhead of 8 bytes + the path string, versus about 26 bytes + the path string.	2001-06-08 05:24:21 +00:00
Peter Wemm	4422746fdf	Back out part of my previous commit. This was a last minute change and I botched testing. This is a perfect example of how NOT to do this sort of thing. :-(	2001-06-07 03:17:26 +00:00
Thomas Moestl	c0a0fb85e2	Fix an instance of NDINIT in the extattrctl syscall: LOCKLEAF was or'ed to the operation parameter, not to the flags as it should be. Reviewed by: rwatson	2001-06-06 23:34:38 +00:00
Peter Wemm	81930014ef	Make the TUNABLE_() macros look and behave more consistantly like the SYSCTL_() macros. TUNABLE_INT_DECL() was an odd name because it didn't actually declare the int, which is what the name suggests it would do.	2001-06-06 22:17:08 +00:00
John Baldwin	5beb572b41	We don't need to hold a lock just to test a flag.	2001-06-06 22:05:48 +00:00
Ruslan Ermilov	4589be70fe	Unbreak setregid(2). Spotted by: Alexander Leidinger <Alexander@Leidinger.net>	2001-06-06 13:58:03 +00:00
John Baldwin	262c9f8a3b	Don't hold sched_lock across addupc_task(). Reported by: David Taylor <davidt@yadt.co.uk> Submitted by: bde	2001-06-06 00:57:24 +00:00
Dima Dorfman	ddf5b79683	Add a line discipline close routine which restores some functionality I accidently nuked in rev. 1.54. Also rework the error handling in snplwrite a little.	2001-06-05 05:07:53 +00:00
Dima Dorfman	f09f49f136	Style and cosmetic cleanups. This driver is now reasonably stlye(9) compliant. All the variable definitions and function names are reasonably consistent, and the functions which should be static (i.e., all of them) are. Other assorted fixes were made. The majority of the delta is indentation fixes. Partially reviewed by: bde	2001-06-05 05:00:17 +00:00
Dima Dorfman	7fd72392d9	Use the l_nullioctl exported from tty_conf.c rather than rolling our own.	2001-06-04 23:31:21 +00:00
Dima Dorfman	22cf0fb34d	Unstaticize l_nullioctl; it is needed elsewhere (like in tty_snoop.c). Suggested by: bde	2001-06-04 23:30:47 +00:00
Matthew Dillon	1b3e974a71	The pipe_write() code was locking the pipe without busying it first in certain cases, and a close() by another process could potentially rip the pipe out from under the (blocked) locking operation. Reported-by: Alexander Viro <viro@math.psu.edu>	2001-06-04 04:04:45 +00:00
Dima Dorfman	87826386e0	Remove unused includes, use *min() inline functions rather than a home-grown macro, rewrite a confusing conditional in snpdevtotty(), and change ibuf to 512 bytes instead of 1024 bytes in dsnwrite(). Reviewed by: bde	2001-06-03 05:17:39 +00:00
Dima Dorfman	b8edb44cc3	When tring to find out if this is a request for a write in kernel_sysctl and userland_sysctl, check for whether new is NULL, not whether newlen is 0. This allows one to set a string sysctl to "".	2001-06-03 04:58:51 +00:00
Dima Dorfman	c0b824f97d	Include sys/mutex.h to silence a warning.	2001-06-03 02:19:07 +00:00
Jesper Skriver	5b86eac4e5	Revert the last bits of my bogus move of NMBCLUSTERS to <sys/param.h>	2001-06-01 21:47:34 +00:00
Thomas Moestl	d279178df7	Clean up the code exporting interrupt statistics via sysctl a bit: - move the sysctl code to kern_intr.c - do not use INTRCNT_COUNT, but rather eintrcnt - intrcnt to determine the length of the intrcnt array - move the declarations of intrnames, eintrnames, intrcnt and eintrcnt from machine-dependent include files to sys/interrupt.h - remove the hw.nintr sysctl, it is not needed. - fix various style bugs Requested by: bde Reviewed by: bde (some time ago)	2001-06-01 13:23:28 +00:00
Ruslan Ermilov	0b381bf1fd	Remove vestiges of MFS.	2001-06-01 10:07:28 +00:00
David E. O'Brien	240ef84277	Back out jesper's 2001/05/31 14:58:11 PDT commit. It does not compile.	2001-06-01 09:51:14 +00:00
Jesper Skriver	e916d96e64	Move the definition of NMBCLUSTERS from src/sys/kern/uipc_mbuf.c to <sys/param.h>, so it's available to src/sys/netinet/ip_input.c, and remove the now unneeded includes of "opt_param.h". MFC after: 1 week	2001-05-31 21:56:44 +00:00
Dima Dorfman	a723c4e173	Export via sysctl: * all members of msginfo from sysv_msg.c; * msqids from sysv_msg.c; * sema from sysv_sem.c; and * shmsegs from sysv_shm.c; These will be used by ipcs(1) in non-kvm mode. Reviewed by: tmm	2001-05-30 03:28:59 +00:00
Poul-Henning Kamp	22628ccf96	Remove the hack-around for the slice/label code, it didn't cover the hole.	2001-05-29 18:19:57 +00:00
Ian Dowse	5f558fa42f	Since the netexport struct was centralised to 'struct mount', attempting to remove nonexistant exports with MNT_DELEXPORT returns an error; before this change it always succeeded. This caused mountd(8) to log "can't delete exports for /whatever" warnings. Change the error code from EINVAL to a more specific ENOENT, and make mountd ignore this error when deleting the export list. I could have just restored the previous behaviour of returning success, but I think an error return is a useful diagnostic. Reviewed by: phk	2001-05-29 17:46:52 +00:00
Poul-Henning Kamp	b63436919d	Remove a comment which was past its shelf life. PR: 18750 Submitted by: Tony Finch <dot@dotat.at>	2001-05-29 09:22:22 +00:00
Poul-Henning Kamp	c01a009dc5	With the new kernel dev_t conversions done at release 4.X, it becomes possible to trap in ptsstop() in kern/tty_pty.c if the slave side has never been opened during the life of a kernel. What happens is that calls to ttyflush() done from ptyioctl() for the controlling side end up calling ptsstop() [via (tp->t_stop)(tp, <X>)] which evaluates the following: struct pt_ioctl pti = tp->t_dev->si_drv1; In order for tp->t_dev to be set, the slave device must first be opened in ttyopen() [kern/tty.c]. It appears that the only problem is calls to (*tp->t_stop)(tp, <n>), so this could also happen with other ioctls initiated by the controlling side before the slave has been opened. PR: 27698 Submitted by: David Bein bein@netapp.com MFC after: 6 days	2001-05-28 20:22:12 +00:00
Poul-Henning Kamp	507fbee0ad	The disklabel/slice code is more twisted than I thought. Revert to calling the cdevsw_add() unconditionally.	2001-05-28 16:12:55 +00:00
Brian Somers	04bd20e31d	Handle NULL struct device *s	2001-05-28 01:00:03 +00:00
Robert Watson	823c224e95	o uifree() the cr_ruidinfo in crfree() as well as cr_uidinfo now that the real uid info is in the credential also. Submitted by: egge	2001-05-27 21:43:46 +00:00
Robert Watson	7cb8e4d277	o pcred-removal changes included modifications to optimize the setting of the saved uid and gid during execve(). Unfortunately, the optimizations were incorrect in the case where the credential was updated, skipping the setting of the saved uid and gid when new credentials were generated. This change corrects that problem by handling the newcred!=NULL case correctly. Reported/tested by: David Malone <dwmalone@maths.tcd.ie> Obtained from: TrustedBSD Project	2001-05-26 19:59:44 +00:00
Poul-Henning Kamp	3344c5a17e	Create a general facility for making dev_t's depend on another dev_t. The dev_depends(dev_t, dev_t) function is for tying them to each other. When destroy_dev() is called on a dev_t, all dev_t's depending on it will also be destroyed (depth first order). Rewrite the make_dev_alias() to use this dependency facility. kern/subr_disk.c: Make the disk mini-layer use dependencies to make sure all relevant dev_t's are removed when the disk disappears. Make the disk mini-layer precreate some magic sub devices which the disk/slice/label code expects to be there. kern/subr_disklabel.c: Remove some now unneeded variables. kern/subr_diskmbr.c: Remove some ancient, commented out code. kern/subr_diskslice.c: Minor cleanup. Use name from dev_t instead of dsname()	2001-05-26 08:27:58 +00:00
John Baldwin	9d127f9ffb	Add vm locking to sendfile(2) and sf_buf_free(). Reported by: Tamiji Homma <thomma@BayNetworks.com> Tested by: Tamiji Homma <thomma@BayNetworks.com>	2001-05-25 19:23:04 +00:00
Robert Watson	b1fc0ec1a7	o Merge contents of struct pcred into struct ucred. Specifically, add the real uid, saved uid, real gid, and saved gid to ucred, as well as the pcred->pc_uidinfo, which was associated with the real uid, only rename it to cr_ruidinfo so as not to conflict with cr_uidinfo, which corresponds to the effective uid. o Remove p_cred from struct proc; add p_ucred to struct proc, replacing original macro that pointed. p->p_ucred to p->p_cred->pc_ucred. o Universally update code so that it makes use of ucred instead of pcred, p->p_ucred instead of p->p_pcred, cr_ruidinfo instead of p_uidinfo, cr_{r,sv}{u,g}id instead of p_*, etc. o Remove pcred0 and its initialization from init_main.c; initialize cr_ruidinfo there. o Restruction many credential modification chunks to always crdup while we figure out locking and optimizations; generally speaking, this means moving to a structure like this: newcred = crdup(oldcred); ... p->p_ucred = newcred; crfree(oldcred); It's not race-free, but better than nothing. There are also races in sys_process.c, all inter-process authorization, fork, exec, and exit. o Remove sigio->sio_ruid since sigio->sio_ucred now contains the ruid; remove comments indicating that the old arrangement was a problem. o Restructure exec1() a little to use newcred/oldcred arrangement, and use improved uid management primitives. o Clean up exit1() so as to do less work in credential cleanup due to pcred removal. o Clean up fork1() so as to do less work in credential cleanup and allocation. o Clean up ktrcanset() to take into account changes, and move to using suser_xxx() instead of performing a direct uid==0 comparision. o Improve commenting in various kern_prot.c credential modification calls to better document current behavior. In a couple of places, current behavior is a little questionable and we need to check POSIX.1 to make sure it's "right". More commenting work still remains to be done. o Update credential management calls, such as crfree(), to take into account new ruidinfo reference. o Modify or add the following uid and gid helper routines: change_euid() change_egid() change_ruid() change_rgid() change_svuid() change_svgid() In each case, the call now acts on a credential not a process, and as such no longer requires more complicated process locking/etc. They now assume the caller will do any necessary allocation of an exclusive credential reference. Each is commented to document its reference requirements. o CANSIGIO() is simplified to require only credentials, not processes and pcreds. o Remove lots of (p_pcred==NULL) checks. o Add an XXX to authorization code in nfs_lock.c, since it's questionable, and needs to be considered carefully. o Simplify posix4 authorization code to require only credentials, not processes and pcreds. Note that this authorization, as well as CANSIGIO(), needs to be updated to use the p_cansignal() and p_cansched() centralized authorization routines, as they currently do not take into account some desirable restrictions that are handled by the centralized routines, as well as being inconsistent with other similar authorization instances. o Update libkvm to take these changes into account. Obtained from: TrustedBSD Project Reviewed by: green, bde, jhb, freebsd-arch, freebsd-audit	2001-05-25 16:59:11 +00:00
Poul-Henning Kamp	5696db457d	Make the PTY drivers cloning algorithm create "CHEAPCLONE" dev_t, so that some twit cannot allocate all 256 PTY's with "ls -l".	2001-05-25 13:23:42 +00:00
Poul-Henning Kamp	2613d3fec9	Use the name given to the dev_t, rather than creating our own. This makes it possible to give sensible information for /dev/fd.720 and similar "special" devices.	2001-05-25 09:06:52 +00:00
Ruslan Ermilov	1166fb516b	- sys/msdosfs moved to sys/fs/msdosfs - msdos.ko renamed to msdosfs.ko - /usr/include/msdosfs moved to /usr/include/fs/msdosfs	2001-05-25 08:14:14 +00:00
Poul-Henning Kamp	25e0288d07	Don't rely on cdevsw_add() when we hack about with dev_t's.	2001-05-24 20:28:06 +00:00
Poul-Henning Kamp	8576c652b4	Don't take the detour around devsw() to find out if the proto-cdevsw is already initialized.	2001-05-24 20:27:16 +00:00
Alfred Perlstein	0cea693084	whitespace/style	2001-05-24 18:06:22 +00:00
Matthew Dillon	ac8f990bde	This patch implements O_DIRECT about 80% of the way. It takes a patchset Tor created a while ago, removes the raw I/O piece (that has cache coherency problems), and adds a buffer cache / VM freeing piece. Essentially this patch causes O_DIRECT I/O to not be left in the cache, but does not prevent it from going through the cache, hence the 80%. For the last 20% we need a method by which the I/O can be issued directly to buffer supplied by the user process and bypass the buffer cache entirely, but still maintain cache coherency. I also have the code working under -stable but the changes made to sys/file.h may not be MFCable, so an MFC is not on the table yet. Submitted by: tegge, dillon	2001-05-24 07:22:27 +00:00
Dima Dorfman	028f979d1d	Correct style bugs with regards to long lines and comments. Reviewed by: bde	2001-05-23 23:38:05 +00:00
John Baldwin	0dfefe6829	Don't acquire Giant just to call trap_fatal(), we are about to panic anyway so we'd rather see the printf's then block if the system is hosed.	2001-05-23 22:58:09 +00:00
John Baldwin	bdc60f5bd3	Don't release Giant around vm_oject_page_clean() in fsync() as the pager putpages called will need Giant.	2001-05-23 22:55:13 +00:00
John Baldwin	8aa66068ed	- Always call bfreekva() w/o vm_mtx held. - Always call vfs_setdirty() with vm_mtx held. - Fix an old comment: vm_hold_unload_pages is called vm_hold_free_pages() nowadays. - Always call vm_hold_free_pages() w/o vm_mtx held.	2001-05-23 22:24:49 +00:00
John Baldwin	1b2555b243	- Lock the VM when initializing the vmspace for proc0. - Don't bother releasing Giant while doing a lookup on the vm_map of initproc while starting up init. We have to grab it again right after the lookup anyways.	2001-05-23 22:06:47 +00:00
John Baldwin	613c83cbf1	Lock the VM while twiddling the vmspace.	2001-05-23 22:05:08 +00:00
Bosko Milekic	629db60492	Increment mbstat.m_mpfail, not mbstat.m_mcfail, when m_pullup() fails. This slipped in accidently a few commits back.	2001-05-23 20:44:54 +00:00
John Baldwin	5bd57bc8b7	Don't release the vm lock just to turn around and grab it again.	2001-05-23 19:51:12 +00:00
John Baldwin	b516d2f5e1	Add in assertions to ensure that we always call msleep or mawait with either a timeout or a held mutex to detect unprotected infinite sleeps that can easily lead to deadlock. Submitted by: alfred	2001-05-23 19:38:26 +00:00
Poul-Henning Kamp	4787f91d6b	syslogd gets kernel log messages only once every 30 seconds or at the top of the minute, whichever comes first. It seems logtimeout() is only called once after the kernel log is opened and then never again after that. So I guess syslogd only gets kernel log messages by virtue of syncer(4)'s flushes ...? PR: 27361 Submitted by: pkern@utcc.utoronto.ca MFC after: 1 week	2001-05-23 19:02:50 +00:00
Alfred Perlstein	53240603ee	aquire vm_mutex a little bit earlier to protect a pmap call.	2001-05-23 10:26:36 +00:00
Ruslan Ermilov	99d300a1ec	- FDESC, FIFO, NULL, PORTAL, PROC, UMAP and UNION file systems were repo-copied from sys/miscfs to sys/fs. - Renamed the following file systems and their modules: fdesc -> fdescfs, portal -> portalfs, union -> unionfs. - Renamed corresponding kernel options: FDESC -> FDESCFS, PORTAL -> PORTALFS, UNION -> UNIONFS. - Install header files for the above file systems. - Removed bogus -I${.CURDIR}/../../sys CFLAGS from userland Makefiles.	2001-05-23 09:42:29 +00:00
Dima Dorfman	0150c6e83d	Unifdef DEV_SNP; snp(4) no longer requires these ugly hacks. Silence by: -hackers, -audit	2001-05-22 22:16:18 +00:00
Dima Dorfman	47eaa5f542	Convert this driver to (ab?)use line disciplines to get the input it needs instead of relying on idiosyncratic hacks in the tty subsystem. Also add module code since this can now be compiled as a module. Silence by: -hackers, -audit	2001-05-22 22:13:14 +00:00
Bruce Evans	1c1771cb5b	Convert npx interrupts into traps instead of vice versa. This is much simpler for npx exceptions that start as traps (no assembly required...) and works better for npx exceptions that start as interrupts (there is no longer a problem for nested interrupts). Submitted by: original (pre-SMPng) version by luoqi	2001-05-22 21:20:49 +00:00
Dima Dorfman	a8dbafbe87	Correct the vm_mtx handling; specifically, don't acquire it in shm_deallocate_segment because shmexit_myhook calls it, and the latter should always be called with it already held. Submitted by: dwmalone, dd Approved by: alfred	2001-05-22 03:56:26 +00:00
Alfred Perlstein	a4d22b8035	Remove KASSERT test for sleeping on mv_mtx, instead let WITNESS catch it. Requested by: jhb	2001-05-22 00:58:20 +00:00
John Baldwin	9dceb26b23	Sort includes.	2001-05-21 18:52:02 +00:00
John Baldwin	270b041d95	- Assert that the vm mutex is held in pipe_free_kmem(). - Don't release the vm mutex early in pipespace() but instead hold it across vm_object_deallocate() if vm_map_find() returns an error and across pipe_free_kmem() if vm_map_find() succeeds. - Add a XXX above a zfree() since zalloc already has its own locking, one would hope that zfree() wouldn't need the vm lock.	2001-05-21 18:47:17 +00:00
John Baldwin	d8aad40c88	Axe unneeded spl()'s.	2001-05-21 18:30:50 +00:00
Alfred Perlstein	67d1f21cbe	Aquire vm mutex when releasing sysv shm segments. Obtained from: Dima Dorfman <dima@unixfreak.org>	2001-05-20 20:37:47 +00:00
Jonathan Lemon	1890520a77	Add convenience function kernel_sysctlbyname() for kernel consumers, so they don't have to roll their own sysctlbyname function.	2001-05-19 05:45:55 +00:00
Alfred Perlstein	5ee5c3aa1f	remove my private assertions from tsleep. add one assertion to ensure we don't sleep while holding vm.	2001-05-19 01:40:48 +00:00
Alfred Perlstein	2c3c846931	Regen syscalls that were made mpsafe via vm_mtx obreak, getpagesize, sbrk, sstk, mmap, ovadvise, munmap, mprotect, madvise, mincore, mmap, mlock, munlock, minherit, msync, mlockall, munlockall	2001-05-19 01:37:12 +00:00
Alfred Perlstein	2395531439	Introduce a global lock for the vm subsystem (vm_mtx). vm_mtx does not recurse and is required for most low level vm operations. faults can not be taken without holding Giant. Memory subsystems can now call the base page allocators safely. Almost all atomic ops were removed as they are covered under the vm mutex. Alpha and ia64 now need to catch up to i386's trap handlers. FFS and NFS have been tested, other filesystems will need minor changes (grabbing the vm lock when twiddling page properties). Reviewed (partially) by: jake, jhb	2001-05-19 01:28:09 +00:00
John Baldwin	1ad5401134	- Don't panic on a try lock operation for a sleep lock if we hold a spin lock. Since we won't actually block on a try lock operation, it's not a problem. Add a comment explaining why it is safe to skip lock order checking with try locks. - Remove the ithread list lock spin lock from the order list.	2001-05-17 22:44:56 +00:00
John Baldwin	4d29cb2db9	- Remove the global ithread_list_lock spin lock in favor of per-ithread sleep locks. - Delay returning from ithread_remove_handler() until we are certain that the interrupt handler being removed has in fact been removed from the ithread. - XXX: There is still a problem in that nothing protects the kernel from adding a new handler while the ithread is running, though with our current architectures this is not a problem. Requested by: gibbs (2)	2001-05-17 22:43:26 +00:00
John Baldwin	7a08bae6ec	- Move the setting of bootverbose to a MI SI_SUB_TUNABLES SYSINIT. - Attach a writable sysctl to bootverbose (debug.bootverbose) so it can be toggled after boot. - Move the printf of the version string to a SI_SUB_COPYRIGHT SYSINIT just afer the display of the copyright message instead of doing it by hand in three MD places.	2001-05-17 22:28:46 +00:00
Robert Watson	6bd1912df4	o Modify access control checks in p_candebug() such that the policy is as follows: the effective uid of p1 (subject) must equal the real, saved, and effective uids of p2 (object), p2 must not have undergone a credential downgrade. A subject with appropriate privilege may override these protections. In the future, we will extend these checks to require that p1 effective group membership must be a superset of p2 effective group membership. Obtained from: TrustedBSD Project	2001-05-17 21:48:44 +00:00
Alfred Perlstein	0fd061c0c4	Cleanup Remove comment about setting error for reads on EOF, read returns 0 on EOF so the code should be ok. Remove non-effective priority boost, PRIO+1 doesn't do anything (according to McKusick), if a real priority boost is needed it should have been +4. Style fixes: .) return foo -> return (foo) .) FLAG1\|FlAG2 -> FLAG1 \| FlAG2 .) wrap long lines .) unwrap short lines .) for(i=0;i=foo;i++) -> for (i = 0; i=foo; i++) .) remove braces for some conditionals with a single statement .) fix continuation lines. md5 couldn't verify the binary because some code had to be shuffled around to address the style issues.	2001-05-17 19:47:09 +00:00
Alfred Perlstein	2deb4a20c3	initialize pipe pointers	2001-05-17 18:22:58 +00:00
Alfred Perlstein	82a283fcf3	pipe_create has to zero out the select record earlier to avoid returning a half-initialized pipe and causing pipeclose() to follow a junk pointer. Discovered by: "Nick S" <snicko@noid.org>	2001-05-17 17:59:28 +00:00
Ian Dowse	0864ef1e8a	Change the second argument of vflush() to an integer that specifies the number of references on the filesystem root vnode to be both expected and released. Many filesystems hold an extra reference on the filesystem root vnode, which must be accounted for when determining if the filesystem is busy and then released if it isn't busy. The old `skipvp' approach required individual filesystem xxx_unmount functions to re-implement much of vflush()'s logic to deal with the root vnode. All 9 filesystems that hold an extra reference on the root vnode got the logic wrong in the case of forced unmounts, so `umount -f' would always fail if there were any extra root vnode references. Fix this issue centrally in vflush(), now that we can. This commit also fixes a vnode reference leak in devfs, which could result in idle devfs filesystems that refuse to unmount. Reviewed by: phk, bp	2001-05-16 18:04:37 +00:00
Alfred Perlstein	a428c5ffef	remove include of ipl.h because it no longer exists	2001-05-16 02:52:06 +00:00
John Baldwin	8bd57f8fc2	Remove unneeded includes of sys/ipl.h and machine/ipl.h.	2001-05-15 23:22:29 +00:00
John Baldwin	74fc745594	- Remove unneeded include of sys/ipl.h. - Lock the process before calling killproc() to kill it for exceeding the maximum CPU limit.	2001-05-15 23:15:06 +00:00
John Baldwin	9081e5e826	- Remove unneeded include of sys/ipl.h. - Require the proc lock be held for killproc() to allow for the vmdaemon to kill a process when memory is exhausted while holding the lock of the process to kill.	2001-05-15 23:13:58 +00:00
Brian Somers	eeee064735	Support /dev/ctty again Submitted by: peter	2001-05-15 18:12:38 +00:00
Seigo Tanimura	1b36970495	Back out scanning file descriptors with holding a process lock. selrecord() requires allproc sx in pfind(), resulting in lock order reversal between allproc and a process lock.	2001-05-15 10:19:57 +00:00
Jonathan Lemon	97f6754ff1	When calling poll() on a fd associated with a filesystem, let POLLIN/POLLOUT behave identically to POLLRDNORM/POLLWRNORM. Submitted by: bde PR: 27287 merge after: 1 week	2001-05-14 14:37:25 +00:00
Poul-Henning Kamp	241e77c8a5	Use the new ability to avoid practically all the gunk in this file. When people access /dev/tty, locate their controlling tty and return the dev_t of it to them. This basically makes /dev/tty act like a variant symlink sort of thing which is much simpler than all the mucking about with vnodes.	2001-05-14 08:22:56 +00:00
Seigo Tanimura	265fc98f36	- Convert msleep(9) in select(2) and poll(2) to cv_wait(9). - Since polling should not involve sleeping, keep holding a process lock upon scanning file descriptors. - Hold a reference to every file descriptor prior to entering polling loop in order to avoid lock order reversal between lockmgr and p_mtx upon calling fdrop() in fo_poll(). (NOTE: this work has not been done for netncp and netsmb yet because a socket itself has no reference counts.) Reviewed by: jhb	2001-05-14 05:26:48 +00:00
John Baldwin	1efb92b7ca	Simplify the vm fault trap handling code a bit by using if-else instead of duplicating code in the then case and then using a goto to jump around the else case.	2001-05-11 23:50:08 +00:00
Ian Dowse	1feb7a6efa	In vrele() and vput(), avoid triggering the confusing "missed vn_close" KASSERT when vp->v_usecount is zero or negative. In this case, the "v*: negative ref cnt" panic that follows is much more appropriate. Reviewed by: mckusick	2001-05-11 20:42:41 +00:00
John Baldwin	9e5620599e	Check witness_dead in more functions to avoid panic'ing when assertions fail due to witness exhausting its internal resources and shutting down. Reported by: Szilveszter Adam <sziszi@petra.hos.u-szeged.hu> Tested by: David Wolfskill <david@catwhisker.org>	2001-05-11 20:25:29 +00:00
Tor Egge	dd1c45f3ca	Regenerate.	2001-05-11 17:05:47 +00:00
Tor Egge	b4b469e6bb	gettimeofday() is MP safe on both -current and -stable.	2001-05-11 17:05:12 +00:00
John Baldwin	ba228f6d96	- Split out the support for per-CPU data from the SMP code. UP kernels have per-CPU data and gdb on the i386 at least needs access to it. - Clean up includes in kern_idle.c and subr_smp.c. Reviewed by: jake	2001-05-10 17:45:49 +00:00
Alfred Perlstein	97d4578662	Remove an 'optimization' I hope to never see again. The pipe code could not handle running out of kva, it would panic if that happened. Instead return ENFILE to the application which is an acceptable error return from pipe(2). There was some slightly tricky things that needed to be worked on, namely that the pipe code can 'realloc' the size of the buffer if it detects that the pipe could use a bit more room. However if it failed the reallocation it could not cope and would panic. Fix this by attempting to grow the pipe while holding onto our old resources. If all goes well free the old resources and use the new ones, otherwise continue to use the smaller buffer already allocated. While I'm here add a few blank lines for style(9) and remove 'register'.	2001-05-08 09:09:18 +00:00
Poul-Henning Kamp	e0e0b6610e	Always initialize bio_resid from bio_bcount in the disk mini-layer so that the drivers don't have to do it umpteen times.	2001-05-08 08:24:54 +00:00
Akinori MUSHA	3b26be6ae1	Properly copy the P_ALTSTACK flag in struct proc::p_flag to the child process on fork(2). It is the supposed behavior stated in the manpage of sigaction(2), and Solaris, NetBSD and FreeBSD 3-STABLE correctly do so. The previous fix against libc_r/uthread/uthread_fork.c fixed the problem only for the programs linked with libc_r, so back it out and fix fork(2) itself to help those not linked with libc_r as well. PR: kern/26705 Submitted by: KUROSAWA Takahiro <fwkg7679@mb.infoweb.ne.jp> Tested by: knu, GOTOU Yuuzou <gotoyuzo@notwork.org>, and some other people Not objected by: hackers MFC in: 3 days	2001-05-07 18:07:29 +00:00
Poul-Henning Kamp	079f2df393	Make the disk mini-layer check for and handle zero-length transfers instead of the underlying drivers.	2001-05-06 21:55:22 +00:00
Poul-Henning Kamp	a468031ce8	Actually biofinish(struct bio , struct devstat , int error) is more general than the bioerror(). Most of this patch is generated by scripts.	2001-05-06 20:00:03 +00:00
Poul-Henning Kamp	b966319db7	Fix return type of vop_stdputpages() Noticed by: rwatson	2001-05-06 17:40:22 +00:00
Robert Watson	29b2efeb6b	o First step in cleaning up authorization code for the posix4 implementation. Move from direct uid 0 comparision to using suser_xxx() call with the same semantics. Simplify CAN_AFFECT() macro as passed pcred was redundant. The checks here still aren't "right", but they are probably "better". Obtained from: TrustedBSD Project	2001-05-06 16:15:42 +00:00
Matthew Dillon	1766b2e5fa	Raise the SysV shared memory defaults to more reasonable values. Mainly increases the shared memory limit from 4M to 32M (approx). Many more programs these days use SysV shared memory, especially X-related programs.	2001-05-04 18:43:19 +00:00
John Baldwin	6c49a8e295	Fix a bug in the pfind() changes due to confusing the process returned by pfind() ('pp') with the process being detached from ptrace. Reported by: bde	2001-05-04 18:13:11 +00:00
John Baldwin	2d96f0b145	- Move state about lock objects out of struct lock_object and into a new struct lock_instance that is stored in the per-process and per-CPU lock lists. Previously, the lock lists just kept a pointer to each lock held. That pointer is now replaced by a lock instance which contains a pointer to the lock object, the file and line of the last acquisition of a lock, and various flags about a lock including its recursion count. - If we sleep while holding a sleepable lock, then mark that lock instance as having slept and ignore any lock order violations that occur while acquiring Giant when we wake up with slept locks. This is ok because of Giant's special nature. - Allow witness to differentiate between shared and exclusive locks and unlocks of a lock. Witness will now detect the case when a lock is acquired first in one mode and then in another. Mutexes are always locked and unlocked exclusively. Witness will also now detect the case where a process attempts to unlock a shared lock while holding an exclusive lock and vice versa. - Fix a bug in the lock list implementation where we used the wrong constant to detect the case where a lock list entry was full.	2001-05-04 17:15:16 +00:00
John Baldwin	ac07d659c3	Don't hold the process mutex across calls to FREE() since the vm system uses lockmgr locks and this leads to a lock order reversal. At this point in wait1() the process is not on any process lists or in the process tree, so no other process should be able to find it or have a reference to it anyways, so the locking is not needed.	2001-05-04 16:13:28 +00:00
Poul-Henning Kamp	a62615e59b	Implement vop_std{get\|put}pages() and add them to the default vop[]. Un-copy&paste all the VOP_{GET\|PUT}PAGES() functions which do nothing but the default.	2001-05-01 08:34:45 +00:00
Mark Murray	fb919e4d5a	Undo part of the tangle of having sys/lock.h and sys/mutex.h included in other "system" header files. Also help the deprecation of lockmgr.h by making it a sub-include of sys/lock.h and removing sys/lockmgr.h form kernel .c files. Sort sys/*.h includes where possible in affected files. OK'ed by: bde (with reservations)	2001-05-01 08:13:21 +00:00
Alfred Perlstein	aad7597ce0	When panic()'ing because of recursion on a non-recursive mutex, print out the location it was initially locked. Ok'd by: jake	2001-04-30 01:01:52 +00:00
Jake Burkholder	e6af1080c2	Make rtprio work again. - add a missing break which caused RTP_SET to always return EINVAL - break instead of returning if p_can fails so proc_lock is always dropped correctly - only copyin data that is actually needed - use break instead of goto - make rtp_to_pri return EINVAL instead of -1 if the values are out or range so we don't have to translate	2001-04-29 22:09:26 +00:00
Robert Watson	46157a65d7	o As part of the move to not maintaining copies of the vnode owning uid and gid in the ACL, vaccess_acl_posix1e() was changed to accept explicit file_uid and file_gid as arguments. However, in making the change, I explicitly checked file_gid against cr->cr_groups[0], rather than using groupmember, resulting in ACL_GROUP_OBJ entries being compared to the caller's effective gid only, not the remainder of its groups. This was recently corrected for the version of the group call without privilege, but the second test (when privilege is added) was missed. This change replaces an additiona cr->cr_groups[0] check with groupmember(). Pointed out by: jedgar Reviewed by: jedgar Obtained from: TrustedBSD Project	2001-04-29 19:53:50 +00:00
Poul-Henning Kamp	855aa097af	VOP_BALLOC was never really a VOP in the first place, so convert it to UFS_BALLOC like the other "between UFS and FFS function interfaces".	2001-04-29 12:36:52 +00:00
Poul-Henning Kamp	b7ebffbc08	Add a vop_stdbmap(), and make it part of the default vop vector. Make 7 filesystems which don't really know about VOP_BMAP rely on the default vector, rather than more or less complete local vop_nopbmap() implementations.	2001-04-29 11:48:41 +00:00
Greg Lehey	60fb0ce365	Revert consequences of changes to mount.h, part 2. Requested by: bde	2001-04-29 02:45:39 +00:00
Alfred Perlstein	6157b69f4a	Instead of asserting that a mutex is not still locked after unlocking it, assert that the mutex is owned and not recursed prior to unlocking it. This should give a clearer diagnostic when a programming error is caught.	2001-04-28 12:11:01 +00:00
John Baldwin	6caa8a1501	Overhaul of the SMP code. Several portions of the SMP kernel support have been made machine independent and various other adjustments have been made to support Alpha SMP. - It splits the per-process portions of hardclock() and statclock() off into hardclock_process() and statclock_process() respectively. hardclock() and statclock() call the _process() functions for the current process so that UP systems will run as before. For SMP systems, it is simply necessary to ensure that all other processors execute the _process() functions when the main clock functions are triggered on one CPU by an interrupt. For the alpha 4100, clock interrupts are delievered in a staggered broadcast fashion, so we simply call hardclock/statclock on the boot CPU and call the _process() functions on the secondaries. For x86, we call statclock and hardclock as usual and then call forward_hardclock/statclock in the MD code to send an IPI to cause the AP's to execute forwared_hardclock/statclock which then call the _process() functions. - forward_signal() and forward_roundrobin() have been reworked to be MI and to involve less hackery. Now the cpu doing the forward sets any flags, etc. and sends a very simple IPI_AST to the other cpu(s). AST IPIs now just basically return so that they can execute ast() and don't bother with setting the astpending or needresched flags themselves. This also removes the loop in forward_signal() as sched_lock closes the race condition that the loop worked around. - need_resched(), resched_wanted() and clear_resched() have been changed to take a process to act on rather than assuming curproc so that they can be used to implement forward_roundrobin() as described above. - Various other SMP variables have been moved to a MI subr_smp.c and a new header sys/smp.h declares MI SMP variables and API's. The IPI API's from machine/ipl.h have moved to machine/smp.h which is included by sys/smp.h. - The globaldata_register() and globaldata_find() functions as well as the SLIST of globaldata structures has become MI and moved into subr_smp.c. Also, the globaldata list is only available if SMP support is compiled in. Reviewed by: jake, peter Looked over by: eivind	2001-04-27 19:28:25 +00:00
Alfred Perlstein	3abedb4e01	Actually show the values that tripped the assertion "receive 1"	2001-04-27 13:42:50 +00:00
Robert Watson	80c9c40df9	o Remove the disabled p_cansched() test cases that permitted users to modify the scheduling properties of processes with a different real uid but the same effective uid (i.e., daemons, et al). (note: these cases were previously commented out, so this does not change the compiled code at al) Obtained from: TrustedBSD Project	2001-04-27 01:56:32 +00:00
Poul-Henning Kamp	8ee8b21b48	vfs_subr.c is getting rather fat. The underlying repocopy and this commit moves the filesystem export handling code to vfs_export.c	2001-04-26 20:47:14 +00:00
Alfred Perlstein	06336fb26d	Sendfile is documented to return 0 on success, however if when a sf_hdtr is used to provide writev(2) style headers/trailers on the sent data the return value is actually either the result of writev(2) from the trailers or headers of no tailers are specified. Fix sendfile to comply with the documentation, by returning 0 on success. Ok'd by: dg	2001-04-26 00:14:14 +00:00
Seigo Tanimura	ebdc3f1d2d	Do not leave a process with no credential in zombproc. Reviewed by: jhb	2001-04-25 10:22:35 +00:00
Kirk McKusick	112f737245	When closing the last reference to an unlinked file, it is freed by the inactive routine. Because the freeing causes the filesystem to be modified, the close must be held up during periods when the filesystem is suspended. For snapshots to be consistent across crashes, they must write blocks that they copy and claim those written blocks in their on-disk block pointers before the old blocks that they referenced can be allowed to be written. Close a loophole that allowed unwritten blocks to be skipped when doing ffs_sync with a request to wait for all I/O activity to be completed.	2001-04-25 08:11:18 +00:00
Poul-Henning Kamp	a13234bb35	Move the netexport structure from the fs-specific mountstructure to struct mount. This makes the "struct netexport *" paramter to the vfs_export and vfs_checkexport interface unneeded. Consequently that all non-stacking filesystems can use vfs_stdcheckexp(). At the same time, make it a pointer to a struct netexport in struct mount, so that we can remove the bogus AF_MAX and #include <net/radix.h> from <sys/mount.h>	2001-04-25 07:07:52 +00:00
Thomas Moestl	83f3198b2b	Change uipc_sockaddr so that a sockaddr_un without a path is returned nam for an unbound socket instead of leaving nam untouched in that case. This way, the getsockname() output can be used to determine the address family of such sockets (AF_LOCAL). Reviewed by: iedowse Approved by: rwatson	2001-04-24 19:09:23 +00:00
John Baldwin	33a9ed9d0e	Change the pfind() and zpfind() functions to lock the process that they find before releasing the allproc lock and returning. Reviewed by: -smp, dfr, jake	2001-04-24 00:51:53 +00:00
Thomas Moestl	e15480f8dd	Fix a bug introduced in the last commit: vaccess_acl_posix1 only checked the file gid gainst the egid of the accessing process for the ACL_GROUP_OBJ case, and ignored supplementary groups. Approved by: rwatson	2001-04-23 22:52:26 +00:00
Greg Lehey	d98dc34f52	Correct #includes to work with fixed sys/mount.h.	2001-04-23 09:05:15 +00:00
Robert Watson	5ea6583e2d	o Remove comment indicating policy permits loop-back debugging, but semantics don't: in practice, both policy and semantics permit loop-back debugging operations, only it's just a subset of debugging operations (i.e., a proc can open its own /dev/mem), and that's at a higher layer.	2001-04-21 22:41:45 +00:00
John Baldwin	9d4f526475	Spelling nit: acquring -> acquiring. Reported by: T. William Wells <bill@twwells.com>	2001-04-21 01:50:32 +00:00
Alfred Perlstein	98689e1e70	Assert that when using an interlock mutex it is not recursed when lockmgr() is called. Ok'd by: jhb	2001-04-20 22:38:40 +00:00
John Baldwin	242d02a13f	Make the ap_boot_mtx mutex static.	2001-04-20 01:09:05 +00:00
John Baldwin	d8915a7f34	- Whoops, forgot to enable the clock lock in the spin order list on the alpha. - Change the Debugger() functions to pass in the real function name.	2001-04-19 15:49:54 +00:00
Bosko Milekic	d04d50d1f7	Fix inconsistency in setup of kernel_map: we need to make sure that we also reserve _adequate_ space for the mb_map submap; i.e. we need space for nmbclusters, nmbufs, _and_ nmbcnt. Furthermore, we need to rounddown, and not roundup, so that we are consistent. Pointed out by: bde	2001-04-18 23:54:13 +00:00
Alfred Perlstein	2f3cf91876	Check validity of signal callback requested via aio routines. Also move the insertion of the request to after the request is validated, there's still looks like there may be some problems if an invalid address is passed to the aio routines, basically a possible leak or having a not completely initialized structure on the queue may still be possible. A new sig macro was made _SIG_VALID to check the validity of a signal, it would be advisable to use it from now on (in kern/kern_sig.c) rather than rolling your own. PR: kern/17152	2001-04-18 22:18:39 +00:00
Seigo Tanimura	759cb26335	Reclaim directory vnodes held in namecache if few free vnodes are available. Only directory vnodes holding no child directory vnodes held in v_cache_src are recycled, so that directory vnodes near the root of the filesystem hierarchy remain in namecache and directory vnodes are not reclaimed in cascade. The period of vnode reclaiming attempt and the number of vnodes attempted to reclaim can be tuned via sysctl(2). Suggested by: tegge Approved by: phk	2001-04-18 11:19:50 +00:00
Poul-Henning Kamp	793d6d5d57	bread() is a special case of breadn(), so don't replicate code.	2001-04-18 07:16:07 +00:00
Dima Dorfman	25c7870e5d	Make this driver play ball with devfs(5). Reviewed by: brian	2001-04-17 20:53:11 +00:00
Alfred Perlstein	e04670b734	Add a sanity check on ucred refcount. Submitted by: Terry Lambert <terry@lambert.org>	2001-04-17 20:50:43 +00:00
Alfred Perlstein	603c86672c	Implement client side NFS locks. Obtained from: BSD/os Import Ok'd by: mckusick, jkh, motd on builder.freebsd.org	2001-04-17 20:45:23 +00:00
Poul-Henning Kamp	0dfba3cef1	Write a switch statement as less obscure if statements.	2001-04-17 20:22:07 +00:00
John Baldwin	e3ee8974e3	Fix an old bug related to BETTER_CLOCK. Call forward_clock if SMP and __i386__ are defined rather than if SMP and BETTER_CLOCK are defined. The removal of BETTER_CLOCK would have broken this except that kern_clock.c doesn't include <machine/smptests.h>, so it doesn't see the definition of BETTER_CLOCK, and forward_clock aren't called, even on 4.x. This seems to fix the problem where a n-way SMP system would see 100 * n clk interrupts and 128 * n rtc interrupts.	2001-04-17 17:53:36 +00:00
Poul-Henning Kamp	f84e29a06c	This patch removes the VOP_BWRITE() vector. VOP_BWRITE() was a hack which made it possible for NFS client side to use struct buf with non-bio backing. This patch takes a more general approach and adds a bp->b_op vector where more methods can be added. The success of this patch depends on bp->b_op being initialized all relevant places for some value of "relevant" which is not easy to determine. For now the buffers have grown a b_magic element which will make such issues a tiny bit easier to debug.	2001-04-17 08:56:39 +00:00
Kirk McKusick	5819ab3f12	Add debugging option to always read/write cylinder groups as full sized blocks. To enable this option, use: `sysctl -w debug.bigcgs=1'. Add debugging option to disable background writes of cylinder groups. To enable this option, use: `sysctl -w debug.dobkgrdwrite=0'. These debugging options should be tried on systems that are panicing with corrupted cylinder group maps to see if it makes the problem go away. The set of panics in question are: ffs_clusteralloc: map mismatch ffs_nodealloccg: map corrupted ffs_nodealloccg: block not in map ffs_alloccg: map corrupted ffs_alloccg: block not in map ffs_alloccgblk: cyl groups corrupted ffs_alloccgblk: can't find blk in cyl ffs_checkblk: partially free fragment The following panics are less likely to be related to this problem, but might be helped by these debugging options: ffs_valloc: dup alloc ffs_blkfree: freeing free block ffs_blkfree: freeing free frag ffs_vfree: freeing free inode If you try these options, please report whether they helped reduce your bitmap corruption panics to Kirk McKusick at <mckusick@mckusick.com> and to Matt Dillon <dillon@earth.backplane.com>.	2001-04-17 05:37:51 +00:00
Robert Watson	b114e127e6	In my first reading of POSIX.1e, I misinterpreted handling of the ACL_USER_OBJ and ACL_GROUP_OBJ fields, believing that modification of the access ACL could be used by privileged processes to change file/directory ownership. In fact, this is incorrect; ACL_*_OBJ (+ ACL_MASK and ACL_OTHER) should have undefined ae_id fields; this commit attempts to correct that misunderstanding. o Modify arguments to vaccess_acl_posix1e() to accept the uid and gid associated with the vnode, as those can no longer be extracted from the ACL passed as an argument. Perform all comparisons against the passed arguments. This actually has the effect of simplifying a number of components of this call, as well as reducing the indent level, but now seperates handling of ACL_GROUP_OBJ from ACL_GROUP. o Modify acl_posix1e_check() to return EINVAL if the ae_id field of any of the ACL_{USER_OBJ,GROUP_OBJ,MASK,OTHER} entries is a value other than ACL_UNDEFINED_ID. As a temporary work-around to allow clean upgrades, set the ae_id field to ACL_UNDEFINED_ID before each check so that this cannot cause a failure in the short term (this work-around will be removed when the userland libraries and utilities are updated to take this change into account). o Modify ufs_sync_acl_from_inode() so that it forces ACL_{USER_OBJ,GROUP_OBJ,MASK,OTHER} ae_id fields to ACL_UNDEFINED_ID when synchronizing the ACL from the inode. o Modify ufs_sync_inode_from_acl to not propagate uid and gid information to the inode from the ACL during ACL update. Also modify the masking of permission bits that may be set from ALLPERMS to (S_IRWXU\|S_IRWXG\|S_IRWXO), as ACLs currently do not carry none-ACCESSPERMS (S_ISUID, S_ISGID, S_ISTXT). o Modify ufs_getacl() so that when it emulates an access ACL from the inode, it initializes the ae_id fields to ACL_UNDEFINED_ID. o Clean up ufs_setacl() substantially since it is no longer possible to perform chown/chgrp operations using vop_setacl(), so all the access control for that can be eliminated. o Modify ufs_access() so that it passes owner uid and gid information into vaccess_acl_posix1e(). Pointed out by: jedger Obtained from: TrustedBSD Project	2001-04-17 04:33:34 +00:00
John Baldwin	abd9053ee4	Blow away the panic mutex in favor of using a single atomic_cmpset() on a panic_cpu shared variable. I used a simple atomic operation here instead of a spin lock as it seemed to be excessive overhead. Also, this can avoid recursive panics if, for example, witness is broken.	2001-04-17 04:18:08 +00:00
John Baldwin	3c41f323c9	Check to see if enroll() returns NULL in the witness initialization. This can happen if witness runs out of resources during initialization or if witness_skipspin is enabled. Sleuthing by: Peter Jeremy <peter.jeremy@alcatel.com.au>	2001-04-17 03:35:38 +00:00
John Baldwin	7141f2ad46	Exit and re-enter the critical section while spinning for a spinlock so that interrupts can come in while we are waiting for a lock.	2001-04-17 03:34:52 +00:00
John Hay	24dbea46a9	Update to the 2001-04-02 version of the nanokernel code from Dave Mills.	2001-04-16 13:05:05 +00:00
Brian Somers	56700d4634	Call strlen() once instead of twice.	2001-04-14 21:33:58 +00:00
Robert Watson	e9e7ff5b22	o Since uid checks in p_cansignal() are now identical between P_SUGID and non-P_SUGID cases, simplify p_cansignal() logic so that the P_SUGID masking of possible signals is independent from uid checks, removing redundant code and generally improving readability. Reviewed by: tmm Obtained from: TrustedBSD Project	2001-04-13 14:33:45 +00:00
Alfred Perlstein	1375ed7eb7	convert if/panic -> KASSERT, explain what triggered the assertion	2001-04-13 10:15:53 +00:00
Murray Stokely	a4e6da691f	Generate useful error messages.	2001-04-13 09:37:25 +00:00
Mark Murray	f0b60d7560	Handle a rare but fatal race invoked sometimes when SIGSTOP is invoked.	2001-04-13 09:29:34 +00:00
John Baldwin	7a9aa5d372	- Add a comment at the start of the spin locks list. - The alpha SMP code uses an "ap boot" spinlock as well.	2001-04-13 08:31:38 +00:00
Robert Watson	44c3e09cdc	o Disallow two "allow this" exceptions in p_cansignal() restricting the ability of unprivileged processes to deliver arbitrary signals to daemons temporarily taking on unprivileged effective credentials when P_SUGID is not set on the target process: Removed: (p1->p_cred->cr_ruid != ps->p_cred->cr_uid) (p1->p_ucred->cr_uid != ps->p_cred->cr_uid) o Replace two "allow this" exceptions in p_cansignal() restricting the ability of unprivileged processes to deliver arbitrary signals to daemons temporarily taking on unprivileged effective credentials when P_SUGID is set on the target process: Replaced: (p1->p_cred->p_ruid != p2->p_ucred->cr_uid) (p1->p_cred->cr_uid != p2->p_ucred->cr_uid) With: (p1->p_cred->p_ruid != p2->p_ucred->p_svuid) (p1->p_ucred->cr_uid != p2->p_ucred->p_svuid) o These changes have the effect of making the uid-based handling of both P_SUGID and non-P_SUGID signal delivery consistent, following these four general cases: p1's ruid equals p2's ruid p1's euid equals p2's ruid p1's ruid equals p2's svuid p1's euid equals p2's svuid The P_SUGID and non-P_SUGID cases can now be largely collapsed, and I'll commit this in a few days if no immediate problems are encountered with this set of changes. o These changes remove a number of warning cases identified by the proc_to_proc inter-process authorization regression test. o As these are new restrictions, we'll have to watch out carefully for possible side effects on running code: they seem reasonable to me, but it's possible this change might have to be backed out if problems are experienced. Submitted by: src/tools/regression/security/proc_to_proc/testuid Reviewed by: tmm Obtained from: TrustedBSD Project	2001-04-13 03:06:22 +00:00
Robert Watson	0489082737	o Disable two "allow this" exceptions in p_cansched()m retricting the ability of unprivileged processes to modify the scheduling properties of daemons temporarily taking on unprivileged effective credentials. These cases (p1->p_cred->p_ruid == p2->p_ucred->cr_uid) and (p1->p_ucred->cr_uid == p2->p_ucred->cr_uid), respectively permitting a subject process to influence the scheduling of a daemon if the subject process has the same real uid or effective uid as the daemon's effective uid. This removes a number of the warning cases identified by the proc_to_proc iner-process authorization regression test. o As these are new restrictions, we'll have to watch out carefully for possible side effects on running code: they seem reasonable to me, but it's possible this change might have to be backed out if problems are experienced. Reported by: src/tools/regression/security/proc_to_proc/testuid Obtained from: TrustedBSD Project	2001-04-12 22:46:07 +00:00
Robert Watson	e386f9bda3	o Make kqueue's filt_procattach() function use the error value returned by p_can(...P_CAN_SEE), rather than returning EACCES directly. This brings the error code used here into line with similar arrangements elsewhere, and prevents the leakage of pid usage information. Reviewed by: jlemon Obtained from: TrustedBSD Project	2001-04-12 21:32:02 +00:00
Robert Watson	d34f8d3030	o Limit process information leakage by introducing a p_can(...P_CAN_SEE...) in rtprio()'s RTP_LOOKIP implementation. Obtained from: TrustedBSD Project	2001-04-12 20:46:26 +00:00
Robert Watson	eb9e5c1d72	o Reduce information leakage into jails by adding invocations of p_can(...P_CAN_SEE...) to getpgid(), getsid(), and setpgid(), blocking these operations on processes that should not be visible by the requesting process. Required to reduce information leakage in MAC environments. Obtained from: TrustedBSD Project	2001-04-12 19:39:00 +00:00
Robert Watson	4c5eb9c397	o Replace p_cankill() with p_cansignal(), remove wrappage of p_can() from signal authorization checking. o p_cansignal() takes three arguments: subject process, object process, and signal number, unlike p_cankill(), which only took into account the processes and not the signal number, improving the abstraction such that CANSIGNAL() from kern_sig.c can now also be eliminated; previously CANSIGNAL() special-cased the handling of SIGCONT based on process session. privused is now deprecated. o The new p_cansignal() further limits the set of signals that may be delivered to processes with P_SUGID set, and restructures the access control check to allow it to be extended more easily. o These changes take into account work done by the OpenBSD Project, as well as by Robert Watson and Thomas Moestl on the TrustedBSD Project. Obtained from: TrustedBSD Project	2001-04-12 02:38:08 +00:00
Robert Watson	40829dd2dc	o Regenerated following introduction of __setugid() system call for "options REGRESSION". Obtained from: TrustedBSD Project	2001-04-11 20:21:37 +00:00
Robert Watson	130d0157d1	o Introduce a new system call, __setsugid(), which allows a process to toggle the P_SUGID bit explicitly, rather than relying on it being set implicitly by other protection and credential logic. This feature is introduced to support inter-process authorization regression testing by simplifying userland credential management allowing the easy isolation and reproduction of authorization events with specific security contexts. This feature is enabled only by "options REGRESSION" and is not intended to be used by applications. While the feature is not known to introduce security vulnerabilities, it does allow processes to enter previously inaccessible parts of the credential state machine, and is therefore disabled by default. It may not constitute a risk, and therefore in the future pending further analysis (and appropriate need) may become a published interface. Obtained from: TrustedBSD Project	2001-04-11 20:20:40 +00:00
John Baldwin	7b531e6037	Stick proc0 in the PID hash table.	2001-04-11 18:50:50 +00:00
John Baldwin	2fea957dc5	Rename the IPI API from smp_ipi_* to ipi_* since the smp_ prefix is just "redundant noise" and to match the IPI constant namespace (IPI_*). Requested by: bde	2001-04-11 17:06:02 +00:00
Chris D. Faulhaber	fb1af1f2bf	Correct the following defines to match the POSIX.1e spec: ACL_PERM_EXEC -> ACL_EXECUTE ACL_PERM_READ -> ACL_READ ACL_PERM_WRITE -> ACL_WRITE Obtained from: TrustedBSD	2001-04-11 02:19:01 +00:00
Peter Wemm	9d10eb0c0c	Create debug.hashstat.[raw]nchash and debug.hashstat.[raw]nfsnode to enable easy access to the hash chain stats. The raw prefixed versions dump an integer array to userland with the chain lengths. This cheats and calls it an array of 'struct int' rather than 'int' or sysctl -a faithfully dumps out the 128K array on an average machine. The non-raw versions return 4 integers: count, number of chains used, maximum chain length, and percentage utilization (fixed point, multiplied by 100). The raw forms are more useful for analyzing the hash distribution, while the other form can be read easily by humans and stats loggers.	2001-04-11 00:39:20 +00:00
John Baldwin	ca7ef17c08	Remove the BETTER_CLOCK #ifdef's. The code is on by default and is here to stay for the foreseeable future. OK'd by: peter (the idea)	2001-04-10 21:34:13 +00:00
John Baldwin	6a0fa9a023	Add an MI API for sending IPI's. I used the same API present on the alpha because: - it used a better namespace (smp_ipi_* rather than _ipi), - it used better constant names for the IPI's (IPI_ rather than X*_OFFSET), and - this API also somewhat exists for both alpha and ia64 already.	2001-04-10 21:04:32 +00:00
Boris Popov	681a5bbef2	Import kernel part of SMB/CIFS requester. Add smbfs(CIFS) filesystem. Userland part will be in the ports tree for a while. Obtained from: smbfs-1.3.7-dev package.	2001-04-10 07:59:06 +00:00
Boris Popov	16162e5789	Avoid endless recursion on panic. Reviewed by: jhb	2001-04-10 00:56:19 +00:00
John Baldwin	d53d22496f	Maintain a reference count on the witness struct. When the reference count drops to 0 in witness_destroy, set the w_name and w_file pointers to point to the string "(dead)" and the w_line field to 0. This way, if a mutex of a given name is used only in a module, then as long as all mutexes in the module are destroyed when the module is unloaded, witness will not maintain stale references to the mutex's name in the module's data section causing a panic later on when the w_name or w_file field's are examined.	2001-04-09 22:34:05 +00:00
Nick Hibma	7032578aac	Remove a stale file.	2001-04-09 10:28:33 +00:00
Jake Burkholder	1681b00a79	Fix a precedence bug. ! has higher precedence than &.	2001-04-08 04:15:26 +00:00
Nick Hibma	6d0b13c1ba	Use getopt instead of a home grown one Submitted by: DES	2001-04-07 20:51:24 +00:00
John Baldwin	3dcb6789d7	- Split out the functionality of displaying the contents of a single lock list into a public witness_list_locks() function. Call this function twice in witness_list() instead of using an evil goto. - Adjust the 'show locks' command to take an optional parameter which specifies the pid of a process to list the locks of. By default the locks held by the current process are displayed.	2001-04-06 21:37:52 +00:00
Bosko Milekic	4b8ae40a7c	- Change the msleep()s to condition variables. The mbuf and mcluster free lists now each "own" a condition variable, m_starved. - Clean up minor indentention issues in sys/mbuf.h caused by previous commit.	2001-04-03 04:50:13 +00:00
Alfred Perlstein	5ea487f34d	Use only one mutex for the entire mbuf subsystem. Don't use atomic operations for the stats updating, instead protect the counts with the mbuf mutex. Most twiddling of the stats was done right before or after releasing a mutex. By doing this we reduce the number of locked ops needed as well as allow a sysctl to gain a consitant view of the entire stats structure. In the future... This will allow us to chain common mbuf operations that would normally need to aquire/release 2 or 3 of the locks to build an mbuf with a cluster or external data attached into a single op requiring only one lock. Simplify the per-cpu locks that are planned. There's also some if (1) code that should check if the "how" operation specifies blocking/non-blocking behavior, we _could_ make it so that we hold onto the mutex through calls into kmem_alloc when non-blocking requests are made, but for safety reasons we currently drop and reaquire the mutex around the calls. Also, note that calling kmem_alloc is rare and only happens during a shortage so drop/re-getting the mutex will not be a common occurance. Remove some #define's that seemed to obfuscate the code to me. Remove an extranious comment. Remove an XXX, including mutex.h isn't a crime. Reviewed by: bmilekic	2001-04-03 03:15:11 +00:00
John Baldwin	5b3047d59f	Change stop() to require the sched_lock as well as p's process lock to avoid silly lock contention on sched_lock since in 2 out of the 3 places that we call stop(), we get sched_lock right after calling it and we were locking sched_lock inside of stop() anyways.	2001-04-03 01:39:23 +00:00
John Baldwin	1333047621	- Move the second stop() of process 'p' in issignal() to be after we send SIGCHLD to our parent process. Otherwise, we could block while obtaining the process lock for our parent process and switch out while we were in SSTOP. Even worse, when we try to resume from the mutex being blocked on our p_stat will be SRUN, not SSTOP. - Fix a comment above stop() to indicate that it requires that the proc lock be held, not a proctree lock. Reported by: markm Sleuthing by: jake	2001-04-02 17:26:51 +00:00
Robert Watson	685574864e	o Part two of introduction of extattr_{delete,get,set}_fd() system calls, regenerate necessary automatically-generated code. Obtained from: TrustedBSD Project	2001-03-31 16:21:19 +00:00
Robert Watson	fec605c882	o Introduce extattr_{delete,get,set}_fd() to allow extended attribute operations on file descriptors, which complement the existing set of calls, extattr_{delete,get,set}_file() which act on paths. In doing so, restructure the system call implementation such that the two sets of functions share most of the relevant code, rather than duplicating it. This pushes the vnode locking into the shared code, but keeps the copying in of some arguments in the system call code. Allowing access via file descriptors reduces the opportunity for race conditions when managing extended attributes. Obtained from: TrustedBSD Project	2001-03-31 16:20:05 +00:00
Robert Watson	f8e6ab29c2	o Restructure privilege check associated with process visibility for ps_showallprocs such that if superuser is present to override process hiding, the search falls through [to success]. When additional restrictions are placed on process visibility, such as MAC, new clauses will be placed above the return(0). Obtained from: TrustedBSD Project	2001-03-29 22:59:44 +00:00
Robert Watson	ed6397209d	o introduce u_cansee(), which performs access control checks between two subject ucreds. Unlike p_cansee(), u_cansee() doesn't have process lock requirements, only valid ucred reference requirements, so is prefered as process locking improves. For now, back p_cansee() into u_cansee(), but eventually p_cansee() will go away. Reviewed by: jhb, tmm Obtained from: TrustedBSD Project	2001-03-28 20:50:15 +00:00
John Baldwin	026e76f43e	Close a race condition where if we were obtaining a sleep lock and no spin locks were held, we could be preempted and switch CPU's in between the time that we set a variable to the list of spin locks on our CPU and the time that we checked that variable to ensure no spinlocks were held while grabbing a sleep lock. Losing the race resulted in checking some other CPU's spin lock list and bogusly panicing.	2001-03-28 16:11:51 +00:00
John Baldwin	f7012f592a	- s/mutexes/locks/g in appropriate comments. - Rename the 'show mutexes' ddb command to 'show locks' since it shows a list of all the lock objects held by the current process.	2001-03-28 12:39:40 +00:00
John Baldwin	1005a129e5	Convert the allproc and proctree locks from lockmgr locks to sx locks.	2001-03-28 11:52:56 +00:00
John Baldwin	c739adbf42	Pass in a pointer to the mutex's lock_object as the second argument to WITNESS_SLEEP() rather than the mutex itself.	2001-03-28 10:41:15 +00:00
John Baldwin	f34fa851e0	Catch up to header include changes: - <sys/mutex.h> now requires <sys/systm.h> - <sys/mutex.h> and <sys/sx.h> now require <sys/lock.h>	2001-03-28 09:17:56 +00:00
John Baldwin	192846463a	Rework the witness code to work with sx locks as well as mutexes. - Introduce lock classes and lock objects. Each lock class specifies a name and set of flags (or properties) shared by all locks of a given type. Currently there are three lock classes: spin mutexes, sleep mutexes, and sx locks. A lock object specifies properties of an additional lock along with a lock name and all of the extra stuff needed to make witness work with a given lock. This abstract lock stuff is defined in sys/lock.h. The lockmgr constants, types, and prototypes have been moved to sys/lockmgr.h. For temporary backwards compatability, sys/lock.h includes sys/lockmgr.h. - Replace proc->p_spinlocks with a per-CPU list, PCPU(spinlocks), of spin locks held. By making this per-cpu, we do not have to jump through magic hoops to deal with sched_lock changing ownership during context switches. - Replace proc->p_heldmtx, formerly a list of held sleep mutexes, with proc->p_sleeplocks, which is a list of held sleep locks including sleep mutexes and sx locks. - Add helper macros for logging lock events via the KTR_LOCK KTR logging level so that the log messages are consistent. - Add some new flags that can be passed to mtx_init(): - MTX_NOWITNESS - specifies that this lock should be ignored by witness. This is used for the mutex that blocks a sx lock for example. - MTX_QUIET - this is not new, but you can pass this to mtx_init() now and no events will be logged for this lock, so that one doesn't have to change all the individual mtx_lock/unlock() operations. - All lock objects maintain an initialized flag. Use this flag to export a mtx_initialized() macro that can be safely called from drivers. Also, we on longer walk the all_mtx list if MUTEX_DEBUG is defined as witness performs the corresponding checks using the initialized flag. - The lock order reversal messages have been improved to output slightly more accurate file and line numbers.	2001-03-28 09:03:24 +00:00
John Baldwin	c31146a14e	- Resort some includes to deal with the new witness code coming in shortly. - Make sure we have Giant locked before calling coredump() in sigexit(). Spotted by: peter (2)	2001-03-28 08:41:04 +00:00
John Baldwin	486b8ac04a	Don't explicitly zero p_intr_nesting_level and p_aioinfo in fork.	2001-03-28 03:14:14 +00:00
John Baldwin	0006681fe6	Switch from save/disable/restore_intr() to critical_enter/exit().	2001-03-28 03:06:10 +00:00
John Baldwin	b944b9033a	Catch up to the mtx_saveintr -> mtx_savecrit change.	2001-03-28 02:46:21 +00:00
John Baldwin	35a472461a	Use mtx_intr_enable() on sched_lock to ensure child processes always start with interrupts enabled rather than calling the no-longer MI function enable_intr(). This is bogus anyways and in theory shouldn't even be needed.	2001-03-28 02:44:11 +00:00
John Baldwin	6283b7d01b	- Switch from using save/disable/restore_intr to using critical_enter/exit and change the u_int mtx_saveintr member of struct mtx to a critical_t mtx_savecrit. - On the alpha we no longer need a custom _get_spin_lock() macro to avoid an extra PAL call, so remove it. - Partially fix using mutexes with WITNESS in modules. Change all the _mtx_{un,}lock_{spin,}_flags() macros to accept explicit file and line parameters and rename them to use a prefix of two underscores. Inside of kern_mutex.c, generate wrapper functions for _mtx_{un,}lock_{spin,}_flags() (only using a prefix of one underscore) that are called from modules. The macros mtx_{un,}lock_{spin,}_flags() are mapped to the __mtx_* macros inside of the kernel to inline the usual case of mutex operations and map to the internal _mtx_* functions in the module case so that modules will use WITNESS and KTR logging if the kernel is compiled with support for it.	2001-03-28 02:40:47 +00:00
Paul Saab	6b8b8c7fdc	Last commit was broken.. It always prints '[CTRL-C to abort]'. Move duplicate code for printing the status of the dump and checking for abort into a separate function. Pointy hat to: me	2001-03-28 01:37:29 +00:00
David Malone	d1cadeb02a	Don't leak the memory we've just malloced if we can't find the process we're looking for. (I don't think this can currently happen, but it depends how the function is called). PR: 25932 Submitted by: David Xu <davidx@viasoft.com.cn>	2001-03-27 20:49:51 +00:00
Yaroslav Tykhiy	ae5fa19aa9	Make cblock_alloc_cblocks() spell its own name correctly in its warning message. PR: kern/7693	2001-03-27 10:21:26 +00:00
Kenneth D. Merry	3393f8daa3	Rewrite of the CAM error recovery code. Some of the major changes include: - The SCSI error handling portion of cam_periph_error() has been broken out into a number of subfunctions to better modularize the code that handles the hierarchy of SCSI errors. As a result, the code is now much easier to read. - String handling and error printing has been significantly revamped. We now use sbufs to do string formatting instead of using printfs (for the kernel) and snprintf/strncat (for userland) as before. There is a new catchall error printing routine, cam_error_print() and its string-based counterpart, cam_error_string() that allow the kernel and userland applications to pass in a CCB and have errors printed out properly, whether or not they're SCSI errors. Among other things, this helped eliminate a fair amount of duplicate code in camcontrol. We now print out more information than before, including the CAM status and SCSI status and the error recovery action taken to remedy the problem. - sbufs are now available in userland, via libsbuf. This change was necessary since most of the error printing code is shared between libcam and the kernel. - A new transfer settings interface is included in this checkin. This code is #ifdef'ed out, and is primarily intended to aid discussion with HBA driver authors on the final form the interface should take. There is example code in the ahc(4) driver that implements the HBA driver side of the new interface. The new transfer settings code won't be enabled until we're ready to switch all HBA drivers over to the new interface. src/Makefile.inc1, lib/Makefile: Add libsbuf. It must be built before libcam, since libcam uses sbuf routines. libcam/Makefile: libcam now depends on libsbuf. libsbuf/Makefile: Add a makefile for libsbuf. This pulls in the sbuf sources from sys/kern. bsd.libnames.mk: Add LIBSBUF. camcontrol/Makefile: Add -lsbuf. Since camcontrol is statically linked, we can't depend on the dynamic linker to pull in libsbuf. camcontrol.c: Use cam_error_print() instead of checking for CAM_SCSI_STATUS_ERROR on every failed CCB. sbuf.9: Change the prototypes for sbuf_cat() and sbuf_cpy() so that the source string is now a const char . This is more in line wth the standard system string functions, and helps eliminate warnings when dealing with a const source buffer. Fix a typo. cam.c: Add description strings for the various CAM error status values, as well as routines to look up those strings. Add new cam_error_string() and cam_error_print() routines for userland and the kernel. cam.h: Add a new CAM flag, CAM_RETRY_SELTO. Add enumerated types for the various options available with cam_error_print() and cam_error_string(). cam_ccb.h: Add new transfer negotiation structures/types. Change inq_len in the ccb_getdev structure to be "reserved". This field has never been filled in, and will be removed when we next bump the CAM version. cam_debug.h: Fix typo. cam_periph.c: Modularize cam_periph_error(). The SCSI error handling part of cam_periph_error() is now in camperiphscsistatuserror() and camperiphscsisenseerror(). In cam_periph_lock(), increase the reference count on the periph while we wait for our lock attempt to succeed so that the periph won't go away while we're sleeping. cam_xpt.c: Add new transfer negotiation code. (ifdefed out) Add a new function, xpt_path_string(). This is a string/sbuf analog to xpt_print_path(). scsi_all.c: Revamp string handing and error printing code. We now use sbufs for much of the string formatting code. More of that code is shared between userland the kernel. scsi_all.h: Get rid of SS_TURSTART, it wasn't terribly useful in the first place. Add a new error action, SS_REQSENSE. (Send a request sense and then retry the command.) This is useful when the controller hasn't performed autosense for some reason. Change the default actions around a bit. scsi_cd.c, scsi_da.c, scsi_pt.c, scsi_ses.c: SF_RETRY_SELTO -> CAM_RETRY_SELTO. Selection timeouts shouldn't be covered by a sense flag. scsi_pass.[ch]: SF_RETRY_SELTO -> CAM_RETRY_SELTO. Get rid of the last vestiges of a read/write interface. libkern/bsearch.c, sys/libkern.h, conf/files: Add bsearch.c, which is needed for some of the new table lookup routines. aic7xxx_freebsd.c: Define AHC_NEW_TRAN_SETTINGS if CAM_NEW_TRAN_CODE is defined. sbuf.h, subr_sbuf.c: Add the appropriate #ifdefs so sbufs can compile and run in userland. Change sbuf_printf() to use vsnprintf() instead of kvprintf(), which is only available in the kernel. Change the source string for sbuf_cpy() and sbuf_cat() to be a const char . Add __BEGIN_DECLS and __END_DECLS around function prototypes since they're now exported to userland. kdump/mkioctls: Include stdio.h before cam.h since cam.h now includes a function with a FILE * argument. Submitted by: gibbs (mostly) Reviewed by: jdp, marcel (libsbuf makefile changes) Reviewed by: des (sbuf changes) Reviewed by: ken	2001-03-27 05:45:52 +00:00
Boris Popov	602ef63172	Previous commit broke interlock locking for !LK_RETRY case.	2001-03-26 12:45:35 +00:00
Poul-Henning Kamp	f83880518b	Send the remains (such as I have located) of "block major numbers" to the bit-bucket.	2001-03-26 12:41:29 +00:00
Boris Popov	71d8277b51	Prevent race condition by using msleep() instead of mtx_unlock()/tsleep(). Reviewed by: alfred	2001-03-26 03:10:07 +00:00
Bosko Milekic	2ba1a89559	Move the atomic() mbstat.m_drops incrementing to the MGET(HDR) and MCLGET macros in order to avoid incrementing the drop count twice. Otherwise, in some cases, we may increment m_drops once in m_mballoc() for example, and increment it again in m_mballoc_wait() if the wait fails.	2001-03-24 23:47:52 +00:00
John Baldwin	1f723035c8	Use (..., "%s", foo) instead of (..., foo) to avoid a warning about a non-constant format string when calling kthread_create() to create an ithread.	2001-03-24 06:26:47 +00:00
Peter Wemm	32e479705a	This is kind of a hack, but it should work. Currently, world is broken because libc/rpc/key_call.c references uname(), and ps/print.c also defines uname(), and ps is linked statically. This leads to a symbol clash. The userland uname(3) kinda sucked anyway as the hostname etc was too short. And since the libc rpc interface now uses the utsname.nodename which gets truncated, I was tempted into doing something about it. Create a new userland uname function, called __xuname() which takes an extra argument that allows you to change the size of the fields. uname() becomes a static inline function in sys/utsname.h that passes the extra argument in. struct utsname has its field members expanded by default now in userland. We still provide a 'uname' externally linkable function for things that either think that they ``know'' the utsname format and assume 32 character strings and bypass the include file, or objects that are linked against old libcs. ie: just about every plausible case that I can think of is covered. Should we ever change the default lengths again, a libc major bump should not be required as the size is now passed to the function. XXX the uname(2) in the kernel is for FreeBSD 1.1 binary compatability! All the uname(3) functions that are exported to userland are actually implemented in libc with sysctl. uname(1) uses sysctl directly and does not call uname(3). PR: bin/4688	2001-03-24 04:40:49 +00:00
John Baldwin	bae3a80b16	Just use the proc lock to protect read accesses to p_pptr rather than the more expensive proctree lock.	2001-03-24 04:00:01 +00:00
John Baldwin	8d2725181a	Protect p_wmesg and p_wchan with sched_lock while checking for deadlocks with other byte range file locks.	2001-03-24 03:57:44 +00:00
Alfred Perlstein	ec4dff5e50	replace calls to non-existant bail() subroutine with calls to the die() builtin function.	2001-03-23 11:48:50 +00:00
Boris Popov	a91f68bca6	o Actually extract version of interface and store it along with the name. o Add new parameter to the modlist_lookup() function to perform lookups with strict version matching. o Collapse duplicate code to function(s).	2001-03-22 08:58:45 +00:00
Boris Popov	303b15f193	Slightly reorganize code in the linker_load_dependancies() function to make codepath more straightforward.	2001-03-22 07:55:33 +00:00
Boris Popov	804f27299d	Remove support for old way of handling module dependencies. Approved by: peter	2001-03-22 07:14:42 +00:00
Poul-Henning Kamp	71d033119f	Make the pseudo-driver for "/dev/fd/*" handle fd's larger than 255. PR: 25936	2001-03-20 13:26:13 +00:00
Poul-Henning Kamp	15b6f00fd1	Add a KASSERT on unit2minor() so that we catch it if people try to pass us unit numbers which doesn't fit in 24 bits.	2001-03-20 13:24:24 +00:00
Bruce Evans	0abc15fd0b	Fixed breakage of access() in rev.1.164. Wrong credentials were used for the final path component.	2001-03-20 09:38:05 +00:00
Peter Wemm	439fea92c2	Use the same API as the example code. Allow the initial hash value to be passed in, as the examples do. Incrementally hash in the dvp->v_id (using the official api) rather than add it. This seems to help power-of-two predictable filename trees where the filenames repeat on a power-of-two cycle and the directory trees have power-of-two components in it. The simple add then mask was causing things like 12000+ entry collision chains while most other entries have between 0 and 3 entries each. This way seems to improve things.	2001-03-20 02:10:18 +00:00
Robert Watson	231b9e916a	o Rename "namespace" argument to "attrnamespace" as namespace is a C++ reserved word. Part 2 of syscalls.master commit to catch rebuilt files. Submitted by: jkh Obtained from: TrustedBSD Project	2001-03-19 05:48:58 +00:00
Robert Watson	3063207147	o Rename "namespace" argument to "attrnamespace" as namespace is a C++ reserved word. Submitted by: jkh Obtained from: TrustedBSD Project	2001-03-19 05:44:15 +00:00
Bosko Milekic	9612101e4c	Fix a couple of things in the internal mbuf allocation interface: - Make sure that m_mballoc() really doesn't allow over nmbufs mbufs to be allocated from mb_map. In the case where nmbufs-reserved space is not an exact multiple of PAGE_SIZE (which it should be, but anyway...), we hold nmbufs as an absolute maximum which need not ever be reached. - Clean up m_clalloc(); make it more consistent in the sense that the first argument `ncl' really means "the number of clusters ensured to be allocated" and not "the number of pages worth of clusters to be allocated," as was previously the case. This also makes it consistent with m_mballoc() as well as the comment that preceeds it. Reviewed by: jlemon	2001-03-17 23:23:24 +00:00
Peter Wemm	6eb39ac8fc	Use a generic implementation of the Fowler/Noll/Vo hash (FNV hash). Make the name cache hash as well as the nfsnode hash use it. As a special tweak, create an unsigned version of register_t. This allows us to use a special tweak for the 64 bit versions that significantly speeds up the i386 version (ie: int64 XOR int64 is slower than int64 XOR int32). The code layout is a little strange for the string function, but I was able to get between 5 to 10% improvement over the original version I started with. The layout affects gcc code generation choices and this way was fastest on x86 and alpha. Note that 'CPUTYPE=p3' etc makes a fair difference to this. It is around 45% faster with -march=pentiumpro on a p6 cpu.	2001-03-17 09:31:06 +00:00
Jonathan Lemon	4d286823c5	When doing a recv(.. MSG_WAITALL) for a message which is larger than the socket buffer size, the receive is done in sections. After completing a read, call pru_rcvd on the underlying protocol before blocking again. This allows the the protocol to take appropriate action, such as sending a TCP window update to the peer, if the window happened to close because the socket buffer was filled. If the protocol is not notified, a TCP transfer may stall until the remote end sends a window probe.	2001-03-16 22:37:06 +00:00
Peter Wemm	50e2347e68	Kill the 4MB kernel limit dead. [I hope :-)]. For UP, we were using $tmp_stk as a stack from the data section. If the kernel text section grew beyond ~3MB, the data section would be pushed beyond the temporary 4MB P==V mapping. This would cause the trampoline up to high memory to fault. The hack workaround I did was to use all of the page table pages that we already have while preparing the initial P==V mapping, instead of just the first one. For SMP, the AP bootstrap process suffered the same sort of problem and got the same treatment. MFC candidate - this breaks on 4.x just the same.. Thanks to: Richard Todd <rmtodd@ichotolot.servalan.com>	2001-03-15 05:10:06 +00:00
Peter Wemm	6fe01250f4	Jake essentially rewrote this. It is not by any stretch of the imagination a derivative of what I did before.	2001-03-15 05:02:08 +00:00
Peter Wemm	043cc5a602	Regenerate after rwatson's commit to syscalls.master (rev 1.85)	2001-03-15 04:43:57 +00:00
Robert Watson	70f3685105	o Change the API and ABI of the Extended Attribute kernel interfaces to introduce a new argument, "namespace", rather than relying on a first- character namespace indicator. This is in line with more recent thinking on EA interfaces on various mailing lists, including the posix1e, Linux acl-devel, and trustedbsd-discuss forums. Two namespaces are defined by default, EXTATTR_NAMESPACE_SYSTEM and EXTATTR_NAMESPACE_USER, where the primary distinction lies in the access control model: user EAs are accessible based on the normal MAC and DAC file/directory protections, and system attributes are limited to kernel-originated or appropriately privileged userland requests. o These API changes occur at several levels: the namespace argument is introduced in the extattr_{get,set}_file() system call interfaces, at the vnode operation level in the vop_{get,set}extattr() interfaces, and in the UFS extended attribute implementation. Changes are also introduced in the VFS extattrctl() interface (system call, VFS, and UFS implementation), where the arguments are modified to include a namespace field, as well as modified to advoid direct access to userspace variables from below the VFS layer (in the style of recent changes to mount by adrian@FreeBSD.org). This required some cleanup and bug fixing regarding VFS locks and the VFS interface, as a vnode pointer may now be optionally submitted to the VFS_EXTATTRCTL() call. Updated documentation for the VFS interface will be committed shortly. o In the near future, the auto-starting feature will be updated to search two sub-directories to the ".attribute" directory in appropriate file systems: "user" and "system" to locate attributes intended for those namespaces, as the single filename is no longer sufficient to indicate what namespace the attribute is intended for. Until this is committed, all attributes auto-started by UFS will be placed in the EXTATTR_NAMESPACE_SYSTEM namespace. o The default POSIX.1e attribute names for ACLs and Capabilities have been updated to no longer include the '$' in their filename. As such, if you're using these features, you'll need to rename the attribute backing files to the same names without '$' symbols in front. o Note that these changes will require changes in userland, which will be committed shortly. These include modifications to the extended attribute utilities, as well as to libutil for new namespace string conversion routines. Once the matching userland changes are committed, a buildworld is recommended to update all the necessary include files and verify that the kernel and userland environments are in sync. Note: If you do not use extended attributes (most people won't), upgrading is not imperative although since the system call API has changed, the new userland extended attribute code will no longer compile with old include files. o Couple of minor cleanups while I'm there: make more code compilation conditional on FFS_EXTATTR, which should recover a bit of space on kernels running without EA's, as well as update copyright dates. Obtained from: TrustedBSD Project	2001-03-15 02:54:29 +00:00
Søren Schmidt	b417a1a8c8	Dont call device close and ioctl functions if device has disappeared. Reviewed by: phk	2001-03-13 08:45:05 +00:00
Dag-Erling Smørgrav	9cbd039343	Assert that the process we're trying to enqueue isn't already there.	2001-03-11 18:57:30 +00:00
Alan Cox	136446540a	When aio_read/write() is used on a raw device, physical buffers are used for up to "vfs.aio.max_buf_aio" of the requests. If a request size is MAXPHYS, but the request base isn't page aligned, vmapbuf() will map the end of the user space buffer into the start of the kva allocated for the next physical buffer. Don't use a physical buffer in this case. (This change addresses problem report 25617.) When an aio_read/write() on a raw device has completed, timeout() is used to schedule a signal to the process. Thus, the reporting is delayed up to 10 ms (assuming hz is 100). The process might have terminated in the meantime, causing a trap 12 when attempting to deliver the signal. Thus, the timeout must be cancelled when removing the job. aio jobs in state JOBST_JOBQGLOBAL should be removed from the kaio_jobqueue list during process rundown. During process rundown, some aio jobs might move from one list to a different list that has already been "emptied", causing the rundown to be incomplete. Retry the rundown. A call to BUF_KERNPROC() is needed after obtaining a physical buffer to disassociate the lock from the running process since it can return to userland without releasing that lock. PR: 25617 Submitted by: tegge	2001-03-10 22:47:57 +00:00
Alfred Perlstein	9708152c20	Don't call malloc with M_WAITOK while holding a mutex.	2001-03-09 18:40:34 +00:00
Jonathan Lemon	c0647e0d07	Push the test for a disconnected socket when accept()ing down to the protocol layer. Not all protocols behave identically. This fixes the brokenness observed with unix-domain sockets (and postfix)	2001-03-09 08:16:40 +00:00
John Baldwin	5db078a9be	Fix mtx_legal2block. The only time that it is bad to block on a mutex is if we hold a spin mutex, since we can trivially get into deadlocks if we start switching out of processes that hold spinlocks. Checking to see if interrupts were disabled was a sort of cheap way of doing this since most of the time interrupts were only disabled when holding a spin lock. At least on the i386. To fix this properly, use a per-process counter p_spinlocks that counts the number of spin locks currently held, and instead of checking to see if interrupts are disabled in the witness code, check to see if we hold any spin locks. Since child processes always start up with the sched lock magically held in fork_exit(), we initialize p_spinlocks to 1 for child processes. Note that proc0 doesn't go through fork_exit(), so it starts with no spin locks held. Consulting from: cp	2001-03-09 07:24:17 +00:00
Alan Cox	c9a970a79f	Use the kthread API to create and destroy AIO daemons. Submitted by: jhb	2001-03-09 06:27:01 +00:00
John Baldwin	3a3f608288	Add a new informative KASSERT to ensure that a process is in the SRUN state before we return it to cpu_switch().	2001-03-09 03:59:50 +00:00
Bosko Milekic	4bde2ac539	Fix is a similar race condition as existed in the mbuf code. When we go into an interruptable sleep and we increment a sleep count, we make sure that we are the thread that will decrement the count when we wakeup. Otherwise, what happens is that if we get interrupted (signal) and we have to wake up, but before we get our mutex, some thread that wants to wake us up detects that the count is non-zero and so enters wakeup_one(), but there's nothing on the sleep queue and so we don't get woken up. The thread will still decrement the sleep count, which is bad because we will also decrement it again later (as we got interrupted) and are already off the sleep queue.	2001-03-08 19:21:45 +00:00
David Malone	2239c07de9	Make the wait for sendfile buffers interruptable. Stops one process consuming them all and then getting stuck. Reviewed by: dg Reviewed by: bmilekic Observed by: Andreas Persson <pap@garen.net>	2001-03-08 16:28:10 +00:00
Thomas Moestl	3a51557243	Make the SYSCTL_OUT handlers sysctl_old_user() and sysctl_old_kernel() more robust. They would correctly return ENOMEM for the first time when the buffer was exhausted, but subsequent calls in this case could cause writes ouside of the buffer bounds. Approved by: rwatson	2001-03-08 01:20:43 +00:00
Kirk McKusick	589c7af992	Fixes to track snapshot copy-on-write checking in the specinfo structure rather than assuming that the device vnode would reside in the FFS filesystem (which is obviously a broken assumption with the device filesystem).	2001-03-07 07:09:55 +00:00
Kirk McKusick	393d77ffad	Bitch more loudly when someone botches changes to kinfo_proc in the hopes that they will actually read the comment above it and follow the instructions so as to cause all the rest of us less a lot less grief.	2001-03-07 06:52:12 +00:00
John Baldwin	5641ae5dc3	- Don't hold the proc lock across VREF and the fd* functions to avoid lock order reversals. - Add some preliminary locking in the !RF_PROC case. - Protect p_estcpu with sched_lock.	2001-03-07 05:21:47 +00:00
John Baldwin	f227364a17	- Release Giant a bit earlier on syscall exit. - Don't try to grab Giant before postsig() in userret() as it is no longer needed. - Don't grab Giant before psignal() in ast() but get the proc lock instead.	2001-03-07 03:53:39 +00:00
John Baldwin	19eb87d22a	Grab the process lock while calling psignal and before calling psignal.	2001-03-07 03:37:06 +00:00
John Baldwin	15e9ec5153	Proc locking including using proc lock in place of proctree where appropriate and locking processes while we signal them.	2001-03-07 03:28:50 +00:00
John Baldwin	e65897c381	Proc locking.	2001-03-07 03:27:32 +00:00
John Baldwin	28aa95b6ee	Use the proc lock to protect access to p_sigacts->ps_sigintr.	2001-03-07 03:26:39 +00:00
John Baldwin	731a1aea4c	- Proc locking. - Remove some unneeded spl()'s.	2001-03-07 03:06:18 +00:00
John Baldwin	378240232a	Lock the process while sending it SIGARLM and updating p_realtimer.	2001-03-07 03:02:56 +00:00
John Baldwin	eed4805444	- Proc locking. - Remove unneeded spl()'s.	2001-03-07 03:01:53 +00:00
John Baldwin	628d2653d6	- Proc locking. Most of signal handling is now MP safe and doesn't require Giant. The only exception is the CANSIGNAL() macro. Unlocking the proc lock around sendsig() in trapsignal() is also questionable. Note that the functions sigexit(), psignal(), and issignal() must be called with the proc lock of the process in question held. postsig() and trapsignal() should not be called with the proc lock held, but they also do not require Giant anymore either. - Remove spl's that are now no longer needed as they are fully replaced.	2001-03-07 02:59:54 +00:00
John Baldwin	87729a2b64	Lock initproc when we send SIGINT to init during shutdown.	2001-03-07 02:50:09 +00:00
John Baldwin	1b43703b47	- Add an extra check in priority_propagation() for UP systems to ensure we don't end up back at ourselves which would indicate deadlock. - Add the proc lock to the witness dup_list as we may hold more than one process lock at a time. - Don't assert a mutex is owned in _mtx_unlock_sleep() as that is too late. We do the checks in the macros instead.	2001-03-07 02:45:15 +00:00
John Baldwin	6451855f6d	- Use _PHOLD and move it before a PROC_UNLOCK to reduce the number of mutex operations in kthread_create(). - Lock a kthread's proc before changing its parent via proc_reparent(). - Test P_KTHREAD not P_SYSTEM in kthread_suspend() and kthread_resume(). P_SYSTEM just means that the process shouldn't be swapped and is used for vinum's daemon for example. - Lock all the signal state used for suspending and resuming kthreads with the proc lock.	2001-03-07 02:36:47 +00:00
John Baldwin	57934cd3c8	- Lock the forklist with an sx lock. - Add proc locking to fork1(). Always lock the child procoess (new process) first when both processes need to be locked at the same time. - Remove unneeded spl()'s as the data they protected is now locked. - Ensure that the proctree is exclusively locked and the new process is locked when setting up the parent process pointer. - Lock the check for P_KTHREAD in p_flag in fork_exit().	2001-03-07 02:30:39 +00:00
John Baldwin	2aa33d2f1e	Check to see if p_fd is NULL before derferencing it in checkdirs(). It's possible for us to see a process in the early stages of fork before p_fd has been initialized. Ideally, we wouldn't stick a process on the allproc list until it was fully created however.	2001-03-07 02:25:13 +00:00
John Baldwin	c65437a326	- Call proc_reparent() when handing a process off to init in exit rather than dinking around in the process lists explicitly. - Hold both the proctree lock and proc lock of the child process when reparenting a process via proc_reparent. - Lock processes while sending them signals. - Miscellaenous proc locking. - proc_reparent() now asserts that the child is locked in addition to an exclusive proctree lock.	2001-03-07 02:22:31 +00:00
John Baldwin	7331c2a252	In order to avoid recursing on the backing mutex for sx locks in the INVARIANTS case, define the actual KASSERT() in _SX_ASSERT_[SX]LOCKED macros that are used in the sx code itself and convert the SX_ASSERT_[SX]LOCKED macros to simple wrappers that grab the mutex for the duration of the check.	2001-03-06 23:13:15 +00:00
Dag-Erling Smørgrav	cab5b963a0	Make the KASSERTs report the correct function names. Fix two off-by-one errors that would sometimes cause the final length of the sbuf to include the trailing zero.	2001-03-06 17:48:26 +00:00
Robert Watson	5293465fef	o Introduce filesystem-independent POSIX.1e ACL utility routines to support implementations of ACLs in file systems. Introduce the following new functions: vaccess_acl_posix1e() vaccess() that accepts an ACL acl_posix1e_mode_to_perm() Convert mode bits to ACL rights acl_posix1e_mode_to_entry() Build ACL entry from mode/uid/gid acl_posix1e_perms_to_mode() Generate file mode from ACL acl_posix1e_check() Syntax verification for ACL These functions allow a file system to rely on central ACL evaluation and syntax checking, as well as providing useful utilities to allow ACL-based file systems to generate mode/owner/etc information to return via VOP_GETATTR(), and to support file systems that split their ACL information over their existing inode storage (mode, uid, gid) and extended ACL into extended attributes (additional users, groups, ACL mask). o Add prototypes for exported functions to sys/acl.h, sys/vnode.h Reviewed by: trustedbsd-discuss, freebsd-arch Obtained from: TrustedBSD Project	2001-03-06 17:28:24 +00:00
Alan Cox	9c8a2647f6	Add a missing splx() to aio_fphysio(). (This change is a no-op in -5.0, but potentially significant in -4.x.) Eliminate a pointless parameter to aio_fphysio(). Remove unnecessary casts from aio_fphysio() and aio_physwakeup().	2001-03-06 15:54:38 +00:00
Bosko Milekic	af76144992	- Add sx_descr description member to sx lock structure - Add sx_xholder member to sx struct which is used for INVARIANTS-enabled assertions. It indicates the thread that presently owns the xlock. - Add some assertions to the sx lock code that will detect the fatal API abuse: xlock --> xlock xlock --> slock which now works thanks to sx_xholder. Notice that the remaining two problematic cases: slock --> xlock slock --> slock (a little less problematic, but still recursion) will need to be handled by witness eventually, as they are more involved. Reviewed by: jhb, jake, jasone	2001-03-06 06:17:05 +00:00
Jason Evans	6281b30a73	Implement shared/exclusive locks. Reviewed by: bmilekic, jake, jhb	2001-03-05 19:59:41 +00:00
Alan Cox	88ed460e6b	Eliminate the aio_freejobs list. Its purpose was to store free aiocb's allocated by zalloc(). In other words, zfree() was never called. Now, we call zfree(). Why eliminate this micro- optimization? At some later point, when we multithread the AIO system, we would need a mutex to synchronize access to aio_freejobs, making its use nearly indistinguishable in cost from zalloc() and zfree(). Remove unnecessary fhold() and fdrop() calls from aio_qphysio(), undo'ing a part of revision 1.86. The reference count on the file structure is already incremented by _aio_aqueue() before it calls aio_qphysio(). (Update the comments to document this fact.) Remove unnecessary casts from _aio_aqueue(), aio_read(), aio_write() and aio_waitcomplete(). Remove an unnecessary "return;" from aio_process(). Add "static" in various places.	2001-03-05 01:30:23 +00:00
David E. O'Brien	828c9e13a3	Do not set a default ELF syscall ABI fallback. If one runs an un-branded Linux static binary that calls Linux's fcntl the machine will reboot when interupted by the FreeBSD syscall ABI.	2001-03-04 11:58:50 +00:00
Assar Westerlund	3617ddfc33	implement OCRNL, ONOCR, and ONLRET Obtained from: NetBSD	2001-03-04 06:04:50 +00:00
Alan Cox	fb579e9a61	Remove the field privatemodes from struct __aiocb_private and the related code from aio_read() and aio_write(). This field was intended, but never used, to allow a mythical user-level library to make an aio_read() or aio_write() behave like an ordinary read() or write(), i.e., a blocking I/O operation.	2001-03-04 01:22:23 +00:00
Adrian Chadd	fbedc11796	Mismatched MFSNAMELEN and MNAMELEN with fstype / fspath. Submitted by: Naoki Kobayashi <shibata@geo.titech.ac.jp>	2001-03-02 14:05:49 +00:00
John Baldwin	003fb9ec2f	Ok, the kernel will panic in kmem_malloc() if the kernel map is full, so malloc with M_WAITOK can't actually return NULL. I wish I could get two people to give me the same answer about this when I ask... Submitted by: jake	2001-03-02 06:07:38 +00:00
John Baldwin	653dd8c243	- Check to see if malloc() returned NULL even with M_WAITOK. - Add a KASSERT() to ensure an ithread has a backing kernel thread when we schedule it. - Don't attempt to preemptively switch to an ithread if p_stat of curproc is not SRUN.	2001-03-02 05:33:03 +00:00
Adrian Chadd	f3a90da995	Reviewed by: jlemon An initial tidyup of the mount() syscall and VFS mount code. This code replaces the earlier work done by jlemon in an attempt to make linux_mount() work. * the guts of the mount work has been moved into vfs_mount(). * move `type', `path' and `flags' from being userland variables into being kernel variables in vfs_mount(). `data' remains a pointer into userspace. * Attempt to verify the `type' and `path' strings passed to vfs_mount() aren't too long. * rework mount() and linux_mount() to take the userland parameters (besides data, as mentioned) and pass kernel variables to vfs_mount(). (linux_mount() already did this, I've just tidied it up a little more.) * remove the copyin() stuff for `path'. `data' still requires copyin() since its a pointer into userland. * set `mount->mnt_statf_mntonname' in vfs_mount() rather than in each filesystem. This variable is generally initialised with `path', and each filesystem can override it if they want to. * NOTE: f_mntonname is intiailised with "/" in the case of a root mount.	2001-03-01 21:00:17 +00:00
Ian Dowse	a90ef2ae0f	The kernel did not hold a vnode reference associated with the `rootvnode' pointer, but vfs_syscalls.c's checkdirs() assumed that it did. This bug reliably caused a panic at reboot time if any filesystem had been mounted directly over /. The checkdirs() function is called at mount time to find any process fd_cdir or fd_rdir pointers referencing the covered mountpoint vnode. It transfers these to point at the root of the new filesystem. However, this process was not reversed at unmount time, so processes with a cwd/root at a mount point would unexpectedly lose their cwd/root following a mount-unmount cycle at that mountpoint. This change should fix both of the above issues. Start_init() now holds an extra vnode reference corresponding to `rootvnode', and dounmount() releases this reference when the root filesystem is unmounted just before reboot. Dounmount() now undoes the actions taken by checkdirs() at mount time; any process cdir/rdir pointers that reference the root vnode of the unmounted filesystem are transferred to the now-uncovered vnode. Reviewed by: bde, phk	2001-02-28 20:54:28 +00:00
Julian Elischer	a96dcd84d2	Shuffle netgraph mutexes a bit and hold a reference on a node from the function that is calling the destructor.	2001-02-28 18:49:09 +00:00
Matthew Dillon	63692125a9	Fix lockup for loopback NFS mounts. The pipelined I/O limitations could be hit on the client side and prevent the server side from retiring writes. Pipeline operations turned off for all READs (no big loss since reads are usually synchronous) and for NFS writes, and left on for the default bwrite(). (MFC expected prior to 4.3 freeze) Testing by: mjacob, dillon	2001-02-28 04:13:11 +00:00
Jake Burkholder	5b270b2a55	Sigh. Try to get priorities sorted out. Don't bother trying to update native priority, it is diffcult to get right and likely to end up horribly wrong. Use an honestly wrong fixed value that seems to work; PUSER for user threads, and the interrupt priority for ithreads. Set it once when the process is created and forget about it. Suggested by: bde Pointy hat: me	2001-02-28 02:53:44 +00:00
Jonathan Lemon	ea0237ed11	Correctly declare variables as u_int rather than doing typecasts. Kill some register declarations while I'm here. Submitted by: bde (1)	2001-02-27 15:11:31 +00:00
Ruslan Ermilov	8ac6dca795	In soshutdown(), use SHUT_{RD,WR,RDWR} instead of FREAD and FWRITE. Also, return EINVAL if `how' is invalid, as required by POSIX spec.	2001-02-27 13:48:07 +00:00
Jonathan Lemon	0b7088c4d0	Cast nfds to u_int before range checking it in order to catch negative values. PR: 25393	2001-02-27 00:50:20 +00:00
Jake Burkholder	be15bfc091	Initialize native priority to PRI_MAX. It was usually 0 which made a process's priority go through the roof when it released a (contested) mutex. Only set the native priority in mtx_lock if hasn't already been set. Reviewed by: jhb	2001-02-26 23:27:35 +00:00
Jake Burkholder	a10f496636	Remove brackets around variables in a function that used to be a macro.	2001-02-25 16:18:13 +00:00
Peter Wemm	d6df01d823	Make this compile in a.out mode. link.h has extra dependencies for a.out.	2001-02-25 07:26:54 +00:00
Peter Wemm	1a5f13cfbf	Manually add an extra _ to _DYNAMIC since it is provided by ld, not gcc. Make the rest compile.	2001-02-25 07:25:05 +00:00
Bosko Milekic	096e2dd9d8	Remove superfluous m_pkthdr.rcv_if = NULL assignment following m_gethdr() mbuf allocation, which already does this for us.	2001-02-25 06:33:50 +00:00
Julian Elischer	7433466190	Move netgraph spimlock order entries out of the #ifdef SMP section. They need to be there for UP too.	2001-02-25 04:56:23 +00:00
Jake Burkholder	631d7bf3da	- Rename the lcall system call handler from Xsyscall to Xlcall_syscall to be more like Xint0x80_syscall and less like c function syscall(). - Reduce code duplication between the int0x80 and lcall handlers by shuffling the elfags into the right place, saving the sizeof the instruction in tf_err and jumping into the common int0x80 code. Reviewed by: peter	2001-02-25 02:53:06 +00:00
David E. O'Brien	21a3ee0ead	MFS: bring the consistent `compat_3_brand' support into -CURRENT (the work was first done in the RELENG_4 branch near a release during a MFC to make the code cleaner and more consistent)	2001-02-24 22:20:11 +00:00
John Baldwin	1103f3b05b	Grrr, s/INVARIANTS_SUPPORT/INVARIANT_SUPPORT/.	2001-02-24 21:29:32 +00:00
John Baldwin	15ec816acc	- Axe RETIP() as it was very i386 specific and unwieldy. Instead, use the passed in filename and line number in the KTR tracepoint message. - Even though it is #if 0'd code, change the code to detect that a process is an interrupt thread to check p->p_ithd against NULL rather than checking non-existant process flags from BSD/OS. - Use '%p' to print pointers in KTR log messages instead of assuming sizeof(int) == sizeof(void *). - Don't set p_mtxname to NULL when releasing a mutex. It doesn't hurt to leave it set (we don't clear w_mesg for example) and at least at one time in the past, there used to be race conditions in the kernel that would result in setting this to NULL causing the kernel to dereference NULL. - Make the _mtx_assert() function be compiled in if INVARIANTS_SUPPORT is defined rather than if INVARIANTS is defined so that a KLD compiled with INVARIANTS that uses mtx_assert() can be used with a kernel that just has INVARIANT_SUPPORT compiled in.	2001-02-24 19:36:13 +00:00
Boris Popov	d8589bd5cb	Introduce API for sequential reads/writes (build/dissect) of mbuf chains. Reviewed by: Ian Dowse <iedowse@maths.tcd.ie>, Bosko Milekic <bmilekic@technokratis.com>, Julian Elischer <julian@elischer.org> and arch@/net@ Obtained from: smbfs	2001-02-24 15:44:30 +00:00
Julian Elischer	33338e7370	Add knowledge of the netgraph spinlocks into the Witness code. Well, at least I think that's how it's done.	2001-02-24 14:29:47 +00:00
Jake Burkholder	f32ded2fb5	- Assert that the proc to return is not NULL in runq_choose the same as runq_remove. - bzero the whole struct runq in runq_init just in case its not statically allocated.	2001-02-24 14:06:36 +00:00
John Baldwin	130c1f25a4	It turns out the kernel console works fine and thus doesn't need quite this much extra testing.	2001-02-24 03:40:23 +00:00
Jonathan Lemon	24607d88ed	Add an EV_SET() convenience macro for initializing struct kevent prior to the call to kevent(). Update the copyright notices as well.	2001-02-24 01:44:03 +00:00
Jonathan Lemon	da403b9df8	Introduce a NOTE_LOWAT flag for use with the read/write filters, which allow the watermark to be passed in via the data field during the EV_ADD operation. Hook this up to the socket read/write filters; if specified, it overrides the so_{rcv\|snd}.sb_lowat values in the filter. Inspired by: "Ronald F. Guilmette" <rfg@monkeys.com>	2001-02-24 01:41:31 +00:00
Jonathan Lemon	b07540c837	When returning EV_EOF for the socket read/write filters, also return the current socket error in fflags. This may be useful for determining why a connect() request fails. Inspired by: "Jonathan Graehl" <jonathan@graehl.org>	2001-02-24 01:33:12 +00:00
Peter Wemm	3e688165a9	Stricter style(9) conformance - remove unnecessary blank lines in previous commit.	2001-02-23 23:05:46 +00:00
Jonathan Lemon	89bbe051bb	Fix typo in comment (knode -> knote).	2001-02-23 20:32:42 +00:00
Jonathan Lemon	7df2842dee	Add a NOTE_REVOKE flag for vnodes, which is triggered from within vclean(). Use this to tell a filter attached to a vnode that the underlying vnode is no longer valid, by returning EV_EOF. PR: kern/25309, kern/25206	2001-02-23 20:06:01 +00:00
John Baldwin	0b1d793211	Test out the kernel console just before launching the AP's.	2001-02-23 19:44:25 +00:00
Peter Wemm	f1532aadee	Activate USER_LDT by default. The new thread libraries are going to depend on this. The linux ABI emulator tries to use it for some linux binaries too. VM86 had a bigger cost than this and it was made default a while ago. Reviewed by: jhb, imp	2001-02-23 01:25:02 +00:00
Tor Egge	9d0ddf1861	Streamline updating of switchtime (don't copy code from kern_sync.c). Submitted by: jhb	2001-02-22 20:16:51 +00:00
Tor Egge	35030da9f8	Backout previous commit. sched_lock is held, thus interrupts are prevented here. Submitted by: jhb	2001-02-22 20:12:52 +00:00
Tor Egge	0d139b3741	Protect update of the per processor switchtime variable against interrupts. Protect usage of the per processor switchtime variable against interrupts in calcru(). This seem to eliminate the "microuptime() went backwards" warnings.	2001-02-22 19:50:37 +00:00
John Baldwin	feb43c5f37	The p_md.md_regs member of proc is used in signal handling to reference the the original trapframe of the syscall, trap, or interrupt that entered the kernel. Before SMPng, ast's were handled via a psuedo trap at the end of doerti. With the SMPng commit, ast's were broken out into a separate ast() function that was called from doreti to match the behavior of other architectures. Unfortunately, when this was done, the p_md.md_regs member of curproc was not updateda in ast(), thus when signals are handled by userret() after an interrupt that returns to userland, we end up using a stale trapframe that will result in the registers from the old trapframe overwriting the real trapframe and smashing all the registers right before we return to usermode. The saved %cs:%eip from where we were in usermode are saved in the trapframe for example.	2001-02-22 19:35:20 +00:00
John Baldwin	51c9129957	Since the PC is a pointer to a code address, change the second parameter of addupc_task() and addupc_intr() to be a uintptr_t instead of a u_long.	2001-02-22 18:07:31 +00:00
John Baldwin	f308e0d714	- Change ast() to take a pointer to a trapframe like other architectures. - Don't use an atomic operation to update cnt.v_soft in ast(). This is the only place the variable is written to, and sched_lock is always held when it is written, so it is already protected and the mutex release of sched_lock asserts a memory barrier that ensures the value will be updated in a timely fashion.	2001-02-22 18:05:15 +00:00
John Baldwin	26f9f5c7c7	- Use TRAPF_PC() on the alpha to acess the PC in the trap frame. - Don't hold sched_lock around addupc_task() as this apparently breaks profiling badly due to sched_lock being held across copyin(). Reported by: bde (2)	2001-02-22 16:23:12 +00:00
John Baldwin	c978f49e20	Add a mtx_assert() in maybe_resched() just to be sure it's always called with sched_lock held.	2001-02-22 13:47:01 +00:00
John Baldwin	3a18729505	Lock need_resched with sched_lock. Reported by: des	2001-02-22 13:46:09 +00:00
John Baldwin	de271f01c2	Work around a race condition where an interrupt handler can be removed from an interrupt thread while the interrupt thread is blocked on Giant waiting to execute the interrupt handler being removed. The result was that the intrhand structure would be free'd, and we would call 0xdeadc0de. The work around is to check to see if the interrupt thread is idle when removing a handler. If not, then we mark the interrupt handler as being dead using the new IH_DEAD flag and don't remove it from the interrupt threads' list of handlers. When the interrupt thread resumes, it will see a dead handler while traversing the list of handlers and will remove the handler then.	2001-02-22 02:18:32 +00:00
John Baldwin	60f2b032fe	Just use the ithread->it_proc directly in a KTR tracepoint instead of assigning a local var to it and using it, as otherwise the local var wasn't used, and generated a warning in the !KTR case. Noticed by: bde	2001-02-22 02:15:57 +00:00
John Baldwin	addec20c38	Add KTR tracepoints for adding/removing interrupt handlers, creating/destroying interrupt threads, and updating the state of an interrupt thread.	2001-02-22 02:14:08 +00:00
John Baldwin	25d209f260	- Use the NOCPU constant. - Move the ithread spin locks before sched lock and clk in preparation for future commits to the ithread code.	2001-02-22 02:12:54 +00:00
John Baldwin	9764c9d36e	Quiet a warning with a uintptr_t cast. Noticed by: bde	2001-02-22 02:10:33 +00:00
John Baldwin	5a93f3e851	- Use the new NOCPU constant. - Fix a warning. Noticed by: bde (2)	2001-02-22 00:32:13 +00:00
John Baldwin	76bd604e7d	Fix a bug where the 'ithread' variable was being set in a KASSERT() condition and thus was not initialized properly in the !INVARIANTS case. Noticed by: bde Pointy hat to: me	2001-02-22 00:23:56 +00:00
John Baldwin	719f43d3df	Remove attempt to add in PREEMPTION #ifdef test in MI code that didn't work because opt_preemption.h wasn't #include'd. Instead, make use of the do_switch parameter to ithread_schedule() and do the check in the alpha interrupt code.	2001-02-21 22:51:00 +00:00
Boris Popov	03137ec82e	Fix parameter order in the calls to MGET().	2001-02-21 09:24:13 +00:00
Robert Watson	91421ba234	o Move per-process jail pointer (p->pr_prison) to inside of the subject credential structure, ucred (cr->cr_prison). o Allow jail inheritence to be a function of credential inheritence. o Abstract prison structure reference counting behind pr_hold() and pr_free(), invoked by the similarly named credential reference management functions, removing this code from per-ABI fork/exit code. o Modify various jail() functions to use struct ucred arguments instead of struct proc arguments. o Introduce jailed() function to determine if a credential is jailed, rather than directly checking pointers all over the place. o Convert PRISON_CHECK() macro to prison_check() function. o Move jail() function prototypes to jail.h. o Emulate the P_JAILED flag in fill_kinfo_proc() and no longer set the flag in the process flags field itself. o Eliminate that "const" qualifier from suser/p_can/etc to reflect mutex use. Notes: o Some further cleanup of the linux/jail code is still required. o It's now possible to consider resolving some of the process vs credential based permission checking confusion in the socket code. o Mutex protection of struct prison is still not present, and is required to protect the reference count plus some fields in the structure. Reviewed by: freebsd-arch Obtained from: TrustedBSD Project	2001-02-21 06:39:57 +00:00
Tor Egge	d82b3e319a	Ensure that RLIMIT_NPROC limits are at least 1 to avoid bad interaction with chgproccnt. MFC candiate. Reviewed by: alfred	2001-02-20 23:34:16 +00:00

... 5 6 7 8 9 ...

4189 Commits