freebsd-nq

Author	SHA1	Message	Date
Jeff Roberson	4b6049cafa	- Closer inspection revealed a possible deadlock situation in vn_lock() that was introduced by my last commit but not caught by stress testing. Fix that and slightly restructure the code so that it is more readable.	2002-08-22 07:57:43 +00:00
Jeff Roberson	9abf54f032	- Make vn_lock() vget() and VOP_LOCK() all behave the same way WRT LK_INTERLOCK. The interlock will never be held on return from these functions even when there is an error. Errors typically only occur when the XLOCK is held which means this isn't the vnode we want anyway. Almost all users of these interfaces expected this behavior even though it was not provided before.	2002-08-22 07:44:45 +00:00
Jeff Roberson	510939d089	- Return two shared locks to exclusive locks. This was premature. - Document the problems that prevent us from using shared locks.	2002-08-22 07:26:18 +00:00
Jeff Roberson	6c54a1f5f0	- Fix interlock handling in vn_lock(). Previously, vn_lock() could return with interlock held in error conditions when the caller did not specify LK_INTERLOCK. - Add several comments to vn_lock() describing the rational behind the code flow since it was not immediately obvious.	2002-08-22 06:58:11 +00:00
Jeff Roberson	183158485a	- Fix interlock handling in vn_lock(). Previously, vn_lock() could return with interlock held in error conditions when the caller did not specify LK_INTERLOCK. - Add several comments to vn_lock() describing the rational behind the code flow since it was not immediately obvious.	2002-08-22 06:51:06 +00:00
Archie Cobbs	55f7c614fd	Don't use "NULL" when "0" is really meant.	2002-08-21 23:39:52 +00:00
Julian Elischer	721e591067	Revert some suspension/sleep/signal code from KSE-III We need to rethink a bit of this and it doesn't matter if we break the KSE test program for now as long as non-KSE programs act as expected. Submitted by: David Xu <bsddiy@yahoo.com> (this guy's just asking to get hit with a commit bit..)	2002-08-21 20:03:55 +00:00
Jeff Roberson	0b600db425	- Document two cases, one in vget and the other in vn_lock, where the state of interlock on exit is not consistent. There are probably several bugs relating to this.	2002-08-21 08:34:48 +00:00
Jeff Roberson	88cf6b94bd	- If vn_lock fails with the LK_INTERLOCK flag set, interlock will not be released. vcanrecycle() failed to unlock interlock under this condition. - Remove an extra VOP_UNLOCK from a failure case in vcanrecycle(). Pointed out by: rwatson	2002-08-21 06:40:34 +00:00
Jeff Roberson	71ea4ba57c	- Add two new debugging macros: ASSERT_VI_LOCKED and ASSERT_VI_UNLOCKED - Use the new VI asserts in place of the old mtx_assert checks. - Add the VI asserts to the automated lock checking in the VOP calls. The interlock should not be held across vops with a few exceptions. - Add the vop_(un)lock_{pre,post} functions to assert that interlock is held when LK_INTERLOCK is set.	2002-08-21 06:19:29 +00:00
Jeff Roberson	856d3a056f	- Hold the vnode lock across unlink() so that the v_vflag check is safe. - Fix the long broken error handling for VV_ROOT and VDIR.	2002-08-21 03:55:35 +00:00
Robert Watson	e5cb5e37d4	Close a race in process label changing opened due to dropping the proc locking when revoking access to mmaps. Instead, perform this later once we've changed the process label (hold onto a reference to the new cred so that we don't lose it when we release the process lock if another thread changes the credential). Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-19 20:26:32 +00:00
Robert Watson	8815d2e899	Regen.	2002-08-19 20:02:29 +00:00
Robert Watson	f61b85492c	mac_syscall is now implemented, switch to MSTD. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-19 20:01:31 +00:00
Robert Watson	177142e458	Pass active_cred and file_cred into the MAC framework explicitly for mac_check_vnode_{poll,read,stat,write}(). Pass in fp->f_cred when calling these checks with a struct file available. Otherwise, pass NOCRED. All currently MAC policies use active_cred, but could now offer the cached credential semantic used for the base system security model. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-19 19:04:53 +00:00
Robert Watson	27f2eac7f3	Provide an implementation of mac_syscall() so that security modules can offer new services without reserving system call numbers, or augmented versions of existing services. User code requests a target policy by name, and specifies the policy-specific API plus target. This is required in particular for our port of SELinux/FLASK to the MAC framework since it offers additional security services. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-19 17:59:48 +00:00
Robert Watson	c024c3eeb1	Break out mac_check_pipe_op() into component check entry points: mac_check_pipe_poll(), mac_check_pipe_read(), mac_check_pipe_stat(), and mac_check_pipe_write(). This is improves consistency with other access control entry points and permits security modules to only control the object methods that they are interested in, avoiding switch statements. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-19 16:59:37 +00:00
Robert Watson	7f724f8b51	Break out mac_check_vnode_op() into three seperate checks: mac_check_vnode_poll(), mac_check_vnode_read(), mac_check_vnode_write(). This improves the consistency with other existing vnode checks, and allows policies to avoid implementing switch statements to determine what operations they do and do not want to authorize. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-19 16:43:25 +00:00
Robert Watson	b12baf55a4	Assert process locks in proces-related access control checks. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-19 15:30:30 +00:00
Robert Watson	851704bbd0	Add a missing vnode assertion for the exec() check. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-19 15:28:39 +00:00
Poul-Henning Kamp	fee7d450d8	Keep a copy of the credential used to mount filesystems around so we can check and use it later on. Change the pieces of code which relied on mount->mnt_stat.f_owner to check which user mounted the filesystem. This became needed as the EA code needs to be able to allocate blocks for "system" EA users like ACLs. There seems to be some half-baked (probably only quarter- actually) notion that the superuser for a given filesystem is the user who mounted it, but this has far from been carried through. It is unclear if it should be. Sponsored by: DARPA & NAI Labs.	2002-08-19 06:52:21 +00:00
Poul-Henning Kamp	91afe0874d	A side effect of some debugging: prototypify and deregister.	2002-08-18 21:24:22 +00:00
Maxim Sobolev	62f7648682	Increase size of ifnet.if_flags from 16 bits (short) to 32 bits (int). To avoid breaking application ABI use unused ifreq.ifru_flags[1] for upper 16 bits in SIOCSIFFLAGS and SIOCGIFFLAGS ioctl's. Reviewed by: -hackers, -net	2002-08-18 07:05:00 +00:00
Robert Watson	d49fa1ca6e	In continuation of early fileop credential changes, modify fo_ioctl() to accept an 'active_cred' argument reflecting the credential of the thread initiating the ioctl operation. - Change fo_ioctl() to accept active_cred; change consumers of the fo_ioctl() interface to generally pass active_cred from td->td_ucred. - In fifofs, initialize filetmp.f_cred to ap->a_cred so that the invocations of soo_ioctl() are provided access to the calling f_cred. Pass ap->a_td->td_ucred as the active_cred, but note that this is required because we don't yet distinguish file_cred and active_cred in invoking VOP's. - Update kqueue_ioctl() for its new argument. - Update pipe_ioctl() for its new argument, pass active_cred rather than td_ucred to MAC for authorization. - Update soo_ioctl() for its new argument. - Update vn_ioctl() for its new argument, use active_cred rather than td->td_ucred to authorize VOP_IOCTL() and the associated VOP_GETATTR(). Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-17 02:36:16 +00:00
David Greenman	79cb7eb41c	Further improved the performance of sbreserve() by moving the calculation of the adjusted sb_max into a sysctl handler for sb_max and assigning it to a variable that is used instead. This eliminates the 32bit multiply and divide from the fast path that was being done previously.	2002-08-16 18:41:48 +00:00
Robert Watson	f050add5c1	Wrap maintenance of varios nmac{objectname} counters in MAC_DEBUG so we can avoid the cost of a large number of atomic operations if we're not interested in the object count statistics. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-16 14:21:38 +00:00
Robert Watson	49cde51dfd	Correct white space nits that crept in during my recent merges of trustedbsd_mac material.	2002-08-16 14:12:40 +00:00
Robert Watson	ea6027a8e1	Make similar changes to fo_stat() and fo_poll() as made earlier to fo_read() and fo_write(): explicitly use the cred argument to fo_poll() as "active_cred" using the passed file descriptor's f_cred reference to provide access to the file credential. Add an active_cred argument to fo_stat() so that implementers have access to the active credential as well as the file credential. Generally modify callers of fo_stat() to pass in td->td_ucred rather than fp->f_cred, which was redundantly provided via the fp argument. This set of modifications also permits threads to perform these operations on behalf of another thread without modifying their credential. Trickle this change down into fo_stat/poll() implementations: - badfo_poll(), badfo_stat(): modify/add arguments. - kqueue_poll(), kqueue_stat(): modify arguments. - pipe_poll(), pipe_stat(): modify/add arguments, pass active_cred to MAC checks rather than td->td_ucred. - soo_poll(), soo_stat(): modify/add arguments, pass fp->f_cred rather than cred to pru_sopoll() to maintain current semantics. - sopoll(): moidfy arguments. - vn_poll(), vn_statfile(): modify/add arguments, pass new arguments to vn_stat(). Pass active_cred to MAC and fp->f_cred to VOP_POLL() to maintian current semantics. - vn_close(): rename cred to file_cred to reflect reality while I'm here. - vn_stat(): Add active_cred and file_cred arguments to vn_stat() and consumers so that this distinction is maintained at the VFS as well as 'struct file' layer. Pass active_cred instead of td->td_ucred to MAC and to VOP_GETATTR() to maintain current semantics. - fifofs: modify the creation of a "filetemp" so that the file credential is properly initialized and can be used in the socket code if desired. Pass ap->a_td->td_ucred as the active credential to soo_poll(). If we teach the vnop interface about the distinction between file and active credentials, we would use the active credential here. Note that current inconsistent passing of active_cred vs. file_cred to VOP's is maintained. It's not clear why GETATTR would be authorized using active_cred while POLL would be authorized using file_cred at the file system level. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-16 12:52:03 +00:00
David Greenman	8c71ce8a4e	Rewrote the space check algorithm in sbreserve() so that the extremely expensive (!) 64bit multiply, divide, and comparison aren't necessary (this came in originally from rev 1.19 to fix an overflow with large sb_max or MCLBYTES). The 64bit math in this function was measured in some kernel profiles as being as much as 5-8% of the total overhead of the TCP/IP stack and is eliminated with this commit. There is a harmless rounding error (of about .4% with the standard values) introduced with this change, however this is in the conservative direction (downward toward a slightly smaller maximum socket buffer size). MFC after: 3 days	2002-08-16 05:08:46 +00:00
Robert Watson	9ca435893b	In order to better support flexible and extensible access control, make a series of modifications to the credential arguments relating to file read and write operations to cliarfy which credential is used for what: - Change fo_read() and fo_write() to accept "active_cred" instead of "cred", and change the semantics of consumers of fo_read() and fo_write() to pass the active credential of the thread requesting an operation rather than the cached file cred. The cached file cred is still available in fo_read() and fo_write() consumers via fp->f_cred. These changes largely in sys_generic.c. For each implementation of fo_read() and fo_write(), update cred usage to reflect this change and maintain current semantics: - badfo_readwrite() unchanged - kqueue_read/write() unchanged pipe_read/write() now authorize MAC using active_cred rather than td->td_ucred - soo_read/write() unchanged - vn_read/write() now authorize MAC using active_cred but VOP_READ/WRITE() with fp->f_cred Modify vn_rdwr() to accept two credential arguments instead of a single credential: active_cred and file_cred. Use active_cred for MAC authorization, and select a credential for use in VOP_READ/WRITE() based on whether file_cred is NULL or not. If file_cred is provided, authorize the VOP using that cred, otherwise the active credential, matching current semantics. Modify current vn_rdwr() consumers to pass a file_cred if used in the context of a struct file, and to always pass active_cred. When vn_rdwr() is used without a file_cred, pass NOCRED. These changes should maintain current semantics for read/write, but avoid a redundant passing of fp->f_cred, as well as making it more clear what the origin of each credential is in file descriptor read/write operations. Follow-up commits will make similar changes to other file descriptor operations, and modify the MAC framework to pass both credentials to MAC policy modules so they can implement either semantic for revocation. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-15 20:55:08 +00:00
Robert Watson	d61198e422	Rename mac_check_socket_receive() to mac_check_socket_deliver() so that we can use the names _receive() and _send() for the receive() and send() checks. Rename related constants, policy implementations, etc. PR: Submitted by: Reviewed by: Approved by: Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs MFC after:	2002-08-15 18:51:26 +00:00
Robert Watson	4b9c2fa1fb	Fix return case for negative namelen by jumping to normal exit processing rather than immediately returning, or we may not unlock necessary locks. Noticed by: Mike Heffner <mheffner@acm.vt.edu>	2002-08-15 17:34:03 +00:00
Bosko Milekic	5fee904c3c	Make m_flags an int instead of a short, this is consistent with the type of the 'flags' argument m_getcl() was using anyway; m_extadd() needed to be changed to accept an int instead of a short for 'flags.' This makes things more consistent and also gives us more bits to use for m_flags in the future (we have almost run out). Requested by: sam (Sam Leffler)	2002-08-15 14:09:16 +00:00
Robert Watson	99fa64f863	Sync to trustedbsd_mac tree: default to sigsegv rather than copy-on-write during a label change resulting in an mmap removal. This is "fail stop" behavior, which is preferred, although it offers slightly less transparency. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-15 02:28:32 +00:00
Alfred Perlstein	b618bb96f0	return foo -> return (foo)	2002-08-15 02:10:12 +00:00
David Greenman	9e63574ea4	Moved sf_buf_alloc and sf_buf_free function declarations to sys/socketvar.h so that they can be seen by external callers.	2002-08-13 19:03:19 +00:00
David Greenman	a370c70055	Remove obsolete comment about sf_buf_* functions being static. They were made un-static in rev 1.114.	2002-08-13 18:20:08 +00:00
Poul-Henning Kamp	6f21160218	Remember to unlock the (optional) vnode in vfs_stdextattrctl(). Failing to do this made the following script hang: #!/bin/sh set -ex extattrctl start /tmp extattrctl initattr 64 /tmp/EA00 extattrctl enable /tmp user ea00 /tmp/EA00 extattrctl showattr /tmp/EA00 if the filesystem backing /tmp did not support EAs. The real solution is probably to have the extattrctl syscall do the unlocking rather than depend on the filesystem to do it. Considering that extattrctl is going to be made obsolete anyway, this has dogwash priority. Sponsored by: DARPA & NAI Labs.	2002-08-13 11:11:51 +00:00
Poul-Henning Kamp	7f52a691f0	Add a #include for <sys/mount.h>	2002-08-13 10:07:05 +00:00
Alfred Perlstein	149004e99d	Make SYSVSEM mpsafe. Each semaphore set gets its own lock, however there is a global lock over the undo structures because of the way they are managed. Switch to using SLIST instead of rolling our own linked list. Fix several races where a permission check was done before a copyin/copyout, if the copy happened to fault it may have been possible to race for access to a semaphore set that one shouldn't have access to. Requested by: rwatson Tested by: NetBSD regression suite.	2002-08-13 08:47:17 +00:00
Alfred Perlstein	4b6ef3a176	Make SYSVMSG mpsafe. Right now there is a global lock over the entire subsystem, we could move to per-message queue locks, however the messages themselves seem to come from a global pool and to avoid over-locking this code (locking individual queues, then the global pool) I've opted to just do it this way. Requested by: rwatson Tested by: NetBSD's regression suite.	2002-08-13 08:00:36 +00:00
Jeff Roberson	619eb6e579	- Hold the vnode lock throughout execve. - Set VV_TEXT in the top level execve code. - Fixup the image activators to deal with the newly locked vnode.	2002-08-13 06:55:28 +00:00
Jeff Roberson	055c012332	- Extend the vnode_free_list_mtx to cover numvnodes and freevnodes. This was done only some of the time before, and now it is uniformly applied.	2002-08-13 05:29:48 +00:00
Robert Watson	925860774d	Introduce support for labeling and access control of pipe objects as part of the TrustedBSD MAC framework. Instrument the creation and destruction of pipes, as well as relevant operations, with necessary calls to the MAC framework. Note that the locking here is probably not quite right yet, but fixes will be forthcoming. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-13 02:47:13 +00:00
Robert Watson	5c5384fe80	Use the credential authorizing the socket creation operation to perform the jail check and the MAC socket labeling in socreate(). This handles socket creation using a cached credential better (such as in the NFS client code when rebuilding a socket following a disconnect: the new socket should be created using the nfsmount cached cred, not the cred of the thread causing the socket to be rebuilt). Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-12 16:49:03 +00:00
Robert Watson	818d7e6d8a	Enforce MAC policy in cttyread() as well as the other operations already instrumented. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-12 16:45:19 +00:00
Robert Watson	0231c03df4	Implement IO_NOMACCHECK in vn_rdwr() -- perform MAC checks (assuming 'options MAC') as long as IO_NOMACCHECK is not set in the IO flags. If IO_NOMACCHECK is set, bypass MAC checks in vn_rdwr(). This allows vn_rdwr() to be used as a utility function inside of file systems where MAC checks have already been performed, or where the operation is being done on behalf of the kernel not the user. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI LAbs	2002-08-12 16:15:34 +00:00
Robert Watson	7ba28492c5	Declare a module service "kernel_mac_support" when MAC support is enabled and the kernel provides the MAC registration and entry point service. Declare a dependency on that module service for any MAC module registered using mac_policy.h. For now, hard code the version as 1, but once we've come up with a versioning policy, we'll move to a #define of some sort. In the mean time, this will prevent loading a MAC module when 'options MAC' isn't present, which (due to a bug in the kernel linker) can result if the MAC module is preloaded via loader.conf. This particular evil recommended by: peter Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI LAbs	2002-08-12 02:00:21 +00:00
Semen Ustimenko	87df4f8f18	Fix sendfile(), who was calling vn_rdwr() without aresid parameter and thus hiting EIO at the end of file. This is believed to be a feature (not a bug) of vn_rdwr(), so we turn it off by supplying aresid param. Reviewed by: rwatson, dg	2002-08-11 20:33:11 +00:00
Alan Cox	ad49abc087	o Make a correction to the last change: In aio_cancel(2) return AIO_ALLDONE instead of EINVAL if p->p_aioinfo is NULL.	2002-08-11 19:04:17 +00:00
David Malone	af338bea64	Make kern.log_console_output a tuneable aswell as a sysctl. MFC after: 1 week	2002-08-11 18:47:42 +00:00
Jens Schweikhardt	2b239dd118	Fix typos; each file has at least one s/seperat/separat/ (I skipped those in contrib/, gnu/ and crypto/) While I was at it, fixed a lot more found by ispell that I could identify with certainty to be errors. All of these were in comments or text, not in actual code. Suggested by: bde MFC after: 3 days	2002-08-11 13:05:30 +00:00
Alan Cox	b6c1f1efa2	o In aio_cancel(2), make sure that p->p_aioinfo isn't NULL before dereferencing it. Submitted by: saureen <sshah@apple.com>	2002-08-11 04:09:14 +00:00
Maxime Henrion	5965373e69	- Introduce a new struct xvfsconf, the userland version of struct vfsconf. - Make getvfsbyname() take a struct xvfsconf *. - Convert several consumers of getvfsbyname() to use struct xvfsconf. - Correct the getvfsbyname.3 manpage. - Create a new vfs.conflist sysctl to dump all the struct xvfsconf in the kernel, and rewrite getvfsbyname() to use this instead of the weird existing API. - Convert some {set,get,end}vfsent() consumers to use the new vfs.conflist sysctl. - Convert a vfsload() call in nfsiod.c to kldload() and remove the useless vfsisloadable() and endvfsent() calls. - Add a warning printf() in vfs_sysctl() to tell people they are using an old userland. After these changes, it's possible to modify struct vfsconf without breaking the binary compatibility. Please note that these changes don't break this compatibility either. When bp will have updated mount_smbfs(8) with the patch I sent him, there will be no more consumers of the {set,get,end}vfsent(), vfsisloadable() and vfsload() API, and I will promptly delete it.	2002-08-10 20:19:04 +00:00
Maxime Henrion	306e6b8393	Introduce a new sysctl flag, CTLFLAG_SKIP, which will cause sysctl_sysctl_next() to skip this sysctl. The sysctl is still available, but doesn't appear in a "sysctl -a". This is especially useful when you want to deprecate a sysctl, and add a warning into it to warn users that they are using an old interface. Without this flag, the warning would get echoed when running "sysctl -a" (which happens at boot).	2002-08-10 19:56:45 +00:00
Jacques Vidrine	5b770403b5	While we're at it, add range checks similar to those in previous commit to getsockname() and getpeername(), too.	2002-08-09 12:58:11 +00:00
Robert Watson	82d9ad331a	Add additional range checks for copyout targets. Submitted by: Silvio Cesare <silvio@qualys.com>	2002-08-09 05:50:32 +00:00
Bosko Milekic	850be9af25	Only my brain can fart while fixing a previous brain fart.	2002-08-08 13:31:57 +00:00
Bosko Milekic	0584320e56	YIKES, I take the pointy-hat for a really big braino here. I appologize to those of you who may have been seeing crashes in code that uses sendfile(2) or other types of external buffers with mbufs. Pointed out by, and provided trace: Niels Chr. Bank-Pedersen <ncbp at bank-pedersen.dk>	2002-08-08 13:29:32 +00:00
Robert Watson	92e35b6006	Due to layering problems, remove the MAC checks from vn_rdwr() -- this VOP wrapper is called from within file systems so can result in odd loopback effects when MAC enforcement is use with the active (as opposed to saved) credential. These checks will be moved elsewhere. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-08 12:45:30 +00:00
Julian Elischer	6933e3c12b	Do some work on keeping better track of stopped/continued state. I'm not sure what happenned to the original setting of the P_CONTINUED flag. it appears to have been lost in the paper shuffling... Submitted by: David Xu <bsddiy@yahoo.com>	2002-08-08 06:18:41 +00:00
Robert Watson	55ac5e1861	Correct a bug introduced in 1.26: M_PKTHDR is set in the 'flags' argument, not the 'type' argument. As a result of the buf, the MAC label on some packet header mbufs might not be set in mbufs allocated using m_getcl(), resulting in a page fault. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-07 20:15:29 +00:00
Thomas Moestl	d88d37d604	Use the CPU_* OID constants instead of OID_AUTO for the clock-related sysctls for compatability with old applications.	2002-08-07 19:43:54 +00:00
Robert Watson	2d70161756	Cache the credential provided during accton() for use in later accounting vnode operations. This permits the rights of the user (typically root) used to turn on accounting to be used when writing out accounting entries, rather than the credentials of the process generating the accounting record. This fixes accounting in a number of environments, including file systems that offer revocation support, MAC environments, some securelevel scenarios, and in some NFS environments. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-07 19:30:16 +00:00
Robert Watson	4d1a4bb79f	Refresh the credential on the first initproc thread following divorcing the initproc credential from the proc0 credential. Otherwise, the proc0 credential is used instead of initproc's credentil when authorizing start_init() activities prior to initproc hitting userland for the first time. This could result in the incorrect credential being used to authorize mounting of the root file system, which could in turn cause problems for NFS when used in combination with uid/gid ipfw rules, or with MAC. Discussed with: julian	2002-08-07 17:53:31 +00:00
Matthew N. Dodd	df95311a10	Move code block added in 1.157 to a safer part of fork1(). Submitted by: jake	2002-08-07 11:31:45 +00:00
Alan Cox	b46f1c55f9	Set the ident field of the struct kevent that is registered by _aio_aqueue() to the address of the user's aiocb rather than the kernel's aiocb. (In other words, prior to this change, the ident field returned by kevent(2) on completion of an AIO was effectively garbage.) Submitted by: Romer Gil <rgil@cs.rice.edu>	2002-08-06 19:01:08 +00:00
Jake Burkholder	a520b73cdc	Remove new console devices with cnremove before initializing them in cninit. This allows a console driver to replace the existing console by calling cninit again, eg during the device probe. Otherwise the multiple console code sends output to both, which is unfortunate if they're using the same hardware.	2002-08-06 18:56:41 +00:00
Bruce Evans	1c530be49c	Try harder to "set signal flags proprly [sic] for ast()". See rev.1.154.	2002-08-06 15:22:09 +00:00
Robert Watson	d2118dfaba	Regen.	2002-08-06 15:16:55 +00:00
Robert Watson	280f0785e8	Rename mac_policy() to mac_syscall() to be more reflective of its purpose. Submitted by: cvance@tislabs.com Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-06 15:15:53 +00:00
Don Lewis	b74f2c1878	Don't automagically call vslock() from SYSCTL_OUT(). Instead, complain about calls to SYSCTL_OUT() made with locks held if the buffer has not been pre-wired. SYSCTL_OUT() should not be called while holding locks, but if this is not possible, the buffer should be wired by calling sysctl_wire_old_buffer() before grabbing any locks.	2002-08-06 11:28:09 +00:00
Alan Cox	20fb589d13	o The introduction of kevent() broke lio_listio(): _aio_aqueue() thought that LIO_READ and LIO_WRITE were requests for kevent()-based notification of completion. Modify _aio_aqueue() to recognize LIO_READ and LIO_WRITE. Notes: (1) The patch provided by the PR perpetuates a second bug in this code, a direct access to user-space memory. This change fixes that bug as well. (2) This change is to code that implements a deprecated interface. It should probably be removed after an MFC. PR: kern/39556	2002-08-05 19:14:27 +00:00
Dag-Erling Smørgrav	ea4c8f8ca1	Check the far end before registering an EVFILT_WRITE filter on a pipe.	2002-08-05 15:03:03 +00:00
Jeff Roberson	8947be9ba0	- Move some logic from getnewvnode() to a new function vcanrecycle() - Unlock the free list mutex around vcanrecycle to prevent a lock order reversal.	2002-08-05 10:15:56 +00:00
Jeff Roberson	18c6acee26	- Move a VOP assert to the right place. Spotted by: i386 tinderbox	2002-08-05 08:55:53 +00:00
Alfred Perlstein	4442e4a436	Cleanup: Fix line wrapping. Remove 'register'. malloc(9) with M_WAITOK can't fail, so remove checks for that.	2002-08-05 05:16:09 +00:00
Luigi Rizzo	fc1c73c21a	Temporarily disable polling when no processes are active, while I investigate the problem described below. I am seeing some strange livelock on recent -current sources with a slow box under heavy load, which disappears with this change. This might suggest some kind of problem (either insufficient locking, or mishandling of priorities) in the poll_idle thread.	2002-08-04 21:00:49 +00:00
Jeff Roberson	e6e370a7fe	- Replace v_flag with v_iflag and v_vflag - v_vflag is protected by the vnode lock and is used when synchronization with VOP calls is needed. - v_iflag is protected by interlock and is used for dealing with vnode management issues. These flags include X/O LOCK, FREE, DOOMED, etc. - All accesses to v_iflag and v_vflag have either been locked or marked with mp_fixme's. - Many ASSERT_VOP_LOCKED calls have been added where the locking was not clear. - Many functions in vfs_subr.c were restructured to provide for stronger locking. Idea stolen from: BSD/OS	2002-08-04 10:29:36 +00:00
Alan Cox	4453ada654	o Convert a vm_page_sleep_busy() into a vm_page_sleep_if_busy() with appropriate page queue locking.	2002-08-04 06:27:37 +00:00
Matthew N. Dodd	9ccba881d9	Kernel modifications necessary to allow to follow fork()ed children. PR: bin/25587 (in part) MFC after: 3 weeks	2002-08-04 01:07:02 +00:00
Alan Cox	3327872297	o Convert two instances of vm_page_sleep_busy() to vm_page_sleep_if_busy() with appropriate page queue locking.	2002-08-03 18:59:19 +00:00
Maxime Henrion	f2b17113cf	Make the consumers of the linker_load_file() function use linker_load_module() instead. This fixes a bug where the kernel was unable to properly locate and load a kernel module in vfs_mount() (and probably in the netgraph code as well since it was using the same function). This is because the linker_load_file() does not properly search the module path. Problem found by: peter Reviewed by: peter Thanks to: peter	2002-08-02 20:56:07 +00:00
Robert Watson	18b770b2fb	Introduce support for Mandatory Access Control and extensible kernel access control. Invoke appropriate MAC framework entry points to authorize readdir() operations in the native ABI. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-01 20:44:52 +00:00
Julian Elischer	67759b33f6	Fix a comment.	2002-08-01 19:10:40 +00:00
Julian Elischer	04774f2357	Slight cleanup of some comments/whitespace. Make idle process state more consistant. Add an assert on thread state. Clean up idleproc/mi_switch() interaction. Use a local instead of referencing curthread 7 times in a row (I've been told curthread can be expensive on some architectures) Remove some commented out code. Add a little commented out code (completion coming soon) Reviewed by: jhb@freebsd.org	2002-08-01 18:45:10 +00:00
Robert Watson	ee0812f320	Since we have the struct file data pointer cached in vp, use that instead when invoking VOP_POLL().	2002-08-01 18:29:30 +00:00
Alan Cox	46086ddf91	o Acquire the page queues lock before calling vm_page_io_finish(). o Assert that the page queues lock is held in vm_page_io_finish().	2002-08-01 17:57:42 +00:00
Robert Watson	f9d0d52459	Include file cleanup; mac.h and malloc.h at one point had ordering relationship requirements, and no longer do. Reminded by: bde	2002-08-01 17:47:56 +00:00
Robert Watson	4a58340e98	Introduce support for Mandatory Access Control and extensible kernel access control Invoke appropriate MAC framework entry points to authorize a number of vnode operations, including read, write, stat, poll. This permits MAC policies to revoke access to files following label changes, and to limit information spread about the file to user processes. Note: currently the file cached credential is used for some of these authorization check. We will need to expand some of the MAC entry point APIs to permit multiple creds to be passed to the access control check to allow diverse policy behavior. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-01 17:23:22 +00:00
Robert Watson	37bde6c0a3	Introduce support for Mandatory Access Control and extensible kernel access control. Restructure the vn_open_cred() access control checks to invoke the MAC entry point for open authorization. Note that MAC can reject open requests where existing DAC code skips the open authorization check due to O_CREAT. However, the failure mode here is the same as other failure modes following creation, wherein an empty file may be left behind. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-01 17:14:28 +00:00
Robert Watson	f4d2cfdda6	Introduce support for Mandatory Access Control and extensible kernel access control. Invoke appropriate MAC entry points to authorize the following operations: truncate on open() (write) access() (access) readlink() (readlink) chflags(), lchflags(), fchflags() (setflag) chmod(), fchmod(), lchmod() (setmode) chown(), fchown(), lchown() (setowner) utimes(), lutimes(), futimes() (setutimes) truncate(), ftrunfcate() (write) revoke() (revoke) fhopen() (open) truncate on fhopen() (write) extattr_set_fd, extattr_set_file() (setextattr) extattr_get_fd, extattr_get_file() (getextattr) extattr_delete_fd(), extattr_delete_file() (setextattr) These entry points permit MAC policies to enforce a variety of protections on vnodes. More vnode checks to come, especially in non-native ABIs. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-01 15:37:12 +00:00
Robert Watson	339b79b939	Introduce support for Mandatory Access Control and extensible kernel access control. Invoke an appropriate MAC entry point to authorize execution of a file by a process. The check is placed slightly differently than it appears in the trustedbsd_mac tree so that it prevents a little more information leakage about the target of the execve() operation. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-01 14:31:58 +00:00
Bosko Milekic	abc1263a51	Move the MAC label init/destroy stuff to more appropriate places so that the inits/destroys are done without the cache locks held even in the persistent-lock calls. I may be cheating a little by using the MAC "already initialized" flag for now.	2002-08-01 14:24:41 +00:00
John Baldwin	12240b1159	Revert previous revision which accidentally snuck in with another commit. It just removed a comment that doesn't make sense to me personally.	2002-08-01 13:44:33 +00:00
John Baldwin	0711ca46c5	Revert previous revision which was accidentally committed and has not been tested yet.	2002-08-01 13:39:33 +00:00
John Baldwin	fbd140c786	If we fail to write to a vnode during a ktrace write, then we drop all other references to that vnode as a trace vnode in other processes as well as in any pending requests on the todo list. Thus, it is possible for a ktrace request structure to have a NULL ktr_vp when it is destroyed in ktr_freerequest(). We shouldn't call vrele() on the vnode in that case. Reported by: bde	2002-08-01 13:35:38 +00:00
Robert Watson	b3e13e1c3f	Introduce support for Mandatory Access Control and extensible kernel access control. Instrument chdir() and chroot()-related system calls to invoke appropriate MAC entry points to authorize the two operations. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-01 03:50:08 +00:00
Robert Watson	b827919594	Introduce support for Mandatory Access Control and extensible kernel access control. Implement two IOCTLs at the socket level to retrieve the primary and peer labels from a socket. Note that this user process interface will be changing to improve multi-policy support. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-01 03:45:40 +00:00
Robert Watson	b285e7f9a8	Improve formatting and variable use consistency in extattr system calls. Submitted by: green Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-01 01:29:03 +00:00
Robert Watson	956fc3f8a5	Simplify the logic to enter VFS_EXTATTRCTL(). Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-01 01:26:07 +00:00
Robert Watson	d03db4290d	Introduce support for Mandatory Access Control and extensible kernel access control. Authorize vop_readlink() and vop_lookup() activities during recursive path lookup via namei() via calls to appropriate MAC entry points. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-01 01:21:40 +00:00
Robert Watson	6ea48a903c	Introduce support for Mandatory Access Control and extensible kernel access control. Authorize the creation of UNIX domain sockets in the file system namespace via an appropriate invocation a MAC framework entry point. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-01 01:18:42 +00:00
Robert Watson	b65f6f6b69	When invoking NDINIT() in preparation for CREATE, set SAVENAME since we'll use nd.ni_cnp later. Submitted by: green Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-01 01:16:22 +00:00
Robert Watson	62b24bcc26	Introduce support for Mandatory Access Control and extensible kernel access control. Instrument ctty driver invocations of various vnode operations on the terminal controlling tty to perform appropriate MAC framework authorization checks. Note: VOP_IOCTL() on the ctty appears to be authorized using NOCRED in the existing code rather than td->td_ucred. Why? Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-01 01:09:54 +00:00
Robert Watson	467a273ca0	Introduce support for Mandatory Access Control and extensible kernel access control. Instrument the ktrace write operation so that it invokes the MAC framework's vnode write authorization check. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-01 01:07:03 +00:00
Robert Watson	c86ca022eb	Introduce support for Mandatory Access Control and extensible kernel access control. Instrument the kernel ACL retrieval and modification system calls to invoke MAC framework entry points to authorize these operations. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-01 01:04:16 +00:00
Robert Watson	62f5f684fb	Introduce support for Mandatory Access Control and extensible kernel access control. Instrument connect(), listen(), and bind() system calls to invoke MAC framework entry points to permit policies to authorize these requests. This can be useful for policies that want to limit the activity of processes involving particular types of IPC and network activity. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-07-31 16:39:49 +00:00
Dag-Erling Smørgrav	aefe27a25c	Have the kern.file sysctl export xfiles rather than files. The truth is out there! Sponsored by: DARPA, NAI Labs	2002-07-31 12:26:52 +00:00
Dag-Erling Smørgrav	3072197229	Nit in previous commit: the correct sysctl type is "S,xvnode"	2002-07-31 12:25:28 +00:00
Dag-Erling Smørgrav	217b2a0b61	Initialize v_cachedid to -1 in getnewvnode(). Reintroduce the kern.vnode sysctl and make it export xvnodes rather than vnodes. Sponsored by: DARPA, NAI Labs	2002-07-31 12:24:35 +00:00
Dag-Erling Smørgrav	4eee8de77c	Introduce struct xvnode, which will be used instead of struct vnode for sysctl purposes. Also add two fields to struct vnode, v_cachedfs and v_cachedid, which hold the vnode's device and file id and are filled in by vn_open_cred() and vn_stat(). Sponsored by: DARPA, NAI Labs	2002-07-31 12:19:49 +00:00
Alan Cox	67c1fae92e	o Lock page accesses by vm_page_io_start() with the page queues lock. o Assert that the page queues lock is held in vm_page_io_start().	2002-07-31 07:27:08 +00:00
Robert Watson	335654d73e	Introduce support for Mandatory Access Control and extensible kernel access control. Invoke the necessary MAC entry points to maintain labels on sockets. In particular, invoke entry points during socket allocation and destruction, as well as creation by a process or during an accept-scenario (sonewconn). For UNIX domain sockets, also assign a peer label. As the socket code isn't locked down yet, locking interactions are not yet clear. Various protocol stack socket operations (such as peer label assignment for IPv4) will follow. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-07-31 03:03:22 +00:00
Robert Watson	07bdba7e2d	Note that the privilege indicating flag to vaccess() originally used by the process accounting system is now deprecated.	2002-07-31 02:05:12 +00:00
Robert Watson	a0ee6ed1c0	Introduce support for Mandatory Access Control and extensible kernel access control. Invoke the necessary MAC entry points to maintain labels on vnodes. In particular, initialize the label when the vnode is allocated or reused, and destroy the label when the vnode is going to be released, or reused. Wow, an object where there really is exactly one place where it's allocated, and one other where it's freed. Amazing. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-07-31 02:03:46 +00:00
Robert Watson	e32a5b94d8	Introduce support for Mandatory Access Control and extensible kernel access control. Invoke additional MAC entry points when an mbuf packet header is copied to another mbuf: release the old label if any, reinitialize the new header, and ask the MAC framework to copy the header label data. Note that this requires a potential allocation operation, but m_copy_pkthdr() is not permitted to fail, so we must block. Since we now use interrupt threads, this is possible, but not desirable. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-07-31 01:51:34 +00:00
Robert Watson	a3abeda755	Introduce support for Mandatory Access Control and extensible kernel access control. Invoke the necessary MAC entry points to maintain labels on header mbufs. In particular, invoke entry points during the two mbuf header allocation cases, and the mbuf freeing case. Pass the "how" argument at allocation time to the MAC framework so that it can determine if it is permitted to block (as with policy modules), and permit the initialization entry point to fail if it needs to allocate memory but is not permitted to, failing the mbuf allocation. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-07-31 01:42:19 +00:00
Robert Watson	2712d0ee89	Introduce support for Mandatory Access Control and extensible kernel access control. Implement MAC framework access control entry points relating to operations on mountpoints. Currently, this consists only of access control on mountpoint listing using the various statfs() variations. In the future, it might also be desirable to implement checks on mount() and unmount(). Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-07-31 01:27:33 +00:00
Robert Watson	a87cdf8335	Introduce support for Mandatory Access Control and extensible kernel access control. Invoke the necessary MAC entry points to maintain labels on mount structures. In particular, invoke entry points for intialization and destruction in various scenarios (root, non-root). Also introduce an entry point in the boot procedure following the mount of the root file system, but prior to the start of the userland init process to permit policies to perform further initialization. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-07-31 01:11:29 +00:00
Robert Watson	8a1d977d66	Introduce support for Mandatory Access Control and extensible kernel access control. Implement inter-process access control entry points for the MAC framework. This permits policy modules to augment the decision making process for process and socket visibility, process debugging, re-scheduling, and signaling. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-07-31 00:48:24 +00:00
Robert Watson	4024496496	Introduce support for Mandatory Access Control and extensible kernel access control. Invoke the necessary MAC entry points to maintain labels on process credentials. In particular, invoke entry points for the initialization and destruction of struct ucred, the copying of struct ucred, and permit the initial labels to be set for both process 0 (parent of all kernel processes) and process 1 (parent of all user processes). Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-07-31 00:39:19 +00:00
Robert Watson	47ac133d33	Regen.	2002-07-31 00:16:58 +00:00
Robert Watson	55fb783052	Introduce support for Mandatory Access Control and extensible kernel access control. Replace 'void ' with 'struct mac ' now that mac.h is in the base tree. The current POSIX.1e-derived userland MAC interface is schedule for replacement, but will act as a functional placeholder until the replacement is done. These system calls allow userland processes to get and set labels on both the current process, as well as file system objects and file descriptor backed objects.	2002-07-30 22:43:20 +00:00
Robert Watson	f8ef020e2e	Begin committing support for Mandatory Access Control and extensible kernel access control. The MAC framework permits loadable kernel modules to link to the kernel at compile-time, boot-time, or run-time, and augment the system security policy. This commit includes the initial kernel implementation, although the interface with the userland components of the operating system is still under work, and not all kernel subsystems are supported. Later in this commit sequence, documentation of which kernel subsystems will not work correctly with a kernel compiled with MAC support will be added. Introduce two node vnode operations required to support MAC. First, VOP_REFRESHLABEL(), which will be invoked by callers requiring that vp->v_label be sufficiently "fresh" for access control purposes. Second, VOP_SETLABEL(), which be invoked by callers requiring that the passed label contents be updated. The file system is responsible for updating v_label if appropriate in coordination with the MAC framework, as well as committing to disk. File systems that are not MAC-aware need not implement these VOPs, as the MAC framework will default to maintaining a single label for all vnodes based on the label on the file system mount point. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-07-30 22:15:09 +00:00
Robert Watson	95fab37ea8	Begin committing support for Mandatory Access Control and extensible kernel access control. The MAC framework permits loadable kernel modules to link to the kernel at compile-time, boot-time, or run-time, and augment the system security policy. This commit includes the initial kernel implementation, although the interface with the userland components of the oeprating system is still under work, and not all kernel subsystems are supported. Later in this commit sequence, documentation of which kernel subsystems will not work correctly with a kernel compiled with MAC support will be added. kern_mac.c contains the body of the MAC framework. Kernel and user APIs defined in mac.h are implemented here, providing a front end to loaded security modules. This code implements a module registration service, state (label) management, security configuration and policy composition. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-07-30 21:36:05 +00:00
Julian Elischer	4d492b4369	Don't need to hold schedlock specifically for stop() ans it calls wakeup() that locks it anyhow. Reviewed by: jhb@freebsd.org	2002-07-30 21:13:48 +00:00
Bosko Milekic	c89137ff90	Make reference counting for mbuf clusters [only] work like in RELENG_4. While I don't think this is the best solution, it certainly is the fastest and in trying to find bottlenecks in network related code I want this out of the way, so that I don't have to think about it. What this means, for mbuf clusters anyway is: - one less malloc() to do for every cluster allocation (replaced with a relatively quick calculation + assignment) - no more free() in the cluster free case (replaced with empty space) :-) This can offer a substantial throughput improvement, but it may not for all cases. Particularly noticable for larger buffer sends/recvs. See http://people.freebsd.org/~bmilekic/code/measure2.txt for a rough idea.	2002-07-30 21:06:27 +00:00
Alan Cox	1812190d09	o Replace vm_page_sleep_busy() with vm_page_sleep_if_busy() in vfs_busy_pages().	2002-07-30 20:41:10 +00:00
Julian Elischer	b8e45df779	Remove code that removes thread from sleep queue before adding it to a condvar wait. We do not have asleep() any more so this can not happen.	2002-07-30 20:34:30 +00:00
Alan Cox	1161b86a15	o In do_sendfile(), replace vm_page_sleep_busy() by vm_page_sleep_if_busy() and extend the scope of the page queues lock to cover all accesses to the page's flags and busy fields.	2002-07-30 18:51:07 +00:00
Robert Watson	e66c87b70e	When referencing nd_cnp after namei(), always pass SAVENAME into NDINIT() operation flags. Submitted by: green Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-07-30 18:48:25 +00:00
Robert Watson	e37b1fcdee	Make M_COPY_PKTHDR() macro into a wrapper for a m_copy_pkthdr() function. This permits conditionally compiled extensions to the packet header copying semantic, such as extensions to copy MAC labels. Reviewed by: bmilekic Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-07-30 18:28:58 +00:00
Robert Watson	4266d0d0ce	Regen.	2002-07-30 16:52:22 +00:00
Robert Watson	aedbd622fe	Introduce a mac_policy() system call that will provide MAC policies with a general purpose front end entry point for user applications to invoke. The MAC framework will route the system call to the appropriate policy by name. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-07-30 16:50:25 +00:00
Jacques Vidrine	89ab930718	For processes which are set-user-ID or set-group-ID, the kernel performs a few special actions for safety. One of these is to make sure that file descriptors 0..2 are in use, by opening /dev/null for those that are not already open. Another is to close any file descriptors 0..2 that reference procfs. However, these checks were made out of order, so that it was still possible for a set-user-ID or set-group-ID process to be started with some of the file descriptors 0..2 unused. Submitted by: Georgi Guninski <guninski@guninski.com>	2002-07-30 15:38:29 +00:00
Seigo Tanimura	133267776c	In endtsleep() and cv_timedwait_end(), a thread marked TDF_TIMEOUT may be swapped out. Do not put such the thread directly back to the run queue. Spotted by: David Xu <davidx@viasoft.com.cn> While I am here, s/PS_TIMEOUT/TDF_TIMEOUT/.	2002-07-30 10:12:11 +00:00
Jeff Roberson	1e4c7a1368	- Acknowledge recursive vnode locks in the vop_unlock specification. The vnode may not be unlocked even if the operation succeeded.	2002-07-30 08:50:52 +00:00
Seigo Tanimura	9eb881f804	- Optimize wakeup() and its friends; if a thread waken up is being swapped in, we do not have to ask for the scheduler thread to do that. - Assert that a process is not swapped out in runq functions and swapout(). - Introduce thread_safetoswapout() for readability. - In swapout_procs(), perform a test that may block (check of a thread working on its vm map) first. This lets us call swapout() with the sched_lock held, providing a better atomicity.	2002-07-30 06:54:05 +00:00
Mike Silbersack	c4441bc769	Update docs to reflect change in count of procs reserved for root from 1 to 10. PR: kern/40515 Submitted by: David Schultz <dschultz@uclink.Berkeley.EDU> MFC after: 1 day	2002-07-30 05:37:00 +00:00
Robert Watson	03a719dcd1	Rebuild of files generated from syscalls.master. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-07-30 02:09:24 +00:00
Robert Watson	5d37d00afc	Prototype function arguments, only with MAC-specific structures replaced with void until we bring in the actual structure definitions. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-07-30 02:06:34 +00:00
Robert Watson	7bc8250003	Stubs for the TrustedBSD MAC system calls to permit TrustedBSD MAC userland code to operate on kernel's from the main tree. Not much in this file yet. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-07-30 02:04:05 +00:00
Julian Elischer	1d7b9ed2e6	Create a new thread state to describe threads that would be ready to run except for the fact tha they are presently swapped out. Also add a process flag to indicate that the process has started the struggle to swap back in. This will be needed for the case where multiple threads start the swapin action top a collision. Also add code to stop a process fropm being swapped out if one of the threads in this process is actually off running on another CPU.. that might hurt... Submitted by: Seigo Tanimura <tanimura@r.dl.itc.u-tokyo.ac.jp>	2002-07-29 18:33:32 +00:00
Jeff Roberson	a562685f65	- Backout the patch made in revision 1.75 of vfs_mount.c. The vputs here were hiding the real problem of the missing unlock in sync_inactive. - Add the missing unlock in sync_inactive. Submitted by: iedowse	2002-07-29 06:26:55 +00:00
Don Lewis	9e74cba35a	Make a temporary copy of the output data in the generic sysctl handlers so that the data is less likely to be inconsistent if SYSCTL_OUT() blocks. If the data is large, wire the output buffer instead. This is somewhat less than optimal, since the handler could skip the copy if it knew that the data was static. If the data is dynamic, we are still not guaranteed to get a consistent copy since another processor could change the data while the copy is in progress because the data is not locked. This problem could be solved if the generic handlers had the ability to grab the proper lock before the copy and release it afterwards. This may duplicate work done in other sysctl handlers in the kernel which also copy the data, possibly while a lock is held, before calling they call a generic handler to output the data. These handlers should probably call SYSCTL_OUT() directly.	2002-07-28 21:06:14 +00:00
Don Lewis	5c38b6dbce	Wire the sysctl output buffer before grabbing any locks to prevent SYSCTL_OUT() from blocking while locks are held. This should only be done when it would be inconvenient to make a temporary copy of the data and defer calling SYSCTL_OUT() until after the locks are released.	2002-07-28 19:59:31 +00:00
David Malone	25dec7474c	If a socket is disconnected for some reason (like a TCP connection not responding) then drop any data on the outgoing queue in soisdisconnected because there is no way to get it to its destination any longer. The only objection to this patch I got on -net was from Terry, who wasn't sure that the condition in question could arise, so I provided some example code.	2002-07-27 23:06:52 +00:00
Robert Watson	d06c0d4d40	Slight restructuring of the logic for credential change case identification during execve() to use a 'credential_changing' variable. This makes it easier to have outstanding patchsets against this code, as well as to add conditionally defined clauses. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-07-27 18:06:49 +00:00
John Baldwin	ce39e722ec	Disable optimization of spinlocks on UP kernels w/o debugging for now since it breaks mtx_owned() on spin mutexes when used outside of mtx_assert(). Unfortunately we currently use it in the i386 MD code and in the sio(4) driver. Reported by: bde	2002-07-27 16:54:23 +00:00
Jeff Roberson	0ea5e55265	- The default for lock, unlock, and islocked is now std* instead of no*.	2002-07-27 05:16:20 +00:00
Robert Drehmel	e25dadb05d	Fix -Werror build for sparc64: Use the appropriate conversion specifier for an 'unsigned int' argument.	2002-07-26 12:57:57 +00:00
Julian Elischer	8625e3721f	get suspension counting right. fix an error message Submitted by: David Xu <bsddiy@yahoo.com>	2002-07-25 03:21:35 +00:00
Julian Elischer	e3b9bf7198	fix some style problems and remove a mis-merged assert.	2002-07-25 00:27:39 +00:00
Julian Elischer	294e6308bf	slight stylisations to take into account recent code changes.	2002-07-24 23:59:15 +00:00
Julian Elischer	b6d5995e5f	Add some locking asserts and some comments	2002-07-24 23:21:05 +00:00
Julian Elischer	cf19bf911d	When single threading a multithreaded program, awaken the 'single threading thread' when the last other thread suspends. I had this code in there before but it seems to have been accidentally deleted somewhere along the way. This would only affect multithreaded processes. Reviewed by: David Xu <bsddiy@yahoo.com>	2002-07-24 19:50:08 +00:00
Maxime Henrion	dae0abedbd	Fix a stupid bug where I wasn't initializing the names of 0-length mount options.	2002-07-24 19:50:00 +00:00
Robert Watson	eeb9251884	Under #ifdef DIAGNOSTIC, NULL out componentname pointers if we free the pnbuf to increase the chances of detecting use of a free'd name buffer if SAVENAME or SAVESTART wasn't passed in. Curiously, running with these changes doesn't panic the kernel, and should.	2002-07-24 15:42:22 +00:00
Bosko Milekic	4151d2e620	Move m_freem() from uipc_mbuf.c to subr_mbuf.c so it can take advantage of the inlines, like its cousin, m_free(). Also, make a small (first step?) optimisation of m_free() to use the MBP_PERSIST{,ENT} interface to hold the lock across frees when possible. The thing is that right now, we can only do this easily for at most across one mbuf + one cluster free, as the comment mentions (it also explains why). Anyway, some basic tests revealed a 5-10% overall improvement. Some of the results can be found here: http://people.freebsd.org/~bmilekic/code/measure.txt	2002-07-24 15:11:23 +00:00
Mike Barcroft	5f0de71223	Catch up to rev 1.87 of sys/sys/socketvar.h (sb_cc changed from u_long to u_int). Noticed by: sparc64 tinderbox	2002-07-24 14:21:41 +00:00
Julian Elischer	205683663f	When suspending a thread, update the appropriate (sic) statistic.	2002-07-24 07:29:16 +00:00
Julian Elischer	38038891e9	revert some of the handling of STOP signals in issignal(). Let thread_suspend_check() actually do the suspension at the user boundary. Submitted by: David Xu <bsddiy@yahoo.com>	2002-07-24 07:23:41 +00:00
John Polstra	f824b5187e	Widen struct sockbuf's sb_timeo member to int from short. With non-default but reasonable values of hz this member overflowed, breaking NFS over UDP. Also, as long as I'm plowing up struct sockbuf ... Change certain members from u_long/long to u_int/int in order to reduce wasted space on 64-bit machines. This change was requested by Andrew Gallatin. Netstat and systat need to be rebuilt. I am incrementing __FreeBSD_version in case any ports need to change.	2002-07-24 03:02:43 +00:00
Alfred Perlstein	b605b54ce3	Attempt to clarify comment in selrecord.	2002-07-24 00:29:22 +00:00
Bosko Milekic	dd4ac026f7	Introduce mb_free() to the MBP_PERSIST{,ENT} interface. What this means is that grouped frees will be done as most often as possible without dropping the cache lock in between. So, for the most part, they'll be done without the lock being dropped. This is particularly true if you have something that does a grouped m_getm() or m_getcl() (a cluster and mbuf at the same time) - most likely getting the buffers from the same per-CPU cache - and then frees them with m_free{,m}(). Unless the buffers' underlying buckets were moved, the free will be done without the lock getting dropped in between. So far, only m_free() has been shown how to do this, and m_freem() will shortly follow. Since I'm here, I also fixed a small (but mostly harmless) type-mismatch introduced in the last commit.	2002-07-23 14:55:33 +00:00
Alexander Kabaev	1c4229a6a7	Fix DIOCGMEDIASIZE and DIOCGSECTORSIZE ioctls to work for all disk devices. This fixes the problem with these ioctls returning EINVAL for plain slice devices with no disklabel on them. The patch incorporates improvements and style fixes from BDE. Reviewed by: bde Approved by: obrien (mentor)	2002-07-23 14:30:27 +00:00
Andrew R. Reiter	5d3232048e	- Make use of the VM_ALLOC_WIRED flag in the call to vm_page_alloc() in do_sendfile(). This allows us to rearrange an if statement in order to avoid doing an unnecesary call to vm_page_lock_queues(), and an attempt at re-wiring the pages (which were wired in the vm_page_alloc() call). Reviewed by: alc, jhb	2002-07-23 01:09:34 +00:00
Alfred Perlstein	1a5a641600	Remove unneeded caddr_t casts.	2002-07-22 19:05:44 +00:00
Alfred Perlstein	fd6d9be4f5	Cleanup: Define a debug printf macro rather than wrapping all calls to printf with #ifdefs.	2002-07-22 18:27:54 +00:00
Alfred Perlstein	8209f090f1	Change struct vmspace->vm_shm from void * to struct shmmap_state *, this removes the need for casts in several cases.	2002-07-22 16:22:27 +00:00
Alfred Perlstein	2cc593fd8e	Remove caddr_t.	2002-07-22 16:12:55 +00:00
Alfred Perlstein	d452ec95a9	remove caddr_t from fo_ioctl calls	2002-07-22 15:46:51 +00:00
Alfred Perlstein	0a3e28cf1c	remove caddr_t	2002-07-22 15:44:27 +00:00
Robert Watson	0b1040cb88	Set VAPPEND in open mode when O_APPEND is specified as an argument to open() of fhopen(). Currently this has no actual affect due to the treatment of VAPPEND in vaccess() and vaccess_acl() as a subset of VWRITE, but when MAC comes in, MAC will distinguish the two. Note: if any file systems are cutting their own permission models, they may wish to now take this into account. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-07-22 12:51:06 +00:00
Don Lewis	dcbe050b29	Pre-wire the output buffer so that sysctl_kern_function_list() doesn't block in SYSCTL_OUT() while holding a lock.	2002-07-22 08:28:09 +00:00
Don Lewis	0600730d73	Provide a way for sysctl handlers to pre-wire their output buffer before they grab a lock so that they don't block in SYSCTL_OUT() with the lock being held.	2002-07-22 08:25:37 +00:00
Robert Watson	b02aac465d	Teach discretionary access control methods for files about VAPPEND and VALLPERM. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-07-22 03:57:07 +00:00
Alan Cox	c1d5e2741e	o Lock page queue accesses by vm_page_free().	2002-07-21 19:06:46 +00:00
Johan Karlsson	5b60674451	Save flags returned by vn_open and use them when calling vn_close. Reviewed by: bde Approved by: sheldonh (mentor)	2002-07-21 15:22:56 +00:00
Warner Losh	5878eb3fca	Add bus_child_present and the child_present method to bus_if.m	2002-07-21 03:28:43 +00:00
Robert Watson	4f18efe220	Do preserve the error result from calling p_cansee() and use that when failing because of the error. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-07-20 22:44:39 +00:00
Peter Wemm	3ebc124838	Infrastructure tweaks to allow having both an Elf32 and an Elf64 executable handler in the kernel at the same time. Also, allow for the exec_new_vmspace() code to build a different sized vmspace depending on the executable environment. This is a big help for execing i386 binaries on ia64. The ELF exec code grows the ability to map partial pages when there is a page size difference, eg: emulating 4K pages on 8K or 16K hardware pages. Flesh out the i386 emulation support for ia64. At this point, the only binary that I know of that fails is cvsup, because the cvsup runtime tries to execute code in pages not marked executable. Obtained from: dfr (mostly, many tweaks from me).	2002-07-20 02:56:12 +00:00
Alan Cox	4aca0b1510	o Use vm_page_alloc(... \| VM_ALLOC_WIRED) in place of vm_page_wire().	2002-07-19 19:35:06 +00:00
Maxime Henrion	0f3b0aa87c	Wrap a line longer than 80 characters.	2002-07-19 17:44:44 +00:00
Maxime Henrion	72fda5bc50	- Merge the mount options at MNT_UPDATE time with vfs_mergeopts(). - Sanity check the mount options list (remove duplicates) with vfs_sanitizeopts(). - Fix some malloc(0)/free(NULL) bugs. Reviewed by: rwatson (some time ago)	2002-07-19 16:05:31 +00:00
Kirk McKusick	7aca6291e3	Add support to UFS2 to provide storage for extended attributes. As this code is not actually used by any of the existing interfaces, it seems unlikely to break anything (famous last words). The internal kernel interface to manipulate these attributes is invoked using two new IO_ flags: IO_NORMAL and IO_EXT. These flags may be specified in the ioflags word of VOP_READ, VOP_WRITE, and VOP_TRUNCATE. Specifying IO_NORMAL means that you want to do I/O to the normal data part of the file and IO_EXT means that you want to do I/O to the extended attributes part of the file. IO_NORMAL and IO_EXT are mutually exclusive for VOP_READ and VOP_WRITE, but may be specified individually or together in the case of VOP_TRUNCATE. For example, when removing a file, VOP_TRUNCATE is called with both IO_NORMAL and IO_EXT set. For backward compatibility, if neither IO_NORMAL nor IO_EXT is set, then IO_NORMAL is assumed. Note that the BA_ and IO_ flags have been `merged' so that they may both be used in the same flags word. This merger is possible by assigning the IO_ flags to the low sixteen bits and the BA_ flags the high sixteen bits. This works because the high sixteen bits of the IO_ word is reserved for read-ahead and help with write clustering so will never be used for flags. This merge lets us get away from code of the form: if (ioflags & IO_SYNC) flags \|= BA_SYNC; For the future, I have considered adding a new field to the vattr structure, va_extsize. This addition could then be exported through the stat structure to allow applications to find out the size of the extended attribute storage and also would provide a more standard interface for truncating them (via VOP_SETATTR rather than VOP_TRUNCATE). I am also contemplating adding a pathconf parameter (for concreteness, lets call it _PC_MAX_EXTSIZE) which would let an application determine the maximum size of the extended atribute storage. Sponsored by: DARPA & NAI Labs.	2002-07-19 07:29:39 +00:00
Julian Elischer	9f189ade99	Clear up confusion in ugly code. ^T gave wrong results for RSS. I misinterpretted this code when changing it to handle threads. (there are still issues here) Submitted by: Ian Dowse <iedowse@maths.tcd.ie>	2002-07-18 21:19:56 +00:00
Peter Wemm	02fb42b0a8	ia64 does not have the same degree of stealth include file nesting, so it needs an explicit #include <machine/frame.h> to get 'struct trapframe'. The fact that it needs this at this level is rather bogus but it will not compile without it.	2002-07-17 23:43:55 +00:00
Peter Wemm	8a2bd34560	Pacify gcc on ia64	2002-07-17 23:32:13 +00:00
Julian Elischer	2d014fd7f8	Fix a reversed test. Fix some style nits. Fix a KASSERT message. Add/fix some comments. Submitted by: bde@freebsd.org	2002-07-17 19:20:48 +00:00
Julian Elischer	cad4143a58	Make sure the process state for the idle proc is set correctly from the beginning.	2002-07-17 19:18:45 +00:00
John Baldwin	3d3f20cbe6	Preallocate a struct file as the first thing in falloc() before we lock the filelist_lock and check nfiles. This closes a race where we had to unlock the filedesc to re-lock the filelist_lock. Reported by: David Xu Reviewed by: bde (mostly)	2002-07-17 02:48:43 +00:00
John Baldwin	627ed43ba7	Add a KASSERT() to assert that td_critnest is == 1 when mi_switch() is called.	2002-07-17 02:46:13 +00:00
Andrew Gallatin	fe79953325	Allow alphas to do crashdumps: Refuse to run anything in choosethread() after a panic which is not an interrupt thread, or the thread which caused the panic. Also, remove panicstr checks from msleep() and from cv_wait() in order to allow threads to go to sleep and yeild the cpu to the panicing thread, or to an interrupt thread which might be doing the crashdump. Reviewed by: jhb (and it was mostly his idea too)	2002-07-17 02:23:44 +00:00
Kirk McKusick	fb36a3d847	Change utimes to set the file creation time (for filesystems that support creation times such as UFS2) to the value of the modification time if the value of the modification time is older than the current creation time. See utimes(2) for further details. Sponsored by: DARPA & NAI Labs.	2002-07-17 02:03:19 +00:00
Kirk McKusick	faab4e2722	Change the name of st_createtime to st_birthtime. This change is made to reduce confusion between st_ctime and st_createtime. Submitted by: Eric Allman <eric@sendmail.org> Sponsored by: DARPA & NAI Labs.	2002-07-16 22:36:00 +00:00
Mark Murray	f0d2d03884	Fix a bazillion lint and WARNS warnings. One major fix is the removal of semicolons from the end of macros: #define FOO() bar(a,b,c); becomes #define FOO() bar(a,b,c) Thus requiring the semicolon in the invocation of FOO. This is much cleaner syntax and more consistent with expectations when writing function-like things in source. With both peril-sensitive sunglasses and flame-proof undies on, tighten up some types, and work around some warnings generated by this. There are some _horrible_ const/non-const issues in this code.	2002-07-15 17:28:34 +00:00
Mark Murray	b90cce95e0	Use ISO 9X variadic macro format; arguments are not optional, just variable.	2002-07-15 17:17:56 +00:00
Bosko Milekic	185c2244ce	o Introduce new m_getcl() interface routine that allocates an mbuf and a cluster in one shot. o Introduce MBP_PERSIST and MBP_PERSISTENT control bits to mb_alloc(); MBP_PERSIST means "if you can allocate, then keep the cache lock held on exit," and MBP_PERSISTENT means "a cache lock is alredy held on entry, so allocate from the specified (already locked) cache." They may be used in combination. o m_getcl() uses the MBP_PERSIST/MBP_PERSISTENT interface so that it doesn't drop the cache lock in between the mbuf and cluster allocations. o m_getm(), which takes a size and allocates an mbuf + cluster "best fit" chain, has been moved from uipc_mbuf.c to subr_mbuf.c and shown how to use MBP_PERSIST/MBP_PERSISTENT to attempt to do a grouped allocation without dropping the cache lock in between. Why this is good: much less bus-locked lock acquires/drops when they're not needed. Also, prototype for m_getcl(): struct mbuf * m_getcl(int how, short type, int flags); "how" and "type" are self-explanatory. "flags" may be M_PKTHDR, in which case m_getcl() will make the mbuf a pkthdr-mbuf. While I'm in subr_mbuf.c: o Every exported routine now has a nice comment with a description of the expected arguments. Eventually, mbuf(9) needs to be re-vamped but there's still more code to write/finalize before I get to that. o internal macros have been changed a bit. o consistently use 'short' for "type." This somehow slipped through before (that 'type' was sometimes declared as int). Alfred has been pushing for the MBP_PERSIST{,ENT} thing for almost a year now. Luigi asked for m_getcl(), and will probably MFC that part of this commit. TODO [Related]: teach mb_free() about MBP_PERSIST{, ENT}.	2002-07-15 15:32:59 +00:00

... 2 3 4 5 6 ...

5417 Commits