freebsd-skq

Author	SHA1	Message	Date
jmg	bc1805c6e8	Add locking to the kqueue subsystem. This also makes the kqueue subsystem a more complete subsystem, and removes the knowlege of how things are implemented from the drivers. Include locking around filter ops, so a module like aio will know when not to be unloaded if there are outstanding knotes using it's filter ops. Currently, it uses the MTX_DUPOK even though it is not always safe to aquire duplicate locks. Witness currently doesn't support the ability to discover if a dup lock is ok (in some cases). Reviewed by: green, rwatson (both earlier versions)	2004-08-15 06:24:42 +00:00
rwatson	37eebe5058	Flag a broad range of VFS operations as GIANT_REQUIRED in order to catch leaking into VFS without Giant. Inch Giant a little lower in several file descriptor operations on vnodes to cover only VFS operations that need it, rather than file flag reading, etc.	2004-08-06 22:25:35 +00:00
rwatson	92f30976fe	Push Giant acquisition down into fo_stat() from most callers. Acquire Giant conditional on debug.mpsafenet in the socket soo_stat() routine, unconditionally in vn_statfile() for VFS, and otherwise don't acquire Giant. Accept an unlocked read in kqueue_stat(), and cryptof_stat() is a no-op. Don't acquire Giant in fstat() system call. Note: in fdescfs, fo_stat() is called while holding Giant due to the VFS stack sitting on top, and therefore there will still be Giant recursion in this case.	2004-07-22 20:40:23 +00:00
rwatson	861b3c4416	Push acquisition of Giant from fdrop_closed() into fo_close() so that individual file object implementations can optionally acquire Giant if they require it: - soo_close(): depends on debug.mpsafenet - pipe_close(): Giant not acquired - kqueue_close(): Giant required - vn_close(): Giant required - cryptof_close(): Giant required (conservative) Notes: Giant is still acquired in close() even when closing MPSAFE objects due to kqueue requiring Giant in the calling closef() code. Microbenchmarks indicate that this removal of Giant cuts 3%-3% off of pipe create/destroy pairs from user space with SMP compiled into the kernel. The cryptodev and opencrypto code appears MPSAFE, but I'm unable to test it extensively and so have left Giant over fo_close(). It can probably be removed given some testing and review.	2004-07-22 18:35:43 +00:00
marcel	c20ced5cd2	Update for the KDB framework: o Call kdb_enter() instead of Debugger().	2004-07-10 21:47:53 +00:00
tjr	02a7d287a2	Change the types of vn_rdwr_inchunks()'s len and aresid arguments to size_t and size_t *, respectively. Update callers for the new interface. This is a better fix for overflows that occurred when dumping segments larger than 2GB to core files.	2004-06-05 02:18:28 +00:00
rwatson	41a003003f	Rather than assert f_type==DTYPE_VNODE, conditionally perform the file lock release based on f_type==DTYPE_VNODE. vn_closefile() is used by non-vnode types as well (fifo).	2004-06-01 23:36:47 +00:00
rwatson	1e76056c09	Push the VOP_ADVLOCK() call to release advisory locks on vnode file descriptors out of fdrop_locked() and into vn_closefile(). This removes all knowledge of vnodes from fdrop_locked(), since the lock behavior was specific to vnodes. This also removes the specific requirement for Giant in fdrop_locked(), it's now only required by code that it calls into. Add GIANT_REQUIRED to vn_closefile() since VFS requires Giant.	2004-06-01 18:03:20 +00:00
rwatson	13656d723e	Assert Giant in vn_start_write() and vn_finished_write().	2004-05-31 20:56:10 +00:00
imp	74cf37bd00	Remove advertising clause from University of California Regent's license, per letter dated July 22, 1999. Approved by: core	2004-04-05 21:03:37 +00:00
bde	58d250bc29	Align the offset in vn_rdwr_inchunks() so that at most the first and the last chunk are misaligned relative to a MAXBSIZE byte boundary. vn_rdwr_inchunks() is used mainly for elf core dumps, and elf sections are usually perfectly misaligned relative to MAXBSIZE, and chunking prevents the file system from doing much realigning. This gives a surprisingly large speedup for core dumps -- from 50 to 13 seconds for a 512MB core dump here. The pessimization was mostly from an interaction of the misalignment with IO_DIRECT. It increased the number of i/o's for each chunk by a factor of 5 (3 writes and 2 read-before-writes instead of 1 write).	2004-03-13 02:56:27 +00:00
bde	7d91626477	v_vxproc was a bogus name for a thread (pointer).	2003-12-28 09:12:56 +00:00
jeff	c1590e8666	- If we are called with LK_NOWAIT in vn_lock() we may be holding a mutex and should not sleep while waiting for XLOCK to clear. Care needs to be taken in functions that use this capability to avoid spinning.	2003-10-04 14:35:22 +00:00
rwatson	d2f7ae9f88	Rename VOP_RMEXTATTR() to VOP_DELETEEXTATTR() for consistency with the kernel ACL interfaces and system call names. Break out UFS2 and FFS extattr delete and list vnode operations from setextattr and getextattr to deleteextattr and listextattr, which cleans up the implementations, and makes the results more readable, and makes the APIs more clear. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2003-07-28 18:53:29 +00:00
phk	b80d7fd8a0	Pass the fdidx argument from vn_open{_cred}() onto VOP_OPEN()	2003-07-27 20:05:36 +00:00
phk	d4d7ca154a	Add fdidx argument to vn_open() and vn_open_cred() and pass -1 throughout.	2003-07-27 17:04:56 +00:00
phk	6221ef9078	Add a "int fd" argument to VOP_OPEN() which in the future will contain the filedescriptor number on opens from userland. The index is used rather than a "struct file " since it conveys a bit more information, which may be useful to in particular fdescfs and /dev/fd/ For now pass -1 all over the place.	2003-07-26 07:32:23 +00:00
rwatson	45b727fb41	Prefer the vop_rmextattr() vnode operation for removing extended attributes from objects over vop_setextattr() with a NULL uio; if the file system doesn't support the vop_rmextattr() method, fall back to the vop_setextattr() method. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2003-06-22 23:03:07 +00:00
phk	c81c59299b	Add a f_vnode field to struct file. Several of the subtypes have an associated vnode which is used for stuff like the f*() functions. By giving the vnode a speparate field, a number of checks for the specific subtype can be replaced simply with a check for f_vnode != NULL, and we can later free f_data up to subtype specific use. At this point in time, f_data still points to the vnode, so any code I might have overlooked will still work.	2003-06-22 08:41:43 +00:00
phk	a81d7fdac7	Introduce a new flag on a file descriptor: DFLAG_SEEKABLE and use that rather than assume that only DTYPE_VNODE is seekable.	2003-06-18 19:53:59 +00:00
phk	591f399cfe	Initialize struct fileops with C99 sparse initialization.	2003-06-18 18:16:40 +00:00
obrien	3b8fff9e4c	Use __FBSDID().	2003-06-11 00:56:59 +00:00
rwatson	16f34ab413	Assert the vnode lock when returning successfully from vn_open_cred().	2003-06-04 00:54:27 +00:00
kan	9468fdaf14	Deprecate machine/limits.h in favor of new sys/limits.h. Change all in-tree consumers to include <sys/limits.h> Discussed on: standards@ Partially submitted by: Craig Rodrigues <rodrigc@attbi.com>	2003-04-29 13:36:06 +00:00
tegge	d9da9de257	fp->f_offset doesn't need any protection when it isn't accessed.	2003-03-26 19:21:12 +00:00
alfred	29fb7c2bce	Do not allow kqueues to be passed via unix domain sockets.	2003-02-15 06:04:55 +00:00
dillon	ccd5574cc6	Bow to the whining masses and change a union back into void *. Retain removal of unnecessary casts and throw in some minor cleanups to see if anyone complains, just for the hell of it.	2003-01-13 00:33:17 +00:00
dillon	ddf9ef103e	Change struct file f_data to un_data, a union of the correct struct pointer types, and remove a huge number of casts from code using it. Change struct xfile xf_data to xun_data (ABI is still compatible). If we need to add a #define for f_data and xf_data we can, but I don't think it will be necessary. There are no operational changes in this commit.	2003-01-12 01:37:13 +00:00
green	19fd807f21	In vn_open(), unset ndp->ni_vp when returning failure so that code which expects it to be NULL unless the return value was 0 will work. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2003-01-07 20:59:55 +00:00
dillon	4ecb4d83e4	Abstract-out the constants for the sequential heuristic. No operational changes. MFC after: 1 day	2002-12-28 20:28:10 +00:00
phk	22ca3b530e	White-space changes.	2002-12-24 09:44:51 +00:00
phk	b9e7819690	Detediousficate declaration of fileops array members by introducing typedefs for them.	2002-12-23 21:53:20 +00:00
mckusick	6b1611bd94	Within ufs, the ffs_sync and ffs_fsync functions did not always check for and/or report I/O errors. The result is that a VFS_SYNC or VOP_FSYNC called with MNT_WAIT could loop infinitely on ufs in the presence of a hard error writing a disk sector or in a filesystem full condition. This patch ensures that I/O errors will always be checked and returned. This patch also ensures that every call to VFS_SYNC or VOP_FSYNC with MNT_WAIT set checks for and takes appropriate action when an error is returned. Sponsored by: DARPA & NAI Labs.	2002-10-25 00:20:37 +00:00
rwatson	ae81971478	Drop in the MAC check for file creation as part of open(). Approved by: re Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-10-19 20:56:44 +00:00
phk	ccb0271ad3	Under DIAGNOSTIC, complain if ENOIOCTL leaks out through VOP_IOCTL().	2002-09-26 21:21:13 +00:00
charnier	7dd9d47059	Replace various spelling with FALLTHROUGH which is lint()able	2002-08-25 13:23:09 +00:00
jeff	da601a39ac	- Fix a mistake in my last few commits. The PDROP flag stops msleep from re-acquiring the mutex. Pointy hat to: me Noticed by: tegge	2002-08-23 00:32:03 +00:00
jeff	14c25eb632	- Closer inspection revealed a possible deadlock situation in vn_lock() that was introduced by my last commit but not caught by stress testing. Fix that and slightly restructure the code so that it is more readable.	2002-08-22 07:57:43 +00:00
jeff	6c5497f47a	- Make vn_lock() vget() and VOP_LOCK() all behave the same way WRT LK_INTERLOCK. The interlock will never be held on return from these functions even when there is an error. Errors typically only occur when the XLOCK is held which means this isn't the vnode we want anyway. Almost all users of these interfaces expected this behavior even though it was not provided before.	2002-08-22 07:44:45 +00:00
jeff	120149c075	- Return two shared locks to exclusive locks. This was premature. - Document the problems that prevent us from using shared locks.	2002-08-22 07:26:18 +00:00
jeff	820f26ad86	- Fix interlock handling in vn_lock(). Previously, vn_lock() could return with interlock held in error conditions when the caller did not specify LK_INTERLOCK. - Add several comments to vn_lock() describing the rational behind the code flow since it was not immediately obvious.	2002-08-22 06:58:11 +00:00
jeff	275611472a	- Document two cases, one in vget and the other in vn_lock, where the state of interlock on exit is not consistent. There are probably several bugs relating to this.	2002-08-21 08:34:48 +00:00
rwatson	a1cb1e3bed	Pass active_cred and file_cred into the MAC framework explicitly for mac_check_vnode_{poll,read,stat,write}(). Pass in fp->f_cred when calling these checks with a struct file available. Otherwise, pass NOCRED. All currently MAC policies use active_cred, but could now offer the cached credential semantic used for the base system security model. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-19 19:04:53 +00:00
rwatson	1a7cd1a210	Break out mac_check_vnode_op() into three seperate checks: mac_check_vnode_poll(), mac_check_vnode_read(), mac_check_vnode_write(). This improves the consistency with other existing vnode checks, and allows policies to avoid implementing switch statements to determine what operations they do and do not want to authorize. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-19 16:43:25 +00:00
rwatson	3246fbf45f	In continuation of early fileop credential changes, modify fo_ioctl() to accept an 'active_cred' argument reflecting the credential of the thread initiating the ioctl operation. - Change fo_ioctl() to accept active_cred; change consumers of the fo_ioctl() interface to generally pass active_cred from td->td_ucred. - In fifofs, initialize filetmp.f_cred to ap->a_cred so that the invocations of soo_ioctl() are provided access to the calling f_cred. Pass ap->a_td->td_ucred as the active_cred, but note that this is required because we don't yet distinguish file_cred and active_cred in invoking VOP's. - Update kqueue_ioctl() for its new argument. - Update pipe_ioctl() for its new argument, pass active_cred rather than td_ucred to MAC for authorization. - Update soo_ioctl() for its new argument. - Update vn_ioctl() for its new argument, use active_cred rather than td->td_ucred to authorize VOP_IOCTL() and the associated VOP_GETATTR(). Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-17 02:36:16 +00:00
rwatson	2b82cd24f1	Make similar changes to fo_stat() and fo_poll() as made earlier to fo_read() and fo_write(): explicitly use the cred argument to fo_poll() as "active_cred" using the passed file descriptor's f_cred reference to provide access to the file credential. Add an active_cred argument to fo_stat() so that implementers have access to the active credential as well as the file credential. Generally modify callers of fo_stat() to pass in td->td_ucred rather than fp->f_cred, which was redundantly provided via the fp argument. This set of modifications also permits threads to perform these operations on behalf of another thread without modifying their credential. Trickle this change down into fo_stat/poll() implementations: - badfo_poll(), badfo_stat(): modify/add arguments. - kqueue_poll(), kqueue_stat(): modify arguments. - pipe_poll(), pipe_stat(): modify/add arguments, pass active_cred to MAC checks rather than td->td_ucred. - soo_poll(), soo_stat(): modify/add arguments, pass fp->f_cred rather than cred to pru_sopoll() to maintain current semantics. - sopoll(): moidfy arguments. - vn_poll(), vn_statfile(): modify/add arguments, pass new arguments to vn_stat(). Pass active_cred to MAC and fp->f_cred to VOP_POLL() to maintian current semantics. - vn_close(): rename cred to file_cred to reflect reality while I'm here. - vn_stat(): Add active_cred and file_cred arguments to vn_stat() and consumers so that this distinction is maintained at the VFS as well as 'struct file' layer. Pass active_cred instead of td->td_ucred to MAC and to VOP_GETATTR() to maintain current semantics. - fifofs: modify the creation of a "filetemp" so that the file credential is properly initialized and can be used in the socket code if desired. Pass ap->a_td->td_ucred as the active credential to soo_poll(). If we teach the vnop interface about the distinction between file and active credentials, we would use the active credential here. Note that current inconsistent passing of active_cred vs. file_cred to VOP's is maintained. It's not clear why GETATTR would be authorized using active_cred while POLL would be authorized using file_cred at the file system level. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-16 12:52:03 +00:00
rwatson	44404e4547	In order to better support flexible and extensible access control, make a series of modifications to the credential arguments relating to file read and write operations to cliarfy which credential is used for what: - Change fo_read() and fo_write() to accept "active_cred" instead of "cred", and change the semantics of consumers of fo_read() and fo_write() to pass the active credential of the thread requesting an operation rather than the cached file cred. The cached file cred is still available in fo_read() and fo_write() consumers via fp->f_cred. These changes largely in sys_generic.c. For each implementation of fo_read() and fo_write(), update cred usage to reflect this change and maintain current semantics: - badfo_readwrite() unchanged - kqueue_read/write() unchanged pipe_read/write() now authorize MAC using active_cred rather than td->td_ucred - soo_read/write() unchanged - vn_read/write() now authorize MAC using active_cred but VOP_READ/WRITE() with fp->f_cred Modify vn_rdwr() to accept two credential arguments instead of a single credential: active_cred and file_cred. Use active_cred for MAC authorization, and select a credential for use in VOP_READ/WRITE() based on whether file_cred is NULL or not. If file_cred is provided, authorize the VOP using that cred, otherwise the active credential, matching current semantics. Modify current vn_rdwr() consumers to pass a file_cred if used in the context of a struct file, and to always pass active_cred. When vn_rdwr() is used without a file_cred, pass NOCRED. These changes should maintain current semantics for read/write, but avoid a redundant passing of fp->f_cred, as well as making it more clear what the origin of each credential is in file descriptor read/write operations. Follow-up commits will make similar changes to other file descriptor operations, and modify the MAC framework to pass both credentials to MAC policy modules so they can implement either semantic for revocation. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-15 20:55:08 +00:00
rwatson	d14df136e2	Implement IO_NOMACCHECK in vn_rdwr() -- perform MAC checks (assuming 'options MAC') as long as IO_NOMACCHECK is not set in the IO flags. If IO_NOMACCHECK is set, bypass MAC checks in vn_rdwr(). This allows vn_rdwr() to be used as a utility function inside of file systems where MAC checks have already been performed, or where the operation is being done on behalf of the kernel not the user. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI LAbs	2002-08-12 16:15:34 +00:00
rwatson	551b018164	Due to layering problems, remove the MAC checks from vn_rdwr() -- this VOP wrapper is called from within file systems so can result in odd loopback effects when MAC enforcement is use with the active (as opposed to saved) credential. These checks will be moved elsewhere. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-08 12:45:30 +00:00
jeff	02517b6731	- Replace v_flag with v_iflag and v_vflag - v_vflag is protected by the vnode lock and is used when synchronization with VOP calls is needed. - v_iflag is protected by interlock and is used for dealing with vnode management issues. These flags include X/O LOCK, FREE, DOOMED, etc. - All accesses to v_iflag and v_vflag have either been locked or marked with mp_fixme's. - Many ASSERT_VOP_LOCKED calls have been added where the locking was not clear. - Many functions in vfs_subr.c were restructured to provide for stronger locking. Idea stolen from: BSD/OS	2002-08-04 10:29:36 +00:00

1 2 3 4 5

207 Commits