freebsd-dev

Author	SHA1	Message	Date
Doug Rabson	dfdcada31e	Add the new kernel-mode NFS Lock Manager. To use it instead of the user-mode lock manager, build a kernel with the NFSLOCKD option and add '-k' to 'rpc_lockd_flags' in rc.conf. Highlights include: * Thread-safe kernel RPC client - many threads can use the same RPC client handle safely with replies being de-multiplexed at the socket upcall (typically driven directly by the NIC interrupt) and handed off to whichever thread matches the reply. For UDP sockets, many RPC clients can share the same socket. This allows the use of a single privileged UDP port number to talk to an arbitrary number of remote hosts. * Single-threaded kernel RPC server. Adding support for multi-threaded server would be relatively straightforward and would follow approximately the Solaris KPI. A single thread should be sufficient for the NLM since it should rarely block in normal operation. * Kernel mode NLM server supporting cancel requests and granted callbacks. I've tested the NLM server reasonably extensively - it passes both my own tests and the NFS Connectathon locking tests running on Solaris, Mac OS X and Ubuntu Linux. * Userland NLM client supported. While the NLM server doesn't have support for the local NFS client's locking needs, it does have to field async replies and granted callbacks from remote NLMs that the local client has contacted. We relay these replies to the userland rpc.lockd over a local domain RPC socket. * Robust deadlock detection for the local lock manager. In particular it will detect deadlocks caused by a lock request that covers more than one blocking request. As required by the NLM protocol, all deadlock detection happens synchronously - a user is guaranteed that if a lock request isn't rejected immediately, the lock will eventually be granted. The old system allowed for a 'deferred deadlock' condition where a blocked lock request could wake up and find that some other deadlock-causing lock owner had beaten them to the lock. * Since both local and remote locks are managed by the same kernel locking code, local and remote processes can safely use file locks for mutual exclusion. Local processes have no fairness advantage compared to remote processes when contending to lock a region that has just been unlocked - the local lock manager enforces a strict first-come first-served model for both local and remote lockers. Sponsored by: Isilon Systems PR: 95247 107555 115524 116679 MFC after: 2 weeks	2008-03-26 15:23:12 +00:00
Konstantin Belousov	e30cf87ba1	Do not assert any locks for VOP_PRINT. In particular, do not assert that the vnode interlock is not held. vn_printf() already correctly handles locked and unlocked vnode interlocks, and all the in-tree vop_print methods are interlock-agnostic. Some code calls vprintf() with the vnode interlock held, that causes unjustified panics with INVARIANTS (ffs_syncvnode() as example). Reported by: Peter Holm	2008-02-26 12:16:35 +00:00
Attilio Rao	81c794f998	Axe the 'thread' argument from VOP_ISLOCKED() and lockstatus() as it is always curthread. As KPI gets broken by this patch, manpages and __FreeBSD_version will be updated by further commits. Tested by: Andrea Barberio <insomniac at slackware dot it>	2008-02-25 18:45:57 +00:00
Attilio Rao	22db15c06f	VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary. KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed. Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>	2008-01-13 14:44:15 +00:00
Konstantin Belousov	9e223287c0	Revert UF_OPENING workaround for CURRENT. Change the VOP_OPEN(), vn_open() vnode operation and d_fdopen() cdev operation argument from being file descriptor index into the pointer to struct file. Proposed and reviewed by: jhb Reviewed by: daichi (unionfs) Approved by: re (kensmith)	2007-05-31 11:51:53 +00:00
Konstantin Belousov	d413d21071	Since renaming of vop_lock to _vop_lock, pre- and post-condition function calls are no more generated for vop_lock. Rename _vop_lock to vop_lock1 to satisfy tools/vnode_if.awk assumption about vop naming conventions. This restores pre/post-condition calls.	2007-05-18 13:02:13 +00:00
Pawel Jakub Dawidek	10bcafe9ab	Move vnode-to-file-handle translation from vfs_vptofh to vop_vptofh method. This way we may support multiple structures in v_data vnode field within one file system without using black magic. Vnode-to-file-handle should be VOP in the first place, but was made VFS operation to keep interface as compatible as possible with SUN's VFS. BTW. Now Solaris also implements vnode-to-file-handle as VOP operation. VFS_VPTOFH() was left for API backward compatibility, but is marked for removal before 8.0-RELEASE. Approved by: mckusick Discussed with: many (on IRC) Tested with: ufs, msdosfs, cd9660, nullfs and zfs	2007-02-15 22:08:35 +00:00
Kip Macy	2f6a774be4	change vop_lock handling to allowing tracking of callers' file and line for acquisition of lockmgr locks Approved by: scottl (standing in for mentor rwatson)	2006-11-13 05:51:22 +00:00
Diomidis Spinellis	23efd78d03	Remove two locking assertion entries that: a) were incorrectly written and therefore never compiled into assertions, and b) were incorrectly specified and when compiled resulted in a failed assertion.	2006-05-31 14:06:06 +00:00
Diomidis Spinellis	f69ec7af12	Assertion code specifications are introduced using special character sequences that are distinct from comments. %% is used for argument locks; %! for pre- and post-conditions.	2006-05-30 20:49:54 +00:00
Diomidis Spinellis	b1b4282160	Remove incorrect lock validation specifications that caused failed assertions with DEBUG_VFS_LOCKS. We should reinstate them with correct specifications, possibly after extendng vnode_if.awk Noted by: truckman@	2006-05-30 20:21:51 +00:00
Diomidis Spinellis	0e1c7fb8ea	Add missing % signs in the lock annotations of the functions: lookup, rename, strategy, islocked The missing % sign meant that the lines were processed as plain comments and the corresponding assertions were never generated.	2006-05-28 07:24:12 +00:00
Dag-Erling Smørgrav	0430a5e289	Eradicate caddr_t from the VFS API.	2005-12-14 00:49:52 +00:00
Suleiman Souhlal	679985d03a	Allow EVFILT_VNODE events to work on every filesystem type, not just UFS by: - Making the pre and post hooks for the VOP functions work even when DEBUG_VFS_LOCKS is not defined. - Moving the KNOTE activations into the corresponding VOP hooks. - Creating a MNTK_NOKNOTE flag for the mnt_kern_flag field of struct mount that permits filesystems to disable the new behavior. - Creating a default VOP_KQFILTER function: vfs_kqfilter() My benchmarks have not revealed any performance degradation. Reviewed by: jeff, bde Approved by: rwatson, jmg (kqueue changes), grehan (mentor)	2005-06-09 20:20:31 +00:00
Jeff Roberson	17c916e321	- Mark the VOPs that require exclusive locks. Those that aren't marked with E may be called with a shared lock held. This list really could be made per filesystem if we had any filesystems which differed from ffs in locking guarantees. VFS itself is not sensitive to this except where vgone() etc. are concerned. Sponsored by: Isilon Systems, Inc.	2005-04-11 15:19:29 +00:00
Jeff Roberson	4e6746965e	- CLOSE, REVOKE, INACTIVE, and RECLAIM are not L L L, that's a locked vnode on enter, exit, error. This allows for the removal of the XLOCK. Sponsored by: Isilon Systems, Inc.	2005-03-13 11:42:16 +00:00
Poul-Henning Kamp	7ee4eb6192	VOP_DESTROYVOBJECT() is no more.	2005-02-07 09:26:58 +00:00
Poul-Henning Kamp	729fcf7efb	Take VOP_GETVOBJECT() out to pasture. We use the direct pointer now.	2005-01-25 00:42:16 +00:00
Poul-Henning Kamp	69816ea35e	Kill VOP_CREATEVOBJECT(), it is now the responsibility of the filesystem for a given vnode to create a vnode_pager object if one is needed.	2005-01-25 00:12:24 +00:00
Poul-Henning Kamp	8df6bac4c7	Remove the unused credential argument from VOP_FSYNC() and VFS_SYNC(). I'm not sure why a credential was added to these in the first place, it is not used anywhere and it doesn't make much sense: The credentials for syncing a file (ability to write to the file) should be checked at the system call level. Credentials for syncing one or more filesystems ("none") should be checked at the system call level as well. If the filesystem implementation needs a particular credential to carry out the syncing it would logically have to the cached mount credential, or a credential cached along with any delayed write data. Discussed with: rwatson	2005-01-11 07:36:22 +00:00
Warner Losh	9454b2d864	/* -> /*- for copyright notices, minor format tweaks as necessary	2005-01-06 23:35:40 +00:00
Poul-Henning Kamp	9c83534dd8	Make VOP_BMAP return a struct bufobj for the underlying storage device instead of a vnode for it. The vnode_pager does not and should not have any interest in what the filesystem uses for backend. (vfs_cluster doesn't use the backing store argument.)	2004-11-15 09:18:27 +00:00
Poul-Henning Kamp	c108bb741c	Remove VOP_SPECSTRATEGY() from the system.	2004-10-29 10:59:28 +00:00
Poul-Henning Kamp	883d3c0c07	Remove the buffercache/vnode side of BIO_DELETE processing in preparation for integration of p4::phk_bufwork. In the future, local filesystems will talk to GEOM directly and they will consequently be able to issue BIO_DELETE directly. Since the removal of the fla driver, BIO_DELETE has effectively been a no-op anyway.	2004-09-13 06:50:42 +00:00
Warner Losh	7f8a436ff2	Remove advertising clause from University of California Regent's license, per letter dated July 22, 1999. Approved by: core	2004-04-05 21:03:37 +00:00
Robert Watson	9080ff25cf	Rename VOP_RMEXTATTR() to VOP_DELETEEXTATTR() for consistency with the kernel ACL interfaces and system call names. Break out UFS2 and FFS extattr delete and list vnode operations from setextattr and getextattr to deleteextattr and listextattr, which cleans up the implementations, and makes the results more readable, and makes the APIs more clear. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2003-07-28 18:53:29 +00:00
Poul-Henning Kamp	1b6c609507	Call the new argument "fdidx" that is more precise than "fd".	2003-07-27 17:03:20 +00:00
Poul-Henning Kamp	a8d43c90af	Add a "int fd" argument to VOP_OPEN() which in the future will contain the filedescriptor number on opens from userland. The index is used rather than a "struct file " since it conveys a bit more information, which may be useful to in particular fdescfs and /dev/fd/ For now pass -1 all over the place.	2003-07-26 07:32:23 +00:00
Robert Watson	77533ed2aa	Expose vop_rmextattr as an explicit operation at the vnode operation interface, rather than relying on a NULL uio for the deletion operation. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2003-06-22 22:45:24 +00:00
Stefan Eßer	c2ef4dd48a	Add comment about **vpp being special-cased in vnode_if.awk (1.38)	2003-06-20 12:24:06 +00:00
Robert Watson	a6f1342ff6	Add vop_listextattr(), similar to vop_getextattr() but without a specific attribute name. It will have the same semantics as the older vop_getextattr() "retrieve the names" hack, returning a buffer with ASCII nul-seperated names. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2003-06-05 05:53:35 +00:00
Poul-Henning Kamp	f5b11b6e2d	Temporarily introduce a new VOP_SPECSTRATEGY operation while I try to sort out disk-io from file-io in the vm/buffer/filesystem space. The intent is to sort VOP_STRATEGY calls into those which operate on "real" vnodes and those which operate on VCHR vnodes. For the latter kind, the call will be changed to VOP_SPECSTRATEGY, possibly conditionally for those places where dual-use happens. Add a default VOP_SPECSTRATEGY method which will call the normal VOP_STRATEGY. First time it is called it will print debugging information. This will only happen if a normal vnode is passed to VOP_SPECSTRATEGY by mistake. Add a real VOP_SPECSTRATEGY in specfs, which does what VOP_STRATEGY does on a VCHR vnode today. Add a new VOP_STRATEGY method in specfs to catch instances where the conversion to VOP_SPECSTRATEGY has not yet happened. Handle the request just like we always did, but first time called print debugging information. Apart up to two instances of console messages per boot, this amounts to a glorified no-op commit. If you get any of the messages on your console I would very much like a copy of them mailed to phk@freebsd.org	2003-01-04 22:10:36 +00:00
Robert Watson	79191eca57	Flush vop_refreshlabel() definition, since it is no longer used. Obtained from: TrustedBSD Project Sponsored by: DARPA, Network Associates Laboratories	2002-12-24 19:47:13 +00:00
Jeff Roberson	6fc15f9bdf	- We don't need any automated lock checking for vop_islocked.	2002-09-26 00:31:16 +00:00
Don Lewis	fa288043e2	VOP_FSYNC() requires that it's vnode argument be locked, which nfs_link() wasn't doing. Rather than just lock and unlock the vnode around the call to VOP_FSYNC(), implement rwatson's suggestion to lock the file vnode in kern_link() before calling VOP_LINK(), since the other filesystems also locked the file vnode right away in their link methods. Remove the locking and and unlocking from the leaf filesystem link methods. Reviewed by: rwatson, bde (except for the unionfs_link() changes)	2002-09-19 13:32:45 +00:00
Poul-Henning Kamp	e1657bbb97	Introduce the VOP_OPENEXTATTR() and VOP_CLOSEEXTATTR() methods. Together these two implement a simple transcation style grouping for modifications of extended attributes on a vnode. VOP_CLOSEEXTATTR() takes a boolean "commit" argument, which determines if the aggregate changes are attempted written or not. A commit will fail if any of the VOP_SETEXTATTR() calls since the VOP_OPENEXTATTR() have failed to meet their objective or if the flush to disk fails. The default operations for these two VOP's is to return EOPNOTSUPP. This API may still be subject to change. Sponsored by: DARPA & NAI Labs	2002-09-05 20:56:14 +00:00
Jeff Roberson	71ea4ba57c	- Add two new debugging macros: ASSERT_VI_LOCKED and ASSERT_VI_UNLOCKED - Use the new VI asserts in place of the old mtx_assert checks. - Add the VI asserts to the automated lock checking in the VOP calls. The interlock should not be held across vops with a few exceptions. - Add the vop_(un)lock_{pre,post} functions to assert that interlock is held when LK_INTERLOCK is set.	2002-08-21 06:19:29 +00:00
Robert Watson	f8ef020e2e	Begin committing support for Mandatory Access Control and extensible kernel access control. The MAC framework permits loadable kernel modules to link to the kernel at compile-time, boot-time, or run-time, and augment the system security policy. This commit includes the initial kernel implementation, although the interface with the userland components of the operating system is still under work, and not all kernel subsystems are supported. Later in this commit sequence, documentation of which kernel subsystems will not work correctly with a kernel compiled with MAC support will be added. Introduce two node vnode operations required to support MAC. First, VOP_REFRESHLABEL(), which will be invoked by callers requiring that vp->v_label be sufficiently "fresh" for access control purposes. Second, VOP_SETLABEL(), which be invoked by callers requiring that the passed label contents be updated. The file system is responsible for updating v_label if appropriate in coordination with the MAC framework, as well as committing to disk. File systems that are not MAC-aware need not implement these VOPs, as the MAC framework will default to maintaining a single label for all vnodes based on the label on the file system mount point. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-07-30 22:15:09 +00:00
Jeff Roberson	1e4c7a1368	- Acknowledge recursive vnode locks in the vop_unlock specification. The vnode may not be unlocked even if the operation succeeded.	2002-07-30 08:50:52 +00:00
Jeff Roberson	50bfcee1cb	- Use the new vop_lookup_{pre,post} instead of simpler locking specification.	2002-07-09 19:55:06 +00:00
Jeff Roberson	41a5470d03	- Require locks for getattr. At some point this could only require shared locks.	2002-07-07 22:37:45 +00:00
Jeff Roberson	e818064e98	- Disable original vop_strategy lock specification. - Switch to the new vop_strategy_pre for lock validation. VOP_STRATEGY requires only that the buf is locked UNLESS the block numbers need to be translated. There may be other reasons, but as long as the underlying layer uses a VOP to perform the operations they will be caught later.	2002-07-06 05:23:17 +00:00
Jeff Roberson	13e407efee	Use the new #! directive for vop_rename. Leave the old lock specification intact but disabled.	2002-07-06 04:41:27 +00:00
Poul-Henning Kamp	98b0c78978	Make daddr_t and u_daddr_t 64bits wide. Retire daddr64_t and use daddr_t instead. Sponsored by: DARPA & NAI Labs.	2002-05-14 11:09:43 +00:00
Kirk McKusick	0d2af52141	Introduce the new 64-bit size disk block, daddr64_t. Change the bio and buffer structures to have daddr64_t bio_pblkno, b_blkno, and b_lblkno fields which allows access to disks larger than a Terabyte in size. This change also requires that the VOP_BMAP vnode operation accept and return daddr64_t blocks. This delta should not affect system operation in any way. It merely sets up the necessary interfaces to allow the development of disk drivers that work with these larger disk block addresses. It also allows for the development of UFS2 which will use 64-bit block addresses.	2002-03-15 18:49:47 +00:00
Robert Watson	eae1306746	Per discussion at BSDCon, note that the vop_getattr locking protocol should require a shared lock, rather than an exclusive lock, which can improve performance. No actual code change here, since a number of VFS locking fixes are in the works.	2002-02-18 00:22:57 +00:00
Robert Watson	1745909176	Add a comment indicating that the locking protocol should be updated to be 'L L L' for vop_getattr(). Don't update it yet, because there are still many offenders.	2002-02-10 21:46:16 +00:00
Robert Watson	74237f55b0	Part I: Update extended attribute API and ABI: o Modify the system call syntax for extattr_{get,set}_{fd,file}() so as not to use the scatter gather API (which appeared not to be used by any consumers, and be less portable), rather, accepts 'data' and 'nbytes' in the style of other simple read/write interfaces. This changes the API and ABI. o Modify system call semantics so that extattr_get_{fd,file}() return a size_t. When performing a read, the number of bytes read will be returned, unless the data pointer is NULL, in which case the number of bytes of data are returned. This changes the API only. o Modify the VOP_GETEXTATTR() vnode operation to accept a *size_t argument so as to return the size, if desirable. If set to NULL, the size will not be returned. o Update various filesystems (pseodofs, ufs) to DTRT. These changes should make extended attributes more useful and more portable. More commits to rebuild the system call files, as well as update userland utilities to follow. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-02-10 04:43:22 +00:00
Julian Elischer	b40ce4165d	KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha	2001-09-12 08:38:13 +00:00
Assar Westerlund	2b3dc41c15	correct description of `vpp' for mknod/symlink: they are actually returned locked	2001-07-24 16:16:00 +00:00

1 2

91 Commits