freebsd-skq

Author	SHA1	Message	Date
attilio	07441f19e1	Optimize lockmgr in order to get rid of the pool mutex interlock, of the state transitioning flags and of msleep(9) callings. Use, instead, an algorithm very similar to what sx(9) and rwlock(9) alredy do and direct accesses to the sleepqueue(9) primitive. In order to avoid writer starvation a mechanism very similar to what rwlock(9) uses now is implemented, with the correspective per-thread shared lockmgrs counter. This patch also adds 2 new functions to lockmgr KPI: lockmgr_rw() and lockmgr_args_rw(). These two are like the 2 "normal" versions, but they both accept a rwlock as interlock. In order to realize this, the general lockmgr manager function "__lockmgr_args()" has been implemented through the generic lock layer. It supports all the blocking primitives, but currently only these 2 mappers live. The patch drops the support for WITNESS atm, but it will be probabilly added soon. Also, there is a little race in the draining code which is also present in the current CVS stock implementation: if some sharers, once they wakeup, are in the runqueue they can contend the lock with the exclusive drainer. This is hard to be fixed but the now committed code mitigate this issue a lot better than the (past) CVS version. In addition assertive KA_HELD and KA_UNHELD have been made mute assertions because they are dangerous and they will be nomore supported soon. In order to avoid namespace pollution, stack.h is splitted into two parts: one which includes only the "struct stack" definition (_stack.h) and one defining the KPI. In this way, newly added _lockmgr.h can just include _stack.h. Kernel ABI results heavilly changed by this commit (the now committed version of "struct lock" is a lot smaller than the previous one) and KPI results broken by lockmgr_rw() / lockmgr_args_rw() introduction, so manpages and __FreeBSD_version will be updated accordingly. Tested by: kris, pho, jeff, danger Reviewed by: jeff Sponsored by: Google, Summer of Code program 2007	2008-04-06 20:08:51 +00:00
kib	dae02901d4	The temporary workaround for the call to the vget() without lock type in the fdesc_allocvp(). The caller of the fdesc_allocvp() expects that the returned vnode is not reclaimed. Do lock the vnode exclusive and drop the lock after. Reported by: pho Reviewed by: jeff	2008-04-04 09:37:57 +00:00
kib	eff8c6d35e	Add the support for the AT_FDCWD and fd-relative name lookups to the namei(9). Based on the submission by rdivacky, sponsored by Google Summer of Code 2007 Reviewed by: rwatson, rdivacky Tested by: pho	2008-03-31 12:01:21 +00:00
jeff	30e699e885	- Simplify null_hashget() and null_hashins() by using vref() rather than a complex series of steps involving vget() without a lock type to emulate the same thing.	2008-03-29 23:24:54 +00:00
dfr	79d2dfdaa6	Add the new kernel-mode NFS Lock Manager. To use it instead of the user-mode lock manager, build a kernel with the NFSLOCKD option and add '-k' to 'rpc_lockd_flags' in rc.conf. Highlights include: * Thread-safe kernel RPC client - many threads can use the same RPC client handle safely with replies being de-multiplexed at the socket upcall (typically driven directly by the NIC interrupt) and handed off to whichever thread matches the reply. For UDP sockets, many RPC clients can share the same socket. This allows the use of a single privileged UDP port number to talk to an arbitrary number of remote hosts. * Single-threaded kernel RPC server. Adding support for multi-threaded server would be relatively straightforward and would follow approximately the Solaris KPI. A single thread should be sufficient for the NLM since it should rarely block in normal operation. * Kernel mode NLM server supporting cancel requests and granted callbacks. I've tested the NLM server reasonably extensively - it passes both my own tests and the NFS Connectathon locking tests running on Solaris, Mac OS X and Ubuntu Linux. * Userland NLM client supported. While the NLM server doesn't have support for the local NFS client's locking needs, it does have to field async replies and granted callbacks from remote NLMs that the local client has contacted. We relay these replies to the userland rpc.lockd over a local domain RPC socket. * Robust deadlock detection for the local lock manager. In particular it will detect deadlocks caused by a lock request that covers more than one blocking request. As required by the NLM protocol, all deadlock detection happens synchronously - a user is guaranteed that if a lock request isn't rejected immediately, the lock will eventually be granted. The old system allowed for a 'deferred deadlock' condition where a blocked lock request could wake up and find that some other deadlock-causing lock owner had beaten them to the lock. * Since both local and remote locks are managed by the same kernel locking code, local and remote processes can safely use file locks for mutual exclusion. Local processes have no fairness advantage compared to remote processes when contending to lock a region that has just been unlocked - the local lock manager enforces a strict first-come first-served model for both local and remote lockers. Sponsored by: Isilon Systems PR: 95247 107555 115524 116679 MFC after: 2 weeks	2008-03-26 15:23:12 +00:00
jeff	a9d123c3ab	- Complete part of the unfinished bufobj work by consistently using BO_LOCK/UNLOCK/MTX when manipulating the bufobj. - Create a new lock in the bufobj to lock bufobj fields independently. This leaves the vnode interlock as an 'identity' lock while the bufobj is an io lock. The bufobj lock is ordered before the vnode interlock and also before the mnt ilock. - Exploit this new lock order to simplify softdep_check_suspend(). - A few sync related functions are marked with a new XXX to note that we may not properly interlock against a non-zero bv_cnt when attempting to sync all vnodes on a mountlist. I do not believe this race is important. If I'm wrong this will make these locations easier to find. Reviewed by: kib (earlier diff) Tested by: kris, pho (earlier diff)	2008-03-22 09:15:16 +00:00
kib	de73f6b678	Do not dereference cdev->si_cdevsw, use the dev_refthread() to properly obtain the reference. In particular, this fixes the panic reported in the PR. Remove the comments stating that this needs to be done. PR: kern/119422 MFC after: 1 week	2008-03-20 16:08:42 +00:00
jeff	acb93d599c	Remove kernel support for M:N threading. While the KSE project was quite successful in bringing threading to FreeBSD, the M:N approach taken by the kse library was never developed to its full potential. Backwards compatibility will be provided via libmap.conf for dynamically linked binaries and static binaries will be broken.	2008-03-12 10:12:01 +00:00
rwatson	1e5ea4d085	Replace lockmgr lock protecting nwfs vnode hash table with an sx lock. MFC after: 1 month	2008-03-02 19:02:30 +00:00
rwatson	f0f0802303	Replace lockmgr lock protecting smbfs node hash table with sx lock. MFC after: 1 month	2008-03-02 18:56:13 +00:00
attilio	0d87334131	- Handle buffer lock waiters count directly in the buffer cache instead than rely on the lockmgr support [1]: * bump the waiters only if the interlock is held * let brelvp() return the waiters count * rely on brelvp() instead than BUF_LOCKWAITERS() in order to check for the waiters number - Remove a namespace pollution introduced recently with lockmgr.h including lock.h by including lock.h directly in the consumers and making it mandatory for using lockmgr. - Modify flags accepted by lockinit(): * introduce LK_NOPROFILE which disables lock profiling for the specified lockmgr * introduce LK_QUIET which disables ktr tracing for the specified lockmgr [2] * disallow LK_SLEEPFAIL and LK_NOWAIT to be passed there so that it can only be used on a per-instance basis - Remove BUF_LOCKWAITERS() and lockwaiters() as they are no longer used This patch breaks KPI so __FreBSD_version will be bumped and manpages updated by further commits. Additively, 'struct buf' changes results in a disturbed ABI also. [2] Really, currently there is no ktr tracing in the lockmgr, but it will be added soon. [1] Submitted by: kib Tested by: pho, Andrea Barberio <insomniac at slackware dot it>	2008-03-01 19:47:50 +00:00
kib	eb463d58c6	Rename fdescfs vnode from "fdesc" to "fdescfs" to avoid name collision of the vnode lock with the fdesc_mtx mutex. Having different kinds of locks with the same name confuses witness.	2008-02-26 10:10:55 +00:00
rwatson	7abbbd5936	Add "Make MPSAFE" to the Coda todo list. MFC after: 3 days	2008-02-26 09:27:47 +00:00
attilio	4014b55830	Axe the 'thread' argument from VOP_ISLOCKED() and lockstatus() as it is always curthread. As KPI gets broken by this patch, manpages and __FreeBSD_version will be updated by further commits. Tested by: Andrea Barberio <insomniac at slackware dot it>	2008-02-25 18:45:57 +00:00
attilio	0d54671a48	Introduce some functions in the vnode locks namespace and in the ffs namespace in order to handle lockmgr fields in a controlled way instead than spreading all around bogus stubs: - VN_LOCK_AREC() allows lock recursion for a specified vnode - VN_LOCK_ASHARE() allows lock sharing for a specified vnode In FFS land: - BUF_AREC() allows lock recursion for a specified buffer lock - BUF_NOREC() disallows recursion for a specified buffer lock Side note: union_subr.c::unionfs_node_update() is the only other function directly handling lockmgr fields. As this is not simple to fix, it has been left behind as "sole" exception.	2008-02-24 16:38:58 +00:00
marcel	c8e1e08fd3	Don't check the bpbSecPerTrack and bpbHeads fields of the BPB. They are typically 0 on new ia64 systems. Since we don't use either field, there's no harm in not checking.	2008-02-21 03:19:46 +00:00
rwatson	c8bd1d9da0	Remove custom queue macros in Coda, replacing them with queue(9) tailq macros. The only semantic change was the need to add a vc_opened field to struct vcomm since we can no longer use the request queue returning to an uninitialized state to hold whether or not the device is open. MFC after: 1 month	2008-02-17 14:33:28 +00:00
rwatson	dae55bda62	Remove namecache performance-tuning todo for Coda: we now use the FreeBSD name cache. MFC after: 1 month	2008-02-17 12:40:27 +00:00
rwatson	8009d6be3e	The possibly interruptible msleep in coda_call() means well, but is fundamentally fairly confused about how signals work and when it is appropriate for upcalls to be interrupted. In particular, we should be exempting certain upcalls from interruption, we should not always eventually time out sleeping on a upcall, and we should not be interrupting the sleep for certain signals that we currently are (including SIGINFO). This code needs to be reworked in the style of NFS interruptible mounts. MFC after: 1 month	2008-02-15 13:31:35 +00:00
rwatson	860deb0bbc	Spell replys as replies. MFC after: 1 month	2008-02-15 12:11:45 +00:00
rwatson	ac6e0fc083	Reorder and clean up make_coda_node(), annotate weaknesses in the implementation. MFC after: 1 month	2008-02-15 11:58:11 +00:00
rwatson	021108eeb6	Remove debugging code under OLD_DIAGNOSTIC; this is all >10 years old and hasn't been used in that time. MFC after: 1 month	2008-02-14 00:55:03 +00:00
rwatson	0cf10cfc00	In Coda, flush the attribute cache for a cnode when its fid is changed, as its synthesized inode number may have changed and we want stat(2) to pick up the new inode number. MFC after: 1 month	2008-02-14 00:30:06 +00:00
rwatson	5e4721882e	Update cache flushing behavior in light of recent namecache and access cache improvements: - Flush just access control state on CODA_PURGEUSER, not the full namecache for /coda. - When replacing a fid on a cnode as a result of, e.g., reintegration after offline operation, we no longer need to purge the namecache entries associated with its vnode. MFC after: 1 month	2008-02-13 19:50:17 +00:00
rwatson	621bdec0f6	Implement a rudimentary access cache for the Coda kernel module, modeled on the access cache found in NFS, smbfs, and the Linux coda module. This is a positive access cache of a single entry per file, tracking recently granted rights, but unlike NFS and smbfs, supporting explicit invalidation by the distributed file system. For each cnode, maintain a C_ACCCACHE flag indicating the validity of the cache, and a cached uid and mode tracking recently granted positive access control decisions. Prefer the cache to venus_access() in VOP_ACCESS() if it is valid, and when we must fall back to venus_access(), update the cache. Allow Venus to clear the access cache, either the whole cache on CODA_FLUSH, or just entries for a specific uid on CODA_PURGEUSER. Unlike the Coda module on Linux, we don't flush all entries on a user purge using a generation number, we instead walk present cnodes and clear only entries for the specific user, meaning it is somewhat more expensive but won't hit all users. Since the Coda module is agressive about not keeping around unopened cnodes, the utility of the cache is somewhat limited for files, but works will for directories. We should make Coda less agressive about GCing cnodes in VOP_INACTIVE() in order to improve the effectiveness of in-kernel caching of attributes and access rights. MFC after: 1 month	2008-02-13 15:45:12 +00:00
rwatson	70ff5c5a61	Remove now-unused Coda namecache. MFC after: 1 month	2008-02-13 13:26:01 +00:00
rwatson	a9d8becadf	Rather than having the Coda module use its own namecache, use the global VFS namecache, as is done by the Coda module on Linux. Unlike the Coda namecache, the global VFS namecache isn't tagged by credential, so use ore conservative flushing behavior (for now) when CODA_PURGEUSER is issued by Venus. This improves overall integration with the FreeBSD VFS, including allowing __getcwd() to work better, procfs/procstat monitoring, and so on. This improves shell behavior in many cases, and improves ".." handling. It may lead to some slowdown until we've implemented a specific access cache, which should net improve performance, but in the mean time, lookup access control now always goes to Venus, whereas previously it didn't. MFC after: 1 month	2008-02-13 13:06:22 +00:00
attilio	313dc11b0b	Fix a lock leak in the ntfs locking scheme: When ntfs_ntput() reaches 0 in the refcount the inode lockmgr is not released and directly destroyed. Fix this by unlocking the lockmgr() even in the case of zero-refcount. Reported by: dougb, yar, Scot Hetzel <swhetzel at gmail dot com> Submitted by: yar	2008-02-13 13:02:12 +00:00
rwatson	8d55ec6003	Clean up coda_pathconf() slightly while debugging a problem there. MFC after: 1 month	2008-02-11 00:01:45 +00:00
rwatson	8a831a9f22	Since we're now actively maintaining the Coda module in the FreeBSD source tree, restyle everything but coda.h (which is more explicitly shared across systems) into a closer approximation to style(9). Remove a few more unused function prototypes. Add or clarify some comments. MFC after: 1 month	2008-02-10 11:18:12 +00:00
rwatson	5b9b5c8121	Various further non-functional cleanups to coda: - Rename print_vattr to coda_print_vattr and make static, rename print_cred to coda_print_cred. - Remove unused coda_vop_nop. - Add XXX comment because coda_readdir forwards to the cache vnode's readdir rather than venus_readdir, and annotate venus_readdir as unused. - Rename vc_nb_* to vc_. - Use d_open_t, d_close_t, d_read_t, d_write_t, d_ioctl_t and d_poll_t for prototyping vc_ as that is the intent, don't use our own definitions. - Rename coda_nb_statfs to coda_statfs, rename NB_SFS_SIZ to CODA_SFS_SIZ. - Replace one more OBE reference to NetBSD with a reference to FreeBSD. - Tidy up a little vertical whitespace here and there. - Annotate coda_nc_zapvnode as unused. - Remove unused vcodattach. - Annotate VM_INTR as unused. - Annotate that coda_fhtovp is unused and doesn't match the FreeBSD prototype, so isn't hooked up to vfs_fhtovp. If we want NFS export of Coda to work someday, this needs to be fixed. - Remove unused getNewVnode. - Remove unused coda_vget, coda_init, coda_quotactl prototypes. MFC after: 1 month	2008-02-09 12:49:18 +00:00
rwatson	f2fd79dc06	No reason not to maintain stats on statfs in Coda, as it's done for other VFS operations, so uncomment the existing statistics gathering. MFC after: 1 month	2008-02-09 11:40:49 +00:00
rwatson	c06ae37dfb	Remove unused devtomp(), which exploited UFS-specific knowledge to find the mountpoint for a specific device. This was implemented incorrectly, a bad idea in a fundamental sense, and also never used, so presumably a long-idle debugging function. MFC after: 1 month	2008-02-09 11:12:18 +00:00
rwatson	e16828cb77	Since Coda is effectively a stacked file system, use VOP_EOPNOTSUPP for vop_bmap; delete the existing stub that returned either EINVAL or EOPNOTSUPP, and had unreachable calls to VOP_BMAP on the cache vnode. MFC after: 1 month	2008-02-09 09:33:19 +00:00
rwatson	7445f79ec2	Lock cache vnode when VOP_FSYNC() is called on a Coda vnode. MFC after: 1 month	2008-02-09 00:12:22 +00:00
rwatson	0a37acb8a6	Make all calls to vn_lock() in Coda, including recently added ones, use LK_RETRY, since failure is undesirable (and not handled). MFC after: 1 month Pointed out by: kib	2008-02-09 00:03:22 +00:00
rwatson	409c34ce7d	The Coda module was originally ported to NetBSD from Mach by rvb, and then later to FreeBSD. Update various NetBSD-related comments: in some cases delete them because they don't appply, in others update to say FreeBSD as they still apply but in FreeBSD (and might for that matter no longer apply on NetBSD), and flag one case where I'm not sure whether it applies. MFC after: 1 month	2008-02-08 23:15:36 +00:00
rwatson	83dcd82dd9	Before invoking vnode operations on cache vnodes, acquire the vnode locks of those vnodes. Probably, Coda should do the same lock sharing/ pass-through that is done for nullfs, but in the mean time this ensures that locks are adequately held to prevent corruption of data structures in the cache file system. Assuming most operations came from the top layer of Coda and weren't performed directly on the cache vnodes, in practice this corruption was relatively unlikely as the Coda vnode locks were ensuring exclusive access for most consumers. This causes WITNESS to squeal like a pig immediately when Coda is used, rather than waiting until file close; I noticed these problems because of the lack of said squealing. MFC after: 1 month	2008-02-08 23:01:40 +00:00
rwatson	09506e9ff8	Remove undefined coda excluded by #if 1 #else, which previously protected vget() calls using inode numbers to query the root of /coda, which is not needed since we now cache the root vnode with the mountpoint. MFC after: 1 month	2008-02-08 22:37:15 +00:00
attilio	e1db4e70b3	Conver all explicit instances to VOP_ISLOCKED(arg, NULL) into VOP_ISLOCKED(arg, curthread). Now, VOP_ISLOCKED() and lockstatus() should only acquire curthread as argument; this will lead in axing the additional argument from both functions, making the code cleaner. Reviewed by: jeff, kib	2008-02-08 21:45:47 +00:00
rwatson	3b2455b135	Remove Giant acquisition around soreceive() and sosend() in fifofs. The bug that caused us to reintroduce it is believed to be fixed, and Kris says he no longer sees problems with fifofs in highly parallel builds. If this works out, we'll MFC it for 7.1. MFC after: 3 months Pointed out by: kris	2008-01-26 12:34:23 +00:00
attilio	7213f4c32b	Cleanup lockmgr interface and exported KPI: - Remove the "thread" argument from the lockmgr() function as it is always curthread now - Axe lockcount() function as it is no longer used - Axe LOCKMGR_ASSERT() as it is bogus really and no currently used. Hopefully this will be soonly replaced by something suitable for it. - Remove the prototype for dumplockinfo() as the function is no longer present Addictionally: - Introduce a KASSERT() in lockstatus() in order to let it accept only curthread or NULL as they should only be passed - Do a little bit of style(9) cleanup on lockmgr.h KPI results heavilly broken by this change, so manpages and FreeBSD_version will be modified accordingly by further commits. Tested by: matteo	2008-01-24 12:34:30 +00:00
rwatson	b33bafcdc2	Put "coda_rdwr: Internally Opening" printf generated by in-kernel writes to files, such as ktrace output, under CODA_VERBOSE. Otherwise, each such call to VOP_WRITE() results in a kernel printf. MFC after: 3 days Obtained from: NetBSD	2008-01-21 21:39:08 +00:00
rwatson	a718996964	Replace references to VOP_LOCK() w/o LK_RETRY to vn_lock() with LK_RETRY, avoiding extra error handling, or in some cases, missing error handling. MFC after: 3 days Discussed with: kib	2008-01-21 21:19:07 +00:00
rwatson	8294510902	Remove unused oldhash definition from Coda namecache. MFC after: 3 days	2008-01-19 19:21:07 +00:00
rwatson	a45d8c6482	Improve default vnode operation handling for Coda: - Don't specify vnode operations for mknod, lease, and advlock--let them fall through to vop_default. - Implement vop_default with &default_vnodeops, rather than with VOP_PANIC, so that unimplemented vnode operations are handled in more sensible ways than panicking, such as EOPNOTSUPP on ACL queries generated by bsdtar, or mknod. MFC after: 3 days	2008-01-19 17:12:44 +00:00
rwatson	5baa8fe000	Rework coda_statfs(): no longer need to zero the statfs structure or fill out all fields, just fill out the ones the file system knows about. Among other things, this causes the outpuf of "mount" and "df" to make quite a bit more sense as /dev/cfs0 is specified as the mountfrom name. MFC after: 3 days	2008-01-19 16:39:14 +00:00
rwatson	1d78104fa0	Zero mi_rotovp and coda_ctlvp immediately after calling vrele() on the vnodes during coda_unmount() in order to detect errant use of them after the vnode references may no longer be valid. No need to clear the VV_ROOT flag on mi_rootvp flag (especially after the vnode reference is no longer valid) as this isn't done on other file systems. MFC after: 3 days	2008-01-19 15:40:46 +00:00
rwatson	fc2cdfa748	Don't acquire an additional vnode reference to a vnode when it is opened and then release it when it is closed: we rely on the caller to keep the vnode around with a valid reference. This avoids vrele() destroying the vnode vop_close() is being called from during a call to vop_close(), and a crash due to lockmgr recursing the vnode lock when a Coda unmount occurs. MFC after: 3 days	2008-01-19 15:39:10 +00:00
rwatson	735d73fd1d	Don't declare functions as extern. Move all extern variable definitions to associated .h files, move some extern variable definitions between include files to place them more appropriately. MFC after: 3 days	2008-01-19 14:32:44 +00:00

1 2 3 4 5 ...

2086 Commits