freebsd-dev

Author	SHA1	Message	Date
Jeff Roberson	1c4bcd050a	- Move rusage from being per-process in struct pstats to per-thread in td_ru. This removes the requirement for per-process synchronization in statclock() and mi_switch(). This was previously supported by sched_lock which is going away. All modifications to rusage are now done in the context of the owning thread. reads proceed without locks. - Aggregate exiting threads rusage in thread_exit() such that the exiting thread's rusage is not lost. - Provide a new routine, rufetch() to fetch an aggregate of all rusage structures from all threads in a process. This routine must be used in any place requiring a rusage from a process prior to it's exit. The exited process's rusage is still available via p_ru. - Aggregate tick statistics only on demand via rufetch() or when a thread exits. Tick statistics are kept in the thread and protected by sched_lock until it exits. Initial patch by: attilio Reviewed by: attilio, bde (some objections), arch (mostly silent)	2007-06-01 01:12:45 +00:00
John Baldwin	a1054d5776	Various fixes to the NFS Directio support. - Fix for a bug where a close would not wait for all (directio) dirty buffers to drain. The nfsnode was not marked NMODIFIED when there were directio dirtied buffers pending, causing this. - No reason to vhold/vrele the vp when enqueueing DirectIO requests for the nfsiods. The vnode can't really go way since the close has to wait for these requests to drain. MFC after: 1 week Submitted by: mohans	2007-04-25 20:34:55 +00:00
Mohan Srinivasan	f9bb753844	Over NFS, an open() call could result in multiple over-the-wire GETATTRs being generated - one from lookup()/namei() and the other from nfs_open() (for cto consistency). This change eliminates the GETATTR in nfs_open() if an otw GETATTR was done from the namei() path. Instead of extending the vop interface, we timestamp each attr load, and use this to detect whether a GETATTR was done from namei() for this syscall. Introduces a thread-local variable that counts the syscalls made by the thread and uses <pid, tid, thread syscalls> as the attrload timestamp. Thanks to jhb@ and peter@ for a discussion on thread state that could be used as the timestamp with minimal overhead.	2007-03-09 04:02:38 +00:00
Mohan Srinivasan	4e99994cc9	Fix for a vnode lock leak in nfs_create() in the event of an error. Spotted by ups@.	2007-01-31 23:10:27 +00:00
Konstantin Belousov	2cc7d26f7f	Cylinder group bitmaps and blocks containing inode for a snapshot file are after snaplock, while other ffs device buffers are before snaplock in global lock order. By itself, this could cause deadlock when bdwrite() tries to flush dirty buffers on snapshotted ffs. If, during the flush, COW activity for snapshot needs to allocate block and ffs_alloccg() selects the cylinder group that is being written by bdwrite(), then kernel would panic due to recursive buffer lock acquision. Avoid dealing with buffers in bdwrite() that are from other side of snaplock divisor in the lock order then the buffer being written. Add new BOP, bop_bdwrite(), to do dirty buffer flushing for same vnode in the bdwrite(). Default implementation, bufbdflush(), refactors the code from bdwrite(). For ffs device buffers, specialized implementation is used. Reviewed by: tegge, jeff, Russell Cattelan (cattelan xfs org, xfs changes) Tested by: Peter Holm X-MFC after: 3 weeks (if ever: it changes ABI)	2007-01-23 10:01:19 +00:00
Mohan Srinivasan	87c125cecc	Fix to readdir+ reply handling. When inserting an entry into the namecache, initialize the nfsnode's ctime. Otherwise a subsequent lookup purges the just entered namecache entry.	2006-11-16 23:02:37 +00:00
Bruce Evans	6a72ff6b09	Don't do null Setattr RPCs for VA_MARK_ATIME. When we added the VA_MARK_ATIME feature to fix POSIX conformance fore execve() and mmap(), we thought that it was optimized well enough for the one file system that supports it (ffs) and harmless for other file systems (except layered ones which already get the layering for VOP_SETATTR() wrong). However, nfs_setattr() doesn't do much parameter checking, so when it gets a combination of parameters that it doesn't understand, it always does a Setattr RPC. This RPC can't do anything good, and for VA_MARK_ATIME it is null except for wasting a lot of time. This is the smallest and easiest to fix of several bugs that have increased the number of RPCs for kernel builds on nfs by more than 100% since 2004-11-05. The real-time increase depends on network latency and parallelization and can also be very large (approaching the same percentage for unparallelized operations like "make depend" on systems with fast CPUs and high-latency networks).	2006-10-14 07:25:11 +00:00
Tor Egge	a1e363f256	Add mnt_noasync counter to better handle interleaved calls to nmount(), sync() and sync_fsync() without losing MNT_ASYNC. Add MNTK_ASYNC flag which is set only when MNT_ASYNC is set and mnt_noasync is zero, and check that flag instead of MNT_ASYNC before initiating async io.	2006-09-26 04:15:59 +00:00
Mohan Srinivasan	7d7d9e2242	Fixes up the handling of shared vnode lock lookups in the NFS client, adds a FS type specific flag indicating that the FS supports shared vnode lock lookups, adds some logic in vfs_lookup.c to test this flag and set lock flags appropriately. - amd on 6.x is a non-starter (without this change). Using amd under heavy load results in a deadlock (with cascading vnode locks all the way to the root) very quickly. - This change should also fix the more general problem of cascading vnode deadlocks when an NFS server goes down. Ideally, we wouldn't need these changes, as enabling shared vnode lock lookups globally would work. Unfortunately, UFS, for example isn't ready for shared vnode lock lookups, crashing pretty quickly. This change is the result of discussions with Stephan Uphoff (ups@). Reviewed by: ups@	2006-09-13 18:39:09 +00:00
Konstantin Belousov	201599c3af	Always supply curthread as argument to nfs_asyncio and nfs_doio in nfs_strategy. Otherwise, for some buffers, signals would be ignored at the intr mounts. Reviewed by: mohan MFC after: 1 month Approved by: kan (mentor)	2006-07-08 15:36:51 +00:00
Mohan Srinivasan	f1cdf89911	Changes to make the NFS client MP safe. Thanks to Kris Kennaway for testing and sending lots of bugs my way.	2006-05-19 00:04:24 +00:00
Mohan Srinivasan	5ef7d50da5	Keep track of the number of in-progress async direct IO writes in the nfsnode. Make fsync/close wait until all of these drain. Add a check to nfs_getpage() and nfs_putpage().	2006-04-06 01:20:30 +00:00
Chuck Lever	9f5349f23d	Fix a bug in NFSv3 READDIRPLUS reply processing The client's READDIRPLUS logic skips the attributes and filehandle of the ".." entry. If the server doesn't send attributes but does send a filehandle for "..", the client's logic doesn't account for the extra "value follows" field that indicates whether the filehandle is present, causing the remaining entries in the reply to be ignored. Sponsored by: Network Appliance, Inc. Reviewed by: rick, mohans Approved by: silby MFC after: 2 weeks	2006-03-08 01:43:01 +00:00
Xin LI	fc9fac4c78	Correct a typo	2005-12-28 10:03:48 +00:00
Paul Saab	3834aac17e	- Always return success from NFS strategy. nfs_doio(), in the event of an error, does the right thing, in terms of setting the error flags in the buf header. That fixes a crash from bstrategy(). - Treat ETIMEDOUT as a "recoverable" error, causing the buffer to be re-dirtied. ETIMEDOUT can occur on soft mounts, when the number of retries are exceeded, and we don't want data loss in that case. Submitted by: Mohan Srinivasan	2005-11-21 19:23:46 +00:00
Jonathan Chen	0b3e7451da	fix a crash when an nfsv2 mount fails MFC after: 1 week	2005-11-10 23:25:16 +00:00
Paul Saab	9c31df40bb	Fix for a crash (from nfs_lookup() in an error case). Submitted by: Mohan Srinivasan	2005-11-03 19:24:54 +00:00
Paul Saab	41ce2892bb	In nfs_flush(), clear the NMODIFIED bit only if there are no dirty buffers and there are no buffers queued up for writing. The bug was that NMODIFIED was being cleared even while there were buffers scheduled to be written out, which leads to all sorts of interesting bugs - one where the file could shrink (because of a post-op getattr load, say) causing data in buffer(s) queued for write to be tossed, resulting in data corruption. Submitted by: Mohan Srinivasan	2005-11-03 07:42:15 +00:00
Jeff Roberson	5b5f16b5a8	- cache_lookup() relocks the parent in the DOTDOT case for us. Spotted by: phk Sponsored by: Isilon Systems, Inc.	2005-04-14 07:08:34 +00:00
Jeff Roberson	4585e3ac5a	- Change all filesystems and vfs_cache to relock the dvp once the child is locked in the ISDOTDOT case. Se vfs_lookup.c r1.79 for details. Sponsored by: Isilon Systems, Inc.	2005-04-13 10:59:09 +00:00
Jeff Roberson	da1c9cb2b5	- Remove wantparent, it is no longer necessary. An assert in vfs_lookup.c prevents any callers from doing a modifying op without LOCKPARENT or WANTPARENT.	2005-03-29 13:09:42 +00:00
Jeff Roberson	5c5e51fd9a	- cache_lookup() now locks the new vnode for us to prevent some races. Remove redundant code. Sponsored by: Isilon Systems, Inc.	2005-03-29 13:00:37 +00:00
Jeff Roberson	f6576f194e	- We no longer have to bother with PDIRUNLOCK, lookup() handles it for us. - Network filesystems are written with a special idiom that checks the cache first, and may even unlock dvp before discovering that a network round-trip is required to resolve the name. I believe dvp is prevented from being recycled even in the forced unmount case by the shared lock on the mount point. If not, this code should grow checks for VI_DOOMED after it relocks dvp or it will access NULL v_data fields. Sponsored by: Isilon Systems, Inc.	2005-03-28 09:29:58 +00:00
Jeff Roberson	30144f05f0	- It is no longer necessary to lock and unlock the vnode in nfs_close() as the top level does this for us now. Sponsored by: Isilon Systems, Inc.	2005-03-13 12:11:23 +00:00
Poul-Henning Kamp	33822d53bf	vp->v_id is a private field for the vfs namecache and it is a big mistake that NFS ever started using it. Long time ago I added the necessary vhold()/vdrop() calls to replace it, but forgot to remove the v_id code. Do it now.	2005-02-22 14:52:00 +00:00
Poul-Henning Kamp	dfd4be14bd	Try to unbreak the vnode locking around vop_reclaim() (based mostly on patch from kan@). Pull bufobj_invalbuf() out of vinvalbuf() and make g_vfs call it on close. This is not yet a generally safe function, but for this very specific use it is safe. This solves the problem with buffers not being flushed by unmount or after failed mount attempts.	2005-02-19 11:44:57 +00:00
Robert Watson	38aa565976	Style cleanup for O_DIRECT sysctl comment introduced in nfs_vnops.c:1.242.	2005-01-29 23:19:08 +00:00
Poul-Henning Kamp	b3a4d73ebe	Create a vnode_pager object when a file is opened.	2005-01-24 23:03:29 +00:00
Poul-Henning Kamp	56dd36b1a6	Remove unused cred arg from nfs_vinvalbuf() and many bogus arguments passed for it.	2005-01-24 12:31:06 +00:00
Poul-Henning Kamp	6ef8480a88	Add BO_SYNC() and add a default which uses the secret vnode pointer and VOP_FSYNC() for now.	2005-01-11 10:43:08 +00:00
Poul-Henning Kamp	8df6bac4c7	Remove the unused credential argument from VOP_FSYNC() and VFS_SYNC(). I'm not sure why a credential was added to these in the first place, it is not used anywhere and it doesn't make much sense: The credentials for syncing a file (ability to write to the file) should be checked at the system call level. Credentials for syncing one or more filesystems ("none") should be checked at the system call level as well. If the filesystem implementation needs a particular credential to carry out the syncing it would logically have to the cached mount credential, or a credential cached along with any delayed write data. Discussed with: rwatson	2005-01-11 07:36:22 +00:00
Warner Losh	c398230b64	/* -> /*- for license, minor formatting changes	2005-01-07 01:45:51 +00:00
Paul Saab	72af302481	Turn NFS directio off until the stability issues are resolved.	2004-12-23 21:30:30 +00:00
Paul Saab	cb36cd34b8	Change the NFS sillyrename convention so that we won't run out of sillyrenames (which were limited to 58 per pid per directory, for no good reason). The new format of sillyrenames looks like .nfs.0000b31a.00d24.4 ^^^^^^^^ ^^^^^ ticks pid Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com Obtained from: Yahoo!	2004-12-16 19:28:37 +00:00
Paul Saab	a7500bceb0	First cut of NFS direct IO support. - NFS direct IO completely bypasses the buffer and page caches. If a file is open for direct IO all caching is disabled. - Direct IO for Directories will be addressed later. - 2 new NFS directio related sysctls are added. One is a knob to disable NFS direct IO completely (direct IO is enabled by default). The other is to disallow mmaped IO on a file that has at least one O_DIRECT open (see the comment in nfs_vnops.c for more details). The default is to allow mmaps on a file that has O_DIRECT opens. Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com Obtained from: Yahoo!	2004-12-15 22:20:22 +00:00
Marcel Moolenaar	129999637e	Revert rev 1.233. The null-pointer function call (a dereference on ia64) was not the result of a change in the vector operations. It was caused by the NFS locking code using a FIFO and those bypassing the vnode. This indirectly caused the panic. The NFS locking code has been changed. Requested by: phk	2004-12-11 21:36:29 +00:00
Paul Saab	41bc38d132	In nfs_rename(), skip the otw rename operation if the fsync (to either src or dst) fails. This closes a potential data loss case (where the fsync failed with ENOSPC, for example). Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com Obtained from: Yahoo!	2004-12-10 03:29:02 +00:00
Paul Saab	4342aac774	Store a hint in the nfsnode to detect sequential access of the file. Kick off a readahead only when sequential access is detected. This eliminates wasteful readaheads in random file access. Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com Obtained from: Yahoo!	2004-12-10 03:27:12 +00:00
Paul Saab	5e5f905de3	Fix for a Lock Order Reversal in the nfs_flush() path, between the vnode interlock and the proc lock. Reported by: marcel Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com	2004-12-07 21:16:32 +00:00
Paul Saab	35ec46b7f2	Rewrite of the NFS client's reply handling. We now have NFS socket upcalls which do RPC header parsing and match up the reply with the request. NFS calls now sleep on the nfsreq structure. This enables us to eliminate the NFS recvlock. Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com	2004-12-06 21:11:15 +00:00
Paul Saab	ddc6c40075	2 fixes that improve on the consistency of the NFS client cache. - Change the cached mtime to a 'struct timespec' from a time_t. Improving the precision of the cached mtime tightens up NFS' "close-to-open" consistency considerably. - Always force an over-the-wire consistency check from nfs_open() (unless the file is marked modified). This further improves NFS' "close-to-open" consistency. Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com	2004-12-06 19:18:00 +00:00
Paul Saab	d54d263a79	Serialize NFS vinvalbuf operations by acquiring/upgrading to the vnode EXCLUSIVE lock. This prevents threads from adding pages to the vnode while an invalidation is in progress, closing potential races. In the bioread() path, callers acquire the SHARED vnode lock - so while an invalidate was in progress, it was possible to fault in new pages onto the vnode causing the invalidation to take a while or fail. We saw these races at Yahoo! with very large files+heavy concurrent access. Forcing an upgrade to EXCLUSIVE lock before doing the invalidation closes all these races. Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com	2004-12-06 18:52:28 +00:00
Paul Saab	8fefdf0057	- If all data has been committed to stable storage on the server, it is safe to turn off the nfsnode's NMODIFIED flag. - Move the check for signals to the top of the loop where we loop around the dirty buffers on the vnode, scheduling writes. This ensures that we'll break ouf of the flush operation on reception of a signal. Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com	2004-12-06 16:35:58 +00:00
Marcel Moolenaar	061f5ec825	Fix null-pointer indirect function calls introduced in the previous commit. In the new world order, the transitive closure on the vector operations is not precomputed. As such, it's unsafe to actually use any of the function pointers in an indirect function call. They can be null, and we need to use the default vector in that case. This is mostly a quick fix for the four function pointers that are ed explicitly. A more generic or scalable solution is likely to see the light of day. No pathos on: current@	2004-12-05 22:30:28 +00:00
Poul-Henning Kamp	aec0fb7b40	Back when VOP_* was introduced, we did not have new-style struct initializations but we did have lofty goals and big ideals. Adjust to more contemporary circumstances and gain type checking. Replace the entire vop_t frobbing thing with properly typed structures. The only casualty is that we can not add a new VOP_ method with a loadable module. History has not given us reason to belive this would ever be feasible in the the first place. Eliminate in toto VOCALL(), vop_t, VNODEOP_SET() etc. Give coda correct prototypes and function definitions for all vop_()s. Generate a bit more data from the vnode_if.src file: a struct vop_vector and protype typedefs for all vop methods. Add a new vop_bypass() and make vop_default be a pointer to another struct vop_vector. Remove a lot of vfs_init since vop_vector is ready to use from the compiler. Cast various vop_mumble() to void * with uppercase name, for instance VOP_PANIC, VOP_NULL etc. Implement VCALL() by making vdesc_offset the offsetof() the relevant function pointer in vop_vector. This is disgusting but since the code is generated by a script comparatively safe. The alternative for nullfs etc. would be much worse. Fix up all vnode method vectors to remove casts so they become typesafe. (The bulk of this is generated by scripts)	2004-12-01 23:16:38 +00:00
Poul-Henning Kamp	ccae7d65f7	Scripted modification of vop_* prototypes to use typedefs.	2004-12-01 19:08:40 +00:00
Poul-Henning Kamp	e9d823dde4	Add missing #include	2004-12-01 07:34:08 +00:00
Paul Saab	cd15125084	Fix for a race between lookup and readdirplus, that causes a deadlock (with NFS exclusive vnode locks enabled). Lookup grabs the parent's lock and wants to lock child. Readdirplus locks the child and wants to lock parent (for loading the attrs for ".."). The fix is to not load the attrs for ".." in readdirplus. Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com Reviewed by: rwatson	2004-12-01 06:51:07 +00:00
Paul Saab	3e9c9e432a	Clean all dirty pages (dirtied by mmap'ed writes) in nfs_close(). This closes a major hole in close-to-open consistency support. Added a new sysctl so that this can be disabled for single NFS client applications with very large amounts of mmap'ed IO (for performance). Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com Reviewed by: rwatson	2004-12-01 06:48:54 +00:00
Paul Saab	74f44849b5	Fix for a bug in nfs_mkdir() that called vrele() instead of vput() in the error cases, causing panics. Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com Reviewed by: rwatson	2004-11-29 23:05:30 +00:00

1 2 3 4 5 ...

276 Commits