freebsd-dev

Author	SHA1	Message	Date
Julian Elischer	8b07e49a00	Add code to allow the system to handle multiple routing tables. This particular implementation is designed to be fully backwards compatible and to be MFC-able to 7.x (and 6.x) Currently the only protocol that can make use of the multiple tables is IPv4 Similar functionality exists in OpenBSD and Linux. From my notes: ----- One thing where FreeBSD has been falling behind, and which by chance I have some time to work on is "policy based routing", which allows different packet streams to be routed by more than just the destination address. Constraints: ------------ I want to make some form of this available in the 6.x tree (and by extension 7.x) , but FreeBSD in general needs it so I might as well do it in -current and back port the portions I need. One of the ways that this can be done is to have the ability to instantiate multiple kernel routing tables (which I will now refer to as "Forwarding Information Bases" or "FIBs" for political correctness reasons). Which FIB a particular packet uses to make the next hop decision can be decided by a number of mechanisms. The policies these mechanisms implement are the "Policies" referred to in "Policy based routing". One of the constraints I have if I try to back port this work to 6.x is that it must be implemented as a EXTENSION to the existing ABIs in 6.x so that third party applications do not need to be recompiled in timespan of the branch. This first version will not have some of the bells and whistles that will come with later versions. It will, for example, be limited to 16 tables in the first commit. Implementation method, Compatible version. (part 1) ------------------------------- For this reason I have implemented a "sufficient subset" of a multiple routing table solution in Perforce, and back-ported it to 6.x. (also in Perforce though not always caught up with what I have done in -current/P4). The subset allows a number of FIBs to be defined at compile time (8 is sufficient for my purposes in 6.x) and implements the changes needed to allow IPV4 to use them. I have not done the changes for ipv6 simply because I do not need it, and I do not have enough knowledge of ipv6 (e.g. neighbor discovery) needed to do it. Other protocol families are left untouched and should there be users with proprietary protocol families, they should continue to work and be oblivious to the existence of the extra FIBs. To understand how this is done, one must know that the current FIB code starts everything off with a single dimensional array of pointers to FIB head structures (One per protocol family), each of which in turn points to the trie of routes available to that family. The basic change in the ABI compatible version of the change is to extent that array to be a 2 dimensional array, so that instead of protocol family X looking at rt_tables[X] for the table it needs, it looks at rt_tables[Y][X] when for all protocol families except ipv4 Y is always 0. Code that is unaware of the change always just sees the first row of the table, which of course looks just like the one dimensional array that existed before. The entry points rtrequest(), rtalloc(), rtalloc1(), rtalloc_ign() are all maintained, but refer only to the first row of the array, so that existing callers in proprietary protocols can continue to do the "right thing". Some new entry points are added, for the exclusive use of ipv4 code called in_rtrequest(), in_rtalloc(), in_rtalloc1() and in_rtalloc_ign(), which have an extra argument which refers the code to the correct row. In addition, there are some new entry points (currently called rtalloc_fib() and friends) that check the Address family being looked up and call either rtalloc() (and friends) if the protocol is not IPv4 forcing the action to row 0 or to the appropriate row if it IS IPv4 (and that info is available). These are for calling from code that is not specific to any particular protocol. The way these are implemented would change in the non ABI preserving code to be added later. One feature of the first version of the code is that for ipv4, the interface routes show up automatically on all the FIBs, so that no matter what FIB you select you always have the basic direct attached hosts available to you. (rtinit() does this automatically). You CAN delete an interface route from one FIB should you want to but by default it's there. ARP information is also available in each FIB. It's assumed that the same machine would have the same MAC address, regardless of which FIB you are using to get to it. This brings us as to how the correct FIB is selected for an outgoing IPV4 packet. Firstly, all packets have a FIB associated with them. if nothing has been done to change it, it will be FIB 0. The FIB is changed in the following ways. Packets fall into one of a number of classes. 1/ locally generated packets, coming from a socket/PCB. Such packets select a FIB from a number associated with the socket/PCB. This in turn is inherited from the process, but can be changed by a socket option. The process in turn inherits it on fork. I have written a utility call setfib that acts a bit like nice.. setfib -3 ping target.example.com # will use fib 3 for ping. It is an obvious extension to make it a property of a jail but I have not done so. It can be achieved by combining the setfib and jail commands. 2/ packets received on an interface for forwarding. By default these packets would use table 0, (or possibly a number settable in a sysctl(not yet)). but prior to routing the firewall can inspect them (see below). (possibly in the future you may be able to associate a FIB with packets received on an interface.. An ifconfig arg, but not yet.) 3/ packets inspected by a packet classifier, which can arbitrarily associate a fib with it on a packet by packet basis. A fib assigned to a packet by a packet classifier (such as ipfw) would over-ride a fib associated by a more default source. (such as cases 1 or 2). 4/ a tcp listen socket associated with a fib will generate accept sockets that are associated with that same fib. 5/ Packets generated in response to some other packet (e.g. reset or icmp packets). These should use the FIB associated with the packet being reponded to. 6/ Packets generated during encapsulation. gif, tun and other tunnel interfaces will encapsulate using the FIB that was in effect withthe proces that set up the tunnel. thus setfib 1 ifconfig gif0 [tunnel instructions] will set the fib for the tunnel to use to be fib 1. Routing messages would be associated with their process, and thus select one FIB or another. messages from the kernel would be associated with the fib they refer to and would only be received by a routing socket associated with that fib. (not yet implemented) In addition Netstat has been edited to be able to cope with the fact that the array is now 2 dimensional. (It looks in system memory using libkvm (!)). Old versions of netstat see only the first FIB. In addition two sysctls are added to give: a) the number of FIBs compiled in (active) b) the default FIB of the calling process. Early testing experience: ------------------------- Basically our (IronPort's) appliance does this functionality already using ipfw fwd but that method has some drawbacks. For example, It can't fully simulate a routing table because it can't influence the socket's choice of local address when a connect() is done. Testing during the generating of these changes has been remarkably smooth so far. Multiple tables have co-existed with no notable side effects, and packets have been routes accordingly. ipfw has grown 2 new keywords: setfib N ip from anay to any count ip from any to any fib N In pf there seems to be a requirement to be able to give symbolic names to the fibs but I do not have that capacity. I am not sure if it is required. SCTP has interestingly enough built in support for this, called VRFs in Cisco parlance. it will be interesting to see how that handles it when it suddenly actually does something. Where to next: -------------------- After committing the ABI compatible version and MFCing it, I'd like to proceed in a forward direction in -current. this will result in some roto-tilling in the routing code. Firstly: the current code's idea of having a separate tree per protocol family, all of the same format, and pointed to by the 1 dimensional array is a bit silly. Especially when one considers that there is code that makes assumptions about every protocol having the same internal structures there. Some protocols don't WANT that sort of structure. (for example the whole idea of a netmask is foreign to appletalk). This needs to be made opaque to the external code. My suggested first change is to add routing method pointers to the 'domain' structure, along with information pointing the data. instead of having an array of pointers to uniform structures, there would be an array pointing to the 'domain' structures for each protocol address domain (protocol family), and the methods this reached would be called. The methods would have an argument that gives FIB number, but the protocol would be free to ignore it. When the ABI can be changed it raises the possibilty of the addition of a fib entry into the "struct route". Currently, the structure contains the sockaddr of the desination, and the resulting fib entry. To make this work fully, one could add a fib number so that given an address and a fib, one can find the third element, the fib entry. Interaction with the ARP layer/ LL layer would need to be revisited as well. Qing Li has been working on this already. This work was sponsored by Ironport Systems/Cisco Reviewed by: several including rwatson, bz and mlair (parts each) Obtained from: Ironport systems/Cisco	2008-05-09 23:03:00 +00:00
Konstantin Belousov	eab626f110	Move the head of byte-level advisory lock list from the filesystem-specific vnode data to the struct vnode. Provide the default implementation for the vop_advlock and vop_advlockasync. Purge the locks on the vnode reclaim by using the lf_purgelocks(). The default implementation is augmented for the nfs and smbfs. In the nfs_advlock, push the Giant inside the nfs_dolock. Before the change, the vop_advlock and vop_advlockasync have taken the unlocked vnode and dereferenced the fs-private inode data, racing with with the vnode reclamation due to forced unmount. Now, the vop_getattr under the shared vnode lock is used to obtain the inode size, and later, in the lf_advlockasync, after locking the vnode interlock, the VI_DOOMED flag is checked to prevent an operation on the doomed vnode. The implementation of the lf_purgelocks() is submitted by dfr. Reported by: kris Tested by: kris, pho Discussed with: jeff, dfr MFC after: 2 weeks	2008-04-16 11:33:32 +00:00
Doug Rabson	dfdcada31e	Add the new kernel-mode NFS Lock Manager. To use it instead of the user-mode lock manager, build a kernel with the NFSLOCKD option and add '-k' to 'rpc_lockd_flags' in rc.conf. Highlights include: * Thread-safe kernel RPC client - many threads can use the same RPC client handle safely with replies being de-multiplexed at the socket upcall (typically driven directly by the NIC interrupt) and handed off to whichever thread matches the reply. For UDP sockets, many RPC clients can share the same socket. This allows the use of a single privileged UDP port number to talk to an arbitrary number of remote hosts. * Single-threaded kernel RPC server. Adding support for multi-threaded server would be relatively straightforward and would follow approximately the Solaris KPI. A single thread should be sufficient for the NLM since it should rarely block in normal operation. * Kernel mode NLM server supporting cancel requests and granted callbacks. I've tested the NLM server reasonably extensively - it passes both my own tests and the NFS Connectathon locking tests running on Solaris, Mac OS X and Ubuntu Linux. * Userland NLM client supported. While the NLM server doesn't have support for the local NFS client's locking needs, it does have to field async replies and granted callbacks from remote NLMs that the local client has contacted. We relay these replies to the userland rpc.lockd over a local domain RPC socket. * Robust deadlock detection for the local lock manager. In particular it will detect deadlocks caused by a lock request that covers more than one blocking request. As required by the NLM protocol, all deadlock detection happens synchronously - a user is guaranteed that if a lock request isn't rejected immediately, the lock will eventually be granted. The old system allowed for a 'deferred deadlock' condition where a blocked lock request could wake up and find that some other deadlock-causing lock owner had beaten them to the lock. * Since both local and remote locks are managed by the same kernel locking code, local and remote processes can safely use file locks for mutual exclusion. Local processes have no fairness advantage compared to remote processes when contending to lock a region that has just been unlocked - the local lock manager enforces a strict first-come first-served model for both local and remote lockers. Sponsored by: Isilon Systems PR: 95247 107555 115524 116679 MFC after: 2 weeks	2008-03-26 15:23:12 +00:00
Jeff Roberson	698b1a6643	- Complete part of the unfinished bufobj work by consistently using BO_LOCK/UNLOCK/MTX when manipulating the bufobj. - Create a new lock in the bufobj to lock bufobj fields independently. This leaves the vnode interlock as an 'identity' lock while the bufobj is an io lock. The bufobj lock is ordered before the vnode interlock and also before the mnt ilock. - Exploit this new lock order to simplify softdep_check_suspend(). - A few sync related functions are marked with a new XXX to note that we may not properly interlock against a non-zero bv_cnt when attempting to sync all vnodes on a mountlist. I do not believe this race is important. If I'm wrong this will make these locations easier to find. Reviewed by: kib (earlier diff) Tested by: kris, pho (earlier diff)	2008-03-22 09:15:16 +00:00
Attilio Rao	7fbfba7bf8	- Handle buffer lock waiters count directly in the buffer cache instead than rely on the lockmgr support [1]: * bump the waiters only if the interlock is held * let brelvp() return the waiters count * rely on brelvp() instead than BUF_LOCKWAITERS() in order to check for the waiters number - Remove a namespace pollution introduced recently with lockmgr.h including lock.h by including lock.h directly in the consumers and making it mandatory for using lockmgr. - Modify flags accepted by lockinit(): * introduce LK_NOPROFILE which disables lock profiling for the specified lockmgr * introduce LK_QUIET which disables ktr tracing for the specified lockmgr [2] * disallow LK_SLEEPFAIL and LK_NOWAIT to be passed there so that it can only be used on a per-instance basis - Remove BUF_LOCKWAITERS() and lockwaiters() as they are no longer used This patch breaks KPI so __FreBSD_version will be bumped and manpages updated by further commits. Additively, 'struct buf' changes results in a disturbed ABI also. [2] Really, currently there is no ktr tracing in the lockmgr, but it will be added soon. [1] Submitted by: kib Tested by: pho, Andrea Barberio <insomniac at slackware dot it>	2008-03-01 19:47:50 +00:00
Attilio Rao	81c794f998	Axe the 'thread' argument from VOP_ISLOCKED() and lockstatus() as it is always curthread. As KPI gets broken by this patch, manpages and __FreeBSD_version will be updated by further commits. Tested by: Andrea Barberio <insomniac at slackware dot it>	2008-02-25 18:45:57 +00:00
Attilio Rao	84887fa362	- Add real assertions to lockmgr locking primitives. A couple of notes for this: * WITNESS support, when enabled, is only used for shared locks in order to avoid problems with the "disowned" locks * KA_HELD and KA_UNHELD only exists in the lockmgr namespace in order to assert for a generic thread (not curthread) owning or not the lock. Really, this kind of check is bogus but it seems very widespread in the consumers code. So, for the moment, we cater this untrusted behaviour, until the consumers are not fixed and the options could be removed (hopefully during 8.0-CURRENT lifecycle) * Implementing KA_HELD and KA_UNHELD (not surported natively by WITNESS) made necessary the introduction of LA_MASKASSERT which specifies the range for default lock assertion flags * About other aspects, lockmgr_assert() follows exactly what other locking primitives offer about this operation. - Build real assertions for buffer cache locks on the top of lockmgr_assert(). They can be used with the BUF_ASSERT_*(bp) paradigm. - Add checks at lock destruction time and use a cookie for verifying lock integrity at any operation. - Redefine BUF_LOCKFREE() in order to not use a direct assert but let it rely on the aforementioned destruction time check. KPI results evidently broken, so __FreeBSD_version bumping and manpage update result necessary and will be committed soon. Side note: lockmgr_assert() will be used soon in order to implement real assertions in the vnode namespace replacing the legacy and still bogus "VOP_ISLOCKED()" way. Tested by: kris (earlier version) Reviewed by: jhb	2008-02-13 20:44:19 +00:00
Attilio Rao	2433c4883e	Conver all explicit instances to VOP_ISLOCKED(arg, NULL) into VOP_ISLOCKED(arg, curthread). Now, VOP_ISLOCKED() and lockstatus() should only acquire curthread as argument; this will lead in axing the additional argument from both functions, making the code cleaner. Reviewed by: jeff, kib	2008-02-08 21:45:47 +00:00
Attilio Rao	0e9eb108f0	Cleanup lockmgr interface and exported KPI: - Remove the "thread" argument from the lockmgr() function as it is always curthread now - Axe lockcount() function as it is no longer used - Axe LOCKMGR_ASSERT() as it is bogus really and no currently used. Hopefully this will be soonly replaced by something suitable for it. - Remove the prototype for dumplockinfo() as the function is no longer present Addictionally: - Introduce a KASSERT() in lockstatus() in order to let it accept only curthread or NULL as they should only be passed - Do a little bit of style(9) cleanup on lockmgr.h KPI results heavilly broken by this change, so manpages and FreeBSD_version will be modified accordingly by further commits. Tested by: matteo	2008-01-24 12:34:30 +00:00
Attilio Rao	d638e093d6	- Introduce the function lockmgr_recursed() which returns true if the lockmgr lkp, when held in exclusive mode, is recursed - Introduce the function BUF_RECURSED() which does the same for bufobj locks based on the top of lockmgr_recursed() - Introduce the function BUF_ISLOCKED() which works like the counterpart VOP_ISLOCKED(9), showing the state of lockmgr linked with the bufobj BUF_RECURSED() and BUF_ISLOCKED() entirely replace the usage of bogus BUF_REFCNT() in a more explicative and SMP-compliant way. This allows us to axe out BUF_REFCNT() and leaving the function lockcount() totally unused in our stock kernel. Further commits will axe lockcount() as well as part of lockmgr() cleanup. KPI results, obviously, broken so further commits will update manpages and freebsd version. Tested by: kris (on UFS and NFS)	2008-01-19 17:36:23 +00:00
Attilio Rao	22db15c06f	VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary. KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed. Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>	2008-01-13 14:44:15 +00:00
Attilio Rao	cb05b60a89	vn_lock() is currently only used with the 'curthread' passed as argument. Remove this argument and pass curthread directly to underlying VOP_LOCK1() VFS method. This modify makes the code cleaner and in particular remove an annoying dependence helping next lockmgr() cleanup. KPI results, obviously, changed. Manpage and FreeBSD_version will be updated through further commits. As a side note, would be valuable to say that next commits will address a similar cleanup about VFS methods, in particular vop_lock1 and vop_unlock. Tested by: Diego Sardina <siarodx at gmail dot com>, Andrea Di Pasquale <whyx dot it at gmail dot com>	2008-01-10 01:10:58 +00:00
Julian Elischer	3745c395ec	Rename the kthread_xxx (e.g. kthread_create()) calls to kproc_xxx as they actually make whole processes. Thos makes way for us to add REAL kthread_create() and friends that actually make theads. it turns out that most of these calls actually end up being moved back to the thread version when it's added. but we need to make this cosmetic change first. I'd LOVE to do this rename in 7.0 so that we can eventually MFC the new kthread_xxx() calls.	2007-10-20 23:23:23 +00:00
Alfred Perlstein	77465d9390	Get rid of qaddr_t. Requested by: bde	2007-10-16 10:54:55 +00:00
Jeff Roberson	1c4bcd050a	- Move rusage from being per-process in struct pstats to per-thread in td_ru. This removes the requirement for per-process synchronization in statclock() and mi_switch(). This was previously supported by sched_lock which is going away. All modifications to rusage are now done in the context of the owning thread. reads proceed without locks. - Aggregate exiting threads rusage in thread_exit() such that the exiting thread's rusage is not lost. - Provide a new routine, rufetch() to fetch an aggregate of all rusage structures from all threads in a process. This routine must be used in any place requiring a rusage from a process prior to it's exit. The exited process's rusage is still available via p_ru. - Aggregate tick statistics only on demand via rufetch() or when a thread exits. Tick statistics are kept in the thread and protected by sched_lock until it exits. Initial patch by: attilio Reviewed by: attilio, bde (some objections), arch (mostly silent)	2007-06-01 01:12:45 +00:00
Bruce Evans	63cb891e8b	Rename some functions and variables from nfs_* to nfs4_* to avoid collisions with nfsclient's names. Even static names should have a unique prefix so that they can be debugged easily. Hide the unused colliding variable nfsv3_commit_on_close in "#if 0" together with other unused sysctl variables. Duplicating the nfs sysctl under nfs4 is probably just a bug. Fix some nearby style bugs. Remove duplicate $FreeBSD$.	2007-01-25 14:33:13 +00:00
Bruce Evans	8754c03a11	Rename some functions and variables (mainly vfsops entry points) from nfs_* to nfs4_* to avoid collisions with nfsclient's names. Even static names should have a unique prefix so that they can be debugged easily. Most of the renamed functions can probably be shared. nfs4_cmount() and nfs4_sync() are identical to the nfs_* versions, and all the others except nfs4_vfsops() seem to be idendentical except for style bugs, missing support for mountroot, and bugs. Fix some nearby style bugs. Remove duplicate $FreeBSD$.	2007-01-25 14:18:40 +00:00
Bruce Evans	e43982a801	Unstaticize nfs_iosize() in nfsclient and use it in nfs4client instead of duplicating it except for larger style bugs in the copy. Fix some nearby style bugs (including a harmless type mismatch) in and near the remaining copy. This is part of fixing collisions of the 2 nfs*client's names. Even static names should have a unique prefixes so that they can be debugged easily.	2007-01-25 13:07:25 +00:00
Konstantin Belousov	2cc7d26f7f	Cylinder group bitmaps and blocks containing inode for a snapshot file are after snaplock, while other ffs device buffers are before snaplock in global lock order. By itself, this could cause deadlock when bdwrite() tries to flush dirty buffers on snapshotted ffs. If, during the flush, COW activity for snapshot needs to allocate block and ffs_alloccg() selects the cylinder group that is being written by bdwrite(), then kernel would panic due to recursive buffer lock acquision. Avoid dealing with buffers in bdwrite() that are from other side of snaplock divisor in the lock order then the buffer being written. Add new BOP, bop_bdwrite(), to do dirty buffer flushing for same vnode in the bdwrite(). Default implementation, bufbdflush(), refactors the code from bdwrite(). For ffs device buffers, specialized implementation is used. Reviewed by: tegge, jeff, Russell Cattelan (cattelan xfs org, xfs changes) Tested by: Peter Holm X-MFC after: 3 weeks (if ever: it changes ABI)	2007-01-23 10:01:19 +00:00
Jim Rees	c1ecb4d7c3	NFSv4 client: Add support for va_birthtime Fix va_ctime to use TIME_METADATA, not TIME_CREATE	2006-11-28 19:33:28 +00:00
Mohan Srinivasan	7d7d9e2242	Fixes up the handling of shared vnode lock lookups in the NFS client, adds a FS type specific flag indicating that the FS supports shared vnode lock lookups, adds some logic in vfs_lookup.c to test this flag and set lock flags appropriately. - amd on 6.x is a non-starter (without this change). Using amd under heavy load results in a deadlock (with cascading vnode locks all the way to the root) very quickly. - This change should also fix the more general problem of cascading vnode deadlocks when an NFS server goes down. Ideally, we wouldn't need these changes, as enabling shared vnode lock lookups globally would work. Unfortunately, UFS, for example isn't ready for shared vnode lock lookups, crashing pretty quickly. This change is the result of discussions with Stephan Uphoff (ups@). Reviewed by: ups@	2006-09-13 18:39:09 +00:00
Konstantin Belousov	a881fd5a0a	Always supply curthread as argument to nfs_asyncio and nfs_doio in nfs_strategy. Otherwise, for some buffers, signals would be ignored at the intr mounts. Reviewed by: mohan, cel MFC after: 1 month Approved by: pjd (mentor)	2006-07-12 09:16:35 +00:00
Chuck Lever	52bde90e30	While reviewing NFS client for another PR, noticed this omission in the NFSv4 client READDIR logic. This change matches the logic in the version 2 and 3 code. Sponsored by: Network Appliance, Incorporated	2006-05-24 15:56:36 +00:00
Chuck Lever	6d0699a5ba	NFS over TCP retransmit behavior should default to a 60 second time out, mimicing the NFS reference implementation. NFS over TCP does not need fast retransmit timeouts, since network loss and congestion are managed by the transport (TCP), unlike with NFS over UDP. A long timeout prevents the unnecessary retransmission of non- idempotent NFS requests. Reviewed by: mohans, silby, rees? Sponsored by: Network Appliance, Incorporated	2006-05-23 18:48:07 +00:00
Mohan Srinivasan	f1cdf89911	Changes to make the NFS client MP safe. Thanks to Kris Kennaway for testing and sending lots of bugs my way.	2006-05-19 00:04:24 +00:00
Chuck Lever	5f396e80f0	Add better sanity checking to the logic that handles ioctl processing for nfsclient and nfs4client in order to prevent local root users from panicing the system. PR: kern/77463 Submitted by: Wojciech A. Koszek Reviewed by: cel, rees MFC after: 2 weeks Security: Local root users can panic the system at will	2006-05-13 00:16:35 +00:00
Jim Rees	c1f097baba	Use nfs4_disconnect for connections opened with nfs4_connect. Submitted by: cel@citi.umich.edu MFC after: 1 week	2006-01-19 22:48:31 +00:00
Tor Egge	82be0a5a24	Add marker vnodes to ensure that all vnodes associated with the mount point are iterated over when using MNT_VNODE_FOREACH. Reviewed by: truckman	2006-01-09 20:42:19 +00:00
Tor Egge	012cbd3181	Obtain mount point lock before restarting sync loop if vget() failed. Reviewed by: truckman	2006-01-09 18:57:35 +00:00
Robert Watson	5bb84bc84b	Normalize a significant number of kernel malloc type names: - Prefer '_' to ' ', as it results in more easily parsed results in memory monitoring tools such as vmstat. - Remove punctuation that is incompatible with using memory type names as file names, such as '/' characters. - Disambiguate some collisions by adding subsystem prefixes to some memory types. - Generally prefer lower case to upper case. - If the same type is defined in multiple architecture directories, attempt to use the same name in additional cases. Not all instances were caught in this change, so more work is required to finish this conversion. Similar changes are required for UMA zone names.	2005-10-31 15:41:29 +00:00
Jeff Roberson	3b06a01534	- We want if (mrep != NULL) not if (m_freem != NULL). m_freem will never be NULL and we will always leak mrep in the error case. Submitted by: Greg Taleck <gtaleck@isilon.com>	2005-04-25 05:11:19 +00:00
Jeff Roberson	5b5f16b5a8	- cache_lookup() relocks the parent in the DOTDOT case for us. Spotted by: phk Sponsored by: Isilon Systems, Inc.	2005-04-14 07:08:34 +00:00
Jeff Roberson	4585e3ac5a	- Change all filesystems and vfs_cache to relock the dvp once the child is locked in the ISDOTDOT case. Se vfs_lookup.c r1.79 for details. Sponsored by: Isilon Systems, Inc.	2005-04-13 10:59:09 +00:00
Jeff Roberson	da1c9cb2b5	- Remove wantparent, it is no longer necessary. An assert in vfs_lookup.c prevents any callers from doing a modifying op without LOCKPARENT or WANTPARENT.	2005-03-29 13:09:42 +00:00
Jeff Roberson	5c5e51fd9a	- cache_lookup() now locks the new vnode for us to prevent some races. Remove redundant code. Sponsored by: Isilon Systems, Inc.	2005-03-29 13:00:37 +00:00
Jeff Roberson	f6576f194e	- We no longer have to bother with PDIRUNLOCK, lookup() handles it for us. - Network filesystems are written with a special idiom that checks the cache first, and may even unlock dvp before discovering that a network round-trip is required to resolve the name. I believe dvp is prevented from being recycled even in the forced unmount case by the shared lock on the mount point. If not, this code should grow checks for VI_DOOMED after it relocks dvp or it will access NULL v_data fields. Sponsored by: Isilon Systems, Inc.	2005-03-28 09:29:58 +00:00
Jeff Roberson	a176ceb322	- Update vfs_root implementations to match the new prototype. None of these filesystems will support shared locks until they are explicitly modified to do so. Careful review must be done to ensure that this is safe for each individual filesystem. Sponsored by: Isilon Systems, Inc.	2005-03-24 07:39:03 +00:00
David Schultz	ab3ac78792	Remove dead code. Found by: Coverity Prevent analysis tool	2005-03-18 22:33:10 +00:00
Jeff Roberson	dcc18814ba	- It is no longer necessary to lock and unlock the vnode in nfs4_close() as the top level does this for us now. Sponsored by: Isilon Systems, Inc.	2005-03-13 12:16:45 +00:00
Poul-Henning Kamp	44b3f4ab59	Follow v_id changes in NFSv[23]	2005-02-22 15:15:28 +00:00
Poul-Henning Kamp	56dd36b1a6	Remove unused cred arg from nfs_vinvalbuf() and many bogus arguments passed for it.	2005-01-24 12:31:06 +00:00
Poul-Henning Kamp	14bcbf608e	This file fell out of the list when adding bufsync.	2005-01-11 11:36:26 +00:00
Poul-Henning Kamp	8df6bac4c7	Remove the unused credential argument from VOP_FSYNC() and VFS_SYNC(). I'm not sure why a credential was added to these in the first place, it is not used anywhere and it doesn't make much sense: The credentials for syncing a file (ability to write to the file) should be checked at the system call level. Credentials for syncing one or more filesystems ("none") should be checked at the system call level as well. If the filesystem implementation needs a particular credential to carry out the syncing it would logically have to the cached mount credential, or a credential cached along with any delayed write data. Discussed with: rwatson	2005-01-11 07:36:22 +00:00
Warner Losh	c398230b64	/* -> /*- for license, minor formatting changes	2005-01-07 01:45:51 +00:00
Paul Saab	35ec46b7f2	Rewrite of the NFS client's reply handling. We now have NFS socket upcalls which do RPC header parsing and match up the reply with the request. NFS calls now sleep on the nfsreq structure. This enables us to eliminate the NFS recvlock. Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com	2004-12-06 21:11:15 +00:00
Poul-Henning Kamp	ea87d2a2d4	Convert to nmount, add omount compat. Take the cheap way and just put struct nfs_args in a nmount arg, we will need a userland mount_nfs4(8) program anyhow.	2004-12-06 20:33:24 +00:00
Paul Saab	ddc6c40075	2 fixes that improve on the consistency of the NFS client cache. - Change the cached mtime to a 'struct timespec' from a time_t. Improving the precision of the cached mtime tightens up NFS' "close-to-open" consistency considerably. - Always force an over-the-wire consistency check from nfs_open() (unless the file is marked modified). This further improves NFS' "close-to-open" consistency. Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com	2004-12-06 19:18:00 +00:00
Poul-Henning Kamp	743312367a	VFS_STATFS(mp, ...) is mostly called with &mp->mnt_stat, but a few cases doesn't. Most of the implementations have grown weeds for this so they copy some fields from mnt_stat if the passed argument isn't that. Fix this the cleaner way: Always call the implementation on mnt_stat and copy that in toto to the VFS_STATFS argument if different.	2004-12-05 22:41:02 +00:00
Poul-Henning Kamp	aec0fb7b40	Back when VOP_* was introduced, we did not have new-style struct initializations but we did have lofty goals and big ideals. Adjust to more contemporary circumstances and gain type checking. Replace the entire vop_t frobbing thing with properly typed structures. The only casualty is that we can not add a new VOP_ method with a loadable module. History has not given us reason to belive this would ever be feasible in the the first place. Eliminate in toto VOCALL(), vop_t, VNODEOP_SET() etc. Give coda correct prototypes and function definitions for all vop_()s. Generate a bit more data from the vnode_if.src file: a struct vop_vector and protype typedefs for all vop methods. Add a new vop_bypass() and make vop_default be a pointer to another struct vop_vector. Remove a lot of vfs_init since vop_vector is ready to use from the compiler. Cast various vop_mumble() to void * with uppercase name, for instance VOP_PANIC, VOP_NULL etc. Implement VCALL() by making vdesc_offset the offsetof() the relevant function pointer in vop_vector. This is disgusting but since the code is generated by a script comparatively safe. The alternative for nullfs etc. would be much worse. Fix up all vnode method vectors to remove casts so they become typesafe. (The bulk of this is generated by scripts)	2004-12-01 23:16:38 +00:00
Jim Rees	088c777637	don't confuse NFSMNT_ flags with MNT_ flags in statfs Approved by: alfred	2004-12-01 21:47:51 +00:00

1 2

89 Commits