freebsd-skq

Author	SHA1	Message	Date
jeff	d7efebc4db	- Convert the bufobj lock to rwlock. - Use a shared bufobj lock in getblk() and inmem(). - Convert softdep's lk to rwlock to match the bufobj lock. - Move INFREECNT to b_flags and protect it with the buf lock. - Remove unnecessary locking around bremfree() and BKGRDINPROG. Sponsored by: EMC / Isilon Storage Division Discussed with: mckusick, kib, mdf	2013-05-31 00:43:41 +00:00
attilio	15bf891afe	Rename VM_OBJECT_LOCK(), VM_OBJECT_UNLOCK() and VM_OBJECT_TRYLOCK() to their "write" versions. Sponsored by: EMC / Isilon storage division	2013-02-20 12:03:20 +00:00
attilio	658534ed5a	Switch vm_object lock to be a rwlock. * VM_OBJECT_LOCK and VM_OBJECT_UNLOCK are mapped to write operations * VM_OBJECT_SLEEP() is introduced as a general purpose primitve to get a sleep operation using a VM_OBJECT_LOCK() as protection * The approach must bear with vm_pager.h namespace pollution so many files require including directly rwlock.h	2013-02-20 10:38:34 +00:00
glebius	8e20fa5ae9	Mechanically substitute flags from historic mbuf allocator with malloc(9) flags within sys. Exceptions: - sys/contrib not touched - sys/mbuf.h edited manually	2012-12-05 08:04:20 +00:00
kib	560aa751e0	Remove the support for using non-mpsafe filesystem modules. In particular, do not lock Giant conditionally when calling into the filesystem module, remove the VFS_LOCK_GIANT() and related macros. Stop handling buffers belonging to non-mpsafe filesystems. The VFS_VERSION is bumped to indicate the interface change which does not result in the interface signatures changes. Conducted and reviewed by: attilio Tested by: pho	2012-10-22 17:50:54 +00:00
kib	8f845e475e	Fix the mis-handling of the VV_TEXT on the nullfs vnodes. If you have a binary on a filesystem which is also mounted over by nullfs, you could execute the binary from the lower filesystem, or from the nullfs mount. When executed from lower filesystem, the lower vnode gets VV_TEXT flag set, and the file cannot be modified while the binary is active. But, if executed as the nullfs alias, only the nullfs vnode gets VV_TEXT set, and you still can open the lower vnode for write. Add a set of VOPs for the VV_TEXT query, set and clear operations, which are correctly bypassed to lower vnode. Tested by: pho (previous version) MFC after: 2 weeks	2012-09-28 11:25:02 +00:00
delphij	51d47afd85	Honor NFSv3 commit call (RFC 1813, Section 3.3.21) where when count is 0, the full length from offset is being flushed. Note that for now VOP_FSYNC does not support offset and length parameters so we still do the same full VOP_FSYNC. This issue was reported at FreeNAS support site as FreeNAS ticket #1096. Submitted by: "ceckerle" <ce.freenas eckerle net> Prodded by: gcooper Reviewed by: rmacklem MFC after: 2 weeks	2011-12-15 02:26:53 +00:00
jhb	6dededc9d9	Enhance the sequential access heuristic used to perform readahead in the NFS server and reuse it for writes as well to allow writes to the backing store to be clustered. - Use a prime number for the size of the heuristic table (1017 is not prime). - Move the logic to locate a heuristic entry from the table and compute the sequential count out of VOP_READ() and into a separate routine. - Use the logic from sequential_heuristic() in vfs_vnops.c to update the seqcount when a sequential access is performed rather than just increasing seqcount by 1. This lets the clustering count ramp up faster. - Allow for some reordering of RPCs and if it is detected leave the current seqcount as-is rather than dropping back to a seqcount of 1. Also, when out of order access is encountered, cut seqcount in half rather than dropping it all the way back to 1 to further aid with reordering. - Fix the new NFS server to properly update the next offset after a successful VOP_READ() so that the readahead actually works. Some of these changes came from an earlier patch by Bjorn Gronwall that was forwarded to me by bde@. Discussed with: bde, rmacklem, fs@ Submitted by: Bjorn Gronwall (1, 4) MFC after: 2 weeks	2011-12-01 18:46:28 +00:00
rmacklem	3e62df9adb	Fix the NFS servers so that they can do a Lookup of "..", which requires that ni_strictrelative be set to 0, post-r224810. Tested by: swills (earlier version), geo dot liaskos at gmail.com Approved by: re (kib)	2011-09-03 00:28:53 +00:00
netchild	cc4128c6b1	Add some FEATURE macros for various features (AUDIT/CAM/IPC/KTR/MAC/NFS/NTP/ PMC/SYSV/...). No FreeBSD version bump, the userland application to query the features will be committed last and can serve as an indication of the availablility if needed. Sponsored by: Google Summer of Code 2010 Submitted by: kibab Reviewed by: arch@ (parts by rwatson, trasz, jhb) X-MFC after: to be determined in last commit with code from this project	2011-02-25 10:11:01 +00:00
alc	11491a4c5e	Unless "cnt" exceeds MAX_COMMIT_COUNT, nfsrv_commit() and nfsvno_fsync() are incorrectly calling vm_object_page_clean(). They are passing the length of the range rather than the ending offset of the range. Perform the OFF_TO_IDX() conversion in vm_object_page_clean() rather than the callers. Reviewed by: kib MFC after: 3 weeks	2011-02-05 21:21:27 +00:00
pjd	76df586660	ZFS might not return monotonically increasing directory offset cookies, so turn off UFS-specific hack that assumes so in ZFS case. Before the change we can miss returning some directory entries to a NFS client. I believe that the hack should be moved to ufs_readdir(), but until we find somebody who will do it, turn it off for ZFS in NFS server code. Submitted by: rmacklem Discussed with: rmacklem, mckusick MFC after: 3 days	2010-12-28 21:12:15 +00:00
pjd	f4e75b41ae	Use newly added NFSRV_FLAG_BUSY flag for nfsrv_fhtovp() to keep mount point busy. This fixes a race where we can pass invalid mount point to VFS_VGET() via vp->v_mount when exported file system was forcibly unmounted between nfsrv_fhtovp() and VFS_VGET(). Reviewed by: kib MFC after: 5 days	2010-12-21 23:15:40 +00:00
pjd	8503dc84a4	- Move pubflag and lockflag handling from nfsrv_fhtovp() to nfs_namei() - this is the only place that is different from all the other nfsrv_fhtovp() consumers. This simplifies nfsrv_fhtovp() a bit and also eliminates one vn_lock/VOP_UNLOCK() cycle in case of NFSv3. - Implement NFSRV_FLAG_BUSY flag for nfsrv_fhtovp() that tells it to leave mount point busy. Reviewed by: kib MFC after: 5 days	2010-12-21 23:12:45 +00:00
pjd	c1aba2a1e9	Reduce lock scope a little.	2010-12-19 18:06:20 +00:00
kib	d0d6cc47b6	VOP_ISLOCKED() should not be used to determine if the vnode is locked. Explicitely track the locked status of the vnode. Reviewed by: pjd Tested by: avg MFC after: 1 week	2010-12-15 12:46:53 +00:00
kib	5fa1e0e510	Fix a bug in r214049. The nvp == vp case shall be handled specially only for !usevget case. If VFS_VGET is working, the vnode shared lock is obtained recursively and vput() shall be done, not vunref(). Submitted by: rmacklem Tested by: Josh Carroll <josh.carroll gmail com> MFC after: 3 days	2010-11-05 21:13:16 +00:00
kib	c4752b1717	When readdirplus() is handled on the exported filesystem that does not support VFS_VGET, like msdosfs, do not call VOP_LOOKUP() for dotdot on the root directory. Our filesystems expect that VFS handles dotdot lookups on root on its own. Reported and tested by: kevlo MFC after: 2 weeks	2010-10-19 08:55:31 +00:00
pjd	f8dd61b4d9	- When VFS_VGET() is not supported, switch to VOP_LOOKUP(). - We are fine by only share-locking the vnode. - Remove assertion that doesn't hold for ZFS where we cross mount points boundaries by going into .zfs/snapshot/<name>/. Reviewed by: rmacklem MFC after: 1 month	2010-08-26 23:41:40 +00:00
jhb	045df5c8fd	Properly return an error reply if an NFS remove or link operation fails. Previously the failing operation would allocate an mbuf and construct an error reply, but because the function did not return 0, the NFS server assumed it had failed to generate a reply and would leak the reply mbuf as well as not sending the reply to the NFS client. PR: kern/140853 Submitted by: Ted Faber faber at isi edu (remove) Reviewed by: rmacklem (remove) MFC after: 1 week	2009-12-03 20:59:28 +00:00
pjd	62c08fc476	Ensure that tv_sec is between INT32_MIN and INT32_MAX, so ZFS won't object. This completes the fix from r185586. PR: kern/139059 Reported by: Daniel Braniss <danny@cs.huji.ac.il> Submitted by: Jaakko Heinonen <jh@saunalahti.fi> Tested by: Daniel Braniss <danny@cs.huji.ac.il> MFC after: 3 days	2009-09-26 18:23:16 +00:00
pjd	f02d7edecd	Correct typo after manual patching. Noticed by: b. f.	2009-09-09 13:23:26 +00:00
pjd	ef2355e38d	Fix usecount leak in mknod(2) on file system exported over NFS. While I'm here, correct typo in comment. Reviewed by: kan, kib MFC after: 3 days	2009-09-09 12:56:05 +00:00
dfr	5d248bb05f	Remove the old kernel RPC implementation and the NFS_LEGACYRPC option. Approved by: re	2009-06-30 19:03:27 +00:00
attilio	1dcb84131b	Remove the thread argument from the FSD (File-System Dependent) parts of the VFS. Now all the VFS_* functions and relating parts don't want the context as long as it always refers to curthread. In some points, in particular when dealing with VOPs and functions living in the same namespace (eg. vflush) which still need to be converted, pass curthread explicitly in order to retain the old behaviour. Such loose ends will be fixed ASAP. While here fix a bug: now, UFS_EXTATTR can be compiled alone without the UFS_EXTATTR_AUTOSTART option. VFS KPI is heavilly changed by this commit so thirdy parts modules needs to be recompiled. Bump __FreeBSD_version in order to signal such situation.	2009-05-11 15:33:26 +00:00
jhb	26e338d6fc	Use shared vnode locks when invoking VOP_READDIR(). MFC after: 1 month	2009-02-13 18:18:14 +00:00
kensmith	7d088b7fca	Handle VFS_VGET() failing with an error other than EOPNOTSUPP in addition to failing with that error. PR: 125149 Submitted by: Jaakko Heinonen (jh <at> saunalahti <dot> fi) Reviewed by: mohans, kan MFC after: 3 days	2008-12-16 04:34:09 +00:00
kan	c7b0520697	Change nfsserver slightly so that it does not trip over the timestamp validation code on ZFS. Problem: when opening file with O_CREAT\|O_EXCL NFS has to jump through extra hoops to ensure O_EXCL semantics. Namely, client supplies of 8 bytes (NFSX_V3CREATEVERF) bytes of verification data to uniquely identify this create request. Server then creates a new file with access mode 0, copies received 8 bytes into va_atime member of struct vattr and attempt to set the atime on file using VOP_SETATTR. If that succeeds, it fetches file attributes with VOP_GETATTR and verifies that atime timestamps match. If timestamps do not match, NFS server concludes it has probbaly lost the race to another process creating the file with the same name and bails with EEXIST. This scheme works OK when exported FS is FFS, but if underlying filesystem is ZFS _and_ server is running 64bit kernel, it breaks down due to sanity checking in zfs_setattr function, which refuses to accept any timestamps which have tv_sec that cannot be represented as 32bit int. Since struct timespec fields are 64 bit integers on 64bit platforms and server just copies NFSX_V3CREATEVERF bytes info va_atime, all eight bytes supplied by client end up in va_atime.tv_sec, forcing it out of valid 32bit range. The solution this change implements is simple: it treats NFSX_V3CREATEVERF as two 32bit integers and unpacks them separately into va_atime.tv_sec and va_atime.tv_nsec respectively, thus guaranteeing that tv_sec remains in 32 bit range and ZFS remains happy. Reviewed by: kib	2008-12-03 17:54:09 +00:00
dfr	2fb03513fc	Implement support for RPCSEC_GSS authentication to both the NFS client and server. This replaces the RPC implementation of the NFS client and server with the newer RPC implementation originally developed (actually ported from the userland sunrpc code) to support the NFS Lock Manager. I have tested this code extensively and I believe it is stable and that performance is at least equal to the legacy RPC implementation. The NFS code currently contains support for both the new RPC implementation and the older legacy implementation inherited from the original NFS codebase. The default is to use the new implementation - add the NFS_LEGACYRPC option to fall back to the old code. When I merge this support back to RELENG_7, I will probably change this so that users have to 'opt in' to get the new code. To use RPCSEC_GSS on either client or server, you must build a kernel which includes the KGSSAPI option and the crypto device. On the userland side, you must build at least a new libc, mountd, mount_nfs and gssd. You must install new versions of /etc/rc.d/gssd and /etc/rc.d/nfsd and add 'gssd_enable=YES' to /etc/rc.conf. As long as gssd is running, you should be able to mount an NFS filesystem from a server that requires RPCSEC_GSS authentication. The mount itself can happen without any kerberos credentials but all access to the filesystem will be denied unless the accessing user has a valid ticket file in the standard place (/tmp/krb5cc_<uid>). There is currently no support for situations where the ticket file is in a different place, such as when the user logged in via SSH and has delegated credentials from that login. This restriction is also present in Solaris and Linux. In theory, we could improve this in future, possibly using Brooks Davis' implementation of variant symlinks. Supporting RPCSEC_GSS on a server is nearly as simple. You must create service creds for the server in the form 'nfs/<fqdn>@<REALM>' and install them in /etc/krb5.keytab. The standard heimdal utility ktutil makes this fairly easy. After the service creds have been created, you can add a '-sec=krb5' option to /etc/exports and restart both mountd and nfsd. The only other difference an administrator should notice is that nfsd doesn't fork to create service threads any more. In normal operation, there will be two nfsd processes, one in userland waiting for TCP connections and one in the kernel handling requests. The latter process will create as many kthreads as required - these should be visible via 'top -H'. The code has some support for varying the number of service threads according to load but initially at least, nfsd uses a fixed number of threads according to the value supplied to its '-n' option. Sponsored by: Isilon Systems MFC after: 1 month	2008-11-03 10:38:00 +00:00
trhodes	3c9c77e154	Document a few sysctls in the NFS client and server code. Minor style(9) where applicable. Approved by: alfred (slightly older version)	2008-11-02 17:00:23 +00:00
trasz	0ad8692247	Introduce accmode_t. This is required for NFSv4 ACLs - it will be neccessary to add more V* constants, and the variables changed by this patch were often being assigned to mode_t variables, which is 16 bit. Approved by: rwatson (mentor)	2008-10-28 13:44:11 +00:00
des	66f807ed8b	Retire the MALLOC and FREE macros. They are an abomination unto style(9). MFC after: 3 months	2008-10-23 15:53:51 +00:00
rwatson	25ed07a2fe	Turn XXX's for unlocked writes of NFS server statistics to simple notes, as we consider it a feature to exchange performance for consistency. MFC after: 3 days	2008-10-12 20:06:59 +00:00
attilio	23ff3dbeb8	Remove the suser(9) interface from the kernel. It has been replaced from years by the priv_check(9) interface and just very few places are left. Note that compatibility stub with older FreeBSD version (all above the 8 limit though) are left in order to reduce diffs against old versions. It is responsibility of the maintainers for any module, if they think it is the case, to axe out such cases. This patch breaks KPI so __FreeBSD_version will be bumped into a later commit. This patch needs to be credited 50-50 with rwatson@ as he found time to explain me how the priv_check() works in detail and to review patches. Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com> Reviewed by: rwatson	2008-09-17 15:49:44 +00:00
attilio	a9873f87a6	Decontext-alize the nfsserver module. Now, only some few places still require thread passing (mostly the ones which access to VOP_* functions) and will be fixed once the primitive also will be. Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>	2008-09-16 21:57:39 +00:00
attilio	dbf35e279f	Decontextualize the couplet VOP_GETATTR / VOP_SETATTR as the passed thread was always curthread and totally unuseful. Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>	2008-08-28 15:23:18 +00:00
kib	f4b9bd396a	Change the fix in the rev. 1.179 to use nfsrv_lockedpair_nd(). Tested by: pho MFC after: 3 days	2008-05-28 16:23:17 +00:00
kib	2e00a34c1d	Initialize vfslocked prior to calling nfsm_srvmtofh where it was forgotten. Reported by: Andrew Edwards <aedwards sandvine com> Tested by: pho MFC after: 3 days	2008-05-28 16:21:32 +00:00
ru	3b1bf8c2e9	Replaced the misleading uses of a historical artefact M_TRYWAIT with M_WAIT. Removed dead code that assumed that M_TRYWAIT can return NULL; it's not true since the advent of MBUMA. Reviewed by: arch There are ongoing disputes as to whether we want to switch to directly using UMA flags M_WAITOK/M_NOWAIT for mbuf(9) allocation.	2008-03-25 09:39:02 +00:00
jeff	a9d123c3ab	- Complete part of the unfinished bufobj work by consistently using BO_LOCK/UNLOCK/MTX when manipulating the bufobj. - Create a new lock in the bufobj to lock bufobj fields independently. This leaves the vnode interlock as an 'identity' lock while the bufobj is an io lock. The bufobj lock is ordered before the vnode interlock and also before the mnt ilock. - Exploit this new lock order to simplify softdep_check_suspend(). - A few sync related functions are marked with a new XXX to note that we may not properly interlock against a non-zero bv_cnt when attempting to sync all vnodes on a mountlist. I do not believe this race is important. If I'm wrong this will make these locations easier to find. Reviewed by: kib (earlier diff) Tested by: kris, pho (earlier diff)	2008-03-22 09:15:16 +00:00
kib	02dada141b	Fix the Giant leak in the nfsrv_remove(). Reported by: pluknet <pluknet gmail com> MFC after: 1 week	2008-03-04 11:05:03 +00:00
attilio	71b7824213	VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary. KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed. Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>	2008-01-13 14:44:15 +00:00
attilio	18d0a0dd51	vn_lock() is currently only used with the 'curthread' passed as argument. Remove this argument and pass curthread directly to underlying VOP_LOCK1() VFS method. This modify makes the code cleaner and in particular remove an annoying dependence helping next lockmgr() cleanup. KPI results, obviously, changed. Manpage and FreeBSD_version will be updated through further commits. As a side note, would be valuable to say that next commits will address a similar cleanup about VFS methods, in particular vop_lock1 and vop_unlock. Tested by: Diego Sardina <siarodx at gmail dot com>, Andrea Di Pasquale <whyx dot it at gmail dot com>	2008-01-10 01:10:58 +00:00
jhb	1f6b3a5f2c	Add a -z flag to nfsstat which zeros the NFS statistics after displaying them. MFC after: 1 week Requested by: ps Submitted by: ps (6 years ago)	2007-10-18 16:38:07 +00:00
rwatson	f938c62f4e	Include priv.h to pick up suser(9) definitions, missed in an earlier commit. Warnings spotted by: kris	2007-06-13 22:42:43 +00:00
mjacob	24a416aad5	Init timespec to zero fo quiesce warnings.	2007-06-10 04:42:20 +00:00
jhb	c2c01f044f	Initialize vfslocked to 0 before nfsm_srvmtofh() so that the variable is not used uninitialized in 'nfsmout' if nfsm_srvmtofh() gets an internal error. CID: 1766 Found by: Coverity Prevent (tm)	2007-03-26 15:14:58 +00:00
jeff	d43d58ff45	- Turn all explicit giant acquires into conditional VFS_LOCK_GIANTs. Only ops which used namei still remained. - Implement a scheme for reducing the overhead of tracking which vops require giant by constantly reducing the number of recursive giant acquires to one, leaving us with only one vfslocked variable. - Remove all NFSD lock acquisition and release from the individual nfs ops. Careful examination has shown that they are not required. This greatly simplifies the code. Sponsored by: Isilon Systems, Inc. Discussed with: rwatson Tested by: kkenn Approved by: re	2007-03-17 18:18:08 +00:00
pjd	cb2d7c85a8	Move vnode-to-file-handle translation from vfs_vptofh to vop_vptofh method. This way we may support multiple structures in v_data vnode field within one file system without using black magic. Vnode-to-file-handle should be VOP in the first place, but was made VFS operation to keep interface as compatible as possible with SUN's VFS. BTW. Now Solaris also implements vnode-to-file-handle as VOP operation. VFS_VPTOFH() was left for API backward compatibility, but is marked for removal before 8.0-RELEASE. Approved by: mckusick Discussed with: many (on IRC) Tested with: ufs, msdosfs, cd9660, nullfs and zfs	2007-02-15 22:08:35 +00:00
mpp	f66eda706d	Get the vfs giant lock before calling nfs_access. Reviewed by: mohan	2007-02-13 03:27:45 +00:00

1 2 3 4 5

218 Commits