freebsd-dev

Author	SHA1	Message	Date
Pedro F. Giffuni	b187108f66	ext2fs: Add some DOINGASYNC check to match ffs. This is mostly cosmetical. Reviewed by: bde MFC after: 3 days	2013-01-18 19:11:17 +00:00
John Baldwin	d177f14da9	Use vfs_timestamp() to set file timestamps rather than invoking getmicrotime() or getnanotime() directly in NFS. Reviewed by: rmacklem, bde MFC after: 1 week	2013-01-18 18:43:38 +00:00
John Baldwin	39804bc89d	Remove a no-longer-used variable after the previous change to use VA_UTIMES_NULL. Submitted by: bde, rmacklem MFC after: 1 week	2013-01-17 18:45:20 +00:00
John Baldwin	5055536eec	Use the VA_UTIMES_NULL flag to detect when NULL was passed to utimes() instead of comparing the desired time against the current time as a heuristic. Reviewed by: rmacklem MFC after: 1 week	2013-01-16 21:52:31 +00:00
Konstantin Belousov	e8f966eeb8	Remove the filtering of the acceptable mount options for nullfs, added in r245004. Although the report was for noatime option which is non-functional for the nullfs, other standard options like nosuid or noexec are useful with it. Reported by: Dewayne Geraghty <dewayne.geraghty@heuristicsystems.com.au> MFC after: 3 days	2013-01-16 05:32:49 +00:00
John Baldwin	6910d7a0d8	- More properly handle interrupted NFS requests on an interruptible mount by returning an error of EINTR rather than EACCES. - While here, bring back some (but not all) of the NFS RPC statistics lost when krpc was committed. Reviewed by: rmacklem MFC after: 1 week	2013-01-15 22:08:17 +00:00
Konstantin Belousov	603f963e56	The current default size of the nullfs hash table used to lookup the existing nullfs vnode by the lower vnode is only 16 slots. Since the default mode for the nullfs is to cache the vnodes, hash has extremely huge chains. Size the nullfs hashtbl based on the current value of desiredvnodes. Use vfs_hash_index() to calculate the hash bucket for a given vnode. Pointy hat to: kib Diagnosed and reviewed by: peter Tested by: peter, pho (previous version) Sponsored by: The FreeBSD Foundation MFC after: 5 days	2013-01-14 05:44:47 +00:00
Konstantin Belousov	6b17595133	When nullfs mount is forcibly unmounted and nullfs vnode is reclaimed, get back the leased write reference from the lower vnode. There is no other path which can correct v_writecount on the lowervp. Reported by: flo Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 3 days	2013-01-10 18:24:48 +00:00
Baptiste Daroussin	3d94054c30	Add support for IO_APPEND flag in fuse This make open(..., O_APPEND) actually works on fuse filesystem. Reviewed by: attilio	2013-01-08 12:21:50 +00:00
Pedro F. Giffuni	98f5c0d41a	ext2fs: cleanup de dinode structure. It was plagued with style errors and the offsets had been lost. While here took the time to update the fields according to the latest ext4 documentation. Reviewed by: bde MFC after: 3 days	2013-01-07 03:36:32 +00:00
Gleb Kurtsou	4fd5efe79e	tmpfs: Replace directory entry linked list with RB-Tree. Use file name hash as a tree key, handle duplicate keys. Both VOP_LOOKUP and VOP_READDIR operations utilize same tree for search. Directory entry offset (cookie) is either file name hash or incremental id in case of hash collisions (duplicate-cookies). Keep sorted per directory list of duplicate-cookie entries to facilitate cookie number allocation. Don't fail if previous VOP_READDIR() offset is no longer valid, start with next dirent instead. Other file system handle it similarly. Workaround race prone tn_readdir_last[pn] fields update. Add tmpfs_dir_destroy() to free all dirents. Set NFS cookies in tmpfs_dir_getdents(). Return EJUSTRETURN from tmpfs_dir_getdents() instead of hard coded -1. Mark directory traversal routines static as they are no longer used outside of tmpfs_subr.c	2013-01-06 22:15:44 +00:00
Konstantin Belousov	268dd286a0	Fix reversed condition in the assertion. Pointy hat to: kib MFC after: 13 days	2013-01-04 07:52:47 +00:00
Konstantin Belousov	9cf4c952ca	Add the "nocache" nullfs mount option, which disables the caching of the free nullfs vnodes, switching nullfs behaviour to pre-r240285. The option is mostly intended as the last-resort when higher pressure on the vnode cache due to doubling of the vnode counts is not desirable. Note that disabling the cache costs more than 2x wall time in the metadata-hungry scenarious. The default is "cache". Tested and benchmarked by: pho (previous version) MFC after: 2 weeks	2013-01-03 19:17:57 +00:00
Konstantin Belousov	6b54784391	Remove the last use of the deprecated MNT_VNODE_FOREACH interface in the tree. With the help from: mjg Tested by: Ronald Klop <ronald-freebsd8@klop.yi.org> MFC after: 2 weeks	2013-01-03 19:01:56 +00:00
Konstantin Belousov	ad9789f6db	Do not force a writer to the devfs file to drain the buffer writes. Requested and tested by: Ian Lepore <freebsd@damnhippie.dyndns.org> MFC after: 2 weeks	2012-12-23 22:43:27 +00:00
Pedro F. Giffuni	e28f5d5222	More constant renaming in preparation for newer features. We also try to make better use of the fs flags instead of trying adapt the code according to the fs structures. In the case of subsecond timestamps and birthtime we now check that the feature is explicitly enabled: previously we only checked that the reserved space was available and silently wrote them. This approach is much safer, especially if the filesystem happens to use embedded inodes or support EAs. Discussed with: Zheng Liu MFC after: 5 days	2012-12-20 02:22:36 +00:00
Rick Macklem	ef8f1261d2	Add "nfsstat -m" support for the two new NFS mount options added by r244042.	2012-12-09 22:23:50 +00:00
Rick Macklem	1f60bfd822	Move the NFSv4.1 client patches over from projects/nfsv4.1-client to head. I don't think the NFS client behaviour will change unless the new "minorversion=1" mount option is used. It includes basic NFSv4.1 support plus support for pNFS using the Files Layout only. All problems detecting during an NFSv4.1 Bakeathon testing event in June 2012 have been resolved in this code and it has been tested against the NFSv4.1 server available to me. Although not reviewed, I believe that kib@ has looked at it.	2012-12-08 22:52:39 +00:00
Gleb Smirnoff	eb1b1807af	Mechanically substitute flags from historic mbuf allocator with malloc(9) flags within sys. Exceptions: - sys/contrib not touched - sys/mbuf.h edited manually	2012-12-05 08:04:20 +00:00
Rick Macklem	99d2727d67	Add an nfssvc() option to the kernel for the new NFS client which dumps out the actual options being used by an NFS mount. This will be used to implement a "-m" option for nfsstat(1). Reviewed by: alfred MFC after: 2 weeks	2012-12-02 01:16:04 +00:00
Pedro F. Giffuni	371f338bfd	Update some definitions or make them match NetBSD's headers. Bring several definitions required for newer ext4 features. Rename EXT2F_COMPAT_HTREE to EXT2F_COMPAT_DIRHASHINDEX since it is not being used yet and the new name is more compatible with NetBSD and Linux. This change is purely cosmetic and has no effect on the real code. Obtained from: NetBSD MFC after: 3 days	2012-11-28 15:48:32 +00:00
Pedro F. Giffuni	7306dea4e8	Partially bring r242520 to ext2fs. When a file is first being written, the dynamic block reallocation (implemented by ext2_reallocblks) relocates the file's blocks so as to cluster them together into a contiguous set of blocks on the disk. When the cluster crosses the boundary into the first indirect block, the first indirect block is initially allocated in a position immediately following the last direct block. Block reallocation would usually destroy locality by moving the indirect block out of the way to keep the data blocks contiguous. The issue was diagnosed long ago by Bruce Evans on ffs and surfaced on ext2fs when block reallocaton was ported. This is only a partial solution based on the similarities with FFS. We still require more review of the allocation details that vary in ext2fs. Reported by: bde MFC after: 1 week	2012-11-28 00:36:40 +00:00
Davide Italiano	42039c5bce	- smbfs_rename() might return an error value without correctly upgrading the vnode use count, and this might cause the kernel to panic if compiled with WITNESS enable. - Be sure to put the '\0' terminator to the rpath string. Sponsored by: iXsystems inc.	2012-11-26 04:29:47 +00:00
Davide Italiano	2c4415419f	- Remove reset of vpp pointer in some places as long as it's not really useful and has the side effect of obfuscating the code a bit. - Remove spurious references to simple_lock. Reported by: attilio [1] Sponsored by: iXsystems inc.	2012-11-22 09:13:45 +00:00
Davide Italiano	80704a47af	Until now, smbfs_fullpath() computed the full path starting from the vnode and following back the chain of n_parent pointers up to the root, without acquiring the locks of the n_parent vnodes analyzed during the computation. This is immediately wrong because if the vnode lock is not held there's no guarantee on the validity of the vnode pointer or the data. In order to fix, store the whole path in the smbnode structure so that smbfs_fullpath() can use this information. Discussed with: kib Reported and tested by: pho Sponsored by: iXsystems inc.	2012-11-22 08:58:29 +00:00
Konstantin Belousov	6db79c26ce	Remove the check and panic for an impossible condition. The NULL lowervp vnode v_vnlock would cause panic due to NULL pointer dereference much earlier. MFC after: 1 week	2012-11-20 15:25:00 +00:00
Attilio Rao	c6e0355cee	r16312 is not any longer real since many years (likely since when VFS received granular locking) but the comment present in UFS has been copied all over other filesystems code incorrectly for several times. Removes comments that makes no sense now. Reviewed by: kib MFC after: 3 days	2012-11-19 22:43:45 +00:00
Konstantin Belousov	134eb42e24	In pget(9), if PGET_NOTWEXIT flag is not specified, also search the zombie list for the pid. This allows several kern.proc sysctls to report useful information for zombies. Hold the allproc_lock around all searches instead of relocking it. Remove private pfind_locked() from the new nfs client code. Requested and reviewed by: pjd Tested by: pho MFC after: 3 weeks	2012-11-16 08:25:06 +00:00
Konstantin Belousov	6feceb86ab	Remove M_USE_RESERVE from the devfs cdp allocator, which is one of two uses of M_USE_RESERVE in the kernel. This allocation is not special. Reviewed by: alc Tested by: pho MFC after: 2 weeks	2012-11-14 19:50:21 +00:00
Davide Italiano	e631d5ab78	Get rid of some old debug code. It provides checks similar to the one offered by RedZone so there's no need to keep it. Sponsored by: iXsystems inc.	2012-11-14 19:10:50 +00:00
Davide Italiano	9dbe0b121c	Fix the lookup in the DOTDOT case in the same way as other filesystems do, i.e. inlining the vn_vget_ino() algorithm. Sponsored by: iXsystems inc.	2012-11-14 18:43:58 +00:00
Attilio Rao	1750b7b9c8	- Protect mnt_data and mnt_flags under the mount interlock - Move mp->mnt_stat manipulation where all of them happens Reported by: davide Discussed with: kib Tested by: flo MFC after: 2 months X-MFC: 241519, 242536,242616, 242727	2012-11-10 19:32:16 +00:00
Attilio Rao	bc2258da88	Complete MPSAFE VFS interface and remove MNTK_MPSAFE flag. Porters should refer to __FreeBSD_version 1000021 for this change as it may have happened at the same timeframe.	2012-11-09 18:02:25 +00:00
Attilio Rao	d9454fab30	- Current caching mode is completely broken because it simply relies on timing of the operations and not real lookup, bringing too many false positives. Remove the whole mechanism. If it needs to be implemented, next time it should really be done in the proper way. - Fix VOP_GETATTR() in order to cope with userland bugs that would change the type of file and not panic. Instead it gets the entry as if it is not existing. Reported and tested by: flo MFC after: 2 months X-MFC: 241519, 242536,242616	2012-11-08 00:32:49 +00:00
Attilio Rao	2810826df9	fuse_io* must be able to crunch also VDIR vnodes. Update assert appropriately. Reported and Tested by: flo MFC after: 2 months X-MFC: 241519,242536	2012-11-05 15:23:54 +00:00
Attilio Rao	6de3b00db6	Fix a bug where operations was carried on even if not implemented, leading to handling of an invalid fdip object. Reported and tested by: flo MFC after: 2 months X-MFC: 241519	2012-11-03 23:32:32 +00:00
Konstantin Belousov	140dedb81c	The r241025 fixed the case when a binary, executed from nullfs mount, was still possible to open for write from the lower filesystem. There is a symmetric situation where the binary could already has file descriptors opened for write, but it can be executed from the nullfs overlay. Handle the issue by passing one v_writecount reference to the lower vnode if nullfs vnode has non-zero v_writecount. Note that only one write reference can be donated, since nullfs only keeps one use reference on the lower vnode. Always use the lower vnode v_writecount for the checks. Introduce the VOP_GET_WRITECOUNT to read v_writecount, which is currently always bypassed to the lower vnode, and VOP_ADD_WRITECOUNT to manipulate the v_writecount value, which manages a single bypass reference to the lower vnode. Caling the VOPs instead of directly accessing v_writecount provide the fix described in the previous paragraph. Tested by: pho MFC after: 3 weeks	2012-11-02 13:56:36 +00:00
Davide Italiano	8680dc800f	- Do not put in the mntqueue half-constructed vnodes. - Change the code so that it relies on vfs_hash rather than on a home-made hashtable. - There's no need to inline fnv_32_buf(). Reviewed by: delphij Tested by: pho Sponsored by: iXsystems inc.	2012-10-31 03:55:33 +00:00
Davide Italiano	afe097512c	Fix panic due to page faults while in kernel mode, under conditions of VM pressure. The reason is that in some codepaths pointers to stack variables were passed from one thread to another. In collaboration with: pho Reported by: pho's stress2 suite Sponsored by: iXsystems inc.	2012-10-31 03:34:07 +00:00
Davide Italiano	994f027fbc	Change the code to use %jd as printf() placeholder for uio_offset and cast to intmax_t. Suggested by: pjd Sponsored by: iXsystems inc.	2012-10-31 02:54:44 +00:00
Davide Italiano	469cb18f88	Fix build in case we have SMBVDEBUG turned on. Reviewed by: gnn Approved by: gnn Sponsored by: iXsystems inc.	2012-10-25 21:08:02 +00:00
Davide Italiano	8d9495bb1d	- Remove the references to the deprecated zalloc kernel interface - Use M_ZERO flag in malloc() rather than bzero() - malloc() with M_NOWAIT can't return NULL so there's no need to check Reviewed by: alc Approved by: alc	2012-10-25 20:23:04 +00:00
Konstantin Belousov	5050aa86cf	Remove the support for using non-mpsafe filesystem modules. In particular, do not lock Giant conditionally when calling into the filesystem module, remove the VFS_LOCK_GIANT() and related macros. Stop handling buffers belonging to non-mpsafe filesystems. The VFS_VERSION is bumped to indicate the interface change which does not result in the interface signatures changes. Conducted and reviewed by: attilio Tested by: pho	2012-10-22 17:50:54 +00:00
Eitan Adler	db702c59cf	remove duplicate semicolons where possible. Approved by: cperciva MFC after: 1 week	2012-10-22 03:00:37 +00:00
Ed Schouten	9671713425	Remove unneeded D_NEEDMINOR. This is only needed when using clonelists. This got remove in r238693.	2012-10-18 19:28:31 +00:00
Rick Macklem	6001db296e	Add two new options to the nfssvc(2) syscall that allow processes running as root to suspend/resume execution of the kernel nfsd threads. An earlier version of this patch was tested by Vincent Hoffman (vince at unsane.co.uk) and John Hickey (jh at deterlab.net). Reviewed by: kib MFC after: 2 weeks	2012-10-14 22:33:17 +00:00
Konstantin Belousov	82ed933c6f	Grammar fixes. Submitted by: bf MFC after: 1 week	2012-10-14 18:13:33 +00:00
Konstantin Belousov	806efacae0	Replace the XXX comment with the proper description. MFC after: 1 week	2012-10-14 17:07:34 +00:00
Attilio Rao	4cff153b87	Rename s/DEBUG()/FS_DEBUG() and s/DEBUG2G()/FS_DEBUG2G() in order to avoid a name clash in sparc64. MFC after: 2 months X-MFC: r241519	2012-10-14 03:51:59 +00:00
Attilio Rao	5fe580195f	Import a FreeBSD port of the FUSE Linux module. This has been developed during 2 summer of code mandates and being revived by gnn recently. The functionality in this commit mirrors entirely content of fusefs-kmod port, which doesn't need to be installed anymore for -CURRENT setups. In order to get some sparse technical notes, please refer to: http://lists.freebsd.org/pipermail/freebsd-fs/2012-March/013876.html or to the project branch: svn://svn.freebsd.org/base/projects/fuse/ which also contains granular history of changes happened during port refinements. This commit does not came from the branch reintegration itself because it seems svn is not behaving properly for this functionaly at the moment. Partly Sponsored by: Google, Summer of Code program 2005, 2011 Originally submitted by: ilya, Csaba Henk <csaba-ml AT creo DOT hu > In collabouration with: pho Tested by: flo, gnn, Gustau Perez, Kevin Oberman <rkoberman AT gmail DOT com> MFC after: 2 months	2012-10-13 23:54:26 +00:00
Konstantin Belousov	877d24ac8a	Fix the mis-handling of the VV_TEXT on the nullfs vnodes. If you have a binary on a filesystem which is also mounted over by nullfs, you could execute the binary from the lower filesystem, or from the nullfs mount. When executed from lower filesystem, the lower vnode gets VV_TEXT flag set, and the file cannot be modified while the binary is active. But, if executed as the nullfs alias, only the nullfs vnode gets VV_TEXT set, and you still can open the lower vnode for write. Add a set of VOPs for the VV_TEXT query, set and clear operations, which are correctly bypassed to lower vnode. Tested by: pho (previous version) MFC after: 2 weeks	2012-09-28 11:25:02 +00:00
Matthew D Fleming	fc8fdae0df	Fix up kernel sources to be ready for a 64-bit ino_t. Original code by: Gleb Kurtsou	2012-09-27 23:30:49 +00:00
Rick Macklem	c52005a31d	Modify the NFSv4 client so that it can handle owner and owner_group strings that consist entirely of digits, interpreting them as the uid/gid number. This change was needed since new (>= 3.3) Linux servers reply with these strings by default. This change is mandated by the rfc3530bis draft. Reported on freebsd-stable@ under the Subject heading "Problem with Linux >= 3.3 as NFSv4 server" by Norbert Aschendorff on Aug. 20, 2012. Tested by: norbert.aschendorff at yahoo.de Reviewed by: jhb MFC after: 2 weeks	2012-09-20 02:49:25 +00:00
Ed Schouten	7cbef24e1a	Prefer __containerof() above member2struct(). The first does proper checking of the argument types, while the latter does not.	2012-09-15 19:28:54 +00:00
Konstantin Belousov	df3cbc41fa	The deadfs VOPs for vop_ioctl and vop_bmap call itself recursively, which is an elaborate way to cause kernel panic. Change the VOPs implementation to return EBADF for a reclaimed vnode. While the calls to vop_bmap should not reach deadfs, it is indeed possible for vop_ioctl, because the VOP locking protocol is to pass the vnode to VOP unlocked. The actual panic was observed when ioctl was called on procfs filedescriptor which pointed to an exited process. Reported by: zont Tested by: pho MFC after: 1 week	2012-09-13 13:05:45 +00:00
Kevin Lo	95c79b6082	Add VFCF_READONLY flag that indicates ntfs and xfs file systems are only supported as read-only.	2012-09-12 03:42:52 +00:00
Kevin Lo	6297d5d6f9	Prevent nump NULL pointer dereference in bmap_getlbns()	2012-09-11 09:38:32 +00:00
Kevin Lo	8e46bf68d1	Fix style nit	2012-09-11 08:36:41 +00:00
Rick Macklem	f4e2c07e73	Add a simple printf() based debug facility to the new nfs client. Use it for a printf() that can be harmlessly generated for mmap()'d files. It will be used extensively for the NFSv4.1 client. Debugging printf()s are enabled by setting vfs.nfs.debuglevel to a non-zero value. The higher the value, the more debugging printf()s. Reviewed by: jhb MFC after: 2 weeks	2012-09-09 21:00:45 +00:00
Konstantin Belousov	d9e9650a36	Allow shared lookups for nullfs mounts, if lower filesystem supports it. There are two problems which shall be addressed for shared lookups use to have measurable effect on nullfs scalability: 1. When vfs_lookup() calls VOP_LOOKUP() for nullfs, which passes lookup operation to lower fs, resulting vnode is often only shared-locked. Then null_nodeget() cannot instantiate covering vnode for lower vnode, since insmntque1() and null_hashins() require exclusive lock on the lower. Change the assert that lower vnode is exclusively locked to only require any lock. If null hash failed to find pre-existing nullfs vnode for lower vnode and the vnode is shared-locked, the lower vnode lock is upgraded. 2. Nullfs reclaims its vnodes on deactivation. This is due to nullfs inability to detect reclamation of the lower vnode. Reclamation of a nullfs vnode at deactivation time prevents a reference to the lower vnode to become stale. Change nullfs VOP_INACTIVE to not reclaim the vnode, instead use the VFS_RECLAIM_LOWERVP to get notification and reclaim upper vnode together with the reclamation of the lower vnode. Note that nullfs reclamation procedure calls vput() on the lowervp vnode, temporary unlocking the vnode being reclaimed. This seems to be fine for MPSAFE filesystems, but not-MPSAFE code often put partially initialized vnode on some globally visible list, and later can decide that half-constructed vnode is not needed. If nullfs mount is created above such filesystem, then other threads might catch such not properly initialized vnode. Instead of trying to overcome this case, e.g. by recursing the lower vnode lock in null_reclaim_lowervp(), I decided to rely on nearby removal of the support for non-MPSAFE filesystems. In collaboration with: pho MFC after: 3 weeks	2012-09-09 19:20:23 +00:00
Pedro F. Giffuni	051b0df565	Add some basic definitions for a future htree implementation. MFC after: 3 days	2012-08-24 01:12:07 +00:00
Kevin Lo	5bb295c408	Fix typo	2012-08-18 16:13:16 +00:00
Mateusz Guzik	1ec9bedabe	Remove unused member of struct indir (in_exists) from UFS and EXT2 code. Reviewed by: mckusick Approved by: trasz (mentor) MFC after: 1 week	2012-08-17 17:45:27 +00:00
Hans Petter Selasky	07da61a6cc	Streamline use of cdevpriv and correct some corner cases. 1) It is not useful to call "devfs_clear_cdevpriv()" from "d_close" callbacks, hence for example read, write, ioctl and so on might be sleeping at the time of "d_close" being called and then then freed private data can still be accessed. Examples: dtrace, linux_compat, ksyms (all fixed by this patch) 2) In sys/dev/drm* there are some cases in which memory will be freed twice, if open fails, first by code in the open routine, secondly by the cdevpriv destructor. Move registration of the cdevpriv to the end of the drm open routines. 3) devfs_clear_cdevpriv() is not called if the "d_open" callback registered cdevpriv data and the "d_open" callback function returned an error. Fix this. Discussed with: phk MFC after: 2 weeks	2012-08-15 16:19:39 +00:00
Konstantin Belousov	b6c00483e9	Do not leave invalid pages in the object after the short read for a network file systems (not only NFS proper). Short reads cause pages other then the requested one, which were not filled by read response, to stay invalid. Change the vm_page_readahead_finish() interface to not take the error code, but instead to make a decision to free or to (de)activate the page only by its validity. As result, not requested invalid pages are freed even if the read RPC indicated success. Noted and reviewed by: alc MFC after: 1 week	2012-08-14 11:45:47 +00:00
Konstantin Belousov	1c771f9222	After the PHYS_TO_VM_PAGE() function was de-inlined, the main reason to pull vm_param.h was removed. Other big dependency of vm_page.h on vm_param.h are PA_LOCK* definitions, which are only needed for in-kernel code, because modules use KBI-safe functions to lock the pages. Stop including vm_param.h into vm_page.h. Include vm_param.h explicitely for the kernel code which needs it. Suggested and reviewed by: alc MFC after: 2 weeks	2012-08-05 14:11:42 +00:00
Konstantin Belousov	0055cbd3c5	Reduce code duplication and exposure of direct access to struct vm_page oflags by providing helper function vm_page_readahead_finish(), which handles completed reads for pages with indexes other then the requested one, for VOP_GETPAGES(). Reviewed by: alc MFC after: 1 week	2012-08-04 18:16:43 +00:00
Konstantin Belousov	843dcea09e	The header uma_int.h is internal uma header, unused by this source file. Do not include it needlessly. Reviewed by: alc MFC after: 1 week	2012-08-04 18:12:54 +00:00
David Xu	5ff2bb52cc	I am comparing current pipe code with the one in 8.3-STABLE r236165, I found 8.3 is a history BSD version using socket to implement FIFO pipe, it uses per-file seqcount to compare with writer generation stored in per-pipe object. The concept is after all writers are gone, the pipe enters next generation, all old readers have not closed the pipe should get the indication that the pipe is disconnected, result is they should get EPIPE, SIGPIPE or get POLLHUP in poll(). But newcomer should not know that previous writters were gone, it should treat it as a fresh session. I am trying to bring back FIFO pipe to history behavior. It is still unclear that if single EOF flag can represent SBS_CANTSENDMORE and SBS_CANTRCVMORE which socket-based version is using, but I have run the poll regression test in tool directory, output is same as the one on 8.3-STABLE now. I think the output "not ok 18 FIFO state 6b: poll result 0 expected 1. expected POLLHUP; got 0" might be bogus, because newcomer should not know that old writers were gone. I got the same behavior on Linux. Our implementation always return POLLIN for disconnected pipe even it should return POLLHUP, but I think it is not wise to remove POLLIN for compatible reason, this is our history behavior. Regression test: /usr/src/tools/regression/poll	2012-07-31 05:48:35 +00:00
David Xu	12a480fa41	When a thread is blocked in direct write state, it only sets PIPE_DIRECTW flag but not PIPE_WANTW, but FIFO pipe code does not understand this internal state, when a FIFO peer reader closes the pipe, it wants to notify the writer, it checks PIPE_WANTW, if not set, it skips calling wakeup(), so blocked writer never noticed the case, but in general, the writer should return from the syscall with EPIPE error code and may get SIGPIPE signal. Setting the PIPE_WANTW fixed problem, or you can turn off direct write, it should fix the problem too. This bug is found by PR/170203. Another bug in FIFO pipe code is when peer closes the pipe, another end which is being blocked in select() or poll() is not notified, it missed to call pipeselwakeup(). Third problem is found in poll regression test, the existing code can not pass 6b,6c,6d tests, but FreeBSD-4 works. This commit does not fix the problem, I still need to study more to find the cause. PR: 170203 Tested by: Garrett Copper < yanegomi at gmail dot com >	2012-07-31 02:00:37 +00:00
Kevin Lo	f7a3729c91	Use NULL instead of 0 for pointers	2012-07-22 15:40:31 +00:00
Christian Brueffer	01cc0b6531	Simply error handling by moving the allocation of np down to where it is actually used. While here, improve style a little. Submitted by: mjg MFC after: 2 weeks	2012-07-16 22:07:29 +00:00
Christian Brueffer	9ce63ce2a3	Save a bzero() by using M_ZERO. Obtained from: Dragonfly BSD (change 4faaf07c3d7ddd120deed007370aaf4d90b72ebb) MFC after: 2 weeks	2012-07-15 15:50:12 +00:00
Attilio Rao	5ebe387ff7	Remove a check on MNTK_UPDATE that is not really necessary as it is handled in a code snippet above.	2012-07-10 00:23:25 +00:00
Attilio Rao	8806edb4ab	- Remove the unused and not completed write support for NTFS. - Fix a bug where vfs_mountedfrom() is called also when the filesystem is not mounted successfully. Tested by: pho	2012-07-10 00:01:00 +00:00
Kevin Lo	d6cc34a1ad	Fix a typo	2012-07-03 08:03:07 +00:00
Konstantin Belousov	c5c1199c83	Extend the KPI to lock and unlock f_offset member of struct file. It now fully encapsulates all accesses to f_offset, and extends f_offset locking to other consumers that need it, in particular, to lseek() and variants of getdirentries(). Ensure that on 32bit architectures f_offset, which is 64bit quantity, always read and written under the mtxpool protection. This fixes apparently easy to trigger race when parallel lseek()s or lseek() and read/write could destroy file offset. The already broken ABI emulations, including iBCS and SysV, are not converted (yet). Tested by: pho No objections from: jhb MFC after: 3 weeks	2012-07-02 21:01:03 +00:00
Konstantin Belousov	9d232eec30	Do not override an error from uiomove() with (non-)error result from bwrite(). VFS needs to know about EFAULT from uiomove() and does not care much that partially filled block writeback after EFAULT was successfull. Early return without error causes short write to be reported to usermode. Reported and tested by: andreast MFC after: 3 weeks	2012-07-02 09:53:08 +00:00
Konstantin Belousov	ddfc47fdc9	Enable deadlock avoidance code for NFS client. MFC after: 2 weeks	2012-06-21 09:26:06 +00:00
Rick Macklem	53e1b8fba5	Fix the NFSv4 client for the case where mmap'd files are written, but not msync'd by a process. A VOP_PUTPAGES() called when VOP_RECLAIM() happens will usually fail, since the NFSv4 Open has already been closed by VOP_INACTIVE(). Add a vm_object_page_clean() call to the NFSv4 client's VOP_INACTIVE(), so that the write happens before the NFSv4 Open is closed. kib@ suggested using vgone() instead and I will explore this, but this patch fixes things in the meantime. For some reason, the VOP_PUTPAGES() is still attaempted in VOP_RECLAIM(), but having this fail doesn't cause any problems except a "stateid0 in write" being logged. Reviewed by: kib MFC after: 1 week	2012-06-18 22:17:28 +00:00
Rick Macklem	79cafccd40	Move the nfsrpc_close() call in ncl_reclaim() for the NFSv4 client to below the vnode_destroy_vobject() call, since that is where writes are flushed. Suggested by: kib MFC after: 1 week	2012-06-17 18:34:04 +00:00
Konstantin Belousov	bfb68a9e43	Improve handling of uiomove(9) errors for the NFS client. Do not brelse() the buffer unconditionally with BIO_ERROR set if uiomove() failed. The brelse() treats most buffers with BIO_ERROR as B_INVAL, dropping their content. Instead, if the write request covered the whole buffer, remember the cached state and brelse() with BIO_ERROR set only if the buffer was not cached previously. Update the buffer dirtyoff/dirtyend based on the progress recorded by uiomove() in passed struct uio, even in the presence of error. Otherwise, usermode could see changed data in the backed pages, but later the buffer is destroyed without write-back. If uiomove() failed for IO_UNIT request, try to truncate the vnode back to the pre-write state, and rewind the progress in passed uio accordingly, following the FFS behaviour. Reviewed by: rmacklem (some time ago) Tested by: pho MFC after: 1 month	2012-06-06 16:30:16 +00:00
Konstantin Belousov	6eec26f5ad	Capitalize start of sentence. MFC after: 3 days	2012-05-30 14:00:23 +00:00
Marcel Moolenaar	e9b29d1604	Catch a corner case where ssegs could be 0 and thus i would be 0 and we index suinfo out of bounds (i.e. -1). Approved by: gber	2012-05-28 16:33:58 +00:00
Ed Schouten	0078e2fdeb	Fix style and consistency: - Use tabs, not spaces. - Add tab after #define. - Don't mix the use of BSD and ISO C unsigned integer types. Prefer the ISO C ones.	2012-05-27 09:34:47 +00:00
Gleb Kurtsou	fb2a3e6ea1	Use C99-style initialization for struct dirent in preparation for changing the structure. Sponsored by: Google Summer of Code 2011	2012-05-25 09:16:59 +00:00
Alexander Motin	d499701b0c	Revert devfs part of r235911. I was unaware about old but unfinished discussion between kib@ and gibbs@ about it.	2012-05-24 18:19:23 +00:00
Alexander Motin	f6ad3f237a	MFprojects/zfsd: Revamp the CAM enclosure services driver. This updated driver uses an in-kernel daemon to track state changes and publishes physical path location information\for disk elements into the CAM device database. Sponsored by: Spectra Logic Corporation Sponsored by: iXsystems, Inc. Submitted by: gibbs, will, mav	2012-05-24 14:07:44 +00:00
Rick Macklem	f4b9a05a90	A problem with the NFSv4 server was reported by Andrew Leonard to freebsd-fs@, where the setfacl of an NFSv4 acl would fail. This was caused by the VOP_ACLCHECK() call for ZFS replying EOPNOTSUPP. After discussion with rwatson@, it was determined that a call to VOP_ACLCHECK() before doing VOP_SETACL() is not required. This patch fixes the problem by deleting the VOP_ACLCHECK() call. Tested by: Andrew Leonard (previous version) MFC after: 1 week	2012-05-17 21:52:17 +00:00
Grzegorz Bernacki	7f725bcd5c	Import work done under project/nand (@235533) into head. The NAND Flash environment consists of several distinct components: - NAND framework (drivers harness for NAND controllers and NAND chips) - NAND simulator (NANDsim) - NAND file system (NAND FS) - Companion tools and utilities - Documentation (manual pages) This work is still experimental. Please use with caution. Obtained from: Semihalf Supported by: FreeBSD Foundation, Juniper Networks	2012-05-17 10:11:18 +00:00
Pedro F. Giffuni	553c9b4d08	Fix a couple of issues that appear to be inherited from the old 8.x code: - If the lock cannot be acquired immediately unlocks 'bar' vnode and then locks both vnodes in order. - wrong vnode type panics from cache_enter_time after calls by ext2_lookup. The fix merges the fixes from ufs/ufs_lookup.c. Submitted by: Mateusz Guzik Approved by: jhb@ (mentor) Reviewed by: kib@ MFC after: 1 week	2012-05-16 15:53:38 +00:00
Gleb Kurtsou	ac13a90c4b	Skip directory entries with zero inode number during traversal. Entries with zero inode number are considered placeholders by libc and UFS. Fix remaining uses of VOP_READDIR in kernel: vop_stdvptocnp, unionfs. Sponsored by: Google Summer of Code 2011	2012-05-16 10:44:09 +00:00
Rick Macklem	2108487ead	Fix two cases in the new NFS server where a tsleep() is used, when the code should actually protect the tested variable with a mutex. Since the tsleep()s had a 10sec timeout, the race would have only delayed the allocation of a new clientid for a client. The sleeps will also rarely occur, since having a callback in progress when a client acquires a new clientid, is unlikely. in practice, since having a callback in progress when a fresh clientid is being acquired by a client is unlikely. MFC after: 1 month	2012-05-12 22:20:55 +00:00
Rick Macklem	7af1242a34	PR# 165923 reported intermittent write failures for dirty memory mapped pages being written back on an NFS mount. Since any thread can call VOP_PUTPAGES() to write back a dirty page, the credentials of that thread may not have write access to the file on an NFS server. (Often the uid is 0, which may be mapped to "nobody" in the NFS server.) Although there is no completely correct fix for this (NFS servers check access on every write RPC instead of at open/mmap time), this patch avoids the common cases by holding onto a credential that recently opened the file for writing and uses that credential for the write RPCs being done by VOP_PUTPAGES() for both NFS clients. Tested by: Joel Ray Holveck (joelh at juniper.net) PR: kern/165923 Reviewed by: kib MFC after: 2 weeks	2012-05-12 12:02:51 +00:00
Sergey Kandaurov	7d5f5d83f5	Fix mount interlock oversights from the previous change in r234386. Reported by: dougb Submitted by: Mateusz Guzik <mjguzik at gmail com> Reviewed by: Kirk McKusick Tested by: pho	2012-05-10 20:28:33 +00:00
John W. De Boskey	3676a0d890	Use the common api helper routine instead of freeing the namei buffer directly. Approved by: rmacklem (mentor) MFC after: 1 month	2012-05-08 03:39:44 +00:00
Daichi GOTO	508a31f1a8	fixed a unionfs_readdir math issue PR: 132987 Submitted by: Matthew Fleming <mfleming@isilon.com>	2012-05-03 07:22:29 +00:00
Daichi GOTO	cb5736b73b	- fixed a vnode lock hang-up issue. - fixed an incorrect lock status issue. - fixed an incorrect lock issue of unionfs root vnode removed. (pointed out by keith) - fixed an infinity loop issue. (pointed out by dumbbell) - changed to do LK_RELEASE expressly when unlocked. Submitted by: ozawa@ongs.co.jp	2012-05-01 07:46:30 +00:00
Rick Macklem	4964d80705	It was reported via email that some non-FreeBSD NFS servers do not include file attributes in the reply to an NFS create RPC under certain circumstances. This resulted in a vnode of type VNON that was not usable. This patch adds an NFS getattr RPC to nfs_create() for this case, to fix the problem. It was tested by the person that reported the problem and confirmed to fix this case for their server. Tested by: Steven Haber (steven.haber at isilon.com) MFC after: 2 weeks	2012-04-27 22:23:06 +00:00
Rick Macklem	a607cc6d8e	Fix a leak of namei lookup path buffers that occurs when a ZFS volume is exported via the new NFS server. The leak occurred because the new NFS server code didn't handle the case where a file system sets the SAVENAME flag in its VOP_LOOKUP() and ZFS does this for the DELETE case. Tested by: Oliver Brandmueller (ob at gruft.de), hrs PR: kern/167266 MFC after: 1 month	2012-04-27 20:23:24 +00:00
Edward Tomasz Napierala	af6e6b87ad	Remove unused thread argument to vrecycle(). Reviewed by: kib	2012-04-23 14:10:34 +00:00
Edward Tomasz Napierala	c52fd858ae	Remove unused thread argument from vtruncbuf(). Reviewed by: kib	2012-04-23 13:21:28 +00:00
Kirk McKusick	f257ebbb2e	This change creates a new list of active vnodes associated with a mount point. Active vnodes are those with a non-zero use or hold count, e.g., those vnodes that are not on the free list. Note that this list is in addition to the list of all the vnodes associated with a mount point. To avoid adding another set of linkage pointers to the vnode structure, the active list uses the existing linkage pointers used by the free list (previously named v_freelist, now renamed v_actfreelist). This update adds the MNT_VNODE_FOREACH_ACTIVE interface that loops over just the active vnodes associated with a mount point (typically less than 1% of the vnodes associated with the mount point). Reviewed by: kib Tested by: Peter Holm MFC after: 2 weeks	2012-04-20 06:50:44 +00:00
Jaakko Heinonen	fd1062ce4c	Return EOPNOTSUPP rather than EPERM for the SF_SNAPSHOT flag because tmpfs doesn't support snapshots. Suggested by: bde	2012-04-18 15:22:08 +00:00
Kirk McKusick	71469bb38f	Replace the MNT_VNODE_FOREACH interface with MNT_VNODE_FOREACH_ALL. The primary changes are that the user of the interface no longer needs to manage the mount-mutex locking and that the vnode that is returned has its mutex locked (thus avoiding the need to check to see if its is DOOMED or other possible end of life senarios). To minimize compatibility issues for third-party developers, the old MNT_VNODE_FOREACH interface will remain available so that this change can be MFC'ed to 9. Following the MFC to 9, MNT_VNODE_FOREACH will be removed in head. The reason for this update is to prepare for the addition of the MNT_VNODE_FOREACH_ACTIVE interface that will loop over just the active vnodes associated with a mount point (typically less than 1% of the vnodes associated with the mount point). Reviewed by: kib Tested by: Peter Holm MFC after: 2 weeks	2012-04-17 16:28:22 +00:00
Jaakko Heinonen	587fdb536f	Sync tmpfs_chflags() with the recent changes to UFS: - Add a check for unsupported file flags. - Return EPERM when an user without PRIV_VFS_SYSFLAGS privilege attempts to toggle SF_SETTABLE flags.	2012-04-16 18:10:34 +00:00
Jaakko Heinonen	c5ab5ce345	tmpfs: Allow update mounts only for certain options. Since r230208 update mounts were allowed if the list of mount options contained the "export" option. This is not correct as tmpfs doesn't really support updating all options. Reviewed by: kevlo, trociny	2012-04-16 18:07:42 +00:00
Gleb Kurtsou	f8439900d6	Provide better description for vfs.tmpfs.memory_reserved sysctl. Suggested by: Anton Yuzhaninov <citrin@citrin.ru>	2012-04-15 21:59:28 +00:00
Jaakko Heinonen	295a542d96	Apply changes from r234103 to ext2fs: Return EPERM from ext2_setattr() when an user without PRIV_VFS_SYSFLAGS privilege attempts to toggle SF_SETTABLE flags. Flags are now stored to ip->i_flags in one place after all checks. Also, remove SF_NOUNLINK from the checks because ext2fs doesn't support that flag. Reviewed by: bde	2012-04-13 05:48:31 +00:00
Jaakko Heinonen	e6b8bdf252	Restore the blank line incorrectly removed in r234104. Pointed out by: bde	2012-04-11 15:48:50 +00:00
Jaakko Heinonen	034efc61ba	Apply changes from r233787 to ext2fs: - Use more natural ip->i_flags instead of vap->va_flags in the final flags check. - Style improvements. No functional change intended. MFC after: 2 weeks	2012-04-10 16:05:52 +00:00
Attilio Rao	a0f2c37b6f	- Introduce a cache-miss optimization for consistency with other accesses of the cache member of vm_object objects. - Use novel vm_page_is_cached() for checks outside of the vm subsystem. Reviewed by: alc MFC after: 2 weeks X-MFC: r234039	2012-04-09 17:05:18 +00:00
Kirk McKusick	827e334c01	Add I/O accounting to msdos filesystem. Suggested and reviewed by: kib	2012-04-08 06:18:18 +00:00
Gleb Kurtsou	9295c62814	tmpfs supports only INT_MAX nodes due to limitations of unit number allocator. Replace UINT32_MAX checks with INT_MAX. Keeping more than 2^31 nodes in memory is not likely to become possible in foreseeable feature and would require new unit number allocator. Discussed with: delphij MFC after: 2 weeks	2012-04-07 15:30:46 +00:00
Gleb Kurtsou	0ff93c48da	Add vfs_getopt_size. Support human readable file system options in tmpfs. Increase maximum tmpfs file system size to 4GB*PAGE_SIZE on 32 bit archs. Discussed with: delphij MFC after: 2 weeks	2012-04-07 15:27:34 +00:00
Gleb Kurtsou	da7aa2778e	Add reserved memory limit sysctl to tmpfs. Cleanup availble and used memory functions. Check if free pages available before allocating new node. Discussed with: delphij	2012-04-07 15:23:51 +00:00
Konstantin Belousov	a53373fabe	Add sysctl vfs.nfs.nfs_keep_dirty_on_error to switch the nfs client behaviour on error from write RPC back to behaviour of old nfs client. When set to not zero, the pages for which write failed are kept dirty. PR: kern/165927 Reviewed by: alc MFC after: 2 weeks	2012-03-17 23:03:20 +00:00
Gleb Kurtsou	db94ad126a	Prevent tmpfs_rename() deadlock in a way similar to UFS Unlock vnodes and try to lock them one by one. Relookup fvp and tvp. Approved by: mdf (mentor)	2012-03-14 09:15:50 +00:00
Gleb Kurtsou	ca846258e2	Don't enforce LK_RETRY to get existing vnode in tmpfs_alloc_vp() Doomed vnode is hardly of any use here, besides all callers handle error case. vfs_hash_get() does the same. Don't mess with vnode holdcount, vget() takes care of it already. Approved by: mdf (mentor)	2012-03-14 08:29:21 +00:00
Kevin Lo	11753bd018	Use NULL instead of 0	2012-03-13 10:04:13 +00:00
Konstantin Belousov	0e738b4c0f	Update comment. Submitted by: gianni	2012-03-11 15:58:27 +00:00
Konstantin Belousov	b80dcb55aa	Remove fifo.h. The only used function declaration from the header is migrated to sys/vnode.h. Submitted by: gianni	2012-03-11 12:19:58 +00:00
Pedro F. Giffuni	035e4e0494	Add support for ns timestamps and birthtime to the ext2/3 driver. When using big inodes there is sufficient space in ext3 to keep extra resolution and birthtime (creation) timestamps. The appropriate fields in the on-disk inode have been approved for a long time but support for this in ext3 has not been widely distributed. In preparation for ext4 most linux distributions have enabled by default such bigger inodes and some people use nanosecond timestamps in ext3. We now support those when the inode is big enough and while we do recognize the EXT4F_ROCOMPAT_EXTRA_ISIZE, we maintain the extra timestamps even when they are not used. An additional note by Bruce Evans: We blindly accept unrepresentable tv_nsec in VOP_SETATTR(), but all file systems have always done that. When POSIX gets around to specifying the behaviour, it will probably require certain rounding to the fs's resolution and not rejecting the request. This unfortunately means that syscalls that set times can't really tell if they succeeded without reading back the times using stat() or similar and checking that they were set close enough. Reviewed by: bde Approved by: jhb (mentor) MFC after: 2 weeks	2012-03-08 21:06:05 +00:00
John Baldwin	b47f624183	Add KTR_VFS traces to track modifications to a vnode's writecount.	2012-03-08 20:27:20 +00:00
Konstantin Belousov	f950879e16	The pipe_poll() performs lockless access to the vnode to test fifo_iseof() condition, allowing the v_fifoinfo to be reset and freed by fifo_cleanup(). Precalculate EOF at the places were fo_wgen is changed, and cache the state in a new pipe state flag PIPE_SAMEWGEN. Reported and tested by: bf Submitted by: gianni MFC after: 1 week (a backport)	2012-03-07 07:31:50 +00:00
Konstantin Belousov	31452ff75e	Apply inlined vn_vget_ino() algorithm for ".." lookup in pseudofs. Reported and tested by: pho MFC after: 2 weeks	2012-03-05 11:38:02 +00:00
Konstantin Belousov	ea4072446b	Remove unneeded cast to u_int. The values as small enough to fit into int, beside the use of MIN macro which performs type promotions. Submitted by: bde MFC after: 3 weeks	2012-03-04 14:51:42 +00:00
Kevin Lo	c225ad032d	Remove unnecessary casts	2012-03-04 09:48:58 +00:00
Kevin Lo	dd104b3305	Clean up style(9) nits	2012-03-04 09:38:20 +00:00
Rick Macklem	b76ec2db93	The name caching changes of r230394 exposed an intermittent bug in the new NFS server for NFSv4, where it would report ENOENT when the file actually existed on the server. This turned out to be caused by not initializing ni_topdir before calling lookup() and there was a rare case where the value on the stack location assigned to ni_topdir happened to be a pointer to a ".." entry, such that "dp == ndp->ni_topdir" succeeded in lookup(). This patch initializes ni_topdir to fix the problem. MFC after: 5 days	2012-03-03 16:13:20 +00:00
Rick Macklem	5e99212d36	Post r230394, the Lookup RPC counts for both NFS clients increased significantly. Upon investigation this was caused by name cache misses for lookups of "..". For name cache entries for non-".." directories, the cache entry serves double duty. It maps both the named directory plus ".." for the parent of the directory. As such, two ctime values (one for each of the directory and its parent) need to be saved in the name cache entry. This patch adds an entry for ctime of the parent directory to the name cache. It also adds an additional uma zone for large entries with this time value, in order to minimize memory wastage. As well, it fixes a couple of cases where the mtime of the parent directory was being saved instead of ctime for positive name cache entries. With this patch, Lookup RPC counts return to values similar to pre-r230394 kernels. Reported by: bde Discussed with: kib Reviewed by: jhb MFC after: 2 weeks	2012-03-03 01:06:54 +00:00
John Baldwin	58d65e8031	Similar to the fixes in 226967 and 226987, purge any name cache entries associated with the previous vnode (if any) associated with the target of a rename(). Otherwise, a lookup of the target pathname concurrent with a rename() could re-add a name cache entry after the namei(RENAME) lookup in kern_renameat() had purged the target pathname. MFC after: 2 weeks	2012-03-02 18:55:19 +00:00
Konstantin Belousov	66f02f4b25	Do not expose unlocked unconstructed nullfs vnode on mount list. Lock the native nullfs vnode lock before switching the locks. Tested by: pho MFC after: 1 week	2012-03-02 09:48:46 +00:00
Rick Macklem	4cf7d12840	Fix the NFS clients so that they use copyin() instead of bcopy(), when doing direct I/O. This direct I/O code is not enabled by default. Submitted by: kib (earlier version) Reviewed by: kib MFC after: 1 week	2012-03-01 03:53:07 +00:00
Martin Matuska	b362f0a78b	Add "export" to devfs_opts[] and return EOPNOTSUPP if called with it. Fixes mountd warnings. Reported by: kib MFC after: 1 week	2012-02-29 16:16:36 +00:00
Konstantin Belousov	37a1046e61	Allow shared locks for reads when lower filesystem accept shared locking. Tested by: pho MFC after: 1 week	2012-02-29 15:18:53 +00:00
Konstantin Belousov	cec1d07726	Document that null_nodeget() cannot take shared-locked lowervp due to insmntque() requirements. Tested by: pho MFC after: 1 week	2012-02-29 15:18:04 +00:00
Konstantin Belousov	409b12c08a	In null_reclaim(), assert that reclaimed vnode is fully constructed, instead of accepting half-constructed vnode. Previous code cannot decide what to do with such vnode anyway, and although processing it for hash removal, paniced later when getting rid of nullfs reference on lowervp. While there, remove initializations from the declaration block. Tested by: pho MFC after: 1 week	2012-02-29 15:15:36 +00:00
Konstantin Belousov	e4e1d9f382	Always request exclusive lock for the lower vnode in nullfs_vget(). The null_nodeget() requires exclusive lock on lowervp to be able to insmntque() new vnode. Reported by: rea Tested by: pho MFC after: 1 week	2012-02-29 15:09:20 +00:00
Konstantin Belousov	67e3d54f80	Move the code to destroy half-contructed nullfs vnode into helper function null_destroy_proto() from null_insmntque_dtr(). Also apply null_destroy_proto() in null_nodeget() when we raced and a vnode is found in the hash, so the currently allocated protonode shall be destroyed. Lock the vnode interlock around reassigning the v_vnlock. In fact, this path will not be exercised after several later commits, since null_nodeget() cannot take shared-locked lowervp at all due to insmntque() requirements. Reported by: rea Tested by: pho MFC after: 1 week	2012-02-29 15:06:00 +00:00
Konstantin Belousov	da732fc69f	Merge a split multi-line comment. MFC after: 1 week	2012-02-29 14:43:27 +00:00
Martin Matuska	41c0675e6e	Add procfs to jail-mountable filesystems. Reviewed by: jamie MFC after: 1 week	2012-02-29 00:30:18 +00:00
Kevin Lo	19b029a487	Remove an unused structure and unnecessary cast	2012-02-24 07:30:44 +00:00
Kevin Lo	a61d3d5a99	Check if the user has necessary permissions on the device	2012-02-24 07:29:06 +00:00
Martin Matuska	bf3db8aa65	To improve control over the use of mount(8) inside a jail(8), introduce a new jail parameter node with the following parameters: allow.mount.devfs: allow mounting the devfs filesystem inside a jail allow.mount.nullfs: allow mounting the nullfs filesystem inside a jail Both parameters are disabled by default (equals the behavior before devfs and nullfs in jails). Administrators have to explicitly allow mounting devfs and nullfs for each jail. The value "-1" of the devfs_ruleset parameter is removed in favor of the new allow setting. Reviewed by: jamie Suggested by: pjd MFC after: 2 weeks	2012-02-23 18:51:24 +00:00
Kip Macy	11ac7ec076	merge pipe and fifo implementations Also reviewed by: jhb, jilles (initial revision) Tested by: pho, jilles Submitted by: gianni Reviewed by: bde	2012-02-23 18:37:30 +00:00
Rick Macklem	7cfce7cec7	hrs@ reported a panic to freebsd-stable@ under the subject line "panic in 8.3-PRERELEASE" on Feb. 22, 2012. This panic was caused by use of a mix of tsleep() and msleep() calls on the same event in the new NFS server DRC code. It did "mtx_unlock(); tsleep();" in two places, which kib@ noted introduced a slight risk that the wakeup() would occur before the tsleep(), resulting in a 10sec delay before waking up. This patch fixes the problem by replacing "mtx_unlock(); tsleep();" with mtx_sleep(..PDROP..). It also changes a nfsmsleep() call to mtx_sleep() so that the code uses mtx_sleep() consistently within the file. Tested by: hrs (in progress) Reviewed by: jhb MFC after: 5 days	2012-02-23 16:47:05 +00:00
Konstantin Belousov	2aacee7779	Use DOINGASYNC() to test for async allowance, to honor VFS syncing requests. Noted by: bde MFC after: 1 week	2012-02-22 13:01:17 +00:00
Konstantin Belousov	526d0bd547	Fix found places where uio_resid is truncated to int. Add the sysctl debug.iosize_max_clamp, enabled by default. Setting the sysctl to zero allows to perform the SSIZE_MAX-sized i/o requests from the usermode. Discussed with: bde, das (previous versions) MFC after: 1 month	2012-02-21 01:05:12 +00:00
Kevin Lo	3d74f18ba4	Remove an unnecessary cast.	2012-02-20 09:56:14 +00:00
Bjoern A. Zeeb	9dba179d5e	IFC @231845 Sponsored by: Cisco Systems, Inc.	2012-02-17 00:27:48 +00:00
Rick Macklem	13b2772f8e	Delete a couple of out of date comments that are no longer true in the new NFS client. Requested by: bde MFC after: 1 week	2012-02-16 02:19:53 +00:00
Tijl Coosemans	0662ee9826	Replace PRIdMAX with "jd" in a printf call. Cast the corresponding value to intmax_t instead of uintmax_t, because the original type is off_t.	2012-02-14 11:24:24 +00:00
Ed Schouten	8fac9b7b7d	Merge si_name and __si_namebuf. The si_name pointer always points to the __si_namebuf member inside the same object. Remove it and rename __si_namebuf to si_name.	2012-02-10 12:40:50 +00:00
Martin Matuska	61f0e25abf	Allow mounting nullfs(5) inside jails. This is now possible thanks to r230129. MFC after: 1 month	2012-02-09 10:39:01 +00:00
Martin Matuska	0cc207a6f5	Add support for mounting devfs inside jails. A new jail(8) option "devfs_ruleset" defines the ruleset enforcement for mounting devfs inside jails. A value of -1 disables mounting devfs in jails, a value of zero means no restrictions. Nested jails can only have mounting devfs disabled or inherit parent's enforcement as jails are not allowed to view or manipulate devfs(8) rules. Utilizes new functions introduced in r231265. Reviewed by: jamie MFC after: 1 month	2012-02-09 10:22:08 +00:00
Martin Matuska	17d84d611f	Introduce the "ruleset=number" option for devfs(5) mounts. Add support for updating the devfs mount (currently only changing the ruleset number is supported). Check mnt_optnew with vfs_filteropt(9). This new option sets the specified ruleset number as the active ruleset of the new devfs mount and applies all its rules at mount time. If the specified ruleset doesn't exist, a new empty ruleset is created. MFC after: 1 month	2012-02-09 10:09:12 +00:00
Pedro F. Giffuni	3cc6ae1f57	Update the data structures with some fields reserved for ext4 but that can be used in ext3 mode. Also adjust the internal inode to carry the birthtime, like in UFS, which is starting to get some use when big inodes are available. Right now these are just placeholders for features to come. Approved by: jhb (mentor) MFC after: 2 weeks	2012-02-07 22:31:28 +00:00
Rick Macklem	8c9c322347	r228827 fixed a problem where copying of NFSv4 open credentials into a credential structure would corrupt it. This happened when the p argument was != NULL. However, I now realize that the copying of open credentials should only happen for p == NULL, since that indicates that it is a read-ahead or write-behind. This patch fixes this. After this commit, r228827 could be reverted, but I think the code is clearer and safer with the patch, so I am going to leave it in. Without this patch, it was possible that a NFSv4 VOP_SETATTR() could have changed the credentials of the caller. This would have happened if the process doing the VOP_SETATTR() did not have the file open, but some other process running as a different uid had the file open for writing at the same time. MFC after: 5 days	2012-02-07 16:32:43 +00:00
John Baldwin	bf40d24a3f	Rename cache_lookup_times() to cache_lookup() and retire the old API and ABI stub for cache_lookup().	2012-02-06 17:00:28 +00:00
Konstantin Belousov	c480f781ea	Current implementations of sync(2) and syncer vnode fsync() VOP uses mnt_noasync counter to temporary remove MNTK_ASYNC mount option, which is needed to guarantee a synchronous completion of the initiated i/o before syscall or VOP return. Global removal of MNTK_ASYNC option is harmful because not only i/o started from corresponding thread becomes synchronous, but all i/o is synchronous on the filesystem which is initiated during sync(2) or syncer activity. Instead of removing MNTK_ASYNC from mnt_kern_flag, provide a local thread flag to disable async i/o for current thread only. Use the opportunity to move DOINGASYNC() macro into sys/vnode.h and consistently use it through places which tested for MNTK_ASYNC. Some testing demonstrated 60-70% improvements in run time for the metadata-intensive operations on async-mounted UFS volumes, but still with great deviation due to other reasons. Reviewed by: mckusick Tested by: scottl MFC after: 2 weeks	2012-02-06 11:04:36 +00:00
Bjoern A. Zeeb	81d5d46b3c	Add multi-FIB IPv6 support to the core network stack supplementing the original IPv4 implementation from r178888: - Use RT_DEFAULT_FIB in the IPv4 implementation where noticed. - Use rtfib() KPI with explicit RT_DEFAULT_FIB where applicable in the NFS code. - Use the new in6_rt KPI in TCP, gif(4), and the IPv6 network stack where applicable. - Split in6_rtqtimo() and in6_mtutimo() as done in IPv4 and equally prevent multiple initializations of callouts in in6_inithead(). - Use wrapper functions where needed to preserve the current KPI to ease MFCs. Use BURN_BRIDGES to indicate expected future cleanup. - Fix (related) comments (both technical or style). - Convert to rtinit() where applicable and only use custom loops where currently not possible otherwise. - Multicast group, most neighbor discovery address actions and faith(4) are locked to the default FIB. Individual IPv6 addresses will only appear in the default FIB, however redirect information and prefixes of connected subnets are automatically propagated to all FIBs by default (mimicking IPv4 behavior as closely as possible). Sponsored by: Cisco Systems, Inc.	2012-02-03 13:08:44 +00:00
Rick Macklem	87b633678b	When a "mount -u" switches an NFS mount point from TCP to UDP, any thread doing an I/O RPC with a transfer size greater than NFS_UDPMAXDATA will be hung indefinitely, retrying the RPC. After a discussion on freebsd-fs@, I decided to add a warning message for this case, as suggested by Jeremy Chadwick. Suggested by: freebsd at jdc.parodius.com (Jeremy Chadwick) MFC after: 2 weeks	2012-01-31 03:58:26 +00:00
Rick Macklem	7f763fc39c	A problem with respect to data read through the buffer cache for both NFS clients was reported to freebsd-fs@ under the subject "NFS corruption in recent HEAD" on Nov. 26, 2011. This problem occurred when a TCP mounted root fs was changed to using UDP. I believe that this problem was caused by the change in mnt_stat.f_iosize that occurred because rsize was decreased to the maximum supported by UDP. This patch fixes the problem by using v_bufobj.bo_bsize instead of f_iosize, since the latter is set to f_iosize when the vnode is allocated, but does not change for a given vnode when f_iosize changes. Reported by: pjd Reviewed by: kib MFC after: 2 weeks	2012-01-27 02:46:12 +00:00
Rick Macklem	0149d177fb	Revert r230516, since it doesn't really fix the problem.	2012-01-26 00:07:34 +00:00
Konstantin Belousov	d5210589b7	Fix remaining calls to cache_enter() in both NFS clients to provide appropriate timestamps. Restore the assertions which verify that NCF_TS is set when timestamp is asked for. Reviewed by: jhb (previous version) MFC after: 2 weeks	2012-01-25 20:48:20 +00:00
John Baldwin	0b17c7bea5	Add a timeout on positive name cache entries in the NFS client. That is, we will only trust a positive name cache entry for a specified amount of time before falling back to a LOOKUP RPC, even if the ctime for the file handle matches the cached copy in the name cache entry. The timeout is configured via a new 'nametimeo' mount option and defaults to 60 seconds. It may be set to zero to disable positive name caching entirely. Reviewed by: rmacklem MFC after: 1 week	2012-01-25 20:05:58 +00:00
Rick Macklem	6403723880	If a mount -u is done to either NFS client that switches it from TCP to UDP and the rsize/wsize/readdirsize is greater than NFS_MAXDGRAMDATA, it is possible for a thread doing an I/O RPC to get stuck repeatedly doing retries. This happens because the RPC will use a resize/wsize/readdirsize that won't work for UDP and, as such, it will keep failing indefinitely. This patch returns an error for this case, to avoid the problem. A discussion on freebsd-fs@ seemed to indicate that returning an error was preferable to silently ignoring the "udp"/"mntudp" option. This problem was discovered while investigating a problem reported by pjd@ via email. MFC after: 2 weeks	2012-01-25 00:22:53 +00:00
John Baldwin	5aefb4cbbf	Close a race in NFS lookup processing that could result in stale name cache entries on one client when a directory was renamed on another client. The root cause for the stale entry being trusted is that each per-vnode nfsnode structure has a single 'n_ctime' timestamp used to validate positive name cache entries. However, if there are multiple entries for a single vnode, they all share a single timestamp. To fix this, extend the name cache to allow filesystems to optionally store a timestamp value in each name cache entry. The NFS clients now fetch the timestamp associated with each name cache entry and use that to validate cache hits instead of the timestamps previously stored in the nfsnode. Another part of the fix is that the NFS clients now use timestamps from the post-op attributes of RPCs when adding name cache entries rather than pulling the timestamps out of the file's attribute cache. The latter is subject to races with other lookups updating the attribute cache concurrently. Some more details: - Add a variant of nfsm_postop_attr() to the old NFS client that can return a vattr structure with a copy of the post-op attributes. - Handle lookups of "." as a special case in the NFS clients since the name cache does not store name cache entries for ".", so we cannot get a useful timestamp. It didn't really make much sense to recheck the attributes on the the directory to validate the namecache hit for "." anyway. - ABI compat shims for the name cache routines are present in this commit so that it is safe to MFC. MFC after: 2 weeks	2012-01-20 20:02:01 +00:00
Rick Macklem	23b3566364	Martin Cracauer reported a problem to freebsd-current@ under the subject "Data corruption over NFS in -current". During investigation of this, I came across an ugly bogusity in the new NFS client where it replaced the cr_uid with the one used for the mount. This was done so that "system operations" like the NFSv4 Renew would be performed as the user that did the mount. However, if any other thread shares the credential with the one doing this operation, it could do an RPC (or just about anything else) as the wrong cr_uid. This patch fixes the above, by using the mount credentials instead of the one provided as an argument for this case. It appears to have fixed Martin's problem. This patch is needed for NFSv4 mounts and NFSv3 mounts against some non-FreeBSD servers that do not put post operation attributes in the NFSv3 Statfs RPC reply. Tested by: Martin Cracauer (cracauer at cons.org) Reviewed by: jhb MFC after: 2 weeks	2012-01-20 00:58:51 +00:00
Eygene Ryabinkin	15c75a0d9f	Subject: NULLFS: properly destroy node hash Use hashdestroy() instead of naive free(). Approved by: kib MFC after: 2 weeks	2012-01-18 11:23:46 +00:00
Kevin Lo	e0d3195bd6	Return EOPNOTSUPP since we only support update mounts for NFS export. Spotted by: trociny	2012-01-17 01:25:53 +00:00
Kirk McKusick	cc672d3599	Make sure all intermediate variables holding mount flags (mnt_flag) and that all internal kernel calls passing mount flags are declared as uint64_t so that flags in the top 32-bits are not lost. MFC after: 2 weeks	2012-01-17 01:08:01 +00:00
Kevin Lo	57eb5548c9	Add nfs export support to tmpfs(5) Reviewed by: kib	2012-01-16 10:25:22 +00:00
Alan Cox	0b05cac3d2	When tmpfs_write() resets an extended file to its original size after an error, we want tmpfs_reg_resize() to ignore I/O errors and unconditionally update the file's size. Reviewed by: kib MFC after: 3 weeks	2012-01-16 00:26:49 +00:00
Mikolaj Golub	fe7f89b71a	Abrogate nchr argument in proc_getargv() and proc_getenvv(): we always want to read strings completely to know the actual size. As a side effect it fixes the issue with kern.proc.args and kern.proc.env sysctls, which didn't return the size of available data when calling sysctl(3) with the NULL argument for oldp. Note, in get_ps_strings(), which does actual work for proc_getargv() and proc_getenvv(), we still have a safety limit on the size of data read in case of a corrupted procces stack. Suggested by: kib MFC after: 3 days	2012-01-15 18:47:24 +00:00
Ulrich Spörlein	9a14aa017b	Convert files to UTF-8	2012-01-15 13:23:18 +00:00
Alan Cox	93431cb74c	Neither tmpfs_nocacheread() nor tmpfs_mappedwrite() needs to call vm_object_pip_{add,subtract}() on the swap object because the swap object can't be destroyed while the vnode is exclusively locked. Moreover, even if the swap object could have been destroyed during tmpfs_nocacheread() and tmpfs_mappedwrite() this code is broken because vm_object_pip_subtract() does not wake up the sleeping thread that is trying to destroy the swap object. Free invalid pages after an I/O error. There is no virtue in keeping them around in the swap object creating more work for the page daemon. (I believe that any non-busy page in the swap object will now always be valid.) vm_pager_get_pages() does not return a standard errno, so its return value should not be returned by tmpfs without translation to an errno value. There is no reason for the wakeup on vpg in tmpfs_mappedwrite() to occur with the swap object locked. Eliminate printf()s from tmpfs_nocacheread() and tmpfs_mappedwrite(). (The swap pager already spam your console if data corruption is imminent.) Reviewed by: kib MFC after: 3 weeks	2012-01-14 23:04:27 +00:00
Rick Macklem	5b79362b47	Tai Horgan reported via email that there were two places in the new NFSv4 server where the code follows the wrong list. Fortunately, for these fairly rare cases, the lc_stateid[] lists are normally empty. This patch fixes the code to follow the correct list. Reported by: tai.horgan at isilon.com Discussed with: zack MFC after: 2 weeks	2012-01-14 04:04:58 +00:00
Rick Macklem	a16cd9c05e	jwd@ reported via email that the "CacheSize" field reported by "nfsstat -e -s" would go negative after using the "-z" option to zero out the stats. This patch fixes that by not zeroing out the srvcache_size field for "-z", since it is the size of the cache and not a counter. MFC after: 2 weeks	2012-01-11 02:46:42 +00:00
Alan Cox	2971897d51	Correct an error of omission in the implementation of the truncation operation on POSIX shared memory objects and tmpfs. Previously, neither of these modules correctly handled the case in which the new size of the object or file was not a multiple of the page size. Specifically, they did not handle partial page truncation of data stored on swap. As a result, stale data might later be returned to an application. Interestingly, a data inconsistency was less likely to occur under tmpfs than POSIX shared memory objects. The reason being that a different mistake by the tmpfs truncation operation helped avoid a data inconsistency. If the data was still resident in memory in a PG_CACHED page, then the tmpfs truncation operation would reactivate that page, zero the truncated portion, and leave the page pinned in memory. More precisely, the benevolent error was that the truncation operation didn't add the reactivated page to any of the paging queues, effectively pinning the page. This page would remain pinned until the file was destroyed or the page was read or written. With this change, the page is now added to the inactive queue. Discussed with: jhb Reviewed by: kib (an earlier version) MFC after: 3 weeks	2012-01-08 20:09:26 +00:00
Rick Macklem	f725864490	opt_inet6.h was missing from some files in the new NFS subsystem. The effect of this was, for clients mounted via inet6 addresses, that the DRC cache would never have a hit in the server. It also broke NFSv4 callbacks when an inet6 address was the only one available in the client. This patch fixes the above, plus deletes opt_inet6.h from a couple of files it is not needed for. MFC after: 2 weeks	2012-01-08 01:54:46 +00:00
Jaakko Heinonen	d467c9472a	r222004 changed sbuf_finish() to not clear the buffer error status. As a consequence sbuf_len() will return -1 for buffers which had the error status set prior to sbuf_finish() call. This causes a problem in pfs_read() which purposely uses a fixed size sbuf to discard bytes which are not needed to fulfill the read request. Work around the problem by using the full buffer length when sbuf_finish() indicates an overflow. An overflowed sbuf with fixed size is always full. PR: kern/163076 Approved by: des MFC after: 2 weeks	2012-01-06 10:12:59 +00:00
Jaakko Heinonen	9cb24e3c98	Check the return value of sbuf_finish() in pfs_readlink() and return ENAMETOOLONG if the buffer overflowed. Approved by: des MFC after: 2 weeks	2012-01-06 09:17:34 +00:00
Dimitry Andric	f39adedd5b	In sys/fs/nullfs/null_subr.c, in a KASSERT, output the correct vnode pointer 'lowervp' instead of 'vp', which is uninitialized at that point. Reviewed by: kib MFC after: 1 week	2012-01-05 17:06:04 +00:00
Konstantin Belousov	dd0f9532f3	Do the vput() for the lowervp in the null_nodeget() for error case too. Several callers of null_nodeget() did the cleanup itself, but several missed it, most prominent being null_bypass(). Remove the cleanup from the callers, now null_nodeget() handles lowervp free itself. Reported and tested by: pho MFC after: 1 week	2012-01-03 21:09:07 +00:00
Konstantin Belousov	48a1e3f624	Document the state of the lowervp vnode for null_nodeget(). Tested by: pho MFC after: 1 week	2012-01-03 21:03:20 +00:00
Pedro F. Giffuni	5eda6329b2	Minor cleanups to ntfs code bzero -> memset rename variables to avoid shadowing. PR: 142401 Obtained from: NetBSD Approved by jhb (mentor)	2012-01-03 19:09:01 +00:00
Alan Cox	04f883d798	Don't pass VM_ALLOC_ZERO to vm_page_grab() in tmpfs_mappedwrite() and tmpfs_nocacheread(). It is both unnecessary and a pessimization. It results in either the page being zeroed twice or zeroed first and then overwritten by an I/O operation. MFC after: 3 weeks	2012-01-03 03:29:01 +00:00
Ed Schouten	dc15eac046	Use strchr() and strrchr(). It seems strchr() and strrchr() are used more often than index() and rindex(). Therefore, simply migrate all kernel code to use it. For the XFS code, remove an empty line to make the code identical to the code in the Linux kernel.	2012-01-02 12:12:10 +00:00
Ed Schouten	8f8d30274a	Migrate ufs and ext2fs from skpc() to memcchr(). While there, remove a useless check from the code. memcchr() always returns characters unequal to 0xff in this case, so inosused[i] ^ 0xff can never be equal to zero. Also, the fact that memcchr() returns a pointer instead of the number of bytes until the end, makes conversion to an offset far more easy.	2012-01-01 20:47:33 +00:00
Kevin Lo	824be4a073	Discard local array based on return values. Pointed out by: uqs Found with: Coverity Prevent(tm) CID: 10089	2011-12-24 15:49:52 +00:00
Rick Macklem	f855a3c570	During investigation of an NFSv4 client crash reported by glebius@, jhb@ spotted that nfscl_getstateid() might modify credentials when called from nfsrpc_read() for the case where p != NULL, whereas nfsrpc_read() only did a crdup() to get new credentials for p == NULL. This bug was introduced by r195510, since pre-r195510 nfscl_getstateid() only modified credentials for the p == NULL case. This patch modifies nfsrpc_read()/nfsrpc_write() so that they do crdup() for the p != NULL case. It is conceivable that this bug caused the crash reported by glebius@, but that will not be determined for some time, since the crash occurred after about 1month of operation. Tested by: glebius Reviewed by: jhb MFC after: 2 weeks	2011-12-23 02:04:35 +00:00
Kevin Lo	e2ee19e346	Discarding local array based on return values	2011-12-22 06:31:29 +00:00
Rick Macklem	713f46ac47	jwd@ reported a problem via email where the old NFS client would get a reply of EEXIST from an NFS server when a Mkdir RPC was retried, for an NFS over UDP mount. Upon investigation, it was found that the client was retransmitting the Mkdir RPC request over UDP, but with a different xid. As such, the retransmitted message would miss the Duplicate Request Cache in the server, causing it to reply EEXIST. The kernel client side UDP rpc code has two timers. The first one causes a retransmit using the same xid and socket and was set to a fixed value of 3seconds. (The default can be overridden via CLSET_RETRY_TIMEOUT.) The second one creates a new socket and xid and should be larger than the first. However, both NFS clients were setting the second timer to nm_timeo ("timeout=<value>" mount argument), which defaulted to 1second, so the first timer would never time out. This patch fixes both NFS clients so that they set the first timer using nm_timeo and makes the second timer larger than the first one. Reported by: jwd Tested by: jwd Reviewed by: jhb MFC after: 2 weeks	2011-12-21 02:45:51 +00:00
Pedro F. Giffuni	5ed5554f0a	Style cleanups by jh@. Fix a comment from the previous commit. Use M_ZERO instead of bzero() in ext2_vfsops.c Add include guards from PR. PR: 162564 Approved by: jhb (mentor) MFC after: 2 weeks	2011-12-16 15:47:43 +00:00
Rick Macklem	22ea9f58f0	Patch the new NFS server in a manner analagous to r228520 for the old NFS server, so that it correctly handles a count == 0 argument for Commit. PR: kern/118126 MFC after: 2 weeks	2011-12-16 00:58:41 +00:00
Pedro F. Giffuni	5b63c1252b	Bring in reallocblk to ext2fs. The feature has been standard for a while in UFS as a means to reduce fragmentation, therefore maintaining consistent performance with filesystem aging. This is also very similar to what ext4 calls "delayed allocation". In his 2010 GSoC, Zheng Liu ported and benchmarked the missing FANCY_REALLOC code to find more consistent performance improvements than with the preallocation approach. PR: 159233 Author: Zheng Liu <gnehzuil AT SPAMFREE gmail DOT com> Sponsored by: Google Inc. Approved by: jhb (mentor) MFC after: 2 weeks	2011-12-15 20:31:18 +00:00
Pedro F. Giffuni	c14d4ad1c6	Merge ext2_readwrite.c into ext2_vnops.c as done in UFS in r101729. This removes the obfuscations mentioned in ext2_readwrite and places the clustering funtion in a location similar to other UFS-based implementations. No performance or functional changeses are expected from this move. PR: kern/159232 Suggested by: bde Approved by: jhb (mentor) MFC after: 2 weeks	2011-12-14 22:04:14 +00:00
John Baldwin	e517e6f12c	Explicitly use curthread while manipulating td_fpop during last close of a devfs file descriptor in devfs_close_f(). The passed in td argument may be NULL if the close was invoked by garbage collection of open file descriptors in pending control messages in the socket buffer of a UNIX domain socket after it was closed. PR: kern/151758 Submitted by: Andrey Shidakov andrey shidakov ru Submitted by: Ruben van Staveren ruben verweg com Reviewed by: kib MFC after: 2 weeks	2011-12-09 17:49:34 +00:00
Konstantin Belousov	d8e8af3166	Initialize fifoinfo fi_wgen field on open. The only important is the difference between fi_wgen and f_seqcount, so the change is purely cosmetic, but it makes the code easier to understand. Submitted by: gianni MFC after: 2 weeks	2011-12-04 19:25:49 +00:00
Rick Macklem	34f2e649d0	This patch adds a sysctl to the NFSv4 server which optionally disables the check for a UTF-8 compliant file name. Enabling this sysctl results in an NFSv4 server that is non-RFC3530 compliant, therefore it is not enabled by default. However, enabling this sysctl results in NFSv3 compatible behaviour and fixes the problem reported by "dan at sunsaturn.com" to freebsd-current@ on Nov. 14, 2011 under the subject "NFSV4 readlink_stat". Tested by: dan at sunsaturn.com Reviewed by: zack MFC after: 2 weeks	2011-12-04 16:33:04 +00:00
Rick Macklem	7a2e4d803c	Post r223774, the NFSv4 client no longer has multiple instances of the same lock_owner4 string. As such, the handling of cleanup of lock_owners could be simplified. This simplification permitted the client to do a ReleaseLockOwner operation when the process that the lock_owner4 string represents, has exited. This permits the server to release any storage related to the lock_owner4 string before the associated open is closed. Without this change, it is possible to exhaust a server's storage when a long running process opens a file and then many child processes do locking on the file, because the open doesn't get closed. A similar patch was applied to the Linux NFSv4 client recently so that it wouldn't exhaust a server's storage. Reviewed by: zack MFC after: 2 weeks	2011-12-03 02:27:26 +00:00
John Baldwin	574862c8ba	Enhance the sequential access heuristic used to perform readahead in the NFS server and reuse it for writes as well to allow writes to the backing store to be clustered. - Use a prime number for the size of the heuristic table (1017 is not prime). - Move the logic to locate a heuristic entry from the table and compute the sequential count out of VOP_READ() and into a separate routine. - Use the logic from sequential_heuristic() in vfs_vnops.c to update the seqcount when a sequential access is performed rather than just increasing seqcount by 1. This lets the clustering count ramp up faster. - Allow for some reordering of RPCs and if it is detected leave the current seqcount as-is rather than dropping back to a seqcount of 1. Also, when out of order access is encountered, cut seqcount in half rather than dropping it all the way back to 1 to further aid with reordering. - Fix the new NFS server to properly update the next offset after a successful VOP_READ() so that the readahead actually works. Some of these changes came from an earlier patch by Bjorn Gronwall that was forwarded to me by bde@. Discussed with: bde, rmacklem, fs@ Submitted by: Bjorn Gronwall (1, 4) MFC after: 2 weeks	2011-12-01 18:46:28 +00:00
Konstantin Belousov	dc874f9881	Rename vm_page_set_valid() to vm_page_set_valid_range(). The vm_page_set_valid() is the most reasonable name for the m->valid accessor. Reviewed by: attilio, alc	2011-11-30 17:39:00 +00:00
Kevin Lo	bdcdb55387	Add unicode support to ntfs Obtained from: imura	2011-11-27 15:43:49 +00:00
Mikolaj Golub	beb7471b16	In procfs_doproccmdline() if arguments are not cashed read them from the process stack. Suggested by: kib Reviewed by: kib Tested by: pho MFC after: 2 weeks	2011-11-22 20:43:03 +00:00
Ivan Voras	6e92aee4e2	Avoid panics from recursive rename operations. Not a perfect patch but good enough for now. PR: kern/159418 Submitted by: Gleb Kurtsou Reviewed by: kib MFC after: 1 month	2011-11-22 16:18:12 +00:00
Konstantin Belousov	54cf919857	Put all the messages from msdosfs under the MSDOSFS_DEBUG ifdef. They are confusing to user, and not informative for general consumption. MFC after: 1 week	2011-11-22 13:30:36 +00:00
Rick Macklem	6854d64811	This patch enables the new/default NFS server's use of shared vnode locking for read, readdir, readlink, getattr and access. It is hoped that this will improve server performance for these operations, since they will no longer be serialized for a given file/vnode.	2011-11-22 00:35:30 +00:00
Xin LI	296a25a245	Improve the way to calculate available pages in tmpfs: - Don't deduct wired pages from total usable counts because it does not make any sense. To make things worse, on systems where swap size is smaller than physical memory and use a lot of wired pages (e.g. ZFS), tmpfs can suddenly have free space of 0 because of this; - Count cached pages as available; [1] - Don't count inactive pages as available, technically we could but that might be too aggressive; [1] [1] Suggested by kib@ MFC after: 1 week	2011-11-21 20:26:22 +00:00
Rick Macklem	f9340edfc0	Clean up some cruft in the NFSv4 client left over from the OpenBSD port, so that it is more readable. No logic change is made by this commit. MFC after: 2 weeks	2011-11-21 16:06:23 +00:00
Rick Macklem	034235528f	Add two arguments to the nfsrpc_rellockown() function in the NFSv4 client. This does not change the client's behaviour, but prepares the code so that nfsrpc_rellockown() can be called elsewhere in a future commit. MFC after: 2 weeks	2011-11-20 16:46:50 +00:00
Rick Macklem	d57a9d5f52	Since the nfscl_cleanup() function isn't used by the FreeBSD NFSv4 client, delete the code and fix up the related comments. This should not have any functional effect on the client. MFC after: 2 weeks	2011-11-20 01:18:47 +00:00
Rick Macklem	2f27585ef9	Post r223774 the NFSv4 client never uses the linked list with the head nfsc_defunctlockowner. This patch simply removes the code that loops through this always empty list, since the code no longer does anything useful. It should not have any effect on the client's behaviour. MFC after: 2 weeks	2011-11-20 00:39:15 +00:00
Konstantin Belousov	f82360acf2	Existing VOP_VPTOCNP() interface has a fatal flow that is critical for nullfs. The problem is that resulting vnode is only required to be held on return from the successfull call to vop, instead of being referenced. Nullfs VOP_INACTIVE() method reclaims the vnode, which in combination with the VOP_VPTOCNP() interface means that the directory vnode returned from VOP_VPTOCNP() is reclaimed in advance, causing vn_fullpath() to error with EBADF or like. Change the interface for VOP_VPTOCNP(), now the dvp must be referenced. Convert all in-tree implementations of VOP_VPTOCNP(), which is trivial, because vhold(9) and vref(9) are similar in the locking prerequisites. Out-of-tree fs implementation of VOP_VPTOCNP(), if any, should have no trouble with the fix. Tested by: pho Reviewed by: mckusick MFC after: 3 weeks (subject of re approval)	2011-11-19 07:50:49 +00:00
Konstantin Belousov	f82ee01c1c	Do not use NULLVPTOLOWERVP() in the null_print(). If diagnostic is compiled in, and show vnode is used from ddb on the faulty nullfs vnode, we get panic instead of vnode dump. MFC after: 1 week	2011-11-19 07:41:37 +00:00
Konstantin Belousov	4d2310dd81	Use the plain panic calls, without additional printing around them. The debugger and dumping support is adequate. Tested by: pho MFC after: 1 week	2011-11-19 07:40:13 +00:00
Kevin Lo	41f1dccceb	Add unicode support to msdosfs and smbfs; original pathes from imura, bug fixes by Kuan-Chung Chiu <buganini at gmail dot com>. Tested by me in production for several days at work.	2011-11-18 03:05:20 +00:00
Konstantin Belousov	1fb5311e00	Fix build, use %d for int value formatting.	2011-11-16 18:41:59 +00:00
Peter Holm	50546f8ffe	Handle invalid large values for getdirentries(2) data buffer size. In collaboration with: kib Reviewed by: des Reported by: The iknowthis syscall fuzzer. MFC after: 1 week	2011-11-16 10:11:55 +00:00
Rick Macklem	a5e583eea0	Modify the new NFS client so that nfs_fsync() only calls ncl_flush() for regular files. Since other file types don't write into the buffer cache, calling ncl_flush() is almost a no-op. However, it does clear the NMODIFIED flag and this shouldn't be done by nfs_fsync() for directories. MFC after: 2 weeks	2011-11-15 23:35:43 +00:00
Peter Holm	3c93d4433f	Removed extra PRELE() call. MFC after: 1 week	2011-11-15 09:23:21 +00:00
Rick Macklem	e42a8d7e24	Move the setting of the default value for nm_wcommitsize to before the nfs_decode_args() call in the new NFS client, so that a specfied command line value won't be overwritten. Also, modify the calculation for small values of desiredvnodes to avoid an unusually large value or a divide by zero crash. It seems that the default value for nm_wcommitsize is very conservative and may need to change at some time. PR: kern/159351 Submitted by: onwahe at gmail.com (earlier version) Reviewed by: jhb MFC after: 2 weeks	2011-11-15 01:39:02 +00:00
John Baldwin	840fb1c02b	Finish making 'wcommitsize' an NFS client mount option. Reviewed by: rmacklem MFC after: 1 week	2011-11-14 18:52:07 +00:00
John Baldwin	e43c042fec	Sync with the old NFS client: Remove an obsolete comment.	2011-11-14 18:23:50 +00:00
Rick Macklem	670bf6f126	Since NFSv4 byte range locking only works for regular files, add a sanity check for the vnode type to the NFSv4 client. MFC after: 2 weeks	2011-11-14 00:10:11 +00:00
Rick Macklem	90379d6116	Move the assignment of default values for some mount options to before the nfs_decode_args() call in the new NFS client, so they don't overwrite the value specified on the command line. MFC after: 2 weeks	2011-11-13 23:09:26 +00:00
Eitan Adler	3b6dc18ef5	- fix duplicate "a a" in some comments Submitted by: eadler Approved by: simon MFC after: 3 days	2011-11-13 17:06:33 +00:00
Konstantin Belousov	c8997bf02a	Lock the thread lock around block that retrieves td_wmesg. Otherwise, procfs could see a thread with assigned td_wchan but still NULL td_wmesg. Reported and tested by: pho MFC after: 1 week	2011-11-09 17:15:51 +00:00
Marcel Moolenaar	82543c5928	Don astbestos garment and remove the warning about TMPFS being experimental -- highly experimental even. So far the closest to a bug in TMPFS that people have gotten to relates to how ZFS can take away from the memory that TMPFS needs. One can argue that such is not a bug in TMPFS. Irrespective, even if there is a bug here and there in TMPFS, it's not in our own advantage to scare people away from using TMPFS. I for one have been using it, even with ZFS, very successfully.	2011-11-07 16:21:50 +00:00
Ed Schouten	6472ac3d8a	Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs. The SYSCTL_NODE macro defines a list that stores all child-elements of that node. If there's no SYSCTL_DECL macro anywhere else, there's no reason why it shouldn't be static.	2011-11-07 15:43:11 +00:00
Ed Schouten	d745c852be	Mark MALLOC_DEFINEs static that have no corresponding MALLOC_DECLAREs. This means that their use is restricted to a single C file.	2011-11-07 06:44:47 +00:00
Ed Schouten	8f80f103b4	Remove MALLOC_DECLAREs of nonexisting malloc-pools. After careful grepping, it seems none of these pools can be found in our source tree. They are not in use, nor are they defined.	2011-11-06 20:16:50 +00:00
Konstantin Belousov	25cc6027cf	Fix typo. MFC after: 3 days	2011-11-05 09:04:13 +00:00
John Baldwin	dccc45e4c0	Move the cleanup of f_cdevpriv when the reference count of a devfs file descriptor drops to zero out of _fdrop() and into devfs_close_f() as it is only relevant for devfs file descriptors. Reviewed by: kib MFC after: 1 week	2011-11-04 03:39:31 +00:00
Konstantin Belousov	1fef78c3f0	Fix kernel panic when d_fdopen csw method is called for NULL fp. This may happen when kernel consumer calls VOP_OPEN(). Reported by: Tavis Ormandy <taviso cmpxchg8b com> through delphij MFC after: 3 days	2011-11-03 18:55:18 +00:00
Peter Holm	948fa27d49	Added missing cache purge of from argument for rename(). Reported by: Anton Yuzhaninov <citrin citrin ru> In collaboration with: kib MFC after: 1 week	2011-11-01 12:33:06 +00:00
Konstantin Belousov	17edcd764d	The use of VOP_ISLOCKED() without a check for the return values can cause false positives. Replace the #ifdef block with the proper ASSERT_VOP_UNLOCKED() assert. Tested by: pho MFC after: 1 week	2011-10-24 13:56:31 +00:00
Konstantin Belousov	234ab7412e	The only possible error return from null_nodeget() is due to insmntque1 failure (the getnewvnode cannot return an error). In this case, the null_insmntque_dtr() already unlocked the reclaimed vnode, so VOP_UNLOCK() in the nullfs_mount() after null_nodeget() failure is wrong. Tested by: pho MFC after: 1 week	2011-10-24 13:53:32 +00:00
Konstantin Belousov	ffa43617e8	The covered vnode must be reloced if it was unlocked. Remove VOP_ISLOCKED test because of this and also because it can lead to false positives. Tested by: pho MFC after: 1 week	2011-10-24 13:48:13 +00:00
Peter Holm	9ce7379778	Only unlock if the lock is exclusive. Reported by: Subbsd <subbsd gmail com> Discussed with: kib	2011-10-24 10:35:37 +00:00
Dag-Erling Smørgrav	0fc93d0b00	Trace attempts to open a portal device. Ceterum censeo portalfs esse delendam.	2011-10-18 07:31:49 +00:00
Edward Tomasz Napierala	5c0c5a182f	Make unionfs also clear VAPPEND when clearing VWRITE, since VAPPEND is just a modifier for VWRITE. Submitted by: rmacklem	2011-10-10 21:32:08 +00:00
Konstantin Belousov	084e62e91b	Export devfs inode number allocator for the kernel consumers. Reviewed by: jhb MFC after: 2 weeks	2011-10-05 16:50:15 +00:00
Kip Macy	8451d0dd78	In order to maximize the re-usability of kernel code in user space this patch modifies makesyscalls.sh to prefix all of the non-compatibility calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel entry points and all places in the code that use them. It also fixes an additional name space collision between the kernel function psignal and the libc function of the same name by renaming the kernel psignal kern_psignal(). By introducing this change now we will ease future MFCs that change syscalls. Reviewed by: rwatson Approved by: re (bz)	2011-09-16 13:58:51 +00:00
Konstantin Belousov	3407fefef6	Split the vm_page flags PG_WRITEABLE and PG_REFERENCED into atomic flags field. Updates to the atomic flags are performed using the atomic ops on the containing word, do not require any vm lock to be held, and are non-blocking. The vm_page_aflag_set(9) and vm_page_aflag_clear(9) functions are provided to modify afalgs. Document the changes to flags field to only require the page lock. Introduce vm_page_reference(9) function to provide a stable KPI and KBI for filesystems like tmpfs and zfs which need to mark a page as referenced. Reviewed by: alc, attilio Tested by: marius, flo (sparc64); andreast (powerpc, powerpc64) Approved by: re (bz)	2011-09-06 10:30:11 +00:00
Rick Macklem	322c8d9e25	Fix the NFS servers so that they can do a Lookup of "..", which requires that ni_strictrelative be set to 0, post-r224810. Tested by: swills (earlier version), geo dot liaskos at gmail.com Approved by: re (kib)	2011-09-03 00:28:53 +00:00
Rick Macklem	de67b4966c	Fix the NFSv4 server so that it returns NFSERR_SYMLINK when an attempt to do an Open operation on any type of file other than VREG is done. A recent discussion on the IETF working group's mailing list (nfsv4@ietf.org) decided that NFSERR_SYMLINK should be returned for all non-regular files and not just symlinks, so that the Linux client would work correctly. This change does not affect the FreeBSD NFSv4 client and is not believed to have a negative effect on other NFSv4 clients. Reviewed by: zkirsch Approved by: re (kib) MFC after: 2 weeks	2011-08-20 21:26:35 +00:00
Konstantin Belousov	4c023a3365	Do not return success and a string "unknown" when vn_fullpath() was unable to resolve the path of the text vnode of the process. The behaviour is very confusing for any consumer of the procfs, in particular, java. Reported and tested by: bf MFC after: 2 weeks Approved by: re (bz)	2011-08-16 20:13:17 +00:00
Konstantin Belousov	9c00bb9190	Add the fo_chown and fo_chmod methods to struct fileops and use them to implement fchown(2) and fchmod(2) support for several file types that previously lacked it. Add MAC entries for chown/chmod done on posix shared memory and (old) in-kernel posix semaphores. Based on the submission by: glebius Reviewed by: rwatson Approved by: re (bz)	2011-08-16 20:07:47 +00:00
Jonathan Anderson	985a88e2a6	Fix a merge conflict. r224086 added "goto out"-style error handling to nfssvc_nfsd(), in order to reliably call NFSEXITCODE() before returning. Our Capsicum changes, based on the old "return (error)" model, did not merge nicely. Approved by: re (kib), mentor (rwatson) Sponsored by: Google Inc	2011-08-16 14:23:16 +00:00
Robert Watson	a9d2f8d84f	Second-to-last commit implementing Capsicum capabilities in the FreeBSD kernel for FreeBSD 9.0: Add a new capability mask argument to fget(9) and friends, allowing system call code to declare what capabilities are required when an integer file descriptor is converted into an in-kernel struct file *. With options CAPABILITIES compiled into the kernel, this enforces capability protection; without, this change is effectively a no-op. Some cases require special handling, such as mmap(2), which must preserve information about the maximum rights at the time of mapping in the memory map so that they can later be enforced in mprotect(2) -- this is done by narrowing the rights in the existing max_protection field used for similar purposes with file permissions. In namei(9), we assert that the code is not reached from within capability mode, as we're not yet ready to enforce namespace capabilities there. This will follow in a later commit. Update two capability names: CAP_EVENT and CAP_KEVENT become CAP_POST_KEVENT and CAP_POLL_KEVENT to more accurately indicate what they represent. Approved by: re (bz) Submitted by: jonathan Sponsored by: Google Inc	2011-08-11 12:30:23 +00:00
Konstantin Belousov	e047ade947	Do not update mountpoint generation counter to the value which was not yet acted upon by devfs_populate(). Submitted by: Kohji Okuno <okuno.kohji jp panasonic com> Approved by: re (bz) MFC after: 1 week	2011-08-09 20:53:33 +00:00
Zack Kirsch	06521fbb49	Fix an NFS server issue where it was not correctly setting the eof flag when a READ had hit the end of the file. Also, clean up some cruft in the code. Approved by: re (kib) Reviewed by: rmacklem MFC after: 2 weeks	2011-08-03 18:50:19 +00:00
Rick Macklem	e2eb210c09	Fix a LOR in the NFS client which could cause a deadlock. This was reported to the mailing list freebsd-net@freebsd.org on July 21, 2011 under the subject "LOR with nfsclient sillyrename". The LOR occurred when nfs_inactive() called vrele(sp->s_dvp) while holding the vnode lock on the file in s_dvp. This patch modifies the client so that it performs the vrele(sp->s_dvp) as a separate task to avoid the LOR. This fix was discussed with jhb@ and kib@, who both proposed variations of it. Tested by: pho, jlott at averesystems.com Submitted by: jhb (earlier version) Reviewed by: kib Approved by: re (kib) MFC after: 2 weeks	2011-08-02 11:28:42 +00:00
Rick Macklem	6b3dfc6ab0	Fix rename in the new NFS server so that it does not require a recursive vnode lock on the directory for the case where the new file name is in the same directory as the old one. The patch handles this as a special case, recognized by the new directory having the same file handle as the old one and just VREF()s the old dir vnode for this case, instead of doing a second VFS_FHTOVP() to get it. This is required so that the server will work for file systems like msdosfs, that do not support recursive vnode locking. This problem was discovered during recent testing by pho@ when exporting an msdosfs file system via the new NFS server. Tested by: pho Reviewed by: zkirsch Approved by: re (kib) MFC after: 2 weeks	2011-07-31 20:06:11 +00:00
Rick Macklem	d1907de2ba	The new NFS client failed to vput() the new vnode if a setattr failed after the file was created in nfs_create(). This would probably only happen during a forced dismount. The old NFS client does have a vput() for this case. Detected by pho during recent testing, where an open syscall returned with a vnode still locked. Tested by: pho Approved by: re (kib) MFC after: 2 weeks	2011-07-30 22:57:38 +00:00
Kirk McKusick	6beb3bb4eb	This update changes the mnt_flag field in the mount structure from 32 bits to 64 bits and eliminates the unused mnt_xflag field. The existing mnt_flag field is completely out of bits, so this update gives us room to expand. Note that the f_flags field in the statfs structure is already 64 bits, so the expanded mnt_flag field can be exported without having to make any changes in the statfs structure. Approved by: re (bz)	2011-07-24 17:43:09 +00:00
Zack Kirsch	061c683cc2	Revert revision 224079 as Rick pointed out that I would be calling VOP_PATHCONF without the vnode lock held. Implicitly approved by: zml (mentor)	2011-07-17 03:44:05 +00:00
Rick Macklem	6a536ceea5	The new NFSv4 client handled NFSERR_GRACE as a fatal error for the remove and rename operations. Some NFSv4 servers will report NFSERR_GRACE for these operations. This patch changes the behaviour of the client so that it handles NFSERR_GRACE like NFSERR_DELAY for non-state related operations like remove and rename. It also exempts the delegreturn operation from handling within newnfs_request() for NFSERR_DELAY/NFSERR_GRACE so that it can handle NFSERR_GRACE in the same manner as before. This problem was resolved thanks to discussion with bfields at fieldses.org. The problem was identified at the recent NFSv4 ineroperability bakeathon. MFC after: 2 weeks	2011-07-16 20:53:27 +00:00
Zack Kirsch	a9285ae5c4	Add DEXITCODE plumbing to NFS. Isilon has the concept of an in-memory exit-code ring that saves the last exit code of a function and allows for stack tracing. This is very helpful when debugging tough issues. This patch is essentially a no-op for BSD at this point, until we upstream the dexitcode logic itself. The patch adds DEXITCODE calls to every NFS function that returns an errno error code. A number of code paths were also reorganized to have single exit paths, to reduce code duplication. Submitted by: David Kwan <dkwan@isilon.com> Reviewed by: rmacklem Approved by: zml (mentor) MFC after: 2 weeks	2011-07-16 08:51:09 +00:00
Zack Kirsch	68347a92db	Simple find/replace of VOP_ISLOCKED -> NFSVOPISLOCKED. This is done so that NFSVOPISLOCKED can be modified later to add enhanced logging and assertions. Reviewed by: rmacklem Approved by: zml (mentor) MFC after: 2 weeks	2011-07-16 08:05:41 +00:00
Zack Kirsch	a998963469	Simple find/replace of VOP_UNLOCK -> NFSVOPUNLOCK. This is done so that NFSVOPUNLOCK can be modified later to add enhanced logging and assertions. Reviewed by: rmacklem Approved by: zml (mentor) MFC after: 2 weeks	2011-07-16 08:05:36 +00:00
Zack Kirsch	98f234f338	Simple find/replace of vn_lock -> NFSVOPLOCK. This is done so that NFSVOPLOCK can be modified later to add enhanced logging and assertions. Reviewed by: rmacklem Approved by: zml (mentor) MFC after: 2 weeks	2011-07-16 08:05:31 +00:00
Zack Kirsch	c383087c0c	Remove unnecessary thread pointer from VOPLOCK macros and current users. Reviewed by: rmacklem Approved by: zml (mentor) MFC after: 2 weeks	2011-07-16 08:05:26 +00:00
Zack Kirsch	51c099f522	Change loadattr and fillattr to ask the file system for the pathconf variable. Small modification where VOP_PATHCONF was being called directly. Reviewed by: rmacklem Approved by: zml (mentor) MFC after: 2 weeks	2011-07-16 08:05:21 +00:00
Zack Kirsch	40435b74f4	Move nfsvno_pathconf to be accessible to sys/fs/nfs; no functionality change. Reviewed by: rmacklem Approved by: zml (mentor) MFC after: 2 weeks	2011-07-16 08:05:17 +00:00
Zack Kirsch	b008a72c86	Small acl patch to return the aclerror that comes back from nfsrv_dissectacl(). This fixes a problem where ATTRNOTSUPP was being returned instead of BADOWNER. Reviewed by: rmacklem Approved by: zml (mentor) MFC after: 2 weeks	2011-07-16 08:04:57 +00:00
Konstantin Belousov	724ce55b5b	While fixing the looping of a thread while devfs vnode is reclaimed, r179247 introduced a possibility of devfs_allocv() returning spurious ENOENT. If the vnode is selected by vnlru daemon for reclamation, then devfs_allocv() can get ENOENT from vget() due to devfs_close() dropping vnode lock around the call to cdevsw d_close method. Use LK_RETRY in the vget() call, and do some part of the devfs_reclaim() work in devfs_allocv(), clearing vp->v_data and de->de_vnode. Retry the allocation of the vnode, now with de->de_vnode == NULL. The check vp->v_data == NULL at the start of devfs_close() cannot be affected by the change, since vnode lock must be held while VI_DOOMED is set, and only dropped after the check. Reported and tested by: Kohji Okuno <okuno.kohji jp panasonic com> Reviewed by: attilio MFC after: 3 weeks	2011-07-13 21:07:41 +00:00
Rick Macklem	305a0c9111	r222389 introduced a case where the NFSv4 client could loop in nfscl_getcl() when a forced dismount is in progress, because nfsv4_lock() will return 0 without sleeping when MNTK_UNMOUNTF is set. This patch fixes it so it won't loop calling nfsv4_lock() for this case. MFC after: 2 weeks	2011-07-13 00:48:36 +00:00
Jonathan Anderson	cd6dac7dff	Make a comment more accurate. This comment refers to CAP_NT_SMBS, which does not exist; it should refer to SMB_CAP_NT_SMBS. Fixing this comment makes it easier for people interested in Capsicum to grep around for capability rights, whose identifiers are of the form 'CAP_[A-Z_]'. Approved by: mentor (rwatson), re (Capsicum blanket) Sponsored by: Google Inc	2011-07-07 17:00:42 +00:00
Rick Macklem	98a7b279a2	The algorithm used by nfscl_getopen() could have resulted in multiple instances of the same lock_owner when a process both inherited an open file descriptor plus opened the same file itself. Since some NFSv4 servers cannot handle multiple instances of the same lock_owner string, this patch changes the algorithm used by nfscl_getopen() in the new NFSv4 client to keep that from happening. The new algorithm is simpler, since there is no longer any need to ascend the process's parentage tree because all NFSv4 Closes for a file are done at VOP_INACTIVE()/VOP_RECLAIM(), making the Opens indistinct w.r.t. use with Lock Ops. This problem was discovered at the recent NFSv4 interoperability Bakeathon. MFC after: 2 weeks	2011-07-04 23:32:09 +00:00
Rick Macklem	1171f21dab	Modify the new NFSv4 client so that it appends a file handle to the lock_owner4 string that goes on the wire. Also, add code to do a ReleaseLockOwner Op on the lock_owner4 string before a Close. Apparently not all NFSv4 servers handle multiple instances of the same lock_owner4 string, at least not in a compatible way. This patch avoids having multiple instances, except for one unusual case, which will be fixed by a future commit. Found at the recent NFSv4 interoperability Bakeathon. Tested by: tdh at excfb.com MFC after: 2 weeks	2011-07-03 21:44:26 +00:00
Alan Cox	6bbee8e28a	Add a new option, OBJPR_NOTMAPPED, to vm_object_page_remove(). Passing this option to vm_object_page_remove() asserts that the specified range of pages is not mapped, or more precisely that none of these pages have any managed mappings. Thus, vm_object_page_remove() need not call pmap_remove_all() on the pages. This change not only saves time by eliminating pointless calls to pmap_remove_all(), but it also eliminates an inconsistency in the use of pmap_remove_all() versus related functions, like pmap_remove_write(). It eliminates harmless but pointless calls to pmap_remove_all() that were being performed on PG_UNMANAGED pages. Update all of the existing assertions on pmap_remove_all() to reflect this change. Reviewed by: kib	2011-06-29 16:40:41 +00:00
Rick Macklem	4875024b26	Fix the new NFSv4 client so that it doesn't fill the cached mode attribute in as 0 when doing writes. The change adds the Mode attribute plus the others except Owner and Owner_group to the list requested by the NFSv4 Write Operation. This fixed a problem where an executable file built by "cc" would get mode 0111 instead of 0755 for some NFSv4 servers. Found at the recent NFSv4 interoperability Bakeathon. Tested by: tdh at excfb.com MFC after: 2 weeks	2011-06-28 22:52:38 +00:00
Rick Macklem	7bb55def77	Plug an mbuf leak in the new NFS client that occurred when a server replied NFS3ERR_JUKEBOX/NFS4ERR_DELAY to an rpc. This affected both NFSv3 and NFSv4. Found during testing at the recent NFSv4 interoperability Bakeathon. MFC after: 2 weeks	2011-06-22 21:10:12 +00:00
Rick Macklem	72b7c8ddb1	Fix the new NFSv4 client so that it uses the same uid as was used for doing a mount when performing system operations on AUTH_SYS mounts. This resolved an issue when mounting a Linux server. Found during testing at the recent NFSv4 interoperability Bakeathon. MFC after: 2 weeks	2011-06-22 19:47:45 +00:00
Rick Macklem	53f476cab3	Fix the new NFSv4 server so that it checks for VREAD_ACL when a client does a Getattr for an ACL and not VREAD_ATTRIBUTES. This was found during the recent NFSv4 interoperability Bakeathon. MFC after: 2 weeks	2011-06-21 19:58:29 +00:00
Rick Macklem	37b88c2d51	Fix the new NFSv4 server so that it only allows Lookup of directories and symbolic links when traversing non-exported file systems. Found during the recent NFSv4 interoperability Bakeathon. MFC after: 2 weeks	2011-06-20 22:02:01 +00:00
Rick Macklem	5a55e04ffa	Fix the new NFSv4 server so that it allows Access and Readlink operations while traversing non-exported file systems. This is required for some non-FreeBSD clients to do NFSv4 mounts. Found during the recent NFSv4 interoperability Bakeathon. MFC after: 2 weeks	2011-06-20 21:57:26 +00:00
Rick Macklem	4e22c98a39	Fix a number of places where the new NFS server did not lock the mutex when manipulating rc_flag in the DRC cache. This is believed to fix a hung server that was reported to the freebsd-fs@ list on June 9 under the subject heading "New NFS server stress test hang", where all the threads were waiting for the RC_LOCKED flag to clear. Tested by: jwd at slowblink.com MFC after: 2 weeks	2011-06-19 23:54:01 +00:00
Rick Macklem	7e7fd7d177	Fix the kgssapi so that it can be loaded as a module. Currently the NFS subsystems use five of the rpcsec_gss/kgssapi entry points, but since it was not obvious which others might be useful, all nineteen were included. Basically the nineteen entry points are set in a structure called rpc_gss_entries and inline functions defined in sys/rpc/rpcsec_gss.h check for the entry points being non-NULL and then call them. A default value is returned otherwise. Requested by rwatson. Reviewed by: jhb MFC after: 2 weeks	2011-06-19 22:08:55 +00:00
Rick Macklem	8f0e65c915	Add DTrace support to the new NFS client. This is essentially cloned from the old NFS client, plus additions for NFSv4. A review of this code is in progress, however it was felt by the reviewer that it could go in now, before code slush. Any changes required by the review can be committed as bug fixes later.	2011-06-18 23:02:53 +00:00
Rick Macklem	fb35711d76	Add support for flock(2) locks to the new NFSv4 client. I think this should be ok, since the client now delays NFSv4 Close operations until VOP_INACTIVE()/VOP_RECLAIM(). As such, there should be no risk that the NFSv4 Open is closed while an associated byte range lock still exists. Tested by: avg MFC after: 2 weeks	2011-06-05 20:22:56 +00:00
Rick Macklem	f8f4e256e7	The new NFSv4 client was erroneously using "p" instead of "p_leader" for the "id" for POSIX byte range locking. I think this would only have affected processes created by rfork(2) with the RFTHREAD flag specified. This patch fixes that by passing the "id" down through the various functions from nfs_advlock(). MFC after: 2 weeks	2011-06-05 18:17:37 +00:00
Rick Macklem	2301f58fe5	Fix the new NFSv4 client so that it doesn't crash when a mount is done for a VIMAGE kernel. Tested by: glz at hidden-powers dot com Reviewed by: bz MFC after: 2 weeks	2011-06-05 17:31:44 +00:00
Rick Macklem	c5c142f652	Modify the new NFS server so that the NFSv3 Pathconf RPC doesn't return an error when the underlying file system lacks support for any of the four _PC_xxx values used, by falling back to default values. Tested by: avg MFC after: 2 weeks	2011-06-04 01:13:09 +00:00
Konstantin Belousov	031ec8c10a	In the VOP_PUTPAGES() implementations, change the default error from VM_PAGER_AGAIN to VM_PAGER_ERROR for the uwritten pages. Return VM_PAGER_AGAIN for the partially written page. Always forward at least one page in the loop of vm_object_page_clean(). VM_PAGER_ERROR causes the page reactivation and does not clear the page dirty state, so the write is not lost. The change fixes an infinite loop in vm_object_page_clean() when the filesystem returns permanent errors for some page writes. Reported and tested by: gavin Reviewed by: alc, rmacklem MFC after: 1 week	2011-06-01 21:00:28 +00:00
Rick Macklem	b398d10657	Fix the new NFS client so that it doesn't do an NFSv3 Pathconf RPC for cases where the reply doesn't include the answer. This fixes a problem reported by avg@ where the NFSv3 Pathconf RPC would fail when "ls -l" did an lpathconf(2) for _PC_ACL_NFS4. Tested by: avg MFC after: 2 weeks	2011-05-31 17:43:25 +00:00
Rick Macklem	ff29f3b241	Fix the new NFS client so that it handles NFSv4 state correctly during a forced dismount. This required that the exclusive and shared (refcnt) sleep lock functions check for MNTK_UMOUNTF before sleeping, so that they won't block while nfscl_umount() is getting rid of the state. As such, a "struct mount *" argument was added to the locking functions. I believe the only remaining case where a forced dismount can get hung in the kernel is when a thread is already attempting to do a TCP connect to a dead server when the krpc client structure called nr_client is NULL. This will only happen just after a "mount -u" with options that force a new TCP connection is done, so it shouldn't be a problem in practice. MFC after: 2 weeks	2011-05-27 22:05:10 +00:00
Rick Macklem	8b5e8315a7	Add a check for MNTK_UNMOUNTF at the beginning of nfs_sync() in the new NFS client so that a forced dismount doesn't get stuck in the VFS_SYNC() call that happens before VFS_UNMOUNT() in dounmount(). Additional changes are needed before forced dismounts will work. MFC after: 2 weeks	2011-05-26 22:05:35 +00:00
Rick Macklem	81ddb192e8	Add some missing mutex locking to the new NFS client. MFC after: 2 weeks	2011-05-25 21:17:53 +00:00
Rick Macklem	147206ae68	Fix the new NFS client so that it correctly sets the "must_commit" argument for a write RPC when it succeeds for the first one and fails for a subsequent RPC within the same call to the function. This makes it compatible with the old NFS client for this case. MFC after: 2 weeks	2011-05-25 20:53:08 +00:00
Rick Macklem	484c842d57	Set the MNT_NFS4ACLS flag for an NFSv4 client mount if the NFSv4 server supports it. Requested by trasz. MFC after: 2 weeks	2011-05-23 22:31:42 +00:00
Alan Cox	76036f2bbd	Eliminate duplicate #include's.	2011-05-22 18:11:41 +00:00
Rick Macklem	694a586a43	Add a lock flags argument to the VFS_FHTOVP() file system method, so that callers can indicate the minimum vnode locking requirement. This will allow some file systems to choose to return a LK_SHARED locked vnode when LK_SHARED is specified for the flags argument. This patch only adds the flag. It does not change any file system to use it and all callers specify LK_EXCLUSIVE, so file system semantics are not changed. Reviewed by: kib	2011-05-22 01:07:54 +00:00
Rick Macklem	b70cddba44	Add a sanity check for the existence of an "addr" option to both NFS clients. This avoids the crash reported by Sergey Kandaurov (pluknet@gmail.com) to the freebsd-fs@ list with subject "[old nfsclient] different nmount() args passed from mount vs mount_nfs" dated May 17, 2011. Tested by: pluknet at gmail.com (old nfs client) MFC after: 2 weeks	2011-05-18 18:36:40 +00:00
Rick Macklem	1f3765902c	Change the sysctl naming for the old and new NFS clients to vfs.oldnfs.xxx and vfs.nfs.xxx respectively. This makes the default nfs client use vfs.nfs.xxx after r221124.	2011-05-15 20:52:43 +00:00
John Baldwin	5b4f35a4f0	Merge comments about converting directory entries to be more direct and concise. Inspired by: Gleb Kurtsou	2011-05-14 01:10:57 +00:00
Rick Macklem	a0c2c3691c	Change the new NFS server so that it uses vfs.nfsd naming for its sysctls instead of vfs.newnfs. This separates the names from the ones used by the client.	2011-05-08 01:01:27 +00:00
Rick Macklem	1dcad8ec9a	Set the initial value of maxfilesize to OFF_MAX in the new NFS client. It will then be reduced to whatever the server says it can support. There might be an argument that this could be one block larger, but since NFS is a byte granular system, I chose not to do that. Suggested by: Matt Dillon Tested by: Daniel Braniss (earlier version) MFC after: 2 weeks	2011-05-06 17:51:00 +00:00
Alexander Motin	08aadbe3b4	Increase NFS_TICKINTVL value from 10 to 500. Now that callout does useful things only once per second, so other 99 calls per second were useless and just don't allow idle system to sleep properly. Reviewed by: rmacklem	2011-05-06 13:11:50 +00:00
Rick Macklem	78e4b1f838	Change the new NFS server so that it returns 0 when the f_bavail or f_ffree fields of "struct statfs" are negative, since the values that go on the wire are unsigned and will appear to be very large positive values otherwise. This makes the handling of a negative f_bavail compatible with the old/regular NFS server. MFC after: 2 weeks	2011-05-06 01:29:14 +00:00
Rick Macklem	f96712c2e6	Fix the new NFS client so that it handles the 64bit fields that are now in "struct statfs" for NFSv3 and NFSv4. Since the ffiles value is uint64_t on the wire, I clip the value to INT64_MAX to avoid setting f_ffree negative. Tested by: kib MFC after: 2 weeks	2011-05-05 00:11:09 +00:00
Rick Macklem	5a816b92a3	Add a comment noting that the NFS code assumes that the values of error numbers in sys/errno.h will be the same as the ones specified by the NFS RFCs and that the code needs to be fixed if error numbers are changed in sys/errno.h. Suggested by: Peter Jeremy MFC after: 2 weeks	2011-05-04 22:02:33 +00:00
Rick Macklem	2e3b981a4d	Add kernel support for NFSSVC_ZEROCLTSTATS and NFSSVC_ZEROSRVSTATS so that they can be used by nfsstat(1) to implement the "-z" option for the new NFS subsystem. MFC after: 2 weeks	2011-05-04 13:36:18 +00:00
Rick Macklem	2b08b570cb	Revert r221306, since NFSSVC_ZEROSTATS zero'd both client and server stats, when separate modifiers for NFSSVC_GETSTATS for each of client and server stats is what it required by nfsstat(1).	2011-05-04 13:30:38 +00:00
Ruslan Ermilov	e2f2b37089	Implemented a mount option "nocto" that disables cache coherency checking at open time. It may improve performance for read-only NFS mounts. Use deliberately. MFC after: 1 week Reviewed by: rmacklem, jhb (earlier version)	2011-05-04 13:27:45 +00:00
Ruslan Ermilov	55cde634cf	In ncl_printf(), call vprintf() instead of printf(). MFC after: 3 days	2011-05-04 11:22:52 +00:00
Rick Macklem	b2946fadcd	Add the kernel support needed to zero out the nfsstats structure for the new NFS subsystem. This will be used by nfsstats.c to implement the "-z" option. MFC after: 2 weeks	2011-05-01 22:19:52 +00:00
Konstantin Belousov	4417ac326a	Clarify the comment. MFC after: 1 week	2011-04-30 13:49:03 +00:00
Rick Macklem	8b713a2f8a	The build was broken by r221190 for 64bit arches like amd64. This patch fixes it. MFC after: 2 weeks	2011-04-29 12:30:15 +00:00
Rick Macklem	61c827204b	Fix the new NFS client so that it handles the "nfs_args" value in mnt_optnew. This is needed so that the old mount(2) syscall works and that is needed so that amd(8) works. The code was basically just cribbed from sys/nfsclient/nfs_vfsops.c with minor changes. This patch is mainly to fix the new NFS client so that amd(8) works with it. Thanks go to Craig Rodrigues for helping with this. Tested by: Craig Rodrigues (for amd) MFC after: 2 weeks	2011-04-28 23:21:50 +00:00
John Baldwin	7d74606889	Update a comment since ext2fs does not use SU. Reviewed by: kib	2011-04-28 20:25:15 +00:00
John Baldwin	466a71d75e	The b_dep field of buffers is always empty for ext2fs, it is only used for SU in FFS. Reported by: kib	2011-04-28 17:36:26 +00:00
John Baldwin	9e880b876d	Sync with several changes in UFS/FFS: - 77115: Implement support for O_DIRECT. - 98425: Fix a performance issue introduced in 70131 that was causing reads before writes even when writing full blocks. - 98658: Rename the BALLOC flags from B_* to BA_* to avoid confusion with the struct buf B_ flags. - 100344: Merge the BA_ and IO_ flags so so that they may both be used in the same flags word. This merger is possible by assigning the IO_ flags to the low sixteen bits and the BA_ flags the high sixteen bits. - 105422: Fix a file-rewrite performance case. - 129545: Implement IO_INVAL in VOP_WRITE() by marking the buffer as "no cache". - Readd the DOINGASYNC() macro and use it to control asynchronous writes. Change i-node updates to honor DOINGASYNC() instead of always being synchronous. - Use a PRIV_VFS_RETAINSUGID check instead of checking cr_uid against 0 directly when deciding whether or not to clear suid and sgid bits. Submitted by: Pedro F. Giffuni giffunip at yahoo	2011-04-28 14:27:17 +00:00
Rick Macklem	afea74655f	Fix module names and dependencies so the NFS clients will load correctly as modules after r221124.	2011-04-27 20:42:30 +00:00
John Baldwin	bbfe24fbf2	Use a private EXT2_ROOTINO constant instead of redefining ROOTINO. Submitted by: Pedro F. Giffuni giffunip at yahoo	2011-04-27 18:25:35 +00:00
John Baldwin	4d2ede6798	Various style fixes including using uint_t instead of u_int_t. Submitted by: Pedro F. Giffuni giffunip at yahoo	2011-04-27 18:15:34 +00:00
Rick Macklem	4309e17add	This patch changes head so that the default NFS client is now the new NFS client (which I guess is no longer experimental). The fstype "newnfs" is now "nfs" and the regular/old NFS client is now fstype "oldnfs". Although mounts via fstype "nfs" will usually work without userland changes, an updated mount_nfs(8) binary is needed for kernels built with "options NFSCL" but not "options NFSCLIENT". Updated mount_nfs(8) and mount(8) binaries are needed to do mounts for fstype "oldnfs". The GENERIC kernel configs have been changed to use options NFSCL and NFSD (the new client and server) instead of NFSCLIENT and NFSSERVER. For kernels being used on diskless NFS root systems, "options NFSCL" must be in the kernel config. Discussed on freebsd-fs@.	2011-04-27 17:51:51 +00:00
Rick Macklem	541cb7a358	Fix a kernel linking problem introduced by r221032, r221040 when building kernels that don't have "options NFS_ROOT" specified. I plan on moving the functions that use these data structures into the shared code in sys/nfs/nfs_diskless.c in a future commit. At that time, these definitions will no longer be needed in nfs_vfsops.c and nfs_clvfsops.c. MFC after: 2 weeks	2011-04-26 13:50:11 +00:00
Rick Macklem	8954032f0d	Modify the experimental (newnfs) NFS client so that it uses the same diskless NFS root code as the regular client, which was moved to sys/nfs by r221032. This fixes the newnfs client so that it can do an NFSv3 diskless root file system. MFC after: 2 weeks	2011-04-25 23:12:18 +00:00
Rick Macklem	151c163e4d	Fix the experimental NFS client so that it does not bogusly set the f_flags field of "struct statfs". This had the interesting effect of making the NFSv4 mounts "disappear" after r221014, since NFSMNT_NFSV4 and MNT_IGNORE became the same bit. MFC after: 2 weeks	2011-04-25 14:51:08 +00:00
Rick Macklem	385edc8e71	Modify the experimental NFS client so that it uses the same "struct nfs_args" as the regular NFS client. This is needed so that the old mount(2) syscall will work and it makes sharing of the diskless NFS root code easier. Eary in the porting exercise I introduced a new revision of nfs_args, but didn't actually need it, thanks to nmount(2). I re-introduced the NFSMNT_KERB flag, since it does essentially the same thing and the old one would not have been used because it never worked. I also added a few new NFSMNT_xxx flags to sys/nfsclient/nfs_args.h that are used by the experimental NFS client. MFC after: 2 weeks	2011-04-25 13:09:32 +00:00
Rick Macklem	24e2bcc006	Remove the nm_mtx mutex locking from the test for nm_maxfilesize. This value rarely, if ever, changes and the nm_mtx mutex is locked/unlocked earlier in the function, which should be sufficient to avoid getting a stale cached value for it. There is a discussion w.r.t. what these tests should be, but I've left them basically the same as the regular NFS client for now. Suggested by: pjd MFC after: 2 weeks	2011-04-21 19:56:06 +00:00
Rick Macklem	920ae5d96a	Revert r220906, since the vp isn't always locked when nfscl_request() is called. It will need a more involved patch.	2011-04-21 12:38:12 +00:00
Rick Macklem	69bcf84509	Add a check for VI_DOOMED at the beginning of nfscl_request() so that it won't try and use vp->v_mount to do an RPC during a forced dismount. There needs to be at least one more kernel commit, plus a change to the umount(8) command before forced dismounts will work for the experimental NFS client. MFC after: 2 weeks	2011-04-20 23:25:18 +00:00
Rick Macklem	b29b9bcbfb	Modify the offset + size checks for read and write in the experimental NFS client to take care of overflows for the calls above the buffer cache layer in a manner similar to r220876. Thanks go to dillon at apollo.backplane.com for providing the snippet of code that does this. MFC after: 2 weeks	2011-04-20 01:15:22 +00:00
Rick Macklem	b1297f142f	Modify the offset + size checks for read and write in the experimental NFS client to take care of overflows. Thanks go to dillon at apollo.backplane.com for providing the snippet of code that does this. MFC after: 2 weeks	2011-04-20 00:21:51 +00:00
Rick Macklem	58c969c8de	Fix up handling of the nfsmount structure in read and write within the experimental NFS client. Mostly add mutex locking and use the same rsize, wsize during the operation by keeping a local copy of it. This is another change that brings it closer to the regular NFS client. MFC after: 2 weeks	2011-04-19 01:09:51 +00:00
Rick Macklem	a8bafa5d3b	Revert r220761 since, as kib@ pointed out, the case of adding the check to nfsrpc_close() isn't useful. Also, the check in nfscl_getcl() must be more involved, since it needs to check before and after the acquisition of the refcnt on nfsc_lock, while the mutex that protects the client state data is held.	2011-04-18 23:35:16 +00:00
Rick Macklem	bc62b5cf6a	Add a vput() to nfs_lookitup() in the experimental NFS client for a case that will probably never happen. It can only happen if a server were to successfully lookup a file, but not return attributes for that file. Although technically allowed by the NFSv3 RFC, I doubt any server would ever do this. However, if it did, the client would have not vput()'d the new vnode when it needed to do so. MFC after: 2 weeks	2011-04-18 01:02:43 +00:00
Rick Macklem	ab42af2708	Add vput() calls in two places in the experimental NFS client that would be needed if, in the future, nfscl_loadattrcache() were to return an error. Currently nfscl_loadattrcache() never returns an error, so these cases never currently happen. MFC after: 2 weeks	2011-04-18 00:41:23 +00:00
Rick Macklem	78d8a60009	Change the mutex locking for several locations in the experimental NFS client's vnode op functions to make them compatible with the regular NFS client. I'll admit I'm not sure that the mutex locks around the assignments are needed, but the regular client has them, so I added them. Also, add handling of the case of partial attributes in setattr to be compatible with the regular client. MFC after: 2 weeks	2011-04-17 23:56:57 +00:00
Rick Macklem	be8b35eda7	Add checks for MNTK_UNMOUNTF at the beginning of three functions, so that threads don't get stuck in them during a forced dismount. nfs_sync/VFS_SYNC() needs this, since it is called by dounmount() before VFS_UNMOUNT(). The nfscl_nget() case makes sure that a thread doing an VOP_OPEN() or VOP_ADVLOCK() call doesn't get blocked before attempting the RPC. Attempting RPCs don't block, since they all fail once a forced dismount is in progress. The third one at the beginning of nfsrpc_close() is done so threads don't get blocked while doing VOP_INACTIVE() as the vnodes are cleared out. With these three changes plus a change to the umount(1) command so that it doesn't do "sync()" for the forced case seem to make forced dismounts work for the experimental NFS client. MFC after: 2 weeks	2011-04-17 23:04:03 +00:00
Rick Macklem	ebd9ef339f	Get rid of the "nfscl: consider increasing kern.ipc.maxsockbuf" message that was generated when doing experimental NFS client mounts. I put that message in because the krpc would hang with the default size for mounts that used large rsize/wsize values. Since the bug that caused these hangs was fixed by r213756, I think the message is no longer needed. MFC after: 2 weeks	2011-04-17 20:01:32 +00:00
Rick Macklem	0a9f005dff	Fix up some of the sysctls for the experimental NFS client so that they use the same names as the regular client. Also add string descriptions for them. MFC after: 2 weeks	2011-04-17 18:56:17 +00:00
Rick Macklem	8e82d541da	Change some defaults in the experimental NFS client to be the same as the regular NFS client for NFSv3. The main one is making use of a reserved port# the default. Also, set the retry limit for TCP the same and fix the code so that it doesn't disable readdirplus for NFSv4. MFC after: 2 weeks	2011-04-17 14:10:12 +00:00
Rick Macklem	f5613c1d97	Fix readdirplus in the experimental NFS client so that it skips over ".." to avoid a LOR race with nfs_lookup(). This fix is analagous to r138256 in the regular NFS client. MFC after: 2 weeks	2011-04-17 02:44:51 +00:00
Rick Macklem	4b3a38ecdf	Add a lktype flags argument to nfscl_nget() and ncl_nget() in the experimental NFS client so that its nfs_lookup() function can use cn_lkflags in a manner analagous to the regular NFS client. MFC after: 2 weeks	2011-04-16 23:20:21 +00:00
Rick Macklem	f8a2f6b03a	Add mutex locking on the nfs node in ncl_inactive() for the experimental NFS client. MFC after: 2 weeks	2011-04-16 22:15:59 +00:00
Rick Macklem	7b8c319be4	Change the experimental NFS client so that it creates nfsiod threads in the same manner as the regular NFS client after r214026 was committed. This resolves the lors fixed by r214026 and its predecessors for the regular client. Reviewed by: jhb MFC after: 2 weeks	2011-04-15 23:07:48 +00:00
Rick Macklem	a09001a82b	Fix the experimental NFSv4 server so that it uses VOP_PATHCONF() to determine if a file system supports NFSv4 ACLs. Since VOP_PATHCONF() must be called with a locked vnode, the function is called before nfsvno_fillattr() and the result is passed in as an extra argument. MFC after: 2 weeks	2011-04-14 23:46:15 +00:00
Rick Macklem	07c0c166e4	Modify the experimental NFSv4 server so that it handles crossing of server mount points properly. The functions nfsvno_fillattr() and nfsv4_fillattr() were modified to take the extra arguments that are the mount point, a flag to indicate that it is a file system root and the mounted on fileno. The mount point argument needs to be busy when nfsvno_fillattr() is called, since the vp argument is not locked. Reviewed by: kib MFC after: 2 weeks	2011-04-14 21:49:52 +00:00
Rick Macklem	149ce1025c	Add VOP_PATHCONF() support to the experimental NFS client so that it can, along with other things, report whether or not NFS4 ACLs are supported. MFC after: 2 weeks	2011-04-13 22:37:28 +00:00
Rick Macklem	3707cf8962	Fix the experimental NFSv4 client so that it recognizes server mount point crossings correctly. It was testing the wrong flag. Also, try harder to make sure that the fsid is different than the one assigned to the client mount point, by hashing the server's fsid (just to create a different value deterministically) when it is the same. MFC after: 2 weeks	2011-04-13 22:16:52 +00:00
Rick Macklem	f659876f01	Vrele ni_startdir in the experimental NFS server for the case of NFSv2 getting an error return from VOP_MKNOD(). Without this patch, the server file system remains busy after an NFSv2 VOP_MKNOD() fails. MFC after: 2 weeks	2011-04-11 20:54:30 +00:00
Rick Macklem	806e2e4bb6	Add some cleanup code to the module unload operation for the experimental NFS server, so that it doesn't leak memory when unloaded. However, unloading the NFSv4 server is not recommended, since all NFSv4 state will be lost by the unload and clients will have to recover the state after a server reload/restart as if the server crashed/rebooted. MFC after: 2 weeks	2011-04-10 20:43:07 +00:00
Rick Macklem	8d2f180ea4	Add a VOP_UNLOCK() for the directory, when that is not what VOP_LOOKUP() returned. This fixes a bug in the experimental NFS server for the case where VFS_VGET() fails returning EOPNOTSUPP in the ReaddirPlus RPC, forcing the use of VOP_LOOKUP() instead. MFC after: 2 weeks	2011-04-09 23:55:27 +00:00
Konstantin Belousov	e06c3d4363	Linuxolator calls VOP_READDIR with ncookies pointer. Implement a workaround for fdescfs to not panic when ncookies is not NULL, similar to the one committed as r152254, but simpler, due to fdescfs_readdir() not calling vfs_read_dirent(). PR: kern/156177 MFC after: 1 week	2011-04-09 21:40:48 +00:00
Edward Tomasz Napierala	722581d9e6	Add RACCT_NOFILE accounting. Sponsored by: The FreeBSD Foundation Reviewed by: kib (earlier version)	2011-04-06 19:13:04 +00:00
Zack Kirsch	418802a96c	This patch fixes the Experimental NFS client to properly deal with 32 bit or 64 bit fileid's in NFSv2 and NFSv3. Without this fix, invalid casting (and sign extension) was creating problems for any fileid greater than 2^31. We discovered this because we have test clusters with more than 2 billion allocated files and 64-bit ino_t's (and friend structures). Reviewed by: rmacklem Approved by: zml (mentor) MFC after: 2 weeks	2011-03-30 01:10:11 +00:00
Konstantin Belousov	9ba671debc	Report EBUSY instead of EROFS for attempt of deleting or renaming the root directory of msdosfs mount. The VFS code would handle deletion case itself too, assuming VV_ROOT flag is not lost. The msdosfs_rename() should also note attempt to rename root via doscheckpath() or different mount point check leading to EXDEV. Nonetheless, keep the checks for now. The change is inspired by NetBSD change referenced in PR, but return EBUSY like kern_unlinkat() does. PR: kern/152079 MFC after: 1 week	2011-03-25 22:31:28 +00:00
John Baldwin	8e6fa660f2	Fix some locking nits with the p_state field of struct proc: - Hold the proc lock while changing the state from PRS_NEW to PRS_NORMAL in fork to honor the locking requirements. While here, expand the scope of the PROC_LOCK() on the new process (p2) to avoid some LORs. Previously the code was locking the new child process (p2) after it had locked the parent process (p1). However, when locking two processes, the safe order is to lock the child first, then the parent. - Fix various places that were checking p_state against PRS_NEW without having the process locked to use PROC_LOCK(). Every place was already locking the process, just after the PRS_NEW check. - Remove or reduce the use of PROC_SLOCK() for places that were checking p_state against PRS_NEW. The PROC_LOCK() alone is sufficient for reading the current state. - Reorder fill_kinfo_proc() slightly so it only acquires PROC_SLOCK() once. MFC after: 1 week	2011-03-24 18:40:11 +00:00
Alexander Leidinger	de5b19526b	Add some FEATURE macros for various features (AUDIT/CAM/IPC/KTR/MAC/NFS/NTP/ PMC/SYSV/...). No FreeBSD version bump, the userland application to query the features will be committed last and can serve as an indication of the availablility if needed. Sponsored by: Google Summer of Code 2010 Submitted by: kibab Reviewed by: arch@ (parts by rwatson, trasz, jhb) X-MFC after: to be determined in last commit with code from this project	2011-02-25 10:11:01 +00:00
John Baldwin	056c6c933c	Use ffs() to locate free bits in the inode and block bitmaps rather than loops with bit shifts.	2011-02-24 22:11:36 +00:00
Rebecca Cran	974206cf70	Fix typos - remove duplicate "is". PR: docs/154934 Submitted by: Eitan Adler <lists at eitanadler.com> MFC after: 3 days	2011-02-23 09:22:33 +00:00
Alan Cox	4d2f3d2cde	Eliminate two dubious attempts at optimizing the implementation of a file's last accessed, modified, and changed times: TMPFS_NODE_ACCESSED and TMPFS_NODE_CHANGED should be set unconditionally in tmpfs_remove() without regard to the number of hard links to the file. Otherwise, after the last directory entry for a file has been removed, a process that still has the file open could read stale values for the last accessed and changed times with fstat(2). Similarly, tmpfs_close() should update the time-related fields even if all directory entries for a file have been removed. In this case, the effect is that the time-related fields will have values that are later than expected. They will correspond to the time at which fstat(2) is called. In collaboration with: kib MFC after: 1 week	2011-02-22 14:47:10 +00:00
Rebecca Cran	6bccea7c2b	Fix typos - remove duplicate "the". PR: bin/154928 Submitted by: Eitan Adler <lists at eitanadler.com> MFC after: 3 days	2011-02-21 09:01:34 +00:00
Alan Cox	7ded42ba28	tmpfs_remove() isn't modifying the file's data, so it shouldn't set TMPFS_NODE_MODIFIED on the node. PR: 152488 Submitted by: Anton Yuzhaninov Reviewed by: kib MFC after: 1 week	2011-02-19 21:04:36 +00:00
Bjoern A. Zeeb	1fb51a12f2	Mfp4 CH=177274,177280,177284-177285,177297,177324-177325 VNET socket push back: try to minimize the number of places where we have to switch vnets and narrow down the time we stay switched. Add assertions to the socket code to catch possibly unset vnets as seen in r204147. While this reduces the number of vnet recursion in some places like NFS, POSIX local sockets and some netgraph, .. recursions are impossible to fix. The current expectations are documented at the beginning of uipc_socket.c along with the other information there. Sponsored by: The FreeBSD Foundation Sponsored by: CK Software GmbH Reviewed by: jhb Tested by: zec Tested by: Mikolaj Golub (to.my.trociny gmail.com) MFC after: 2 weeks	2011-02-16 21:29:13 +00:00
Alan Cox	4673c751f8	Further simplify tmpfs_reg_resize(). Also, update its comments, including style fixes.	2011-02-14 15:36:38 +00:00
Alan Cox	b10d1d5d60	Eliminate tn_reg.tn_aobj_pages. Instead, correctly maintain the vm object's size field. Previously, that field was always zero, even when the object tn_reg.tn_aobj contained numerous pages. Apply style fixes to tmpfs_reg_resize(). In collaboration with: kib	2011-02-13 14:46:39 +00:00
John Baldwin	73dd6d1f8f	After reading a bitmap block for i-nodes or blocks, recheck the count of free i-nodes or blocks to handle a race where another thread might have allocated the last i-node or block while we were waiting for the buffer. Tested by: dougb	2011-02-08 13:02:25 +00:00
Alan Cox	17f3095d1a	Unless "cnt" exceeds MAX_COMMIT_COUNT, nfsrv_commit() and nfsvno_fsync() are incorrectly calling vm_object_page_clean(). They are passing the length of the range rather than the ending offset of the range. Perform the OFF_TO_IDX() conversion in vm_object_page_clean() rather than the callers. Reviewed by: kib MFC after: 3 weeks	2011-02-05 21:21:27 +00:00
John Baldwin	a3ebd02675	Collapse duplicate definitions of EXT2_SB(). Submitted by: Pedro F. Giffuni giffunip at yahoo	2011-02-04 14:20:27 +00:00
John Baldwin	8e42a40607	Fix build with DIAGNOSTIC enabled. Pointy hat to: jhb	2011-02-02 14:59:05 +00:00
John Baldwin	45641afb72	Some cosmetic fixes and remove a duplicate constant. Submitted by: Pedro F. Giffuni giffunip at yahoo	2011-02-01 18:30:52 +00:00
John Baldwin	c767faa558	- Set the next_alloc fields for an i-node after allocating a new block so that future allocations start with most recently allocated block rather than the beginning of the filesystem. - Fix ext2_alloccg() to properly scan for 8 block chunks that are not aligned on 8-bit boundaries. Previously this was causing new blocks to be allocated in a highly fragmented fashion (block 0 of a file at lbn N, block 1 at lbn N + 8, block 2 at lbn N + 16, etc.). - Cosmetic tweaks to the currently-disabled fancy realloc sysctls. PR: kern/153584 Discussed with: bde Tested by: Pedro F. Giffuni giffunip at yahoo, Zheng Liu (lz)	2011-02-01 18:21:45 +00:00
George V. Neville-Neil	64181ef324	Quick fix to a comment.	2011-01-27 03:32:16 +00:00
Dmitry Chagin	a5c1afadeb	Add macro to test the sv_flags of any process. Change some places to test the flags instead of explicit comparing with address of known sysentvec structures. MFC after: 1 month	2011-01-26 20:03:58 +00:00
John Baldwin	cd2895aab0	- Move special inode constants to ext2_dinode.h and rename them to match NetBSD. - Add a constant for the HASJOURNAL compat flag. PR: kern/153584 Submitted by: Pedro F. Giffuni giffunip at yahoo	2011-01-21 22:00:40 +00:00
John Baldwin	84edda0a2c	Restore support for the 'async' and 'sync' mount options lost when switching to nmount(2). While here, sort the options. PR: kern/153584 Submitted by: Pedro F. Giffuni giffunip at yahoo MFC after: 1 week	2011-01-21 21:33:46 +00:00
Konstantin Belousov	9fb9c623a6	In tmpfs_readdir(), normalize handling of the directory entries that either overflow the supplied buffer, or cause uiomove fail. Do not advance cached de when directory entry was not copied out. Do not return EOF when no entries could be copied due to first entry too large for supplied buffer, signal EINVAL instead. Reported by: Beat G?tzi <beat chruetertee ch> MFC after: 1 week	2011-01-20 09:39:16 +00:00
John Baldwin	a2add8d070	Fix build with KDB defined. Pointy hat to: jhb Submitted by: jkim	2011-01-19 19:49:48 +00:00
John Baldwin	08b1d53573	Whitespace and style fixes.	2011-01-19 16:55:32 +00:00
John Baldwin	f82a066c72	Move calculation of 'bmask' earlier to match it's current location in ufs_lookup().	2011-01-19 16:52:22 +00:00
John Baldwin	007c620744	Merge 118969 from UFS: Eliminate the i_devvp field from the incore inodes, we can get the same value from ip->i_ump->um_devvp. Submitted by: Pedro F. Giffuni giffunip at yahoo MFC after: 1 week	2011-01-19 16:46:13 +00:00
Rick Macklem	8207db3ec3	Fix the experimental NFSv4 server so that it uses VOP_ACCESSX() to check for VREAD_ACL instead of VOP_ACCESS(). MFC after: 3 days	2011-01-18 14:34:45 +00:00
Rick Macklem	5f73287a6e	Modify the experimental NFSv4 server so that it posts a SIGUSR2 signal to the master nfsd daemon whenever the stable restart file has been modified. This will allow the master nfsd daemon to maintain an up to date backup copy of the file. This is enabled via the nfssvc() syscall, so that older nfsd daemons will not be signaled. Reviewed by: jhb MFC after: 1 week	2011-01-14 23:30:35 +00:00
Zack Kirsch	770b49a314	In the experimental NFS server, when converting an open-owner to a lock-owner, start at sequence id 1 instead of 0, to match up with both Solaris and Linux. Reviewed by: rmacklem Approved by: zml (mentor)	2011-01-12 23:46:12 +00:00
Zack Kirsch	52776c502b	Clean up the experimental NFS server replay cache when the module is unloaded. Reviewed by: rmacklem Approved by: zml (mentor)	2011-01-12 23:34:09 +00:00
Rick Macklem	f9266eb1f9	Modify readdirplus in the experimental NFS server in a manner analogous to r216633 for the regular server. This change busies the file system so that VFS_VGET() is guaranteed to be using the correct mount point even during a forced dismount attempt. Since nfsd_fhtovp() is not called immediately before readdirplus, the patch is actually a clone of pjd@'s nfs_serv.c.4.patch instead of the one committed in r216633. Reviewed by: kib MFC after: 10 days	2011-01-09 02:10:54 +00:00
Rick Macklem	fbf0af3fcb	Delete the NFS_STARTWRITE() and NFS_ENDWRITE() macros that obscured vn_start_write() and vn_finished_write() for the old OpenBSD port, since most uses have been replaced by the correct calls. MFC after: 12 days	2011-01-06 20:31:33 +00:00
Rick Macklem	8974bc2f3a	Since the VFS_LOCK_GIANT() code in the experimental NFS server is broken and the major file systems are now all mpsafe, modify the server so that it will only export mpsafe file systems. This was discussed on freebsd-fs@ and removes a fair bit of crufty code. MFC after: 12 days	2011-01-06 19:50:11 +00:00
Rick Macklem	785f073be9	Modify the experimental NFS server so that it calls vn_start_write() with a non-NULL vp. That way it will find the correct mount point mp and use that mp for the subsequent vn_finished_write() call. Also, it should fail without crashing if the mount point is being forced dismounted because vn_start_write() will set the mp NULL via VOP_GETWRITEMOUNT(). Reviewed by: kib MFC after: 12 days	2011-01-05 19:35:35 +00:00
Rick Macklem	47524363da	Fix the experimental NFS server to use vfs_busyfs() instead of vfs_getvfs() so that the mount point is busied for the VFS_FHTOVP() call. This is analagous to r185432 for the regular NFS server. Reviewed by: kib MFC after: 12 days	2011-01-05 18:46:05 +00:00
Rick Macklem	90305aa38b	Fix the nlm so that it no longer depends on the regular nfs client and, as such, can be loaded for the experimental nfs client without the regular client. Reviewed by: jhb MFC after: 2 weeks	2011-01-03 20:37:31 +00:00
Rick Macklem	fa5ecdd3b9	Fix the experimental NFS server so that it doesn't leak a reference count on the directory when creating device special files. MFC after: 2 weeks	2011-01-03 00:40:13 +00:00
Rick Macklem	81f78d997d	Modify the experimental NFSv4 server so that the lookup ops return a locked vnode. This ensures that the associated mount point will always be valid for the code that follows the operation. Also add a couple of additional checks for non-error to the other functions that create file objects. MFC after: 2 weeks	2011-01-03 00:33:32 +00:00
Rick Macklem	c9aad40f5f	Delete some cruft from the experimental NFS server that was only used by the OpenBSD port for its pseudo-fs. MFC after: 2 weeks	2011-01-02 21:34:01 +00:00
Rick Macklem	629fa50e68	Add checks for VI_DOOMED and vn_lock() failures to the experimental NFS server, to handle the case where an exported file system is forced dismounted while an RPC is in progress. Further commits will fix the cases where a mount point is used when the associated vnode isn't locked. Reviewed by: kib MFC after: 2 weeks	2011-01-02 19:58:39 +00:00
Rick Macklem	5a12538bd7	Add support for shared vnode locks for the Read operation in the experimental NFSv4 server. Reviewed by: kib MFC after: 2 weeks	2011-01-01 18:50:49 +00:00
Rick Macklem	bd2fa726e0	Delete the nfsvno_localconflict() function in the experimental NFS server since it is no longer used and is broken. MFC after: 2 weeks	2010-12-28 23:50:13 +00:00
Rick Macklem	17891d0082	Modify the experimental NFS server so that it uses LK_SHARED for RPC operations when it can. Since VFS_FHTOVP() currently always gets an exclusively locked vnode and is usually called at the beginning of each RPC, the RPCs for a given vnode will still be serialized. As such, passing a lock type argument to VFS_FHTOVP() would be preferable to doing the vn_lock() with LK_DOWNGRADE after the VFS_FHTOVP() call. Reviewed by: kib MFC after: 2 weeks	2010-12-25 21:56:25 +00:00
Rick Macklem	0cf42b622b	Add an argument to nfsvno_getattr() in the experimental NFS server, so that it can avoid calling VOP_ISLOCKED() when the vnode is known to be locked. This will allow LK_SHARED to be used for these cases, which happen to be all the cases that can use LK_SHARED. This does not fix any bug, but it reduces the number of calls to VOP_ISLOCKED() and prepares the code so that it can be switched to using LK_SHARED in a future patch. Reviewed by: kib MFC after: 2 weeks	2010-12-24 21:31:18 +00:00
Rick Macklem	a852f40b7a	Simplify vnode locking in the expeimental NFS server's readdir functions. In particular, get rid of two bogus VOP_ISLOCKED() calls. Removing the VOP_ISLOCKED() calls is the only actual bug fixed by this patch. Reviewed by: kib MFC after: 2 weeks	2010-12-24 20:24:07 +00:00
Rick Macklem	63e1cb4308	Since VOP_READDIR() for ZFS does not return monotonically increasing directory offset cookies, disable the UFS related loop that skips over directory entries at the beginning of the block for the experimental NFS server. This loop is required for UFS since it always returns directory entries starting at the beginning of the block that the requested directory offset is in. In discussion with pjd@ and mckusick@ it seems that this behaviour of UFS should maybe change, with this fix being an interim patch until then. This patch only fixes the experimental server, since pjd@ is working on a patch for the regular server. Discussed with: pjd, mckusick MFC after: 5 days	2010-12-24 18:46:44 +00:00
Rick Macklem	d6ec8427bc	Fix two vnode locking problems in nfsd_recalldelegation() in the experimental NFSv4 server. The first was a bogus use of VOP_ISLOCKED() in a KASSERT() and the second was the need to lock the vnode for the nfsrv_checkremove() call. Also, delete a "__unused" that was bogus, since the argument is used. Reviewed by: zack.kirsch at isilon.com MFC after: 2 weeks	2010-12-17 22:18:09 +00:00
Jaakko Heinonen	2d843e7d34	Don't allow user created symbolic links to cover another entries marked with DE_USER. If a devfs rule hid such entry, it was possible to create infinite number of symbolic links with the same name. Reviewed by: kib	2010-12-15 16:49:47 +00:00
Jaakko Heinonen	ef456eec95	- Assert that dm_lock is exclusively held in devfs_rules_apply() and in devfs_vmkdir() while adding the entry to de_list of the parent. - Apply devfs rules to newly created directories and symbolic links. PR: kern/125034 Submitted by: Mateusz Guzik (original version)	2010-12-15 16:42:44 +00:00
Jaakko Heinonen	2f66e90fc7	Handle the special ruleset 0 in devfs_ruleset_use(). An attempt set the current ruleset to 0 with command "devfs ruleset 0" triggered a KASSERT in devfs_ruleset_create(). PR: kern/125030 Submitted by: Mateusz Guzik	2010-12-12 08:52:13 +00:00
Rick Macklem	b4a8d95279	Disable attempts to establish a callback connection from the experimental NFSv4 server to a NFSv4 client when delegations are not being issued, even if the client advertises a callback path. This avoids a problem where a Linux client advertises a callback path that doesn't work, due to a firewall, and then times out an Open attempt before the FreeBSD server gives up its callback connection attempt. (Suggested by drb at karlov.mff.cuni.cz to fix the Linux client problem that he reported on the fs-stable mailing list.) The server should probably have a 1sec timeout on callback connection attempts when there are no delegations issued to the client, but that patch will require changes to the krpc and this serves as a work around until then. Tested by: drb at karlov.mff.cuni.cz MFC after: 5 days	2010-12-09 19:02:23 +00:00
Edward Tomasz Napierala	ef694c1ac4	Replace pointer to "struct uidinfo" with pointer to "struct ucred" in "struct vm_object". This is required to make it possible to account for per-jail swap usage. Reviewed by: kib@ Tested by: pho@ Sponsored by: FreeBSD Foundation	2010-12-02 17:37:16 +00:00
Konstantin Belousov	847e02e941	For non-stopped threads, td_frame pointer is undefined. As a consequence, fill_regs() and fill_fpregs() access random data, usually on the thread kernel stack. Most often the td_frame points to the previous frame saved by last kernel entry sequence, but this is not guaranteed. For /proc/<pid>/{regs,fpregs} read access, require the thread to be in stopped state. Otherwise, return EBUSY as is done for write case. Reported and tested by: pho Approved by: des (procfs maintainer) MFC after: 1 week	2010-12-02 12:44:51 +00:00
Konstantin Belousov	730b63b0c2	Remove prtactive variable and related printf()s in the vop_inactive and vop_reclaim() methods. They seems to be unused, and the reported situation is normal for the forced unmount. MFC after: 1 week X-MFC-note: keep prtactive symbol in vfs_subr.c	2010-11-19 21:17:34 +00:00
John Baldwin	b3e3402d3a	Remove unused includes of <sys/mutex.h> and <machine/mutex.h>.	2010-11-09 20:41:10 +00:00
Rick Macklem	f93d95cbf6	Modify nfs_open() in the experimental NFS client to be compatible with the regular NFS client. Also, fix a couple of mutex lock issues. MFC after: 1 week	2010-10-29 13:46:21 +00:00
Rick Macklem	0661e0348b	Add a call for nfsrpc_close() to ncl_reclaim() in the experimental NFSv4 client, since the call in ncl_inactive() might be missed because VOP_INACTIVE() is not guaranteed to be called before VOP_RECLAIM(). MFC after: 1 week	2010-10-29 13:34:57 +00:00
Rick Macklem	c5dd9d8c37	Add a flag to the experimental NFSv4 client to indicate when delegations are being returned for reasons other than a Recall. Also, re-organize nfscl_recalldeleg() slightly, so that it leaves clearing NMODIFIED to the ncl_flush() call and invalidates the attribute cache after flushing. It is hoped that these changes might fix the problem others have seen when using the NFSv4 client with delegations enabled, since I can't reliably reproduce the problem. These changes only affect the client when doing NFSv4 mounts with delegations enabled. MFC after: 10 days	2010-10-26 23:18:37 +00:00
Rick Macklem	377c50f67a	Modify the experimental NFSv4 server's file handle hash function to use the generic hash32_buf() function. Although adding the bytes seemed sufficient for UFS and ZFS, since most of the bytes are the same for file handles on the same volume, this might not be sufficient for other file systems. Use of a generic function also seems preferable to one specific to NFSv4. Suggested by: gleb.kurtsou at gmail.com MFC after: 10 days	2010-10-23 22:28:29 +00:00
Rick Macklem	91027b4ef0	Modify the file handle hash function in the experimental NFS server so that it will work better for non-UFS file systems. The new function simply sums the bytes of the fh_fid field of fhandle_t. MFC after: 10 days	2010-10-22 21:38:56 +00:00
Rick Macklem	8a1b5ade5f	Modify the experimental NFS server in a manner analagous to r214049 for the regular NFS server, so that it will not do a VOP_LOOKUP() of ".." when at the root of a file system when performing a ReaddirPlus RPC. MFC after: 10 days	2010-10-21 18:49:12 +00:00
Rick Macklem	4d4f9a3721	Fix the type of the 3rd argument for nm_getinfo so that it works for architectures like sparc64. Suggested by: kib MFC after: 2 weeks	2010-10-19 11:55:58 +00:00
Rick Macklem	ca27c028d8	Modify the NFS clients and the NLM so that the NLM can be used by both clients. Since the NLM uses various fields of the nfsmount structure, those fields were extracted and put in a separate nfs_mountcommon structure stored in sys/nfs/nfs_mountcommon.h. This structure also has a function pointer for a function that extracts the required information from the mount point and nfs vnode for that particular client, for information stored differently by the clients. Reviewed by: jhb MFC after: 2 weeks	2010-10-19 00:20:00 +00:00
Kevin Lo	4bc8fad7bd	Fix a possible race where the directory dirent is moved to the location that was used by ".." entry. This change seems fixed panic during attempt to access msdosfs data over nfs. Reviewed by: kib MFC after: 1 week	2010-10-18 03:34:33 +00:00
Rui Paulo	0b53cc9f56	Ignore the return value of DE_INTERNALIZE().	2010-10-13 11:37:39 +00:00
Andriy Gapon	e07b64c567	tmpfs + sendfile: do not produce partially valid pages for vnode's tail See r213730 for details of analogous change in ZFS. MFC after: 3 days	2010-10-12 17:16:51 +00:00
Jaakko Heinonen	27877c9903	Format prototypes to follow style(9) more closely. Discussed with: kib, phk	2010-10-12 15:58:52 +00:00
Rick Macklem	db0a33d219	Try and make the nfsrv_localunlock() function in the experimental NFSv4 server more readable. Mostly changes to comments, but a case of >= is changed to >, since == can never happen. Also, I've added a couple of KASSERT()s and a slight optimization, since once the "else if" case happens, subsequent locks in the list can't have any effect. None of these changes fixes any known bug. MFC after: 2 weeks	2010-10-11 23:15:18 +00:00
Konstantin Belousov	d0cc54f3b4	The r184588 changed the layout of struct export_args, causing an ABI breakage for old mount(2) syscall, since most struct <filesystem>_args embed export_args. The mount(2) is supposed to provide ABI compatibility for pre-nmount mount(8) binaries, so restore ABI to pre-r184588. Requested and reviewed by: bde MFC after: 2 weeks	2010-10-10 07:05:47 +00:00
Konstantin Belousov	b0d5391101	Add a comment describing the reason for calling cache_purge(fvp). Requested by: danfe MFC after: 6 days	2010-10-08 07:17:22 +00:00
Konstantin Belousov	4d477d5c77	The msdosfs lookup is case insensitive. Several aliases may be inserted for a single directory entry. As a consequnce, name cache purge done by lookup for fvp when DELETE op for namei is specified, might be not enough to expunge all namecache entries that were installed for this direntry. Explicitely call cache_purge(fvp) when msdosfs_rename() succeeded. PR: kern/93634 MFC after: 1 week	2010-10-07 08:36:02 +00:00
Alan Cox	a03e344a7f	M_USE_RESERVE has been deprecated for a decade. Eliminate any uses that have no run-time effect.	2010-10-02 17:58:57 +00:00
Jaakko Heinonen	47bcfb6422	Add a new function devfs_dev_exists() to be able to find out if a specific devfs path already exists. The function will be used from kern_conf.c to detect duplicate device registrations. Callers must hold the devmtx mutex. Reviewed by: kib	2010-09-27 18:20:56 +00:00
Jaakko Heinonen	d318c565d7	Add reference counting for devfs paths containing user created symbolic links. The reference counting is needed to be able to determine if a specific devfs path exists. For true device file paths we can traverse the cdevp_list but a separate directory list is needed for user created symbolic links. Add a new directory entry flag DE_USER to mark entries which should unreference their parent directory on deletion. A new function to traverse cdevp_list and the directory list will be introduced in a separate commit. Idea from: kib Reviewed by: kib	2010-09-27 17:47:09 +00:00
Jaakko Heinonen	6adc52306a	Modify devfs_fqpn() for future use in devfs path reference counting code: - Accept devfs_mount and devfs_dirent as the arguments instead of a vnode. This generalizes the function so that it can be used from contexts where vnode references are not available. - Accept NULL cnp argument. No '/' will be appended, if a NULL cnp is provided. - Make the function global and add its prototype to devfs.h. Reviewed by: kib	2010-09-21 16:49:02 +00:00
Rick Macklem	a212c01aac	Fix nfsrv_freeallnfslocks() in the experimental NFSv4 server so that it frees local locks correctly upon close. In order for nfsrv_localunlock() to work correctly, the lock can no longer be in the lockowner's stateid list. As such, nfsrv_freenfslock() has to be called before nfsrv_localunlock(), to get rid of the lock structure on the lockowner's stateid list. This only affected operation when local locks (vfs.newnfs.enable_locallocks=1) are enabled, which is not the default at this time. MFC after: 1 week	2010-09-19 01:18:03 +00:00
Rick Macklem	c7aafc24c4	Fix the experimental NFSv4 server so that it performs local VOP_ADVLOCK() unlock operations correctly. It was passing in F_SETLK instead of F_UNLCK as the operation for the unlock case. This only affected operation when local locking (vfs.newnfs.enable_locallocks=1) was enabled. MFC after: 1 week	2010-09-19 01:05:19 +00:00
Jaakko Heinonen	8570d045e5	- For consistency, remove "." and ".." entries from de_dlist before calling devfs_delete() (and thus possibly dropping dm_lock) in devfs_rmdir_empty(). - Assert that we don't return doomed entries from devfs_find(). [1] Suggested by: kib [1] Reviewed by: kib	2010-09-18 18:37:41 +00:00
Jaakko Heinonen	89d10571db	Remove empty devfs directories automatically. devfs_delete() now recursively removes empty parent directories unless the DEVFS_DEL_NORECURSE flag is specified. devfs_delete() can't be called anymore with a parent directory vnode lock held because the possible parent directory deletion needs to lock the vnode. Thus we unlock the parent directory vnode in devfs_remove() before calling devfs_delete(). Call devfs_populate_vp() from devfs_symlink() and devfs_vptocnp() as now directories can get removed. Add a check for DE_DOOMED flag to devfs_populate_vp() because devfs_delete() drops dm_lock before the VI_DOOMED vnode flag gets set. This ensures that devfs_populate_vp() returns an error for directories which are in progress of deletion. Reviewed by: kib Discussed on: freebsd-current (mostly silence)	2010-09-15 14:23:55 +00:00
Andriy Gapon	21bd3e2576	tmpfs, zfs + sendfile: mark page bits as valid after populating it with data Otherwise, adding insult to injury, in addition to double-caching of data we would always copy the data into a vnode's vm object page from backend. This is specific to sendfile case only (VOP_READ with UIO_NOCOPY). PR: kern/141305 Reported by: Wiktor Niesiobedzki <bsd@vink.pl> Reviewed by: alc Tested by: tools/regression/sockets/sendfile MFC after: 2 weeks	2010-09-15 10:31:27 +00:00
Rick Macklem	2c6d0e01f8	This patch applies one of the two fixes suggested by zack.kirsch at isilon.com for a race between nfsrv_freeopen() and nfsrv_getlockfile() in the experimental NFS server that he found during testing. Although nfsrv_freeopen() holds a sleep lock on the lock file structure when called with cansleep != 0, nfsrv_getlockfile() could still search the list, once it acquired the NFSLOCKSTATE() mutex. I believe that acquiring the mutex in nfsrv_freeopen() fixes the race. MFC after: 2 weeks	2010-09-10 23:49:33 +00:00
Rick Macklem	37fe683250	Fix the NFSVNO_CMPFH() macro in the experimental NFS server so that it works correctly for ZFS file handles. It is possible to have two ZFS file handles that differ only in the bytes in the fid_reserved field of the generic "struct fid" and comparing the bytes in fid_data didn't catch this case. This patch changes the macro to compare all bytes of "struct fid". Tested by: gull at gull.us MFC after: 2 weeks	2010-09-10 23:18:45 +00:00
Rick Macklem	a8c0af5906	Fix the experimental NFS client so that it doesn't panic when NFSv2,3 byte range locking is attempted. A fix that allows the nlm_advlock() to work with both clients is in progress, but may take a while. As such, I am doing this commit so that the kernel doesn't panic in the meantime. Submitted by: jh MFC after: 2 weeks	2010-09-09 15:45:11 +00:00
Ivan Voras	b2143ecb99	Avoid "Entry can disappear before we lock fdvp" panic. PR: 150143 Submitted by: Gleb Kurtsou <gk at FreeBSD.org> Pretty sure it won't blow up: mckusick MFC after: 2 weeks	2010-09-07 22:40:45 +00:00
John Baldwin	8e27c18282	Store the full timestamp when caching timestamps of files and directories for purposes of validating name cache entries. This closes races where two updates to a file or directory within the same second could result in stale entries in the name cache. While here, remove the 'n_expiry' field as it is no longer used. Reviewed by: rmacklem MFC after: 1 week	2010-09-07 14:29:45 +00:00
Daichi GOTO	21f9b7b28a	Allowed unionfs to use whiteout not supporting file system as upper layer. Until now, unionfs prevents to use that kind of file system as upper layer. This time, I changed to allow that kind of file system as upper layer. By this change, you can use whiteout not supporting file system (e.g., especially for tmpfs) as upper layer. It's very useful for combination of tmpfs as upper layer and read only file system as lower layer. By difinition, without whiteout support from the file system backing the upper layer, there is no way that delete and rename operations on lower layer objects can be done. EOPNOTSUPP is returned for this kind of operations as generated by VOP_WHITEOUT() along with any others which would make modifica tions to the lower layer, such as chmod(1). This change is suggested by ed. Submitted by: ed	2010-09-05 04:58:16 +00:00
Rick Macklem	848fd2c0e2	Change the code in ncl_bioread() in the experimental NFS client to return an error when rabp is not set, so it behaves the same way as the regular NFS client for this case. It does not affect NFSv4, since nfs_getcacheblk() only fails for "intr" mounts and NFSv4 can't use the "intr" mount option. MFC after: 2 weeks	2010-09-05 00:47:44 +00:00
Rick Macklem	0372f5f411	Disable use of the NLM in the experimental NFS client, since it will crash the kernel because it uses the nfsmount and nfsnode structures of the regular NFS client. MFC after: 2 weeks	2010-09-05 00:10:18 +00:00
Ulf Lilleengen	0cc17ce608	- Remove duplicate comment. PR: kern/148820 Submitted by: pluknet <pluknet - at - gmail.com>	2010-09-01 05:34:17 +00:00
Rick Macklem	2d0c83b139	Add a null_remove() function to nullfs, so that the v_usecount of the lower level vnode is incremented to greater than 1 when the upper level vnode's v_usecount is greater than one. This is necessary for the NFS clients, so that they will do a silly rename of the file instead of actually removing it when the file is still in use. It is "racy", since the v_usecount is incremented in many places in the kernel with minimal synchronization, but an extraneous silly rename is preferred to not doing a silly rename when it is required. The only other file systems that currently check the value of v_usecount in their VOP_REMOVE() functions are nwfs and smbfs. These file systems choose to fail a remove when the v_usecount is greater than 1 and I believe will function more correctly with this patch, as well. Tested by: to.my.trociny at gmail.com Submitted by: to.my.trociny at gmail.com (earlier version) Reviewed by: kib MFC after: 2 weeks	2010-08-31 01:16:45 +00:00
Rick Macklem	b5cb66df25	Add acquisition of a reference count on nfsv4root_lock to the nfsd_recalldelegation() function, since this function is called by nfsd threads when they are handling NFSv2 or NFSv3 RPCs, where no reference count would have been acquired. MFC after: 2 weeks	2010-08-28 23:50:09 +00:00
Rick Macklem	2ec3f92528	The timer routine in the experimental NFS server did not acquire the correct mutex when checking nfsv4root_lock. Although this could be fixed by adding mutex lock/unlock calls, zack.kirsch at isilon.com suggested a better fix that uses a non-blocking acquisition of a reference count on nfsv4root_lock. This fix allows the weird NFSLOCKSTATE(); NFSUNLOCKSTATE(); synchronization to be deleted. This patch applies this fix. Tested by: zack.kirsch at isilon.com MFC after: 2 weeks	2010-08-28 21:41:18 +00:00
Jaakko Heinonen	4136388a18	Set de_dir for user created symbolic links. This will be needed to be able to resolve their parent directories.	2010-08-26 16:01:29 +00:00
Edward Tomasz Napierala	81f6480d42	Revert r210194, adding a comment explaining why calls to chgproccnt() in unionfs are actually needed. I have a better fix in trasz_hrl p4 branch, but now is not a good moment to commit it. Reported by: Alex Kozlov	2010-08-25 21:32:08 +00:00
Jaakko Heinonen	f5efcd64f4	Call devfs_populate_vp() from devfs_getattr(). It was possible that fstat(2) returned stale information through an open file descriptor.	2010-08-25 15:29:12 +00:00
Jaakko Heinonen	0f6bb099ae	Introduce and use devfs_populate_vp() to unlock a vnode before calling devfs_populate(). This is a prerequisite for the automatic removal of empty directories which will be committed in the future. Reviewed by: kib (previous version)	2010-08-22 16:08:12 +00:00
Ed Schouten	99d57a6bd8	Add support for whiteouts on tmpfs. Right now unionfs only allows filesystems to be mounted on top of another if it supports whiteouts. Even though I have sent a patch to daichi@ to let unionfs work without it, we'd better also add support for whiteouts to tmpfs. This patch implements .vop_whiteout and makes necessary changes to lookup() and readdir() to take them into account. We must also make sure that when adding or removing a file, we honour the componentname's DOWHITEOUT and ISWHITEOUT, to prevent duplicate filenames. MFC after: 1 month	2010-08-22 05:36:06 +00:00

... 7 8 9 10 11 ...

3320 Commits