freebsd-dev

Author	SHA1	Message	Date
John W. De Boskey	3676a0d890	Use the common api helper routine instead of freeing the namei buffer directly. Approved by: rmacklem (mentor) MFC after: 1 month	2012-05-08 03:39:44 +00:00
Daichi GOTO	508a31f1a8	fixed a unionfs_readdir math issue PR: 132987 Submitted by: Matthew Fleming <mfleming@isilon.com>	2012-05-03 07:22:29 +00:00
Daichi GOTO	cb5736b73b	- fixed a vnode lock hang-up issue. - fixed an incorrect lock status issue. - fixed an incorrect lock issue of unionfs root vnode removed. (pointed out by keith) - fixed an infinity loop issue. (pointed out by dumbbell) - changed to do LK_RELEASE expressly when unlocked. Submitted by: ozawa@ongs.co.jp	2012-05-01 07:46:30 +00:00
Rick Macklem	4964d80705	It was reported via email that some non-FreeBSD NFS servers do not include file attributes in the reply to an NFS create RPC under certain circumstances. This resulted in a vnode of type VNON that was not usable. This patch adds an NFS getattr RPC to nfs_create() for this case, to fix the problem. It was tested by the person that reported the problem and confirmed to fix this case for their server. Tested by: Steven Haber (steven.haber at isilon.com) MFC after: 2 weeks	2012-04-27 22:23:06 +00:00
Rick Macklem	a607cc6d8e	Fix a leak of namei lookup path buffers that occurs when a ZFS volume is exported via the new NFS server. The leak occurred because the new NFS server code didn't handle the case where a file system sets the SAVENAME flag in its VOP_LOOKUP() and ZFS does this for the DELETE case. Tested by: Oliver Brandmueller (ob at gruft.de), hrs PR: kern/167266 MFC after: 1 month	2012-04-27 20:23:24 +00:00
Edward Tomasz Napierala	af6e6b87ad	Remove unused thread argument to vrecycle(). Reviewed by: kib	2012-04-23 14:10:34 +00:00
Edward Tomasz Napierala	c52fd858ae	Remove unused thread argument from vtruncbuf(). Reviewed by: kib	2012-04-23 13:21:28 +00:00
Kirk McKusick	f257ebbb2e	This change creates a new list of active vnodes associated with a mount point. Active vnodes are those with a non-zero use or hold count, e.g., those vnodes that are not on the free list. Note that this list is in addition to the list of all the vnodes associated with a mount point. To avoid adding another set of linkage pointers to the vnode structure, the active list uses the existing linkage pointers used by the free list (previously named v_freelist, now renamed v_actfreelist). This update adds the MNT_VNODE_FOREACH_ACTIVE interface that loops over just the active vnodes associated with a mount point (typically less than 1% of the vnodes associated with the mount point). Reviewed by: kib Tested by: Peter Holm MFC after: 2 weeks	2012-04-20 06:50:44 +00:00
Jaakko Heinonen	fd1062ce4c	Return EOPNOTSUPP rather than EPERM for the SF_SNAPSHOT flag because tmpfs doesn't support snapshots. Suggested by: bde	2012-04-18 15:22:08 +00:00
Kirk McKusick	71469bb38f	Replace the MNT_VNODE_FOREACH interface with MNT_VNODE_FOREACH_ALL. The primary changes are that the user of the interface no longer needs to manage the mount-mutex locking and that the vnode that is returned has its mutex locked (thus avoiding the need to check to see if its is DOOMED or other possible end of life senarios). To minimize compatibility issues for third-party developers, the old MNT_VNODE_FOREACH interface will remain available so that this change can be MFC'ed to 9. Following the MFC to 9, MNT_VNODE_FOREACH will be removed in head. The reason for this update is to prepare for the addition of the MNT_VNODE_FOREACH_ACTIVE interface that will loop over just the active vnodes associated with a mount point (typically less than 1% of the vnodes associated with the mount point). Reviewed by: kib Tested by: Peter Holm MFC after: 2 weeks	2012-04-17 16:28:22 +00:00
Jaakko Heinonen	587fdb536f	Sync tmpfs_chflags() with the recent changes to UFS: - Add a check for unsupported file flags. - Return EPERM when an user without PRIV_VFS_SYSFLAGS privilege attempts to toggle SF_SETTABLE flags.	2012-04-16 18:10:34 +00:00
Jaakko Heinonen	c5ab5ce345	tmpfs: Allow update mounts only for certain options. Since r230208 update mounts were allowed if the list of mount options contained the "export" option. This is not correct as tmpfs doesn't really support updating all options. Reviewed by: kevlo, trociny	2012-04-16 18:07:42 +00:00
Gleb Kurtsou	f8439900d6	Provide better description for vfs.tmpfs.memory_reserved sysctl. Suggested by: Anton Yuzhaninov <citrin@citrin.ru>	2012-04-15 21:59:28 +00:00
Jaakko Heinonen	295a542d96	Apply changes from r234103 to ext2fs: Return EPERM from ext2_setattr() when an user without PRIV_VFS_SYSFLAGS privilege attempts to toggle SF_SETTABLE flags. Flags are now stored to ip->i_flags in one place after all checks. Also, remove SF_NOUNLINK from the checks because ext2fs doesn't support that flag. Reviewed by: bde	2012-04-13 05:48:31 +00:00
Jaakko Heinonen	e6b8bdf252	Restore the blank line incorrectly removed in r234104. Pointed out by: bde	2012-04-11 15:48:50 +00:00
Jaakko Heinonen	034efc61ba	Apply changes from r233787 to ext2fs: - Use more natural ip->i_flags instead of vap->va_flags in the final flags check. - Style improvements. No functional change intended. MFC after: 2 weeks	2012-04-10 16:05:52 +00:00
Attilio Rao	a0f2c37b6f	- Introduce a cache-miss optimization for consistency with other accesses of the cache member of vm_object objects. - Use novel vm_page_is_cached() for checks outside of the vm subsystem. Reviewed by: alc MFC after: 2 weeks X-MFC: r234039	2012-04-09 17:05:18 +00:00
Kirk McKusick	827e334c01	Add I/O accounting to msdos filesystem. Suggested and reviewed by: kib	2012-04-08 06:18:18 +00:00
Gleb Kurtsou	9295c62814	tmpfs supports only INT_MAX nodes due to limitations of unit number allocator. Replace UINT32_MAX checks with INT_MAX. Keeping more than 2^31 nodes in memory is not likely to become possible in foreseeable feature and would require new unit number allocator. Discussed with: delphij MFC after: 2 weeks	2012-04-07 15:30:46 +00:00
Gleb Kurtsou	0ff93c48da	Add vfs_getopt_size. Support human readable file system options in tmpfs. Increase maximum tmpfs file system size to 4GB*PAGE_SIZE on 32 bit archs. Discussed with: delphij MFC after: 2 weeks	2012-04-07 15:27:34 +00:00
Gleb Kurtsou	da7aa2778e	Add reserved memory limit sysctl to tmpfs. Cleanup availble and used memory functions. Check if free pages available before allocating new node. Discussed with: delphij	2012-04-07 15:23:51 +00:00
Konstantin Belousov	a53373fabe	Add sysctl vfs.nfs.nfs_keep_dirty_on_error to switch the nfs client behaviour on error from write RPC back to behaviour of old nfs client. When set to not zero, the pages for which write failed are kept dirty. PR: kern/165927 Reviewed by: alc MFC after: 2 weeks	2012-03-17 23:03:20 +00:00
Gleb Kurtsou	db94ad126a	Prevent tmpfs_rename() deadlock in a way similar to UFS Unlock vnodes and try to lock them one by one. Relookup fvp and tvp. Approved by: mdf (mentor)	2012-03-14 09:15:50 +00:00
Gleb Kurtsou	ca846258e2	Don't enforce LK_RETRY to get existing vnode in tmpfs_alloc_vp() Doomed vnode is hardly of any use here, besides all callers handle error case. vfs_hash_get() does the same. Don't mess with vnode holdcount, vget() takes care of it already. Approved by: mdf (mentor)	2012-03-14 08:29:21 +00:00
Kevin Lo	11753bd018	Use NULL instead of 0	2012-03-13 10:04:13 +00:00
Konstantin Belousov	0e738b4c0f	Update comment. Submitted by: gianni	2012-03-11 15:58:27 +00:00
Konstantin Belousov	b80dcb55aa	Remove fifo.h. The only used function declaration from the header is migrated to sys/vnode.h. Submitted by: gianni	2012-03-11 12:19:58 +00:00
Pedro F. Giffuni	035e4e0494	Add support for ns timestamps and birthtime to the ext2/3 driver. When using big inodes there is sufficient space in ext3 to keep extra resolution and birthtime (creation) timestamps. The appropriate fields in the on-disk inode have been approved for a long time but support for this in ext3 has not been widely distributed. In preparation for ext4 most linux distributions have enabled by default such bigger inodes and some people use nanosecond timestamps in ext3. We now support those when the inode is big enough and while we do recognize the EXT4F_ROCOMPAT_EXTRA_ISIZE, we maintain the extra timestamps even when they are not used. An additional note by Bruce Evans: We blindly accept unrepresentable tv_nsec in VOP_SETATTR(), but all file systems have always done that. When POSIX gets around to specifying the behaviour, it will probably require certain rounding to the fs's resolution and not rejecting the request. This unfortunately means that syscalls that set times can't really tell if they succeeded without reading back the times using stat() or similar and checking that they were set close enough. Reviewed by: bde Approved by: jhb (mentor) MFC after: 2 weeks	2012-03-08 21:06:05 +00:00
John Baldwin	b47f624183	Add KTR_VFS traces to track modifications to a vnode's writecount.	2012-03-08 20:27:20 +00:00
Konstantin Belousov	f950879e16	The pipe_poll() performs lockless access to the vnode to test fifo_iseof() condition, allowing the v_fifoinfo to be reset and freed by fifo_cleanup(). Precalculate EOF at the places were fo_wgen is changed, and cache the state in a new pipe state flag PIPE_SAMEWGEN. Reported and tested by: bf Submitted by: gianni MFC after: 1 week (a backport)	2012-03-07 07:31:50 +00:00
Konstantin Belousov	31452ff75e	Apply inlined vn_vget_ino() algorithm for ".." lookup in pseudofs. Reported and tested by: pho MFC after: 2 weeks	2012-03-05 11:38:02 +00:00
Konstantin Belousov	ea4072446b	Remove unneeded cast to u_int. The values as small enough to fit into int, beside the use of MIN macro which performs type promotions. Submitted by: bde MFC after: 3 weeks	2012-03-04 14:51:42 +00:00
Kevin Lo	c225ad032d	Remove unnecessary casts	2012-03-04 09:48:58 +00:00
Kevin Lo	dd104b3305	Clean up style(9) nits	2012-03-04 09:38:20 +00:00
Rick Macklem	b76ec2db93	The name caching changes of r230394 exposed an intermittent bug in the new NFS server for NFSv4, where it would report ENOENT when the file actually existed on the server. This turned out to be caused by not initializing ni_topdir before calling lookup() and there was a rare case where the value on the stack location assigned to ni_topdir happened to be a pointer to a ".." entry, such that "dp == ndp->ni_topdir" succeeded in lookup(). This patch initializes ni_topdir to fix the problem. MFC after: 5 days	2012-03-03 16:13:20 +00:00
Rick Macklem	5e99212d36	Post r230394, the Lookup RPC counts for both NFS clients increased significantly. Upon investigation this was caused by name cache misses for lookups of "..". For name cache entries for non-".." directories, the cache entry serves double duty. It maps both the named directory plus ".." for the parent of the directory. As such, two ctime values (one for each of the directory and its parent) need to be saved in the name cache entry. This patch adds an entry for ctime of the parent directory to the name cache. It also adds an additional uma zone for large entries with this time value, in order to minimize memory wastage. As well, it fixes a couple of cases where the mtime of the parent directory was being saved instead of ctime for positive name cache entries. With this patch, Lookup RPC counts return to values similar to pre-r230394 kernels. Reported by: bde Discussed with: kib Reviewed by: jhb MFC after: 2 weeks	2012-03-03 01:06:54 +00:00
John Baldwin	58d65e8031	Similar to the fixes in 226967 and 226987, purge any name cache entries associated with the previous vnode (if any) associated with the target of a rename(). Otherwise, a lookup of the target pathname concurrent with a rename() could re-add a name cache entry after the namei(RENAME) lookup in kern_renameat() had purged the target pathname. MFC after: 2 weeks	2012-03-02 18:55:19 +00:00
Konstantin Belousov	66f02f4b25	Do not expose unlocked unconstructed nullfs vnode on mount list. Lock the native nullfs vnode lock before switching the locks. Tested by: pho MFC after: 1 week	2012-03-02 09:48:46 +00:00
Rick Macklem	4cf7d12840	Fix the NFS clients so that they use copyin() instead of bcopy(), when doing direct I/O. This direct I/O code is not enabled by default. Submitted by: kib (earlier version) Reviewed by: kib MFC after: 1 week	2012-03-01 03:53:07 +00:00
Martin Matuska	b362f0a78b	Add "export" to devfs_opts[] and return EOPNOTSUPP if called with it. Fixes mountd warnings. Reported by: kib MFC after: 1 week	2012-02-29 16:16:36 +00:00
Konstantin Belousov	37a1046e61	Allow shared locks for reads when lower filesystem accept shared locking. Tested by: pho MFC after: 1 week	2012-02-29 15:18:53 +00:00
Konstantin Belousov	cec1d07726	Document that null_nodeget() cannot take shared-locked lowervp due to insmntque() requirements. Tested by: pho MFC after: 1 week	2012-02-29 15:18:04 +00:00
Konstantin Belousov	409b12c08a	In null_reclaim(), assert that reclaimed vnode is fully constructed, instead of accepting half-constructed vnode. Previous code cannot decide what to do with such vnode anyway, and although processing it for hash removal, paniced later when getting rid of nullfs reference on lowervp. While there, remove initializations from the declaration block. Tested by: pho MFC after: 1 week	2012-02-29 15:15:36 +00:00
Konstantin Belousov	e4e1d9f382	Always request exclusive lock for the lower vnode in nullfs_vget(). The null_nodeget() requires exclusive lock on lowervp to be able to insmntque() new vnode. Reported by: rea Tested by: pho MFC after: 1 week	2012-02-29 15:09:20 +00:00
Konstantin Belousov	67e3d54f80	Move the code to destroy half-contructed nullfs vnode into helper function null_destroy_proto() from null_insmntque_dtr(). Also apply null_destroy_proto() in null_nodeget() when we raced and a vnode is found in the hash, so the currently allocated protonode shall be destroyed. Lock the vnode interlock around reassigning the v_vnlock. In fact, this path will not be exercised after several later commits, since null_nodeget() cannot take shared-locked lowervp at all due to insmntque() requirements. Reported by: rea Tested by: pho MFC after: 1 week	2012-02-29 15:06:00 +00:00
Konstantin Belousov	da732fc69f	Merge a split multi-line comment. MFC after: 1 week	2012-02-29 14:43:27 +00:00
Martin Matuska	41c0675e6e	Add procfs to jail-mountable filesystems. Reviewed by: jamie MFC after: 1 week	2012-02-29 00:30:18 +00:00
Kevin Lo	19b029a487	Remove an unused structure and unnecessary cast	2012-02-24 07:30:44 +00:00
Kevin Lo	a61d3d5a99	Check if the user has necessary permissions on the device	2012-02-24 07:29:06 +00:00
Martin Matuska	bf3db8aa65	To improve control over the use of mount(8) inside a jail(8), introduce a new jail parameter node with the following parameters: allow.mount.devfs: allow mounting the devfs filesystem inside a jail allow.mount.nullfs: allow mounting the nullfs filesystem inside a jail Both parameters are disabled by default (equals the behavior before devfs and nullfs in jails). Administrators have to explicitly allow mounting devfs and nullfs for each jail. The value "-1" of the devfs_ruleset parameter is removed in favor of the new allow setting. Reviewed by: jamie Suggested by: pjd MFC after: 2 weeks	2012-02-23 18:51:24 +00:00
Kip Macy	11ac7ec076	merge pipe and fifo implementations Also reviewed by: jhb, jilles (initial revision) Tested by: pho, jilles Submitted by: gianni Reviewed by: bde	2012-02-23 18:37:30 +00:00
Rick Macklem	7cfce7cec7	hrs@ reported a panic to freebsd-stable@ under the subject line "panic in 8.3-PRERELEASE" on Feb. 22, 2012. This panic was caused by use of a mix of tsleep() and msleep() calls on the same event in the new NFS server DRC code. It did "mtx_unlock(); tsleep();" in two places, which kib@ noted introduced a slight risk that the wakeup() would occur before the tsleep(), resulting in a 10sec delay before waking up. This patch fixes the problem by replacing "mtx_unlock(); tsleep();" with mtx_sleep(..PDROP..). It also changes a nfsmsleep() call to mtx_sleep() so that the code uses mtx_sleep() consistently within the file. Tested by: hrs (in progress) Reviewed by: jhb MFC after: 5 days	2012-02-23 16:47:05 +00:00
Konstantin Belousov	2aacee7779	Use DOINGASYNC() to test for async allowance, to honor VFS syncing requests. Noted by: bde MFC after: 1 week	2012-02-22 13:01:17 +00:00
Konstantin Belousov	526d0bd547	Fix found places where uio_resid is truncated to int. Add the sysctl debug.iosize_max_clamp, enabled by default. Setting the sysctl to zero allows to perform the SSIZE_MAX-sized i/o requests from the usermode. Discussed with: bde, das (previous versions) MFC after: 1 month	2012-02-21 01:05:12 +00:00
Kevin Lo	3d74f18ba4	Remove an unnecessary cast.	2012-02-20 09:56:14 +00:00
Bjoern A. Zeeb	9dba179d5e	IFC @231845 Sponsored by: Cisco Systems, Inc.	2012-02-17 00:27:48 +00:00
Rick Macklem	13b2772f8e	Delete a couple of out of date comments that are no longer true in the new NFS client. Requested by: bde MFC after: 1 week	2012-02-16 02:19:53 +00:00
Tijl Coosemans	0662ee9826	Replace PRIdMAX with "jd" in a printf call. Cast the corresponding value to intmax_t instead of uintmax_t, because the original type is off_t.	2012-02-14 11:24:24 +00:00
Ed Schouten	8fac9b7b7d	Merge si_name and __si_namebuf. The si_name pointer always points to the __si_namebuf member inside the same object. Remove it and rename __si_namebuf to si_name.	2012-02-10 12:40:50 +00:00
Martin Matuska	61f0e25abf	Allow mounting nullfs(5) inside jails. This is now possible thanks to r230129. MFC after: 1 month	2012-02-09 10:39:01 +00:00
Martin Matuska	0cc207a6f5	Add support for mounting devfs inside jails. A new jail(8) option "devfs_ruleset" defines the ruleset enforcement for mounting devfs inside jails. A value of -1 disables mounting devfs in jails, a value of zero means no restrictions. Nested jails can only have mounting devfs disabled or inherit parent's enforcement as jails are not allowed to view or manipulate devfs(8) rules. Utilizes new functions introduced in r231265. Reviewed by: jamie MFC after: 1 month	2012-02-09 10:22:08 +00:00
Martin Matuska	17d84d611f	Introduce the "ruleset=number" option for devfs(5) mounts. Add support for updating the devfs mount (currently only changing the ruleset number is supported). Check mnt_optnew with vfs_filteropt(9). This new option sets the specified ruleset number as the active ruleset of the new devfs mount and applies all its rules at mount time. If the specified ruleset doesn't exist, a new empty ruleset is created. MFC after: 1 month	2012-02-09 10:09:12 +00:00
Pedro F. Giffuni	3cc6ae1f57	Update the data structures with some fields reserved for ext4 but that can be used in ext3 mode. Also adjust the internal inode to carry the birthtime, like in UFS, which is starting to get some use when big inodes are available. Right now these are just placeholders for features to come. Approved by: jhb (mentor) MFC after: 2 weeks	2012-02-07 22:31:28 +00:00
Rick Macklem	8c9c322347	r228827 fixed a problem where copying of NFSv4 open credentials into a credential structure would corrupt it. This happened when the p argument was != NULL. However, I now realize that the copying of open credentials should only happen for p == NULL, since that indicates that it is a read-ahead or write-behind. This patch fixes this. After this commit, r228827 could be reverted, but I think the code is clearer and safer with the patch, so I am going to leave it in. Without this patch, it was possible that a NFSv4 VOP_SETATTR() could have changed the credentials of the caller. This would have happened if the process doing the VOP_SETATTR() did not have the file open, but some other process running as a different uid had the file open for writing at the same time. MFC after: 5 days	2012-02-07 16:32:43 +00:00
John Baldwin	bf40d24a3f	Rename cache_lookup_times() to cache_lookup() and retire the old API and ABI stub for cache_lookup().	2012-02-06 17:00:28 +00:00
Konstantin Belousov	c480f781ea	Current implementations of sync(2) and syncer vnode fsync() VOP uses mnt_noasync counter to temporary remove MNTK_ASYNC mount option, which is needed to guarantee a synchronous completion of the initiated i/o before syscall or VOP return. Global removal of MNTK_ASYNC option is harmful because not only i/o started from corresponding thread becomes synchronous, but all i/o is synchronous on the filesystem which is initiated during sync(2) or syncer activity. Instead of removing MNTK_ASYNC from mnt_kern_flag, provide a local thread flag to disable async i/o for current thread only. Use the opportunity to move DOINGASYNC() macro into sys/vnode.h and consistently use it through places which tested for MNTK_ASYNC. Some testing demonstrated 60-70% improvements in run time for the metadata-intensive operations on async-mounted UFS volumes, but still with great deviation due to other reasons. Reviewed by: mckusick Tested by: scottl MFC after: 2 weeks	2012-02-06 11:04:36 +00:00
Bjoern A. Zeeb	81d5d46b3c	Add multi-FIB IPv6 support to the core network stack supplementing the original IPv4 implementation from r178888: - Use RT_DEFAULT_FIB in the IPv4 implementation where noticed. - Use rtfib() KPI with explicit RT_DEFAULT_FIB where applicable in the NFS code. - Use the new in6_rt KPI in TCP, gif(4), and the IPv6 network stack where applicable. - Split in6_rtqtimo() and in6_mtutimo() as done in IPv4 and equally prevent multiple initializations of callouts in in6_inithead(). - Use wrapper functions where needed to preserve the current KPI to ease MFCs. Use BURN_BRIDGES to indicate expected future cleanup. - Fix (related) comments (both technical or style). - Convert to rtinit() where applicable and only use custom loops where currently not possible otherwise. - Multicast group, most neighbor discovery address actions and faith(4) are locked to the default FIB. Individual IPv6 addresses will only appear in the default FIB, however redirect information and prefixes of connected subnets are automatically propagated to all FIBs by default (mimicking IPv4 behavior as closely as possible). Sponsored by: Cisco Systems, Inc.	2012-02-03 13:08:44 +00:00
Rick Macklem	87b633678b	When a "mount -u" switches an NFS mount point from TCP to UDP, any thread doing an I/O RPC with a transfer size greater than NFS_UDPMAXDATA will be hung indefinitely, retrying the RPC. After a discussion on freebsd-fs@, I decided to add a warning message for this case, as suggested by Jeremy Chadwick. Suggested by: freebsd at jdc.parodius.com (Jeremy Chadwick) MFC after: 2 weeks	2012-01-31 03:58:26 +00:00
Rick Macklem	7f763fc39c	A problem with respect to data read through the buffer cache for both NFS clients was reported to freebsd-fs@ under the subject "NFS corruption in recent HEAD" on Nov. 26, 2011. This problem occurred when a TCP mounted root fs was changed to using UDP. I believe that this problem was caused by the change in mnt_stat.f_iosize that occurred because rsize was decreased to the maximum supported by UDP. This patch fixes the problem by using v_bufobj.bo_bsize instead of f_iosize, since the latter is set to f_iosize when the vnode is allocated, but does not change for a given vnode when f_iosize changes. Reported by: pjd Reviewed by: kib MFC after: 2 weeks	2012-01-27 02:46:12 +00:00
Rick Macklem	0149d177fb	Revert r230516, since it doesn't really fix the problem.	2012-01-26 00:07:34 +00:00
Konstantin Belousov	d5210589b7	Fix remaining calls to cache_enter() in both NFS clients to provide appropriate timestamps. Restore the assertions which verify that NCF_TS is set when timestamp is asked for. Reviewed by: jhb (previous version) MFC after: 2 weeks	2012-01-25 20:48:20 +00:00
John Baldwin	0b17c7bea5	Add a timeout on positive name cache entries in the NFS client. That is, we will only trust a positive name cache entry for a specified amount of time before falling back to a LOOKUP RPC, even if the ctime for the file handle matches the cached copy in the name cache entry. The timeout is configured via a new 'nametimeo' mount option and defaults to 60 seconds. It may be set to zero to disable positive name caching entirely. Reviewed by: rmacklem MFC after: 1 week	2012-01-25 20:05:58 +00:00
Rick Macklem	6403723880	If a mount -u is done to either NFS client that switches it from TCP to UDP and the rsize/wsize/readdirsize is greater than NFS_MAXDGRAMDATA, it is possible for a thread doing an I/O RPC to get stuck repeatedly doing retries. This happens because the RPC will use a resize/wsize/readdirsize that won't work for UDP and, as such, it will keep failing indefinitely. This patch returns an error for this case, to avoid the problem. A discussion on freebsd-fs@ seemed to indicate that returning an error was preferable to silently ignoring the "udp"/"mntudp" option. This problem was discovered while investigating a problem reported by pjd@ via email. MFC after: 2 weeks	2012-01-25 00:22:53 +00:00
John Baldwin	5aefb4cbbf	Close a race in NFS lookup processing that could result in stale name cache entries on one client when a directory was renamed on another client. The root cause for the stale entry being trusted is that each per-vnode nfsnode structure has a single 'n_ctime' timestamp used to validate positive name cache entries. However, if there are multiple entries for a single vnode, they all share a single timestamp. To fix this, extend the name cache to allow filesystems to optionally store a timestamp value in each name cache entry. The NFS clients now fetch the timestamp associated with each name cache entry and use that to validate cache hits instead of the timestamps previously stored in the nfsnode. Another part of the fix is that the NFS clients now use timestamps from the post-op attributes of RPCs when adding name cache entries rather than pulling the timestamps out of the file's attribute cache. The latter is subject to races with other lookups updating the attribute cache concurrently. Some more details: - Add a variant of nfsm_postop_attr() to the old NFS client that can return a vattr structure with a copy of the post-op attributes. - Handle lookups of "." as a special case in the NFS clients since the name cache does not store name cache entries for ".", so we cannot get a useful timestamp. It didn't really make much sense to recheck the attributes on the the directory to validate the namecache hit for "." anyway. - ABI compat shims for the name cache routines are present in this commit so that it is safe to MFC. MFC after: 2 weeks	2012-01-20 20:02:01 +00:00
Rick Macklem	23b3566364	Martin Cracauer reported a problem to freebsd-current@ under the subject "Data corruption over NFS in -current". During investigation of this, I came across an ugly bogusity in the new NFS client where it replaced the cr_uid with the one used for the mount. This was done so that "system operations" like the NFSv4 Renew would be performed as the user that did the mount. However, if any other thread shares the credential with the one doing this operation, it could do an RPC (or just about anything else) as the wrong cr_uid. This patch fixes the above, by using the mount credentials instead of the one provided as an argument for this case. It appears to have fixed Martin's problem. This patch is needed for NFSv4 mounts and NFSv3 mounts against some non-FreeBSD servers that do not put post operation attributes in the NFSv3 Statfs RPC reply. Tested by: Martin Cracauer (cracauer at cons.org) Reviewed by: jhb MFC after: 2 weeks	2012-01-20 00:58:51 +00:00
Eygene Ryabinkin	15c75a0d9f	Subject: NULLFS: properly destroy node hash Use hashdestroy() instead of naive free(). Approved by: kib MFC after: 2 weeks	2012-01-18 11:23:46 +00:00
Kevin Lo	e0d3195bd6	Return EOPNOTSUPP since we only support update mounts for NFS export. Spotted by: trociny	2012-01-17 01:25:53 +00:00
Kirk McKusick	cc672d3599	Make sure all intermediate variables holding mount flags (mnt_flag) and that all internal kernel calls passing mount flags are declared as uint64_t so that flags in the top 32-bits are not lost. MFC after: 2 weeks	2012-01-17 01:08:01 +00:00
Kevin Lo	57eb5548c9	Add nfs export support to tmpfs(5) Reviewed by: kib	2012-01-16 10:25:22 +00:00
Alan Cox	0b05cac3d2	When tmpfs_write() resets an extended file to its original size after an error, we want tmpfs_reg_resize() to ignore I/O errors and unconditionally update the file's size. Reviewed by: kib MFC after: 3 weeks	2012-01-16 00:26:49 +00:00
Mikolaj Golub	fe7f89b71a	Abrogate nchr argument in proc_getargv() and proc_getenvv(): we always want to read strings completely to know the actual size. As a side effect it fixes the issue with kern.proc.args and kern.proc.env sysctls, which didn't return the size of available data when calling sysctl(3) with the NULL argument for oldp. Note, in get_ps_strings(), which does actual work for proc_getargv() and proc_getenvv(), we still have a safety limit on the size of data read in case of a corrupted procces stack. Suggested by: kib MFC after: 3 days	2012-01-15 18:47:24 +00:00
Ulrich Spörlein	9a14aa017b	Convert files to UTF-8	2012-01-15 13:23:18 +00:00
Alan Cox	93431cb74c	Neither tmpfs_nocacheread() nor tmpfs_mappedwrite() needs to call vm_object_pip_{add,subtract}() on the swap object because the swap object can't be destroyed while the vnode is exclusively locked. Moreover, even if the swap object could have been destroyed during tmpfs_nocacheread() and tmpfs_mappedwrite() this code is broken because vm_object_pip_subtract() does not wake up the sleeping thread that is trying to destroy the swap object. Free invalid pages after an I/O error. There is no virtue in keeping them around in the swap object creating more work for the page daemon. (I believe that any non-busy page in the swap object will now always be valid.) vm_pager_get_pages() does not return a standard errno, so its return value should not be returned by tmpfs without translation to an errno value. There is no reason for the wakeup on vpg in tmpfs_mappedwrite() to occur with the swap object locked. Eliminate printf()s from tmpfs_nocacheread() and tmpfs_mappedwrite(). (The swap pager already spam your console if data corruption is imminent.) Reviewed by: kib MFC after: 3 weeks	2012-01-14 23:04:27 +00:00
Rick Macklem	5b79362b47	Tai Horgan reported via email that there were two places in the new NFSv4 server where the code follows the wrong list. Fortunately, for these fairly rare cases, the lc_stateid[] lists are normally empty. This patch fixes the code to follow the correct list. Reported by: tai.horgan at isilon.com Discussed with: zack MFC after: 2 weeks	2012-01-14 04:04:58 +00:00
Rick Macklem	a16cd9c05e	jwd@ reported via email that the "CacheSize" field reported by "nfsstat -e -s" would go negative after using the "-z" option to zero out the stats. This patch fixes that by not zeroing out the srvcache_size field for "-z", since it is the size of the cache and not a counter. MFC after: 2 weeks	2012-01-11 02:46:42 +00:00
Alan Cox	2971897d51	Correct an error of omission in the implementation of the truncation operation on POSIX shared memory objects and tmpfs. Previously, neither of these modules correctly handled the case in which the new size of the object or file was not a multiple of the page size. Specifically, they did not handle partial page truncation of data stored on swap. As a result, stale data might later be returned to an application. Interestingly, a data inconsistency was less likely to occur under tmpfs than POSIX shared memory objects. The reason being that a different mistake by the tmpfs truncation operation helped avoid a data inconsistency. If the data was still resident in memory in a PG_CACHED page, then the tmpfs truncation operation would reactivate that page, zero the truncated portion, and leave the page pinned in memory. More precisely, the benevolent error was that the truncation operation didn't add the reactivated page to any of the paging queues, effectively pinning the page. This page would remain pinned until the file was destroyed or the page was read or written. With this change, the page is now added to the inactive queue. Discussed with: jhb Reviewed by: kib (an earlier version) MFC after: 3 weeks	2012-01-08 20:09:26 +00:00
Rick Macklem	f725864490	opt_inet6.h was missing from some files in the new NFS subsystem. The effect of this was, for clients mounted via inet6 addresses, that the DRC cache would never have a hit in the server. It also broke NFSv4 callbacks when an inet6 address was the only one available in the client. This patch fixes the above, plus deletes opt_inet6.h from a couple of files it is not needed for. MFC after: 2 weeks	2012-01-08 01:54:46 +00:00
Jaakko Heinonen	d467c9472a	r222004 changed sbuf_finish() to not clear the buffer error status. As a consequence sbuf_len() will return -1 for buffers which had the error status set prior to sbuf_finish() call. This causes a problem in pfs_read() which purposely uses a fixed size sbuf to discard bytes which are not needed to fulfill the read request. Work around the problem by using the full buffer length when sbuf_finish() indicates an overflow. An overflowed sbuf with fixed size is always full. PR: kern/163076 Approved by: des MFC after: 2 weeks	2012-01-06 10:12:59 +00:00
Jaakko Heinonen	9cb24e3c98	Check the return value of sbuf_finish() in pfs_readlink() and return ENAMETOOLONG if the buffer overflowed. Approved by: des MFC after: 2 weeks	2012-01-06 09:17:34 +00:00
Dimitry Andric	f39adedd5b	In sys/fs/nullfs/null_subr.c, in a KASSERT, output the correct vnode pointer 'lowervp' instead of 'vp', which is uninitialized at that point. Reviewed by: kib MFC after: 1 week	2012-01-05 17:06:04 +00:00
Konstantin Belousov	dd0f9532f3	Do the vput() for the lowervp in the null_nodeget() for error case too. Several callers of null_nodeget() did the cleanup itself, but several missed it, most prominent being null_bypass(). Remove the cleanup from the callers, now null_nodeget() handles lowervp free itself. Reported and tested by: pho MFC after: 1 week	2012-01-03 21:09:07 +00:00
Konstantin Belousov	48a1e3f624	Document the state of the lowervp vnode for null_nodeget(). Tested by: pho MFC after: 1 week	2012-01-03 21:03:20 +00:00
Pedro F. Giffuni	5eda6329b2	Minor cleanups to ntfs code bzero -> memset rename variables to avoid shadowing. PR: 142401 Obtained from: NetBSD Approved by jhb (mentor)	2012-01-03 19:09:01 +00:00
Alan Cox	04f883d798	Don't pass VM_ALLOC_ZERO to vm_page_grab() in tmpfs_mappedwrite() and tmpfs_nocacheread(). It is both unnecessary and a pessimization. It results in either the page being zeroed twice or zeroed first and then overwritten by an I/O operation. MFC after: 3 weeks	2012-01-03 03:29:01 +00:00
Ed Schouten	dc15eac046	Use strchr() and strrchr(). It seems strchr() and strrchr() are used more often than index() and rindex(). Therefore, simply migrate all kernel code to use it. For the XFS code, remove an empty line to make the code identical to the code in the Linux kernel.	2012-01-02 12:12:10 +00:00
Ed Schouten	8f8d30274a	Migrate ufs and ext2fs from skpc() to memcchr(). While there, remove a useless check from the code. memcchr() always returns characters unequal to 0xff in this case, so inosused[i] ^ 0xff can never be equal to zero. Also, the fact that memcchr() returns a pointer instead of the number of bytes until the end, makes conversion to an offset far more easy.	2012-01-01 20:47:33 +00:00
Kevin Lo	824be4a073	Discard local array based on return values. Pointed out by: uqs Found with: Coverity Prevent(tm) CID: 10089	2011-12-24 15:49:52 +00:00
Rick Macklem	f855a3c570	During investigation of an NFSv4 client crash reported by glebius@, jhb@ spotted that nfscl_getstateid() might modify credentials when called from nfsrpc_read() for the case where p != NULL, whereas nfsrpc_read() only did a crdup() to get new credentials for p == NULL. This bug was introduced by r195510, since pre-r195510 nfscl_getstateid() only modified credentials for the p == NULL case. This patch modifies nfsrpc_read()/nfsrpc_write() so that they do crdup() for the p != NULL case. It is conceivable that this bug caused the crash reported by glebius@, but that will not be determined for some time, since the crash occurred after about 1month of operation. Tested by: glebius Reviewed by: jhb MFC after: 2 weeks	2011-12-23 02:04:35 +00:00
Kevin Lo	e2ee19e346	Discarding local array based on return values	2011-12-22 06:31:29 +00:00
Rick Macklem	713f46ac47	jwd@ reported a problem via email where the old NFS client would get a reply of EEXIST from an NFS server when a Mkdir RPC was retried, for an NFS over UDP mount. Upon investigation, it was found that the client was retransmitting the Mkdir RPC request over UDP, but with a different xid. As such, the retransmitted message would miss the Duplicate Request Cache in the server, causing it to reply EEXIST. The kernel client side UDP rpc code has two timers. The first one causes a retransmit using the same xid and socket and was set to a fixed value of 3seconds. (The default can be overridden via CLSET_RETRY_TIMEOUT.) The second one creates a new socket and xid and should be larger than the first. However, both NFS clients were setting the second timer to nm_timeo ("timeout=<value>" mount argument), which defaulted to 1second, so the first timer would never time out. This patch fixes both NFS clients so that they set the first timer using nm_timeo and makes the second timer larger than the first one. Reported by: jwd Tested by: jwd Reviewed by: jhb MFC after: 2 weeks	2011-12-21 02:45:51 +00:00

1 2 3 4 5 ...

2875 Commits