freebsd-nq

Author	SHA1	Message	Date
Konstantin Belousov	245b204491	When restoring the mount after umount failed, the MNTK_UNMOUNT flag prevents insmntque() from placing reallocated syncer vnode on mount list, that causes panic in vfs_allocate_syncvnode(). Introduce MNTK_NOINSMNTQ flag, that marks the period when instmntque is not allowed to success, instead of MNTK_UNMOUNT. The MNTK_NOINSMNTQ is set and cleared simultaneously with MNTK_UNMOUNT, except on umount error path, where it is cleaned just before the syncer vnode is going to be allocated. Reported by: Peter Jeremy <peterjeremy optushome com au> Suggested by: tegge Approved by: re (rwatson)	2007-09-12 16:31:32 +00:00
John Baldwin	1dc5b1cc56	On 6.x this works: % mount \| grep home /dev/ad4s1e on /home (ufs, local, noatime, soft-updates) % mount -u -o atime /home % mount \| grep home /dev/ad4s1e on /home (ufs, local, soft-updates) Restore this behavior for on 7.x for the following mount options: noatime, noclusterr, noclusterw, noexec, nosuid, nosymfollow In addition, on 7.x, the following are equivalent: mount -u -o atime /home mount -u -o nonoatime /home Ideally, when we introduce new mount options, we should avoid options starting with "no". :) Requested by: jhb Reported by: Karol Kwiat <karol.kwiat gmail com>, Scott Hetzel <swhetzel gmail com> Approved by: re (bmah) Proxy commit for: rodrigc	2007-08-15 17:40:09 +00:00
Pawel Jakub Dawidek	68c1a246ae	The v_mountedhere field is protected by the vnode lock, not vnode's internal lock. Approved by: re (rwatson)	2007-07-26 16:52:57 +00:00
Craig Rodrigues	d7f81adbd4	Revert previous commits which I committed by mistake. Approved by: re (implicit) Pointy hat to: me	2007-07-14 21:23:31 +00:00
Craig Rodrigues	d678780e60	The last entry in the ext2_opts array must be NULL, otherwise the kernel with crash in vfs_filteropt() if an invalid mount option is passed to ext2fs. Approved by: re (kensmith)	2007-07-14 21:18:19 +00:00
Robert Watson	32f9753cfb	Eliminate now-unused SUSER_ALLOWJAIL arguments to priv_check_cred(); in some cases, move to priv_check() if it was an operation on a thread and no other flags were present. Eliminate caller-side jail exception checking (also now-unused); jail privilege exception code now goes solely in kern_jail.c. We can't yet eliminate suser() due to some cases in the KAME code where a privilege check is performed and then used in many different deferred paths. Do, however, move those prototypes to priv.h. Reviewed by: csjp Obtained from: TrustedBSD Project	2007-06-12 00:12:01 +00:00
Konstantin Belousov	e5ea32c290	Allow the dounmount() to proceed even for doomed coveredvp. In dounmount(), before or while vn_lock(coveredvp) is called, coveredvp vnode may be VI_DOOMED due to one of the following: - other thread finished unmount and vput()ed it, and vnode was chosen for recycling, while vn_lock() slept; - forced unmount of the coveredvp->v_mount fs. In the first case, next check for changed v_mountedhere or mnt_gen counter would be successfull. In the second case, the unmount shall be allowed. Submitted by: sobomax MFC after: 2 weeks	2007-04-26 08:56:56 +00:00
Pawel Jakub Dawidek	7760d8409f	Export vfs_mount_alloc() as it is used in ZFS.	2007-04-17 21:14:06 +00:00
Pawel Jakub Dawidek	24b0502ee0	Fix jails and jail-friendly file systems handling: - We need to allow for PRIV_VFS_MOUNT_OWNER inside a jail. - Move security checks to vfs_suser() and deny unmounting and updating for jailed root from different jails, etc. OK'ed by: rwatson	2007-04-13 23:54:22 +00:00
Nate Lawson	a363f67a81	Restore the locking for the sleep/wakeup to avoid waiting an extra 1 sec if a race was lost. We're still single-threaded at this point, but just be safe for the future.	2007-04-09 21:10:04 +00:00
Nate Lawson	6b1e469ea5	Clean up the root mount and mount wait code. No mutexes are needed here since a spurious wakeup() is the only possible outcome and this is fine in the BSD programming model.	2007-04-09 19:23:52 +00:00
Pawel Jakub Dawidek	2eb68d493f	Add root_mounted() function that returns true if the root file system is already mounted.	2007-04-08 23:54:01 +00:00
Pawel Jakub Dawidek	f3a8d2f93c	Add security.jail.mount_allowed sysctl, which allows to mount and unmount jail-friendly file systems from within a jail. Precisely it grants PRIV_VFS_MOUNT, PRIV_VFS_UNMOUNT and PRIV_VFS_MOUNT_NONUSER privileges for a jailed super-user. It is turned off by default. A jail-friendly file system is a file system which driver registers itself with VFCF_JAIL flag via VFS_SET(9) API. The lsvfs(1) command can be used to see which file systems are jail-friendly ones. There currently no jail-friendly file systems, ZFS will be the first one. In the future we may consider marking file systems like nullfs as jail-friendly. Reviewed by: rwatson	2007-04-05 21:03:05 +00:00
Robert Watson	5e3f7694b1	Replace custom file descriptor array sleep lock constructed using a mutex and flags with an sxlock. This leads to a significant and measurable performance improvement as a result of access to shared locking for frequent lookup operations, reduced general overhead, and reduced overhead in the event of contention. All of these are imported for threaded applications where simultaneous access to a shared file descriptor array occurs frequently. Kris has reported 2x-4x transaction rate improvements on 8-core MySQL benchmarks; smaller improvements can be expected for many workloads as a result of reduced overhead. - Generally eliminate the distinction between "fast" and regular acquisisition of the filedesc lock; the plan is that they will now all be fast. Change all locking instances to either shared or exclusive locks. - Correct a bug (pointed out by kib) in fdfree() where previously msleep() was called without the mutex held; sx_sleep() is now always called with the sxlock held exclusively. - Universally hold the struct file lock over changes to struct file, rather than the filedesc lock or no lock. Always update the f_ops field last. A further memory barrier is required here in the future (discussed with jhb). - Improve locking and reference management in linux_at(), which fails to properly acquire vnode references before using vnode pointers. Annotate improper use of vn_fullpath(), which will be replaced at a future date. In fcntl(), we conservatively acquire an exclusive lock, even though in some cases a shared lock may be sufficient, which should be revisited. The dropping of the filedesc lock in fdgrowtable() is no longer required as the sxlock can be held over the sleep operation; we should consider removing that (pointed out by attilio). Tested by: kris Discussed with: jhb, kris, attilio, jeff	2007-04-04 09:11:34 +00:00
Pawel Jakub Dawidek	afd894bb12	Add root_mount_wait() function which can be used to wait until the root file system is mounted. This is useful for kernel modules loaded from /boot/loader.conf, that have to access file system.	2007-04-03 11:45:28 +00:00
Pawel Jakub Dawidek	5c1c2e82e2	I think the code I'm removing here is completely bogus. vfs_flags field is used for VFCF_* flags which are given at file system driver creation time (via VFS_SET(9)) macro. What this code did was bascially this: If file system registers itself with VFCF_UNICODE flag (stores file names as Unicode), it will gain MNT_SOFTDEP flag (UFS soft-updates). If file system registers itself with VFCF_LOOPBACK flag (aliases some other mounted FS), it will gain MNT_SUIDDIR flag (special handling of SUID on dirs). The latter will be quite dangerous, but those flags are reset later in vfs_domount(). MFC after: 1 month	2007-04-01 13:08:05 +00:00
Pawel Jakub Dawidek	695919ad9a	Make vfs_mount_destroy() and vfs_freeopts() non-static, I'd like to use them.	2007-03-31 22:44:45 +00:00
Pawel Jakub Dawidek	9a2fd584b4	Don't deny unmounting file systems for jailed processes immediately, allow prison_priv_check() to decide what to do. This change is suppose not to change current (security) behaviour in any way. This change is simlar to the change of PRIV_VFS_MOUNT in previous revision.	2007-03-18 02:39:19 +00:00
Pawel Jakub Dawidek	7533652025	Don't deny mounting for jailed processes immediately, allow prison_priv_check() to decide what to do. This change is suppose not to change current (security) behaviour in any way. Reviewed by: rwatson	2007-03-14 13:09:59 +00:00
Pawel Jakub Dawidek	f7d4e990c7	White space nits.	2007-03-14 12:54:10 +00:00
Robert Watson	873fbcd776	Further system call comment cleanup: - Remove also "MP SAFE" after prior "MPSAFE" pass. (suggested by bde) - Remove extra blank lines in some cases. - Add extra blank lines in some cases. - Remove no-op comments consisting solely of the function name, the word "syscall", or the system call name. - Add punctuation. - Re-wrap some comments.	2007-03-05 13:10:58 +00:00
Olivier Houchard	38cc2a5caa	Make vfs_getopts() set *error to ENOENT if the option wasn't found, so that consumers don't have to check for both error and the return value (some of them actually don't do it). MFC After: 1 week	2007-02-13 01:28:48 +00:00
Craig Rodrigues	2892f3bbfa	Add a function vfs_deleteopt() which searches through the vfsoptlist linked list of mount options by name, and deletes the option if it finds it.	2006-12-16 15:44:03 +00:00
Robert Watson	acd3428b7d	Sweep kernel replacing suser(9) calls with priv(9) calls, assigning specific privilege names to a broad range of privileges. These may require some future tweaking. Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>	2006-11-06 13:42:10 +00:00
Robert Watson	aed5570872	Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.h begun with a repo-copy of mac.h to mac_framework.h. sys/mac.h now contains the userspace and user<->kernel API and definitions, with all in-kernel interfaces moved to mac_framework.h, which is now included across most of the kernel instead. This change is the first step in a larger cleanup and sweep of MAC Framework interfaces in the kernel, and will not be MFC'd. Obtained from: TrustedBSD Project Sponsored by: SPARTA	2006-10-22 11:52:19 +00:00
Konstantin Belousov	30af71199e	Fix the remaining race in the revs. 1.232, 1,233 that could occur during unmount when mp structure is reused while waiting for coveredvp lock. Introduce struct mount generation count, increment it on each reuse and compare the generations before and after obtaining the coveredvp lock. Reviewed by: tegge, pjd Approved by: pjd (mentor) MFC after: 2 weeks	2006-10-03 10:47:04 +00:00
Poul-Henning Kamp	f645b0b51c	First part of a little cleanup in the calendar/timezone/RTC handling. Move relevant variables to <sys/clock.h> and fix #includes as necessary. Use libkern's much more time- & spamce-efficient BCD routines.	2006-10-02 12:59:59 +00:00
Tor Egge	e60c361218	Reduce fluctuations of mnt_flag to allow unlocked readers to get a slightly more consistent view.	2006-09-26 04:20:09 +00:00
Tor Egge	fba924ce9b	Don't restore MNT_QUOTA bit in mnt_flag after a failed mount with MNT_UPDATE flag, closing a race between nmount() and quotactl().	2006-09-26 04:18:36 +00:00
Tor Egge	a1e363f256	Add mnt_noasync counter to better handle interleaved calls to nmount(), sync() and sync_fsync() without losing MNT_ASYNC. Add MNTK_ASYNC flag which is set only when MNT_ASYNC is set and mnt_noasync is zero, and check that flag instead of MNT_ASYNC before initiating async io.	2006-09-26 04:15:59 +00:00
Tor Egge	cea9d840d8	Don't restore mnt_kern_flag on failed MNT_UPDATE mount, it can race with dounmount(), causing loss of MNTK_UNMOUNT flag.	2006-09-26 04:15:04 +00:00
Tor Egge	5da56ddb21	Use mount interlock to protect all changes to mnt_flag and mnt_kern_flag. This eliminates a race where MNT_UPDATE flag could be lost when nmount() raced against sync(), sync_fsync() or quotactl().	2006-09-26 04:12:49 +00:00
Konstantin Belousov	f37e633887	Fix the bug in rev. 1.232. If vfs_suser returned false, coveredvp shall be unlocked only if it really exists. Found with: Coverity Prevent(tm) CID: 1535 Approved by: pjd (mentor)	2006-09-19 14:04:12 +00:00
Konstantin Belousov	4dec8579bd	Fix the race while waiting for coveredvp lock during unmount. The vnode may be recycled during the sleep, wrap the vn_lock with vhold/vdrop. Check that coveredvp still points to the same mp after sleep (needed because sleep dropped Giant). Move check for user rights for unmount after coveredvp lock is obtained. Tested by: Peter Holm Reviewed by: tegge Approved by: kan (mentor) MFC after: 2 weeks	2006-09-18 15:35:22 +00:00
Marius Strobl	aed760ef8a	Fix another bug introduced with rev. 1.204; in vfs_donmount() if the 'vfs_getopt(optlist, "errmsg", (void **)&errmsg, &errmsg_len)' call fails, 'errmsg' is left uninitialized, making the later tests against NULL meaningless, and the uses bogus. Thus initialize 'errmsg' to NULL beforehand. [1] While at it, remove the superfluous assignment of 0 to 'errmsg_len' if the above mentioned call fails as it's already initialized to 0. Submitted by: Michael Plass [1]	2006-08-26 16:28:19 +00:00
Pawel Jakub Dawidek	bebabf24bb	Fix comment.	2006-08-25 15:13:49 +00:00
Marius Strobl	3a30d178fe	Fix a bug introduced with rev. 1.204; in vfs_donmount() use copyout(9) instead of copystr(9) for copying the errmsg from kernel- to user-space. This fixes a panic on sparc64 when using the nmount(2)-converted mountd(8). While at it, use bcopy(3) instead of strncpy(3) in the kernel- to kernel-space case for consistency with vfs_buildopts() and between kernel- to user-space and kernel- to kernel-space case.	2006-08-24 18:52:28 +00:00
John Baldwin	597d608f86	- Expand the scope of Giant some in mount(2) to protect the vfsp structure from going away. mount(2) is now MPSAFE. - Expand the scope of Giant some in unmount(2) to protect the mp structure (or rather, to handle concurrent unmount races) from going away. umount(2) is now MPSAFE, as well as linux_umount() and linux_oldumount(). - nmount(2) and linux_mount() were already MPSAFE.	2006-06-27 14:46:31 +00:00
Robert Watson	7ebfc8df78	Audit some arguments to nmount(), mount(), umount(). Submitted by: wsalamon Obtained from: TrustedBSD Project	2006-06-05 15:32:07 +00:00
Pawel Jakub Dawidek	1f58dd4956	Fix a problem introduced in revision 1.220. On mount(2) failure, don't forget to unbusy file system before its destruction. This fixes the following warning on mount failure: Mount point <X> had 1 dangling refs Tested by: wkoszek	2006-06-02 20:29:02 +00:00
Craig Rodrigues	0c89bb0a02	Add "update" mount option to global_opts array, for use with vfs_filteropt().	2006-05-26 02:38:48 +00:00
Craig Rodrigues	5eb304a91a	Remove calls to vfs_export() for exporting a filesystem for NFS mounting from individual filesystems. Call it instead in vfs_mount.c, after we call VFS_MOUNT() for a specific filesystem.	2006-05-26 00:32:21 +00:00
Kelly Yancey	c9ad8a67af	Restore the ability to mount procfs and fdescfs filesystems via the mount(2) system call: * Add cmount hook to fdescfs and pseudofs (and, by extension, procfs and linprocfs). This (mostly) restores the ability to mount these filesystems using the old mount(2) system call (see below for the rest of the fix). * Remove not-NULL check for the data argument from the mount(2) entry point. Per the mount(2) man page, it is up to the individual filesystem being mounted to verify data. Or, in the case of procfs, etc. the filesystem is free to ignore the data parameter if it does not use it. Enforcing data to be not-NULL in the mount(2) system call entry point prevented passing NULL to filesystems which ignored the data pointer value. Apparently, passing NULL was common practice in such cases, as even our own mount_std(8) used to do it in the pre-nmount(2) world. All userland programs in the tree were converted to nmount(2) long ago, but I've found at least one external program which broke due to this (presumably unintentional) mount(2) API change. One could argue that external programs should also be converted to nmount(2), but then there isn't much point in keeping the mount(2) interface for backward compatibility if it isn't backward compatible.	2006-05-15 19:42:10 +00:00
Craig Rodrigues	5250012a1d	For nmount(), if "rw" is specified as a mount option, add "noro" to the list of mount options. This allows a read-only mount to be converted to read-write via: mount -u -o rw Requested by: kris	2006-05-14 01:51:38 +00:00
Jeff Roberson	ba5eb429e3	- When there are dangling vnodes at unmount print them before we panic. Sponsored by: Isilon Systems, Inc.	2006-03-31 23:38:15 +00:00
Jeff Roberson	a218edceb2	- Allocate mounts from a uma zone that uses UMA_ZONE_NOFREE to prevent mount memory from being reclaimed. This resolves a number of race conditions described in vfs_default.c and introduced with the VFS_LOCK_GIANT macros. - Let the mtx and lock remain valid after the mount structure has been freed by using init and fini calls. Technically fini will never be called but is included for completeness. - Consistently use lockmgr directly rather than lockmgr to lock and vfs_unbusy to unlock. Discussed with: tegge Tested by: kris Sponsored by: Isilon Systems, Inc.	2006-03-31 03:49:51 +00:00
Ruslan Ermilov	936ddefcd6	The mount(8) manpage says: "In case of conflicting options being specified, the rightmost option takes effect." Fix code to obey this. This makes e.g. "mount -r /usr" or "mount -ar" actually mount file systems read-only.	2006-03-13 14:58:37 +00:00
Tor Egge	791dd2fade	Use vn_start_secondary_write() and vn_finished_secondary_write() as a replacement for vn_write_suspend_wait() to better account for secondary write processing. Close race where secondary writes could be started after ffs_sync() returned but before the file system was marked as suspended. Detect if secondary writes or softdep processing occurred during vnode sync loop in ffs_sync() and retry the loop if needed.	2006-03-08 23:43:39 +00:00
Jeff Roberson	a4aeaefe5a	- We can not hold a vnode lock while we do a lookup. Search for and load modules prior to looking up the directory which we will cover to avoid this problem in mount. - We must hold the coveredvp locked before we can busy the mountpoint to prevent a lock order reversal with the vfs_busy() in lookup which holds the directory lock prior to doing a vfs_busy(). The directory lock is required to safely clear the v_mountedhere field on the directory. MFC After: 1 week	2006-02-22 06:29:55 +00:00
Jeff Roberson	04f6d3effa	- Add a ref count to the mount structure. Sleep for up to 3 seconds in vfs_mount_destroy waiting for this ref to hit 0. We don't print an error if we are rebooting as the root mount always retains some refernces by init proc. - Acquire a mnt ref for every vnode allocated to a mount point. Drop this ref only once vdestroy() has been called and the mount has been freed. - No longer NULL the v_mount pointer in delmntque() so that we may release the ref after vgone() has been called. This allows us to guarantee that the mount point structure will be valid until the last vnode has lost its last ref. - Fix a few places that rely on checking v_mount to detect recycling. Sponsored by: Isilon Systems, Inc. MFC After: 1 week	2006-02-06 10:19:50 +00:00

1 2 3 4 5 ...

265 Commits