freebsd-dev

Author	SHA1	Message	Date
Mateusz Guzik	dc20b834ca	vfs: add optional root vnode caching Root vnodes looekd up all the time, e.g. when crossing a mount point. Currently used routines always perform a costly lookup which can be trivially avoided. Reviewed by: jeff (previous version), kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21646	2019-10-06 22:14:32 +00:00
Andrew Turner	50bb04b750	Check the vfs option length is valid before accessing through When a VFS option passed to nmount is present but NULL the kernel will place an empty option in its internal list. This will have a NULL pointer and a length of 0. When we come to read one of these the kernel will try to load from the last address of virtual memory. This is normally invalid so will fault resulting in a kernel panic. Fix this by checking if the length is valid before dereferencing. MFC after: 3 days Sponsored by: DARPA, AFRL	2019-09-27 16:22:28 +00:00
Sean Eric Fagan	ba7a55d934	Add two options to allow mount to avoid covering up existing mount points. The two options are * nocover/cover: Prevent/allow mounting over an existing root mountpoint. E.g., "mount -t ufs -o nocover /dev/sd1a /usr/local" will fail if /usr/local is already a mountpoint. * emptydir/noemptydir: Prevent/allow mounting on a non-empty directory. E.g., "mount -t ufs -o emptydir /dev/sd1a /usr" will fail. Neither of these options is intended to be a default, for historical and compatibility reasons. Reviewed by: allanjude, kib Differential Revision: https://reviews.freebsd.org/D21458	2019-09-23 04:28:07 +00:00
Mateusz Guzik	b488246b45	vfs: group fields used for per-cpu ops in one cacheline Sponsored by: The FreeBSD Foundation	2019-09-19 21:23:14 +00:00
Mateusz Guzik	4cace859c2	vfs: convert struct mount counters to per-cpu There are 3 counters modified all the time in this structure - one for keeping the structure alive, one for preventing unmount and one for tracking active writers. Exact values of these counters are very rarely needed, which makes them a prime candidate for conversion to a per-cpu scheme, resulting in much better performance. Sample benchmark performing fstatfs (modifying 2 out of 3 counters) on a 104-way 2 socket Skylake system: before: 852393 ops/s after: 76682077 ops/s Reviewed by: kib, jeff Tested by: pho Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21637	2019-09-16 21:37:47 +00:00
Mateusz Guzik	a8c8e44bf0	vfs: manage mnt_ref with atomics New primitive is introduced to denote sections can operate locklessly on aspects of struct mount, but which can also be disabled if necessary. This provides an opportunity to start scaling common case modifications while providing stable state of the struct when facing unmount, write suspendion or other events. mnt_ref is the first counter to start being managed in this manner with the intent to make it per-cpu. Reviewed by: kib, jeff Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21425	2019-09-16 21:31:02 +00:00
Konstantin Belousov	e671edac06	De-commision the MNTK_NOINSMNTQ kernel mount flag. After all the changes, its dynamic scope is same as for MNTK_UNMOUNT, but to allow the syncer vnode to be re-installed on unmount failure. But the case of syncer was already handled by using the VV_FORCEINSMQ flag for quite some time. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2019-08-23 19:40:10 +00:00
Mateusz Guzik	4b3f767340	vfs: fix up r351193 ("stop always overwriting ->mnt_stat in VFS_STATFS") fs-specific part of vfs_statfs routines only fill in small portion of the structure. Previous code was always copying everything at a higher layer to acoomodate it and this patch does the same. 'df' (no arguments) worked fine because the caller uses mnt_stat itself as the target buffer, making all the copying a no-op for its own case. 'df /' and similar use a different consumer which passes its own buffer and this is where you can run into trouble. Reported by: cy Fixes: r351193 Sponsored by: The FreeBSD Foundation	2019-08-19 14:11:54 +00:00
Mateusz Guzik	e7c1709aaf	vfs: stop always overwriting ->mnt_stat in VFS_STATFS The struct is already populated on each mount (and remount). Fields are either constant or not used by filesystem in the first place. Some infrequently used functions use it to avoid having to allocate a new buffer and are left alone. The current code results in an avoidable copying single-threaded and significant cache line bouncing multithreaded While here deduplicate initial filling of the struct. Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21317	2019-08-18 18:40:12 +00:00
Conrad Meyer	daec92844e	Include ktr.h in more compilation units Similar to r348026, exhaustive search for uses of CTRn() and cross reference ktr.h includes. Where it was obvious that an OS compat header of some kind included ktr.h indirectly, .c files were left alone. Some of these files clearly got ktr.h via header pollution in some scenarios, or tinderbox would not be passing prior to this revision, but go ahead and explicitly include it in files using it anyway. Like r348026, these CUs did not show up in tinderbox as missing the include. Reported by: peterj (arm64/mp_machdep.c) X-MFC-With: r347984 Sponsored by: Dell EMC Isilon	2019-05-21 20:38:48 +00:00
Kirk McKusick	13c31c29ca	Some filesystems (like cd9660 and ext3) require that VFS_STATFS() be called before VFS_ROOT() is called. Move the call for VFS_STATFS() so that it is done after VFS_MOUNT(), but before VFS_ROOT(). This change actually improves the robustness of the mount system call because it returns an error rather than failing silently when VFS_STATFS() returns failure. Reported by: Rebecca Cran <rebecca@bluestop.org> Sponsored by: Netflix	2018-12-21 01:09:25 +00:00
Kirk McKusick	e04d2a3c5a	Under UFS/FFS the VFS_ROOT() function will return an error if the inode check-hash fails. Panic'ing is not an appropriate response. So, check for an error return from VFS_ROOT() and when an error is reported, unwind and return the error. Reported by: Gary Jennejohn (gj) Sponsored by: Netflix	2018-12-15 19:04:50 +00:00
Mateusz Guzik	cc426dd319	Remove unused argument to priv_check_cred. Patch mostly generated with cocinnelle: @@ expression E1,E2; @@ - priv_check_cred(E1,E2,0) + priv_check_cred(E1,E2) Sponsored by: The FreeBSD Foundation	2018-12-11 19:32:16 +00:00
Mark Johnston	970a174f3b	Add FALLTHROUGH comments to appease Coverity. CID: 1017862-1017864, 1017866-1017868 MFC after: 2 weeks	2018-10-25 15:43:21 +00:00
Konstantin Belousov	4fceda6206	Correct condition to detect mount(2) support by a filesystem. Reported and tested by: cy Sponsored by: The FreeBSD Foundation Approved by: re (rgrimes)	2018-10-24 19:40:09 +00:00
Konstantin Belousov	8ff7fad1d7	Only call sigdeferstop() for NFS. Use bypass to catch any NFS VOP dispatch and route it through the wrapper which does sigdeferstop() and then dispatches original VOP. NFS does not need a bypass below it, which is not supported. The vop offset in the vop_vector is added since otherwise it is impossible to get vop_op_t from the internal table, and I did not wanted to create the layered fs only to wrap NFS VOPs. VFS_OP()s wrap is straightforward. Requested and reviewed by: mjg (previous version) Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D17658	2018-10-23 21:43:41 +00:00
Jamie Gritton	0e5c6bd436	Make it easier for filesystems to count themselves as jail-enabled, by doing most of the work in a new function prison_add_vfs in kern_jail.c Now a jail-enabled filesystem need only mark itself with VFCF_JAIL, and the rest is taken care of. This includes adding a jail parameter like allow.mount.foofs, and a sysctl like security.jail.mount_foofs_allowed. Both of these used to be a static list of known filesystems, with predefined permission bits. Reviewed by: kib Differential Revision: D14681	2018-05-04 20:54:27 +00:00
Andriy Gapon	31260bf042	vfs_donmount: in certain cases try r/o mount if r/w mount fails If the operation is not an update, if neither r/w nor r/o mode is explicitly requested, if the error code hints at the possibility of the media being read-only, and if the fallback is allowed, then we can try to automatically downgrade to the readonly mode. This is especially useful for auto-mounting of removable media that sometimes can happen to be write-protected. The fallback to r/o is not enabled by default. It can be requested on a per-mount basis with a new mount option, 'autoro'. Or it can be globally allowed by setting vfs.default_autoro. Reviewed by: cem, kib MFC after: 3 weeks Relnotes: yes Differential Revision: https://reviews.freebsd.org/D13361	2018-03-27 14:31:42 +00:00
Ian Lepore	ac579135b0	Use EVENTHANDLER_DIRECT_INVOKE for [un]mount events, for better performance.	2018-01-07 18:07:22 +00:00
Pedro F. Giffuni	51369649b0	sys: further adoption of SPDX licensing ID tags. Mainly focus on files that use BSD 3-Clause license. The Software Package Data Exchange (SPDX) group provides a specification to make it easier for automated tools to detect and summarize well known opensource licenses. We are gradually adopting the specification, noting that the tags are considered only advisory and do not, in any way, superceed or replace the license texts. Special thanks to Wind River for providing access to "The Duke of Highlander" tool: an older (2014) run over FreeBSD tree was useful as a starting point.	2017-11-20 19:43:44 +00:00
Andriy Gapon	f92e3400bc	remove process and jail directory machinations from dounmount The manipulations done by mountcheckdirs() are not that useful during the unmount, they can bring about unexpected security consequences. Thic change effectively reverts the change in r73241. The change also allows to simplify the handling of rootvnode global variable. Discussed with: mckusick, mjg, kib Reviewed by: trasz MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D12366	2017-10-13 09:42:05 +00:00
Konstantin Belousov	9770475ce7	Do not vrele() covered vnode under the mp mutex. If vrele() changes the hold count to zero, it needs to acquire the vnode lock. Sponsored by: The FreeBSD Foundation Discussed with: avg X-MFC with: r323578	2017-09-19 16:49:45 +00:00
Andriy Gapon	cbc785c293	dounmount: do not release the mount point's reference on the covered vnode As long as mnt_ref is not zero there can be a consumer that might try to access mnt_vnodecovered. For this reason the covered vnode must not be freed until mnt_ref goes to zero. So, move the release of the covered vnode to vfs_mount_destroy. Reviewed by: kib MFC after: 3 weeks Differential Revision: https://reviews.freebsd.org/D12329	2017-09-14 08:47:06 +00:00
Ed Maste	3e85b721d6	Remove register keyword from sys/ and ANSIfy prototypes A long long time ago the register keyword told the compiler to store the corresponding variable in a CPU register, but it is not relevant for any compiler used in the FreeBSD world today. ANSIfy related prototypes while here. Reviewed by: cem, jhb Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D10193	2017-05-17 00:34:34 +00:00
Konstantin Belousov	2f304845e2	Do not allocate struct statfs on kernel stack. Right now size of the structure is 472 bytes on amd64, which is already large and stack allocations are indesirable. With the ino64 work, MNAMELEN is increased to 1024, which will make it impossible to have struct statfs on the stack. Extracted from: ino64 work by gleb Discussed with: mckusick Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2017-01-05 17:19:26 +00:00
Konstantin Belousov	714b7df502	Provide simple mutual exclusion between mount point update and unmount. Currently mount update keeps vfs_busy(9) reference on the mount point during MNT_UPDATE VFS_MOUNT() vfsops call. This already provides the exclusion, but is problematic for filesystems which need to perform namei(9) during VFS_MOUNT(MNT_UPDATE) operations, e.g. to refresh mnt_from path, because namei(9) must not be called while the vfs_busy(9) reference is owned. Check for MNT_UPDATE flag before setting MNTK_UNMOUNT, and for MNTK_UNMOUNT before entering innards of vfs_domount_update(), failing syscalls with EBUSY if conflict is detected. Keep vfs_busy(9) reference around VFS_MOUNT(MNT_UPDATE) calls still to not change VFS KPI. In the update path in ffs_mount(), drop vfs_busy() reference around namei(), which is now safe due to unmount never executing in parallel with VFS_MOUNT(MNT_UPDATE), and which avoids the deadlock. Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2016-11-13 21:49:51 +00:00
Konstantin Belousov	9eb8f495b8	Move common cleanup code into helper. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2016-11-13 21:39:55 +00:00
Mateusz Guzik	45571f8886	vfs: assert empty tmp free list on unmount	2016-10-08 13:38:05 +00:00
Konstantin Belousov	f71d08566c	Limit scope of the optimization in r306608 to dounmount() caller only. Other uses of cache_purgevfs() do rely on the cache purge for correct operations, when paths are invalidated without unmount. Reported and tested by: jkim Discussed with: mjg Sponsored by: The FreeBSD Foundation	2016-10-07 11:38:28 +00:00
Mateusz Guzik	5bb81f9b2d	vfs: batch free vnodes in per-mnt lists Previously free vnodes would always by directly returned to the global LRU list. With this change up to mnt_free_list_batch vnodes are collected first. syncer runs always return the batch regardless of its size. While vnodes on per-mnt lists are not counted as free, they can be returned in case of vnode shortage. Reviewed by: kib Tested by: pho	2016-09-30 17:27:17 +00:00
Edward Tomasz Napierala	e313b4dd95	Fix bug introduced with r302388, which could cause processes accessing automounted shares to hang with "vfs_busy" wchan. (As a workaround one can run 'automount -u' from cron.) Reviewed by: kib@ MFC after: 1 month	2016-09-21 05:44:13 +00:00
Ed Maste	69a2875821	Renumber license clauses in sys/kern to avoid skipping #3	2016-09-15 13:16:20 +00:00
Edward Tomasz Napierala	411455a8fb	Replace all remaining calls to vprint(9) with vn_printf(9), and remove the old macro. MFC after: 1 month	2016-08-10 16:12:31 +00:00
Edward Tomasz Napierala	debc480e03	Add new unmount(2) flag, MNT_NONBUSY, to check whether there are any open vnodes before proceeding. Make autounmound(8) use this flag. Without it, even an unsuccessfull unmount causes filesystem flush, which interferes with normal operation. Reviewed by: kib@ Approved by: re (gjb@) MFC after: 1 month Differential Revision: https://reviews.freebsd.org/D7047	2016-07-07 09:03:57 +00:00
Konstantin Belousov	9fdbfd3b6c	Do not assume that we own the use reference on the covered vnode until we set MNTK_UNMOUNT flag on the mp. Otherwise parallel unmount which wins race with us could dereference the covered vnode, and we are left with the locked freed memory. Reported and tested by: pho Sponsored by: The FreeBSD Foundation Approved by: re (gjb) MFC after: 1 week	2016-06-15 15:56:03 +00:00
Andriy Gapon	8614f45b2d	dounmount: do not call mountcheckdirs() for mounts with MNT_IGNORE This is a bit hackish, but the flag is currently set only for ZFS snapshots mounted under .zfs. mountcheckdirs() can change cdir/rdir references to a covered vnode. But for the said snapshots the covered vnode is really ephemeral and it must never be accessed (except for a few specific cases). To do: consider removing mountcheckdirs() entirely MFC after: 5 days	2016-05-16 07:23:24 +00:00
Konstantin Belousov	76c404fce5	Do not copy by field when converting struct oexport_args to struct export_args on mount update, bzero() is consistent with vfs_oexport_conv(). Make the code structure more explicit by using switch. Return EINVAL if export option layout (deduced from size) is unknown. Based on the submission by: bde Sponsored by: The FreeBSD Foundation	2016-02-04 16:32:21 +00:00
Edward Tomasz Napierala	c9ba65040f	Make vfs_unmountall() unmount /dev after /, not before. The only reason this didn't result in an unclean shutdown is that devfs ignores MNT_FORCE flag. Reviewed by: kib@ MFC after: 1 month Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D3467	2015-08-24 13:18:13 +00:00
Konstantin Belousov	1965f86c72	Vnode is not referenced by the vfs_domount() at the point where asserts are made. Remove them, since we might dereference freed memory. Leaked locks are asserted by the syscall return code anyway. Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2015-07-02 14:31:47 +00:00
Konstantin Belousov	780dca1b1e	Right now, dounmount() is called with unreferenced mount point. Nothing stops a parallel unmount to suceed before the given call to dounmount() checks and locks the covered vnode. Prevent dounmount() from acting on the freed (although type-stable) memory by changing the interface to require the mount point to be referenced. dounmount() consumes the reference on return, regardless of the sucessfull or erronous result. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2015-05-27 09:22:50 +00:00
Konstantin Belousov	5d6f5b24ca	Mountd iterating over the mount points may race with the parallel unmount, which causes error from nmount(2) call when performing MNT_DELEXPORT over the directory which ceased to be a mount point. The race is legitimate and innocent, but results in the chatty mountd. Silence it by providing an distinguished error code for the situation, and ignoring the error in mountd loop. Based on the patch by: Andreas Longwitz <longwitz@incore.de> Prodded and tested by: bdrewery Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2015-02-10 18:00:32 +00:00
Konstantin Belousov	b2344ab5ff	Do not call VFS_SYNC() before VFS_UNMOUNT() for forced unmount. Since VFS does not/cannot stop writes, sync might run indefinitely, or be a wrong thing to do at all. E. g. NFS ignores VFS_SYNC() for forced unmounts, since non-responding server does not allow sync to finish. On the other hand, filesystems can and do stop writes using fs-specific facilities, and should already fully flush caches in VFS_UNMOUNT() due to the race. Adjust msdosfs tp sync in unmount for forced call, to accomodate the new behaviour. Note that it is still racy, since writes are not stopped. Discussed with: avg, bjk, mckusick Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 3 weeks	2014-12-09 10:00:47 +00:00
Edward Tomasz Napierala	3914ddf8a7	Bring in the new automounter, similar to what's provided in most other UNIX systems, eg. MacOS X and Solaris. It uses Sun-compatible map format, has proper kernel support, and LDAP integration. There are still a few outstanding problems; they will be fixed shortly. Reviewed by: allanjude@, emaste@, kib@, wblock@ (earlier versions) Phabric: D523 MFC after: 2 weeks Relnotes: yes Sponsored by: The FreeBSD Foundation	2014-08-17 09:44:42 +00:00
Konstantin Belousov	168f4ee0a8	Remove Giant acquisition from the mount and unmount pathes. It could be claimed that two things were reasonable protected by Giant. One is vfsconf list links, which is converted to the new dedicated sx vfsconf_sx. Another is vfsconf.vfc_refcount, which is now updated with atomics. Note that vfc_refcount still has the same races now as it has under the Giant, the unload of filesystem modules can happen while the module is still in use. Tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 2 weeks	2014-08-03 03:27:54 +00:00
Bryan Drewery	97c0df733f	Use proper MFSNAMELEN for fs type. MFC after: 2 weeks Reviewed by: rodrigc Also spotted by:ambrisko	2014-04-12 21:39:17 +00:00
Sean Bruno	d3baefa809	Change len checks for fstypelen and fspathlen to be against absolute len not strlen as they are not strings. Discovered by GSOC student, Mike Ma <mikemandarine@gmail.com> during his fuse.glusterfs port to FreeBSD. Final patch from mckusick@ Submitted by: mckusick@ Approved by: re (hrs) MFC after: 2 weeks	2013-10-03 22:52:03 +00:00
Rick Macklem	8fe6bddff7	Forced dismounts of NFS mounts can fail when thread(s) are stuck waiting for an RPC reply from the server while holding the mount point busy (mnt_lockref incremented). This happens because dounmount() msleep()s waiting for mnt_lockref to become 0, before calling VFS_UNMOUNT(). This patch adds a new VFS operation called VFS_PURGE(), which the NFS client implements as purging RPCs in progress. Making this call before checking mnt_lockref fixes the problem, by ensuring that the VOP_xxx() calls will fail and unbusy the mount point. Reported by: sbruno Reviewed by: kib MFC after: 2 weeks	2013-09-01 23:02:59 +00:00
Marcel Moolenaar	8939c0693c	Add vfs_mounted and vfs_unmounted events so that components can be informed about mount and unmount events. This is used by Juniper to implement a more optimal implementation of NetBSD's veriexec. This change differs from r253224 in the following way: o The vfs_mounted handler is called before mountcheckdirs() and with newdp locked. vp is unlocked. o The event handlers are declared in <sys/eventhandler.h> and not in <sys/mount.h>. The <sys/mount.h> header is used in user land code that pretends to be kernel code and as such creates a very convoluted environment. It's hard to untangle. Submitted by: stevek@juniper.net Discussed with: pjd@ Obtained from: Juniper Networks, Inc.	2013-07-10 15:35:25 +00:00
Marcel Moolenaar	4612275fdb	Revert r251590. It unexpectedly broke the build and there were some questions on locking. As part of commit-bit grooming, I'd like Steve to handle this, but can't leave things broken in the mean time.	2013-06-10 15:22:27 +00:00
Marcel Moolenaar	8c7ca16f63	Add vfs_mounted and vfs_unmounted events so that components can be informed about mount and unmount events. This is used by Juniper to implement a more optimal implementation of NetBSD's veriexec. Submitted by: stevek@juniper.net Obtained from: Juniper Networks, Inc	2013-06-09 23:51:26 +00:00

1 2 3 4 5 ...

409 Commits