freebsd-nq

Author	SHA1	Message	Date
Doug Moore	f9f4c60aa7	Including <sys/tmpfs.h> into non-kernel software leads to a compilation error because, without _KERNEL defined, the macro TMPFS_VALIDATE_DIR is invoked, but never defined. User-level software that includes sys/tmpfs.h must define _KERNEL to make the definition of TMPFS_VALIDATE_DIR visible. This change puts all the inline functions that, directly or indirectly, invoke MPASS into the scope of the _KERNEL block, allowing many user-space includers of <sys/tmpfs.h> to stop defining _KERNEL. Reviewed by: alc, kib MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D22874	2019-12-19 16:39:52 +00:00
Mateusz Guzik	6fa079fc3f	vfs: flatten vop vectors This eliminates the following loop from all VOP calls: while(vop != NULL && \ vop->vop_spare2 == NULL && vop->vop_bypass == NULL) vop = vop->vop_default; Reviewed by: jeff Tesetd by: pho Differential Revision: https://reviews.freebsd.org/D22738	2019-12-16 00:06:22 +00:00
Jeff Roberson	a808177864	Add a deferred free mechanism for freeing swap space that does not require an exclusive object lock. Previously swap space was freed on a best effort basis when a page that had valid swap was dirtied, thus invalidating the swap copy. This may be done inconsistently and requires the object lock which is not always convenient. Instead, track when swap space is present. The first dirty is responsible for deleting space or setting PGA_SWAP_FREE which will trigger background scans to free the swap space. Simplify the locking in vm_fault_dirty() now that we can reliably identify the first dirty. Discussed with: alc, kib, markj Differential Revision: https://reviews.freebsd.org/D22654	2019-12-15 03:15:06 +00:00
Rick Macklem	f808cf7294	Silence some "might not be initialized" warnings for riscv64. None of these case were actually using the variable(s) uninitialized, but I figured that silencing the warnings via initializing them made sense. Some of these predated r355677.	2019-12-13 21:38:08 +00:00
Rick Macklem	bf6ac05aa3	Add some more initializations to quiet riscv build. The one case in nfs_copy_file_range() was a legitimate case, although it would probably never occur in practice.	2019-12-13 01:34:25 +00:00
Rick Macklem	95bf2e523b	Fix the build for MAC not defined and a couple of might not be initialized. r355677 broke the build for the not MAC defined case and a couple of might not be initialized warnings were generated for riscv. Others seem to be erroneous. Hopefully there won't be too many more build errors. Pointy hat goes on me.	2019-12-13 00:45:14 +00:00
Rick Macklem	c057a37818	Add support for NFSv4.2 to the NFS client and server. This patch adds support for NFSv4.2 (RFC-7862) and Extended Attributes (RFC-8276) to the NFS client and server. NFSv4.2 is comprised of several optional features that can be supported in addition to NFSv4.1. This patch adds the following optional features: - posix_fadvise(POSIX_FADV_WILLNEED/POSIX_FADV_DONTNEED) - posix_fallocate() - intra server file range copying via the copy_file_range(2) syscall --> Avoiding data tranfer over the wire to/from the NFS client. - lseek(SEEK_DATA/SEEK_HOLE) - Extended attribute syscalls for "user" namespace attributes as defined by RFC-8276. Although this patch is fairly large, it should not affect support for the other versions of NFS. However it does add two new sysctls that allow a sysadmin to limit which minor versions of NFSv4 a server supports, allowing a sysadmin to disable NFSv4.2. Unfortunately, when the NFS stats structure was last revised, it was assumed that there would be no additional operations added beyond what was specified in RFC-7862. However RFC-8276 did add additional operations, forcing the NFS stats structure to revised again. It now has extra unused entries in all arrays, so that future extensions to NFSv4.2 can be accomodated without revising this structure again. A future commit will update nfsstat(1) to report counts for the new NFSv4.2 specific operations/procedures. This patch affects the internal interface between the nfscommon, nfscl and nfsd modules and, as such, they all must be upgraded simultaneously. I will do a version bump (although arguably not needed), due to this. This code has survived a "make universe" but has not been built with a recent GCC. If you encounter build problems, please email me. Relnotes: yes	2019-12-12 23:22:55 +00:00
Mateusz Guzik	c8b29d1212	vfs: locking primitives which elide ->v_vnlock and shared locking disablement Both of these features are not needed by many consumers and result in avoidable reads which in turn puts them on profiles due to cache-line ping ponging. On top of that the current lockgmr entry point is slower than necessary single-threaded. As an attempted clean up preparing for other changes, provide new routines which don't support any of the aforementioned features. With these patches in place vop_stdlock and vop_stdunlock disappear from flamegraphs during -j 104 buildkernel. Reviewed by: jeff (previous version) Tested by: pho Differential Revision: https://reviews.freebsd.org/D22665	2019-12-11 23:11:21 +00:00
Mateusz Guzik	abd80ddb94	vfs: introduce v_irflag and make v_type smaller The current vnode layout is not smp-friendly by having frequently read data avoidably sharing cachelines with very frequently modified fields. In particular v_iflag inspected for VI_DOOMED can be found in the same line with v_usecount. Instead make it available in the same cacheline as the v_op, v_data and v_type which all get read all the time. v_type is avoidably 4 bytes while the necessary data will easily fit in 1. Shrinking it frees up 3 bytes, 2 of which get used here to introduce a new flag field with a new value: VIRF_DOOMED. Reviewed by: kib, jeff Differential Revision: https://reviews.freebsd.org/D22715	2019-12-08 21:30:04 +00:00
Rick Macklem	a95cd06e9a	Delete an unused external declaration. Since nfsv4_opflag is no longer used in nfs_clcomsubs.c, delete the external declaration of it. Found during NFSv4.2 code merge. MFC after: 2 weeks	2019-12-08 16:59:36 +00:00
Rick Macklem	8e1906f700	Fix kernel handling of a NFSERR_MINORVERSMISMATCH NFSv4 server reply. When an NFSv4 server replies NFSERR_MINORVERSMISMATCH, it does not generate a status result for the first operation in the compound. Without this patch, this will result in a bogus EBADXDR error return. Returning EBADXDR is relatively harmless, but a correct reply of NFSERR_MINORVERSMISMATCH is needed by the pNFS client to select the correct minor version to use for a File Layout DS now that there can be NFSv4.2 DS servers. mount_nfs.c still needs to be fixed for this, although how the mount fails is only useful to help sysadmins isolate why a mount fails. Found during testing of the NFSv4.2 client and server. MFC after: 2 weeks	2019-12-08 00:06:00 +00:00
Rick Macklem	238da71f91	Add some definitions for NFSv4.2 which will be used by subsequent commits. This is a preliminary commit of NFSv4.2 definitions that will be used by subsequent commits which adds NFSv4.2 support to the NFS client and server. There will be a series of these preliminary commits that will prepare for a major commit of the NFSv4.2 client/server changes currently found in subversion under projects/nfsv42/sys.	2019-12-07 23:13:51 +00:00
Rick Macklem	394dae30b1	Set the XATTRSUPPORT attribute bit for NFSv4.2, always cleared for now. Since r355472 added code which clears the XATTRSUPPORT bit for non-NFSv4.2 mounts, it is now safe to set it. There will be a series of these preliminary commits that will prepare for a major commit of the NFSv4.2 client/server changes currently found in subversion under projects/nfsv42/sys. This commit completes updates to nfsproto.h required by the NFSv4.2.	2019-12-07 01:10:38 +00:00
Rick Macklem	2096ce0339	Add a couple of definitions for NFSv4.2 and update macros to use them. This patch adds code to macros to clear attribute bits not supported by NFSv4.2. For now, these bits are never set anyhow, but this prepares the code for the addition of NFSv4.2 support in a future commit. There will be a series of these preliminary commits that will prepare for a major commit of the NFSv4.2 client/server changes currently found in subversion under projects/nfsv42/sys.	2019-12-06 23:51:11 +00:00
Rick Macklem	8f9259a550	Add some definitions for NFSv4.2 which will be used by subsequent commits. This is a preliminary commit of NFSv4.2 definitions that will be used by subsequent commits which adds NFSv4.2 support to the NFS client and server. There will be a series of these preliminary commits that will prepare for a major commit of the NFSv4.2 client/server changes currently found in subversion under projects/nfsv42/sys.	2019-12-06 01:53:02 +00:00
Mateusz Guzik	1e0006e49c	nullfs: locklessly check for entries in null_hashget During random sampling over poudriere -j 104 over 10% of calls returned NULL.	2019-12-05 13:41:22 +00:00
Konstantin Belousov	a51c8071a3	Stop using per-mount tmpfs zones. Requested and reviewed by: jeff Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D22643	2019-12-05 00:03:17 +00:00
Rick Macklem	348d9b567e	Add some definitions for NFSv4.2 which will be used by subsequent commits. This is a preliminary commit of NFSv4.2 definitions that will be used by subsequent commits which adds NFSv4.2 support to the NFS client and server. There will be a series of these preliminary commits that will prepare for a major commit of the NFSv4.2 client/server changes currently found in subversion under projects/nfsv42/sys.	2019-12-04 23:24:40 +00:00
Mateusz Guzik	ba08feecbf	tmpfs: use proper macros for permission values in tmpfs_access While here group them in one var to prevent overy long lines. Perhaps a general macro of the same sort should be introduced. Requested by: kib	2019-12-01 00:34:49 +00:00
Kyle Evans	1b50b999f9	tty: implement TIOCNOTTY Generally, it's preferred that an application fork/setsid if it doesn't want to keep its controlling TTY, but it could be that a debugger is trying to steal it instead -- so it would hook in, drop the controlling TTY, then do some magic to set things up again. In this case, TIOCNOTTY is quite handy and still respected by at least OpenBSD, NetBSD, and Linux as far as I can tell. I've dropped the note about obsoletion, as I intend to support TIOCNOTTY as long as it doesn't impose a major burden. Reviewed by: bcr (manpages), kib Differential Revision: https://reviews.freebsd.org/D22572	2019-11-30 20:10:50 +00:00
Mateusz Guzik	a02cab334c	devfs: introduce a per-dev lock to protect ->si_devsw This allows bumping threadcount without taking the global devmtx lock. In particular this eliminates contention on said lock while using bhyve with multiple vms. Reviewed by: kib Tested by: markj MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D22548	2019-11-30 16:46:19 +00:00
Mateusz Guzik	0f4b850e85	tmpfs: add fast path to tmpfs_access for common case lookup VEXEC consists of vast majority of all calls and almost all targets have at least 0111.	2019-11-30 16:41:47 +00:00
Konstantin Belousov	9698d99230	In nfs_lock(), recheck vp->v_data after lock before accessing it. We might race with reclaim, and then this is no longer a nfs vnode, in which case we do not need to handle deferred vnode_pager_setsize() either. Reported by: rk@ronald.org PR: 242184 Sponsored by: The FreeBSD Foundation MFC after: 3 days	2019-11-29 13:55:56 +00:00
Rick Macklem	e1cda5eea6	Fix two races while handling nfsuserd daemon start/stop. A crash was reported where the nr_client field was NULL during an upcall to the nfsuserd daemon. Since nr_client == NULL only occurs when the nfsuserd daemon is being shut down, it appeared to be caused by a race between doing an upcall and the daemon shutting down. By inspection two races were identified: 1 - The nfsrv_nfsuserd variable is used to indicate whether or not the daemon is running. However it did not handle the intermediate phase where the daemon is starting or stopping. This was fixed by making nfsrv_nfsuserd tri-state and having the functions that are called during start/stop to obey the intermediate state. 2 - nfsrv_nfsuserd was checked to see that the daemon was running at the beginning of an upcall, but nothing prevented the daemon from being shut down while an upcall was still in progress. This race probably caused the crash. The patch fixes this by adding a count of upcalls in progress and having the shut down function delay until this count goes to zero before getting rid of nr_client and related data used by an upcall. Tested by: avg (Panzura QA) Reported by: avg Reviewed by: avg MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D22377	2019-11-28 23:34:23 +00:00
Konstantin Belousov	dbe257d253	tmpfs: resolve deadlock between rename and unmount. Top-level kern_renameat() increases the writecount on the mount point, which, together with tmpfs unmount suspending the mount, already ensures that unmount cannot proceed while rename unlocks and relocks all operated vnodes. Remove vfs_busy() call from tmpfs_rename() which was done while holding a vnode lock, creating the deadlock. The only intent of the busy operation seems to be the prevention of unmount, which is already ensured. Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2019-11-24 19:06:38 +00:00
Rick Macklem	14eff785e8	Fix the pNFS server's reporting of SpaceUsed (va_bytes). The pNFS server currently reports SpaceUsed (va_bytes) for the metadata file. This in not correct, since the metadata file is always empty and, as such, va_bytes is just the allocation for the empty file. This patch adds va_bytes to the list of attributes acquired from the DS for a file, so that it includes the allocated data size and is updated when the file is written. For files created on a pNFS server before this patch is applied, the va_bytes value is estimated by rounding va_size up to a multiple of BLKDEV_IOSIZE. Once the file is written after this patch has been applied to the metadata server, the va_bytes returned for the file will be correct. This patch only affects a pNFS metadata server. Found during testing of the NFSv4.2 pNFS server for the Allocate operation. (Not yet in head/current.) MFC after: 2 weeks	2019-11-22 00:22:55 +00:00
Mateusz Guzik	1fccb43c39	vfs: change si_usecount management to count used vnodes Currently si_usecount is effectively a sum of usecounts from all associated vnodes. This is maintained by special-casing for VCHR every time usecount is modified. Apart from complicating the code a little bit, it has a scalability impact since it forces a read from a cacheline shared with said count. There are no consumers of the feature in the ports tree. In head there are only 2: revoke and devfs_close. Both can get away with a weaker requirement than the exact usecount, namely just the count of active vnodes. Changing the meaning to the latter means we only need to modify it on 0<->1 transitions, avoiding the check plenty of times (and entirely in something like vrefact). Reviewed by: kib, jeff Tested by: pho Differential Revision: https://reviews.freebsd.org/D22202	2019-11-20 12:05:59 +00:00
Jeff Roberson	639676877b	Simplify anonymous memory handling with an OBJ_ANON flag. This eliminates reudundant complicated checks and additional locking required only for anonymous memory. Introduce vm_object_allocate_anon() to create these objects. DEFAULT and SWAP objects now have the correct settings for non-anonymous consumers and so individual consumers need not modify the default flags to create super-pages and avoid ONEMAPPING/NOSPLIT. Reviewed by: alc, dougm, kib, markj Tested by: pho Differential Revision: https://reviews.freebsd.org/D22119	2019-11-19 23:19:43 +00:00
Jeff Roberson	67d0e29304	Replace OBJ_MIGHTBEDIRTY with a system using atomics. Remove the TMPFS_DIRTY flag and use the same system. This enables further fault locking improvements by allowing more faults to proceed with a shared lock. Reviewed by: kib Tested by: pho Differential Revision: https://reviews.freebsd.org/D22116	2019-10-29 21:06:34 +00:00
Mateusz Guzik	2e310f6f72	pseudofs: hashed vncache Vast majority of uses the cache are just checking if there is an entry present on process exit (and evicting it if so). Both checking and eviction process are very expensive and put the lock protecting it high up on the profile during poudriere -j 104. Convert the linked list into a hash. This allows to almost always avoid taking the lock in the first place (and consequently almost removes it from the profile). Note only one lock is preserved as a split did not meaningfully impact contention. Should the cache be used for something it will still run into contention issues. The code needs a rewrite, but should someone want to tidy it up further the following can be done: 1) per-chain locks (or at least an array) 2) hashing by something else than just pid Sponsored by: The FreeBSD Foundation	2019-10-22 22:52:53 +00:00
Konstantin Belousov	c6ba06d86c	Fix interface between nfsclient and vnode pager. Make the nfsclient always call vnode_pager_setsize() with the vnode exclusively locked. This ensures that page fault always can find the backing page if the object size check succeeded. Set VV_VMSIZEVNLOCK flag on NFS nodes. The main offender breaking the interface in nfsclient is nfs_loadattrcache(), which is used whenever server responded with updated attributes, which can happen on non-changing operations as well. Also, iod threads only have buffers locked (and even that is LK_KERNPROC), but they still may call nfs_loadattrcache() on RPC response. Instead of immediately calling vnode_pager_setsize() if server response indicated changed file size, but the vnode is not exclusively locked, set a new node flag NVNSETSZSKIP. When the vnode exclusively locked, or when we can temporary upgrade the lock to exclusive, call vnode_pager_setsize(), by providing the nfsclient VOP_LOCK() implementation. Tested by: pho Discussed with: rmacklem Sponsored by: The FreeBSD Foundation MFC after: 2 weeks Differential revision: https://reviews.freebsd.org/D21883	2019-10-22 16:17:38 +00:00
Jeff Roberson	0012f373e4	(4/6) Protect page valid with the busy lock. Atomics are used for page busy and valid state when the shared busy is held. The details of the locking protocol and valid and dirty synchronization are in the updated vm_page.h comments. Reviewed by: kib, markj Tested by: pho Sponsored by: Netflix, Intel Differential Revision: https://reviews.freebsd.org/D21594	2019-10-15 03:45:41 +00:00
Jeff Roberson	63e9755548	(1/6) Replace busy checks with acquires where it is trival to do so. This is the first in a series of patches that promotes the page busy field to a first class lock that no longer requires the object lock for consistency. Reviewed by: kib, markj Tested by: pho Sponsored by: Netflix, Intel Differential Revision: https://reviews.freebsd.org/D21548	2019-10-15 03:35:11 +00:00
Mateusz Guzik	9c04e4c01e	tmpfs: use MNTK_NOMSYNC Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D22009	2019-10-13 15:42:41 +00:00
Mateusz Guzik	48c426f226	pseudofs: use MNTK_NOMSYNC Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D22009	2019-10-13 15:42:25 +00:00
Mateusz Guzik	be4cd6912f	nullfs: use MNTK_NOMSYNC Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D22009	2019-10-13 15:42:04 +00:00
Mateusz Guzik	4c9ba39aea	devfs: use MNTK_NOMSYNC Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D22009	2019-10-13 15:41:47 +00:00
Konstantin Belousov	e3fdd051f9	devfs_vptocnp(): correct the component name when node is not at top. Node' cdp.si_name is the full path as provided by make_dev(9), it should not be returned by VOP_VPTOCNP() when only the last component is requested. Use the dirent entry instead. With this note, handling of VDIR and VCHR nodes only differs in handling of root vnode, which simplifies and unifies the logic. Reported by: Li, Zhichao1 <Zhichao_Li1@Dell.com> Sponsored by: The FreeBSD Foundation MFC after: 1 week	2019-10-11 18:41:24 +00:00
Konstantin Belousov	53fcc6c960	Plug the rest of undef behavior places that were missed in r337456. There are three more places in msdosfs_fat.c which might shift one into the sign bit. While there, fix formatting of KASSERTs. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2019-10-11 18:37:02 +00:00
Doug Moore	2288078c5e	Define macro VM_MAP_ENTRY_FOREACH for enumerating the entries in a vm_map. In case the implementation ever changes from using a chain of next pointers, then changing the macro definition will be necessary, but changing all the files that iterate over vm_map entries will not. Drop a counter in vm_object.c that would have an effect only if the vm_map entry count was wrong. Discussed with: alc Reviewed by: markj Tested by: pho (earlier version) Differential Revision: https://reviews.freebsd.org/D21882	2019-10-08 07:14:21 +00:00
Mateusz Guzik	d511f93e45	nfsclient: add root vnode caching See r353150. Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21646	2019-10-06 22:17:29 +00:00
Mateusz Guzik	7682d0be2b	tmpfs: add root vnode caching See r353150. Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21646	2019-10-06 22:17:11 +00:00
Mateusz Guzik	559ac49d41	devfs: add root vnode caching See r353150. Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21646	2019-10-06 22:16:55 +00:00
Mateusz Guzik	dfa8dae493	devfs: plug redundant bwillwrite avoidance vn_write already checks for vnode type to see if bwillwrite should be called. This effectively reverts r244643. Reviewed by: kib Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D21905	2019-10-05 17:44:33 +00:00
Konstantin Belousov	c5dac63c15	tmpfs_readdir(): unlock the locked node. During readdir() we guarantee that the tn_dir.tn_parent does not go away, but it might be replaced by a parallel rename. Read tn_parent only once, then use the cached value. Reported and tested by: pho Sponsored by: The FreeBSD Foundation MFC after: 1 week	2019-10-03 19:55:05 +00:00
Konstantin Belousov	f7e69a6fa0	tmpfs_rename: style. Reformat multi-line comments to follow style. Also fix some typos. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2019-10-03 19:51:56 +00:00
Konstantin Belousov	d60ac9d561	Remove unnecessary vm/vm_page.h and vm/vm_pager.h includes from tmpfs/tmpfs_vnodes.c. Submitted by: ota@j.email.ne.jp MFC after: 1 week Differential revision: https://reviews.freebsd.org/D21881	2019-10-03 08:25:09 +00:00
Rick Macklem	ee7201a725	Replace all mtx_assert() calls for n_mtx and ncl_iod_mutex with macros. To be consistent with replacing the mtx_lock()/mtx_unlock() calls on the NFS node mutex (n_mtx) and ncl_iod_mutex, this patch replaces all mtx_assert() calls on these mutexes with macros as well. This will simplify changing these locks to sx locks in a future commit. However, this change may be delayed indefinitely, since it appears there is a deadlock when vnode_pager_setsize() is called to shrink the size and the NFS node lock is held. There is no semantic change as a result of this commit. Suggested by: kib MFC after: 1 week	2019-09-26 02:54:45 +00:00
Rick Macklem	b662b41e62	Replace all mtx_lock()/mtx_unlock() on the iod lock with macros. Since the NFS node mutex needs to change to an sx lock so it can be held when vnode_pager_setsize() is called and the iod lock is held when the NFS node lock is acquired, the iod mutex will need to be changed to an sx lock as well. To simply the future commit that changes both the NFS node lock and iod lock to sx locks, this commit replaces all mtx_lock()/mtx_unlock() calls on the iod lock with macros. There is no semantic change as a result of this commit. I don't know when the future commit will happen and be MFC'd, so I have set the MFC on this commit to one week so that it can be MFC'd at the same time. Suggested by: kib MFC after: 1 week	2019-09-24 23:38:10 +00:00
Rick Macklem	5d85e12f44	Replace all mtx_lock()/mtx_unlock() on n_mtx with the macros. For a long time, some places in the NFS code have locked/unlocked the NFS node lock with the macros NFSLOCKNODE()/NFSUNLOCKNODE() whereas others have simply used mtx_lock()/mtx_unlock(). Since the NFS node mutex needs to change to an sx lock so it can be held when vnode_pager_setsize() is called, replace all occurrences of mtx_lock/mtx_unlock with the macros to simply making the change to an sx lock in future commit. There is no semantic change as a result of this commit. I am not sure if the change to an sx lock will be MFC'd soon, so I put an MFC of 1 week on this commit so that it could be MFC'd with that commit. Suggested by: kib MFC after: 1 week	2019-09-24 01:58:54 +00:00

1 2 3 4 5 ...

4214 Commits