Commit Graph

4494 Commits

Author SHA1 Message Date
Rick Macklem
90cf38f22e Fix a bug introduced by r363001 for the ext_pgs case.
r363001 added support for ext_pgs mbufs to nfsm_uiombuf().
By inspection, I noticed that "mlen" was not set non-zero and, as such, there
would be an iteration of the loop that did nothing.
This patch sets it.
This bug would have no effect on the system, since the ext_pgs mbuf code
is not yet enabled.
2020-08-12 04:35:49 +00:00
Conrad Meyer
0ac9e27ba9 devfs: Abstract locking assertions
The conversion was largely mechanical: sed(1) with:

  -e 's|mtx_assert(&devmtx, MA_OWNED)|dev_lock_assert_locked()|g'
  -e 's|mtx_assert(&devmtx, MA_NOTOWNED)|dev_lock_assert_unlocked()|g'

The definitions of these abstractions in fs/devfs/devfs_int.h are the
only non-mechanical change.

No functional change.
2020-08-12 00:32:31 +00:00
Mateusz Guzik
3b44443626 devfs: rework si_usecount to track opens
This removes a lot of special casing from the VFS layer.

Reviewed by:	kib (previous version)
Tested by:	pho (previous version)
Differential Revision:	https://reviews.freebsd.org/D25612
2020-08-11 14:27:57 +00:00
Rick Macklem
02511d2112 Add an argument to newnfs_connect() that indicates use TLS for the connection.
For NFSv4.0, the server creates a server->client TCP connection for callbacks.
If the client mount on the server is using TLS, enable TLS for this callback
TCP connection.
TLS connections from clients will not be supported until the kernel RPC
changes are committed.

Since this changes the internal ABI between the NFS kernel modules that
will require a version bump, delete newnfs_trimtrailing(), which is no
longer used.

Since LCL_TLSCB is not yet set, these changes should not have any semantic
affect at this time.
2020-08-11 00:26:45 +00:00
Mateusz Guzik
03337743db vfs: clean MNTK_FPLOOKUP if MNT_UNION is set
Elides checking it during lookup.
2020-08-10 11:51:21 +00:00
Mateusz Guzik
ca423b858b devfs: bool -> int
Fixes buildworld after r364069
2020-08-10 11:46:39 +00:00
Mateusz Guzik
7b19bddac8 devfs: save on spurious relocking for devfs_populate
Tested by:	pho
2020-08-10 10:36:43 +00:00
Mateusz Guzik
f8935a96d1 devfs: use cheaper lockmgr entry points
Tested by:	pho
2020-08-10 10:36:10 +00:00
Mateusz Guzik
f9c13ab856 devfs: use vget_prep/vget_finish
Tested by:	pho
2020-08-10 10:35:47 +00:00
Mateusz Guzik
fc9fcee01a nullfs: add missing VOP_STAT handling
Tested by:	pho
2020-08-10 10:31:17 +00:00
Mateusz Guzik
9a14439f2f tmpfs: add VOP_STAT handler 2020-08-07 23:07:47 +00:00
Mateusz Guzik
d292b1940c vfs: remove the obsolete privused argument from vaccess
This brings argument count down to 6, which is passable without the
stack on amd64.
2020-08-05 09:27:03 +00:00
Rick Macklem
cb889ce631 Add optional support for ext_pgs mbufs to the NFS server's read, readlink
and getxattr operations.

This patch optionally enables generation of read, readlink and getxattr replies
in ext_pgs mbufs.  Since neither of ND_EXTPG or ND_TLS are currently ever set,
there is no change in semantics at this time.
It also corrects the message in a couple of panic()s that should never occur.

This is another in the series of commits that add support to the NFS client
and server for building RPC messages in ext_pgs mbufs with anonymous pages.
This is useful so that the entire mbuf list does not need to be
copied before calling sosend() when NFS over TLS is enabled.

Use of ext_pgs mbufs will not be enabled until the kernel RPC is updated
to handle TLS.
2020-07-31 23:35:49 +00:00
Rick Macklem
ea83d07e82 Add support for ext_pgs mbufs to nfsrvd_readdir() and nfsrvd_readdirplus().
This patch code that optionally (based on ND_TLS, never set yet) generates
readdir replies in ext_pgs mbufs.
To trim the list back, a new function that is ext_pgs aware called
nfsm_trimtrailing() replaces newnfs_trimtrailing().
newnfs_trimtrailing() is no longer used, but will be removed in a future
commit, since its removal does modify the internal kpi between the NFS
modules.

This is another in the series of commits that add support to the NFS client
and server for building RPC messages in ext_pgs mbufs with anonymous pages.
This is useful so that the entire mbuf list does not need to be
copied before calling sosend() when NFS over TLS is enabled.

Use of ext_pgs mbufs will not be enabled until the kernel RPC is updated
to handle TLS.
2020-07-29 22:58:08 +00:00
Rick Macklem
194d870481 Fix the NFSv4 client so that it checks for support of TimeCreate before
trying to set it.

r362490 added support for setting of the TimeCreate (va_birthtime) attribute,
but it does so without checking to see if the server supports the attribute.
This could result in NFSERR_ATTRNOTSUPP error replies to the Setattr operation.
This patch adds code to check that the server supports TimeCreate before
attempting to do a Setattr of it to avoid these error returns.
2020-07-26 23:13:10 +00:00
Rick Macklem
2de592f6e1 Fix the NFS server so that it sets va_birthtime.
r362490 marked that the NFSv4 attribute TimeCreate (va_birthtime) is supported,
but it did not change the NFS server code to actually do it.
As such, errors could occur when unrolling a tarball onto an NFSv4 mounted
volume, since setting TimeCreate would fail with a NFSERR_ATTRNOTSUPP reply.

This patch fixes the server so that it does TimeCreate and also makes
sure that TimeCreate will not be set for a DS file for a pNFS server.

A separate commit will add a check to the NFSv4 client for support of
the TimeCreate attribute before attempting to set it, to avoid a problem
when mounting a server that does not support the attribute.
The failures will still occur for r362490 or later kernels that do not
have this patch, since they indicate support for the attribute, but do not
actually support the attribute.
2020-07-26 23:03:41 +00:00
Rick Macklem
18a48314ba Add support for ext_pgs mbufs to nfsrv_adj().
This patch uses a slightly different algorithm for nfsrv_adj()
since ext_pgs mbuf lists are not permitted to have m_len == 0 mbufs.
As such, the code now frees mbufs after the adjustment in the list instead
of setting their m_len field to 0.
Since mbuf(s) may be trimmed off the tail of the list, the function now
returns a pointer to the last mbuf in the list.  This saves the caller
from needing to use m_last() to find the last mbuf.
It also implies that it might return a nul list, which required a check for
that in nfsrvd_readlink().

This is another in the series of commits that add support to the NFS client
and server for building RPC messages in ext_pgs mbufs with anonymous pages.
This is useful so that the entire mbuf list does not need to be
copied before calling sosend() when NFS over TLS is enabled.

Use of ext_pgs mbufs will not be enabled until the kernel RPC is updated
to handle TLS.
2020-07-26 02:42:09 +00:00
Mateusz Guzik
172ffe702c tmpfs: add support for lockless lookup
Reviewed by:    kib
Tested by:      pho (in a patchset)
Differential Revision:	https://reviews.freebsd.org/D25580
2020-07-25 10:38:44 +00:00
Rick Macklem
cfaafa7908 Add support for ext_pgs mbufs to nfsm_uiombuflist() and nfsm_split().
This patch uses a slightly different algorithm for nfsm_uiombuflist() for
the non-ext_pgs case, where a variable called "mcp" is maintained, pointing to
the current location that mbuf data can be filled into. This avoids use of
mtod(mp, char *) + mp->m_len to calculate the location, since this does
not work for ext_pgs mbufs and I think it makes the algorithm more readable.
This change should not result in semantic changes for the non-ext_pgs case.
The patch also deletes come unneeded code.

It also adds support for anonymous page ext_pgs mbufs to nfsm_split().

This is another in the series of commits that add support to the NFS client
and server for building RPC messages in ext_pgs mbufs with anonymous pages.
This is useful so that the entire mbuf list does not need to be
copied before calling sosend() when NFS over TLS is enabled.
At this time for this case, use of ext_pgs mbufs cannot be enabled, since
ktls_encrypt() replaces the unencrypted data with encrypted data in place.

Until such time as this can be enabled, there should be no semantic change.
Also, note that this code is only used by the NFS client for a mirrored pNFS
server.
2020-07-24 23:17:09 +00:00
Mark Johnston
cbef26ed16 cuse: Stop checking for failures from malloc(M_WAITOK).
PR:		240545
Submitted by:	Andrew Reiter <arr@watson.org>
Reviewed by:	hselasky
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D25765
2020-07-23 14:03:37 +00:00
Rick Macklem
9516bcdfb4 Modify writing to mirrored pNFS DSs to prepare for use of ext_pgs mbufs.
This patch modifies writing to mirrored pNFS DSs slightly so that there is
only one m_copym() call for a mirrored pair instead of two of them.
This call replaces the custom nfsm_copym() call, which is no longer needed
and deleted by this patch. The patch does introduce a new nfsm_split()
function that only calls m_split() for the non-ext_pgs case.
The semantics of nfsm_uiombuflist() is changed to include code that nul
pads the generated mbuf list. This was done by nfsm_copym() prior to this patch.

The main reason for this change is that it allows the data to be a list
of ext_pgs mbufs, since the m_copym() is for the entire mbuf list.
This support will be added in a future commit.

This patch only affects writing to mirrored flexible file layout pNFS servers.
2020-07-22 23:33:37 +00:00
Alexander V. Chernikov
e1c05fd290 Transition from rtrequest1_fib() to rib_action().
Remove all variations of rtrequest <rtrequest1_fib, rtrequest_fib,
 in6_rtrequest, rtrequest_fib> and their uses and switch to
 to rib_action(). This is part of the new routing KPI.

Submitted by: Neel Chauhan <neel AT neelc DOT org>
Differential Revision: https://reviews.freebsd.org/D25546
2020-07-21 19:56:13 +00:00
Mark Johnston
39bc40e3d2 ext2fs: Stop checking for failures from malloc(M_WAITOK).
PR:		240545
Submitted by:	Andrew Reiter <arr@watson.org>
Reviewed by:	fsu
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D25707
2020-07-20 14:28:26 +00:00
Alexander V. Chernikov
725871230d Temporarly revert r363319 to unbreak the build.
Reported by:	CI
Pointy hat to: melifaro
2020-07-19 10:53:15 +00:00
Alexander V. Chernikov
8cee15d9e4 Transition from rtrequest1_fib() to rib_action().
Remove all variations of rtrequest <rtrequest1_fib, rtrequest_fib,
 in6_rtrequest, rtrequest_fib> and their uses and switch to
to rib_action(). This is part of the new routing KPI.

Submitted by:	Neel Chauhan <neel AT neelc DOT org>
Differential Revision:	https://reviews.freebsd.org/D25546
2020-07-19 09:29:27 +00:00
Rick Macklem
7477442fdd Fix the pNFS flexible file layout client for servers with small write size.
The code in nfscl_dofflayout() loops when a flexible file layout server
provides a small write data limit (no extant server is known to do this).
If/when it looped, it erroneously reused the "drpc" argument for the
mirror worker thread, corrupting it.
This patch fixes the problem by only using the calling thread after the
first loop iteration.

Found during testing by simulating a server with a small write size.

Since no extant pNFS server is known to provide a small write size,
this fix it not needed in practice at this time.

MFC after:	2 weeks
2020-07-15 01:26:28 +00:00
Rick Macklem
6722f6e577 Minor code cleanup that removes "nd->nd_bpos = mcp;" in both if and else.
The statement "nd->nd_bpos = mcp;" was in both the if and else. Correct,
but potentially confusing.  This patch fixes this.

There should be no semantics change caused by this commit.
2020-07-13 01:28:45 +00:00
Rick Macklem
3eaf03766e Add support for ext_pgs mbufs to nfsm_uiombuf().
This patch uses a slightly different algorithm for the non-ext_pgs case,
where a variable called "mcp" is maintained, pointing to the current
location that mbuf data can be filled into. This avoids use of
mtod(mp, char *) + mp->m_len to calculate the location, since this does
not work for ext_pgs mbufs and I think it makes the algorithm more readable.
This change should not result in semantic changes for the non-ext_pgs case.

This is another in the series of commits that add support to the NFS client
and server for building RPC messages in ext_pgs mbufs with anonymous pages.
This is useful so that the entire mbuf list does not need to be
copied before calling sosend() when NFS over TLS is enabled.

Since ND_EXTPG is never set yet, there is no semantic change at this time.
2020-07-08 02:28:08 +00:00
Rick Macklem
022346fa62 Add support for ext_pgs mbufs to nfsrvd_rephead().
This is another in the series of commits that add support to the NFS client
and server for building RPC messages in ext_pgs mbufs with anonymous pages.
This is useful so that the entire mbuf list does not need to be
copied before calling sosend() when NFS over TLS is enabled.

Since ND_EXTPG is never set yet, there is no semantic change at this time.
2020-07-07 00:42:23 +00:00
Rick Macklem
34fc29e0c9 Add support for ext_pgs mbufs to nfsm_strtom().
Also, add a new function nfsm_add_ext_pgs() which will either add a page
or add a new ext_pgs mbuf with a page to the mbuf list. Used by nfsm_strtom().
This is another in the series of commits that add support to the NFS client
and server for building RPC messages in ext_pgs mbufs with anonymous pages.
This is useful so that the entire mbuf list does not need to be
copied before calling sosend() when NFS over TLS is enabled.

Since ND_EXTPG is never set yet, there is no semantic change at this time.
2020-07-05 21:55:16 +00:00
Mateusz Guzik
11c345b18f devfs: fix a vnode use-after-free in devfs_ioctl
The vnode to be replaced was read with a shared lock, meaning 2 racing threads
can find the same one.

While here clean it up a little bit.
2020-07-04 06:27:28 +00:00
Rick Macklem
dccb580624 Add support for ext_pgs mbufs to nfscl_reqstart() and nfsm_set().
This is another in the series of commits that add support to the NFS client
and server for building RPC messages in ext_pgs mbufs with anonymous pages.
This is useful so that the entire mbuf list does not need to be
copied before calling sosend() when NFS over TLS is enabled.

Since ND_EXTPG is never set yet, there is no semantic change at this time.
2020-07-04 03:28:13 +00:00
Rick Macklem
606007409c Fix build breakage caused by r362903. Only pmap.h is needed now, but
vm_page.h and vm_pageout.h is needed later, so put them in now.

Pointy hat goes on me.
2020-07-03 05:21:05 +00:00
Rick Macklem
2da1527844 Add support for ext_pgs mbufs to nfsm_build().
This is the first of a series of commits that add support to the NFS client
and server for building RPC messages in ext_pgs mbufs with anonymous pages.
This is useful so that the entire mbuf list does not need to be
copied before calling sosend() when NFS over TLS is enabled.

Since ND_EXTPG is never set yet, there is no semantic change at this time.
2020-07-03 01:19:29 +00:00
Rick Macklem
4476c1def0 Add a boolean argument to nfscl_reqstart() to indicate that ext_pgs mbufs
should be used.

For KERN_TLS (and possibly some other future network interface) the mbuf
list passed into sosend() must be ext_pgs mbufs. The krpc could simply
copy all the mbuf data into ext_pgs mbufs before calling sosend(), but
that would be inefficient for large RPC messages.
This patch adds an argument to nfscl_reqstart() to indicate that it should
fill the RPC message into ext_pgs mbufs.
It also adds fields to "struct nfsrv_descript" needed for building NFS RPC
messages in ext_pgs mbufs, along with new flags for this.

Since the argument is always "false", this commit should not result in any
semantic change. However, this commit prepares the code
for future commits that will add support for building of NFS RPC messages
in ext_pgs mbufs.
2020-06-26 03:11:54 +00:00
Mark Johnston
84242cf68a Call swap_pager_freespace() from vm_object_page_remove().
All vm_object_page_remove() callers, except
linux_invalidate_mapping_pages() in the LinuxKPI, free swap space when
removing a range of pages from an object.  The LinuxKPI case appears to
be an unintentional omission that could result in leaked swap blocks, so
unconditionally free swap space in vm_object_page_remove() to protect
against similar bugs in the future.

Reviewed by:	alc, kib
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25329
2020-06-25 15:21:21 +00:00
Doug Rabson
c07782e10e Add some missing parts for supporting va_birthtime.
Reviewed by:	rmacklem
2020-06-22 08:23:16 +00:00
Thomas Munro
f270658873 vfs: track sequential reads and writes separately
For software like PostgreSQL and SQLite that sometimes reads sequentially
while also writing sequentially some distance behind with interleaved
syscalls on the same fd, performance is better on UFS if we do
sequential access heuristics separately for reads and writes.

Patch originally by Andrew Gierth in 2008, updated and proposed by me with
his permission.

Reviewed by:	mjg, kib, tmunro
Approved by:	mjg (mentor)
Obtained from:	Andrew Gierth <andrew@tao11.riddles.org.uk>
Differential Revision:	https://reviews.freebsd.org/D25024
2020-06-21 08:51:24 +00:00
Alan Somers
eea79fde5a Remove vfs_statfs and vnode_mount macros from NFS
These macro definitions are no longer needed as the NFS OSX port is long
dead.  The vfs_statfs macro conflicts with the vfsops field of the same
name.

Submitted by:	shivank@
Reviewed by:	rmacklem
MFC after:	2 weeks
Sponsored by:	Google, Inc. (GSoC 2020)
Differential Revision:	https://reviews.freebsd.org/D25263
2020-06-17 16:20:19 +00:00
Doug Rabson
3900c11481 Add support for the timecreate attribute
This maps to the va_birthtime VFS attribute.
2020-06-14 11:41:57 +00:00
Rick Macklem
1f7104d720 Fix export_args ex_flags field so that is 64bits, the same as mnt_flags.
Since mnt_flags was upgraded to 64bits there has been a quirk in
"struct export_args", since it hold a copy of mnt_flags
in ex_flags, which is an "int" (32bits).
This happens to currently work, since all the flag bits used in ex_flags are
defined in the low order 32bits. However, new export flags cannot be defined.
Also, ex_anon is a "struct xucred", which limits it to 16 additional groups.
This patch revises "struct export_args" to make ex_flags 64bits and replaces
ex_anon with ex_uid, ex_ngroups and ex_groups (which points to a
groups list, so it can be malloc'd up to NGROUPS in size.
This requires that the VFS_CHECKEXP() arguments change, so I also modified the
last "secflavors" argument to be an array pointer, so that the
secflavors could be copied in VFS_CHECKEXP() while the export entry is locked.
(Without this patch VFS_CHECKEXP() returns a pointer to the secflavors
array and then it is used after being unlocked, which is potentially
a problem if the exports entry is changed.
In practice this does not occur when mountd is run with "-S",
but I think it is worth fixing.)

This patch also deleted the vfs_oexport_conv() function, since
do_mount_update() does the conversion, as required by the old vfs_cmount()
calls.

Reviewed by:	kib, freqlabs
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D25088
2020-06-14 00:10:18 +00:00
Ryan Moeller
693d10a291 tmpfs: Preserve alignment of struct fid fields
On 64-bit platforms, the two short fields in `struct tmpfs_fid` are padded to
the 64-bit alignment of the long field.  This pushes the offsets of the
subsequent fields by 4 bytes and makes `struct tmpfs_fid` bigger than
`struct fid`.  `tmpfs_vptofh()` casts a `struct fid *` to `struct tmpfs_fid *`,
causing 4 bytes of adjacent memory to be overwritten when the struct fields are
set.  Through several layers of indirection and embedded structs, the adjacent
memory for one particular call to `tmpfs_vptofh()` happens to be the stack
canary for `nfsrvd_compound()`.  Half of the canary ends up being clobbered,
going unnoticed until eventually the stack check fails when `nfsrvd_compound()`
returns and a panic is triggered.

Instead of duplicating fields of `struct fid` in `struct tmpfs_fid`, narrow the
struct to cover only the unique fields for tmpfs and assert at compile time
that the struct fits in the allotted space.  This way we don't have to
replicate the offsets of `struct fid` fields, we just use them directly.

Reviewed by:	kib, mav, rmacklem
Approved by:	mav (mentor)
MFC after:	1 week
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D25077
2020-06-03 09:38:51 +00:00
Alexander V. Chernikov
9d5df78e64 Fix NOINET6 build broken by r361575.
Reported by:	ci, hps
2020-05-28 09:52:28 +00:00
Alexander V. Chernikov
c74ce5cca3 Make NFS address selection use fib4_lookup().
fib4_lookup_nh_ represents pre-epoch generation of fib api,
providing less guarantees over pointer validness and requiring
on-stack data copying.
Switch call to use new fib4_lookup(), allowing to eventually
deprecate old api.

Differential Revision:	https://reviews.freebsd.org/D24977
2020-05-28 07:35:07 +00:00
Conrad Meyer
852c303b61 copystr(9): Move to deprecate (attempt #2)
This reapplies logical r360944 and r360946 (reverting r360955), with fixed
copystr() stand-in replacement macro.  Eventually the goal is to convert
consumers and kill the macro, but for a first step it helps if the macro is
correct.

Prior commit message:

Unlike the other copy*() functions, it does not serve to copy from one
address space to another or protect against potential faults.  It's just
an older incarnation of the now-more-common strlcpy().

Add a coccinelle script to tools/ which can be used to mechanically
convert existing instances where replacement with strlcpy is trivial.
In the two cases which matched, fuse_vfsops.c and union_vfsops.c, the
code was further refactored manually to simplify.

Replace the declaration of copystr() in systm.h with a small macro
wrapper around strlcpy (with correction from brooks@ -- thanks).

Remove N redundant MI implementations of copystr.  For MIPS, this
entailed inlining the assembler copystr into the only consumer,
copyinstr, and making the latter a leaf function.

Reviewed by:		jhb (earlier version)
Discussed with:		brooks (thanks!)
Differential Revision:	https://reviews.freebsd.org/D24672
2020-05-25 16:40:48 +00:00
Alexander V. Chernikov
2bbab0af6d Use epoch(9) for rtentries to simplify control plane operations.
Currently the only reason of refcounting rtentries is the need to report
 the rtable operation details immediately after the execution.
Delaying rtentry reclamation allows to stop refcounting and simplify the code.
Additionally, this change allows to reimplement rib_lookup_info(), which
 is used by some of the customers to get the matching prefix along
 with nexthops, in more efficient way.

The change keeps per-vnet rtzone uma zone. It adds nh_vnet field to
 nhop_priv to be able to reliably set curvnet even during vnet teardown.
Rest of the reference counting code will be removed in the D24867 .

Differential Revision:	https://reviews.freebsd.org/D24866
2020-05-23 10:21:02 +00:00
Alan Somers
bfcb817bcd Fix issues with FUSE_ACCESS when default_permissions is disabled
This patch fixes two issues relating to FUSE_ACCESS when the
default_permissions mount option is disabled:

* VOP_ACCESS() calls with VADMIN set should never be sent to a fuse server
  in the form of FUSE_ACCESS operations. The FUSE protocol has no equivalent
  of VADMIN, so we must evaluate such things kernel-side, regardless of the
  default_permissions setting.

* The FUSE protocol only requires FUSE_ACCESS to be sent for two purposes:
  for the access(2) syscall and to check directory permissions for
  searchability during lookup. FreeBSD sends it much more frequently, due to
  differences between our VFS and Linux's, for which FUSE was designed. But
  this patch does eliminate several cases not required by the FUSE protocol:

  * for any FUSE_*XATTR operation
  * when creating a new file
  * when deleting a file
  * when setting timestamps, such as by utimensat(2).

* Additionally, when default_permissions is disabled, this patch removes one
  FUSE_GETATTR operation when deleting a file.

PR:		245689
Reported by:	MooseFS FreeBSD Team <freebsd@moosefs.pro>
Reviewed by:	cem
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D24777
2020-05-22 18:11:17 +00:00
Alan Somers
7096c29e5b Disable nullfs cacheing on top of fusefs
Nullfs cacheing can keep a large number of vnodes active.  That results in
more active FUSE file handles, causing some FUSE servers to use extra
resources.  Disable nullfs cacheing for fusefs, just like we already do for
NFSv4.

PR:		245688
Reported by:	MooseFS FreeBSD Team <freebsd@moosefs.pro>
MFC after:	2 weeks
2020-05-22 18:03:14 +00:00
Ryan Moeller
245bfd34da Deduplicate fsid comparisons
Comparing fsid_t objects requires internal knowledge of the fsid structure
and yet this is duplicated across a number of places in the code.

Simplify by creating a fsidcmp function (macro).

Reviewed by:	mjg, rmacklem
Approved by:	mav (mentor)
MFC after:	1 week
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D24749
2020-05-21 01:55:35 +00:00
Rick Macklem
3d7650f04c Add a function nfsm_set() to initialize "struct nfsrv_descript" for building
mbuf lists.

This function is currently trivial, but will that will change when
support for building NFS messages in ext_pgs mbufs is added.
Adding support for ext_pgs mbufs is needed for KERN_TLS, which will
be used to implement nfs-over-tls.
2020-05-18 00:07:45 +00:00
Fedor Uporov
cd3acfe7f3 Add BE architectures support.
Author of most initial version: pfg (https://reviews.freebsd.org/D23259)

Reviewed by:    pfg
MFC after:      3 months

Differential Revision:    https://reviews.freebsd.org/D24685
2020-05-17 14:52:54 +00:00
Fedor Uporov
4bd6d63dc5 Restrict the max runp and runb return values in case of extents mapping.
This restriction already present in case of indirect mapping, do the same
in case of extents.

PR:		246182
Reported by:	Teran McKinney
MFC after:	2 weeks
2020-05-17 14:10:46 +00:00
Fedor Uporov
86e2d48bf9 Fix incorrect inode link count check in case of rename.
The check was incorrect because the directory inode link count have
min value 2 after dir_nlink extfs feature introduction.
2020-05-17 14:03:13 +00:00
Fedor Uporov
ec81c9cc06 Add inode bitmap tail initialization.
Make ext2fs compatible with changes introduced in e2fsprogs v1.45.2.
Now the tail of inode bitmap is filled with 0xff pattern explicitly during
bitmap initialization phase to avoid e2fsck error like:
"Padding at end of inode bitmap is not set."
2020-05-17 14:00:54 +00:00
John Baldwin
07a34ce381 Remove unused header for DES.
The NFS port doesn't use any of the DES functions.
2020-05-13 18:35:02 +00:00
Ryan Moeller
b9cc3262bc nfs: Remove APPLESTATIC macro
It is no longer useful.

Reviewed by:	rmacklem
Approved by:	mav (mentor)
MFC after:	1 week
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D24811
2020-05-12 13:23:25 +00:00
Conrad Meyer
051fc58cb3 Revert r360944 and r360946 until reported issues can be resolved
Reported by:	cy
2020-05-12 04:34:26 +00:00
Conrad Meyer
580744621f copystr(9): Move to deprecate [2/2]
Unlike the other copy*() functions, it does not serve to copy from one
address space to another or protect against potential faults.  It's just
an older incarnation of the now-more-common strlcpy().

Add a coccinelle script to tools/ which can be used to mechanically
convert existing instances where replacement with strlcpy is trivial.
In the two cases which matched, fuse_vfsops.c and union_vfsops.c, the
code was further refactored manually to simplify.

Replace the declaration of copystr() in systm.h with a small macro
wrapper around strlcpy.

Remove N redundant MI implementations of copystr.  For MIPS, this
entailed inlining the assembler copystr into the only consumer,
copyinstr, and making the latter a leaf function.

Reviewed by:	jhb
Differential Revision:	https://reviews.freebsd.org/D24672
2020-05-11 22:57:21 +00:00
Alan Somers
9d4e48aebf fusefs: better dtrace probes for asynchronous invalidation operations
MFC after:	2 weeks
2020-05-08 22:26:52 +00:00
Ryan Moeller
32033b3d30 Remove APPLEKEXT ifndefs
They are no longer useful.

Reviewed by:	rmacklem
Approved by:	mav (mentor)
MFC after:	1 week
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D24752
2020-05-08 14:39:38 +00:00
Rick Macklem
04d6c514b0 Delete unused function newnfs_trimleading.
The NFS function called newnfs_trimleading() has not been used by the
code in long time. To give you a clue, it still had a K&R style function
declaration.
Delete it, since it is just cruft, as a part of the NFS mbuf handling
cleanup in preparation for adding ext_pgs mbuf support.
The ext_pgs mbuf support for the build/send side is needed by
nfs-over-tls.
2020-05-06 00:44:03 +00:00
Rick Macklem
3973ef1dfc Revert r360514, to avoid unnecessary churn of the sources.
r360514 prepared the NFS code for changes to handle ext_pgs mbufs on
the receive side. However, at this time, KERN_TLS does not pass
ext_pgs mbufs up through soreceive(). As such, as this time, only
the send/build side of the NFS mbuf code needs to handle ext_pgs mbufs.
Revert r360514 since the rather extensive changes required for receive
side ext_pgs mbufs are not yet needed.
This avoids unnecessary churn of the sources.
2020-05-05 00:58:03 +00:00
Rick Macklem
0c9cd5cacd Factor some code out of nfsm_dissct() into separate functions.
Factoring some of the code in nfsm_dissct() out into separate functions
allows these functions to be used elsewhere in the NFS mbuf handling code.
Other uses of these functions will be done in future commits.
It also makes it easier to add support for ext_pgs mbufs, which is needed
for nfs-over-tls under development in base/projects/nfs-over-tls.

Although the algorithm in nfsm_dissct() is somewhat re-written by this
patch, the semantics of nfsm_dissct() should not have changed.
2020-05-01 00:36:14 +00:00
Rick Macklem
5ecf33c6c4 Get rid of uio_XXX macros used for the Mac OS/X port.
The NFS code had a bunch of Mac OS/X accessor functions named uio_XXX
left over from the port to Mac OS/X. Since that port is long forgotten,
replace the calls with the code generated by the FreeBSD macros for these
in nfskpiport.h. This allows the macros to be deleted from nfskpiport.h
and I think makes the code more readable.

This patch should not result in any semantic change.
2020-04-28 02:11:02 +00:00
Mark Johnston
9b22722423 Call pipeselwakeup() after toggling PIPE_EOF.
This ensures that pipe_poll() and the pipe kqueue filters observe
PIPE_EOF and set EV_EOF accordingly.  As a result an extra call to
knote() after setting PIPE_EOF is unnecessary.

Submitted by:	Jan Kokemüller <jan.kokemueller@gmail.com>
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D24528
2020-04-27 15:59:07 +00:00
Rick Macklem
e4a458bb1b Remove Mac OS/X macros that did nothing for FreeBSD.
The macros CAST_USER_ADDR_T() and CAST_DOWN() were used for the Mac OS/X
port. The first of these macros was a no-op for FreeBSD and the second
is no longer used.
This patch gets rid of them. It also deletes the "mbuf_t" typedef which
is no longer used in the FreeBSD code from nfskpiport.h

This patch should not change semantics.
2020-04-25 02:18:59 +00:00
Rick Macklem
897d7d45ba Make the NFSv4.n client's recovery from NFSERR_BADSESSION RFC5661 conformant.
RFC5661 specifies that a client's recovery upon receipt of NFSERR_BADSESSION
should first consist of a CreateSession operation using the extant ClientID.
If that fails, then a full recovery beginning with the ExchangeID operation
is to be done.
Without this patch, the FreeBSD client did not attempt the CreateSession
operation with the extant ClientID and went directly to a full recovery
beginning with ExchangeID. I have had this patch several years, but since
no extant NFSv4.n server required the CreateSession with extant ClientID,
I have never committed it.
I an committing it now, since I suspect some future NFSv4.n server will
require this and it should not negatively impact recovery for extant NFSv4.n
servers, since they should all return NFSERR_STATECLIENTID for this first
CreateSession.

The patched client has been tested for recovery against both the FreeBSD
and Linux NFSv4.n servers and no problems have been observed.

MFC after:	1 month
2020-04-22 21:00:14 +00:00
Edward Tomasz Napierala
d499502db7 Silence down a warning which should really be a debug message.
MFC after:	2 weeks
Sponsored by:	DARPA
2020-04-21 13:57:51 +00:00
Rick Macklem
ae070589d3 Replace all instances of the typedef mbuf_t with "struct mbuf *".
The typedef mbuf_t was used for the Mac OS/X port of the code long ago.
Since this port is no longer used and the use of mbuf_t obscures what
the code does (and is not consistent with style(9)), it is no longer needed.
This patch replaces all instances of mbuf_t with "struct mbuf *", so that
it is no longer used.

This patch should not result in any semantic change.
2020-04-17 21:17:51 +00:00
Rick Macklem
82164bdd76 Add a sanity check for nes_numsecflavor to the NFS server.
Ryan Moeller reported crashes in the NFS server that appear to be
caused by stack corruption in nfsrv_compound(). It appears that
the stack got corrupted just after a NFSv4.1 Lookup that crosses
a server mount point.
Although it is just a "theory" at this point, the most obvious way
the stack could get corrupted would be if nfsvno_checkexp() somehow
acquires an export with a bogus nes_numsecflavor value. This would
cause the copying of the secflavors to run off the end of the array,
which is allocated on the stack below where the corruption occurs.

This sanity check is simple to do and would stop the stack corruption
if the theory is correct. Otherwise, doing the sanity check seems to
be a reasonable safety belt to add to the code.

Reported by:	freqlabs
MFC after:	2 weeks
2020-04-17 02:21:46 +00:00
Rick Macklem
0bda1ddd33 Fix the NFSv4.2 extended attribute support for remove extended attrbute.
I missed the "atomic" field of the RemoveExtendedAttribute operation's
reply when I implemented it. It worked between FreeBSD client and server,
since it was missed for both, but it did not conform to RFC 8276.
This patch adds the field for both client and server.

Thanks go to Frank for doing interoperability testing of the extended
attribute support against patches for Linux.

Submitted by:	Frank van der Linden <fllinden@amazon.com>
Reported by:	Frank van der Linden <fllinden@amazon.com>
2020-04-15 21:27:52 +00:00
Rick Macklem
fb8ed4c5f8 Fix the NFSv2 extended attribute support to handle 0 length attributes.
I did not realize that zero length attributes are allowed, but they are.
This patch fixes the NFSv4.2 client and server to handle zero length
extended attributes correctly.

Submitted by:	Frank van der Linden <fllinden@amazon.com> (earlier version)
Reported by:	Frank van der Linden <fllinder@amazon.com>
2020-04-14 22:57:21 +00:00
Rick Macklem
9897e357de Re-organize the NFS file handle affinity code for the NFS server.
The file handle affinity code was configured to be used by both the
old and new NFS servers. This no longer makes sense, since there is
only one NFS server.
This patch copies a majority of the code in sys/nfs/nfs_fha.c and
sys/nfs/nfs_fha.h into sys/fs/nfsserver/nfs_fha_new.c and
sys/fs/nfsserver/nfs_fha_new.h, so that the files in sys/nfs can be
deleted. The code is simplified by deleting the function callback pointers
used to call functions in either the old or new NFS server and they were
replaced by calls to the functions.

As well as a cleanup, this re-organization simplifies the changes
required for handling of external page mbufs, which is required for KERN_TLS.

This patch should not result in a semantic change to file handle affinity.
2020-04-14 00:01:26 +00:00
Rick Macklem
66ea9219a2 Delete the mbuf macros that were used for the Mac OS/X port.
When the code was ported to Mac OS/X, mbuf handling functions were
converted to using the Mac OS/X accessor functions. For FreeBSD, they
are a simple set of macros in sys/fs/nfs/nfskpiport.h.
Since r359757, r359780, r359785, r359810, r359811 have removed all uses
of these macros, this patch deleted the macros from the .h files.

My eventual goal is deleting nfskpiport.h, but that will take some more
editting to replace uses of the remaining macros.
2020-04-13 00:07:37 +00:00
Rick Macklem
e3e7c612f3 Replace mbuf macros with the code they would generate in the NFS code.
When the code was ported to Mac OS/X, mbuf handling functions were
converted to using the Mac OS/X accessor functions. For FreeBSD, they
are a simple set of macros in sys/fs/nfs/nfskpiport.h.
Since porting to Mac OS/X is no longer a consideration, replacement of
these macros with the code generated by them makes the code more
readable.
When support for external page mbufs is added as needed by the KERN_TLS,
the patch becomes simpler if done without the macros.

This patch should not result in any semantic change.

This is the final patch of this series and the macros should now be
able to be deleted from the .h files in a future commit.
2020-04-11 23:37:58 +00:00
Rick Macklem
9f6624d317 Replace mbuf macros with the code they would generate in the NFS code.
When the code was ported to Mac OS/X, mbuf handling functions were
converted to using the Mac OS/X accessor functions. For FreeBSD, they
are a simple set of macros in sys/fs/nfs/nfskpiport.h.
Since porting to Mac OS/X is no longer a consideration, replacement of
these macros with the code generated by them makes the code more
readable.
When support for external page mbufs is added as needed by the KERN_TLS,
the patch becomes simpler if done without the macros.

This patch should not result in any semantic change.
2020-04-11 20:57:15 +00:00
Rick Macklem
3133bbf7a4 Replace mbuf macros with the code they would generate in the NFS code.
When the code was ported to Mac OS/X, mbuf handling functions were
converted to using the Mac OS/X accessor functions. For FreeBSD, they
are a simple set of macros in sys/fs/nfs/nfskpiport.h.
Since porting to Mac OS/X is no longer a consideration, replacement of
these macros with the code generated by them makes the code more
readable.
When support for external page mbufs is added as needed by the KERN_TLS,
the patch becomes simpler if done without the macros.

This patch should not result in any semantic change.
This conversion will be committed one file at a time.
2020-04-10 22:42:14 +00:00
Rick Macklem
28e8046b2e Replace mbuf macros with the code they would generate in the NFS code.
When the code was ported to Mac OS/X, mbuf handling functions were
converted to using the Mac OS/X accessor functions. For FreeBSD, they
are a simple set of macros in sys/fs/nfs/nfskpiport.h.
Since porting to Mac OS/X is no longer a consideration, replacement of
these macros with the code generated by them makes the code more
readable.
When support for external page mbufs is added as needed by the KERN_TLS,
the patch becomes simpler if done without the macros.

This patch should not result in any semantic change.
This conversion will be committed one file at a time.
2020-04-10 21:25:35 +00:00
Rick Macklem
c948a17a52 Replace mbuf macros with the code they would generate in the NFS code.
When the code was ported to Mac OS/X, mbuf handling functions were
converted to using the Mac OS/X accessor functions. For FreeBSD, they
are a simple set of macros in sys/fs/nfs/nfskpiport.h.
Since porting to Mac OS/X is no longer a consideration, replacement of
these macros with the code generated by them makes the code more
readable.
When support for external page mbufs is added as needed by the KERN_TLS,
the patch becomes simpler if done without the macros.

This patch should not result in any semantic change.
This conversion will be committed one file at a time.
2020-04-09 23:11:19 +00:00
Rick Macklem
8de97f394e Remove the old NFS lock device driver that uses Giant.
This NFS lock device driver was replaced by the kernel NLM around FreeBSD7 and
has not normally been used since then.
To use it, the kernel had to be built without "options NFSLOCKD" and
the nfslockd.ko had to be deleted as well.
Since it uses Giant and is no longer used, this patch removes it.

With this device driver removed, there is now a lot of unused code
in the userland rpc.lockd. That will be removed on a future commit.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D22933
2020-04-09 14:44:46 +00:00
Rick Macklem
b0b7d978b6 Fix an interoperability issue w.r.t. the Linux client and the NFSv4 server.
Luoqi Chen reported a problem on freebsd-fs@ where a Linux NFSv4 client
was able to open and write to a file when the file's permissions were
not set to allow the owner write access.

Since NFS servers check file permissions on every write RPC, it is standard
practice to allow the owner of the file to do writes, regardless of
file permissions. This provides POSIX like behaviour, since POSIX only
checks permissions upon open(2).
The traditional way NFS clients handle this is to check access via the
Access operation/RPC and use that to determine if an open(2) on the
client is allowed.

It appears that, for NFSv4, the Linux client expects the NFSv4 Open (not a
POSIX open) operation to fail with NFSERR_ACCES if the file is not being
created and file permissions do not allow owner access, unlike NFSv3.
Since both the Linux and OpenSolaris NFSv4 servers seem to exhibit this
behaviour, this patch changes the FreeBSD NFSv4 server to do the same.
A sysctl called vfs.nfsd.v4openaccess can be set to 0 to return the
NFSv4 server to its previous behaviour.

Since both the Linux and FreeBSD NFSv4 clients seem to exhibit correct
behaviour with the access check for file owner in Open enabled, it is enabled
by default.

Reported by:	luoqi.chen@gmail.com
MFC after:	2 weeks
2020-04-08 01:12:54 +00:00
Rick Macklem
76fd19b0a2 Fix noisy NFSv4 server printf.
Peter reported that his dmesg was getting cluttered with
nfsrv_cache_session: no session
messages when he rebooted his NFS server and they did not seem useful.
He was correct, in that these messages are "normal" and expected when
NFSv4.1 or NFSv4.2 are mounted and the server is rebooted.
This patch silences the printf() during the grace period after a reboot.
It also adds the client IP address to the printf(), so that the message
is more useful if/when it occurs. If this happens outside of the
server's grace period, it does indicate something is not working correctly.
Instead of adding yet another nd_XXX argument, the arguments for
nfsrv_cache_session() were simplified to take a "struct nfsrv_descript *".

Reported by:	pen@lysator.liu.se
MFC after:	2 weeks
2020-04-06 23:21:39 +00:00
John Baldwin
59838c1a19 Retire procfs-based process debugging.
Modern debuggers and process tracers use ptrace() rather than procfs
for debugging.  ptrace() has a supserset of functionality available
via procfs and new debugging features are only added to ptrace().
While the two debugging services share some fields in struct proc,
they each use dedicated fields and separate code.  This results in
extra complexity to support a feature that hasn't been enabled in the
default install for several years.

PR:		244939 (exp-run)
Reviewed by:	kib, mjg (earlier version)
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D23837
2020-04-01 19:22:09 +00:00
Hans Petter Selasky
98029019b6 Fine grain locking inside the cuse(3) kernel module.
Implement one mutex per cuse(3) server instance which also cover the
clients belonging to the given server instance.

This should significantly reduce the mutex congestion inside the
cuse(3) kernel module when multiple servers are in use.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-03-30 18:25:43 +00:00
Alan Somers
9338f18965 fusefs: add a dtrace probe that fires after mounting is complete
This probe is useful for showing the protocol options negotiated with a FUSE
server.

MFC after:	2 weeks
2020-03-30 14:03:35 +00:00
Mark Johnston
355b3b7fd7 Simplify td_ucred handling in newnfs_connect().
No functional change intended.

MFC after:	1 week
2020-03-26 15:02:56 +00:00
John Baldwin
8d8a74e69e Mark procfs-based process debugging as deprecated for FreeBSD 13.
Attempting to use ioctls on /proc/<pid>/mem to control a process will
trigger warnings on the console.  The <sys/pioctl.h> include file will
also now emit a compile-time warning when used from userland.

Reviewed by:	emaste
MFC after:	1 week
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D23822
2020-03-17 18:44:03 +00:00
Edward Tomasz Napierala
fb48a42f03 Make autofs(5) timeout messages include affected process name and PID.
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
2020-03-16 16:17:58 +00:00
Alan Somers
b0ecfb42d1 fusefs: avoid cache corruption with buggy fuse servers
The FUSE protocol allows the client (kernel) to cache a file's size, if the
server (userspace daemon) allows it. A well-behaved daemon obviously should
not change a file's size while a client has it cached. But a buggy daemon
might. If the kernel ever detects that that has happened, then it should
invalidate the entire cache for that file. Previously, we would not only
cache stale data, but in the case of a file extension while we had the size
cached, we accidentally extended the cache with zeros.

PR:		244178
Reported by:	Ben RUBSON <ben.rubson@gmx.com>
Reviewed by:	cem
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D24012
2020-03-11 04:29:45 +00:00
Konstantin Belousov
c6d3d601c9 Preallocate pipe buffers on pipe creation.
Return ENOMEM if one of the buffer cannot be created even with the
minimal size.  This should avoid subsequent spurious ENOMEM errors
from write(2) when buffer cannot be allocated on the fly, after we
reported that the pipe was create succesfully.

Reported by:	Keno Fischer <keno@juliacomputing.com>
Reviewed by:	markj (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D23993
2020-03-09 21:55:26 +00:00
Alan Somers
d970778e6f fusefs: fix fsync for files with multiple open handles
We were reusing a structure for multiple operations, but failing to
reinitialize one member.  The result is that a server that cares about FUSE
file handle IDs would see one correct FUSE_FSYNC operation, and one with the
FHID unset.

PR:		244431
Reported by:	Agata <chogata@gmail.com>
MFC after:	2 weeks
2020-03-09 01:57:21 +00:00
Chuck Silvers
f15ccf8836 Add a new "mntfs" pseudo file system which provides private device vnodes for
file systems to safely access their disk devices, and adapt FFS to use it.
Also add a new BO_NOBUFS flag to allow enforcing that file systems using
mntfs vnodes do not accidentally use the original devfs vnode to create buffers.

Reviewed by:	kib, mckusick
Approved by:	imp (mentor)
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D23787
2020-03-06 18:41:37 +00:00
Mateusz Guzik
625adeaccd nullfs: don't pre lock exclusive in nullfs_root
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D23955
2020-03-04 19:52:00 +00:00
Pawel Biernacki
7029da5c36 Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE.  All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by:	kib (mentor, blanket)
Commented by:	kib, gallatin, melifaro
Differential Revision:	https://reviews.freebsd.org/D23718
2020-02-26 14:26:36 +00:00
Pawel Biernacki
d3d10ed299 Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (10 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Approved by:	kib (mentor, blanket)
Differential Revision:	https://reviews.freebsd.org/D23629
2020-02-24 10:37:56 +00:00
Pawel Biernacki
ef06a80cdb Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (8 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Approved by:	kib (mentor, blanket)
Differential Revision:	https://reviews.freebsd.org/D23627
2020-02-24 10:33:51 +00:00
Konstantin Belousov
0ff51c98d1 Fix NFS client deadlock when read reports truncated node.
If node attribute returned in the reply for read rpc indicate
truncation, and it happens that the vnode is exclusively locked,
update of the node attributes would try to shrink vnode size.  Since
during the read some vnode pages were busied by the reading thread,
vnode_pager_setsize() deadlocks waiting for the busy state owned by
the caller.

Use a thread-local flag to indicate that NFS read owns some (s)busy
pages states and postpone the call to vnode_pager_setsize() until the
thread relinguishes the ownership.

Diagnosed by:	rlibby
Tested by:	pho, rlibby
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2020-02-22 20:50:30 +00:00
Fedor Uporov
3767ed5b11 Add a EXT2FS-specific implementation for lseek(SEEK_DATA).
The lseek(SEEK_DATA) optimization logic could be simply borrowed from ufs side.
See, https://reviews.freebsd.org/D19599.

Reviewed by:    pfg
MFC after:      1 week

Differential Revision:    https://reviews.freebsd.org/D23605
2020-02-18 16:39:57 +00:00
Pawel Biernacki
e0d69c5a88 Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (1 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked). Use it in
preparation for a general review of all nodes.
This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Reviewed by:	kib, trasz
Approved by:	kib (mentor)
Differential Revision:	https://reviews.freebsd.org/D23640
2020-02-15 18:48:38 +00:00
Mateusz Guzik
074ad60a4c vfs: make write suspension mandatory
At the time opt-in was introduced adding yourself as a writer was esrializing
across the mount point. Nowadays it is fully per-cpu, the only impact being
a small single-threaded hit on top of what's there right now.

Vast majority of the overhead stems from the call to VOP_GETWRITEMOUNT which
has is done regardless.

Should someone want to microoptimize this single-threaded they can coalesce
looking the mount up with adding a write to it.
2020-02-15 13:00:39 +00:00
Konstantin Belousov
c1e84733ac tmpfs: add nomtime mount option,
which disables tracking mtime updates due to writes through the shared
mapped areas backed by tmpfs files.  This removes periodic scans which
downgrades rw mapped pages to ro to note the writes.

Suggested by:	mjg
Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D23432
2020-02-04 19:05:58 +00:00
Konstantin Belousov
b66352b787 tmpfs_mount update: simplify, cache the value of VFS_TO_TMPFS() calculation.
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2020-02-04 18:52:25 +00:00
Mateusz Guzik
2abdae33b1 tmpfs: inline tmpfs_update
It was generated to be just a jumping off point to tmpfs_itimes.

While here provide a dedicated variant for getattr since we normally don't
expect to need to the update from that caller.
2020-02-03 17:06:21 +00:00
Mateusz Guzik
f1fa1ba3d0 Fix up various vnode-related asserts which did not dump the used vnode 2020-02-03 14:25:32 +00:00
Kyle Evans
6a5abb1ee5 Provide O_SEARCH
O_SEARCH is defined by POSIX [0] to open a directory for searching, skipping
permissions checks on the directory itself after the initial open(). This is
close to the semantics we've historically applied for O_EXEC on a directory,
which is UB according to POSIX. Conveniently, O_SEARCH on a file is also
explicitly undefined behavior according to POSIX, so O_EXEC would be a fine
choice. The spec goes on to state that O_SEARCH and O_EXEC need not be
distinct values, but they're not defined to be the same value.

This was pointed out as an incompatibility with other systems that had made
its way into libarchive, which had assumed that O_EXEC was an alias for
O_SEARCH.

This defines compatibility O_SEARCH/FSEARCH (equivalent to O_EXEC and FEXEC
respectively) and expands our UB for O_EXEC on a directory. O_EXEC on a
directory is checked in vn_open_vnode already, so for completeness we add a
NOEXECCHECK when O_SEARCH has been specified on the top-level fd and do not
re-check that when descending in namei.

[0] https://pubs.opengroup.org/onlinepubs/9699919799/

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D23247
2020-02-02 16:34:57 +00:00
Kyle Evans
bd11e674ec pseudofs: don't do VEXEC check in VOP_CACHEDLOOKUP
VOP_CACHEDLOOKUP should assume that the appropriate VEXEC check has been
done in the caller (vfs_cache_lookup), so it does not belong here.
2020-02-02 15:36:12 +00:00
Mateusz Guzik
10a15df653 vfs: remove the never set VDESC_VPP_WILLRELE flag 2020-02-02 09:35:48 +00:00
Mateusz Guzik
45757984f8 vfs: consistently use size_t for buflen around VOP_VPTOCNP 2020-02-01 20:34:43 +00:00
Konstantin Belousov
dc1d2cc648 Fix a bug in r357199.
Around a generic call to null_nodeget(), there is nothing that would
prevent the unmount of the nullfs mp until we process to the
insmntque1() point.  Calculate the VV_ROOT flag after insmntque1() to
not access mp->mnt_data before we have an exclusively locked vnode
from this mount point on the mp vnode list.

Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2020-01-30 19:34:37 +00:00
Mateusz Guzik
3cfabd81a1 vfs: remove the never set VDESC_NOMAP_VPP flag 2020-01-30 08:56:22 +00:00
Konstantin Belousov
5fc9e11c42 Save lower root vnode in nullfs mnt data instead of upper.
Nullfs needs to know the root vnode of the lower fs during the
operation.  Currently it caches the upper vnode of it, which is also
the root of the nullfs mount.  On unmount, nullfs calls vflush() with
rootrefs == 1, and aborts non-forced unmount if there are any more
vnodes instantiated during vflush().  This means that the reference to
the root vnode after failed non-forced unmount could be lost and
nullm_rootvp points to the freed memory.

Fix it by storing the reference for lower vnode instead, which is kept
intact during vflush().  nullfs_root() now instantiates the upper
vnode of lower root.  Care about VV_ROOT flag in null_nodeget().

Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2020-01-28 11:29:06 +00:00
Alex Richardson
162ae9c834 Allow bootstrapping makefs on older FreeBSD hosts and Linux/macOS
In order to do so we need to install the msdosfs headers to the bootstrap
sysroot and avoid includes of kernel headers that may not exist on every
host (e.g. sys/lockmgr.h). This change should allow bootstrapping of makefs
on FreeBSD 11+ as well as Linux and macOS.

We also have to avoid using the IO_SYNC macro since that may not be
available. In makefs it is only used to switch between calling
bwrite() and bdwrite() which both call the same function. Therefore we
can simply always call bwrite().

For our CheriBSD builds we always bootstrap makefs by setting
LOCAL_XTOOL_DIRS='lib/libnetbsd usr.sbin/makefs' and use the makefs binary
from the build tree to create a bootable disk image.

Reviewed By:	brooks
Differential Revision: https://reviews.freebsd.org/D23201
2020-01-27 12:02:41 +00:00
Rick Macklem
60a09a94cf Fix a crash in the NFSv4 server.
The PR reported a crash that occurred when a file was removed while
client(s) were actively doing lock operations on it.
Since nfsvno_getvp() will return NULL when the file does not exist,
the bug was obvious and easy to fix via this patch. It is a little
surprising that this wasn't found sooner, but I guess the above
case rarely occurs.

Tested by:	iron.udjin@gmail.com
PR:		242768
Reported by:	iron.udjin@gmail.com
MFC after:	2 weeks
2020-01-26 17:59:05 +00:00
Jeff Roberson
d6e13f3b4d Don't hold the object lock while calling getpages.
The vnode pager does not want the object lock held.  Moving this out allows
further object lock scope reduction in callers.  While here add some missing
paging in progress calls and an assert.  The object handle is now protected
explicitly with pip.

Reviewed by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D23033
2020-01-19 23:47:32 +00:00
Mateusz Guzik
d3cc535474 vfs: provide F_ISUNIONSTACK as a kludge for libc
Prior to introduction of this op libc's readdir would call fstatfs(2), in
effect unnecessarily copying kilobytes of data just to check fs name and a
mount flag.

Reviewed by:	kib (previous version)
Differential Revision:	https://reviews.freebsd.org/D23162
2020-01-17 14:42:25 +00:00
Mateusz Guzik
e2fa68513e unionfs: use MNTK_NOMSYNC 2020-01-16 22:45:08 +00:00
Mateusz Guzik
2a829749d3 tmpfs: add missing CLTFLAG_MPSAFE annotation 2020-01-15 01:32:11 +00:00
Mateusz Guzik
7493134e08 nfs: add missing CLTFLAG_MPSAFE annotations 2020-01-15 01:31:57 +00:00
Mateusz Guzik
388820fbef fusefs: add missing CLTFLAG_MPSAFE annotation 2020-01-15 01:31:28 +00:00
Eric van Gyzen
cf64777f50 Add missing comma in nfsv4_errstr
Reported by:	Coverity
CID:		1412243
Sponsored by:	Dell EMC Isilon
2020-01-13 21:49:27 +00:00
Mateusz Guzik
cc3593fbd9 vfs: rework vnode list management
The current notion of an active vnode is eliminated.

Vnodes transition between 0<->1 hold counts all the time and the
associated traversal between different lists induces significant
scalability problems in certain workloads.

Introduce a global list containing all allocated vnodes. They get
unlinked only when UMA reclaims memory and are only requeued when
hold count reaches 0.

Sample result from an incremental make -s -j 104 bzImage on tmpfs:
stock:   118.55s user 3649.73s system 7479% cpu 50.382 total
patched: 122.38s user 1780.45s system 6242% cpu 30.480 total

Reviewed by:	jeff
Tested by:	pho (in a larger patch, previous version)
Differential Revision:	https://reviews.freebsd.org/D22997
2020-01-13 02:37:25 +00:00
Mateusz Guzik
57083d2576 vfs: add per-mount vnode lazy list and use it for deferred inactive + msync
This obviates the need to scan the entire active list looking for vnodes
of interest.

msync is handled by adding all vnodes with write count to the lazy list.

deferred inactive directly adds vnodes as it sets the VI_DEFINACT flag.

Vnodes get dequeued from the list when their hold count reaches 0.

Newly added MNT_VNODE_FOREACH_LAZY* macros support filtering so that
spurious locking is avoided in the common case.

Reviewed by:	jeff
Tested by:	pho (in a larger patch, previous version)
Differential Revision:	https://reviews.freebsd.org/D22995
2020-01-13 02:34:02 +00:00
Mateusz Guzik
b249ce48ea vfs: drop the mostly unused flags argument from VOP_UNLOCK
Filesystems which want to use it in limited capacity can employ the
VOP_UNLOCK_FLAGS macro.

Reviewed by:	kib (previous version)
Differential Revision:	https://reviews.freebsd.org/D21427
2020-01-03 22:29:58 +00:00
Mateusz Guzik
4a20fe31c3 unionfs: fix up VOP_UNLOCK use after flags stopped being supported
For the most part the code was passing the LK_RELEASE flag.
The 2 cases which did not use the VOP_UNLOCK_FLAGS macro.

This fixes a panic when stacking unionfs on top of e.g., tmpfs when
debug is enabled.

Note there are latent bugs which prevent unionfs from working with debug
regardless of this change.

PR:		243064
Reported by:	Mason Loring Bliss
2020-01-03 22:12:25 +00:00
Mateusz Guzik
d2203b48a5 msdos: vgone unconstructed vnode before vputing it
Otherwise someone else may race to start using it. Race window
was opened by r351748 ("vfs: implement usecount implying holdcnt").

Noted by:	kib
2020-01-01 22:50:23 +00:00
Mateusz Guzik
f342b91c76 msdosfs: add a missing MNT_VNODE_FOREACH_ALL_ABORT to msdosfs_sync 2020-01-01 22:47:00 +00:00
Mark Johnston
9f5632e6c8 Remove page locking for queue operations.
With the previous reviews, the page lock is no longer required in order
to perform queue operations on a page.  It is also no longer needed in
the page queue scans.  This change effectively eliminates remaining uses
of the page lock and also the false sharing caused by multiple pages
sharing a page lock.

Reviewed by:	jeff
Tested by:	pho
Sponsored by:	Netflix, Intel
Differential Revision:	https://reviews.freebsd.org/D22885
2019-12-28 19:04:00 +00:00
Rick Macklem
57fd4aa7e4 Change NFSv4.1 and NFSv4.2 error strings to start with lower case letter.
r356084 added error strings for NFSv4.1 and NFSv4.2, with the first
character capitalized. Since the other error strings were not capitalized
and these strings would usually be imbedded in an error, I decided to
make the first characters lower cased.
No real effect but more consistent.
2019-12-26 21:06:34 +00:00
Rick Macklem
8f2940cec7 Add NFSv4.1 and NFSv4.2 errors to nfsv4_errstr.h.
nfsv4_errstr.h only had strings for NFSv4.0 errors. This patch adds the
errors for NFSv4.1 and NFSv4.2. At this time, this file is not used by
any sources in the tree, so the change is not significant.
I do plan on using nfsv4_errstr.h in a future patch to mount_nfs.c.
Since I am doing this patch so that "minor version mismatch" will be
recognized, I made that string less abbreviated.
2019-12-25 22:25:30 +00:00
Rick Macklem
05dcd5d2c8 Fix nfsmount() so that it will return NFSERR_MINORVERMISMATCH.
If nfsrpc_getdirpath() returns NFSERR_MINORVERMISMATCH, it would erroneously
get mapped to EIO. This was not particularily harmful, but would make it
hard for sysadmins to diagnose why an NFSv4 mount is failing.

mount_nfs.c still needs to be fixed so that it does not report
NFSERR_MINORVERMISMATCH as an unknown error 10021.

MFC after:	1 week
2019-12-25 01:15:38 +00:00
Doug Moore
f9f4c60aa7 Including <sys/tmpfs.h> into non-kernel software leads to a
compilation error because, without _KERNEL defined, the macro
TMPFS_VALIDATE_DIR is invoked, but never defined. User-level software
that includes sys/tmpfs.h must define _KERNEL to make the definition
of TMPFS_VALIDATE_DIR visible.

This change puts all the inline functions that, directly or
indirectly, invoke MPASS into the scope of the _KERNEL block, allowing
many user-space includers of <sys/tmpfs.h> to stop defining _KERNEL.

Reviewed by:	alc, kib
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D22874
2019-12-19 16:39:52 +00:00
Mateusz Guzik
6fa079fc3f vfs: flatten vop vectors
This eliminates the following loop from all VOP calls:

while(vop != NULL && \
    vop->vop_spare2 == NULL && vop->vop_bypass == NULL)
        vop = vop->vop_default;

Reviewed by:	jeff
Tesetd by:	pho
Differential Revision:	https://reviews.freebsd.org/D22738
2019-12-16 00:06:22 +00:00
Jeff Roberson
a808177864 Add a deferred free mechanism for freeing swap space that does not require
an exclusive object lock.

Previously swap space was freed on a best effort basis when a page that
had valid swap was dirtied, thus invalidating the swap copy.  This may be
done inconsistently and requires the object lock which is not always
convenient.

Instead, track when swap space is present.  The first dirty is responsible
for deleting space or setting PGA_SWAP_FREE which will trigger background
scans to free the swap space.

Simplify the locking in vm_fault_dirty() now that we can reliably identify
the first dirty.

Discussed with:	alc, kib, markj
Differential Revision:	https://reviews.freebsd.org/D22654
2019-12-15 03:15:06 +00:00
Rick Macklem
f808cf7294 Silence some "might not be initialized" warnings for riscv64.
None of these case were actually using the variable(s) uninitialized, but
I figured that silencing the warnings via initializing them made sense.

Some of these predated r355677.
2019-12-13 21:38:08 +00:00
Rick Macklem
bf6ac05aa3 Add some more initializations to quiet riscv build.
The one case in nfs_copy_file_range() was a legitimate case, although
it would probably never occur in practice.
2019-12-13 01:34:25 +00:00
Rick Macklem
95bf2e523b Fix the build for MAC not defined and a couple of might not be initialized.
r355677 broke the build for the not MAC defined case and a couple of
might not be initialized warnings were generated for riscv. Others seem
to be erroneous.

Hopefully there won't be too many more build errors.

Pointy hat goes on me.
2019-12-13 00:45:14 +00:00
Rick Macklem
c057a37818 Add support for NFSv4.2 to the NFS client and server.
This patch adds support for NFSv4.2 (RFC-7862) and Extended Attributes
(RFC-8276) to the NFS client and server.
NFSv4.2 is comprised of several optional features that can be supported
in addition to NFSv4.1. This patch adds the following optional features:
   - posix_fadvise(POSIX_FADV_WILLNEED/POSIX_FADV_DONTNEED)
   - posix_fallocate()
   - intra server file range copying via the copy_file_range(2) syscall
     --> Avoiding data tranfer over the wire to/from the NFS client.
   - lseek(SEEK_DATA/SEEK_HOLE)
   - Extended attribute syscalls for "user" namespace attributes as defined
     by RFC-8276.

Although this patch is fairly large, it should not affect support for
the other versions of NFS. However it does add two new sysctls that allow
a sysadmin to limit which minor versions of NFSv4 a server supports, allowing
a sysadmin to disable NFSv4.2.

Unfortunately, when the NFS stats structure was last revised, it was assumed
that there would be no additional operations added beyond what was
specified in RFC-7862. However RFC-8276 did add additional operations,
forcing the NFS stats structure to revised again. It now has extra unused
entries in all arrays, so that future extensions to NFSv4.2 can be
accomodated without revising this structure again.

A future commit will update nfsstat(1) to report counts for the new NFSv4.2
specific operations/procedures.

This patch affects the internal interface between the nfscommon, nfscl and
nfsd modules and, as such, they all must be upgraded simultaneously.
I will do a version bump (although arguably not needed), due to this.

This code has survived a "make universe" but has not been built with a
recent GCC. If you encounter build problems, please email me.

Relnotes:	yes
2019-12-12 23:22:55 +00:00
Mateusz Guzik
c8b29d1212 vfs: locking primitives which elide ->v_vnlock and shared locking disablement
Both of these features are not needed by many consumers and result in avoidable
reads which in turn puts them on profiles due to cache-line ping ponging.

On top of that the current lockgmr entry point is slower than necessary
single-threaded. As an attempted clean up preparing for other changes,
provide new routines which don't support any of the aforementioned features.

With these patches in place vop_stdlock and vop_stdunlock disappear from
flamegraphs during -j 104 buildkernel.

Reviewed by:	jeff (previous version)
Tested by:	pho
Differential Revision:	https://reviews.freebsd.org/D22665
2019-12-11 23:11:21 +00:00
Mateusz Guzik
abd80ddb94 vfs: introduce v_irflag and make v_type smaller
The current vnode layout is not smp-friendly by having frequently read data
avoidably sharing cachelines with very frequently modified fields. In
particular v_iflag inspected for VI_DOOMED can be found in the same line with
v_usecount. Instead make it available in the same cacheline as the v_op, v_data
and v_type which all get read all the time.

v_type is avoidably 4 bytes while the necessary data will easily fit in 1.
Shrinking it frees up 3 bytes, 2 of which get used here to introduce a new
flag field with a new value: VIRF_DOOMED.

Reviewed by:	kib, jeff
Differential Revision:	https://reviews.freebsd.org/D22715
2019-12-08 21:30:04 +00:00
Rick Macklem
a95cd06e9a Delete an unused external declaration.
Since nfsv4_opflag is no longer used in nfs_clcomsubs.c, delete the
external declaration of it. Found during NFSv4.2 code merge.

MFC after:	2 weeks
2019-12-08 16:59:36 +00:00
Rick Macklem
8e1906f700 Fix kernel handling of a NFSERR_MINORVERSMISMATCH NFSv4 server reply.
When an NFSv4 server replies NFSERR_MINORVERSMISMATCH, it does not generate
a status result for the first operation in the compound. Without this
patch, this will result in a bogus EBADXDR error return.
Returning EBADXDR is relatively harmless, but a correct reply of
NFSERR_MINORVERSMISMATCH is needed by the pNFS client to select the correct
minor version to use for a File Layout DS now that there can be NFSv4.2
DS servers.

mount_nfs.c still needs to be fixed for this, although how the mount fails
is only useful to help sysadmins isolate why a mount fails.

Found during testing of the NFSv4.2 client and server.

MFC after:	2 weeks
2019-12-08 00:06:00 +00:00
Rick Macklem
238da71f91 Add some definitions for NFSv4.2 which will be used by subsequent commits.
This is a preliminary commit of NFSv4.2 definitions that will be used by
subsequent commits which adds NFSv4.2 support to the NFS client and server.

There will be a series of these preliminary commits that will prepare for
a major commit of the NFSv4.2 client/server changes currently found in
subversion under projects/nfsv42/sys.
2019-12-07 23:13:51 +00:00
Rick Macklem
394dae30b1 Set the XATTRSUPPORT attribute bit for NFSv4.2, always cleared for now.
Since r355472 added code which clears the XATTRSUPPORT bit for non-NFSv4.2
mounts, it is now safe to set it.

There will be a series of these preliminary commits that will prepare for
a major commit of the NFSv4.2 client/server changes currently found in
subversion under projects/nfsv42/sys.
This commit completes updates to nfsproto.h required by the NFSv4.2.
2019-12-07 01:10:38 +00:00
Rick Macklem
2096ce0339 Add a couple of definitions for NFSv4.2 and update macros to use them.
This patch adds code to macros to clear attribute bits not supported
by NFSv4.2. For now, these bits are never set anyhow, but this prepares
the code for the addition of NFSv4.2 support in a future commit.

There will be a series of these preliminary commits that will prepare for
a major commit of the NFSv4.2 client/server changes currently found in
subversion under projects/nfsv42/sys.
2019-12-06 23:51:11 +00:00
Rick Macklem
8f9259a550 Add some definitions for NFSv4.2 which will be used by subsequent commits.
This is a preliminary commit of NFSv4.2 definitions that will be used by
subsequent commits which adds NFSv4.2 support to the NFS client and server.

There will be a series of these preliminary commits that will prepare for
a major commit of the NFSv4.2 client/server changes currently found in
subversion under projects/nfsv42/sys.
2019-12-06 01:53:02 +00:00
Mateusz Guzik
1e0006e49c nullfs: locklessly check for entries in null_hashget
During random sampling over poudriere -j 104 over 10% of calls returned NULL.
2019-12-05 13:41:22 +00:00
Konstantin Belousov
a51c8071a3 Stop using per-mount tmpfs zones.
Requested and reviewed by:	jeff
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D22643
2019-12-05 00:03:17 +00:00
Rick Macklem
348d9b567e Add some definitions for NFSv4.2 which will be used by subsequent commits.
This is a preliminary commit of NFSv4.2 definitions that will be used by
subsequent commits which adds NFSv4.2 support to the NFS client and server.

There will be a series of these preliminary commits that will prepare for
a major commit of the NFSv4.2 client/server changes currently found in
subversion under projects/nfsv42/sys.
2019-12-04 23:24:40 +00:00
Mateusz Guzik
ba08feecbf tmpfs: use proper macros for permission values in tmpfs_access
While here group them in one var to prevent overy long lines. Perhaps a
general macro of the same sort should be introduced.

Requested by:	kib
2019-12-01 00:34:49 +00:00
Kyle Evans
1b50b999f9 tty: implement TIOCNOTTY
Generally, it's preferred that an application fork/setsid if it doesn't want
to keep its controlling TTY, but it could be that a debugger is trying to
steal it instead -- so it would hook in, drop the controlling TTY, then do
some magic to set things up again. In this case, TIOCNOTTY is quite handy
and still respected by at least OpenBSD, NetBSD, and Linux as far as I can
tell.

I've dropped the note about obsoletion, as I intend to support TIOCNOTTY as
long as it doesn't impose a major burden.

Reviewed by:	bcr (manpages), kib
Differential Revision:	https://reviews.freebsd.org/D22572
2019-11-30 20:10:50 +00:00
Mateusz Guzik
a02cab334c devfs: introduce a per-dev lock to protect ->si_devsw
This allows bumping threadcount without taking the global devmtx lock.

In particular this eliminates contention on said lock while using bhyve
with multiple vms.

Reviewed by:	kib
Tested by:	markj
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D22548
2019-11-30 16:46:19 +00:00
Mateusz Guzik
0f4b850e85 tmpfs: add fast path to tmpfs_access for common case lookup
VEXEC consists of vast majority of all calls and almost all targets have
at least 0111.
2019-11-30 16:41:47 +00:00
Konstantin Belousov
9698d99230 In nfs_lock(), recheck vp->v_data after lock before accessing it.
We might race with reclaim, and then this is no longer a nfs vnode, in
which case we do not need to handle deferred vnode_pager_setsize()
either.

Reported by:	rk@ronald.org
PR:	 242184
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2019-11-29 13:55:56 +00:00
Rick Macklem
e1cda5eea6 Fix two races while handling nfsuserd daemon start/stop.
A crash was reported where the nr_client field was NULL during an upcall
to the nfsuserd daemon. Since nr_client == NULL only occurs when the
nfsuserd daemon is being shut down, it appeared to be caused by a race
between doing an upcall and the daemon shutting down.
By inspection two races were identified:
1 - The nfsrv_nfsuserd variable is used to indicate whether or not the
    daemon is running. However it did not handle the intermediate phase
    where the daemon is starting or stopping.

    This was fixed by making nfsrv_nfsuserd tri-state and having the
    functions that are called during start/stop to obey the intermediate
    state.

2 - nfsrv_nfsuserd was checked to see that the daemon was running at
    the beginning of an upcall, but nothing prevented the daemon from
    being shut down while an upcall was still in progress.
    This race probably caused the crash.

    The patch fixes this by adding a count of upcalls in progress and
    having the shut down function delay until this count goes to zero
    before getting rid of nr_client and related data used by an upcall.

Tested by:	avg (Panzura QA)
Reported by:	avg
Reviewed by:	avg
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D22377
2019-11-28 23:34:23 +00:00
Konstantin Belousov
dbe257d253 tmpfs: resolve deadlock between rename and unmount.
Top-level kern_renameat() increases the writecount on the mount point,
which, together with tmpfs unmount suspending the mount, already
ensures that unmount cannot proceed while rename unlocks and relocks
all operated vnodes.

Remove vfs_busy() call from tmpfs_rename() which was done while
holding a vnode lock, creating the deadlock.  The only intent of the
busy operation seems to be the prevention of unmount, which is already
ensured.

Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2019-11-24 19:06:38 +00:00
Rick Macklem
14eff785e8 Fix the pNFS server's reporting of SpaceUsed (va_bytes).
The pNFS server currently reports SpaceUsed (va_bytes) for the metadata
file. This in not correct, since the metadata file is always empty and,
as such, va_bytes is just the allocation for the empty file.
This patch adds va_bytes to the list of attributes acquired from the
DS for a file, so that it includes the allocated data size and is updated
when the file is written.
For files created on a pNFS server before this patch is applied, the
va_bytes value is estimated by rounding va_size up to a multiple of
BLKDEV_IOSIZE. Once the file is written after this patch has been
applied to the metadata server, the va_bytes returned for the file
will be correct.

This patch only affects a pNFS metadata server.

Found during testing of the NFSv4.2 pNFS server for the Allocate operation.
(Not yet in head/current.)

MFC after:	2 weeks
2019-11-22 00:22:55 +00:00
Mateusz Guzik
1fccb43c39 vfs: change si_usecount management to count used vnodes
Currently si_usecount is effectively a sum of usecounts from all associated
vnodes. This is maintained by special-casing for VCHR every time usecount is
modified. Apart from complicating the code a little bit, it has a scalability
impact since it forces a read from a cacheline shared with said count.

There are no consumers of the feature in the ports tree. In head there are only
2: revoke and devfs_close. Both can get away with a weaker requirement than the
exact usecount, namely just the count of active vnodes. Changing the meaning to
the latter means we only need to modify it on 0<->1 transitions, avoiding the
check plenty of times (and entirely in something like vrefact).

Reviewed by:	kib, jeff
Tested by:	pho
Differential Revision:	https://reviews.freebsd.org/D22202
2019-11-20 12:05:59 +00:00
Jeff Roberson
639676877b Simplify anonymous memory handling with an OBJ_ANON flag. This eliminates
reudundant complicated checks and additional locking required only for
anonymous memory.  Introduce vm_object_allocate_anon() to create these
objects.  DEFAULT and SWAP objects now have the correct settings for
non-anonymous consumers and so individual consumers need not modify the
default flags to create super-pages and avoid ONEMAPPING/NOSPLIT.

Reviewed by:	alc, dougm, kib, markj
Tested by:	pho
Differential Revision:	https://reviews.freebsd.org/D22119
2019-11-19 23:19:43 +00:00
Jeff Roberson
67d0e29304 Replace OBJ_MIGHTBEDIRTY with a system using atomics. Remove the TMPFS_DIRTY
flag and use the same system.

This enables further fault locking improvements by allowing more faults to
proceed with a shared lock.

Reviewed by:	kib
Tested by:	pho
Differential Revision:	https://reviews.freebsd.org/D22116
2019-10-29 21:06:34 +00:00
Mateusz Guzik
2e310f6f72 pseudofs: hashed vncache
Vast majority of uses the cache are just checking if there is an entry
present on process exit (and evicting it if so). Both checking and
eviction process are very expensive and put the lock protecting it high
up on the profile during poudriere -j 104.

Convert the linked list into a hash. This allows to almost always avoid
taking the lock in the first place (and consequently almost removes it
from the profile). Note only one lock is preserved as a split did not
meaningfully impact contention.

Should the cache be used for something it will still run into contention
issues. The code needs a rewrite, but should someone want to tidy it up
further the following can be done:

1) per-chain locks (or at least an array)
2) hashing by something else than just pid

Sponsored by:	The FreeBSD Foundation
2019-10-22 22:52:53 +00:00
Konstantin Belousov
c6ba06d86c Fix interface between nfsclient and vnode pager.
Make the nfsclient always call vnode_pager_setsize() with the vnode
exclusively locked.  This ensures that page fault always can find the
backing page if the object size check succeeded.  Set VV_VMSIZEVNLOCK
flag on NFS nodes.

The main offender breaking the interface in nfsclient is
nfs_loadattrcache(), which is used whenever server responded with
updated attributes, which can happen on non-changing operations as
well.  Also, iod threads only have buffers locked (and even that is
LK_KERNPROC), but they still may call nfs_loadattrcache() on RPC
response.

Instead of immediately calling vnode_pager_setsize() if server
response indicated changed file size, but the vnode is not exclusively
locked, set a new node flag NVNSETSZSKIP.  When the vnode exclusively
locked, or when we can temporary upgrade the lock to exclusive, call
vnode_pager_setsize(), by providing the nfsclient VOP_LOCK() implementation.

Tested by:	pho
Discussed with:	rmacklem
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D21883
2019-10-22 16:17:38 +00:00
Jeff Roberson
0012f373e4 (4/6) Protect page valid with the busy lock.
Atomics are used for page busy and valid state when the shared busy is
held.  The details of the locking protocol and valid and dirty
synchronization are in the updated vm_page.h comments.

Reviewed by:    kib, markj
Tested by:      pho
Sponsored by:   Netflix, Intel
Differential Revision:        https://reviews.freebsd.org/D21594
2019-10-15 03:45:41 +00:00
Jeff Roberson
63e9755548 (1/6) Replace busy checks with acquires where it is trival to do so.
This is the first in a series of patches that promotes the page busy field
to a first class lock that no longer requires the object lock for
consistency.

Reviewed by:	kib, markj
Tested by:	pho
Sponsored by:	Netflix, Intel
Differential Revision:	https://reviews.freebsd.org/D21548
2019-10-15 03:35:11 +00:00
Mateusz Guzik
9c04e4c01e tmpfs: use MNTK_NOMSYNC
Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D22009
2019-10-13 15:42:41 +00:00
Mateusz Guzik
48c426f226 pseudofs: use MNTK_NOMSYNC
Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D22009
2019-10-13 15:42:25 +00:00
Mateusz Guzik
be4cd6912f nullfs: use MNTK_NOMSYNC
Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D22009
2019-10-13 15:42:04 +00:00
Mateusz Guzik
4c9ba39aea devfs: use MNTK_NOMSYNC
Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D22009
2019-10-13 15:41:47 +00:00
Konstantin Belousov
e3fdd051f9 devfs_vptocnp(): correct the component name when node is not at top.
Node' cdp.si_name is the full path as provided by make_dev(9), it
should not be returned by VOP_VPTOCNP() when only the last component
is requested.  Use the dirent entry instead.

With this note, handling of VDIR and VCHR nodes only differs in
handling of root vnode, which simplifies and unifies the logic.

Reported by:	Li, Zhichao1 <Zhichao_Li1@Dell.com>
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2019-10-11 18:41:24 +00:00
Konstantin Belousov
53fcc6c960 Plug the rest of undef behavior places that were missed in r337456.
There are three more places in msdosfs_fat.c which might shift one
into the sign bit.  While there, fix formatting of KASSERTs.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2019-10-11 18:37:02 +00:00
Doug Moore
2288078c5e Define macro VM_MAP_ENTRY_FOREACH for enumerating the entries in a vm_map.
In case the implementation ever changes from using a chain of next pointers,
then changing the macro definition will be necessary, but changing all the
files that iterate over vm_map entries will not.

Drop a counter in vm_object.c that would have an effect only if the
vm_map entry count was wrong.

Discussed with: alc
Reviewed by: markj
Tested by: pho (earlier version)
Differential Revision:	https://reviews.freebsd.org/D21882
2019-10-08 07:14:21 +00:00
Mateusz Guzik
d511f93e45 nfsclient: add root vnode caching
See r353150.

Sponsored by:   The FreeBSD Foundation
Differential Revision:  https://reviews.freebsd.org/D21646
2019-10-06 22:17:29 +00:00
Mateusz Guzik
7682d0be2b tmpfs: add root vnode caching
See r353150.

Sponsored by:   The FreeBSD Foundation
Differential Revision:  https://reviews.freebsd.org/D21646
2019-10-06 22:17:11 +00:00
Mateusz Guzik
559ac49d41 devfs: add root vnode caching
See r353150.

Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D21646
2019-10-06 22:16:55 +00:00
Mateusz Guzik
dfa8dae493 devfs: plug redundant bwillwrite avoidance
vn_write already checks for vnode type to see if bwillwrite should be called.

This effectively reverts r244643.

Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D21905
2019-10-05 17:44:33 +00:00
Konstantin Belousov
c5dac63c15 tmpfs_readdir(): unlock the locked node.
During readdir() we guarantee that the tn_dir.tn_parent does not go
away, but it might be replaced by a parallel rename.  Read tn_parent
only once, then use the cached value.

Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2019-10-03 19:55:05 +00:00
Konstantin Belousov
f7e69a6fa0 tmpfs_rename: style.
Reformat multi-line comments to follow style.
Also fix some typos.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2019-10-03 19:51:56 +00:00
Konstantin Belousov
d60ac9d561 Remove unnecessary vm/vm_page.h and vm/vm_pager.h includes from
tmpfs/tmpfs_vnodes.c.

Submitted by:	ota@j.email.ne.jp
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D21881
2019-10-03 08:25:09 +00:00
Rick Macklem
ee7201a725 Replace all mtx_assert() calls for n_mtx and ncl_iod_mutex with macros.
To be consistent with replacing the mtx_lock()/mtx_unlock() calls on
the NFS node mutex (n_mtx) and ncl_iod_mutex, this patch replaces
all mtx_assert() calls on these mutexes with macros as well.
This will simplify changing these locks to sx locks in a future commit.
However, this change may be delayed indefinitely, since it appears there
is a deadlock when vnode_pager_setsize() is called to shrink the size
and the NFS node lock is held.
There is no semantic change as a result of this commit.

Suggested by:	kib
MFC after:	1 week
2019-09-26 02:54:45 +00:00
Rick Macklem
b662b41e62 Replace all mtx_lock()/mtx_unlock() on the iod lock with macros.
Since the NFS node mutex needs to change to an sx lock so it can be held when
vnode_pager_setsize() is called and the iod lock is held when the NFS node lock
is acquired, the iod mutex will need to be changed to an sx lock as well.
To simply the future commit that changes both the NFS node lock and iod lock
to sx locks, this commit replaces all mtx_lock()/mtx_unlock() calls on the
iod lock with macros.
There is no semantic change as a result of this commit.

I don't know when the future commit will happen and be MFC'd, so I have
set the MFC on this commit to one week so that it can be MFC'd at the same
time.

Suggested by:	kib
MFC after:	1 week
2019-09-24 23:38:10 +00:00
Rick Macklem
5d85e12f44 Replace all mtx_lock()/mtx_unlock() on n_mtx with the macros.
For a long time, some places in the NFS code have locked/unlocked the
NFS node lock with the macros NFSLOCKNODE()/NFSUNLOCKNODE() whereas
others have simply used mtx_lock()/mtx_unlock().
Since the NFS node mutex needs to change to an sx lock so it can be held when
vnode_pager_setsize() is called, replace all occurrences of mtx_lock/mtx_unlock
with the macros to simply making the change to an sx lock in future commit.
There is no semantic change as a result of this commit.

I am not sure if the change to an sx lock will be MFC'd soon, so I put
an MFC of 1 week on this commit so that it could be MFC'd with that commit.

Suggested by:	kib
MFC after:	1 week
2019-09-24 01:58:54 +00:00
Kyle Evans
5fdac75222 msdosfs: do not deget unlinked denodes
When a file is unlinked, the denode is not reclaimed until the last
reference is dropped, but the directory entry is immediately up for reuse.
This is a problem later when createde goes to grab a denode for the newly
created entry -- we search the hash and find a dead denode, then return that
without even bumping the reference count and the data later gets truncated
when the the last reference to the unlinked file is dropped.

This manifested itself as a broken in-place strip(1) on msdosfs. elfcopy
will do a sequence incredibly roughly like this:

open("/mnt/foo", ...) => fd 3
mmap()
unlink("/mnt/foo")
open("/mnt/foo", ...) => fd 4
write(4, ...)
close(4)
close(3)

and the resulting file would be truncated, but the write succeeded, as long
as a reference to the unlinked file had not been closed.

Some archaeology indicates that this bug has likely existed since msdosfs
was converted to use vfs_hash instead of a home rolled hash implementation
in r143570. Prior to that point, the hashget implementation would do a
refcnt check while searching and explicitly only return a denode with
de_refcnt != 0. vfs_hash did not yet have the callback that it does today,
so this slipped away and did not come back when it later grew that
functionality.

The comment indicating that we want to skip these denodes has been updated
to reflect where this is actually done. My repo-diving session seems to
indicate that the refcnt check was likely never actually below the comment,
to be pedantic, but instead a detail wrapped up in the hashget
implementation since the beginning of its inclusion into FreeBSD.

This bug was the cause behind the issue addressed in r352557.

Reported by:	jhibbits
Reviewed by:	kib
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D21731
2019-09-20 20:47:10 +00:00
Konstantin Belousov
6fd583583b Further refine r352393, only call vnode_pager_setsize() outside the
node lock when shrinking.

This is similar to r252528, applied to the above commit.

Apparently there is a race which makes necessary at least to keep the
n_size and pager size consistent when extending.  Current suspect is
that iod threads perform vnode_pager_setsize() without taking the
vnode lock, which corrupts the file content.

Reported and tested by:	Masachika ISHIZUKA <ish@amail.plala.or.jp>
Discussed with:	rmacklem (related issues)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2019-09-17 18:41:39 +00:00
Mateusz Guzik
4cace859c2 vfs: convert struct mount counters to per-cpu
There are 3 counters modified all the time in this structure - one for
keeping the structure alive, one for preventing unmount and one for
tracking active writers. Exact values of these counters are very rarely
needed, which makes them a prime candidate for conversion to a per-cpu
scheme, resulting in much better performance.

Sample benchmark performing fstatfs (modifying 2 out of 3 counters) on
a 104-way 2 socket Skylake system:
before:   852393 ops/s
after:  76682077 ops/s

Reviewed by:	kib, jeff
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D21637
2019-09-16 21:37:47 +00:00
Alan Somers
320c848ff6 Fix an off-by-one error from r351961
That revision addressed a Coverity CID that could lead to a buffer overflow,
but it had an off-by-one error in the buffer size check.

Reported by:	Coverity
Coverity CID:	1405530
MFC after:	3 days
MFC-With:	351961
Sponsored by:	The FreeBSD Foundation
2019-09-16 16:41:01 +00:00
Alan Somers
42767f76af fusefs: fix some minor issues with fuse_vnode_setparent
* When unparenting a vnode, actually clear the flag. AFAIK this is basically
  a no-op because we only unparent a vnode when reclaiming it or when
  unlinking.

* There's no need to call fuse_vnode_setparent during reclaim, because we're
  about to free the vnode data anyway.

Reviewed by:	emaste
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D21630
2019-09-16 14:51:49 +00:00
Konstantin Belousov
1246ee664b nfscl_loadattrcache: fix rest of the cases to not call
vnode_pager_setsize() under the node mutex.

r248567 moved some calls of vnode_pager_setsize() after the node lock
is unlocked, do the rest now.

Reported and tested by:	peterj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2019-09-16 13:26:27 +00:00
Edward Tomasz Napierala
cf38985293 Make pseudofs(9) create directory entries in order, instead
of the reverse.

This fixes Linux sysctl(8) binary - it assumes the first two
directory entries are always "." and "..". There might be other
Linux apps affected by this.

NB it might be a good idea to rewrite it using queue(3).

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D21550
2019-09-14 19:16:07 +00:00
Conrad Meyer
aaa3852435 buf: Add B_INVALONERR flag to discard data
Setting the B_INVALONERR flag before a synchronous write causes the buf
cache to forcibly invalidate contents if the write fails (BIO_ERROR).

This is intended to be used to allow layers above the buffer cache to make
more informed decisions about when discarding dirty buffers without
successful write is acceptable.

As a proof of concept, use in msdosfs to handle failures to mark the on-disk
'dirty' bit during rw mount or ro->rw update.

Extending this to other filesystems is left as future work.

PR:		210316
Reviewed by:	kib (with objections)
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D21539
2019-09-11 21:24:14 +00:00
Alan Somers
6c0c362075 fusefs: Fix iosize for FUSE_WRITE in 7.8 compat mode
When communicating with a FUSE server that implements version 7.8 (or older)
of the FUSE protocol, the FUSE_WRITE request structure is 16 bytes shorter
than normal. The protocol version check wasn't applied universally, leading
to an extra 16 bytes being sent to such servers. The extra bytes were
allocated and bzero()d, so there was no information disclosure.

Reviewed by:	emaste
MFC after:	3 days
MFC-With:	r350665
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D21557
2019-09-11 19:29:40 +00:00
Mark Johnston
fee2a2fa39 Change synchonization rules for vm_page reference counting.
There are several mechanisms by which a vm_page reference is held,
preventing the page from being freed back to the page allocator.  In
particular, holding the page's object lock is sufficient to prevent the
page from being freed; holding the busy lock or a wiring is sufficent as
well.  These references are protected by the page lock, which must
therefore be acquired for many per-page operations.  This results in
false sharing since the page locks are external to the vm_page
structures themselves and each lock protects multiple structures.

Transition to using an atomically updated per-page reference counter.
The object's reference is counted using a flag bit in the counter.  A
second flag bit is used to atomically block new references via
pmap_extract_and_hold() while removing managed mappings of a page.
Thus, the reference count of a page is guaranteed not to increase if the
page is unbusied, unmapped, and the object's write lock is held.  As
a consequence of this, the page lock no longer protects a page's
identity; operations which move pages between objects are now
synchronized solely by the objects' locks.

The vm_page_wire() and vm_page_unwire() KPIs are changed.  The former
requires that either the object lock or the busy lock is held.  The
latter no longer has a return value and may free the page if it releases
the last reference to that page.  vm_page_unwire_noq() behaves the same
as before; the caller is responsible for checking its return value and
freeing or enqueuing the page as appropriate.  vm_page_wire_mapped() is
introduced for use in pmap_extract_and_hold().  It fails if the page is
concurrently being unmapped, typically triggering a fallback to the
fault handler.  vm_page_wire() no longer requires the page lock and
vm_page_unwire() now internally acquires the page lock when releasing
the last wiring of a page (since the page lock still protects a page's
queue state).  In particular, synchronization details are no longer
leaked into the caller.

The change excises the page lock from several frequently executed code
paths.  In particular, vm_object_terminate() no longer bounces between
page locks as it releases an object's pages, and direct I/O and
sendfile(SF_NOCACHE) completions no longer require the page lock.  In
these latter cases we now get linear scalability in the common scenario
where different threads are operating on different files.

__FreeBSD_version is bumped.  The DRM ports have been updated to
accomodate the KPI changes.

Reviewed by:	jeff (earlier version)
Tested by:	gallatin (earlier version), pho
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D20486
2019-09-09 21:32:42 +00:00
Ed Maste
4f0372f8cb msdosfsmount.h: fix ifdef comment 2019-09-09 18:35:17 +00:00
Alan Somers
16f8783452 Coverity fixes in fusefs(5)
CID 1404532 fixes a signed vs unsigned comparison error in fuse_vnop_bmap.
It could potentially have resulted in VOP_BMAP reporting too many
consecutive blocks.

CID 1404364 is much worse. It was an array access by an untrusted,
user-provided variable. It could potentially have resulted in a malicious
file system crashing the kernel or worse.

Reported by:	Coverity
Reviewed by:	emaste
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D21466
2019-09-06 19:40:11 +00:00
Conrad Meyer
454409a372 msdosfs: Remove redundant brelse() after r294954
Same automation.

No functional change.
2019-09-06 08:08:10 +00:00
Conrad Meyer
dadc0f97d0 cd9660: Remove redundant brelse() after r294954
Same automation.

No functional change.
2019-09-06 08:07:36 +00:00
Conrad Meyer
fe8b34563d ext2fs: Remove redundant brelse() after r294954
Coccinelle:

@ rule1 @
 identifier __error;
@@
 ...
 int __error;
 ...

@ rule2 depends on rule1 @
 identifier rule1.__error;
 identifier __bp;
@@

 __error =
(
 bread
|
 bread_gb
|
 breadn
|
 breadn_flags
)
 (..., &__bp);
 if (
(
 __error
|
 __error != 0
)
 ) {
 ...
- brelse(__bp);
 ...
 }

No functional change.
2019-09-06 08:07:12 +00:00
Rick Macklem
4ce21f37fd Delete the unused "nd" argument for nfsrv_proxyds().
The "nd" argument for nfsrv_proxyds() is no longer used by the function.
This patch deletes it. This allows a subsequent patch to delete the "nd"
argument from nfsvno_getattr(), since it's only use of "nd" was to pass it
to nfsrv_proxyds().
Getting rid of the "nd" argument from nfsvno_getattr() avoids confusion
over why it might need "nd".

This patch is trivial and does not have any semantic effect.
2019-09-05 22:25:19 +00:00
Conrad Meyer
a6935d085c Remove long-dead BUF_ASSERT_{,UN}HELD assertions
These were fully neutered in r177676 (2008), but not removed at the time for
unclear reasons.  They're totally dead code, so go ahead and yank them now.

No functional change.
2019-09-05 21:43:33 +00:00
Conrad Meyer
f80cbeb292 msdosfs: Drop an unneeded brelse in bread error condition
After r294954, it is an invariant that bread returns non-NULL bp if and only
if the routine succeeded.  On error, it handles any buffer cleanup
internally.  So the brelse(NULL) here was just redundant.

No functional change.

Discussed with:	kib (extracted from a larger differential)
2019-09-05 21:30:52 +00:00
Rick Macklem
2e67077700 Delete the unused "nd" argument for nfsrv_checkdsattr().
The "nd" argument for nfsrv_checkdsattr() is no longer used by the function.
This patch deletes it. This allows subsequent patches to delete the "nd"
argument from nfsrv_proxyds(), since it's only use of "nd" was to pass it
to nfsrv_checkdsattr(). The same will then be true for nfsvno_getattr(),
which passes "nd" to nfsrv_proxyds().
Getting rid of the "nd" argument from nfsvno_getattr() avoids confusion
over why it might need "nd".

This patch is trivial and does not have any semantic effect.
Found by inspection while working on the NFSv4.2 server.
2019-09-04 22:37:28 +00:00
Kyle Evans
f99c5e8d28 pseudofs: make readdir work without a pid again
Specifically, the following was broken:

$ mount -t procfs procfs /proc
$ ls -l /proc

r351741 reworked readdir slightly to avoid pfs_node/pidhash LOR, but
inadvertently regressed pid == NO_PID; new pfs_lookup_proc() fails for the
obvious reasons, and later pfs_visible_proc doesn't capture the
pid == NO_PID -> return 1 aspect of pfs_visible. We can infact skip this
whole block if we're operating on a directory w/ NO_PID, as it's always
visible.

Reported by:	trasz
Reviewed by:	mjg
Differential Revision:	https://reviews.freebsd.org/D21518
2019-09-04 14:20:39 +00:00