Commit Graph

3560 Commits

Author SHA1 Message Date
Rick Macklem
0596f343f8 Don't set ND_NOMOREDATA for a failed Setattr operation (NFSv4).
The NFSv4 Setattr operation always has reply data even when it fails,
so don't set the ND_NOMOREDATA for it. This would only affect unusual
cases where Setattr fails and the RPC code wants to parse the rest of
the compound. Detected during recent development related to the pNFS server.

MFC after:	2 weeks
2017-04-21 23:01:32 +00:00
Rick Macklem
40f8ff4800 Don't create a backchannel for a DS connection.
An NFSv4.1 client connection to a Data Server (DS) should not have a
backchannel. This patch fixes the NFSv4.1/pNFS client to not do a backchannel
for this case.
Found during recent testing with the pNFS server under development.

MFC after:	2 weeks
2017-04-21 22:38:26 +00:00
Conrad Meyer
ef3c43b4e3 fuse: Implement FOPEN_KEEP_CACHE flag
Implement FUSE open flag FOPEN_KEEP_CACHE.  Without this flag, cached file
contents should be invalidated on open.  Apparently, fusefs-encfs relies
upon this behavior.

PR:		218636
Submitted by:	Ben RUBSON <ben.rubson at gmail.com>
2017-04-21 22:00:22 +00:00
Rick Macklem
e96af29419 Add checks for failed operations to the NFSv4 client function nfscl_mtofh().
The nfscl_mtofh() function didn't check for failed operations and, as such,
would have returned EBADRPC for these cases, due to parsing failure.
This patch adds checks, so that it returns with ND_NOMOREDATA set.
This is needed for future use in the pNFS server and acts as a safety
belt in the meantime.

MFC after:	2 weeks
2017-04-21 21:43:00 +00:00
Rick Macklem
8c1d0d9ce5 Set default uid/gid to nobody/nogroup for NFSv4 mapping.
The default uid/gid for NFSv4 are set by the nfsuserd(8) daemon.
However, they were 0 until the nfsuserd(8) was run. Since it is
possible to use NFSv4 without running the nfsuserd(8) daemon, set them
to nobody/nogroup initially.
Without this patch, the values would be set by the nfsuserd(8) daemon
and left changed even if the nfsuserd(8) daemon was killed. The default
values of 0 meant that setting a group to "wheel" would fail even when
done by root.
It also adds a definition of GID_NOGROUP to sys/conf.h.

Discussed on:	freebsd-current@
MFC after:	2 weeks
2017-04-21 20:08:10 +00:00
Rick Macklem
b843ada7aa Revert r317240. I didn't realize there were defined constants for
uid/gid values in sys/conf.h. I will do another commit using those.
2017-04-21 11:48:12 +00:00
Rick Macklem
1350db1780 Set default uid/gid to nobody/nogroup for NFSv4 mapping.
The default uid/gid for NFSv4 are set by the nfsuserd(8) daemon.
However, they were 0 until the nfsuserd(8) was run. Since it is
possible to use NFSv4 without running the nfsuserd(8) daemon, set them
to nobody/nogroup initially.
Without this patch, the values would be set by the nfsuserd(8) daemon
and left changed even if the nfsuserd(8) daemon was killed. Also, the default
values of 0 meant that setting a group to "wheel" would fail even when
done by root and this patch fixes this issue.

MFC after:	2 weeks
2017-04-21 01:50:41 +00:00
Rick Macklem
dedec68c32 Fix the setting of atime for Linux client NFSv4 mounts.
The FreeBSD NFSv4 server did not set the attribute bit for TimeAccess in
the reply to an Open with exclusive_create, as required by the RFCs.
(This is required since the FreeBSD NFS server stores the create_verifier
 in the va_atime attribute.)
As such, the Linux NFSv4 client did not set the TimeAccess (atime) in
the Setattr done in an RPC after the one with the Open/exclusive_create.
This patch fixes the server to set the TimeAccess bit in the reply.

I believe that storing the create_verifier in an extended attribute for
file systems that support extended attributes might be a good idea,
but I will wait for a discussion of this on the freebsd-fs@ email list
before considering committing a patch to do this.

Reported by:	jim@ks.uiuc.edu
Suggested by:	dfr
MFC after:	2 weeks
2017-04-21 00:17:47 +00:00
Gleb Smirnoff
83c9dea1ba - Remove 'struct vmmeter' from 'struct pcpu', leaving only global vmmeter
in place.  To do per-cpu stats, convert all fields that previously were
  maintained in the vmmeters that sit in pcpus to counter(9).
- Since some vmmeter stats may be touched at very early stages of boot,
  before we have set up UMA and we can do counter_u64_alloc(), provide an
  early counter mechanism:
  o Leave one spare uint64_t in struct pcpu, named pc_early_dummy_counter.
  o Point counter(9) fields of vmmeter to pcpu[0].pc_early_dummy_counter,
    so that at early stages of boot, before counters are allocated we already
    point to a counter that can be safely written to.
  o For sparc64 that required a whole dummy pcpu[MAXCPU] array.

Further related changes:
- Don't include vmmeter.h into pcpu.h.
- vm.stats.vm.v_swappgsout and vm.stats.vm.v_swappgsin changed to 64-bit,
  to match kernel representation.
- struct vmmeter hidden under _KERNEL, and only vmstat(1) is an exclusion.

This is based on benno@'s 4-year old patch:
https://lists.freebsd.org/pipermail/freebsd-arch/2013-July/014471.html

Reviewed by:	kib, gallatin, marius, lidl
Differential Revision:	https://reviews.freebsd.org/D10156
2017-04-17 17:34:47 +00:00
Gleb Smirnoff
ca148cda3b Two more files missed in r317055: these files need sys/vmmeter.h, but now
they got it implicitly included via sys/pcpu.h.
2017-04-17 17:20:48 +00:00
Gleb Smirnoff
9ed01c32e0 All these files need sys/vmmeter.h, but now they got it implicitly
included via sys/pcpu.h.
2017-04-17 17:07:00 +00:00
Rick Macklem
6d4377c1ae Remove unused "cred" argument to ncl_flush().
The "cred" argument of ncl_flush() is unused and it was confusing to have
the code passing in NULL for this argument in some cases. This patch deletes
this argument.
There is no semantic change because of this patch.

MFC after:	2 weeks
2017-04-14 13:25:45 +00:00
Rick Macklem
037a2012e9 Add an NFSv4.1 mount option for "use one openowner".
Some NFSv4.1 servers such as AmazonEFS can only support a small fixed number
of open_owner4s. This patch adds a mount option called "oneopenown" that
can be used for NFSv4.1 mounts to make the client do all Opens with the
same open_owner4 string. This option can only be used with NFSv4.1 and
may not work correctly when Delegations are is use.

Reported by:	cperciva
Tested by:	cperciva
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D8988
2017-04-13 21:54:19 +00:00
Rick Macklem
60287c0c03 Add call to svcpool_close() for the NFSv4 callback pool (svcpool_nfscbd).
A function called svcpool_close() was added to the server side krpc by
r313735, so that a pool could be closed without destroying the data structures.
This little patch adds a call to it for the callback pool (svcpool_nfscbd),
so that the nfscbd daemon can be killed/restarted and continue to work
correctly.

MFC after:	2 weeks
2017-04-13 20:16:29 +00:00
Rick Macklem
0649fcaea5 Fix the NFS client for "text file modified, process killed" mmap'd case.
When an mmap'd text file is written and then executed immediately
afterwards, it was possible that the modify time would change after the
text file was executing, resulting in the process executing the file
being killed. This was usually only observed when the file system's
times were set to higher resolution, but could have occurred for any
time resolution.
This was reported on a recent email list discussion.
This patch adds a VOP_SET_TEXT() to the NFS client which flushed all
dirty pages to the NFS server and then makes sure that n_mtime is up
to date to avoid this from occurring.
Thanks go to kib@ and pho@ for their help with developing this patch.

Tested by:	pho
Reviewed by:	kib
MFC after:	2 weeks
2017-04-12 21:37:12 +00:00
Rick Macklem
7d9b62e1a1 Don't throw away Open state when a NFSv4.1 client recovery fails.
If the ExchangeID/CreateSession operations done by an NFSv4.1 client
after the server crashes/reboots fails, it is possible that some process/thread
is waiting for an open_owner lock. If the client state is free'd, this
can cause a crash.
This would not normally happen, but has been observed on a mount of the
AmazonEFS service.

Reported by:	cperciva
Tested by:	cperciva
PR:		216086
MFC after:	2 weeks
2017-04-11 22:47:02 +00:00
Rick Macklem
9e507a6a00 During a server crash recovery, fix the NFSv4.1 client for a NFSERR_BADSESSION
during recovery.

If the NFSv4.1 client gets a NFSv4.1 NFSERR_BADSESSION reply to an Open/Lock
operation while recovering from the server crash/reboot, allow the opens
to be retained for a subsequent recovery attempt. Since NFSv4.1 servers
should only reply NFSERR_BADSESSION after a crash/reboot that has lost
state, this case should almost never happen.
However, for the AmazonEFS file service, this has been observed when
the client does a fresh TCP connection for RPCs.

Reported by:	cperciva
Tested by:	cperciva
PR:		216088
MFC after:	2 weeks
2017-04-11 20:28:15 +00:00
Konstantin Belousov
6627d919de Remove debugging printf.
Instead, issue a diagnostic and return appropriate error if
ncl_flush() was unable to clean buffer queue after the specified
number or retries.

Reviewed by:	rmacklem
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2017-04-11 08:29:12 +00:00
Rick Macklem
fb55679151 Set initial values for nfsstatfs in the NFSv4 client.
The AmazonEFS NFSv4.1 server does not support the FILES_FREE and FILES_TOTAL
attributes. As such, an NFSv4.1 mount to the server would return garbage
for these values. This patch initializes the fields of the nfsstatfs structure,
so that "df" and friends will at least return consistent bogus values.
This patch should have effect when mounting other NFSv4.1 servers.

Reported by:	cperciva
MFC after:	2 weeks
2017-04-10 21:49:35 +00:00
Rick Macklem
8efffe80ba Avoid starvation of the server crash recovery thread for the NFSv4 client.
This patch gives a requestor of the exclusive lock on the client state
in the NFSv4 client priority over shared lock requestors. This avoids
the server crash recovery thread being starved out by other threads doing
RPCs.

Tested by:	cperciva
PR:		216087
MFC after:	2 weeks
2017-04-10 01:28:01 +00:00
Rick Macklem
4e7dcfab2d Fix the NFSv4 client hndling of a stale write verifier in the Commit operation.
When the NFSv4 client Commit operation encountered a stale write verifier,
it erroneously mapped that to EIO. This could have caused recently written
data to be lost when a server crashes/reboots between an UNSTABLE write
and the subsequent commit. This patch fixes this.
The bug was only for the NFSv4 client and did not affect NFSv3.

Tested by:	cperciva
PR:		215887
MFC after:	2 weeks
2017-04-09 21:50:21 +00:00
Rick Macklem
2242bc81f2 Fix the NFSv4.1 client for NFSERR_BADSESSION recovery via ReclaimComplete.
For the ReclaimComplete operation, the RPC layer should not loop on
NFSERR_BADSESSION. If it does, the recovery thread (nfscl) can get stuck
looping and will not do a recovery.
This patch fixes it so it does not loop. This bug only affects NFSv4.1 and
only when a server reboots.

Tested by:	cperciva
PR:		215886
MFC after:	2 weeks
2017-04-09 21:06:21 +00:00
Rick Macklem
83a37350bf Fix parsing failure for NFSv4 Setattr operation for failed case.
If an operation that preceeds a Setattr in an NFSv4 compound fails,
there is no bitmap of attributes to parse. Without this patch, the
parsing would fail and return EBADRPC instead of the correct failure
error. This could break recovery from a server crash/reboot.

Tested by:	cperciva
PR:		215883
MFC after:	2 weeks
2017-04-09 12:32:22 +00:00
Conrad Meyer
8844e55d4a smbfs: Fix an indentation level
Based on the change in r242386, it seems clear that scred was intended to
be released in all paths at exit.

No functional change.  This line's indent was just the result of a bad copy
paste from the previous free() in an early exit path.

Reported by:	PVS-Studio
Sponsored by:	Dell EMC Isilon
2017-04-06 17:31:58 +00:00
Konstantin Belousov
a046da7e90 Remove spl*() calls from the nfsclient code. Style adjustments in the
related lines in ncl_writebp().

Reviewed by:	rmacklem
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-04-06 12:44:34 +00:00
Konstantin Belousov
ea52525928 Make nfs pageout coherent with the dirty state of the buffers.
Write out the dirty pages using VOP_WRITE() instead of directly
calling ncl_writerpc(). The state of the buffers now reflects the
write, fixing some hard to diagnose consistency and write order
issues.  The change also allowed to remove remapping of paged out
pages into kernel space and related allocation of the phys buffer.

Reviewed by:	markj, rmacklem
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D10241
2017-04-05 17:26:20 +00:00
Konstantin Belousov
b73cd4d344 Handle nfs IO_ASYNC write requests asynchronously.
Reviewed by:	markj, rmacklem
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
X-Differential revision:	https://reviews.freebsd.org/D10241
2017-04-05 17:20:31 +00:00
Konstantin Belousov
48fe926362 Handle possible vnode reclamation after ncl_vinvalbuf() call.
ncl_vinvalbuf() might need to upgrade vnode lock, allowing the vnode
to be reclaimed by other thread.  Handle the situation, indicated by
the returned error zero and VI_DOOMED iflag set, converting it into
EBADF.  Handle all calls, even where the vnode is exclusively locked
right now.

Reviewed by:	markj, rmacklem
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
X-Differential revision:	https://reviews.freebsd.org/D10241
2017-04-05 17:11:39 +00:00
Pedro F. Giffuni
ac506a8f5a ext2fs: Initial support for Extended Attributes.
Currently read-only.

Submitted by:	Fedor Uporov
MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D10151
2017-04-01 01:00:36 +00:00
Eric Badger
99b14d9f1b remove procfs ctl interface
This interface has no in-tree consumers and has been more or less
non-functional for several releases.

Remove manpage note that the procfs special file 'mem' is grouped to
kmem. This hasn't been true since r81107.

Remove procfs' README file. It is an out of date duplication of the manpage
(quoth the README: "since the bsd kernel is single-processor...").

Reviewed by:	vangyzen, bcr (manpage)
Approved by:	des (procfs maintainer), vangyzen (mentor)
Differential Revision:	https://reviews.freebsd.org/D9802
2017-03-05 03:05:24 +00:00
Warner Losh
fbbd9655e5 Renumber copyright clause 4
Renumber cluase 4 to 3, per what everybody else did when BSD granted
them permission to remove clause 3. My insistance on keeping the same
numbering for legal reasons is too pedantic, so give up on that point.

Submitted by:	Jan Schaumann <jschauma@stevens.edu>
Pull Request:	https://github.com/freebsd/freebsd/pull/96
2017-02-28 23:42:47 +00:00
Konstantin Belousov
aca4bb9112 Do not leak mount references for dying threads.
Thread might create a condition for delayed SU cleanup, which creates
a reference to the mount point in td_su, but exit without returning
through userret(), e.g. when terminating due to single-threading or
process exit.  In this case, td_su reference is not dropped and mount
point cannot be freed.

Handle the situation by clearing td_su also in the thread destructor
and in exit1().  softdep_ast_cleanup() has to receive the thread as
argument, since e.g. thread destructor is executed in different
context.

Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2017-02-25 10:38:18 +00:00
Edward Tomasz Napierala
1a2dbd1aa3 Simplify devfs_fsync() by removing it. This might also be a minor
optimization, as vn_isdisk() needs to lock a global mutex.

Reviewed by:	imp
Tested by:	pho
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D9628
2017-02-20 16:18:33 +00:00
Konstantin Belousov
ecc6c515ab Apply noexec mount option for mmap(PROT_EXEC).
Right now the noexec mount option disallows image activators to try
execve the files on the mount point.  Also, after r127187, noexec
also limits max_prot map entries permissions for mappings of files
from such mounts, but not the actual mapping permissions.

As result, the API behaviour is inconsistent.  The files from noexec
mount can be mapped with PROT_EXEC, but if mprotect(2) drops execution
permission, it cannot be re-enabled later.  Make this consistent
logically and aligned with behaviour of other systems, by disallowing
PROT_EXEC for mmap(2).

Note that this change only ensures aligned results from mmap(2) and
mprotect(2), it does not prevent actual code execution from files
coming from noexec mount.  Such files can always be read into
anonymous executable memory and executed from there.

Reported by:	shamaz.mazum@gmail.com
PR:	217062
Reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-02-19 20:51:04 +00:00
Pedro F. Giffuni
b624457280 ext2fs: Remove unused assignment.
Value is re-assigned a few lines later without being read.

Found by:	Clang static analyzer
MFC after:	1 week
2017-02-17 20:56:43 +00:00
Eric van Gyzen
8144690af4 Use inet_ntoa_r() instead of inet_ntoa() throughout the kernel
inet_ntoa() cannot be used safely in a multithreaded environment
because it uses a static local buffer. Instead, use inet_ntoa_r()
with a buffer on the caller's stack.

Suggested by:	glebius, emaste
Reviewed by:	gnn
MFC after:	2 weeks
Sponsored by:	Dell EMC
Differential Revision:	https://reviews.freebsd.org/D9625
2017-02-16 20:47:41 +00:00
Konstantin Belousov
bd1623def1 Do not access memory past the buffer end.
Do not accept and silently truncate too long hostname.

Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-02-16 06:36:16 +00:00
Konstantin Belousov
599009e261 Do not allocate char[MNAMELEN] on stack in nfsclient.
Right now this is not critical, but will be after planned increase of
MNAMELEN from 88 to 1k.

Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-02-16 06:34:20 +00:00
Konstantin Belousov
cf53034fc7 Minor style fixes.
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2017-02-16 06:31:36 +00:00
Ed Maste
1dc349ab95 prefix UFS symbols with UFS_ to reduce namespace pollution
Specifically:
  ROOTINO -> UFS_ROOTINO
  WINO -> UFS_WINO
  NXADDR -> UFS_NXADDR
  NDADDR -> UFS_NDADDR
  NIADDR -> UFS_NIADDR
  MAXSYMLINKLEN_UFS[12] -> UFS[12]_MAXSYMLINKLEN (for consistency)

Also prefix ext2's and nandfs's NDADDR and NIADDR with EXT2_ and NANDFS_

Reviewed by:	kib, mckusick
Obtained from:	NetBSD
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D9536
2017-02-15 19:50:26 +00:00
Edward Tomasz Napierala
a3b4ab9116 Change the "devfs_fsync: vop_stdfsync failed" from panic to a printf.
It's not a proper fix, but should be better than what we have now.
Since it got broken some six months ago it results in an incredibly
annoying and trivially reproducible panic every time eg an USB disk
gets disconnected.

MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
2017-02-15 16:52:21 +00:00
Andriy Gapon
90f90687b3 add svcpool_close to handle killed nfsd threads
This patch adds a new function to the server krpc called
svcpool_close().  It is similar to svcpool_destroy(), but does not free
the data structures, so that the pool can be used again.

This function is then used instead of svcpool_destroy(),
svcpool_create() when the nfsd threads are killed.

PR:		204340
Reported by:	Panzura
Approved by:	rmacklem
Obtained from:	rmacklem
MFC after:	1 week
2017-02-14 17:49:08 +00:00
Baptiste Daroussin
b4b4b5304b Revert crap accidentally committed 2017-01-28 16:31:23 +00:00
Baptiste Daroussin
814aaaa7da Revert r312923 a better approach will be taken later 2017-01-28 16:30:14 +00:00
Konstantin Belousov
538ee0d74e Remove mistakenly merged field.
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-01-19 20:03:26 +00:00
Konstantin Belousov
00ac6a98d8 Add mount option for tmpfs(5) to not use namecache.
The option "nonc" disables using of namecache for the created mount,
by default namecache is used.  The rationale for the option is that
namecache duplicates the information which is already kept in memory
by tmpfs.  Since it believed that namecache scales better than tmpfs,
or will scale better, do not enable the option by default.  On the
other hand, smaller machines may benefit from lesser namecache
pressure.

Discussed with:	mjg
Tested by:	pho (as part of larger patch)
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2017-01-19 19:46:49 +00:00
Konstantin Belousov
08c053e71c Implement VOP_VPTOCNP() for tmpfs.
For directories, node->tn_spec.tn_dir.tn_parent pointer to the parent
is used.  For non-directories, the implementation is naive, all
directory nodes are scanned to find a dirent linking the specified
node.  This can be significantly improved by maintaining tn_parent for
all nodes, later.

Tested by:	pho (as part of larger patch)
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2017-01-19 19:29:13 +00:00
Konstantin Belousov
b4ba3b6459 VNON nodes cannot exist.
Tested by:	pho (as part of larger patch)
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2017-01-19 19:25:42 +00:00
Konstantin Belousov
64c250439f Refcount tmpfs nodes and mount structures.
On dotdot lookup and fhtovp operations, it is possible for the file
represented by tmpfs node to be removed after the thread calculated
the pointer.  In this case, tmpfs_alloc_vp() accesses freed memory.

Introduce the reference count on the nodes.  The allnodes list from
tmpfs mount owns 1 reference, and threads performing unlocked
operations on the node, add one transient reference.  Similarly, since
struct tmpfs_mount maintains the list where nodes are enlisted,
refcount it by one reference from struct mount and one reference from
each node on the list.  Both nodes and tmpfs_mounts are removed when
refcount goes to zero.

Note that this means that nodes and tmpfs_mounts might survive some
time after the node is deleted or tmpfs_unmount() finished.  The
tmpfs_alloc_vp() in these cases returns error either due to node
removal (tn_nlinks == 0) or because of insmntque1(9) error.

Tested by:	pho (as part of larger patch)
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2017-01-19 19:15:21 +00:00
Konstantin Belousov
1c07d69bc2 Make tmpfs directory cursor available outside tmpfs_subr.c.
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-01-19 18:38:58 +00:00