Commit Graph

3646 Commits

Author SHA1 Message Date
Rick Macklem
c36e087097 Remove 0 filling from nfsm_uiombuflist().
nfsm_uiombuflist() zero filled the mbuf list to a multiple of 4bytes
as required for XDR. Unfortunately that modified an mbuf list after
it was m_copym()'d and was broken. This patch removes the zero filling code.
Since nfsm_uiombuflist() is not yet used in head/current, this has no
effect on users.
The function will be used by a future commit of code that adds Flex
File Layout support.
2017-09-24 19:43:31 +00:00
John Baldwin
e1d15b892a Only handle _PC_MAX_CANON, _PC_MAX_INPUT, and _PC_VDISABLE for TTY devices.
Move handling of these three pathconf() variables out of vop_stdpathconf()
and into devfs_pathconf() as TTY devices can only be devfs files.  In
addition, only return settings for these three variables for devfs devices
whose device switch has the D_TTY flag set.

Discussed with:	bde, kib
Sponsored by:	Chelsio Communications
2017-09-21 23:05:32 +00:00
Rick Macklem
6b43e06029 Add a few definitions for Flex File Layout for pNFS.
These definitions will be used by a future commit.
2017-09-21 00:41:12 +00:00
Rick Macklem
0f29b8292d Make the nfsrpc_layoutget() function a static.
Make the NFSv4 pNFS client function nfsrpc_layoutget() a static, since it
is only used in sys/fs/nfsclient/nfs_clrpcops.c.
This prepares the code for future patches that add Flex File layout
support.
2017-09-19 23:28:22 +00:00
Rick Macklem
2742a21091 Add a new function called nfsm_uiombuflist(), similar to nfsm_uiombuf().
This patch adds a new function called nfsm_uiombuflist(), which is
similar to nfsm_uiombuf(), but doesn't not use the fields in
struct nfsrv_descript. This new function will be used by the pNFS client
for writing to mirrors using Flex Files layout.
The function is not yet called anywhere.
Also, get rid of #ifndef APPLE, which is ancient cruft left over from
the Mac OSX port of the NFSv4 client.
2017-09-19 21:31:36 +00:00
Rick Macklem
b0932afacc Simplify nfsrpc_layoutreturn() args.
Simplify nfsrpc_layoutreturn() args. in preparation for the addition
of Flex File layout support, since File layout uses a 0 length field.
Flex Files does use a longer field, but that will be added in a
subsequent commit.
2017-09-19 20:45:25 +00:00
Rick Macklem
ab118d04be Simplify nfsrpc_layoutcommit() args.
Simplify nfsrpc_layoutcommit() args. in preparation for the addition
of Flex File layout support, since it also uses a 0 length field.
2017-09-19 20:18:41 +00:00
Rick Macklem
ccf038250a Fix bogus FREAD with NFSV4OPEN_ACCESSREAD. No functional change.
The code in nfscl_doflayoutio() bogusly used FREAD instead of
NFSV4OPEN_ACCESSREAD. Since both happen to be defined as "1", this
worked and the patch doesn't result in a functional change.
Found by inspection during development of Flex File Layout support.

MFC after:	2 weeks
2017-09-17 22:18:01 +00:00
Konstantin Belousov
4eeec01fee Style.
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2017-08-28 21:04:56 +00:00
Konstantin Belousov
fbcbbe78dc Verify that the BPB media descriptor and FAT ID match.
FAT specification requires that for valid FAT, FAT cluster 0 has a
specific value derived from the BPB media descriptor.  The lowest
(little-endian) byte must be equal to bpb.bpbMedia, other bits in the
cluster number must be all 1's.  Implement the check to reduce the
chance of the randomly corrupted FAT to pass the mount attempt.

Submitted by:	Siva Mahadevan <smahadevan@freebsdfoundation.org>
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D12124
2017-08-28 20:52:32 +00:00
Konstantin Belousov
e5cffdd34b Do not drop NFS vnode lock when performing consistency checks.
Currently several paths in the NFS client upgrade the shared vnode
lock to exclusive, which might cause temporal dropping of the lock.
This action appears to be fatal for nullfs mounts over NFS. If the
operation is performed over nullfs vnode, then bypassed down to NFS
VOP, and the lock is dropped, other thread might reclaim the upper
nullfs vnode.  Since on reclaim the nullfs vnode lock and NFS vnode
lock are split, the original lock state of the nullfs vnode is not
restored.  As result, VFS operations receive not locked vnode after a
VOP call.

Stop upgrading the vnode lock when we check the consistency or flush
buffers as result of detected inconsistency.  Instead, allocate a new
lockmgr lock for each NFS node, which is locked exclusive instead of
the vnode lock upgrade.  In other words, the other parallel
modification of the vnode are excluded by either vnode lock conflict
or exclusivity of the new lock when the vnode lock is shared.

Also revert r316529 because now the vnode cannot be reclaimed during
ncl_vinvalbuf().

In collaboration with:	pho
Reviewed by:	rmacklem
Reported and tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D12083
2017-08-20 10:08:45 +00:00
Mark Johnston
a95435cfed Bump the maximum file name length in pseudofs filesystems to 48.
The previous limit of 24 was somewhat restrictive, and with this change
ceil(log2(sizeof(struct pfs_node))) is the same as before in both the ILP32
and LP64 models, so the malloc zone used for allocations of struct pfs_node
is the same as before.

Approved by:	des
2017-08-03 21:35:53 +00:00
Dmitry Chagin
77d3337c9f Implement proper Linux /dev/fd and /proc/self/fd behavior by adding
Linux specific things to the native fdescfs file system.

Unlike FreeBSD, the Linux fdescfs is a directory containing a symbolic
links to the actual files, which the process has open.
A readlink(2) call on this file returns a full path in case of regular file
or a string in a special format (type:[inode], anon_inode:<file-type>, etc..).
As well as in a FreeBSD, opening the file in the Linux fdescfs directory is
equivalent to duplicating the corresponding file descriptor.

Here we have mutually exclusive requirements:
- in case of readlink(2) call fdescfs lookup() method should return VLNK
vnode otherwise our kern_readlink() fail with EINVAL error;
- in the other calls fdescfs lookup() method should return non VLNK vnode.

For what new vnode v_flag VV_READLINK was added, which is set if fdescfs has beed
mounted with linrdlnk option an modified kern_readlinkat() to properly handle it.

For now For Linux ABI compatibility mount fdescfs volume with linrdlnk option:

    mount -t fdescfs -o linrdlnk null /compat/linux/dev/fd

Reviewed by:	kib@
MFC after:	1 week
Relnotes:	yes
2017-08-01 03:40:19 +00:00
Alexander Motin
e9c9826673 Improve FHA locality control for NFS read/write requests.
This change adds two new tunables, allowing to control serialization for
read and write NFS requests separately.  It does not change the default
behavior since there are too many factors to consider, but gives additional
space for further experiments and tuning.

The main motivation for this change is very low write speed in case of ZFS
with sync=always or when NFS clients requests sychronous operation, when
every separate request has to be written/flushed to ZIL, and requests are
processed one at a time.  Setting vfs.nfsd.fha.write=0 in that case allows
to increase ZIL throughput by several times by coalescing writes and cache
flushes.  There is a worry that doing it may increase data fragmentation
on disks, but I suppose it should not happen for pool with SLOG.

MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2017-07-31 15:23:19 +00:00
Rick Macklem
47cbff34fa Add kernel support for the NFS client forced dismount "umount -N" option.
When an NFS mount is hung against an unresponsive NFS server, the "umount -f"
option can be used to dismount the mount. Unfortunately, "umount -f" gets
hung as well if a "umount" without "-f" has already been done. Usually,
this is because of a vnode lock being held by the "umount" for the mounted-on
vnode.
This patch adds kernel code so that a new "-N" option can be added to "umount",
allowing it to avoid getting hung for this case.
It adds two flags. One indicates that a forced dismount is about to happen
and the other is used, along with setting mnt_data == NULL, to handshake
with the nfs_unmount() VFS call.
It includes a slight change to the interface used between the client and
common NFS modules, so I bumped __FreeBSD_version to ensure both modules are
rebuilt.

Tested by:	pho
Reviewed by:	kib
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D11735
2017-07-29 19:52:47 +00:00
Rick Macklem
23e148e9a4 Fix possible crash for the NFSv4.1 pNFS client.
If the nfsrpc_createlayoutrpc() call in nfsrpc_getcreatelayout() fails,
the code used nfhpp when it might be set NULL. This patch checks for
the error cases (laystat != 0) and avoids using nfhpp for the failure case.
This would only affect NFSv4.1 mounts with the "pnfs" option.
Found while testing the "umount -N" patch not yet in head.

MFC after:	2 weeks
2017-07-29 02:25:49 +00:00
Rick Macklem
16f300fa4a Replace the checks for MNTK_UNMOUNTF with a macro that does the same thing.
This patch defines a macro that checks for MNTK_UNMOUNTF and replaces
explicit checks with this macro. It has no effect on semantics, but
prepares the code for a future patch where there will also be a
NFS specific flag for "forced dismount about to occur".

Suggested by:	kib
MFC after:	2 weeks
2017-07-27 20:55:31 +00:00
Konstantin Belousov
555b7bb4c8 Mark pages after EOF as clean after pageout.
Suppose that a file on NFS has partially filled last page, and this
page is dirty.  NFS VOP_PAGEOUT() method only marks the the page clean
up to the block of the last written byte, leaving other blocks dirty.
Also any page which erronously exists in the vnode vm_object past EOF
is also left marked as dirty.

With the introduction of the buf-cache coherent pager, each pass of
syncer over the object with such page results in creation of B_DELWRI
buffer due to VOP_WRITE() call.  This buffer is noted on next syncer
pass, which results e.g. a visible manifestation of shutdown never
finishing vnode sync.  Note that before buf-cache coherency commit, a
dirty page might left never synced to server if a partial writes
occur.

Fix this by clearing dirty bits after EOF.  Only blocks of the partial
page which are completely after EOF are marked clean, to avoid
possible user data loss.

Reported by:	mav
Reviewed by:	alc, markj
Tested by:	mav, pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D11697
2017-07-26 20:07:05 +00:00
Konstantin Belousov
cc2c26223b Move rtvals initialization out of the region protected by NFS node
lock.

Noted by:	alc
Reviewed by:	alc, markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
X-Differential revision:	https://reviews.freebsd.org/D11697
2017-07-26 20:01:31 +00:00
Dmitry Chagin
f80dc46c3d Replace unnecessary _KERNEL by double-include protection.
MFC after:	2 week
2017-07-25 06:59:35 +00:00
Rick Macklem
f8181b5e0e r320062 introduced a bug when doing NFSv4.1 mounts against some non-FreeBSD servers.
r320062 used nm_rsize, nm_wsize to set the maximum request/response sizes for
the NFSv4.1 session. If rsize,wsize are not specified as options, the
value of nm_rsize, nm_wsize is 0 at session creation, resulting in
values for request/response that are too small.
This patch fixes the problem. A workaround is to specify rsize=N,wsize=N
mount options explicitly, so they are set before session creation.
This bug only affects NFSv4.1 mounts against some non-FreeBSD servers.

MFC after:	1 week
2017-07-21 00:14:43 +00:00
Rick Macklem
06ea10c60b Revert r321308. I'll commit a better fix soon. 2017-07-20 23:59:47 +00:00
Rick Macklem
a9d104fd89 r320062 introduced a bug when doing NFSv4.1 mounts against some non-FreeBSD servers.
r320062 used nm_rsize, nm_wsize to set the maximum request/response sizes for
the NFSv4.1 session. If rsize,wsize are not specified as options, the
value of nm_rsize, nm_wsize is 0 at session creation, resulting in
values for request/response that are too small.
This patch fixes the problem. A workaround is to specify rsize=N,wsize=N
mount options explicitly, so they are set before session creation.
This bug only affects NFSv4.1 mounts against some non-FreeBSD servers.

MFC after:	1 week
2017-07-20 23:15:50 +00:00
Edward Tomasz Napierala
1d2fef9b9a Rename vfs.nfsd.enable_uidtostring to vfs.nfs.enable_uidtostring.
It applies to both NFS client and NFS server, and is useful for both.
This is different from vfs.nfsd.enable_stringtouid, which is specific
to server side.

Reviewed by:	rmacklem@
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
2017-07-19 09:59:32 +00:00
John Baldwin
15a88f8158 Consistently use vop_stdpathconf() for default pathconf values.
Update filesystems not currently using vop_stdpathconf() in pathconf
VOPs to use vop_stdpathconf() for any configuration variables that do
not have filesystem-specific values.  vop_stdpathconf() is used for
variables that have system-wide settings as well as providing default
values for some values based on system limits.  Filesystems can still
explicitly override individual settings.

PR:		219851
Reported by:	cem
Reviewed by:	cem, kib, ngie
MFC after:	1 month
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D11541
2017-07-11 21:55:20 +00:00
Pedro F. Giffuni
12b4678b2a Remove stale comments.
There's no real advantage in using memcpy here.

Dicussed with:	bde (long ago)
2017-07-09 15:19:28 +00:00
Dmitry Chagin
d499e3f58e Style(9). Whitespace.
MFC after:	3 weeks
2017-07-09 14:18:22 +00:00
Dmitry Chagin
16e3859b47 Eliminate the bogus casts.
MFC after:	3 weeks
2017-07-09 14:15:51 +00:00
Dmitry Chagin
46d186a9b4 Don't initialize error in declaration.
MFC after:	3 weeks
2017-07-08 21:15:46 +00:00
Dmitry Chagin
18a9ea872a Eliminate the bogus cast.
MFC after:	3 weeks
2017-07-08 21:13:25 +00:00
Dmitry Chagin
11fc6c6dac Eliminate the bogus cast.
MFC after:	3 weeks
2017-07-08 21:12:00 +00:00
Dmitry Chagin
b9d3485fb4 Don't take a lock around atomic operation.
MFC after:	3 weeks
2017-07-08 21:08:22 +00:00
Dmitry Chagin
073b14b469 Remove init from declaration, collapse two int vars declarations into single.
MFC after:	3 weeks
2017-07-08 21:05:28 +00:00
Dmitry Chagin
1901d0d8d3 Remove init from declaration.
MFC after:	3 weeks
2017-07-08 21:04:09 +00:00
Dmitry Chagin
a15cf51f0a Style(9). Add blank line aftr {.
MFC after:	3 weeks
2017-07-08 21:02:40 +00:00
Rick Macklem
25d694a6fa Add support for AF_LOCAL socket upcalls to the nfsuserd daemon.
This patch adds support for AF_LOCAL socket upcalls to an nfsuserd daemon
that supports them. A future patch to the nfsuserd daemon will use AF_LOCAL
sockets to avoid a problem when using upcalls to 127.0.0.1 if jails are
in use.

Suggested by:	dfr
PR:		205193
2017-07-06 00:53:12 +00:00
Pedro F. Giffuni
3d851dbe07 ext2fs: be more verbose about unsupported ext2fs features.
It is useful to know exactly what features may be lacking when trying to
mount ext4 filesystems.

Submitted by:	Fedor Uporov
Differential Revision:	https://reviews.freebsd.org/D11208
2017-07-02 20:47:25 +00:00
Rick Macklem
ad6eb97601 Fix an NFSv3 client case that probably never happens.
If an NFSv3 server were to reply with weak cache consistency attributes,
but not post operation attributes, the client would use garbage attributes
from memory. This was spotted during work on the code for the NFSv4.1 client.
I have never seen evidence that this happens and it wouldn't make sense
for an NFSv3 server to do this, so this patch is basically "theoretical",
but does fix the problem if a server were to do the above.

PR:		219552
MFC after:	2 weeks
2017-06-28 21:37:08 +00:00
Conrad Meyer
bb751fbbc7 Complete support for IO_APPEND flag in fuse
This finishes what r245164 started and makes open(..., O_APPEND) work again
after r299753.

- Pass ioflags, incl. IO_APPEND, down to the direct write backend (r245164
  added it to only the bio backend).
- (r299753 changed the WRONLY backend from bio to direct.)

PR:		220185
Reported by:	Ben RUBSON <ben.rubson at gmail.com>
Reviewed by:	bapt@, rmacklem@
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D11348
2017-06-28 13:56:15 +00:00
Enji Cooper
6bfe453238 Fix LINT, broken by a -Wformat warning in r320329 with PFS_DELEN being
changed from %d to a long-width type.

Use uintmax_t casting and %ju to futureproof the format string against
potential changes with either the #define or the implementation-specific
definition for offsetof(..).
2017-06-27 17:01:46 +00:00
Edward Tomasz Napierala
3c264086aa Revert part of r320359, as suggested by rmacklem@. That case is only used
for nfsuserd -manage-gids and shouldn't depend on sysctl.

MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
2017-06-27 15:14:06 +00:00
Pedro F. Giffuni
26f36b55b6 ext2fs: Support e2di_uid_high and e2di_gid_high.
The fields exist on all versions of the filesystem and using them is a mount
option on linux. For FreeBSD, the corresponding i_uid and i_gid are always
long enough so use them by default.

Reviewed by:	Fedor Uporov
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D11354
2017-06-27 15:07:19 +00:00
Edward Tomasz Napierala
6a3450e178 Add vfs.nfsd.nfsd_enable_uidtostring, which works just like
vfs.nfsd.nfsd_enable_stringtouid, but in reverse - when set to 1,
it forces the NFSv4 server to return numeric UIDs and GIDs instead
of "user@domain" strings. This helps with clients that can't
translate returned identifiers, eg when rerooting.

The same can be achieved by just never running nfsuserd(8),
but the sysctl is useful to toggle the behaviour back and forth
without rebooting.

Reviewed by:	rmacklem (earlier version)
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D11326
2017-06-26 13:11:21 +00:00
Rick Macklem
81b07aac10 Add support to the NFSv4.1/pNFS client for commits through the DS.
A NFSv4.1/pNFS server using File Layout can specify that Commit operations
are to be done against the DS instead of MDS. Since no extant pNFS
server did this, the code was untested and "#ifdef notyet".
The FreeBSD pNFS server I am developing does specify that Commits be done
through the DS, so the code has been enabled/tested.
This patch should only affect the case of a pNFS server that specfies
Commits through the DS.

PR:		219551
MFC after:	2 weeks
2017-06-26 00:43:04 +00:00
Dmitry Chagin
0a7c8d302a PFS_DELEN is the sum of the permanent part of the struct dirent and
fixed size for the name buffer PFS_NAMELEN.
As r318736 was commited (ino64 project) the size of the permanent part
of the struct dirent was changed, so calulate PFS_DELEN properly.
2017-06-25 15:21:51 +00:00
Rick Macklem
a351e99ce6 Add two new compound RPCs to the NFSv4.1/pNFS client.
When the NFSv4.1 client is doing pNFS, it needs to get an Open and
a Layout for every file it will be doing I/O on. The current code
does two separate RPCs to get these. This patch adds two new compounds
that do the both the Open and LayoutGet in the same RPC, reducing the
RPC count.
It also factors out the code that sets up and parses the LayoutGet operation
into separate functions, so that the code doesn't get duplicated for
these new RPCs.
This patch is fairly large, but should only affect the NFSv4.1 client
when the "pnfs" option is specified.

PR:		219550
MFC after:	2 weeks
2017-06-24 20:01:21 +00:00
Pedro F. Giffuni
a821bdcfd9 ext2fs: add dir_nlink feature support.
ext4 on linux has always supported more than 32000 directories through
the dir_nlink feature, but FreeBSD was unable to catch up on this feature.
As part of the 64 bit inode changes nlink_t has been extended and this
feature is now possible.

Submitted by:	Fedor Uporov
Differential Revision:	https://reviews.freebsd.org/D11210
2017-06-22 02:43:32 +00:00
Ed Maste
1f7d7cd76a msdosfs: reformat a comment to reduce NetBSD diffs 2017-06-22 01:11:20 +00:00
Rick Macklem
6d7963ecd4 Ensure that the credentials field of the NFSv4 client open structure is
initialized.

bdrewery@ has reported panics "newnfs_copycred: negative nfsc_ngroups".
The only way I can see that this occurs is that the credentials field of
the open structure gets used before being filled in.
I am not sure quite how this happens, but for the file create case, the
code is serialized via the vnode lock on the directory. If, somehow, a
link to the same file gets created just after file creation, this might
occur.

This patch ensures that the credentials field is initialized to a reasonable
set of credentials before the structure is linked into any list, so I
this should ensure it is initialized before use.
I am committing the patch now, since bdrewery@ notes that the panics
are intermittent and it may be months before he knows if the patch fixes
his problem.

Reported by:	bdrewery
MFC after:	2 weeks
2017-06-22 00:17:15 +00:00
Pedro F. Giffuni
aee33af1f1 Attempt to treat "metadata" as a collectively singular noun.
Or at least more consistent.

Input from:	matteo, ian
2017-06-20 20:22:34 +00:00