volume limits. In particular:
- Assert that usemap_alloc() and usemap_free() cluster number argument
is valid.
- In chainlength(), return 0 if cluster start is after the max cluster.
- In chainlength(), cut the calculated cluster chain length at the max
cluster.
- For true paranoia, after the pm_inusemap is calculated in
fillinusemap(), reset all bits in the array for clusters after the
max cluster, as in-use.
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
delegations enabled and the Linux NFSv4.1 client was reported in
reviews.freebsd.org/D7891.
I believe that the FreeBSD server behaviour conforms to the RFC and that
the Linux client has a bug. Therefore, I do not think the proposed patch
is appropriate. When nfsrv_writedelegifpos is non-zero, the FreeBSD
server will issue a write delegation for a read open if possible.
The Linux client then erroneously assumes that the credentials used for
the read open can write the file.
This patch reverses the default value for nfsrv_writedelegifpos to 0 so
that the default behaviour is Linux compatible and adds a sysctl that can
be used to set nfsrv_writedelegifpos.
This change should only affect users that are mounting a FreeBSD server
with delegations enabled (they are not enabled by default) with a Linux
NFSv4.1 client mount.
Reported by: fatih.acar@gandi.net
Tested by: fatih.acar@gandi.net
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D7891
The old behavior depended on the FAT version and on what files were in the
root directory. "mount_msdosfs -o shortnames" is still supported.
Reviewed by: wblock, cem
Discussed with: trasz, adrian, imp
MFC after: 4 weeks
X-MFC-Notes: Don't MFC the removal of findwin95
Differential Revision: https://reviews.freebsd.org/D8018
The lower vnode is already referenced and nodeget is supposed to consume
the reference. Thus the extra vref call was causing a leak.
Reported by: pho
Reviewed by: kib
MFC after: 1 week
The previous code was forcing an expensive walk in vop_stdvptocnp,
which was causing performance issues on highly contended zfs.
No objections: kib
MFC after: 2 weeks
Standard VOP_FSYNC() implementation just syncs data buffers, and due
to this, is the correct and efficient implementation for msdosfs or
any other filesystem which uses bufer cache trivially. Provide
globally visible wrapper vop_stdfdatasync_buf() for future consumption
by other filesystems.
Reviewed by: mckusick
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D7471
the patch in D1626 plus changes so that it includes counts for
NFSv4.1 (and the draft of NFSv4.2).
Also, make all the counts uint64_t and add a vers field at the
beginning, so that future revisions can easily be implemented.
There is code in place to handle the old vesion of the nfsstats
structure for backwards binary compatibility.
Subsequent commits will update nfsstat(8) to use the new fields.
Submitted by: will (earlier version)
Reviewed by: ken
MFC after: 1 month
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D1626
The offset of the directory file, passed to getdirentries(2) syscall,
is user-controllable. The value of the offset must not be asserted,
instead the invalid value should be checked and rejected if invalid.
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
These are currently unused in our implementation and some even appear to
have not been implemented yet on linux but it is good to keep them for
reference.
Obtained from: NetBSD (CVS Rev. 1.41)
MFC after: 1 month
Owning Giant in the init/uninit is accidental due to the moment where
VFS modules initialization is performed, and is not enforced by the
VFS interface. The Giant lock does not prevent a parallel execution
of the code, it is VFS which implements the proper protocol.
Approved by: des (pseudofs maintainer)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
and getboottimebin(9) KPI. Change consumers of boottime to use the
KPI. The variables were renamed to avoid shadowing issues with local
variables of the same name.
Issue is that boottime* should be adjusted from tc_windup(), which
requires them to be members of the timehands structure. As a
preparation, this commit only introduces the interface.
Some uses of boottime were found doubtful, e.g. NLM uses boottime to
identify the system boot instance. Arguably the identity should not
change on the leap second adjustment, but the commit is about the
timekeeping code and the consumers were kept bug-to-bug compatible.
Tested by: pho (as part of the bigger patch)
Reviewed by: jhb (same)
Discussed with: bde
Sponsored by: The FreeBSD Foundation
MFC after: 1 month
X-Differential revision: https://reviews.freebsd.org/D7302
Devfs' file layer ioctl is now just a thin shim around the vnode layer.
Reviewed by: kib
Sponsored by: EMC / Isilon Storage Division
Differential Revision: https://reviews.freebsd.org/D7286
framework allowing to set the suspension policy for the dynamic block.
Extend the currently possible policies of stopping on interruptible
sleeps and ignoring such sleeps by two more: do not suspend at
interruptible sleeps, but interrupt them with either EINTR or ERESTART.
Reviewed by: jilles
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Approved by: re (gjb)
executed by inactive methods, must be repeated on reclaim. In
particular, unlink and free sillyrenamed vnode both on inactivation
and reclaim.
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Approved by: re (gjb)
If strlen(hostp) was zero, the stack array 'nam' would never be initialized
before being strdup()ed. Fix this by initializing it to the empty string.
It's possible some external condition makes this case impossible, in which
case, an assertion instead of this workaround is appropriate.
Introduced in r299848.
Reported by: Coverity
CID: 1355336
Sponsored by: EMC / Isilon Storage Division
Ext2/3/4 manages generation numbers differently than UFS so adopt
some rules that should work well. When allocating a new inode,
make sure we generate a "good" random value specifically avoiding
zero.
Don't interfere with the numbers that are already generated in
the filesystem: ext2fs doesn't have the backwards compatibility
issues where there were no generation numbers.
Reviewed by: kevlo
MFC after: 1 week
on a fuse mounted file system, it will crash. Although it may be
possible to make this work correctly, this patch avoids the crash
in the meantime.
I removed the MPASS(), since panicing for the FIFO case didn't make
a lot of sense when it returns an error for the others.
PR: 195000
Submitted by: henry.hu.sh@gmail.com (earlier version)
MFC after: 2 weeks
When "cp" of a file with read-only (mode 0444) to a fuse mounted
file system was attempted it would fail with EACCES. This was because
fuse would attempt to open the file WRONLY and the open would fail.
This patch changes the fuse_vnop_open() to test for an extant read-write
open and use that, if it is available.
This makes the "cp" of a read-only file to the fuse mounted file system
work ok.
There are simpler ways to fix this than adding the fuse_filehandle_validrw()
function, but this function is useful for future patches related to
exporting a fuse filesystem via NFS.
MFC after: 2 weeks
eg an NFSv4 root over WiFi: boot from md_root (small rootfs image
preloaded by loader(8)), setup WiFi, and then reroot into the actual
root, over NFS.
Note that it's currently limited to NFSv4, and due to problems with
nfsuserd(8) it requres a workaround on the server side: one needs
to set the vfs.nfsd.enable_stringtouid=1 sysctl and not run nfsuserd(8)
on either the server or the client side.
Reviewed by: rmacklem@
MFC after: 1 month
Relnotes: yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D6347
When I/O on a file under fuse is switched from buffered to DIRECT_IO,
it was possible to read stale (before a recent modification) data from
the buffer cache. This patch invalidates the buffer cache for the
file to fix this.
PR: 194293
MFC after: 2 weeks
When a file is opened write-only and a partial block was written,
buffered I/O would try and read the whole block in. This would
result in a hung thread, since there was no open (fuse filehandle)
that allowed reading. This patch avoids the problem by forcing
DIRECT_IO for this case.
It also sets DIRECT_IO when the file system specifies the FN_DIRECTIO
flag in its reply to the open.
Tested by: nishida@asusa.net, freebsd@moosefs.com
PR: 194293, 206238
MFC after: 2 weeks
Trivial use-after-free where stp was freed too soon in the non-error path.
To fix, simply move its release to the end of the routine.
Reported by: Coverity
CID: 1006105
Sponsored by: EMC / Isilon Storage Division
consequence, the nfs client override of VOP_LOCK1() is no longer
needed.
Reviewed and tested by: rmacklem
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
When support for NFSv4.1 was added to the NFS server, it broke
the server rpc count stats, since newnfsstats.srvrpccnt[] doesn't
have entries for the new NFSv4.1 operations.
Without this patch, the code was incrementing bogus entries in
newnfsstats for the new NFSv4.1 operations.
This patch is an interim fix. The nfsstats structure needs to be
updated and that will come in a future commit.
Reported by: cem
MFC after: 2 weeks
It was reported via email that under certain heavy RPC loads
long delays before the exports would be updated was observed
when using "mountd -S". This patch reverses the priority between
the exclusive lock request to suspend the nfsd threads and the
shared lock request for performing RPCs.
As such, when mountd attempts to suspend the nfsd threads, it
gets priority over outstanding RPC requests to do this.
I suspect that the case reported was an artificial test load,
but this patch did fix the problem for the reporter.
Reported and Tested by: josephlai@qnap.com
MFC after: 2 weeks
This is only allowed by root and only used by the nfs daemon, which
should not provide an incorrect value. However, it's still good
practice to validate data provided by userland.
PR: 206626
Reported by: CTurt <cturt@hardenedbsd.org>
Reviewed by: rmacklem
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D6201
In win2unixfn() we expand Windows 95 style long names. In some cases that
requires moving the data in the nbp->nb_buf buffer backwards to make room. That
code failed to check for overflows, leading to a stack overflow in win2unixfn().
We now check for this event, and mark the entire conversion as failed in that
case. This means we present the 8 character, dos style, name instead.
PR: 204643
Differential Revision: https://reviews.freebsd.org/D6015
It was reported via email that a Linux client couldn't do a Kerberized
NFS mount when only "sec=krb5" was specified for the exports. The Linux
client attempted a mount via krb5i and the server replied NFSERR_SERVERFAULT.
Although NFSERR_WRONGSEC isn't listed as an error for SetClientID, I
think it is the correct reply, so this patch enables that.
I do not know if this fixes the mount attempt, but adding "krb5i" to the
list of allowed security flavours does allow the mount to work.
Reported by: joef@spectralogic.com
MFC after: 2 weeks
h_levels_num, as most data structs in ext2fs, is unsigned so
the index that addresses it has to be unsigned as well.
To get to overflow here we would probably be considering a
degenerate case though.
MFC after: 5 days
The ordering of acquisition of the state and session mutexes was
reversed in two cases executed when an NFSv4.1 client created/freed
a session. Since clients will typically do this only when mounting
and dismounting, the likelyhood of causing a deadlock was low but possible.
This can only occur for NFSv4.1 mounts, since the others do not
use sessions.
This was detected while testing the pNFS server/client where the
client crashed during dismounting.
The patch also reorders the unlocks, although that isn't necessary
for correct operation.
MFC after: 2 weeks
rounddown2 tends to produce longer lines than the original code
and when the code has a high indentation level it was not really
advantageous to do the replacement.
This tries to strike a balance between readability using the macros
and flexibility of having the expressions, so not everything is
converted.
the NFS server would leave the newly created vnode locked. This could
result in a file system that would not unmount and processes wedged,
waiting for the file to be unlocked.
Since this VOP_SETATTR() never fails for most file systems, this bug
doesn't normally manifest itself. I found it during testing of an
exported GlusterFS file system, which can fail.
This patch adds the vput() and changes the error to the correct NFS one.
MFC after: 2 weeks
the old and new NFS clients. He did a good job of isolating the problem
which was caused by the new NFS client not setting the post write mtime
correctly. The new NFS client code was cloned from the old client, but
was incorrect, because the mtime in the nfs vnode's cache wasn't yet
updated. This patch fixes this problem. The patch also adds missing mutex
locking.
Reported and tested by: bde
MFC after: 2 weeks
for limiting disk (actually filesystem) IO.
Note that in some cases these limits are not quite precise. It's ok,
as long as it's within some reasonable bounds.
Testing - and review of the code, in particular the VFS and VM parts - is
very welcome.
MFC after: 1 month
Relnotes: yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D5080
the functions free the buffer and set the pointer to NULL. Also
remove useless call to brelse(9) on the error path.
PR: 208275
Submitted by: Fabian Keil <fk@fabiankeil.de>
MFC after: 2 weeks
strlens is somewhat suboptimal, but it's a temporary measure that will
be replaced with red-black trees later on.
PR: 204417
Reviewed by: kib@
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D5266
192.168.1.1, with share "share". This commit fixes a problem
where "mkdir /net/192.168.1.1/share/meh" would return spurious
error instead of creating the directory if the target filesystem
wasn't mounted yet; subsequent attempts would work correctly.
The failure scenario is kind of complicated to explain, but it all
boils down to calling VOP_MKDIR() for the target filesystem (NFS)
with wrong dvp - the autofs vnode instead of the filesystem root
mounted over it.
Reviewed by: kib@
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D5442
pfs_visible(). The recursion does not cause deadlock because the sx
implementation does not prefer exclusive waiters over the shared, but
this is an implementation detail.
Reported by: pho, Matthew Bryan <matthew.bryan@isilon.com>
Reviewed by: jhb
Tested by: pho
Approved by: des (pseudofs maintainer)
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
filesystem to the nullfs mount.
MNTK_NO_IOPF must be present on the nullfs struct mount so that struct
file fo_read and fo_write fops operate in the mode requested by the
lower mount.
MNTK_UNMAPPED_BUFS allows VOP_GETPAGES() to use unmapped buffers. It
does not matter for VOP_GETPAGES() calls from vm_fault() since handle
of the vm_object always points to the lower vnode. But it may be
useful for other situations where VOP_GETPAGES() is used.
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
This adopts the same change as r291936 for UFS.
Directly clear IN_ACCESS or IN_UPDATE when user supplied the time, and
copy the value into the inode.
This keeps the behaviour cleaner and is consistent with UFS.
Reviewed by: bde
MFC after: 1 month (only 10)
unlinked. Otherwise the vnode stays cached, causing leak. This is
similar to r292961 for regular files.
Reported and tested by: pho (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Sync with r84642 from UFS:
The panics are inappropriate because the IN_RENAME flag only fixes a
few of the huge number of race conditions that can result in the
source path becoming invalid even prior to the VOP_RENAME() call.
Found accidentally while checking an issue from PVS Static Analysis.
MFC after: 3 days
Cleanup some checks for NULL. Most of these were always unnecessary and
starting with r294954 brelse() doesn't need any NULL checks at all.
For now keep the checks somewhat consistent with NetBSD in case we want to
merge the cleanups to older versions.
It is otherwise left dangling, and callers that request cookies always free
the cookie buffer, even when VOP_READDIR(9) returns an error. This results
in a double free if tmpfs_readdir() returns an error to the NFS server or
the Linux getdents(2) emulation code.
Reported by: pho
MFC after: 1 week
Security: double free of malloc(9)-backed memory
Sponsored by: EMC / Isilon Storage Division
This is ongoing work from Damjan Jovanovic to improve ext4 read support
with sparse files:
Keep track of the first and last block in each extent as it descends down
the extent tree, thus being able to work out that some blocks are sparse
earlier. This solves an issue on r293680.
In ext4_bmapext() start supporting the runb parameter, which appears to be
the number of adjacent blocks prior to the block being converted in the
same way that runp is the number of blocks after, speding up random access
to mmaped files.
PR: 206652
CID 1018688 is a false positive.
The initialization is done by calling vn_start_write(... &mp, flags).
mp is only an output parameter unless (flags & V_MNTREF), and fdesc
doesn't put V_MNTREF in flags.
Pointed out by: bde