Otherwise writing thread might wait on sbusy state of the pages which were
busied by itself, similarly to nfs_read(). But also we need to clear
NVNSETSZKSIP flag possibly set by ncl_pager_setsize(), to not undo
extension done by write.
Reported by: bdrewery
Reviewed by: rmacklem
Tested by: pho
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D28306
Otherwise it is dereferenced one extra time at unmount, if it survives
long enough. One way to hold the reference on such node is to keep it
open.
tmpfs_vptocnp() now needs to account for the possibility that unlocked
node was removed from the list.
Reported by: danfe
Tested by: danfe, pho
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Add KASSERTS to nfsm_trimtrailing() to confirm the sanity of
the arguments for the M_EXTPG case.
Suggested by: kib
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D28053
Despite TMPFS_UNLOCK() is done in both paths later, unlocking not locked
mutex provides different failure mode.
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
We have the d_off field in struct dirent for providing the seek offset
of the next directory entry. Several filesystems were not initializing
the field, which ends up being copied out to userland.
Reported by: Syed Faraz Abrar <faraz@elttam.com>
Reviewed by: kib
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D27792
Must lock the vnode before accessing the fufh table. Also, check for
invalid parameters earlier. Bug introduced by r346170.
MFC after: 2 weeks
Reviewed by: cem
Differential Revision: https://reviews.freebsd.org/D27936
In particular, do not assume that vn_start_write() returns the same mp
as it was passed in, or never returns error.
Also be more accurate to return NULL vp and mp when error occured, to
catch wrong control flow easier.
Stop checking for NULL mp before calling vn_finished_write(), NULL mp
is handled transparently by the function.
Reviewed by: rmacklem
Tested by: pho
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D27881
Commit 774a36851e0e fixed the NFS server so that it could handle
ERELOOKUP returns from VOP calls by redoing the operation/RPC.
However, for NFSv4.0, redoing an Open would increment
the open_owner's seqid multiple times, breaking the protocol.
This patch sets a new flag called ND_ERELOOKUP on the RPC when
a redo is in progress. Then the code that increments the seqid
avoids the seqid increment/check when the flag is set, since
it indicates this has already been done for the Open.
r367672 modified UFS such that certain VOPs, such as
VOP_CREATE() will intermittently return ERELOOKUP.
When this happens, the entire system call, or NFS
operation in the case of the NFS server, must be redone.
This patch adds that support to the NFS server by rolling
back the state of the NFS request arguments and NFS
reply arguments mbuf lists to the condition they were
in before the operation and then redoing the operation.
Tested by: pho
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D27875
This updates the FUSE protocol to 7.28, though most of the new features
are optional and are not yet implemented.
MFC after: 2 weeks
Relnotes: yes
Reviewed by: cem
Differential Revision: https://reviews.freebsd.org/D27818
FUSE_LSEEK reports holes on fuse file systems, and is used for example
by bsdtar.
MFC after: 2 weeks
Relnotes: yes
Reviewed by: cem
Differential Revision: https://reviews.freebsd.org/D27804
Before r332974 the old code would sometimes cause a rare lock order
reversal against pagequeue, which looked roughly like this:
witness_checkorder()
__mtx_lock-flags()
vm_page_alloc()
uma_small_alloc()
keg_alloc_slab()
keg_fetch-slab()
zone_fetch-slab()
zone_import()
zone_alloc_bucket()
uma_zalloc_arg()
bucket_alloc()
uma_zfree_arg()
free()
devfs_metoo()
devfs_populate_loop()
devfs_populate()
devfs_rioctl()
VOP_IOCTL_APV()
VOP_IOCTL()
vn_ioctl()
fo_ioctl()
kern_ioctl()
sys_ioctl()
Since r332974 the original problem no longer exists, but it still
makes sense to move things out of the - often congested - lock.
Reviewed By: kib, markj
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D27334
The original fusefs GSoC project seems to have envisioned exchanging two
types of messages with FUSE servers. Perhaps vectored and non-vectored?
But in practice only one type has ever been used. Delete the other type.
Reviewed by: cem
Differential Revision: https://reviews.freebsd.org/D27770
This was missed in r340856 / commit
6d2e2df764199f0a15fd743e79599391959cc17d. Three bytes from the kernel
stack may be leaked when reading directory entries.
Reported by: Syed Faraz Abrar <faraz@elttam.com>
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
When using NFS-over-TLS, an NFS client can optionally provide an X.509
certificate to the server during the TLS handshake. For some situations,
such as different NFS servers or different certificates being mapped
to different user credentials on the NFS server, there may be a need
for different mounts to provide different certificates.
This new mount option called "tlscertname" may be used to specify a
non-default certificate be provided. This alernate certificate will
be stored in /etc/rpc.tlsclntd in a file with a name based on what is
provided by this mount option.
The argument is a void * so there's no need to cast it to caddr_t.
Update documentation to match function decleration.
Reviewed by: freqlabs
Obtained from: CheriBSD
MFC after: 1 week
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D27093
Replace MAXPHYS by runtime variable maxphys. It is initialized from
MAXPHYS by default, but can be also adjusted with the tunable kern.maxphys.
Make b_pages[] array in struct buf flexible. Size b_pages[] for buffer
cache buffers exactly to atop(maxbcachebuf) (currently it is sized to
atop(MAXPHYS)), and b_pages[] for pbufs is sized to atop(maxphys) + 1.
The +1 for pbufs allow several pbuf consumers, among them vmapbuf(),
to use unaligned buffers still sized to maxphys, esp. when such
buffers come from userspace (*). Overall, we save significant amount
of otherwise wasted memory in b_pages[] for buffer cache buffers,
while bumping MAXPHYS to desired high value.
Eliminate all direct uses of the MAXPHYS constant in kernel and driver
sources, except a place which initialize maxphys. Some random (and
arguably weird) uses of MAXPHYS, e.g. in linuxolator, are converted
straight. Some drivers, which use MAXPHYS to size embeded structures,
get private MAXPHYS-like constant; their convertion is out of scope
for this work.
Changes to cam/, dev/ahci, dev/ata, dev/mpr, dev/mpt, dev/mvs,
dev/siis, where either submitted by, or based on changes by mav.
Suggested by: mav (*)
Reviewed by: imp, mav, imp, mckusick, scottl (intermediate versions)
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D27225
Normal bypass expects locked vnode, which is not true for
VOP_READ_PGCACHE(). Ensure liveness of the lower vnode by taking the
upper vnode interlock, which is also taked by null_reclaim() when
setting v_data to NULL.
Reported and tested by: pho
Reviewed by: markj, mjg
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D27327
This also eliminates unsafe use of VFS_SYNC(MNT_WAIT).
Requested by: mckusick
Discussed with: imp
Tested by: pho (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D27269
FreeBSD's NFS exporter has long exported some unused statistics fields.
Revision r366992 removed them from nfsstat. This revision renames those
fields in the kernel's exported structures to make it clear to other
consumers that they are unused.
Reported by: emaste
Reviewed by: emaste
Sponsored by: Axcient
Differential Revision: https://reviews.freebsd.org/D27258
No functional change intended.
Tracking these structures separately for each proc enables future work to
correctly emulate clone(2) in linux(4).
__FreeBSD_version is bumped (to 1300130) for consumption by, e.g., lsof.
Reviewed by: kib
Discussed with: markj, mjg
Differential Revision: https://reviews.freebsd.org/D27037
from a Linux binary. Should come handy for AppImages.
Reviewed by: asomers
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26959
Add a pseudofs node flag 'PFS_AUTODRAIN', which automatically emits sbuf
contents to the caller when the sbuf buffer fills. This is only
permissible if the corresponding PFS node fill function can sleep
whenever it appends to the sbuf.
linprocfs' /proc/self/maps node happens to meet this requirement.
Streaming out the file as it is composed avoids truncating the output
and also avoids preallocating a very large buffer.
Reviewed by: markj; earlier version: emaste, kib, trasz
Differential Revision: https://reviews.freebsd.org/D27047
instead of mount_nullfs(8).
Obviously you'd need to force mount(8) to not call
mount_nullfs(8) to make use of it.
Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26934
Foundation copyrights, approved by emaste@. It does not include
files which carry other people's copyrights; if you're one
of those people, feel free to make similar change.
Reviewed by: emaste, imp, gbe (manpages)
Differential Revision: https://reviews.freebsd.org/D26980
module by name and not only by the version information, so that
"kldstat -q -m cuse" works.
Found by: Goran Mekic <meka@tilda.center>
MFC after: 1 week
Sponsored by: Mellanox Technologies // NVIDIA Networking
If lower VOP relocked the lower vnode, it is possible that nullfs
vnode was reclaimed meantime. In this case nullfs vnode no longer
shares lock with lower vnode, which breaks locking protocol.
Check for the condition and acquire nullfs vnode lock if detected.
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Use of dead_vnodeops would result in a panic instead of returning the intended
EOPNOTSUPP error.
While here make sure to abort, not just try to return a partial result.
The former allows the regular lookup to restart from scratch, while the latter
makes it stuck with an unusable vnode.
Reported by: kevans