in6_selectsrc() may call fib6_lookup() in some cases, which requires
epoch. Wrap in6_selectsrc* calls into epoch inside its users.
Mark it as requiring epoch by adding NET_EPOCH_ASSERT().
MFC after: 1 weeek
Differential Revision: https://reviews.freebsd.org/D28647
This allows d_off to be used with lseek to position the file so that
getdirentries(2) will return the next entry. It is not used by
readdir(3).
PR: 253411
Reported by: John Millikin <jmillikin@gmail.com>
Reviewed by: cem
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D28605
Apply VOP_VPUT_PAIR() to the end of vnode operations after the
VOP_MKNOD(), VOP_MKDIR(), VOP_LINK(), VOP_SYMLINK(), VOP_CREATE().
Reviewed by: chs, mckusick
Tested by: pho
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Generic bypass cannot understand the rules of liveness for the VOP.
Reviewed by: chs, mckusick
Tested by: pho
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Most future operations on the returned file descriptor will fail
anyway, and application should be ready to handle that failures. Not
forcing it to understand the transient failure mode on open, which is
implementation-specific, should make us less special without loss of
reporting of errors.
Suggested by: chs
Reviewed by: chs, mckusick
Tested by: pho
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
This could happen when failing due to disappearing source file.
Reviewed By: kib
Tested by: pho
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D27338
We would unlock fvp here, only to unlock it again below,
just before "bad".
Reviewed By: kib
Tested by: pho
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D27339
vfs_cache_lookup() has already done the appropriate VEXEC check, therefore
we must not re-check in VOP_CACHEDLOOKUP.
This fixes O_SEARCH semantics on tmpfs and removes a redundant descent into
VOP_ACCESS() in the common case.
Reported-by: arichardson (via CheriBSD Jenkins CI)
Reviewed-by: kib
MFC-after: 3 days
Differential Revision: https://reviews.freebsd.org/D28401
We provide these for compat with other queue.h headers since some software
assumes it exists (e.g. the libevent contrib code), but we are not
encouraging their use (NULL should be used instead).
This fixes the following warning (which should arguable be an error since
it results in a function call to an undefined function):
.../contrib/libevent/buffer.c:495:16: warning: implicit declaration of function 'LIST_END' is invalid in C99 [-Wimplicit-function-declaration]
cbent != LIST_END(&buffer->callbacks);
^
.../contrib/libevent/buffer.c:495:13: warning: comparison between pointer and integer ('struct evbuffer_cb_entry *' and 'int') [-Wpointer-integer-compare]
cbent != LIST_END(&buffer->callbacks);
~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Reviewed By: jhb
Differential Revision: https://reviews.freebsd.org/D27151
Otherwise writing thread might wait on sbusy state of the pages which were
busied by itself, similarly to nfs_read(). But also we need to clear
NVNSETSZKSIP flag possibly set by ncl_pager_setsize(), to not undo
extension done by write.
Reported by: bdrewery
Reviewed by: rmacklem
Tested by: pho
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D28306
Otherwise it is dereferenced one extra time at unmount, if it survives
long enough. One way to hold the reference on such node is to keep it
open.
tmpfs_vptocnp() now needs to account for the possibility that unlocked
node was removed from the list.
Reported by: danfe
Tested by: danfe, pho
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Add KASSERTS to nfsm_trimtrailing() to confirm the sanity of
the arguments for the M_EXTPG case.
Suggested by: kib
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D28053
Despite TMPFS_UNLOCK() is done in both paths later, unlocking not locked
mutex provides different failure mode.
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
We have the d_off field in struct dirent for providing the seek offset
of the next directory entry. Several filesystems were not initializing
the field, which ends up being copied out to userland.
Reported by: Syed Faraz Abrar <faraz@elttam.com>
Reviewed by: kib
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D27792
Must lock the vnode before accessing the fufh table. Also, check for
invalid parameters earlier. Bug introduced by r346170.
MFC after: 2 weeks
Reviewed by: cem
Differential Revision: https://reviews.freebsd.org/D27936
In particular, do not assume that vn_start_write() returns the same mp
as it was passed in, or never returns error.
Also be more accurate to return NULL vp and mp when error occured, to
catch wrong control flow easier.
Stop checking for NULL mp before calling vn_finished_write(), NULL mp
is handled transparently by the function.
Reviewed by: rmacklem
Tested by: pho
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D27881
Commit 774a36851e fixed the NFS server so that it could handle
ERELOOKUP returns from VOP calls by redoing the operation/RPC.
However, for NFSv4.0, redoing an Open would increment
the open_owner's seqid multiple times, breaking the protocol.
This patch sets a new flag called ND_ERELOOKUP on the RPC when
a redo is in progress. Then the code that increments the seqid
avoids the seqid increment/check when the flag is set, since
it indicates this has already been done for the Open.
r367672 modified UFS such that certain VOPs, such as
VOP_CREATE() will intermittently return ERELOOKUP.
When this happens, the entire system call, or NFS
operation in the case of the NFS server, must be redone.
This patch adds that support to the NFS server by rolling
back the state of the NFS request arguments and NFS
reply arguments mbuf lists to the condition they were
in before the operation and then redoing the operation.
Tested by: pho
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D27875
This updates the FUSE protocol to 7.28, though most of the new features
are optional and are not yet implemented.
MFC after: 2 weeks
Relnotes: yes
Reviewed by: cem
Differential Revision: https://reviews.freebsd.org/D27818
FUSE_LSEEK reports holes on fuse file systems, and is used for example
by bsdtar.
MFC after: 2 weeks
Relnotes: yes
Reviewed by: cem
Differential Revision: https://reviews.freebsd.org/D27804
Before r332974 the old code would sometimes cause a rare lock order
reversal against pagequeue, which looked roughly like this:
witness_checkorder()
__mtx_lock-flags()
vm_page_alloc()
uma_small_alloc()
keg_alloc_slab()
keg_fetch-slab()
zone_fetch-slab()
zone_import()
zone_alloc_bucket()
uma_zalloc_arg()
bucket_alloc()
uma_zfree_arg()
free()
devfs_metoo()
devfs_populate_loop()
devfs_populate()
devfs_rioctl()
VOP_IOCTL_APV()
VOP_IOCTL()
vn_ioctl()
fo_ioctl()
kern_ioctl()
sys_ioctl()
Since r332974 the original problem no longer exists, but it still
makes sense to move things out of the - often congested - lock.
Reviewed By: kib, markj
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D27334
The original fusefs GSoC project seems to have envisioned exchanging two
types of messages with FUSE servers. Perhaps vectored and non-vectored?
But in practice only one type has ever been used. Delete the other type.
Reviewed by: cem
Differential Revision: https://reviews.freebsd.org/D27770
This was missed in r340856 / commit
6d2e2df764. Three bytes from the kernel
stack may be leaked when reading directory entries.
Reported by: Syed Faraz Abrar <faraz@elttam.com>
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
When using NFS-over-TLS, an NFS client can optionally provide an X.509
certificate to the server during the TLS handshake. For some situations,
such as different NFS servers or different certificates being mapped
to different user credentials on the NFS server, there may be a need
for different mounts to provide different certificates.
This new mount option called "tlscertname" may be used to specify a
non-default certificate be provided. This alernate certificate will
be stored in /etc/rpc.tlsclntd in a file with a name based on what is
provided by this mount option.
The argument is a void * so there's no need to cast it to caddr_t.
Update documentation to match function decleration.
Reviewed by: freqlabs
Obtained from: CheriBSD
MFC after: 1 week
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D27093
Replace MAXPHYS by runtime variable maxphys. It is initialized from
MAXPHYS by default, but can be also adjusted with the tunable kern.maxphys.
Make b_pages[] array in struct buf flexible. Size b_pages[] for buffer
cache buffers exactly to atop(maxbcachebuf) (currently it is sized to
atop(MAXPHYS)), and b_pages[] for pbufs is sized to atop(maxphys) + 1.
The +1 for pbufs allow several pbuf consumers, among them vmapbuf(),
to use unaligned buffers still sized to maxphys, esp. when such
buffers come from userspace (*). Overall, we save significant amount
of otherwise wasted memory in b_pages[] for buffer cache buffers,
while bumping MAXPHYS to desired high value.
Eliminate all direct uses of the MAXPHYS constant in kernel and driver
sources, except a place which initialize maxphys. Some random (and
arguably weird) uses of MAXPHYS, e.g. in linuxolator, are converted
straight. Some drivers, which use MAXPHYS to size embeded structures,
get private MAXPHYS-like constant; their convertion is out of scope
for this work.
Changes to cam/, dev/ahci, dev/ata, dev/mpr, dev/mpt, dev/mvs,
dev/siis, where either submitted by, or based on changes by mav.
Suggested by: mav (*)
Reviewed by: imp, mav, imp, mckusick, scottl (intermediate versions)
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D27225
Normal bypass expects locked vnode, which is not true for
VOP_READ_PGCACHE(). Ensure liveness of the lower vnode by taking the
upper vnode interlock, which is also taked by null_reclaim() when
setting v_data to NULL.
Reported and tested by: pho
Reviewed by: markj, mjg
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D27327
This also eliminates unsafe use of VFS_SYNC(MNT_WAIT).
Requested by: mckusick
Discussed with: imp
Tested by: pho (previous version)
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D27269
FreeBSD's NFS exporter has long exported some unused statistics fields.
Revision r366992 removed them from nfsstat. This revision renames those
fields in the kernel's exported structures to make it clear to other
consumers that they are unused.
Reported by: emaste
Reviewed by: emaste
Sponsored by: Axcient
Differential Revision: https://reviews.freebsd.org/D27258
No functional change intended.
Tracking these structures separately for each proc enables future work to
correctly emulate clone(2) in linux(4).
__FreeBSD_version is bumped (to 1300130) for consumption by, e.g., lsof.
Reviewed by: kib
Discussed with: markj, mjg
Differential Revision: https://reviews.freebsd.org/D27037
from a Linux binary. Should come handy for AppImages.
Reviewed by: asomers
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D26959