When an mmap'd text file is written and then executed immediately
afterwards, it was possible that the modify time would change after the
text file was executing, resulting in the process executing the file
being killed. This was usually only observed when the file system's
times were set to higher resolution, but could have occurred for any
time resolution.
This was reported on a recent email list discussion.
This patch adds a VOP_SET_TEXT() to the NFS client which flushed all
dirty pages to the NFS server and then makes sure that n_mtime is up
to date to avoid this from occurring.
Thanks go to kib@ and pho@ for their help with developing this patch.
Tested by: pho
Reviewed by: kib
MFC after: 2 weeks
If the ExchangeID/CreateSession operations done by an NFSv4.1 client
after the server crashes/reboots fails, it is possible that some process/thread
is waiting for an open_owner lock. If the client state is free'd, this
can cause a crash.
This would not normally happen, but has been observed on a mount of the
AmazonEFS service.
Reported by: cperciva
Tested by: cperciva
PR: 216086
MFC after: 2 weeks
during recovery.
If the NFSv4.1 client gets a NFSv4.1 NFSERR_BADSESSION reply to an Open/Lock
operation while recovering from the server crash/reboot, allow the opens
to be retained for a subsequent recovery attempt. Since NFSv4.1 servers
should only reply NFSERR_BADSESSION after a crash/reboot that has lost
state, this case should almost never happen.
However, for the AmazonEFS file service, this has been observed when
the client does a fresh TCP connection for RPCs.
Reported by: cperciva
Tested by: cperciva
PR: 216088
MFC after: 2 weeks
Instead, issue a diagnostic and return appropriate error if
ncl_flush() was unable to clean buffer queue after the specified
number or retries.
Reviewed by: rmacklem
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
The AmazonEFS NFSv4.1 server does not support the FILES_FREE and FILES_TOTAL
attributes. As such, an NFSv4.1 mount to the server would return garbage
for these values. This patch initializes the fields of the nfsstatfs structure,
so that "df" and friends will at least return consistent bogus values.
This patch should have effect when mounting other NFSv4.1 servers.
Reported by: cperciva
MFC after: 2 weeks
This patch gives a requestor of the exclusive lock on the client state
in the NFSv4 client priority over shared lock requestors. This avoids
the server crash recovery thread being starved out by other threads doing
RPCs.
Tested by: cperciva
PR: 216087
MFC after: 2 weeks
When the NFSv4 client Commit operation encountered a stale write verifier,
it erroneously mapped that to EIO. This could have caused recently written
data to be lost when a server crashes/reboots between an UNSTABLE write
and the subsequent commit. This patch fixes this.
The bug was only for the NFSv4 client and did not affect NFSv3.
Tested by: cperciva
PR: 215887
MFC after: 2 weeks
For the ReclaimComplete operation, the RPC layer should not loop on
NFSERR_BADSESSION. If it does, the recovery thread (nfscl) can get stuck
looping and will not do a recovery.
This patch fixes it so it does not loop. This bug only affects NFSv4.1 and
only when a server reboots.
Tested by: cperciva
PR: 215886
MFC after: 2 weeks
If an operation that preceeds a Setattr in an NFSv4 compound fails,
there is no bitmap of attributes to parse. Without this patch, the
parsing would fail and return EBADRPC instead of the correct failure
error. This could break recovery from a server crash/reboot.
Tested by: cperciva
PR: 215883
MFC after: 2 weeks
Based on the change in r242386, it seems clear that scred was intended to
be released in all paths at exit.
No functional change. This line's indent was just the result of a bad copy
paste from the previous free() in an early exit path.
Reported by: PVS-Studio
Sponsored by: Dell EMC Isilon
Write out the dirty pages using VOP_WRITE() instead of directly
calling ncl_writerpc(). The state of the buffers now reflects the
write, fixing some hard to diagnose consistency and write order
issues. The change also allowed to remove remapping of paged out
pages into kernel space and related allocation of the phys buffer.
Reviewed by: markj, rmacklem
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D10241
ncl_vinvalbuf() might need to upgrade vnode lock, allowing the vnode
to be reclaimed by other thread. Handle the situation, indicated by
the returned error zero and VI_DOOMED iflag set, converting it into
EBADF. Handle all calls, even where the vnode is exclusively locked
right now.
Reviewed by: markj, rmacklem
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
X-Differential revision: https://reviews.freebsd.org/D10241
This interface has no in-tree consumers and has been more or less
non-functional for several releases.
Remove manpage note that the procfs special file 'mem' is grouped to
kmem. This hasn't been true since r81107.
Remove procfs' README file. It is an out of date duplication of the manpage
(quoth the README: "since the bsd kernel is single-processor...").
Reviewed by: vangyzen, bcr (manpage)
Approved by: des (procfs maintainer), vangyzen (mentor)
Differential Revision: https://reviews.freebsd.org/D9802
Renumber cluase 4 to 3, per what everybody else did when BSD granted
them permission to remove clause 3. My insistance on keeping the same
numbering for legal reasons is too pedantic, so give up on that point.
Submitted by: Jan Schaumann <jschauma@stevens.edu>
Pull Request: https://github.com/freebsd/freebsd/pull/96
Thread might create a condition for delayed SU cleanup, which creates
a reference to the mount point in td_su, but exit without returning
through userret(), e.g. when terminating due to single-threading or
process exit. In this case, td_su reference is not dropped and mount
point cannot be freed.
Handle the situation by clearing td_su also in the thread destructor
and in exit1(). softdep_ast_cleanup() has to receive the thread as
argument, since e.g. thread destructor is executed in different
context.
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Right now the noexec mount option disallows image activators to try
execve the files on the mount point. Also, after r127187, noexec
also limits max_prot map entries permissions for mappings of files
from such mounts, but not the actual mapping permissions.
As result, the API behaviour is inconsistent. The files from noexec
mount can be mapped with PROT_EXEC, but if mprotect(2) drops execution
permission, it cannot be re-enabled later. Make this consistent
logically and aligned with behaviour of other systems, by disallowing
PROT_EXEC for mmap(2).
Note that this change only ensures aligned results from mmap(2) and
mprotect(2), it does not prevent actual code execution from files
coming from noexec mount. Such files can always be read into
anonymous executable memory and executed from there.
Reported by: shamaz.mazum@gmail.com
PR: 217062
Reviewed by: alc
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
inet_ntoa() cannot be used safely in a multithreaded environment
because it uses a static local buffer. Instead, use inet_ntoa_r()
with a buffer on the caller's stack.
Suggested by: glebius, emaste
Reviewed by: gnn
MFC after: 2 weeks
Sponsored by: Dell EMC
Differential Revision: https://reviews.freebsd.org/D9625
Right now this is not critical, but will be after planned increase of
MNAMELEN from 88 to 1k.
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
It's not a proper fix, but should be better than what we have now.
Since it got broken some six months ago it results in an incredibly
annoying and trivially reproducible panic every time eg an USB disk
gets disconnected.
MFC after: 2 weeks
Sponsored by: DARPA, AFRL
This patch adds a new function to the server krpc called
svcpool_close(). It is similar to svcpool_destroy(), but does not free
the data structures, so that the pool can be used again.
This function is then used instead of svcpool_destroy(),
svcpool_create() when the nfsd threads are killed.
PR: 204340
Reported by: Panzura
Approved by: rmacklem
Obtained from: rmacklem
MFC after: 1 week
The option "nonc" disables using of namecache for the created mount,
by default namecache is used. The rationale for the option is that
namecache duplicates the information which is already kept in memory
by tmpfs. Since it believed that namecache scales better than tmpfs,
or will scale better, do not enable the option by default. On the
other hand, smaller machines may benefit from lesser namecache
pressure.
Discussed with: mjg
Tested by: pho (as part of larger patch)
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
For directories, node->tn_spec.tn_dir.tn_parent pointer to the parent
is used. For non-directories, the implementation is naive, all
directory nodes are scanned to find a dirent linking the specified
node. This can be significantly improved by maintaining tn_parent for
all nodes, later.
Tested by: pho (as part of larger patch)
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
On dotdot lookup and fhtovp operations, it is possible for the file
represented by tmpfs node to be removed after the thread calculated
the pointer. In this case, tmpfs_alloc_vp() accesses freed memory.
Introduce the reference count on the nodes. The allnodes list from
tmpfs mount owns 1 reference, and threads performing unlocked
operations on the node, add one transient reference. Similarly, since
struct tmpfs_mount maintains the list where nodes are enlisted,
refcount it by one reference from struct mount and one reference from
each node on the list. Both nodes and tmpfs_mounts are removed when
refcount goes to zero.
Note that this means that nodes and tmpfs_mounts might survive some
time after the node is deleted or tmpfs_unmount() finished. The
tmpfs_alloc_vp() in these cases returns error either due to node
removal (tn_nlinks == 0) or because of insmntque1(9) error.
Tested by: pho (as part of larger patch)
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Remove TMPFS_ASSERT_ELOCKED(). Its claims are already stated by other
asserts nearby and by VFS guarantees.
Change TMPFS_ASSERT_LOCKED() and one inlined place to use
ASSERT_VOP_(E)LOCKED() instead of hand-rolled imprecise asserts.
Tested by: pho (as part of the larger patch)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Edit comments which explain no longer relevant details, and add
locking annotations to the struct tmpfs_node members.
Tested by: pho (as part of the larger patch)
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
If some process' nodes were accessed using procfs and the process
cannot exit properly at the time modunload event is reported to the
pseudofs-backed filesystem, the assertion in pfs_vncache_unload() is
triggered. Assertion is correct, the cache should be cleaned.
Approved by: des (pseudofs maintainer)
Reported and tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Inums in cd9660 refer to byte offsets on the media. DVD and BD media
can have entries above 4GB, especially with multi-session images.
PR: 190655
Reported by: Thomas Schmitt <scdbackup at gmx.net>