Commit Graph

108 Commits

Author SHA1 Message Date
Mateusz Guzik
d292b1940c vfs: remove the obsolete privused argument from vaccess
This brings argument count down to 6, which is passable without the
stack on amd64.
2020-08-05 09:27:03 +00:00
Kyle Evans
bd11e674ec pseudofs: don't do VEXEC check in VOP_CACHEDLOOKUP
VOP_CACHEDLOOKUP should assume that the appropriate VEXEC check has been
done in the caller (vfs_cache_lookup), so it does not belong here.
2020-02-02 15:36:12 +00:00
Mateusz Guzik
45757984f8 vfs: consistently use size_t for buflen around VOP_VPTOCNP 2020-02-01 20:34:43 +00:00
Mateusz Guzik
b249ce48ea vfs: drop the mostly unused flags argument from VOP_UNLOCK
Filesystems which want to use it in limited capacity can employ the
VOP_UNLOCK_FLAGS macro.

Reviewed by:	kib (previous version)
Differential Revision:	https://reviews.freebsd.org/D21427
2020-01-03 22:29:58 +00:00
Mateusz Guzik
6fa079fc3f vfs: flatten vop vectors
This eliminates the following loop from all VOP calls:

while(vop != NULL && \
    vop->vop_spare2 == NULL && vop->vop_bypass == NULL)
        vop = vop->vop_default;

Reviewed by:	jeff
Tesetd by:	pho
Differential Revision:	https://reviews.freebsd.org/D22738
2019-12-16 00:06:22 +00:00
Mateusz Guzik
abd80ddb94 vfs: introduce v_irflag and make v_type smaller
The current vnode layout is not smp-friendly by having frequently read data
avoidably sharing cachelines with very frequently modified fields. In
particular v_iflag inspected for VI_DOOMED can be found in the same line with
v_usecount. Instead make it available in the same cacheline as the v_op, v_data
and v_type which all get read all the time.

v_type is avoidably 4 bytes while the necessary data will easily fit in 1.
Shrinking it frees up 3 bytes, 2 of which get used here to introduce a new
flag field with a new value: VIRF_DOOMED.

Reviewed by:	kib, jeff
Differential Revision:	https://reviews.freebsd.org/D22715
2019-12-08 21:30:04 +00:00
Kyle Evans
f99c5e8d28 pseudofs: make readdir work without a pid again
Specifically, the following was broken:

$ mount -t procfs procfs /proc
$ ls -l /proc

r351741 reworked readdir slightly to avoid pfs_node/pidhash LOR, but
inadvertently regressed pid == NO_PID; new pfs_lookup_proc() fails for the
obvious reasons, and later pfs_visible_proc doesn't capture the
pid == NO_PID -> return 1 aspect of pfs_visible. We can infact skip this
whole block if we're operating on a directory w/ NO_PID, as it's always
visible.

Reported by:	trasz
Reviewed by:	mjg
Differential Revision:	https://reviews.freebsd.org/D21518
2019-09-04 14:20:39 +00:00
Mateusz Guzik
f5791174df pseudofs: fix a LOR pfs_node vs pidhash (sleepable after non-sleepable)
Sponsored by:	The FreeBSD Foundation
2019-09-03 12:54:51 +00:00
Johannes Lundberg
1e363d64a5 pseudofs: Ignore unsupported commands in vop_setattr.
Users of pseudofs (e.g. lindebugfs), should be able to receive
input from command line via commands like "echo 1 > /path/to/file".
Currently this fails because sh tries to truncate the file first and
vop_setattr returns not supported error for this. This patch simply
ignores the error and returns 0 instead.

Reviewed by:	imp (mentor), asomers
Approved by:	imp (mentor), asomers
MFC after:	1 week
Differential Revision: D20451
2019-05-28 20:54:59 +00:00
Mark Johnston
6d2e2df764 Ensure that directory entry padding bytes are zeroed.
Directory entries must be padded to maintain alignment; in many
filesystems the padding was not initialized, resulting in stack
memory being copied out to userspace.  With the ino64 work there
are also some explicit pad fields in struct dirent.  Add a subroutine
to clear these bytes and use it in the in-tree filesystems.  The
NFS client is omitted for now as it was fixed separately in r340787.

Reported by:	Thomas Barabosch, Fraunhofer FKIE
Reviewed by:	kib
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
2018-11-23 22:24:59 +00:00
Mateusz Guzik
53011553fa proc: convert pfind & friends to use pidhash locks and other cleanup
pfind_locked is retired as it relied on allproc which unnecessarily
restricts locking of the hash.

Sponsored by:	The FreeBSD Foundation
2018-11-21 20:15:56 +00:00
Konstantin Belousov
1c4ca77890 Add d_off support for multiple filesystems.
The d_off field has been added to the dirent structure recently.
Currently filesystems don't support this feature.  Support has been
added and tested for zfs, ufs, ext2fs, fdescfs, msdosfs and unionfs.
A stub implementation is available for cd9660, nandfs, udf and
pseudofs but hasn't been tested.

Motivation for this feature: our usecase is for a userspace nfs server
(nfs-ganesha) with zfs.  At the moment we cache direntry offsets by
calling lseek once per entry, with this patch we can get the offset
directly from getdirentries(2) calls which provides a significant
speedup.

Submitted by:	Jack Halford <jack@gandi.net>
Reviewed by:	mckusick, pfg, rmacklem (previous versions)
Sponsored by:	Gandi.net
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D17917
2018-11-14 14:18:35 +00:00
Pedro F. Giffuni
d63027b668 sys/fs: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
2017-11-27 15:15:37 +00:00
Enji Cooper
6bfe453238 Fix LINT, broken by a -Wformat warning in r320329 with PFS_DELEN being
changed from %d to a long-width type.

Use uintmax_t casting and %ju to futureproof the format string against
potential changes with either the #define or the implementation-specific
definition for offsetof(..).
2017-06-27 17:01:46 +00:00
Konstantin Belousov
ffc161df9f Do not perform unneccessary shared recursion on the allproc_lock in
pfs_visible().  The recursion does not cause deadlock because the sx
implementation does not prefer exclusive waiters over the shared, but
this is an implementation detail.

Reported by:	pho, Matthew Bryan <matthew.bryan@isilon.com>
Reviewed by:	jhb
Tested by:	pho
Approved by:	des (pseudofs maintainer)
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2016-03-11 11:51:38 +00:00
Konstantin Belousov
587430f254 Redo r258088 to avoid relying on signed arithmetic overflow, since
compiler interprets this as an undefined behaviour.  Instead, ensure
that the sum of uio_offset and uio_resid is below OFF_MAX using the
operation which cannot overflow.

Reported and tested by:	pho
Discussed with:	bde
Approved by:	des (pseudofs maintainer)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2013-11-20 19:41:00 +00:00
Konstantin Belousov
5ba4de79a7 Remove useless comparisions of assigned offset and resid with the
sources from uio.  Both uio_offset and offset, and uio_resid and resid
have the same types for some time.

Add check for buflen overflow by comparing the buflen with both offset
and resid (vs. comparing with offset only, as it is currently done).

Reported and tested by:	pho
Approved by:	des (pseudofs maintainer)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2013-11-13 08:55:09 +00:00
Konstantin Belousov
31452ff75e Apply inlined vn_vget_ino() algorithm for ".." lookup in pseudofs.
Reported and tested by:	pho
MFC after:	2 weeks
2012-03-05 11:38:02 +00:00
Konstantin Belousov
526d0bd547 Fix found places where uio_resid is truncated to int.
Add the sysctl debug.iosize_max_clamp, enabled by default. Setting the
sysctl to zero allows to perform the SSIZE_MAX-sized i/o requests from
the usermode.

Discussed with:	bde, das (previous versions)
MFC after:	1 month
2012-02-21 01:05:12 +00:00
Ulrich Spörlein
9a14aa017b Convert files to UTF-8 2012-01-15 13:23:18 +00:00
Jaakko Heinonen
d467c9472a r222004 changed sbuf_finish() to not clear the buffer error status. As a
consequence sbuf_len() will return -1 for buffers which had the error
status set prior to sbuf_finish() call. This causes a problem in
pfs_read() which purposely uses a fixed size sbuf to discard bytes which
are not needed to fulfill the read request.

Work around the problem by using the full buffer length when
sbuf_finish() indicates an overflow. An overflowed sbuf with fixed size
is always full.

PR:		kern/163076
Approved by:	des
MFC after:	2 weeks
2012-01-06 10:12:59 +00:00
Jaakko Heinonen
9cb24e3c98 Check the return value of sbuf_finish() in pfs_readlink() and return
ENAMETOOLONG if the buffer overflowed.

Approved by:	des
MFC after:	2 weeks
2012-01-06 09:17:34 +00:00
Konstantin Belousov
f82360acf2 Existing VOP_VPTOCNP() interface has a fatal flow that is critical for
nullfs.  The problem is that resulting vnode is only required to be
held on return from the successfull call to vop, instead of being
referenced.

Nullfs VOP_INACTIVE() method reclaims the vnode, which in combination
with the VOP_VPTOCNP() interface means that the directory vnode
returned from VOP_VPTOCNP() is reclaimed in advance, causing
vn_fullpath() to error with EBADF or like.

Change the interface for VOP_VPTOCNP(), now the dvp must be
referenced. Convert all in-tree implementations of VOP_VPTOCNP(),
which is trivial, because vhold(9) and vref(9) are similar in the
locking prerequisites. Out-of-tree fs implementation of VOP_VPTOCNP(),
if any, should have no trouble with the fix.

Tested by:	pho
Reviewed by:	mckusick
MFC after:	3 weeks (subject of re approval)
2011-11-19 07:50:49 +00:00
Konstantin Belousov
1fb5311e00 Fix build, use %d for int value formatting. 2011-11-16 18:41:59 +00:00
Peter Holm
50546f8ffe Handle invalid large values for getdirentries(2) data buffer size.
In collaboration with:	kib
Reviewed by:	des
Reported by:	The iknowthis syscall fuzzer.
MFC after:	1 week
2011-11-16 10:11:55 +00:00
Peter Holm
3c93d4433f Removed extra PRELE() call.
MFC after:	1 week
2011-11-15 09:23:21 +00:00
Konstantin Belousov
5673e3cb08 The cache_enter(9) function shall not be called for doomed dvp.
Assert this.

In the reported panic, vdestroy() fired the assertion "vp has namecache
for ..", because pseudofs may end up doing cache_enter() with reclaimed
dvp, after dotdot lookup temporary unlocked dvp.
Similar problem exists in ufs_lookup() for "." lookup, when vnode
lock needs to be upgraded.

Verify that dvp is not reclaimed before calling cache_enter().

Reported and tested by:	pho
Reviewed by:	kan
MFC after:	2 weeks
2010-04-20 10:19:27 +00:00
Jaakko Heinonen
e48fbf26e8 Truncate read request rather than returning EIO if the request is
larger than MAXPHYS + 1. This fixes a problem with cat(1) when it
uses a large I/O buffer.

Reported by:	Fernando Apesteguía
Suggested by:	jilles
Reviewed by:	des
Approved by:	trasz (mentor)
2010-01-22 08:45:12 +00:00
Konstantin Belousov
34f83c86f6 Remove spurious pfs_unlock().
PR:	kern/137310
Reviewed by:	des
MFC after:	3 days
2009-08-31 09:26:04 +00:00
Konstantin Belousov
9f80ce043d Change the type of uio_resid member of struct uio from int to ssize_t.
Note that this does not actually enable full-range i/o requests for
64 architectures, and is done now to update KBI only.

Tested by:	pho
Reviewed by:	jhb, bde (as part of the review of the bigger patch)
2009-06-25 18:46:30 +00:00
Konstantin Belousov
c4df27d5c8 VOP_IOCTL takes unlocked vnode as an argument. Due to this, v_data may
be NULL or derefenced memory may become free at arbitrary moment.

Lock the vnode in cd9660, devfs and pseudofs implementation of VOP_IOCTL
to prevent reclaim; check whether the vnode was already reclaimed after
the lock is granted.

Reported by:	georg at dts su
Reviewed by:	des (pseudofs)
MFC after:	2 weeks
2009-06-10 13:57:36 +00:00
Konstantin Belousov
b00098d164 Unlock the pseudofs vnode before calling fill method for pfs_readlink().
The fill code may need to lock another vnode, e.g. procfs file
implementation.

Reviewed by:	des
Tested by:	pho
MFC after:	2 weeks
2009-05-31 15:01:50 +00:00
Dag-Erling Smørgrav
26088b9d4d Use a temporary variable to avoid a duplicate strlen().
Submitted by:	kib
MFC after:	1 week
2009-05-28 10:24:26 +00:00
Dag-Erling Smørgrav
4970b8ae0a Remove spurious locking in pfs_write().
Reported by:	Andrew Brampton <me@bramp.net>
MFC after:	1 week
2009-04-08 09:02:42 +00:00
Dag-Erling Smørgrav
59fc1b068f Fix an inverted KASSERT. Add similar assertions in other similar places.
Reported by:	Andrew Brampton <me@bramp.net>
MFC after:	1 week
2009-04-07 16:13:10 +00:00
Dag-Erling Smørgrav
655fcdaa00 Fix a logic bug that caused the pfs_attr method to be called only for
PFS_PROCDEP nodes.

Submitted by:	Andrew Brampton <brampton@gmail.com>
MFC after:	2 weeks
2009-02-16 15:17:26 +00:00
Joe Marcus Clarke
e7f54c1b71 Add a VOP_VPTOCNP implementation for pseudofs which covers file systems
such as procfs and linprocfs.

This implementation's locking was enhanced by kib.

Reviewed by:	kib
		des
Approved by:	des
		kib
Tested by:	pho
2008-12-30 21:49:39 +00:00
Konstantin Belousov
505d02eebe Drop the pseudofs vnode lock around call to pfs_read handler. The handler
may need to lock arbitrary vnodes, causing either lock order reversal or
recursive vnode lock acquisition.

Tested by:	pho
Approved by:	des
MFC after:	2 weeks
2008-12-29 12:12:23 +00:00
Edward Tomasz Napierala
15bc6b2bd8 Introduce accmode_t. This is required for NFSv4 ACLs - it will be neccessary
to add more V* constants, and the variables changed by this patch were often
being assigned to mode_t variables, which is 16 bit.

Approved by:	rwatson (mentor)
2008-10-28 13:44:11 +00:00
Konstantin Belousov
caf8aec886 fdescfs, devfs, mqueuefs, nfs, portalfs, pseudofs, tmpfs and xfs
initialize the vattr structure in VOP_GETATTR() with VATTR_NULL(),
vattr_null() or by zeroing it. Remove these to allow preinitialization
of fields work in vn_stat(). This is needed to get birthtime initialized
correctly.

Submitted by:   Jaakko Heinonen <jh saunalahti fi>
Discussed on:   freebsd-fs
MFC after:	1 month
2008-09-20 19:50:52 +00:00
Attilio Rao
0359a12ead Decontextualize the couplet VOP_GETATTR / VOP_SETATTR as the passed thread
was always curthread and totally unuseful.

Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
2008-08-28 15:23:18 +00:00
Attilio Rao
22db15c06f VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in
conjuction with 'thread' argument passing which is always curthread.
Remove the unuseful extra-argument and pass explicitly curthread to lower
layer functions, when necessary.

KPI results broken by this change, which should affect several ports, so
version bumping and manpage update will be further committed.

Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>
2008-01-13 14:44:15 +00:00
Attilio Rao
cb05b60a89 vn_lock() is currently only used with the 'curthread' passed as argument.
Remove this argument and pass curthread directly to underlying
VOP_LOCK1() VFS method. This modify makes the code cleaner and in
particular remove an annoying dependence helping next lockmgr() cleanup.
KPI results, obviously, changed.

Manpage and FreeBSD_version will be updated through further commits.

As a side note, would be valuable to say that next commits will address
a similar cleanup about VFS methods, in particular vop_lock1 and
vop_unlock.

Tested by:	Diego Sardina <siarodx at gmail dot com>,
		Andrea Di Pasquale <whyx dot it at gmail dot com>
2008-01-10 01:10:58 +00:00
John Baldwin
c1f7cf23b1 Use the correct pid when checking to see whether or not the /proc/<pid>
directory itself (rather than any of its contents) is visible to the
current thread.

MFC after:	1 week
PR:		kern/90063
Submitted by:	john of 8192.net
Approved by:	re (kensmith)
2007-10-05 17:37:25 +00:00
Bruce A. Mah
5cca41595d Fix off-by-one error (introduced in r1.60) that had the effect of
disallowing a read of exactly MAXPHYS bytes.

Reviewed by:	des, rdivacky
MFC after:	1 week
Sponsored by:	nCircle Network Security
2007-06-07 15:04:30 +00:00
Dag-Erling Smørgrav
8edf8ae133 Avoid "unused variable" warning when building without PSEUDOFS_TRACE. 2007-04-15 20:35:18 +00:00
Dag-Erling Smørgrav
388596dffc Make pseudofs (and consequently procfs, linprocfs and linsysfs) MPSAFE. 2007-04-15 17:10:01 +00:00
Dag-Erling Smørgrav
f61bc4ea5e Further pseudofs improvements:
The pfs_info mutex is only needed to lock pi_unrhdr.  Everything else
in struct pfs_info is modified only while Giant is held (during
vfs_init() / vfs_uninit()); add assertions to that effect.

Simplify pfs_destroy somewhat.

Remove superfluous arguments from pfs_fileno_{alloc,free}(), and the
assertions which were added in the previous commit to ensure they were
consistent.

Assert that Giant is held while the vnode cache is initialized and
destroyed.  Also assert that the cache is empty when it is destroyed.

Rename the vnode cache mutex for consistency.

Fix a long-standing bug in pfs_getattr(): it would uncritically return
the node's pn_fileno as st_ino.  This would result in st_ino being 0
if the node had not previously been visited by readdir(), and also in
an incorrect st_ino for process directories and any files contained
therein.  Correct this by abstracting the fileno manipulations
previously done in pfs_readdir() into a new function, pfs_fileno(),
which is used by both pfs_getattr() and pfs_readdir().
2007-04-14 14:08:30 +00:00
Alexander Leidinger
85646f7eb1 Correctly calculate a buffer length. It was off by one so a read() returned
one byte less than needed.

This is a RELENG_x_y candidate, since it fixes a problem with Oracle 10.

Noticed by:	Dmitry Ganenko <dima@apk-inform.com>
Testcase by:	Dmitry Ganenko <dima@apk-inform.com>
Reviewed by:	des
Submitted by:	rdivacky
Sponsored by:	Google SoC 2006
MFC after:	1 week
2006-06-27 20:21:38 +00:00
John Baldwin
06ad42b2f7 Close some races between procfs/ptrace and exit(2):
- Reorder the events in exit(2) slightly so that we trigger the S_EXIT
  stop event earlier.  After we have signalled that, we set P_WEXIT and
  then wait for any processes with a hold on the vmspace via PHOLD to
  release it.  PHOLD now KASSERT()'s that P_WEXIT is clear when it is
  invoked, and PRELE now does a wakeup if P_WEXIT is set and p_lock drops
  to zero.
- Change proc_rwmem() to require that the processing read from has its
  vmspace held via PHOLD by the caller and get rid of all the junk to
  screw around with the vmspace reference count as we no longer need it.
- In ptrace() and pseudofs(), treat a process with P_WEXIT set as if it
  doesn't exist.
- Only do one PHOLD in kern_ptrace() now, and do it earlier so it covers
  FIX_SSTEP() (since on alpha at least this can end up calling proc_rwmem()
  to clear an earlier single-step simualted via a breakpoint).  We only
  do one to avoid races.  Also, by making the EINVAL error for unknown
  requests be part of the default: case in the switch, the various
  switch cases can now just break out to return which removes a _lot_ of
  duplicated PRELE and proc unlocks, etc.  Also, it fixes at least one bug
  where a LWP ptrace command could return EINVAL with the proc lock still
  held.
- Changed the locking for ptrace_single_step(), ptrace_set_pc(), and
  ptrace_clear_single_step() to always be called with the proc lock
  held (it was a mixed bag previously).  Alpha and arm have to drop
  the lock while the mess around with breakpoints, but other archs
  avoid extra lock release/acquires in ptrace().  I did have to fix a
  couple of other consumers in kern_kse and a few other places to
  hold the proc lock and PHOLD.

Tested by:	ps (1 mostly, but some bits of 2-4 as well)
MFC after:	1 week
2006-02-22 18:57:50 +00:00