The behavior is the same as in capability mode, it does not actually
return EINVAL for absolute lookups:
openat(AT_FDCWD,"/tmp/test",O_RDONLY|O_DIRECTORY,00) = 3 (0x3)
openat(3,"../../",O_RDONLY|0x800000,00) ERR#93 'Capabilities insufficient'
openat(3,"/etc/passwd",O_RDONLY|0x800000,00) ERR#93 'Capabilities insufficient'
Fixes: 1f305be43 ("Document {O,AT}_RESOLVE_BENEATH...")
Reviewed by: kib, pauamma (manpages), emaste
Sponsored by: https://www.patreon.com/valpackett
Pull Request: https://github.com/freebsd/freebsd-src/pull/680
Differential Revision: https://reviews.freebsd.org/D38675
Problem is that open(O_PATH) on nullfs -o nocache is broken then,
because there is no reference on the vnode after the open syscall exits.
Reported and tested by: ambrisko
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
It reopens the passed file descriptor, checking the file backing vnode'
current access rights against open mode. In particular, this flag allows
to convert file descriptor opened with O_PATH, into operable file
descriptor, assuming permissions allow that.
Reviewed by: markj
Tested by: Andrew Walker <awalker@ixsystems.com>
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D30148
if VREAD access is checked as allowed during open
Requested by: wulf
Reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D29323
by only keeping hold count on the vnode, instead of the use count.
Reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D29323
with the reasoning that the flags did not worked properly, and were not
shipped in a release.
O_RESOLVE_BENEATH is kept as useful.
Reviewed by: markj
Tested by: arichardson, pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D28907
POSIX O_DSYNC means that writes include an implicit fdatasync(2), just
as O_SYNC implies fsync(2).
VOP_WRITE() functions that understand the new IO_DATASYNC flag can act
accordingly, but we'll still pass down IO_SYNC so that file systems that
don't understand it will continue to provide the stronger O_SYNC
behaviour.
Flag also applies to fcntl(2).
Reviewed by: kib, delphij
Differential Revision: https://reviews.freebsd.org/D25090
- varios "new sentence, new line" warnings
- varios "sections out of conventional order" warnings
- varios "unusual Xr order" warnings
- varios "missing section argument" warnings
- varios "no blank before trailing delimiter" warnings
- varios "normalizing date format" warnings
MFC after: 1 month
EINTEGRITY was previously documented as a UFS-specific error for
mount(2). This documents EINTEGRITY as a filesystem-independent error
that may be reported by the backing store of a filesystem.
While here, document EIO as a filesystem-independent error for both
mount(2) and posix_fadvise(2). EIO was previously only documented for
UFS for mount(2).
Reviewed by: mckusick
Suggested by: mckusick
MFC after: 2 weeks
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D24168
O_SEARCH is defined by POSIX [0] to open a directory for searching, skipping
permissions checks on the directory itself after the initial open(). This is
close to the semantics we've historically applied for O_EXEC on a directory,
which is UB according to POSIX. Conveniently, O_SEARCH on a file is also
explicitly undefined behavior according to POSIX, so O_EXEC would be a fine
choice. The spec goes on to state that O_SEARCH and O_EXEC need not be
distinct values, but they're not defined to be the same value.
This was pointed out as an incompatibility with other systems that had made
its way into libarchive, which had assumed that O_EXEC was an alias for
O_SEARCH.
This defines compatibility O_SEARCH/FSEARCH (equivalent to O_EXEC and FEXEC
respectively) and expands our UB for O_EXEC on a directory. O_EXEC on a
directory is checked in vn_open_vnode already, so for completeness we add a
NOEXECCHECK when O_SEARCH has been specified on the top-level fd and do not
re-check that when descending in namei.
[0] https://pubs.opengroup.org/onlinepubs/9699919799/
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D23247
While FreeBSD's implementation of these expect an int inside of libc, that's an
implementation detail that we can hide from the user as it's the natural
promotion of the current mode_t type and before it is used in the kernel, it's
converted back to the narrower type that's the current definition of mode_t. As
such, documenting int is at best confusing and at worst misleading. Instead add
a note that these args are variadic and as such calling conventions may differ
from non-variadic arguments.
promoted to ints).
- `mode_t` is `uint16_t` (`sys/sys/_types.h`)
- `openat` takes variadic args
- variadic args cannot be 16-bit, and indeed the code uses int
- the manpage currently kinda implies the argument is 16-bit by saying `mode_t`
Prompted by Rust things: https://github.com/tailhook/openat/issues/21
Submitted by: Greg V at unrelenting
Differential Revision: https://reviews.freebsd.org/D21816
The man page claims that with O_FSYNC (aka O_SYNC) the kernel will not cache
written data. However, that's not true. Nor does POSIX require it.
Perhaps it was true when that section of the man page was written in r69336
(I haven't checked). But it's not true now. Now the effect is simply that
writes are sent to disk immediately and synchronously, but they're still
cached.
See also: https://pubs.opengroup.org/onlinepubs/9699919799/
See also: ffs_write in sys/ufs/ffs/ffs_vnops.c
Reviewed by: cem
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D20641
paths.
It was decided that committing the code and drafting of the man page
update is better than allowing the code to rot until wordsmithing
happens.
Reviewed by: jilles (previous version)
Discussed with: brooks, jilles, emaste
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D17714
Flags prevent open(2) and *at(2) vfs syscalls name lookup from
escaping the starting directory. Supposedly the interface is similar
to the same proposed Linux flags.
Reviewed by: jilles (code, previous version of manpages), 0mp (manpages)
Discussed with: allanjude, emaste, jonathan
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D17547
After r308212 Capsicum permits .. lookups in capability mode, as long as
path component traversal does not escape the directory corresponding to
the provided file descriptor.
We should add a description of the vfs.lookup_cap_dotdot and
vfs.lookup_cap_dotdot_nonlocal sysctls, perhaps as a cross-reference to
capsicum(4). I intend to look at that soon.
Reviewed by: bjk, cem, kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D12343
Renumber cluase 4 to 3, per what everybody else did when BSD granted
them permission to remove clause 3. My insistance on keeping the same
numbering for legal reasons is too pedantic, so give up on that point.
Submitted by: Jan Schaumann <jschauma@stevens.edu>
Pull Request: https://github.com/freebsd/freebsd/pull/96
We return [EMLINK] instead of [ELOOP] when trying to open a symlink with
O_NOFOLLOW, so that the original case of [ELOOP] can be distinguished. Code
like cmp -h and xz takes advantage of this.
PR: 214633
Reviewed by: kib, imp
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D8586
According to POSIX open() must return ENOTDIR when the path name does
not refer to a path name. Change vn_open() to respect this flag. This
also simplifies the Linuxolator a bit.
On FreeBSD, this is the default behaviour. According to the spec, we may
give this flag a value of zero, but I'd rather not do this. If we define
it to a non-zero value, we can always change default behaviour without
changing the ABI. This is very unlikely to happen, though.
- O_NONBLOCK flag has to be set, if it is not set, open(2) will wait for
another process opening the fifo for reading,
- Use O_WRONLY which implies that the file has to be opened _only_ for write.
This is quite tricky situation, because we allow to open a file with
O_RDONLY|O_TRUNC. O_TRUNC modifies a file, but we actually don't open
it for writing. EISDIR is also returned when we try to open a directory
O_RDONLY|O_TRUNC, which is correct.
POSIX says that "The result of using O_TRUNC with O_RDONLY is undefined.",
we choose to accept it (Solaris did the same), that's why "to be modified"
seems more accurate to me.
other systems it prevents a tty from becoming a controlling tty on the
open. O_SYNC is the POSIX name for O_FSYNC.
The Markup Police may need to tweak my references to standards.