kernel for FreeBSD 9.0:
Add a new capability mask argument to fget(9) and friends, allowing system
call code to declare what capabilities are required when an integer file
descriptor is converted into an in-kernel struct file *. With options
CAPABILITIES compiled into the kernel, this enforces capability
protection; without, this change is effectively a no-op.
Some cases require special handling, such as mmap(2), which must preserve
information about the maximum rights at the time of mapping in the memory
map so that they can later be enforced in mprotect(2) -- this is done by
narrowing the rights in the existing max_protection field used for similar
purposes with file permissions.
In namei(9), we assert that the code is not reached from within capability
mode, as we're not yet ready to enforce namespace capabilities there.
This will follow in a later commit.
Update two capability names: CAP_EVENT and CAP_KEVENT become
CAP_POST_KEVENT and CAP_POLL_KEVENT to more accurately indicate what they
represent.
Approved by: re (bz)
Submitted by: jonathan
Sponsored by: Google Inc
and vop_reclaim() methods. They seems to be unused, and the reported
situation is normal for the forced unmount.
MFC after: 1 week
X-MFC-note: keep prtactive symbol in vfs_subr.c
1. Use unsigned rather than signed lengths
2. Bound messages to/from Venus to VC_MAXMSGSIZE
3. Bound messages to/from general user processes to VC_MAXDATASIZE
4. Update comment regarding data limits for pioctl
Without (1) and (3), it may be possible for unprivileged user processes to
read sensitive portions of kernel memory. This issue is only present if
the Coda kernel module is loaded and venus (the userspace Coda daemon) is
running and has /coda mounted.
As Coda is considered experimental and production use is warned against in
the coda(4) man page, and because Coda must be explicitly configured for a
configuration to be vulnerable, we won't be issuing a security advisory.
However, if you are using Coda, then you are advised to apply these fixes.
Reported by: Dan J. Rosenberg <drosenberg at vsecurity.com>
Obtained from: NetBSD (Christos Zoulas)
Security: Kernel memory disclosure; no advisory as feature experimental
MFC after: 3 days
coda.h:
- CodaFid typdef -> struct CodaFid throughout.
- Use unsigned int instead of unsigned long for venus_dirent and other
cosmetic fixes.
- Introduce cuid_t and cgid_t and use instead of uid_t and gid_t in RPCs.
- Synchronize comments and macros.
- Use u_int32_t instead of unsigned long for coda_out_hdr.
With these changes, a 64-bit Coda kernel module now works with
coda6_client, whereas previous userspace and kernel versions of RPCs
differed sufficiently to prevent using the file system. This has been
verified only with casual testing, but /coda is now usable for at least
basic operations on amd64.
MFC after: 1 week
Note that this does not actually enable full-range i/o requests for
64 architectures, and is done now to update KBI only.
Tested by: pho
Reviewed by: jhb, bde (as part of the review of the bigger patch)
the VFS. Now all the VFS_* functions and relating parts don't want the
context as long as it always refers to curthread.
In some points, in particular when dealing with VOPs and functions living
in the same namespace (eg. vflush) which still need to be converted,
pass curthread explicitly in order to retain the old behaviour.
Such loose ends will be fixed ASAP.
While here fix a bug: now, UFS_EXTATTR can be compiled alone without the
UFS_EXTATTR_AUTOSTART option.
VFS KPI is heavilly changed by this commit so thirdy parts modules needs
to be recompiled. Bump __FreeBSD_version in order to signal such
situation.
to add more V* constants, and the variables changed by this patch were often
being assigned to mode_t variables, which is 16 bit.
Approved by: rwatson (mentor)
When I changed kern_conf.c three months ago I made device unit numbers
equal to (unneeded) device minor numbers. We used to require
bitshifting, because there were eight bits in the middle that were
reserved for a device major number. Not very long after I turned
dev2unit(), minor(), unit2minor() and minor2unit() into macro's.
The unit2minor() and minor2unit() macro's were no-ops.
We'd better not remove these four macro's from the kernel, because there
is a lot of (external) code that may still depend on them. For now it's
harmless to remove all invocations of unit2minor() and minor2unit().
Reviewed by: kib
macros. The only semantic change was the need to add a vc_opened field
to struct vcomm since we can no longer use the request queue returning
to an uninitialized state to hold whether or not the device is open.
MFC after: 1 month
fundamentally fairly confused about how signals work and when it is
appropriate for upcalls to be interrupted. In particular, we should
be exempting certain upcalls from interruption, we should not always
eventually time out sleeping on a upcall, and we should not be
interrupting the sleep for certain signals that we currently are
(including SIGINFO). This code needs to be reworked in the style of
NFS interruptible mounts.
MFC after: 1 month
access cache improvements:
- Flush just access control state on CODA_PURGEUSER, not the full
namecache for /coda.
- When replacing a fid on a cnode as a result of, e.g.,
reintegration after offline operation, we no longer need to
purge the namecache entries associated with its vnode.
MFC after: 1 month
modeled on the access cache found in NFS, smbfs, and the Linux coda
module. This is a positive access cache of a single entry per file,
tracking recently granted rights, but unlike NFS and smbfs,
supporting explicit invalidation by the distributed file system.
For each cnode, maintain a C_ACCCACHE flag indicating the validity
of the cache, and a cached uid and mode tracking recently granted
positive access control decisions.
Prefer the cache to venus_access() in VOP_ACCESS() if it is valid,
and when we must fall back to venus_access(), update the cache.
Allow Venus to clear the access cache, either the whole cache on
CODA_FLUSH, or just entries for a specific uid on CODA_PURGEUSER.
Unlike the Coda module on Linux, we don't flush all entries on a
user purge using a generation number, we instead walk present
cnodes and clear only entries for the specific user, meaning it is
somewhat more expensive but won't hit all users.
Since the Coda module is agressive about not keeping around
unopened cnodes, the utility of the cache is somewhat limited for
files, but works will for directories. We should make Coda less
agressive about GCing cnodes in VOP_INACTIVE() in order to improve
the effectiveness of in-kernel caching of attributes and access
rights.
MFC after: 1 month
VFS namecache, as is done by the Coda module on Linux. Unlike the Coda
namecache, the global VFS namecache isn't tagged by credential, so use
ore conservative flushing behavior (for now) when CODA_PURGEUSER is
issued by Venus.
This improves overall integration with the FreeBSD VFS, including
allowing __getcwd() to work better, procfs/procstat monitoring, and so
on. This improves shell behavior in many cases, and improves ".."
handling. It may lead to some slowdown until we've implemented a
specific access cache, which should net improve performance, but in the
mean time, lookup access control now always goes to Venus, whereas
previously it didn't.
MFC after: 1 month
tree, restyle everything but coda.h (which is more explicitly shared
across systems) into a closer approximation to style(9).
Remove a few more unused function prototypes.
Add or clarify some comments.
MFC after: 1 month
- Rename print_vattr to coda_print_vattr and make static, rename
print_cred to coda_print_cred.
- Remove unused coda_vop_nop.
- Add XXX comment because coda_readdir forwards to the cache vnode's
readdir rather than venus_readdir, and annotate venus_readdir as
unused.
- Rename vc_nb_* to vc_*.
- Use d_open_t, d_close_t, d_read_t, d_write_t, d_ioctl_t and d_poll_t
for prototyping vc_* as that is the intent, don't use our own
definitions.
- Rename coda_nb_statfs to coda_statfs, rename NB_SFS_SIZ to
CODA_SFS_SIZ.
- Replace one more OBE reference to NetBSD with a reference to FreeBSD.
- Tidy up a little vertical whitespace here and there.
- Annotate coda_nc_zapvnode as unused.
- Remove unused vcodattach.
- Annotate VM_INTR as unused.
- Annotate that coda_fhtovp is unused and doesn't match the FreeBSD
prototype, so isn't hooked up to vfs_fhtovp. If we want NFS export of
Coda to work someday, this needs to be fixed.
- Remove unused getNewVnode.
- Remove unused coda_vget, coda_init, coda_quotactl prototypes.
MFC after: 1 month
the mountpoint for a specific device. This was implemented incorrectly,
a bad idea in a fundamental sense, and also never used, so presumably
a long-idle debugging function.
MFC after: 1 month
for vop_bmap; delete the existing stub that returned either EINVAL
or EOPNOTSUPP, and had unreachable calls to VOP_BMAP on the cache
vnode.
MFC after: 1 month
then later to FreeBSD. Update various NetBSD-related comments: in some
cases delete them because they don't appply, in others update to say
FreeBSD as they still apply but in FreeBSD (and might for that matter
no longer apply on NetBSD), and flag one case where I'm not sure
whether it applies.
MFC after: 1 month
locks of those vnodes. Probably, Coda should do the same lock sharing/
pass-through that is done for nullfs, but in the mean time this ensures
that locks are adequately held to prevent corruption of data structures
in the cache file system.
Assuming most operations came from the top layer of Coda and weren't
performed directly on the cache vnodes, in practice this corruption was
relatively unlikely as the Coda vnode locks were ensuring exclusive
access for most consumers.
This causes WITNESS to squeal like a pig immediately when Coda is used,
rather than waiting until file close; I noticed these problems because
of the lack of said squealing.
MFC after: 1 month
vget() calls using inode numbers to query the root of /coda, which is not
needed since we now cache the root vnode with the mountpoint.
MFC after: 1 month
to files, such as ktrace output, under CODA_VERBOSE. Otherwise, each
such call to VOP_WRITE() results in a kernel printf.
MFC after: 3 days
Obtained from: NetBSD
- Don't specify vnode operations for mknod, lease, and advlock--let them
fall through to vop_default.
- Implement vop_default with &default_vnodeops, rather than with VOP_PANIC,
so that unimplemented vnode operations are handled in more sensible ways
than panicking, such as EOPNOTSUPP on ACL queries generated by bsdtar,
or mknod.
MFC after: 3 days
fill out all fields, just fill out the ones the file system knows
about. Among other things, this causes the outpuf of "mount" and
"df" to make quite a bit more sense as /dev/cfs0 is specified as the
mountfrom name.
MFC after: 3 days
vnodes during coda_unmount() in order to detect errant use of them
after the vnode references may no longer be valid.
No need to clear the VV_ROOT flag on mi_rootvp flag (especially after
the vnode reference is no longer valid) as this isn't done on other
file systems.
MFC after: 3 days
and then release it when it is closed: we rely on the caller to keep the
vnode around with a valid reference. This avoids vrele() destroying the
vnode vop_close() is being called from during a call to vop_close(), and
a crash due to lockmgr recursing the vnode lock when a Coda unmount
occurs.
MFC after: 3 days
Move all extern variable definitions to associated .h files, move some
extern variable definitions between include files to place them more
appropriately.
MFC after: 3 days
Coda vnode derived from it, in the style of nullfs. This allows files
in the Coda file system to be memory-mapped, such as with execve(2) or
mmap(2).
MFC after: 3 days
Reported by: Rune <u+openafsdev-sr55 at chalmers dot se>
conjuction with 'thread' argument passing which is always curthread.
Remove the unuseful extra-argument and pass explicitly curthread to lower
layer functions, when necessary.
KPI results broken by this change, which should affect several ports, so
version bumping and manpage update will be further committed.
Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>
Remove this argument and pass curthread directly to underlying
VOP_LOCK1() VFS method. This modify makes the code cleaner and in
particular remove an annoying dependence helping next lockmgr() cleanup.
KPI results, obviously, changed.
Manpage and FreeBSD_version will be updated through further commits.
As a side note, would be valuable to say that next commits will address
a similar cleanup about VFS methods, in particular vop_lock1 and
vop_unlock.
Tested by: Diego Sardina <siarodx at gmail dot com>,
Andrea Di Pasquale <whyx dot it at gmail dot com>