that use it. Specifically, vop_stdlock uses the lock pointed to by
vp->v_vnlock. By default, getnewvnode sets up vp->v_vnlock to
reference vp->v_lock. Filesystems that wish to use the default
do not need to allocate a lock at the front of their node structure
(as some still did) or do a lockinit. They can simply start using
vn_lock/VOP_UNLOCK. Filesystems that wish to manage their own locks,
but still use the vop_stdlock functions (such as nullfs) can simply
replace vp->v_vnlock with a pointer to the lock that they wish to
have used for the vnode. Such filesystems are responsible for
setting the vp->v_vnlock back to the default in their vop_reclaim
routine (e.g., vp->v_vnlock = &vp->v_lock).
In theory, this set of changes cleans up the existing filesystem
lock interface and should have no function change to the existing
locking scheme.
Sponsored by: DARPA & NAI Labs.
- Begin moving scheduler specific functionality into sched_4bsd.c
- Replace direct manipulation of scheduler data with hooks provided by the
new api.
- Remove KSE specific state modifications and single runq assumptions from
kern_switch.c
Reviewed by: -arch
the locking of the proc lock after the goto to done1 to avoid locking
the lock in an error case just so we can turn around and unlock it.
- Move the exec_setregs() stuff out from under the proc lock and after
the p_args stuff. This allows exec_setregs() to be able to sleep or
write things out to userland, etc. which ia64 does.
Tested by: peter
vcanrecycle to check a free vnode's availability. If it is
available, vcanrecycle returns an error code of zero and the
vnode in question locked. The getnewvnode routine then used
to call vn_start_write with the V_NOWAIT flag. If the filesystem
was suspended while taking a snapshot, the vn_start_write would
fail but getnewvnode would fail to unlock the vnode, instead
leaving it locked on the freelist. The result would be that the
vnode would be locked forever and would eventually hang the
system with a race to the root when it was attempted to recycle
it. This fix moves the vn_start_write check into vcanrecycle
where it will properly unlock the vnode if it is unavailable
for recycling due to filesystem suspension.
Sponsored by: DARPA & NAI Labs.
on the _file() theme that do not follow symlinks. Sync to MAC tree.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
sched_lock. This means that we no longer access p_limit in mi_switch()
and the p_limit pointer can be protected by the proc lock.
- Remove PRS_ZOMBIE check from CPU limit test in mi_switch(). PRS_ZOMBIE
processes don't call mi_switch(), and even if they did there is no longer
the danger of p_limit being NULL (which is what the original zombie check
was added for).
- When we bump the current processes soft CPU limit in ast(), just bump the
private p_cpulimit instead of the shared rlimit. This fixes an XXX for
some value of fix. There is still a (probably benign) bug in that this
code doesn't check that the new soft limit exceeds the hard limit.
Inspired by: bde (2)
in specific situations. The owner thread must be blocked, and the
borrower can not proceed back to user space with the borrowed KSE.
The borrower will return the KSE on the next context switch where
teh owner wants it back. This removes a lot of possible
race conditions and deadlocks. It is consceivable that the
borrower should inherit the priority of the owner too.
that's another discussion and would be simple to do.
Also, as part of this, the "preallocatd spare thread" is attached to the
thread doing a syscall rather than the KSE. This removes the need to lock
the scheduler when we want to access it, as it's now "at hand".
DDB now shows a lot mor info for threaded proceses though it may need
some optimisation to squeeze it all back into 80 chars again.
(possible JKH project)
Upcalls are now "bound" threads, but "KSE Lending" now means that
other completing syscalls can be completed using that KSE before the upcall
finally makes it back to the UTS. (getting threads OUT OF THE KERNEL is
one of the highest priorities in the KSE system.) The upcall when it happens
will present all the completed syscalls to the KSE for selection.
configuration device hierarchy. Device arrival, departure and not
matched are presently reported. This will be the basis for devd, which
I still need to polish a little more before I commit it. If you don't
use /dev/devctl, it will be a noop.
o Allow the bus_debug variable to be set via the bus.debug tunable.
o Return pnpinfo and location info via the devinfo interface to userland.
devinfo(8) needs to be updated to print it.
revision 1.218. This bug caused a "struct file" reference to be
leaked if VOP_ADVLOCK(), vn_start_write(), or mac_check_vnode_write()
failed during the open operation.
PR: kern/43739
Reported by: Arne Woerner <woerner@mediabase-gmbh.de>
Don't use snprintf where strlcpy() will do the job.
Also, a NUL is '\0' not 0 in our style (C doesn't care), so spell it like.
Remove useless {} and () in the general area of this change.
checks from the MAC tree: allow policies to perform access control
for the ability of a process to send and receive data via a socket.
At some point, we might also pass in additional address information
if an explicit address is requested on send.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
seperate entry points for each occasion:
mac_check_vnode_mmap() Check at initial mapping
mac_check_vnode_mprotect() Check at mapping protection change
mac_check_vnode_mmap_downgrade() Determine if a mapping downgrade
should take place following
subject relabel.
Implement mmap() and mprotect() entry points for labeled vnode
policies. These entry points are currently not hooked up to the
VM system in the base tree. These changes improve the consistency
of the access control interface and offer more flexibility regarding
limiting access to vnode mmaping.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
flags so that we can call malloc with M_NOWAIT if necessary, avoiding
potential sleeps while holding mutexes in the TCP syncache code.
Similar to the existing support for mbuf label allocation: if we can't
allocate all the necessary label store in each policy, we back out
the label allocation and fail the socket creation. Sync from MAC tree.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
devfs VOP symlink creation by introducing a new entry point to determine
the label of the devfs_dirent prior to allocation of a vnode for the
symlink.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
point that instruments the creation of hard links. Policy implementations
to follow.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
to mbuf label initialization, that functionality was never merged to
the main tree. Go ahead and merge that functionality now. Note that
this requires policy modules to accept the case where the label
element may be destroyed even if init has not succeeded on it (in
the event that policy failed the init). This will shortly also
apply to sockets.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
order used in mac_policy.h and elsewhere. Sort order is basically
"by operation category", then "alphabetically by object". Sync to
MAC tree.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
externalization, and cred label life cycle events to entirely above
devfs and vnode events. Sync from MAC tree.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
entry points to better match the entry point ordering in mac_policy.h.
Big diff, no functional change; merge from the MAC tree.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
- If a policy isn't registered when a policy module unloads, silently
succeed.
- Hold the policy list lock across more of the validity tests to avoid
races.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
NB: But it will enable it in all kernels not having options "NO_GEOM"
Put the GEOM related options into the intended order.
Add "options NO_GEOM" to all kernel configs apart from NOTES.
In some order of controlled fashion, the NO_GEOM options will be
removed, architecture by architecture in the coming days.
There are currently three known issues which may force people to
need the NO_GEOM option:
boot0cfg/fdisk:
Tries to update the MBR while it is being used to control
slices. GEOM does not allow this as a direct operation.
SCSI floppy drives:
Appearantly the scsi-da driver return "EBUSY" if no media
is inserted. This is wrong, it should return ENXIO.
PC98:
It is unclear if GEOM correctly recognizes all variants of
PC98 disklabels. (Help Wanted! I have neither docs nor HW)
These issues are all being worked.
Sponsored by: DARPA & NAI Labs.
- Change mpo_init_foo(obj, label) and mpo_destroy_foo(obj, label) policy
entry points to mpo_init_foo_label(label) and
mpo_destroy_foo_label(label). This will permit the use of the same
entry points for holding temporary type-specific label during
internalization and externalization, as well as for caching purposes.
- Because of this, break out mpo_{init,destroy}_socket() and
mpo_{init,destroy}_mount() into seperate entry points for socket
main/peer labels and mount main/fs labels.
- Since the prototype for label initialization is the same across almost
all entry points, implement these entry points using common
implementations for Biba, MLS, and Test, reducing the number of
almost identical looking functions.
This simplifies policy implementation, as well as preparing us for the
merge of the new flexible userland API for managing labels on objects.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
treat it as an invalid partition.
This fixes a bug where ``dumpon <device>'' will configure the dump
device at a random offset on the disk if <device> isn't a valid
partition.
Reviewed by: phk