"refreshing" the label on the vnode before use, just get the label
right from inception. For single-label file systems, set the label
in the generic VFS getnewvnode() code; for multi-label file systems,
leave the labeling up to the file system. With UFS1/2, this means
reading the extended attribute during vfs_vget() as the inode is
pulled off disk, rather than hitting the extended attributes
frequently during operations later, improving performance. This
also corrects sematics for shared vnode locks, which were not
previously present in the system. This chances the cache
coherrency properties WRT out-of-band access to label data, but in
an acceptable form. With UFS1, there is a small race condition
during automatic extended attribute start -- this is not present
with UFS2, and occurs because EAs aren't available at vnode
inception. We'll introduce a work around for this shortly.
Approved by: re
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
kernel access control.
Modify procfs so that (when mounted multilabel) it exports process MAC
labels as the vnode labels of procfs vnodes associated with processes.
Approved by: des
Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs
not the calling process. While we're here, also unstaticize procfs_doprocfile() and
procfs_docurproc() so linprocfs can call them directly instead of duplicating them.
Submitted by: Dominic Mitchell <dom@semantico.com>
Until now, the ptrace syscall was implemented as a wrapper that called
various functions in procfs depending on which ptrace operation was
requested. Most of these functions were themselves wrappers around
procfs_{read,write}_{,db,fp}regs(), with only some extra error checks,
which weren't necessary in the ptrace case anyway.
This commit moves procfs_rwmem() from procfs_mem.c into sys_process.c
(renaming it to proc_rwmem() in the process), and implements ptrace()
directly in terms of procfs_{read,write}_{,db,fp}regs() instead of
having it fake up a struct uio and then call procfs_do{,db,fp}regs().
It also moves the prototypes for procfs_{read,write}_{,db,fp}regs()
and proc_rwmem() from proc.h to ptrace.h, and marks all procfs files
except procfs_machdep.c as "optional procfs" instead of "standard".
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.
Sorry john! (your next MFC will be a doosie!)
Reviewed by: peter@freebsd.org, dillon@freebsd.org
X-MFC after: ha ha ha ha
and so special-casing was introduced to provide extra procfs privilege
to the kmem group. With the advent of non-setgid kmem ps, this code
is no longer required, and in fact, can is potentially harmful as it
allocates privilege to a gid that is increasingly less meaningful.
Knowledge of specific gid's in kernel is also generally bad precedent,
as the kernel security policy doesn't distinguish gid's specifically,
only uid 0.
This commit removes reference to kmem in procfs, both in terms of
access control decisions, and the applying of gid kmem to the
/proc/*/mem file, simplifying the associated code considerably.
Processes are still permitted to access the mem file based on
the debugging policy, so ps -e still works fine for normal
processes and use.
Reviewed by: tmm
Obtained from: TrustedBSD Project
There's no excuse to have code in synthetic filestores that allows direct
references to the textvp anymore.
Feature requested by: msmith
Feature agreed to by: warner
Move requested by: phk
Move agreed to by: bde
is an application space macro and the applications are supposed to be free
to use it as they please (but cannot). This is consistant with the other
BSD's who made this change quite some time ago. More commits to come.
p_trespass(struct proc *p1, struct proc *p2)
which returns zero or an errno depending on the legality of p1 trespassing
on p2.
Replace kern_sig.c:CANSIGNAL() with call to p_trespass() and one
extra signal related check.
Replace procfs.h:CHECKIO() macros with calls to p_trespass().
Only show command lines to process which can trespass on the target
process.
This is a seriously beefed up chroot kind of thing. The process
is jailed along the same lines as a chroot does it, but with
additional tough restrictions imposed on what the superuser can do.
For all I know, it is safe to hand over the root bit inside a
prison to the customer living in that prison, this is what
it was developed for in fact: "real virtual servers".
Each prison has an ip number associated with it, which all IP
communications will be coerced to use and each prison has its own
hostname.
Needless to say, you need more RAM this way, but the advantage is
that each customer can run their own particular version of apache
and not stomp on the toes of their neighbors.
It generally does what one would expect, but setting up a jail
still takes a little knowledge.
A few notes:
I have no scripts for setting up a jail, don't ask me for them.
The IP number should be an alias on one of the interfaces.
mount a /proc in each jail, it will make ps more useable.
/proc/<pid>/status tells the hostname of the prison for
jailed processes.
Quotas are only sensible if you have a mountpoint per prison.
There are no privisions for stopping resource-hogging.
Some "#ifdef INET" and similar may be missing (send patches!)
If somebody wants to take it from here and develop it into
more of a "virtual machine" they should be most welcome!
Tools, comments, patches & documentation most welcome.
Have fun...
Sponsored by: http://www.rndassociates.com/
Run for almost a year by: http://www.servetheweb.com/
1:
s/suser/suser_xxx/
2:
Add new function: suser(struct proc *), prototyped in <sys/proc.h>.
3:
s/suser_xxx(\([a-zA-Z0-9_]*\)->p_ucred, \&\1->p_acflag)/suser(\1)/
The remaining suser_xxx() calls will be scrutinized and dealt with
later.
There may be some unneeded #include <sys/cred.h>, but they are left
as an exercise for Bruce.
More changes to the suser() API will come along with the "jail" code.
is enough to satisfy things like StarOffice. This is a hack, but doing
it properly would be a LOT of work, and would require extensive grovelling
around in the user address space to find the argv[].
Obtained from: Mostly from Andrzej Bialecki <abial@nask.pl>.
bits. We used a private, wrong, version of `struct dirent' to help
break getdirentries(), and we use a silly check that the size of this
struct is a power of 2 to help break mount() if getdirentries() would
not work. This fix just changes the struct to match `struct dirent'
(except for the name length).
reading/writing of mem and regs). Also have to check for the requesting
process being group KMEM -- this is a bit of a hack, but ps et al need it.
Reviewed by: davidg
changes, so don't expect to be able to run the kernel as-is (very well)
without the appropriate Lite/2 userland changes.
The system boots and can mount UFS filesystems.
Untested: ext2fs, msdosfs, NFS
Known problems: Incorrect Berkeley ID strings in some files.
Mount_std mounts will not work until the getfsent
library routine is changed.
Reviewed by: various people
Submitted by: Jeffery Hsu <hsu@freebsd.org>
This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.
Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.
to information from a single process causes hangs. Specifically, this
fixes problems (hangs) with concurrent ps commands, when the system is under
heavy memory load.
Reviewed by: davidg
with multiple entries as follows:
start address, end address, resident pages in range, private pages
in range, RW/RO, COW or not, (vnode/device/swap/default).
Implement a "variable" directory structure. Files that do not make
sense for the given process do not "appear" and cannot be opened.
For example, "system" processes do not have "file", "regs" or "fpregs",
because they do not have a user area.
"attempt" to fill in the user area of a given process when it is being
accessed via /proc/pid/mem (the user struct is just after
VM_MAXUSER_ADDRESS in the process address space.)
Dont do IO to the U area while it's swapped, hold it in place if possible.
Lock off access to the "ctl" file if it's done a setuid like the other
pseudo-files in there.
it 1138 times (:-() in casts and a few more times in declarations.
This change is null for the i386.
The type has to be `typedef int vop_t(void *)' and not `typedef
int vop_t()' because `gcc -Wstrict-prototypes' warns about the
latter. Since vnode op functions are called with args of different
(struct pointer) types, neither of these function types is any use
for type checking of the arg, so it would be preferable not to use
the complete function type, especially since using the complete
type requires adding 1138 casts to avoid compiler warnings and
another 40+ casts to reverse the function pointer conversions before
calling the functions.
Should anybody out there wonder about this vendetta against global
variables, it is basically to make it more visible what our interfaces
in the kernel really are.
I'm almost convinced we should have a
#define PUBLIC /* public interface */
and use it in the #includes...
regular user could panic the machine with a simple "tail /proc/curproc/mem"
command. The problem was twofold: both kernfs and procfs didn't fill in
the mnt_stat statfs struct (which would later lead to an integer divide
fault in the vnode pager), and kernfs bogusly paniced if a bmap was
attempted.
Reviewed by: John Dyson