This is a seriously beefed up chroot kind of thing. The process
is jailed along the same lines as a chroot does it, but with
additional tough restrictions imposed on what the superuser can do.
For all I know, it is safe to hand over the root bit inside a
prison to the customer living in that prison, this is what
it was developed for in fact: "real virtual servers".
Each prison has an ip number associated with it, which all IP
communications will be coerced to use and each prison has its own
hostname.
Needless to say, you need more RAM this way, but the advantage is
that each customer can run their own particular version of apache
and not stomp on the toes of their neighbors.
It generally does what one would expect, but setting up a jail
still takes a little knowledge.
A few notes:
I have no scripts for setting up a jail, don't ask me for them.
The IP number should be an alias on one of the interfaces.
mount a /proc in each jail, it will make ps more useable.
/proc/<pid>/status tells the hostname of the prison for
jailed processes.
Quotas are only sensible if you have a mountpoint per prison.
There are no privisions for stopping resource-hogging.
Some "#ifdef INET" and similar may be missing (send patches!)
If somebody wants to take it from here and develop it into
more of a "virtual machine" they should be most welcome!
Tools, comments, patches & documentation most welcome.
Have fun...
Sponsored by: http://www.rndassociates.com/
Run for almost a year by: http://www.servetheweb.com/
1:
s/suser/suser_xxx/
2:
Add new function: suser(struct proc *), prototyped in <sys/proc.h>.
3:
s/suser_xxx(\([a-zA-Z0-9_]*\)->p_ucred, \&\1->p_acflag)/suser(\1)/
The remaining suser_xxx() calls will be scrutinized and dealt with
later.
There may be some unneeded #include <sys/cred.h>, but they are left
as an exercise for Bruce.
More changes to the suser() API will come along with the "jail" code.
unallocated parts of the last page when the file ended on a frag
but not a page boundary.
Delimitted by tags PRE_MATT_MMAP_EOF and POST_MATT_MMAP_EOF,
in files alpha/alpha/pmap.c i386/i386/pmap.c nfs/nfs_bio.c vm/pmap.h
vm/vm_page.c vm/vm_page.h vm/vnode_pager.c miscfs/specfs/spec_vnops.c
ufs/ufs/ufs_readwrite.c kern/vfs_bio.c
Submitted by: Matt Dillon <dillon@freebsd.org>
Reviewed by: Alan Cox <alc@freebsd.org>
in my tree for 12+ months, and I just noticed that NetBSD have (I think,
I've just seen the commit, not the change) just zapped it there.
It wasn't in the options files or LINT either.
include of <sys/queue.h> in the !KERNEL case. The prerequisites
for <ufs/ufs/quota.h> were broken in Lite2 by converting some of
the kernel declarations to use queue macros without including
<sys/queue.h>. <sys/queue.h> was included in applications in
/usr/src instead. We polluted this file instead of merging the
changes in the applications.
Include <sys/queue.h> in the KERNEL case, and forward-declare all
structs that are used in prototypes, so that this file is almost
self-sufficient even in the kernel.
Obtained from: mostly from NetBSD
so that non-sloppy applications can call it without using disgusting
casts to avoid warnings. The 4th arg is sort of varargs -- it must
sometimes represent a filename, sometimes a struct pointer, and is
sometimes unused. The arg type is still caddr_t in the kernel.
Obtained from: mostly from NetBSD
lives in ext2_vnops.c for ext2fs. Also remove cast from comparision.
Bruce pointed out that it was bogus since we'd force a signed
comparision when we really wanted an unsigned comparison.
to write all the dirty blocks. If some of those blocks have dependencies,
they will be remarked dirty when the I/O completes. On systems with
really fast I/O systems, it is possible to get in an infinite loop trying
to flush the buffers, because the I/O finishes before we can get all the
dirty buffers off the v_dirtyblkhd list and into the I/O queue. (The
previous algorithm looped over the v_dirtyblkhd list writing out buffers
until the list emptied.) So, now we mark each buffer that we try to
write so that we can distinguish the ones that are being remarked dirty
from those that we have not yet tried to flush. Once we have tried to
push every buffer once, we then push any associated metadata that is
causing the remaining buffers to be redirtied.
Submitted by: Matthew Dillon <dillon@apollo.backplane.com>
Specifically, the test was in the wrong place, lacked a cast, didn't
unlock the node, and exited to bad rather than abortit. Now we don't
allow renaming of a file with LINK_MAX references. Move the test to
earlier in the code as it is closer to where ip is obtained, as that
is the style of the rest of the function.
Didn't fix the problems bruce pointed out in the rename man page to
include EMLINK, nor address his complaints about how the whole idea of
incrementing the link count during a rename is potentially asking for
trouble.
Also didn't try to correct potential problem Terry pointed out with
decrements not being similarly protected against underflow.
turns out to not be useful to unwind the dependencies and continue in
the face of a fatal error.
Also changed the log() to a printf() in softdep_error() so that it will
be output in the case of a impending panic.
Submitted by: Kirk McKusick <mckusick@mckusick.com>
changes to the VM system to support the new swapper, VM bug
fixes, several VM optimizations, and some additional revamping of the
VM code. The specific bug fixes will be documented with additional
forced commits. This commit is somewhat rough in regards to code
cleanup issues.
Reviewed by: "John S. Dyson" <root@dyson.iquest.net>, "David Greenman" <dg@root.com>
MNT_WAIT when we mean boolean `true' or check for that value not being
passed. There was no problem in practice because MNT_WAIT had the
magic value of 1.
I/O requests must be marked P_SYSTEM because if it isn't and the system
decides to swap it or (god forbid) kill it, the system stands a good
chance of locking up.
may be revoked, so vnop routines must be careful about accessing
the vnode if they may have blocked.
Fixed marking for update after successfully reading or writing 0
bytes. In this case, POSIX.1 specifies marking if and only if the
requested count is nonzero, but rev.1.86 never marked.
basically do a on-the-fly defragmentation of the FFS filesystem, changing
file block allocations to make them contiguous. Thanks to Kirk McKusick
for providing hints on what needed to be done to get this working.