freebsd-dev/sys/miscfs/devfs
Poul-Henning Kamp 75c1354190 This Implements the mumbled about "Jail" feature.
This is a seriously beefed up chroot kind of thing.  The process
is jailed along the same lines as a chroot does it, but with
additional tough restrictions imposed on what the superuser can do.

For all I know, it is safe to hand over the root bit inside a
prison to the customer living in that prison, this is what
it was developed for in fact:  "real virtual servers".

Each prison has an ip number associated with it, which all IP
communications will be coerced to use and each prison has its own
hostname.

Needless to say, you need more RAM this way, but the advantage is
that each customer can run their own particular version of apache
and not stomp on the toes of their neighbors.

It generally does what one would expect, but setting up a jail
still takes a little knowledge.

A few notes:

   I have no scripts for setting up a jail, don't ask me for them.

   The IP number should be an alias on one of the interfaces.

   mount a /proc in each jail, it will make ps more useable.

   /proc/<pid>/status tells the hostname of the prison for
   jailed processes.

   Quotas are only sensible if you have a mountpoint per prison.

   There are no privisions for stopping resource-hogging.

   Some "#ifdef INET" and similar may be missing (send patches!)

If somebody wants to take it from here and develop it into
more of a "virtual machine" they should be most welcome!

Tools, comments, patches & documentation most welcome.

Have fun...

Sponsored by:   http://www.rndassociates.com/
Run for almost a year by:       http://www.servetheweb.com/
1999-04-28 11:38:52 +00:00
..
devfs_proto.h changes to make devfs more 'normal' 1996-11-21 07:19:00 +00:00
devfs_tree.c Fix warnings in preparation for adding -Wall -Wcast-qual to the 1999-01-27 22:42:27 +00:00
devfs_vfsops.c Fix warnings in preparation for adding -Wall -Wcast-qual to the 1999-01-27 22:42:27 +00:00
devfs_vnops.c This Implements the mumbled about "Jail" feature. 1999-04-28 11:38:52 +00:00
devfsdefs.h Delete stray extern declaration for non-existing variables. 1998-11-09 07:03:04 +00:00
README Seventy-odd "its" / "it's" typos in comments fixed as per kern/6108. 1998-04-17 22:37:19 +00:00
reproto.sh Fix the reproto.sh script that was broken after my KNFification. 1996-04-07 01:15:03 +00:00

this file is: /sys/miscfs/devfs/README

to enable: add
options	DEVFS

to your config file..
expect it to be highly useless for a while,
as the only devices that register themselves are the floppy,
the pcaudio stuff, speaker, null,mem,zero,io,kmem.

it works like this:

There is a tree of nodes that describe the layout of the DEVFS as seen by
the drivers.. they add nodes to this tree. This is called the 'back' layer
for reasons that will become obvious in a second. Think of it as a
BLUEPRINT of the DEVFS tree. Each back node has associated with it 
a "devnode" struct, that holds information about the device
(or directory) and a pointer to the vnode if one has been associated 
with that node. The back node itself can be considered to be 
a directory entry, and contains the default name of the device,
and a link to the directory that holds it. It is sometimes refered
to in the code as the dev_name. The devnode can be considered the inode.

When you mount the devfs somewhere (you can mount it multiple times in
multiple places), a front layer is created that contains a tree of 'front'
nodes.

Think of this as a Transparency, layed over the top of the blueprint.
(or possibly a photocopy).

The front and back nodes are identical in type, but the back nodes
are reserved for kernel use only, and are protected from the user.
The back plane has a mount structure and all that stuff, but it is in
fact not really mounted. (and is thus not reachable via namei).
Internal kernel routines can open devices in this plane
even if the external devfs has not been mounted yet :)
(e.g. to find the root device)

To start with there is a 1:1 relationship between the front nodes
and the backing nodes, however once the front plane has been created
the nodes can be moved around within that plane (or deleted).
Think of this as the ability to revise a transparency...
the blueprint is untouched.

There is a "devnode" struct associated with each front note also.
Front nodes that refer to devices, use the same "devnode" struct that is used 
by their associated backing node, so that multiple front nodes that
point to the same device will use the same "devnode" struct, and through
that, the same vnode, ops, modification times, flags, owner and group.
Front nodes representing directories and symlinks have their own
"devnode" structs, and may therefore differ. (have different vnodes)
i.e. if you have two devfs trees mounted, you can change the 
directories in one without changing the other. 
e.g. remove or rename nodes

Multiple mountings are like multiple transparencies,
each showing through to the original blueprint.

Information that is to be shared between these mounts is stored
in the 'backing' node for that object.  Once you have erased 'front'
object, there is no memory of where the backing object was, and
except for the possibility of searching the entire backing tree
for the node with the correct major/minor/type, I don't see that
it is easily recovered.. Particularly as there will eventually be
(I hope) devices that go direct from the backing node to the driver
without going via the cdevsw table.. they may not even have
major/minor numbers.

I see 'mount -u' as a possible solution to recovering a broken dev tree.
(though umount+mount would do the same)

Because non device nodes (directories and symlinks) have their own
"devnode" structs on each layer, these may have different
flags, owners, and contents on each layer.
e.g. if you have a chroot tree like erf.tfs.com has, you
may want different permissions or owners on the chroot mount of the DEVFS
than you want in the real one. You might also want to delete some sensitive
devices from the chroot tree.

Directories also have backing nodes but there is nothing to stop
the user from removing a front node from the directory front node.
(except permissions of course).  This is because the front directory
nodes keep their own records as to which front nodes are members
of that directory and do not refer to their original backing node
for this information.

The front nodes may be moved to other directories (including
directories) however this does not break the linkage between the
backing nodes and the front nodes. The backing node never moves. If
a driver decides to remove a device from the backing tree, the FS
code follows the links to all the front nodes linked to that backing
node, and deletes them, no matter where they've been moved to.
(active vnodes are redirected to point to the deadfs).

If a directory has been moved, and a new backing node is inserted
into its own back node, the new front node will appear in that front
directory, even though it's been moved, because the directory that
gets the front node is found via the links and not by name.

a mount -u might be considered to be a request to 'refresh' the
plane that controls to the mount being updated.. that would have the
effect of 're-propogating' through any backing nodes that find they
have no front nodes in that plane.


NOTES FOR RELEASE 1.2
1/ this is very preliminary
2/ the routines have greatly simplified since release 1.1
(I guess the break did me good :)
3/ many features are not present yet..
e.g. symlinks, a comprehensive registration interface (only a crude one)
ability to unlink and mv nodes.
4/ I'm pretty sure my use of vnodes is bad and it may be 'losing'
them, or alternatively, corrupting things.. I need a vnode specialist
to look at this.