kernel access control.
Invoke a MAC framework entry point to authorize reception of an
incoming mbuf by the BPF descriptor, permitting MAC policies to
limit the visibility of packets delivered to particular BPF
descriptors.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs
kernel access control.
Instrument BPF so that MAC labels are properly maintained on BPF
descriptors. MAC framework entry points are invoked at BPF
instantiation and allocation, permitting the MAC framework to
derive the BPF descriptor label from the credential authorizing
the device open. Also enter the MAC framework to label mbufs
created using the BPF device.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, NAI Labs
be done internally.
Ensure that no one can fsetown() to a dying process/pgrp. We need
to check the process for P_WEXIT to see if it's exiting. Process
groups are already safe because there is no such thing as a pgrp
zombie, therefore the proctree lock completely protects the pgrp
from having sigio structures associated with it after it runs
funsetownlst.
Add sigio lock to witness list under proctree and allproc, but over
proc and pgrp.
Seigo Tanimura helped with this.
Turn the sigio sx into a mutex.
Sigio lock is really only needed to protect interrupts from dereferencing
the sigio pointer in an object when the sigio itself is being destroyed.
In order to do this in the most unintrusive manner change pgsigio's
sigio * argument into a **, that way we can lock internally to the
function.
most cases NULL is passed, but in some cases such as network driver locks
(which use the MTX_NETWORK_LOCK macro) and UMA zone locks, a name is used.
Tested on: i386, alpha, sparc64
select/poll, and therefore with pthreads. I doubt there is any way
to make this 100% semantically identical to the way it behaves in
unthreaded programs with blocking reads, but the solution here
should do the right thing for all reasonable usage patterns.
The basic idea is to schedule a callout for the read timeout when a
select/poll is done. When the callout fires, it ends the select if
it is still in progress, or marks the state as "timed out" if the
select has already ended for some other reason. Additional logic in
bpfread then does the right thing in the case where the timeout has
fired.
Note, I co-opted the bd_state member of the bpf_d structure. It has
been present in the structure since the initial import of 4.4-lite,
but as far as I can tell it has never been used.
PR: kern/22063 and bin/31649
MFC after: 3 days
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.
Sorry john! (your next MFC will be a doosie!)
Reviewed by: peter@freebsd.org, dillon@freebsd.org
X-MFC after: ha ha ha ha
valid) if BPF is missing.
The netgraph_bpf node forced bpf to be present, reflect that in the
options.
Stop doing a 'count bpf' - we provide stubs.
Since a handful of drivers still refer to "bpf.h", provide a more accurate
indication that the API is present always. (eg: netinet6)
This is because calls with M_WAIT (now M_TRYWAIT) may not wait
forever when nothing is available for allocation, and may end up
returning NULL. Hopefully we now communicate more of the right thing
to developers and make it very clear that it's necessary to check whether
calls with M_(TRY)WAIT also resulted in a failed allocation.
M_TRYWAIT basically means "try harder, block if necessary, but don't
necessarily wait forever." The time spent blocking is tunable with
the kern.ipc.mbuf_wait sysctl.
M_WAIT is now deprecated but still defined for the next little while.
* Fix a typo in a comment in mbuf.h
* Fix some code that was actually passing the mbuf subsystem's M_WAIT to
malloc(). Made it pass M_WAITOK instead. If we were ever to redefine the
value of the M_WAIT flag, this could have became a big problem.
and had no data available returned 0. Now it returns -1 with errno
set to EWOULDBLOCK (== EAGAIN) as it should. This fix makes the bpf
device usable in threaded programs.
Reviewed by: bde
which hides the 'hole' in the minor bits.
Introduce unit2minor() to do the reverse operation.
Fix some some make_dev() calls which didn't use UID_* or GID_* macros.
Kill the v_hashchain alias macro, it hides the real relationship.
Introduce experimental SI_CHEAPCLONE flag set it on cloned bpfs.
cloning infrastructure standard in kern_conf. Modules are now
the same with or without devfs support.
If you need to detect if devfs is present, in modules or elsewhere,
check the integer variable "devfs_present".
This happily removes an ugly hack from kern/vfs_conf.c.
This forces a rename of the eventhandler and the standard clone
helper function.
Include <sys/eventhandler.h> in <sys/conf.h>: it's a helper #include
like <sys/queue.h>
Remove all #includes of opt_devfs.h they no longer matter.
Remove old DEVFS support fields from dev_t.
Make uid, gid & mode members of dev_t and set them in make_dev().
Use correct uid, gid & mode in make_dev in disk minilayer.
Add support for registering alias names for a dev_t using the
new function make_dev_alias(). These will show up as symlinks
in DEVFS.
Use makedev() rather than make_dev() for MFSs magic devices to prevent
DEVFS from noticing this abuse.
Add a field for DEVFS inode number in dev_t.
Add new DEVFS in fs/devfs.
Add devfs cloning to:
disk minilayer (ie: ad(4), sd(4), cd(4) etc etc)
md(4), tun(4), bpf(4), fd(4)
If DEVFS add -d flag to /sbin/inits args to make it mount devfs.
Add commented out DEVFS to GENERIC
possible for a panic to occur if BPF is in use on the interface at the
time of the call to if_detach. This happens because BPF maintains pointers
to the struct ifnet describing the interface, which is freed by if_detach.
To correct this problem, a new call, bpfdetach, is introduced. bpfdetach
locates BPF descriptor references to the interface, and NULLs them. Other
BPF code is modified so that discovery of a NULL interface results in
ENXIO (already implemented for some calls). Processes blocked on a BPF
call will also be woken up so that they can receive ENXIO.
Interface drivers that invoke bpfattach and if_detach must be modified to
also call bpfattach(ifp) before calling if_detach(ifp). This is relevant
for buses that support hot removal, such as pccard and usb. Patches to
all effected devices will not be committed, only to if_wi.c, due to
testing limitations. To reproduce the crash, load up tcpdump on you
favorite pccard ethernet card, and then eject the card. As some pccard
drivers do not invoke if_detach(ifp), this bug will not manifest itself
for those drivers.
Reviewed by: wes
not the current BPF device should report locally generated packets or not.
This allows sniffing applications to see only packets that are not generated
locally, which can be useful for debugging bridging problems, or other
situations where MAC addresses are not sufficient to identify locally
sourced packets. Default to true for this flag, so as to provide existing
behavior by default.
Introduce two new ioctls, BIOCGSEESENT and BIOCSSEESENT, which may be used
to manipulate this flag from userland, given appropriate privilege.
Modify bpf.4 to document these two new ioctl arguments.
Reviewed by: asmodai
|for high speed networks (even at 100Mbit/s this corresponds to 1/300th
|of a second). The default buffer size is 4KB, but libpcap and ipfilter
|both override this (using the BIOCSBLEN ioctl) and allocate 32KB.
|
|The following patch adds an sysctl for bpf_maxbufsize, similar to the
|one for bpf_bufsize that you added back in December 1995. I choose to
|make the default for this limit 512KB (the value suggested by NFR).
Submitted by: se
Reviewed by: phk
completion' flag. If set, the interface output routine will assume that
the packet already has a valid link-level source address. This defaults
to off (the address is overwritten)
PR: kern/10680
Submitted by: "Christopher N . Harrell" <cnh@mindspring.net>
Obtained from: NetBSD
have been there in the first place. A GENERIC kernel shrinks almost 1k.
Add a slightly different safetybelt under nostop for tty drivers.
Add some missing FreeBSD tags
The cdevsw_add() function now finds the major number(s) in the
struct cdevsw passed to it. cdevsw_add_generic() is no longer
needed, cdevsw_add() does the same thing.
cdevsw_add() will print an message if the d_maj field looks bogus.
Remove nblkdev and nchrdev variables. Most places they were used
bogusly. Instead check a dev_t for validity by seeing if devsw()
or bdevsw() returns NULL.
Move bdevsw() and devsw() functions to kern/kern_conf.c
Bump __FreeBSD_version to 400006
This commit removes:
72 bogus makedev() calls
26 bogus SYSINIT functions
if_xe.c bogusly accessed cdevsw[], author/maintainer please fix.
I4b and vinum not changed. Patches emailed to authors. LINT
probably broken until they catch up.
Reformat and initialize correctly all "struct cdevsw".
Initialize the d_maj and d_bmaj fields.
The d_reset field was not removed, although it is never used.
I used a program to do most of this, so all the files now use the
same consistent format. Please keep it that way.
Vinum and i4b not modified, patches emailed to respective authors.
This is a seriously beefed up chroot kind of thing. The process
is jailed along the same lines as a chroot does it, but with
additional tough restrictions imposed on what the superuser can do.
For all I know, it is safe to hand over the root bit inside a
prison to the customer living in that prison, this is what
it was developed for in fact: "real virtual servers".
Each prison has an ip number associated with it, which all IP
communications will be coerced to use and each prison has its own
hostname.
Needless to say, you need more RAM this way, but the advantage is
that each customer can run their own particular version of apache
and not stomp on the toes of their neighbors.
It generally does what one would expect, but setting up a jail
still takes a little knowledge.
A few notes:
I have no scripts for setting up a jail, don't ask me for them.
The IP number should be an alias on one of the interfaces.
mount a /proc in each jail, it will make ps more useable.
/proc/<pid>/status tells the hostname of the prison for
jailed processes.
Quotas are only sensible if you have a mountpoint per prison.
There are no privisions for stopping resource-hogging.
Some "#ifdef INET" and similar may be missing (send patches!)
If somebody wants to take it from here and develop it into
more of a "virtual machine" they should be most welcome!
Tools, comments, patches & documentation most welcome.
Have fun...
Sponsored by: http://www.rndassociates.com/
Run for almost a year by: http://www.servetheweb.com/
that doesn't have it. This is achieved by having minimal do-nothing stubs
enabled when there are no bpfilter devices configured.
Driver modules should be built with BPF enabled for maximum
convenience (but can be built without it for maximum performance).
by bde, a few other tweaks to get the patch to apply cleanly again and
some improvements to the comments.
This change closes some fairly minor security holes associated with
F_SETOWN, fixes a few bugs, and removes some limitations that F_SETOWN
had on tty devices. For more details, see the description on the PR.
Because this patch increases the size of the proc and pgrp structures,
it is necessary to re-install the includes and recompile libkvm,
the vinum lkm, fstat, gcore, gdb, ipfilter, ps, top, and w.
PR: kern/7899
Reviewed by: bde, elvind