Commit Graph

54 Commits

Author SHA1 Message Date
Robert Watson
800c940832 Add a new priv(9) kernel interface for checking the availability of
privilege for threads and credentials.  Unlike the existing suser(9)
interface, priv(9) exposes a named privilege identifier to the privilege
checking code, allowing more complex policies regarding the granting of
privilege to be expressed.  Two interfaces are provided, replacing the
existing suser(9) interface:

suser(td)                 ->   priv_check(td, priv)
suser_cred(cred, flags)   ->   priv_check_cred(cred, priv, flags)

A comprehensive list of currently available kernel privileges may be
found in priv.h.  New privileges are easily added as required, but the
comments on adding privileges found in priv.h and priv(9) should be read
before doing so.

The new privilege interface exposed sufficient information to the
privilege checking routine that it will now be possible for jail to
determine whether a particular privilege is granted in the check routine,
rather than relying on hints from the calling context via the
SUSER_ALLOWJAIL flag.  For now, the flag is maintained, but a new jail
check function, prison_priv_check(), is exposed from kern_jail.c and used
by the privilege check routine to determine if the privilege is permitted
in jail.  As a result, a centralized list of privileges permitted in jail
is now present in kern_jail.c.

The MAC Framework is now also able to instrument privilege checks, both
to deny privileges otherwise granted (mac_priv_check()), and to grant
privileges otherwise denied (mac_priv_grant()), permitting MAC Policy
modules to implement privilege models, as well as control a much broader
range of system behavior in order to constrain processes running with
root privilege.

The suser() and suser_cred() functions remain implemented, now in terms
of priv_check() and the PRIV_ROOT privilege, for use during the transition
and possibly continuing use by third party kernel modules that have not
been updated.  The PRIV_DRIVER privilege exists to allow device drivers to
check privilege without adopting a more specific privilege identifier.

This change does not modify the actual security policy, rather, it
modifies the interface for privilege checks so changes to the security
policy become more feasible.

Sponsored by:		nCircle Network Security, Inc.
Obtained from:		TrustedBSD Project
Discussed on:		arch@
Reviewed (at least in part) by:	mlaier, jmg, pjd, bde, ceri,
			Alex Lyashkov <umka at sevcity dot net>,
			Skip Ford <skip dot ford at verizon dot net>,
			Antoine Brodin <antoine dot brodin at laposte dot net>
2006-11-06 13:37:19 +00:00
Robert Watson
aed5570872 Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.h
begun with a repo-copy of mac.h to mac_framework.h.  sys/mac.h now
contains the userspace and user<->kernel API and definitions, with all
in-kernel interfaces moved to mac_framework.h, which is now included
across most of the kernel instead.

This change is the first step in a larger cleanup and sweep of MAC
Framework interfaces in the kernel, and will not be MFC'd.

Obtained from:	TrustedBSD Project
Sponsored by:	SPARTA
2006-10-22 11:52:19 +00:00
Robert Watson
5702e0965e Declare security and security.bsd sysctl hierarchies in sysctl.h along
with other commonly used sysctl name spaces, rather than declaring them
all over the place.

MFC after:	1 month
Sponsored by:	nCircle Network Security, Inc.
2006-09-17 20:00:36 +00:00
Christian S.J. Peron
453f7d5369 Push Giant down in jails. Pass the MPSAFE flag to NDINIT, and keep track
of whether or not Giant was picked up by the filesystem. Add VFS_LOCK_GIANT
macros around vrele as it's possible that this can call in the VOP_INACTIVE
filesystem specific code. Also while we are here, remove the Giant assertion.
from the sysctl handler,  we do not actually require Giant here so we
shouldn't assert it. Doing so will just complicate things when Giant is removed
from the sysctl framework.
2005-09-28 00:30:56 +00:00
Pawel Jakub Dawidek
06a137780b Actually only protect mount-point if security.jail.enforce_statfs is set to 2.
If we don't return statistics about requested file systems, system tools
may not work correctly or at all.

Approved by:	re (scottl)
2005-06-23 22:13:29 +00:00
Pawel Jakub Dawidek
820a0de9a9 Rename sysctl security.jail.getfsstatroot_only to security.jail.enforce_statfs
and extend its functionality:

value	policy
0	show all mount-points without any restrictions
1	show only mount-points below jail's chroot and show only part of the
	mount-point's path (if jail's chroot directory is /jails/foo and
	mount-point is /jails/foo/usr/home only /usr/home will be shown)
2	show only mount-point where jail's chroot directory is placed.

Default value is 2.

Discussed with:	rwatson
2005-06-09 18:49:19 +00:00
Jeff Roberson
22fdc83f93 - Use taskqueue_thread rather than taskqueue_swi since our task is going
to vrele, which may vop lock.  This is not safe in a software interrupt
   context.
2005-04-05 08:51:45 +00:00
John Baldwin
2945387fee Drop a bogus mp_fixme(). Adding a lock would do nothing to reduce userland
races regarding changing of jail-related sysctls.
2005-03-31 22:47:57 +00:00
Colin Percival
79653046d8 Add a new sysctl, "security.jail.chflags_allowed", which controls the
behaviour of chflags within a jail.  If set to 0 (the default), then a
jailed root user is treated as an unprivileged user; if set to 1, then
a jailed root user is treated the same as an unjailed root user.

This is necessary to allow "make installworld" to work inside a jail,
since it attempts to manipulate the system immutable flag on certain
files.

Discussed with:	csjp, rwatson
MFC after:	2 weeks
2005-02-08 21:31:11 +00:00
Warner Losh
9454b2d864 /* -> /*- for copyright notices, minor format tweaks as necessary 2005-01-06 23:35:40 +00:00
Pawel Jakub Dawidek
46e3b1cbe7 Add two missing includes and remove two uneeded.
This is quite serious fix, because even with MAC framework compiled in,
MAC entry points in those two files were simply ignored.
2004-06-27 09:03:22 +00:00
Pawel Jakub Dawidek
2ff8a3496f Fix sysctl name: security.jail.getfsstate_getfsstatroot_only ->
security.jail.getfsstatroot_only.

Approved by:	rwatson
2004-05-20 05:28:44 +00:00
Bosko Milekic
5a59cefcd1 Give jail(8) the feature to allow raw sockets from within a
jail, which is less restrictive but allows for more flexible
jail usage (for those who are willing to make the sacrifice).
The default is off, but allowing raw sockets within jails can
now be accomplished by tuning security.jail.allow_raw_sockets
to 1.

Turning this on will allow you to use things like ping(8)
or traceroute(8) from within a jail.

The patch being committed is not identical to the patch
in the PR.  The committed version is more friendly to
APIs which pjd is working on, so it should integrate
into his work quite nicely.  This change has also been
presented and addressed on the freebsd-hackers mailing
list.

Submitted by: Christian S.J. Peron <maneo@bsdpro.com>
PR: kern/65800
2004-04-26 19:46:52 +00:00
Pawel Jakub Dawidek
7f4704c01d Remove sysctl security.jail.list_allowed.
This functionality was a misfeature, sysctl was added and turned off by
default just to check if nobody complains.

Reviewed by:	rwatson
2004-03-15 12:10:34 +00:00
Jacques Vidrine
57f22bd4af Rework jail_attach(2) so that an already jailed process cannot hop
to another jail.

Submitted by:	rwatson
2004-02-19 21:03:20 +00:00
Pawel Jakub Dawidek
461167c289 Added sysctl security.jail.jailed.
It returns 1 is process is inside of jail and 0 if it is not.
Information if we are in jail or not is not a secret, there is plenty of
ways to discover it. Many people are using own hack to check this and
this will be a legal way from now on.

It will be great if our starting scripts will take advantage of this sysctl
to allow clean "boot" inside jail.

Approved by:	rwatson, scottl (mentor)
2004-02-19 14:29:14 +00:00
Robert Watson
679a106075 By default, don't allow processes in a jail to list the set of
jails in the system.  Previous behavior (allowed) may be restored
by setting security.jail.list_allowed=1.
2004-02-14 19:19:47 +00:00
Robert Watson
7e440242e5 Fix mismerge in last commit: check that cred->cr_prison is NULL
before dereferencing the prison pointer.
2004-02-14 18:52:43 +00:00
Robert Watson
f08df373a3 By default, when a process in jail calls getfsstat(), only return the
data for the file system on which the jail's root vnode is located.
Previous behavior (show data for all mountpoints) can be restored
by setting security.jail.getfsstatroot_only to 0.  Note: this also
has the effect of hiding other mounts inside a jail, such as /dev,
/tmp, and /proc, but errs on the side of leaking less information.
2004-02-14 18:31:11 +00:00
Robert Watson
b3059e09f6 Defer the vrele() on a jail's root vnode reference from prison_free()
to a new prison_complete() task run by a task queue.  This removes
a requirement for grabbing Giant in crfree().  Embed the 'struct task'
in 'struct prison' so that we don't have to allocate memory from
prison_free() (which means we also defer the FREE()).

With this change, I believe grabbing Giant from crfree() can now be
removed, but need to check the uidinfo code paths.

To avoid header pollution, move the definition of 'struct task'
to _task.h, and recursively include from taskqueue.h and jail.h; much
preferably to all files including jail.h picking up a requirement to
include taskqueue.h.

Bumped into by:	sam
Reviewed by:	bde, tjr
2004-01-23 20:44:26 +00:00
David E. O'Brien
677b542ea2 Use __FBSDID(). 2003-06-11 00:56:59 +00:00
Mike Barcroft
9ddb795450 style(9) 2003-04-28 18:32:19 +00:00
John Baldwin
69c4ee54ff - The prison mutex cannot possibly protect pointers to the prison it
protects, so don't bother locking it while we assign it to a ucred's
  cr_prison.
- Fully construct the new credential for a process before assigning it to
  p_ucred.
2003-04-17 22:26:53 +00:00
Mike Barcroft
fd7a8150fb o In struct prison, add an allprison linked list of prisons (protected
by allprison_mtx), a unique prison/jail identifier field, two path
  fields (pr_path for reporting and pr_root vnode instance) to store
  the chroot() point of each jail.
o Add jail_attach(2) to allow a process to bind to an existing jail.
o Add change_root() to perform the chroot operation on a specified
  vnode.
o Generalize change_dir() to accept a vnode, and move namei() calls
  to callers of change_dir().
o Add a new sysctl (security.jail.list) which is a group of
  struct xprison instances that represent a snapshot of active jails.

Reviewed by:	rwatson, tjr
2003-04-09 02:55:18 +00:00
Warner Losh
a163d034fa Back out M_* changes, per decision of the TRB.
Approved by: trb
2003-02-19 05:47:46 +00:00
Alfred Perlstein
44956c9863 Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0.
Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.
2003-01-21 08:56:16 +00:00
Maxime Henrion
894db7b01f Don't forget to destroy the mutex if an error occurs
in the jail() system call.

Submitted by:	Pawel Jakub Dawidek <nick@garage.freebsd.pl>
2002-12-20 14:32:20 +00:00
Alfred Perlstein
b80521fee5 remove syscallarg().
Suggested by: peter
2002-12-14 02:07:32 +00:00
Robert Drehmel
e80fb43467 Use strlcpy() instead of strncpy() to copy NUL terminated strings
for safety and consistency.
2002-10-17 20:03:38 +00:00
Ian Dowse
f2f2285a6a The jail syscall calls chroot, which is not mpsafe, so put back a
mtx_lock(&Giant) around that call.

Reviewed by:	arr
2002-07-01 20:46:01 +00:00
Andrew R. Reiter
4e77f68011 - Alleviate jail() from having the burden of acquiring Giant by simply
removing.  We can do this since we no longer need Giant to safely
  execute jail().

Reviewed by:	rwatson, jhb
2002-06-26 00:29:01 +00:00
John Baldwin
6008862bc2 Change callers of mtx_init() to pass in an appropriate lock type name. In
most cases NULL is passed, but in some cases such as network driver locks
(which use the MTX_NETWORK_LOCK macro) and UMA zone locks, a name is used.

Tested on:	i386, alpha, sparc64
2002-04-04 21:03:38 +00:00
John Baldwin
44731cab3b Change the suser() API to take advantage of td_ucred as well as do a
general cleanup of the API.  The entire API now consists of two functions
similar to the pre-KSE API.  The suser() function takes a thread pointer
as its only argument.  The td_ucred member of this thread must be valid
so the only valid thread pointers are curthread and a few kernel threads
such as thread0.  The suser_cred() function takes a pointer to a struct
ucred as its first argument and an integer flag as its second argument.
The flag is currently only used for the PRISON_ROOT flag.

Discussed on:	smp@
2002-04-01 21:31:13 +00:00
Robert Drehmel
ad1ff0997e Make getcredhostname() take a buffer and the buffer's size
as arguments.  The correct hostname is copied into the buffer
while having the prison's lock acquired in a jailed process'
case.

Reviewed by:	jhb, rwatson
2002-02-27 16:43:20 +00:00
Robert Drehmel
9484d0c0e8 Add a function which returns the correct hostname for a given
credential.

Reviewed by:	phk
2002-02-27 14:58:32 +00:00
Andrew R. Reiter
d0615c64a5 - Attempt to help declutter kern. sysctl by moving security out from
beneath it.

Reviewed by: rwatson
2002-01-16 06:55:30 +00:00
Andrew R. Reiter
83aee5a8d5 - Move _jail sysctl node underneath _kern_security in order to standardize
where our security related sysctl tuneables are located.  Also, this
  will help if/when we move _security node out from under _kern as to help
  make _kern less cluttered.

Approved by:	rwatson
Review by:	rwatson
2001-12-12 05:23:20 +00:00
Robert Watson
011376308f o Introduce pr_mtx into struct prison, providing protection for the
mutable contents of struct prison (hostname, securelevel, refcount,
  pr_linux, ...)
o Generally introduce mtx_lock()/mtx_unlock() calls throughout kern/
  so as to enforce these protections, in particular, in kern_mib.c
  protection sysctl access to the hostname and securelevel, as well as
  kern_prot.c access to the securelevel for access control purposes.
o Rewrite linux emulator abstractions for accessing per-jail linux
  mib entries (osname, osrelease, osversion) so that they don't return
  a pointer to the text in the struct linux_prison, rather, a copy
  to an array passed into the calls.  Likewise, update linprocfs to
  use these primitives.
o Update in_pcb.c to always use prison_getip() rather than directly
  accessing struct prison.

Reviewed by:	jhb
2001-12-03 16:12:27 +00:00
Robert Watson
fc5d29ef7d o Move suser() calls in kern/ to using suser_xxx() with an explicit
credential selection, rather than reference via a thread or process
  pointer.  This is part of a gradual migration to suser() accepting
  a struct ucred instead of a struct proc, simplifying the reference
  and locking semantics of suser().

Obtained from:	TrustedBSD Project
2001-11-01 20:56:57 +00:00
John Baldwin
a2f2b3afcd - Catch up to the new ucred API.
- Add proc locking to the jail() syscall.  This mostly involved shuffling
  a few things around so that blockable things like malloc and copyin
  were performed before acquiring the lock and checking the existing
  ucred and then updating the ucred as one "atomic" change under the proc
  lock.
2001-10-11 23:39:43 +00:00
Robert Watson
567931c8f6 o Initialize per-jail securelevel from global securelevel as part of
jail creation.

Obtained from:	TrustedBSD Project
2001-09-26 20:37:15 +00:00
Julian Elischer
b40ce4165d KSE Milestone 2
Note ALL MODULES MUST BE RECOMPILED
make the kernel aware that there are smaller units of scheduling than the
process. (but only allow one thread per process at this time).
This is functionally equivalent to teh previousl -current except
that there is a thread associated with each process.

Sorry john! (your next MFC will be a doosie!)

Reviewed by: peter@freebsd.org, dillon@freebsd.org

X-MFC after:    ha ha ha ha
2001-09-12 08:38:13 +00:00
Matthew Dillon
116734c4d1 Pushdown Giant for acct(), kqueue(), kevent(), execve(), fork(),
vfork(), rfork(), jail().
2001-09-01 03:04:31 +00:00
Robert Watson
fd6aaf7fe1 Anton kindly pointed out (and fixed) a bug in the Jail handling of the
bind() call on IPv4 sockets:

  Currently, if one tries to bind a socket using INADDR_LOOPBACK inside a
  jail, it will fail because prison_ip() does not take this possibility
  into account.  On the other hand, when one tries to connect(), for
  example, to localhost, prison_remote_ip() will silently convert
  INADDR_LOOPBACK to the jail's IP address.  Therefore, it is desirable to
  make bind() to do this implicit conversion as well.

  Apart from this, the patch also replaces 0x7f000001 in
  prison_remote_ip() to a more correct INADDR_LOOPBACK.

This is a 4.4-RELEASE "during the freeze, thanks" MFC candidate.

Submitted by:	Anton Berezin <tobez@FreeBSD.org>
Discussed with at some point:	phk
MFC after:	3 days
2001-08-03 18:21:06 +00:00
Robert Watson
91421ba234 o Move per-process jail pointer (p->pr_prison) to inside of the subject
credential structure, ucred (cr->cr_prison).
o Allow jail inheritence to be a function of credential inheritence.
o Abstract prison structure reference counting behind pr_hold() and
  pr_free(), invoked by the similarly named credential reference
  management functions, removing this code from per-ABI fork/exit code.
o Modify various jail() functions to use struct ucred arguments instead
  of struct proc arguments.
o Introduce jailed() function to determine if a credential is jailed,
  rather than directly checking pointers all over the place.
o Convert PRISON_CHECK() macro to prison_check() function.
o Move jail() function prototypes to jail.h.
o Emulate the P_JAILED flag in fill_kinfo_proc() and no longer set the
  flag in the process flags field itself.
o Eliminate that "const" qualifier from suser/p_can/etc to reflect
  mutex use.

Notes:

o Some further cleanup of the linux/jail code is still required.
o It's now possible to consider resolving some of the process vs
  credential based permission checking confusion in the socket code.
o Mutex protection of struct prison is still not present, and is
  required to protect the reference count plus some fields in the
  structure.

Reviewed by:	freebsd-arch
Obtained from:	TrustedBSD Project
2001-02-21 06:39:57 +00:00
David Malone
7cc0979fd6 Convert more malloc+bzero to malloc+M_ZERO.
Submitted by:	josh@zipperup.org
Submitted by:	Robert Drehmel <robd@gmx.net>
2000-12-08 21:51:06 +00:00
Robert Watson
cb1f0db9db o Deny access to System V IPC from within jail by default, as in the
current implementation, jail neither virtualizes the Sys V IPC namespace,
  nor provides inter-jail protections on IPC objects.
o Support for System V IPC can be enabled by setting jail.sysvipc_allowed=1
  using sysctl.
o This is not the "real fix" which involves virtualizing the System V
  IPC namespace, but prevents processes within jail from influencing those
  outside of jail when not approved by the administrator.

Reported by:	Paulo Fragoso <paulo@nlink.com.br>
2000-10-31 01:34:00 +00:00
Robert Watson
7cadc2663e o Modify jail to limit creation of sockets to UNIX domain sockets,
TCP/IP (v4) sockets, and routing sockets.  Previously, interaction
  with IPv6 was not well-defined, and might be inappropriate for some
  environments.  Similarly, sysctl MIB entries providing interface
  information also give out only addresses from those protocol domains.

  For the time being, this functionality is enabled by default, and
  toggleable using the sysctl variable jail.socket_unixiproute_only.
  In the future, protocol domains will be able to determine whether or
  not they are ``jail aware''.

o Further limitations on process use of getpriority() and setpriority()
  by jailed processes.  Addresses problem described in kern/17878.

Reviewed by:	phk, jmg
2000-06-04 04:28:31 +00:00
Robert Watson
83f1e257e0 Yet-another-update: rename ``kern.prison'' to a new sysctl root entry,
``jail'', and move the set_hostname_allowed sysctl there, as well as
fixing a bug in the sysctl that resulted in jails being over-limited
(preventing them from reading as well as writing the hostname).  Also,
correct some formatting issues, courtesy bde :-).

Reviewed by:	phk
Approved by:	jkh
2000-02-12 13:41:56 +00:00
Poul-Henning Kamp
978f8d9300 Add a version number field to the jail(2) argument so that future changes
can be handled intelligently.
1999-09-19 08:36:03 +00:00