Commit Graph

210 Commits

Author SHA1 Message Date
jamie
1c11f552d6 Make it easier for filesystems to count themselves as jail-enabled,
by doing most of the work in a new function prison_add_vfs in kern_jail.c
Now a jail-enabled filesystem need only mark itself with VFCF_JAIL, and
the rest is taken care of.  This includes adding a jail parameter like
allow.mount.foofs, and a sysctl like security.jail.mount_foofs_allowed.
Both of these used to be a static list of known filesystems, with
predefined permission bits.

Reviewed by:	kib
Differential Revision:	D14681
2018-05-04 20:54:27 +00:00
avg
3cd5284f93 call racct_proc_ucred_changed() under the proc lock
The lock is required to ensure that the switch to the new credentials
and the transfer of the process's accounting data from the old
credentials to the new ones is done atomically.  Otherwise, some updates
may be applied to the new credentials and then additionally transferred
from the old credentials if the updates happen after proc_set_cred() and
before racct_proc_ucred_changed().

The problem is especially pronounced for RACCT_RSS because
- there is a strict accounting for this resource (it's reclaimable)
- it's updated asynchronously by the vm daemon
- it's updated by setting an absolute value instead of applying a delta

I had to remove a call to rctl_proc_ucred_changed() from
racct_proc_ucred_changed() and make all callers of latter call the
former as well.  The reason is that rctl_proc_ucred_changed, as it is
implemented now, cannot be called while holding the proc lock, so the
lock is dropped after calling racct_proc_ucred_changed.  Additionally,
I've added calls to crhold / crfree around the rctl call, because
without the proc lock there is no gurantee that the new credentials,
owned by the process, will stay stable.  That does not eliminate a
possibility that the credentials passed to the rctl will get stale.
Ideally, rctl_proc_ucred_changed should be able to work under the proc
lock.

Many thanks to kib for pointing out the above problems.

PR:		222027
Discussed with:	kib
No comment:	trasz
MFC after:	2 weeks
Differential Revision: https://reviews.freebsd.org/D15048
2018-04-20 13:08:04 +00:00
brooks
9d79658aab Move most of the contents of opt_compat.h to opt_global.h.
opt_compat.h is mentioned in nearly 180 files. In-progress network
driver compabibility improvements may add over 100 more so this is
closer to "just about everywhere" than "only some files" per the
guidance in sys/conf/options.

Keep COMPAT_LINUX32 in opt_compat.h as it is confined to a subset of
sys/compat/linux/*.c.  A fake _COMPAT_LINUX option ensure opt_compat.h
is created on all architectures.

Move COMPAT_LINUXKPI to opt_dontuse.h as it is only used to control the
set of compiled files.

Reviewed by:	kib, cem, jhb, jtl
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14941
2018-04-06 17:35:35 +00:00
jamie
783e904fc9 Represent boolean jail options as an array of structures containing the
flag and both the regular and "no" names, instead of two different string
arrays whose indices need to match the flag's bit position.  This makes
them similar to the say "jailsys" options are represented.

Loop through either kind of option array with a structure pointer rather
then an integer index.
2018-03-20 23:08:42 +00:00
pfg
cc22a86800 sys/kern: adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
2017-11-27 15:20:12 +00:00
allanjude
86ca7d2af5 Jails: Optionally prevent jailed root from binding to privileged ports
You may now optionally specify allow.noreserved_ports to prevent root
inside a jail from using privileged ports (less than 1024)

PR:		217728
Submitted by:	Matt Miller <mattm916@pulsar.neomailbox.ch>
Reviewed by:	jamie, cem, smh
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D10202
2017-06-06 02:15:00 +00:00
vangyzen
c7c348accc Use inet_ntoa_r() instead of inet_ntoa() throughout the kernel
inet_ntoa() cannot be used safely in a multithreaded environment
because it uses a static local buffer. Instead, use inet_ntoa_r()
with a buffer on the caller's stack.

Suggested by:	glebius, emaste
Reviewed by:	gnn
MFC after:	2 weeks
Sponsored by:	Dell EMC
Differential Revision:	https://reviews.freebsd.org/D9625
2017-02-16 20:47:41 +00:00
stevek
12aa5d6087 Move IPv4-specific jail functions to new file netinet/in_jail.c
_prison_check_ip4 renamed to prison_check_ip4_locked

Move IPv6-specific jail functions to new file netinet6/in6_jail.c
_prison_check_ip6 renamed to prison_check_ip6_locked

Add appropriate prototypes to sys/sys/jail.h

Adjust kern_jail.c to call prison_check_ip4_locked and
prison_check_ip6_locked accordingly.

Add netinet/in_jail.c and netinet6/in6_jail.c to the list of files that
need to be built when INET and INET6, respectively, are configured in the
kernel configuration file.

Reviewed by:	jtl
Approved by:	sjg (mentor)
Sponsored by:	Juniper Networks, Inc.
Differential Revision:	https://reviews.freebsd.org/D6799
2016-08-09 02:16:21 +00:00
jamie
aad828cb49 Fix a vnode leak when giving a child jail a too-long path when
debug.disablefullpath=1.
2016-06-09 21:59:11 +00:00
jamie
786e18e298 Re-order some jail parameter reading to prevent a vnode leak. 2016-06-09 20:43:14 +00:00
jamie
91dc125e2d Clean up some logic in jail error messages, replacing a missing test and
a redundant test with a single correct test.
2016-06-09 20:39:57 +00:00
jamie
debe558799 Make sure the OSD methods for jail set and remove can't run concurrently,
by holding allprison_lock exclusively (even if only for a moment before
downgrading) on all paths that call PR_METHOD_REMOVE.  Since they may run
on a downgraded lock, it's still possible for them to run concurrently
with PR_METHOD_GET, which will need to use the prison lock.
2016-06-09 16:41:41 +00:00
jamie
f9e8138137 Mark jail(2), and the sysctls that it (and only it) uses as deprecated.
jail(8) has long used jail_set(2), and those sysctl only cause confusion.
2016-05-30 05:21:24 +00:00
pfg
28823d0656 sys/kern: spelling fixes in comments.
No functional change.
2016-04-29 22:15:33 +00:00
jamie
c39e007779 Delay revmoing the last jail reference in prison_proc_free, and instead
put it off into the pr_task.  This is similar to prison_free, and in fact
uses the same task even though they do something slightly different.

This resolves a LOR between the process lock and allprison_lock, which
came about in r298565.

PR:		48471
2016-04-27 02:25:21 +00:00
jamie
6043e5f56b Use crcopysafe in jail_attach. 2016-04-26 21:19:12 +00:00
jamie
f29978c971 Pass the current/new jail to PR_METHOD_CHECK, which pushes the call
until after the jail is found or created.  This requires unlocking the
jail for the call and re-locking it afterward, but that works because
nothing in the jail has been changed yet, and other processes won't
change the important fields as long as allprison_lock remains held.

Keep better track of name vs namelc in kern_jail_set.  Name should
always be the hierarchical name (relative to the caller), and namelc
the last component.

PR:		48471
MFC after:	5 days
2016-04-25 04:27:58 +00:00
jamie
2003e1acf2 Add a new jail OSD method, PR_METHOD_REMOVE. It's called when a jail is
removed from the user perspective, i.e. when the last pr_uref goes away,
even though the jail mail still exist in the dying state.  It will also
be called if either PR_METHOD_CREATE or PR_METHOD_SET fail.

PR:		48471
MFC after:	 5 days
2016-04-25 04:24:00 +00:00
jamie
b67e183366 Remove the PR_REMOVE flag, which was meant as a temporary marker for
a jail that might be seen mid-removal.  It hasn't been doing the right
thing since at least the ability to resurrect dying jails, and such
resurrection also makes it unnecessary.
2016-04-25 03:58:08 +00:00
pfg
a7d40a88c9 kernel: use our nitems() macro when it is available through param.h.
No functional change, only trivial cases are done in this sweep,

Discussed in:	freebsd-current
2016-04-19 23:48:27 +00:00
jamie
b78d6a91e2 Fix jail name checking that disallowed anything that starts with '0'.
The intention was to just limit leading zeroes on numeric names.  That
check is now improved to also catch the leading spaces and '+' that
strtoul can pass through.

PR:		204897
MFC after:	3 days
2015-12-15 17:25:00 +00:00
trasz
ae1d62cec8 Speed up rctl operation with large rulesets, by holding the lock
during iteration instead of relocking it for each traversed rule.

Reviewed by:	mjg@
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D4110
2015-11-15 12:10:51 +00:00
araujo
61ced0e48d Add support to the jail framework to be able to mount linsysfs(5) and
linprocfs(5).

Differential Revision:	D2846
Submitted by:		Nikolai Lifanov <lifanov@mail.lifanov.com>
Reviewed by:		jamie
2015-07-19 08:52:35 +00:00
mjg
c71e9ab863 Move chdir/chroot-related fdp manipulation to kern_descrip.c
Prefix exported functions with pwd_.

Deduplicate some code by adding a helper for setting fd_cdir.

Reviewed by:	kib
2015-07-11 16:19:11 +00:00
bz
ab1516b90e Initialise pr_enforce_statfs from the "default" sysctl value and
not from the compile time constant.  The sysctl value is seeded
from the compile time constant.

MFC after:	2 weeks
2015-06-17 13:15:54 +00:00
trasz
802017a04b Add kern.racct.enable tunable and RACCT_DISABLED config option.
The point of this is to be able to add RACCT (with RACCT_DISABLED)
to GENERIC, to avoid having to rebuild the kernel to use rctl(8).

Differential Revision:	https://reviews.freebsd.org/D2369
Reviewed by:	kib@
MFC after:	1 month
Relnotes:	yes
Sponsored by:	The FreeBSD Foundation
2015-04-29 10:23:02 +00:00
glebius
baf4ea8ca8 Do not include if_var.h and in6_var.h into kern_jail.c. It is now possible
after r280444.

Sponsored by:	Nginx, Inc.
2015-03-24 16:46:40 +00:00
mjg
054f9cab59 cred: add proc_set_cred helper
The goal here is to provide one place altering process credentials.

This eases debugging and opens up posibilities to do additional work when such
an action is performed.
2015-03-16 00:10:03 +00:00
ian
0b0c9cce2f Format the line properly (wrap before column 80). 2015-02-28 17:44:31 +00:00
ian
57c4ae8cb3 Export the new osreldate and osrelease jail parms in jail_get(2). 2015-02-28 17:32:31 +00:00
ian
1df855e5be Allow the kern.osrelease and kern.osreldate sysctl values to be set in a
jail's creation parameters.  This allows the kernel version to be reliably
spoofed within the jail whether examined directly with sysctl or
indirectly with the uname -r and -K options.

The values can only be set at jail creation time, to eliminate the need
for any locking when accessing the values via sysctl.

The overridden values are inherited by nested jails (unless the config for
the nested jails also overrides the values).

There is no sanity or range checking, other than disallowing an empty
release string or a zero release date, by design.  The system
administrator is trusted to set sane values.  Setting values that are
newer than the actual running kernel will likely cause compatibility
problems.

Differential Revision:	https://reviews.freebsd.org/D1948
Relnotes:	yes
2015-02-27 16:28:55 +00:00
jamie
c7d0935d11 Add allow.mount.fdescfs jail flag.
PR:		192951
Submitted by:	ruben@verweg.com
MFC after:	3 days
2015-01-28 21:08:09 +00:00
jamie
15f3ae0c52 Remove the prison flags PR_IP4_DISABLE and PR_IP6_DISABLE, which have been
write-only for as long as they've existed.
2015-01-14 04:50:28 +00:00
jamie
20db074137 Don't set prison's pr_ip4s or pr_ip6s to -1.
PR:		196474
MFC after:	3 days
2015-01-14 03:52:41 +00:00
trasz
76b4ee1f43 Avoid unlocking unlocked mutex in RCTL jail code. Specific test case
is attached to PR.

PR:		193457
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2014-09-09 16:05:33 +00:00
glebius
80e85e32a5 Remove AppleTalk support.
AppleTalk was a network transport protocol for Apple Macintosh devices
in 80s and then 90s. Starting with Mac OS X in 2000 the AppleTalk was
a legacy protocol and primary networking protocol is TCP/IP. The last
Mac OS X release to support AppleTalk happened in 2009. The same year
routing equipment vendors (namely Cisco) end their support.

Thus, AppleTalk won't be supported in FreeBSD 11.0-RELEASE.
2014-03-14 06:29:43 +00:00
glebius
d494babace Remove IPX support.
IPX was a network transport protocol in Novell's NetWare network operating
system from late 80s and then 90s. The NetWare itself switched to TCP/IP
as default transport in 1998. Later, in this century the Novell Open
Enterprise Server became successor of Novell NetWare. The last release
that claimed to still support IPX was OES 2 in 2007. Routing equipment
vendors (e.g. Cisco) discontinued support for IPX in 2011.

Thus, IPX won't be supported in FreeBSD 11.0-RELEASE.
2014-03-14 02:58:48 +00:00
jamie
64b15ec174 Back out r261266 pending security buy-in.
r261266:
  Add a jail parameter, allow.kmem, which lets jailed processes access
  /dev/kmem and related devices (i.e. grants PRIV_IO and PRIV_KMEM_WRITE).
  This in conjunction with changing the drm driver's permission check from
  PRIV_DRIVER to PRIV_KMEM_WRITE will allow a jailed Xorg server.
2014-01-31 17:39:51 +00:00
jamie
223bb594b0 Add a jail parameter, allow.kmem, which lets jailed processes access
/dev/kmem and related devices (i.e. grants PRIV_IO and PRIV_KMEM_WRITE).
This in conjunction with changing the drm driver's permission check from
PRIV_DRIVER to PRIV_KMEM_WRITE will allow a jailed Xorg server.

Submitted by:	netchild
MFC after:	1 week
2014-01-29 13:41:13 +00:00
ae
89e2f99130 Fix copy/paste typo.
MFC after:	1 week
2013-12-17 16:45:19 +00:00
peter
ac40be45fb jail_v0.ip_number was always in host byte order. This was handled
in one of the many layers of indirection and shims through stable/7
in jail_handle_ips().  When it was cleaned up and unified through
kern_jail() for 8.x, the byte order swap was lost.

This only matters for ancient binaries that call jail(2) themselves
internally.
2013-11-28 19:40:33 +00:00
glebius
93563d8269 prison_check_ip4() can take const arguments. 2013-11-01 10:01:57 +00:00
glebius
ff6e113f1b The r48589 promised to remove implicit inclusion of if_var.h soon. Prepare
to this event, adding if_var.h to files that do need it. Also, include
all includes that now are included due to implicit pollution via if_var.h

Sponsored by:	Netflix
Sponsored by:	Nginx, Inc.
2013-10-26 17:58:36 +00:00
jamie
d13d69ef17 Keep PRIV_KMEM_READ permitted inside jails as it is on the outside. 2013-09-06 17:32:29 +00:00
delphij
b93cf73204 Allow tmpfs be mounted inside jail. 2013-08-23 22:52:20 +00:00
jamie
7941fefd80 Refine the "nojail" rc keyword, adding "nojailvnet" for files that don't
apply to most jails but do apply to vnet jails.  This includes adding
a new sysctl "security.jail.vnet" to identify vnet jails.

PR:		conf/149050
Submitted by:	mdodd
MFC after:	3 days
2013-05-19 04:10:34 +00:00
mjg
5d2ad328b1 prison_racct_detach can be called for not fully initialized jail, so make it check that the jail has racct before doing anything
PR:		kern/174436
Reviewed by:	trasz
MFC after:	3 days
2012-12-18 18:34:36 +00:00
kib
560aa751e0 Remove the support for using non-mpsafe filesystem modules.
In particular, do not lock Giant conditionally when calling into the
filesystem module, remove the VFS_LOCK_GIANT() and related
macros. Stop handling buffers belonging to non-mpsafe filesystems.

The VFS_VERSION is bumped to indicate the interface change which does
not result in the interface signatures changes.

Conducted and reviewed by:	attilio
Tested by:	pho
2012-10-22 17:50:54 +00:00
trasz
b2747e472e Fix use-after-free in kern_jail_set() triggered e.g. by attempts
to clear "persist" flag from empty persistent jail, like this:

jail -c persist=1
jail -n 1 -m persist=0

Submitted by:	Mateusz Guzik <mjguzik at gmail dot com>
MFC after:	2 weeks
2012-05-22 19:43:20 +00:00
trasz
a25d879040 Don't leak locks in prison_racct_modify().
Submitted by:	Mateusz Guzik <mjguzik at gmail dot com>
MFC after:	2 weeks
2012-05-22 17:30:02 +00:00