can see the dmesg buffer (this is the current behavior). When false (the
new default), dmesg will be unavailable to jailed users, whether root or
not.
The security.bsd.unprivileged_read_msgbuf sysctl still works as before,
controlling system-wide whether non-root users can see the buffer.
PR: 211580
Submitted by: bz
Approved by: re@ (kib@)
MFC after: 3 days
It's not supposed to be legal for two jails to contain the same IP address,
unless both jails contain only that one address. This is the behavior
documented in jail(8), and is there to prevent confusion when multiple
jails are listening on IADDR_ANY.
VIMAGE jails (now the default for GENERIC kernels) test this correctly,
but non-VIMAGE jails have been performing an incomplete test when nested
jails are used.
Approved by: re@ (kib@)
MFC after: 5 days
jails since FreeBSD 7.
Along with the system call, put the various security.jail.allow_foo and
security.jail.foo_allowed sysctls partly under COMPAT_FREEBSD11 (or
BURN_BRIDGES). These sysctls had two disparate uses: on the system side,
they were global permissions for jails created via jail(2) which lacked
fine-grained permission controls; inside a jail, they're read-only
descriptions of what the current jail is allowed to do. The first use
is obsolete along with jail(2), but keep them for the second-read-only use.
Differential Revision: D14791
It allows locking or unlocking physical pages in memory within a jail
This allows running elasticsearch with "bootstrap.memory_lock" inside a jail
Reviewed by: jamie@
Differential Revision: https://reviews.freebsd.org/D16342
to FOREACH_PROC_IN_SYSTEM() to have a single pattern to look for.
Reviewed by: kib
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
Differential Revision: https://reviews.freebsd.org/D15916
by doing most of the work in a new function prison_add_vfs in kern_jail.c
Now a jail-enabled filesystem need only mark itself with VFCF_JAIL, and
the rest is taken care of. This includes adding a jail parameter like
allow.mount.foofs, and a sysctl like security.jail.mount_foofs_allowed.
Both of these used to be a static list of known filesystems, with
predefined permission bits.
Reviewed by: kib
Differential Revision: D14681
The lock is required to ensure that the switch to the new credentials
and the transfer of the process's accounting data from the old
credentials to the new ones is done atomically. Otherwise, some updates
may be applied to the new credentials and then additionally transferred
from the old credentials if the updates happen after proc_set_cred() and
before racct_proc_ucred_changed().
The problem is especially pronounced for RACCT_RSS because
- there is a strict accounting for this resource (it's reclaimable)
- it's updated asynchronously by the vm daemon
- it's updated by setting an absolute value instead of applying a delta
I had to remove a call to rctl_proc_ucred_changed() from
racct_proc_ucred_changed() and make all callers of latter call the
former as well. The reason is that rctl_proc_ucred_changed, as it is
implemented now, cannot be called while holding the proc lock, so the
lock is dropped after calling racct_proc_ucred_changed. Additionally,
I've added calls to crhold / crfree around the rctl call, because
without the proc lock there is no gurantee that the new credentials,
owned by the process, will stay stable. That does not eliminate a
possibility that the credentials passed to the rctl will get stale.
Ideally, rctl_proc_ucred_changed should be able to work under the proc
lock.
Many thanks to kib for pointing out the above problems.
PR: 222027
Discussed with: kib
No comment: trasz
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D15048
opt_compat.h is mentioned in nearly 180 files. In-progress network
driver compabibility improvements may add over 100 more so this is
closer to "just about everywhere" than "only some files" per the
guidance in sys/conf/options.
Keep COMPAT_LINUX32 in opt_compat.h as it is confined to a subset of
sys/compat/linux/*.c. A fake _COMPAT_LINUX option ensure opt_compat.h
is created on all architectures.
Move COMPAT_LINUXKPI to opt_dontuse.h as it is only used to control the
set of compiled files.
Reviewed by: kib, cem, jhb, jtl
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D14941
flag and both the regular and "no" names, instead of two different string
arrays whose indices need to match the flag's bit position. This makes
them similar to the say "jailsys" options are represented.
Loop through either kind of option array with a structure pointer rather
then an integer index.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
You may now optionally specify allow.noreserved_ports to prevent root
inside a jail from using privileged ports (less than 1024)
PR: 217728
Submitted by: Matt Miller <mattm916@pulsar.neomailbox.ch>
Reviewed by: jamie, cem, smh
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D10202
inet_ntoa() cannot be used safely in a multithreaded environment
because it uses a static local buffer. Instead, use inet_ntoa_r()
with a buffer on the caller's stack.
Suggested by: glebius, emaste
Reviewed by: gnn
MFC after: 2 weeks
Sponsored by: Dell EMC
Differential Revision: https://reviews.freebsd.org/D9625
_prison_check_ip4 renamed to prison_check_ip4_locked
Move IPv6-specific jail functions to new file netinet6/in6_jail.c
_prison_check_ip6 renamed to prison_check_ip6_locked
Add appropriate prototypes to sys/sys/jail.h
Adjust kern_jail.c to call prison_check_ip4_locked and
prison_check_ip6_locked accordingly.
Add netinet/in_jail.c and netinet6/in6_jail.c to the list of files that
need to be built when INET and INET6, respectively, are configured in the
kernel configuration file.
Reviewed by: jtl
Approved by: sjg (mentor)
Sponsored by: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D6799
by holding allprison_lock exclusively (even if only for a moment before
downgrading) on all paths that call PR_METHOD_REMOVE. Since they may run
on a downgraded lock, it's still possible for them to run concurrently
with PR_METHOD_GET, which will need to use the prison lock.
put it off into the pr_task. This is similar to prison_free, and in fact
uses the same task even though they do something slightly different.
This resolves a LOR between the process lock and allprison_lock, which
came about in r298565.
PR: 48471
until after the jail is found or created. This requires unlocking the
jail for the call and re-locking it afterward, but that works because
nothing in the jail has been changed yet, and other processes won't
change the important fields as long as allprison_lock remains held.
Keep better track of name vs namelc in kern_jail_set. Name should
always be the hierarchical name (relative to the caller), and namelc
the last component.
PR: 48471
MFC after: 5 days
removed from the user perspective, i.e. when the last pr_uref goes away,
even though the jail mail still exist in the dying state. It will also
be called if either PR_METHOD_CREATE or PR_METHOD_SET fail.
PR: 48471
MFC after: 5 days
a jail that might be seen mid-removal. It hasn't been doing the right
thing since at least the ability to resurrect dying jails, and such
resurrection also makes it unnecessary.
The intention was to just limit leading zeroes on numeric names. That
check is now improved to also catch the leading spaces and '+' that
strtoul can pass through.
PR: 204897
MFC after: 3 days
during iteration instead of relocking it for each traversed rule.
Reviewed by: mjg@
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D4110
The point of this is to be able to add RACCT (with RACCT_DISABLED)
to GENERIC, to avoid having to rebuild the kernel to use rctl(8).
Differential Revision: https://reviews.freebsd.org/D2369
Reviewed by: kib@
MFC after: 1 month
Relnotes: yes
Sponsored by: The FreeBSD Foundation
The goal here is to provide one place altering process credentials.
This eases debugging and opens up posibilities to do additional work when such
an action is performed.
jail's creation parameters. This allows the kernel version to be reliably
spoofed within the jail whether examined directly with sysctl or
indirectly with the uname -r and -K options.
The values can only be set at jail creation time, to eliminate the need
for any locking when accessing the values via sysctl.
The overridden values are inherited by nested jails (unless the config for
the nested jails also overrides the values).
There is no sanity or range checking, other than disallowing an empty
release string or a zero release date, by design. The system
administrator is trusted to set sane values. Setting values that are
newer than the actual running kernel will likely cause compatibility
problems.
Differential Revision: https://reviews.freebsd.org/D1948
Relnotes: yes
AppleTalk was a network transport protocol for Apple Macintosh devices
in 80s and then 90s. Starting with Mac OS X in 2000 the AppleTalk was
a legacy protocol and primary networking protocol is TCP/IP. The last
Mac OS X release to support AppleTalk happened in 2009. The same year
routing equipment vendors (namely Cisco) end their support.
Thus, AppleTalk won't be supported in FreeBSD 11.0-RELEASE.
IPX was a network transport protocol in Novell's NetWare network operating
system from late 80s and then 90s. The NetWare itself switched to TCP/IP
as default transport in 1998. Later, in this century the Novell Open
Enterprise Server became successor of Novell NetWare. The last release
that claimed to still support IPX was OES 2 in 2007. Routing equipment
vendors (e.g. Cisco) discontinued support for IPX in 2011.
Thus, IPX won't be supported in FreeBSD 11.0-RELEASE.
r261266:
Add a jail parameter, allow.kmem, which lets jailed processes access
/dev/kmem and related devices (i.e. grants PRIV_IO and PRIV_KMEM_WRITE).
This in conjunction with changing the drm driver's permission check from
PRIV_DRIVER to PRIV_KMEM_WRITE will allow a jailed Xorg server.
/dev/kmem and related devices (i.e. grants PRIV_IO and PRIV_KMEM_WRITE).
This in conjunction with changing the drm driver's permission check from
PRIV_DRIVER to PRIV_KMEM_WRITE will allow a jailed Xorg server.
Submitted by: netchild
MFC after: 1 week
in one of the many layers of indirection and shims through stable/7
in jail_handle_ips(). When it was cleaned up and unified through
kern_jail() for 8.x, the byte order swap was lost.
This only matters for ancient binaries that call jail(2) themselves
internally.
to this event, adding if_var.h to files that do need it. Also, include
all includes that now are included due to implicit pollution via if_var.h
Sponsored by: Netflix
Sponsored by: Nginx, Inc.
apply to most jails but do apply to vnet jails. This includes adding
a new sysctl "security.jail.vnet" to identify vnet jails.
PR: conf/149050
Submitted by: mdodd
MFC after: 3 days
In particular, do not lock Giant conditionally when calling into the
filesystem module, remove the VFS_LOCK_GIANT() and related
macros. Stop handling buffers belonging to non-mpsafe filesystems.
The VFS_VERSION is bumped to indicate the interface change which does
not result in the interface signatures changes.
Conducted and reviewed by: attilio
Tested by: pho
allow.mount.zfs:
allow mounting the zfs filesystem inside a jail
This way the permssions for mounting all current VFCF_JAIL filesystems
inside a jail are controlled wia allow.mount.* jail parameters.
Update sysctl descriptions.
Update jail(8) and zfs(8) manpages.
TODO: document the connection of allow.mount.* and VFCF_JAIL for kernel
developers
MFC after: 10 days
a new jail parameter node with the following parameters:
allow.mount.devfs:
allow mounting the devfs filesystem inside a jail
allow.mount.nullfs:
allow mounting the nullfs filesystem inside a jail
Both parameters are disabled by default (equals the behavior before
devfs and nullfs in jails). Administrators have to explicitly allow
mounting devfs and nullfs for each jail. The value "-1" of the
devfs_ruleset parameter is removed in favor of the new allow setting.
Reviewed by: jamie
Suggested by: pjd
MFC after: 2 weeks
A new jail(8) option "devfs_ruleset" defines the ruleset enforcement for
mounting devfs inside jails. A value of -1 disables mounting devfs in
jails, a value of zero means no restrictions. Nested jails can only
have mounting devfs disabled or inherit parent's enforcement as jails are
not allowed to view or manipulate devfs(8) rules.
Utilizes new functions introduced in r231265.
Reviewed by: jamie
MFC after: 1 month
This function updates path string to vnode's full global path and checks
the size of the new path string against the pathlen argument.
In vfs_domount(), sys_unmount() and kern_jail_set() this new function
is used to update the supplied path argument to the respective global path.
Unbreaks jailed zfs(8) with enforce_statfs set to 1.
Reviewed by: kib
MFC after: 1 month
The SYSCTL_NODE macro defines a list that stores all child-elements of
that node. If there's no SYSCTL_DECL macro anywhere else, there's no
reason why it shouldn't be static.
patch modifies makesyscalls.sh to prefix all of the non-compatibility
calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel
entry points and all places in the code that use them. It also
fixes an additional name space collision between the kernel function
psignal and the libc function of the same name by renaming the kernel
psignal kern_psignal(). By introducing this change now we will ease future
MFCs that change syscalls.
Reviewed by: rwatson
Approved by: re (bz)
but not removed; decrement it instead when the child jail actually
goes away. This avoids letting the counter go below zero in the case
where dying (pr_uref==0) jails are "resurrected", and an associated
KASSERT panic.
Submitted by: Steven Hartland
Approved by: re (bz)
MFC after: 1 week
32 bits to 64 bits and eliminates the unused mnt_xflag field. The
existing mnt_flag field is completely out of bits, so this update
gives us room to expand. Note that the f_flags field in the statfs
structure is already 64 bits, so the expanded mnt_flag field can
be exported without having to make any changes in the statfs structure.
Approved by: re (bz)
to be assigned to a non-default FIB instance.
You may need to recompile world or ports due to the change of struct ifnet.
Submitted by: cjsp
Submitted by: Alexander V. Chernikov (melifaro ipfw.ru)
(original versions)
Reviewed by: julian
Reviewed by: Alexander V. Chernikov (melifaro ipfw.ru)
MFC after: 2 weeks
X-MFC: use spare in struct ifnet
structure, which acts as a proxy between them. This makes jail rules
persistent, i.e. they can be added before jail gets created, and they
don't disappear when the jail gets destroyed.
on the set of rules it maintains and the current resource usage. It also
privides userland API to manage that ruleset.
Sponsored by: The FreeBSD Foundation
Reviewed by: kib (earlier version)
and per-loginclass resource accounting information, to be used by the new
resource limits code. It's connected to the build, but the code that
actually calls the new functions will come later.
Sponsored by: The FreeBSD Foundation
Reviewed by: kib (earlier version)
it possible for the kernel to track login class the process is assigned to,
which is required for RCTL. This change also make setusercontext(3) call
setloginclass(2) and makes it possible to retrieve current login class using
id(1).
Reviewed by: kib (as part of a larger patch)
Add and export constants of array sizes of jail parameters as compiled into
the kernel.
This is the least intrusive way to allow kvm to read the (sparse) arrays
independent of the options the kernel was compiled with.
Reviewed by: jhb (originally)
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Sponsored by: CK Software GmbH
for upcoming 64-bit PowerPC and MIPS support. This renames the COMPAT_IA32
option to COMPAT_FREEBSD32, removes some IA32-specific code from MI parts
of the kernel and enhances the freebsd32 compatibility code to support
big-endian platforms.
Reviewed by: kib, jhb
whether to use source address selection (default) or the primary
jail address for unbound outgoing connections.
This is intended to be used by people upgrading from single-IP
jails to multi-IP jails but not having to change firewall rules,
application ACLs, ... but to force their connections (unless
otherwise changed) to the primry jail IP they had been used for
years, as well as for people prefering to implement similar policies.
Note that for IPv6, if configured incorrectly, this might lead to
scope violations, which single-IPv6 jails could as well, as by the
design of jails. [1]
Reviewed by: jamie, hrs (ipv6 part)
Pointed out by: hrs [1]
MFC After: 2 weeks
Asked for by: Jase Thew (bazerka beardz.net)
- name some columns more closely to the user space variables,
as we do for host.* or allow.* (in the listing) already.
- print pr_childmax (children.max).
- prefix hex values with 0x.
MFC after: 3 weeks
Fix some wrong usages.
Note: this does not affect generated binaries as this argument is not used.
PR: 137213
Submitted by: Eygene Ryabinkin (initial version)
MFC after: 1 month
if (jailed(cred))
left. If you are running with a vnet (virtual network stack) those will
return true and defer you to classic IP-jails handling and thus things
will be "denied" or returned with an error.
Work around this problem by introducing another "jailed()" function,
jailed_without_vnet(), that also takes vnets into account, and permits
the calls, should the jail from the given cred have its own virtual
network stack.
We cannot change the classic jailed() call to do that, as it is used
outside the network stack as well.
Discussed with: julian, zec, jamie, rwatson (back in Sept)
MFC after: 5 days
leaves behind an orphaned vnet. This change ensures that such vnets get
released.
This change affects only options VIMAGE builds.
Submitted by: jamie
Discussed with: bz
Approved by: re (rwatson), julian (mentor)
MFC after: 3 days