Ensure MAC modules are inserted in order that they are registered.
Reviewed by: markj
Obtained from: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D39589
Some 32bit apps may need to be able to use
MAC_VERIEXEC_GET_PARAMS_PID_SYSCALL
MAC_VERIEXEC_GET_PARAMS_PATH_SYSCALL
Therefore compat32 support is required.
Obtained from: Juniper Networks, Inc.
Ensure veriexec opens the file before doing any read operations.
When the MAC_VERIEXEC_CHECK_PATH_SYSCALL syscall is requested, veriexec
needs to open the file before calling mac_veriexec_check_vp. This is to
ensure any set up is done by the file system. Most file systems do not
explicitly need an open, but some (e.g. virtfs) require initialization
of access tokens (file identifiers, etc.) before doing any read or write
operations.
The evaluate_fingerprint() function needs to ensure it has an open file
for reading in order to evaluate the fingerprint. The ideal solution is
to have a hook after the VOP_OPEN call in vn_open. For now, we open the
file for reading, envaluate the fingerprint, and close the file. While
this leaves a potential hole that could possibly be taken advantage of
by a dedicated aversary, this code path is not typically visited often
in our use cases, as we primarily encounter verified mounts and not
individual files. This should be considered a temporary workaround until
discussions about the post-open hook have concluded and the hook becomes
available.
Add MAC_VERIEXEC_GET_PARAMS_PATH_SYSCALL and
MAC_VERIEXEC_GET_PARAMS_PID_SYSCALL to mac_veriexec_syscall so we can
fetch and check label contents in an unconstrained manner.
Add a check for PRIV_VERIEXEC_CONTROL to do ioctl on /dev/veriexec
Make it clear that trusted process cannot be debugged. Attempts to debug
a trusted process already fail, but the failure path is very obscure.
Add an explicit check for VERIEXEC_TRUSTED in
mac_veriexec_proc_check_debug.
We need mac_veriexec_priv_check to not block PRIV_KMEM_WRITE if
mac_priv_gant() says it is ok.
Reviewed by: sjg
Obtained from: Juniper Networks, Inc.
Allow other MAC modules to override some veriexec checks.
We need two new privileges:
PRIV_VERIEXEC_DIRECT process wants to override 'indirect' flag
on interpreter
PRIV_VERIEXEC_NOVERIFY typically associated with PRIV_VERIEXEC_DIRECT
allow override of O_VERIFY
We also need to check for PRIV_VERIEXEC_NOVERIFY override
for FINGERPRINT_NODEV and FINGERPRINT_NOENTRY.
This will only happen if parent had PRIV_VERIEXEC_DIRECT override.
This allows for MAC modules to selectively allow some applications to
run without verification.
Needless to say, this is extremely dangerous and should only be used
sparingly and carefully.
Obtained from: Juniper Networks, Inc.
Reviewers: sjg
Subscribers: imp, dab
Differential Revision: https://reviews.freebsd.org/D39537
Currently, sysctls which enable KDB in some way are flagged with
CTLFLAG_SECURE, meaning that you can't modify them if securelevel > 0.
This is so that KDB cannot be used to lower a running system's
securelevel, see commit 3d7618d8bf. However, the newer mac_ddb(4)
restricts DDB operations which could be abused to lower securelevel
while retaining some ability to gather useful debugging information.
To enable the use of KDB (specifically, DDB) on systems with a raised
securelevel, change the KDB sysctl policy: rather than relying on
CTLFLAG_SECURE, add a check of the current securelevel to kdb_trap().
If the securelevel is raised, only pass control to the backend if MAC
specifically grants access; otherwise simply check to see if mac_ddb
vetoes the request, as before.
Add a new secure sysctl, debug.kdb.enter_securelevel, to override this
behaviour. That is, the sysctl lets one enter a KDB backend even with a
raised securelevel, so long as it is set before the securelevel is
raised.
Reviewed by: mhorne, stevek
MFC after: 1 month
Sponsored by: Juniper Networks
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D37122
It got disabled in 2003:
commit acb18acfec
Author: Poul-Henning Kamp <phk@FreeBSD.org>
Date: Sun Feb 23 18:09:05 2003 +0000
Bracket the kern.vnode sysctl in #ifdef notyet because it results
in massive locking issues on diskless systems.
It is also not clear that this sysctl is non-dangerous in its
requirements for locked down memory on large RAM systems.
There does not seem to be practical use for it and the disabled routine
does not work anyway.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D39127
Make it clear we're checking to see if the target is a verified file and
prevent its replacement if so.
Sponsored by: Netflix
Reviewed by: rpokala
Differential Revision: https://reviews.freebsd.org/D39079
Functions implemented :
- mac_veriexec_vnode_check_unlink: Unlink on a file has been
requested and requires validation. This function prohibits the
deleting a protected file (or deleting one of these hard links, if
any).
- mac_veriexec_vnode_check_rename_from: Rename the file has been
requested and must be validated. This function controls the renaming
of protected file
- mac_veriexec_vnode_check_rename_to: File overwrite rename has been
requested and must be validated. This function prevent overwriting of
a file protected (overwriting by mv command).
The 3 fonctions together aim to control the 'removal' (via unlink) and
the 'mv' on files protected by veriexec. The intention is to reach the
functional level of NetBSD veriexec.
Add sysctl node security.mac.veriexec.unlink to toggle control on
syscall unlink.
Add tunable kernel variable security.mac.veriexec.block_unlink to toggle
unlink protection. Add the corresponding read-only sysctl.
[ tidied up commit message, trailing whitespace, long lines, { placement ]
Reviewed by: sjg, imp
Pull Request: https://github.com/freebsd/freebsd-src/pull/613
Summary:
Port the MAC modules to use the IfAPI APIs as part of this.
Sponsored by: Juniper Networks, Inc.
Reviewed by: glebius
Differential Revision: https://reviews.freebsd.org/D38197
2449b9e5fe introduced API changes
that require ensuring that loadable MAC modules use the matching API.
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
o Assert that every protosw has pr_attach. Now this structure is
only for socket protocols declarations and nothing else.
o Merge struct pr_usrreqs into struct protosw. This was suggested
in 1996 by wollman@ (see 7b187005d1), and later reiterated
in 2006 by rwatson@ (see 6fbb9cf860).
o Make struct domain hold a variable sized array of protosw pointers.
For most protocols these pointers are initialized statically.
Those domains that may have loadable protocols have spacers. IPv4
and IPv6 have 8 spacers each (andre@ dff3237ee5).
o For inetsw and inet6sw leave a comment noting that many protosw
entries very likely are dead code.
o Refactor pf_proto_[un]register() into protosw_[un]register().
o Isolate pr_*_notsupp() methods into uipc_domain.c
Reviewed by: melifaro
Differential revision: https://reviews.freebsd.org/D36232
The validator always returned true due to an incorrect check.
Reviewed by: mhorne, imp
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D36125
Make most AST handlers dynamically registered. This allows to have
subsystem-specific handler source located in the subsystem files,
instead of making subr_trap.c aware of it. For instance, signal
delivery code on return to userspace is now moved to kern_sig.c.
Also, it allows to have some handlers designated as the cleanup (kclear)
type, which are called both at AST and on thread/process exit. For
instance, ast(), exit1(), and NFS server no longer need to be aware
about UFS softdep processing.
The dynamic registration also allows third-party modules to register AST
handlers if needed. There is one caveat with loadable modules: the
code does not make any effort to ensure that the module is not unloaded
before all threads processed through AST handler in it. In fact, this
is already present behavior for hwpmc.ko and ufs.ko. I do not think it
is worth the efforts and the runtime overhead to try to fix it.
Reviewed by: markj
Tested by: emaste (arm64), pho
Discussed with: jhb
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D35888
This fixes the build of at least i386 MINIMAL which was failing with
the error:
sys/security/mac_ddb/mac_ddb.c:200:15: error: use of undeclared identifier 'vnet'; did you mean 'int'?
if ((void *)vnet == (void *)addr)
^~~~
int
Sponsored by: DARPA
These global objects are easy to validate, so provide the helper
functions to do so and include these commands in the allow lists.
Reviewed by: markj
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D35372
Generally, access to the kernel debugger is considered to be unsafe from
a security perspective since it presents an unrestricted interface to
inspect or modify the system state, including sensitive data such as
signing keys.
However, having some access to debugger functionality on production
systems may be useful in determining the cause of a panic or hang.
Therefore, it is desirable to have an optional policy which allows
limited use of ddb(4) while disabling the functionality which could
reveal system secrets.
This loadable MAC module allows for the use of some ddb(4) commands
while preventing the execution of others. The commands have been broadly
grouped into three categories:
- Those which are 'safe' and will not emit sensitive data (e.g. trace).
Generally, these commands are deterministic and don't accept
arguments.
- Those which are definitively unsafe (e.g. examine <addr>, search
<addr> <value>)
- Commands which may be safe to execute depending on the arguments
provided (e.g. show thread <addr>).
Safe commands have been flagged as such with the DB_CMD_MEMSAFE flag.
Commands requiring extra validation can provide a function to do so.
For example, 'show thread <addr>' can be used as long as addr can be
checked against the system's list of process structures.
The policy also prevents debugger backends other than ddb(4) from
executing, for example gdb(4).
Reviewed by: markj, pauamma_gundo.com (manpages)
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D35371
Add three simple hooks to the debugger allowing for a loaded MAC policy
to intervene if desired:
1. Before invoking the kdb backend
2. Before ddb command registration
3. Before ddb command execution
We extend struct db_command with a private pointer and two flag bits
reserved for policy use.
Reviewed by: markj
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D35370
Writes to sysctls flagged with CTLFLAG_SECURE are blocked if the appropriate secure level is set. mac_veriexec does not behave this way, it blocks such sysctls in read-only mode as well.
This change aims to make mac_veriexec behave like secure levels, as it was meant by the original commit ed377cf41.
Reviewed by: sjg
Differential revision: https://reviews.freebsd.org/D34327
Obtained from: Stormshield
Some filesystems do not fill out certain optional vattr fields. To
ensure that they do not get copied out to userspace uninitialized, use
VATTR_NULL to provide default values.
Reported by: KMSAN
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
With the mac_priority(4) realtime policy active, users and processes in
the realtime group may promote existing threads and processes to
realtime scheduling priority. Extend the privileges granted to
PRIV_SCHED_SETPOLICY which allows explicit creation of new realtime
threads.
One use case of this is when the pthread scheduling policy is set to
SCHED_RR or SCHED_FIFO via pthread_attr_setschedpolicy(...) before
calling pthread_create(...). I ran into this when testing audio software
with realtime threads, particularly audio/ardour6.
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D33393
Add an idletime user group that allows non-root users to run processes
with idle scheduling priority. Privileges are granted by a MAC policy in
the mac_priority module. For this purpose, the kernel privilege
PRIV_SCHED_IDPRIO was added to sys/priv.h (kernel module ABI change).
Deprecate the system wide sysctl(8) knob
security.bsd.unprivileged_idprio which lets any user run idle priority
processes, regardless of context. While the knob is still working, it is
marked as deprecated in the description and in the man pages.
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D33338
The privilege allows the holder to assign idle priority type to thread
or process.
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D33338
This is a MAC policy module that grants scheduling privileges based on
group membership. Users or processes in the group realtime (gid 47) are
allowed to run threads and processes with realtime scheduling priority.
For timing-sensitive, low-latency software like audio/jack, running with
realtime priority helps to avoid stutter and gaps.
PR: 239125
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D33191
fspacectl(2) is a system call to provide space management support to
userspace applications. VOP_DEALLOCATE(9) is a VOP call to perform the
deallocation. vn_deallocate(9) is a public KPI for kmods' use.
The purpose of proposing a new system call, a KPI and a VOP call is to
allow bhyve or other hypervisor monitors to emulate the behavior of SCSI
UNMAP/NVMe DEALLOCATE on a plain file.
fspacectl(2) comprises of cmd and flags parameters to specify the
space management operation to be performed. Currently cmd has to be
SPACECTL_DEALLOC, and flags has to be 0.
fo_fspacectl is added to fileops.
VOP_DEALLOCATE(9) is added as a new VOP call. A trivial implementation
of VOP_DEALLOCATE(9) is provided.
Sponsored by: The FreeBSD Foundation
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D28347
mac_veriexec sets its version to 1, but the mac_veriexec_shaX modules which depend on it expect MAC_VERIEXEC_VERSION = 2.
Be consistent and use MAC_VERIEXEC_VERSION everywhere.
This unbreaks loading of mac_veriexec modules at boot time.
Authored by: Kornel Duleba <mindal@semihalf.com>
Obtained from: Semihalf
Sponsored by: Stormshield
Differential Revision: https://reviews.freebsd.org/D31268
When packet is a SYN packet, we don't need to modify any existing PCB.
Normally SYN arrives on a listening socket, we either create a syncache
entry or generate syncookie, but we don't modify anything with the
listening socket or associated PCB. Thus create a new PCB lookup
mode - rlock if listening. This removes the primary contention point
under SYN flood - the listening socket PCB.
Sidenote: when SYN arrives on a synchronized connection, we still
don't need write access to PCB to send a challenge ACK or just to
drop. There is only one exclusion - tcptw recycling. However,
existing entanglement of tcp_input + stacks doesn't allow to make
this change small. Consider this patch as first approach to the problem.
Reviewed by: rrs
Differential revision: https://reviews.freebsd.org/D29576
place -- in the VOP rather than vn_setexttr() -- and that it is for historic
reasons. We might wish to relocate it in due course, but this way at least
we document the asymmetry.
pipes get stated all thet time and this avoidably contributed to contention.
The pipe lock is only held to accomodate MAC and to check the type.
Since normally there is no probe for pipe stat depessimize this by having the
flag.
The pipe_state field gets modified with locks held all the time and it's not
feasible to convert them to use atomic store. Move the type flag away to a
separate variable as a simple cleanup and to provide stable field to read.
Use short for both fields to avoid growing the struct.
While here short-circuit MAC for pipe_poll as well.