we move towards netinet as a pseudo-object for the MAC Framework.
Rename 'mac_create_mbuf_linklayer' to 'mac_mbuf_create_linklayer' to
reflect general object-first ordering preference.
Sponsored by: SPARTA (original patches against Mac OS X)
Obtained from: TrustedBSD Project, Apple Computer
This change introduces audit_proc_coredump() which is called by coredump(9)
to create an audit record for the coredump event. When a process
dumps a core, it could be security relevant. It could be an indicator that
a stack within the process has been overflowed with an incorrectly constructed
malicious payload or a number of other events.
The record that is generated looks like this:
header,111,10,process dumped core,0,Thu Oct 25 19:36:29 2007, + 179 msec
argument,0,0xb,signal
path,/usr/home/csjp/test.core
subject,csjp,csjp,staff,csjp,staff,1101,1095,50457,10.37.129.2
return,success,1
trailer,111
- We allocate a completely new record to make sure we arent clobbering
the audit data associated with the syscall that produced the core
(assuming the core is being generated in response to SIGABRT and not
an invalid memory access).
- Shuffle around expand_name() so we can use the coredump name at the very
beginning of the coredump call. Make sure we free the storage referenced
by "name" if we need to bail out early.
- Audit both successful and failed coredump creation efforts
Obtained from: TrustedBSD Project
Reviewed by: rwatson
MFC after: 1 month
primary object type, and then by secondarily by method name. This sorts
entry points relating to particular objects, such as pipes, sockets, and
vnodes together.
Sponsored by: SPARTA (original patches against Mac OS X)
Obtained from: TrustedBSD Project, Apple Computer
from mac_vfs.c to mac_process.c to join other functions that setup up
process labels for specific purposes. Unlike the two proc create calls,
this call is intended to run after creation when a process registers as
the NFS daemon, so remains an _associate_ call..
Obtained from: TrustedBSD Project
than mac_<policy>_whatever, as this shortens the names and makes the code
a bit easier to read.
When dealing with label structures, name variables 'mb', 'ml', 'mm rather
than the longer 'mac_biba', 'mac_lomac', and 'mac_mls', likewise making
the code a little easier to read.
Obtained from: TrustedBSD Project
from Mac OS X Leopard--rationalize naming for entry points to
the following general forms:
mac_<object>_<method/action>
mac_<object>_check_<method/action>
The previous naming scheme was inconsistent and mostly
reversed from the new scheme. Also, make object types more
consistent and remove spaces from object types that contain
multiple parts ("posix_sem" -> "posixsem") to make mechanical
parsing easier. Introduce a new "netinet" object type for
certain IPv4/IPv6-related methods. Also simplify, slightly,
some entry point names.
All MAC policy modules will need to be recompiled, and modules
not updates as part of this commit will need to be modified to
conform to the new KPI.
Sponsored by: SPARTA (original patches against Mac OS X)
Obtained from: TrustedBSD Project, Apple Computer
state is stored in an extended subject token now. Make sure
that we are using the extended data. This fixes the termID
for process tokens.
Obtained from: TrustedBSD Project
Discussed with: rwatson
MFC after: 1 week
to kproc_xxx as they actually make whole processes.
Thos makes way for us to add REAL kthread_create() and friends
that actually make theads. it turns out that most of these
calls actually end up being moved back to the thread version
when it's added. but we need to make this cosmetic change first.
I'd LOVE to do this rename in 7.0 so that we can eventually MFC the
new kthread_xxx() calls.
on multiple different audit pipes. The old method used cv_signal()
which would result in only one thread being woken up after we
appended a record to it's queue. This resulted in un-timely wake-ups
when processing audit records real-time.
- Assign PSOCK priority to threads that have been sleeping on a read(2).
This is the same priority threads are woken up with when they select(2)
or poll(2). This yields fairness between various forms of sleep on
the audit pipes.
Obtained from: TrustedBSD Project
Discussed with: rwatson
MFC after: 1 week
- p_sflag was mostly protected by PROC_LOCK rather than the PROC_SLOCK or
previously the sched_lock. These bugs have existed for some time.
- Allow swapout to try each thread in a process individually and then
swapin the whole process if any of these fail. This allows us to move
most scheduler related swap flags into td_flags.
- Keep ki_sflag for backwards compat but change all in source tools to
use the new and more correct location of P_INMEM.
Reported by: pho
Reviewed by: attilio, kib
Approved by: re (kensmith)
point to mac_check_vnode_unlink(), reflecting UNIX naming conventions.
This is the first of several commits to synchronize the MAC Framework
in FreeBSD 7.0 with the MAC Framework as it will appear in Mac OS X
Leopard.
Reveiwed by: csjp, Samy Bahra <sbahra at gwu dot edu>
Submitted by: Jacques Vidrine <nectar at apple dot com>
Obtained from: Apple Computer, Inc.
Sponsored by: SPARTA, SPAWAR
Approved by: re (bmah)
previously conditionally acquired Giant based on debug.mpsafenet. As that
has now been removed, they are no longer required. Removing them
significantly simplifies error-handling in the socket layer, eliminated
quite a bit of unwinding of locking in error cases.
While here clean up the now unneeded opt_net.h, which previously was used
for the NET_WITH_GIANT kernel option. Clean up some related gotos for
consistency.
Reviewed by: bz, csjp
Tested by: kris
Approved by: re (kensmith)
vnode label for a check rather than the directory vnode label a second
time.
MFC after: 3 days
Submitted by: Zhouyi ZHOU <zhouzhouyi at FreeBSD dot org>
Reviewed by: csjp
Sponsored by: Google Summer of Code 2007
Approved by: re (bmah)
- Sort copyrights by date.
- Re-wrap, and in some cases, fix comments.
- Fix tabbing, white space, remove extra blank lines.
- Remove commented out debugging printfs.
Approved by: re (kensmith)
and replace with software-testable sysctl node (security.audit) that
can be used to detect kernel audit support.
Obtained from: TrustedBSD Project
Approved by: re (kensmith)
- In audit_bsm.c, make sure all the arguments: ARG_AUID, ARG_ASID, ARG_AMASK,
and ARG_TERMID{_ADDR} are valid before auditing their arguments. (This is done
for both setaudit and setaudit_addr.
- Audit the arguments passed to setaudit_addr(2)
- AF_INET6 does not equate to AU_IPv6. Change this in au_to_in_addr_ex() so the
audit token is created with the correct type. This fixes the processing of the
in_addr_ex token in users pace.
- Change the size of the token (as generated by the kernel) from 5*4 bytes to
4*4 bytes (the correct size of an ip6 address)
- Correct regression from ucred work which resulted in getaudit() not returning
E2BIG if the subject had an ip6 termid
- Correct slight regression in getaudit(2) which resulted in the size of a pointer
being passed instead of the size of the structure. (This resulted in invalid
auditinfo data being returned via getaudit(2))
Reviewed by: rwatson
Approved by: re@ (kensmith)
Obtained from: TrustedBSD Project
MFC after: 1 month
mpo_check_proc_setaudit_addr to be used when controlling use of
setaudit_addr(), rather than mpo_check_proc_setaudit(), which takes a
different argument type.
Reviewed by: csjp
Approved by: re (kensmith)
some cases, move to priv_check() if it was an operation on a thread and
no other flags were present.
Eliminate caller-side jail exception checking (also now-unused); jail
privilege exception code now goes solely in kern_jail.c.
We can't yet eliminate suser() due to some cases in the KAME code where
a privilege check is performed and then used in many different deferred
paths. Do, however, move those prototypes to priv.h.
Reviewed by: csjp
Obtained from: TrustedBSD Project
embedded storage in struct ucred. This allows audit state to be cached
with the thread, avoiding locking operations with each system call, and
makes it available in asynchronous execution contexts, such as deep in
the network stack or VFS.
Reviewed by: csjp
Approved by: re (kensmith)
Obtained from: TrustedBSD Project
- Use thread_lock() rather than sched_lock for per-thread scheduling
sychronization.
- Use the per-process spinlock rather than the sched_lock for per-process
scheduling synchronization.
Tested by: kris, current@
Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
argument from being file descriptor index into the pointer to struct file:
part 2. Convert calls missed in the first big commit.
Noted by: rwatson
Pointy hat to: kib
remove associated comments.
Slip audit_file_rotate_wait assignment in audit_rotate_vnode() before
the drop of the global audit mutex.
Obtained from: TrustedBSD Project
inline it when needed already, and the symbol is also required outside of
audit.c. This silences a new gcc warning on the topic of using __inline__
instead of __inline.
MFC after: 3 days
where similar data structures exist to support devfs and the MAC
Framework, but are named differently.
Obtained from: TrustedBSD Project
Sponsored by: SPARTA, Inc.
variable name conventions for arguments passed into the framework --
for example, name network interfaces 'ifp', sockets 'so', mounts 'mp',
mbufs 'm', processes 'p', etc, wherever possible. Previously there
was significant variation in this regard.
Normalize copyright lists to ranges where sensible.
labels: the mount label (label of the mountpoint) and the fs label (label
of the file system). In practice, policies appear to only ever use one,
and the distinction is not helpful.
Combine mnt_mntlabel and mnt_fslabel into a single mnt_label, and
eliminate extra machinery required to maintain the additional label.
Update policies to reflect removal of extra entry points and label.
Obtained from: TrustedBSD Project
Sponsored by: SPARTA, Inc.
the introduction of priv(9) and MAC Framework entry points for privilege
checking/granting. These entry points exactly aligned with privileges and
provided no additional security context:
- mac_check_sysarch_ioperm()
- mac_check_kld_unload()
- mac_check_settime()
- mac_check_system_nfsd()
Add mpo_priv_check() implementations to Biba and LOMAC policies, which,
for each privilege, determine if they can be granted to processes
considered unprivileged by those two policies. These mostly, but not
entirely, align with the set of privileges granted in jails.
Obtained from: TrustedBSD Project
- Redistribute counter declarations to where they are used, rather than at
the file header, so it's more clear where we do (and don't) have
counters.
- Add many more counters, one per policy entry point, so that many
individual access controls and object life cycle events are tracked.
- Perform counter increments for label destruction explicitly in entry
point functions rather than in LABEL_DESTROY().
- Use LABEL_INIT() instead of SLOT_SET() directly in label init functions
to be symmetric with destruction.
- Align counter names more carefully with entry point names.
- More constant and variable name normalization.
Obtained from: TrustedBSD Project
- Add a more detailed comment describing the mac_test policy.
- Add COUNTER_DECL() and COUNTER_INC() macros to declare and manage
various test counters, reducing the verbosity of the test policy
quite a bit.
- Add LABEL_CHECK() macro to abbreviate normal validation of labels.
Unlike the previous check macros, this checks for a NULL label and
doesn't test NULL labels. This means that optionally passed labels
will now be handled automatically, although in the case of optional
credentials, NULL-checks are still required.
- Add LABEL_DESTROY() macro to abbreviate the handling of label
validation and tear-down.
- Add LABEL_NOTFREE() macro to abbreviate check for non-free labels.
- Normalize the names of counters, magic values.
- Remove unused policy "enabled" flag.
Obtained from: TrustedBSD Project
calls. Add MAC Framework entry points and MAC policy entry points for
audit(), auditctl(), auditon(), setaudit(), aud setauid().
MAC Framework entry points are only added for audit system calls where
additional argument context may be useful for policy decision-making; other
audit system calls without arguments may be controlled via the priv(9)
entry points.
Update various policy modules to implement audit-related checks, and in
some cases, other missing system-related checks.
Obtained from: TrustedBSD Project
Sponsored by: SPARTA, Inc.
tokens. Currently, we do not support the set{get}audit_addr(2) system
calls which allows processes like sshd to set extended or ip6
information for subject tokens.
The approach that was taken was to change the process audit state
slightly to use an extended terminal ID in the kernel. This allows
us to store both IPv4 IPv6 addresses. In the case that an IPv4 address
is in use, we convert the terminal ID from an struct auditinfo_addr to
a struct auditinfo.
If getaudit(2) is called when the subject is bound to an ip6 address,
we return E2BIG.
- Change the internal audit record to store an extended terminal ID
- Introduce ARG_TERMID_ADDR
- Change the kaudit <-> BSM conversion process so that we are using
the appropriate subject token. If the address associated with the
subject is IPv4, we use the standard subject32 token. If the subject
has an IPv6 address associated with them, we use an extended subject32
token.
- Fix a couple of endian issues where we do a couple of byte swaps when
we shouldn't be. IP addresses are already in the correct byte order,
so reading the ip6 address 4 bytes at a time and swapping them results
in in-correct address data. It should be noted that the same issue was
found in the openbsm library and it has been changed there too on the
vendor branch
- Change A_GETPINFO to use the appropriate structures
- Implement A_GETPINFO_ADDR which basically does what A_GETPINFO does,
but can also handle ip6 addresses
- Adjust get{set}audit(2) syscalls to convert the data
auditinfo <-> auditinfo_addr
- Fully implement set{get}audit_addr(2)
NOTE: This adds the ability for processes to correctly set extended subject
information. The appropriate userspace utilities still need to be updated.
MFC after: 1 month
Reviewed by: rwatson
Obtained from: TrustedBSD
and flags with an sxlock. This leads to a significant and measurable
performance improvement as a result of access to shared locking for
frequent lookup operations, reduced general overhead, and reduced overhead
in the event of contention. All of these are imported for threaded
applications where simultaneous access to a shared file descriptor array
occurs frequently. Kris has reported 2x-4x transaction rate improvements
on 8-core MySQL benchmarks; smaller improvements can be expected for many
workloads as a result of reduced overhead.
- Generally eliminate the distinction between "fast" and regular
acquisisition of the filedesc lock; the plan is that they will now all
be fast. Change all locking instances to either shared or exclusive
locks.
- Correct a bug (pointed out by kib) in fdfree() where previously msleep()
was called without the mutex held; sx_sleep() is now always called with
the sxlock held exclusively.
- Universally hold the struct file lock over changes to struct file,
rather than the filedesc lock or no lock. Always update the f_ops
field last. A further memory barrier is required here in the future
(discussed with jhb).
- Improve locking and reference management in linux_at(), which fails to
properly acquire vnode references before using vnode pointers. Annotate
improper use of vn_fullpath(), which will be replaced at a future date.
In fcntl(), we conservatively acquire an exclusive lock, even though in
some cases a shared lock may be sufficient, which should be revisited.
The dropping of the filedesc lock in fdgrowtable() is no longer required
as the sxlock can be held over the sleep operation; we should consider
removing that (pointed out by attilio).
Tested by: kris
Discussed with: jhb, kris, attilio, jeff
system calls now enter without Giant held, and then in some cases, acquire
Giant explicitly.
Remove a number of other MPSAFE annotations in the credential code and
tweak one or two other adjacent comments.
it isn't used in the access control decision. This became visible to
Coverity with the change to a function call retrieving label values.
Coverity CID: 1723
LABEL_TO_SLOT() macro used by policy modules to query and set label data
in struct label. Instead of using a union, store an intptr_t, simplifying
the API.
Update policies: in most cases this required only small tweaks to current
wrapper macros. In two cases, a single wrapper macros had to be split into
separate get and set macros.
Move struct label definition from _label.h to mac_internal.h and remove
_label.h. With this change, policies may now treat struct label * as
opaque, allowing us to change the layout of struct label without breaking
the policy module ABI. For example, we could make the maximum number of
policies with labels modifiable at boot-time rather than just at
compile-time.
Obtained from: TrustedBSD Project
Don't perform a nested include of _label.h in mac.h, as mac.h now
describes only the user API to MAC, and _label.h defines the in-kernel
representation of MAC labels.
Remove mac.h includes from policies and MAC framework components that do
not use userspace MAC API definitions.
Add _KERNEL inclusion checks to mac_internal.h and mac_policy.h, as these
are kernel-only include files
Obtained from: TrustedBSD Project
(due to an early reset or the like), remember to unlock the socket lock.
This will not occur in 7-CURRENT, but could in theory occur in 6-STABLE.
MFC after: 1 week
been introduced to the MAC framework:
mpo_associate_nfsd_label
mpo_create_mbuf_from_firewall
mpo_check_system_nfsd
mpo_check_vnode_mmap_downgrade
mpo_check_vnode_mprotect
mpo_init_syncache_label
mpo_destroy_syncache_label
mpo_init_syncache_from_inpcb
mpo_create_mbuf_from_syncache
MFC after: 2 weeks [1]
[1] The syncache related entry points will NOT be MFCed as the changes in
the syncache subsystem are not present in RELENG_6 yet.
exclusive access if there is at least one thread waiting for it to
become available. This may significantly reduce overhead by reducing
the number of unnecessary wakeups issued whenever the framework becomes
idle.
Annotate that we still signal the CV more than necessary and should
fix this.
Obtained from: TrustedBSD Project
Reviewed by: csjp
Tested by: csjp
manipulation is visible to the subject process. Remove XXX comments
suggesting this.
Convert one XXX on a difference from Darwin into a note: it's not a
bug, it's a feature.
Obtained from: TrustedBSD Project
- Replace XXX with Note: in several cases where observations are made about
future functionality rather than problems or bugs.
- Remove an XXX comment about byte order and au_to_ip() -- IP headers must
be submitted in network byte order. Add a comment to this effect.
- Mention that we don't implement select/poll for /dev/audit.
Obtained from: TrustedBSD Project
kernel<->policy ABI version. Add a comment to the definition describing
it and listing known versions. Modify MAC_POLICY_SET() to reference the
current kernel version by name rather than by number.
Staticize mac_late, which is used only in mac_framework.c.
Obtained from: TrustedBSD Project
mac_framework.c Contains basic MAC Framework functions, policy
registration, sysinits, etc.
mac_syscalls.c Contains implementations of various MAC system calls,
including ENOSYS stubs when compiling without options
MAC.
Obtained from: TrustedBSD Project
consumes and implements, as well as the location of the framework and
policy modules.
Refactor MAC Framework versioning a bit so that the current ABI version can
be exported via a read-only sysctl.
Further update comments relating to locking/synchronization.
Update copyright to take into account these and other recent changes.
Obtained from: TrustedBSD Project
Framework and security modules, to src/sys/security/mac/mac_policy.h,
completing the removal of kernel-only MAC Framework include files from
src/sys/sys. Update the MAC Framework and MAC policy modules. Delete
the old mac_policy.h.
Third party policy modules will need similar updating.
Obtained from: TrustedBSD Project
subsystems will be a property of policy modules, which may require
access control check entry points to be invoked even when not actively
enforcing (i.e., to track information flow without providing
protection).
Obtained from: TrustedBSD Project
Suggested by: Christopher dot Vance at sparta dot com
than from the slab, but don't.
Document mac_mbuf_to_label(), mac_copy_mbuf_tag().
Clean up white space/wrapping for other comments.
Obtained from: TrustedBSD Project
Exapnd comments on System V IPC labeling methods, which could use improved
consistency with respect to other object types.
Obtained from: TrustedBSD Project
the ifnet itself. The stack copy has been made while holding the mutex
protecting ifnet labels, so copying from the ifnet copy could result in
an inconsistent version being copied out.
Reported by: Todd.Miller@sparta.com
Obtained from: TrustedBSD Project
MFC after: 3 weeks
kernel. This LOR snuck in with some of the recent syncache changes. To
fix this, the inpcb handling was changed:
- Hang a MAC label off the syncache object
- When the syncache entry is initially created, we pickup the PCB lock
is held because we extract information from it while initializing the
syncache entry. While we do this, copy the MAC label associated with
the PCB and use it for the syncache entry.
- When the packet is transmitted, copy the label from the syncache entry
to the mbuf so it can be processed by security policies which analyze
mbuf labels.
This change required that the MAC framework be extended to support the
label copy operations from the PCB to the syncache entry, and then from
the syncache entry to the mbuf.
These functions really should be referencing the syncache structure instead
of the label. However, due to some of the complexities associated with
exposing this syncache structure we operate directly on it's label pointer.
This should be OK since we aren't making any access control decisions within
this code directly, we are merely allocating and copying label storage so
we can properly initialize mbuf labels for any packets the syncache code
might create.
This also has a nice side effect of caching. Prior to this change, the
PCB would be looked up/locked for each packet transmitted. Now the label
is cached at the time the syncache entry is initialized.
Submitted by: andre [1]
Discussed with: rwatson
[1] andre submitted the tcp_syncache.c changes
specific privilege names to a broad range of privileges. These may
require some future tweaking.
Sponsored by: nCircle Network Security, Inc.
Obtained from: TrustedBSD Project
Discussed on: arch@
Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri,
Alex Lyashkov <umka at sevcity dot net>,
Skip Ford <skip dot ford at verizon dot net>,
Antoine Brodin <antoine dot brodin at laposte dot net>
privilege for threads and credentials. Unlike the existing suser(9)
interface, priv(9) exposes a named privilege identifier to the privilege
checking code, allowing more complex policies regarding the granting of
privilege to be expressed. Two interfaces are provided, replacing the
existing suser(9) interface:
suser(td) -> priv_check(td, priv)
suser_cred(cred, flags) -> priv_check_cred(cred, priv, flags)
A comprehensive list of currently available kernel privileges may be
found in priv.h. New privileges are easily added as required, but the
comments on adding privileges found in priv.h and priv(9) should be read
before doing so.
The new privilege interface exposed sufficient information to the
privilege checking routine that it will now be possible for jail to
determine whether a particular privilege is granted in the check routine,
rather than relying on hints from the calling context via the
SUSER_ALLOWJAIL flag. For now, the flag is maintained, but a new jail
check function, prison_priv_check(), is exposed from kern_jail.c and used
by the privilege check routine to determine if the privilege is permitted
in jail. As a result, a centralized list of privileges permitted in jail
is now present in kern_jail.c.
The MAC Framework is now also able to instrument privilege checks, both
to deny privileges otherwise granted (mac_priv_check()), and to grant
privileges otherwise denied (mac_priv_grant()), permitting MAC Policy
modules to implement privilege models, as well as control a much broader
range of system behavior in order to constrain processes running with
root privilege.
The suser() and suser_cred() functions remain implemented, now in terms
of priv_check() and the PRIV_ROOT privilege, for use during the transition
and possibly continuing use by third party kernel modules that have not
been updated. The PRIV_DRIVER privilege exists to allow device drivers to
check privilege without adopting a more specific privilege identifier.
This change does not modify the actual security policy, rather, it
modifies the interface for privilege checks so changes to the security
policy become more feasible.
Sponsored by: nCircle Network Security, Inc.
Obtained from: TrustedBSD Project
Discussed on: arch@
Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri,
Alex Lyashkov <umka at sevcity dot net>,
Skip Ford <skip dot ford at verizon dot net>,
Antoine Brodin <antoine dot brodin at laposte dot net>
sockaddr_storage. This structure is defined in RFC 2553 and is a more
semantically correct structure for holding IP and IP6 sockaddr information.
struct sockaddr is not big enough to hold all the required information for
IP6, resulting in truncated addresses et al when auditing IP6 sockaddr
information.
We also need to assume that the sa->sa_len has been validated before the call to
audit_arg_sockaddr() is made, otherwise it could result in a buffer overflow.
This is being done to accommodate auditing of network related arguments (like
connect, bind et al) that will be added soon.
Discussed with: rwatson
Obtained from: TrustedBSD Project
MFC after: 2 weeks
begun with a repo-copy of mac.h to mac_framework.h. sys/mac.h now
contains the userspace and user<->kernel API and definitions, with all
in-kernel interfaces moved to mac_framework.h, which is now included
across most of the kernel instead.
This change is the first step in a larger cleanup and sweep of MAC
Framework interfaces in the kernel, and will not be MFC'd.
Obtained from: TrustedBSD Project
Sponsored by: SPARTA
not trust jails enough to execute audit related system calls. An example of
this is with su(1), or login(1) within prisons. So, if the syscall request
comes from a jail return ENOSYS. This will cause these utilities to operate
as if audit is not present in the kernel.
Looking forward, this problem will be remedied by allowing non privileged
users to maintain and their own audit streams, but the details on exactly how
this will be implemented needs to be worked out.
This change should fix situations when options AUDIT has been compiled into
the kernel, and utilities like su(1), or login(1) fail due to audit system
call failures within jails.
This is a RELENG_6 candidate.
Reported by: Christian Brueffer
Discussed with: rwatson
MFC after: 3 days
written to the audit trail file:
- audit_record_write() now returns void, and all file system specific
error handling occurs inside this function. This pushes error handling
complexity out of the record demux routine that hands off to both the
trail and audit pipes, and makes trail behavior more consistent with
pipes as a record destination.
- Rate limit kernel printfs associated with running low on space. Rate
limit audit triggers for low space. Rate limit printfs for fail stop
events. Rate limit audit worker write error printfs.
- Document in detail the types of limits and space checks we perform, and
combine common cases.
This improves the audit subsystems tolerance to low space conditions by
avoiding toasting the console with printfs are waking up the audit daemon
continuously.
MFC after: 3 days
Obtained from: TrustedBSD Project
other problems while labels were first being added to various kernel
objects. They have outlived their usefulness.
MFC after: 1 month
Suggested by: Christopher dot Vance at SPARTA dot com
Obtained from: TrustedBSD Project
when allocating the record in the first place, allocate the final buffer
when closing the BSM record. At that point, more size information is
available, so a sufficiently large buffer can be allocated.
This allows the kernel to generate audit records in excess of
MAXAUDITDATA bytes, but is consistent with Solaris's behavior. This only
comes up when auditing command line arguments, in which case we presume
the administrator really does want the data as they have specified the
policy flag to gather them.
Obtained from: TrustedBSD Project
MFC after: 3 days
with other commonly used sysctl name spaces, rather than declaring them
all over the place.
MFC after: 1 month
Sponsored by: nCircle Network Security, Inc.
audit pipes. If the kernel record was not selected for the trail or the pipe,
any user supplied record attached to it would be tossed away, resulting in
otherwise selected events being lost.
- Introduce two new masks: AR_PRESELECT_USER_TRAIL AR_PRESELECT_USER_PIPE,
currently we have AR_PRESELECT_TRAIL and AR_PRESELECT_PIPE, which tells
the audit worker that we are interested in the kernel record, with
the additional masks we can determine if either the pipe or trail is
interested in seeing the kernel or user record.
- In audit(2), we unconditionally set the AR_PRESELECT_USER_TRAIL and
AR_PRESELECT_USER_PIPE masks under the assumption that userspace has
done the preselection [1].
Currently, there is work being done that allows the kernel to parse and
preselect user supplied records, so in the future preselection could occur
in either layer. But there is still a few details to work out here.
[1] At some point we need to teach au_preselect(3) about the interests of
all the individual audit pipes.
This is a RELENG_6 candidate.
Reviewed by: rwatson
Obtained from: TrustedBSD Project
MFC after: 1 week
exists to allow the mandatory access control policy to properly initialize
mbufs generated by the firewall. An example where this might happen is keep
alive packets, or ICMP error packets in response to other packets.
This takes care of kernel panics associated with un-initialize mbuf labels
when the firewall generates packets.
[1] I modified this patch from it's original version, the initial patch
introduced a number of entry points which were programmatically
equivalent. So I introduced only one. Instead, we should leverage
mac_create_mbuf_netlayer() which is used for similar situations,
an example being icmp_error()
This will minimize the impact associated with the MFC
Submitted by: mlaier [1]
MFC after: 1 week
This is a RELENG_6 candidate
Add the argument auditing functions for argv and env.
Add kernel-specific versions of the tokenizer functions for the
arg and env represented as a char array.
Implement the AUDIT_ARGV and AUDIT_ARGE audit policy commands to
enable/disable argv/env auditing.
Call the argument auditing from the exec system calls.
Obtained from: TrustedBSD Project
Approved by: rwatson (mentor)
is loaded. This problem stems from the fact that the policy is not properly
initializing the mac label associated with the NFS daemon.
Obtained from: TrustedBSD Project
Discussed with: rwatson
audit record size at run-time, which can be used by the user
process to size the user space buffer it reads into from the audit
pipe.
Perforce change: 105098
Obtained from: TrustedBSD Project
progress the kernel audit code in CVS is considered authoritative.
This will ease $P4$-related merging issues during the CVS loopback.
Obtained from: TrustedBSD Project
we will initialize the label to biba/low for files that have been created
through an NFS RPC. This is a safe default given the default nature of our
NFS implementation, there is not a whole lot of data integrity there by
default. This also fixes kernel panics associated with file creation over NFS
while creating files on filesystems which have multilabel enabled with BIBA
enabled.
MFC after: 2 weeks
Discussed with: rwatson
- Correct audit_arg_socketaddr() argument name from so to sa.
- Assert arguments are non-NULL to many argument capture functions
rather than testing them. This may trip some bugs.
- Assert the process lock is held when auditing process
information.
- Test currecord in several more places.
- Test validity of more arguments with kasserts, such as flag
values when auditing vnode information.
Perforce change: 98825
Obtained from: TrustedBSD Project
whether we have an IPv6 address. Write the term ID as 4 or
16 bytes depending on address type. This change matches the recent
OpenBSM change, and what Solaris does.
Obtained from: TrustedBSD Project
Approved by: rwatson (mentor)
get a consistent snapshot, as well as get consistent values (i.e.,
that p_comm is properly nul-terminated).
Perforce CID: 98824
Obtained from: TrustedBSD Project
process was sucessfully audited. Otherwise, generate the PID
token. This change covers the pid < 0 cases, and pid lookup
failure cases.
Submitted by: wsalamon
Obtained from: TrustedBSD Project
global audit trail configuration. This allows applications consuming
audit trails to specify parameters for which audit records are of
interest, including selecting records not required by the global trail.
Allowing application interest specification without changing the global
configuration allows intrusion detection systems to run without
interfering with global auditing or each other (if multiple are
present). To implement this:
- Kernel audit records now carry a flag to indicate whether they have
been selected by the global trail or by the audit pipe subsystem,
set during record commit, so that this information is available
after BSM conversion when delivering the BSM to the trail and audit
pipes in the audit worker thread asynchronously. Preselection by
either record target will cause the record to be kept.
- Similar changes to preselection when the audit record is created
when the system call is entering: consult both the global trail and
pipes.
- au_preselect() now accepts the class in order to avoid repeatedly
looking up the mask for each preselection test.
- Define a series of ioctls that allow applications to specify whether
they want to track the global trail, or program their own
preselection parameters: they may specify their own flags and naflags
masks, similar to the global masks of the same name, as well as a set
of per-auid masks. They also set a per-pipe mode specifying whether
they track the global trail, or user their own -- the door is left
open for future additional modes. A new ioctl is defined to allow a
user process to flush the current audit pipe queue, which can be used
after reprogramming pre-selection to make sure that only records of
interest are received in future reads.
- Audit pipe data structures are extended to hold the additional fields
necessary to support preselection. By default, audit pipes track the
global trail, so "praudit /dev/auditpipe" will track the global audit
trail even though praudit doesn't program the audit pipe selection
model.
- Comment about the complexities of potentially adding partial read
support to audit pipes.
By using a set of ioctls, applications can select which records are of
interest, and toggle the preselection mode.
Obtained from: TrustedBSD Project
knowledge of user vs. kernel audit records into
audit_worker_process_record(). This largely confines vnode
knowledge to audit_record_write(), but avoids that logic knowing
about BSM as opposed to byte streams. This will allow us to
improve our ability to support real-time audit stream processing
by audit pipe consumers while auditing is disabled, but this
support is not yet complete.
Obtained from: TrustedBSD Project
Break out logic to call audit_record_write() and handle error
conditions into audit_worker_process_record(). This will be the
future home of some logic now present in audit_record_write()
also.
Obtained from: TrustedBSD Project
worker.
Rename audit_commit_cv to audit_watermark_cv, since it is there to
wake up threads waiting on hitting the low watermark. Describe
properly in comment.
Obtained from: TrustedBSD Project
src/sys/security/audit:
- Clarify and clean up AUR_ types to match Solaris.
- Clean up use of host vs. network byte order for IP addresses.
- Remove combined user/kernel implementations of some token creation
calls, such as au_to_file(), header calls, etc.
Obtained from: TrustedBSD Project
pointer prototypes from it into their own typedefs. No functional or
ABI change. This allows policies to declare their own function
prototypes based on a common definition from mac_policy.h rather than
duplicating these definitions.
Obtained from: SEDarwin, SPARTA
MFC after: 1 month
subject: ranges of uid, ranges of gid, jail id
objects: ranges of uid, ranges of gid, filesystem,
object is suid, object is sgid, object matches subject uid/gid
object type
We can also negate individual conditions. The ruleset language is
a superset of the previous language, so old rules should continue
to work.
These changes require a change to the API between libugidfw and the
mac_bsdextended module. Add a version number, so we can tell if
we're running mismatched versions.
Update man pages to reflect changes, add extra test cases to
test_ugidfw.c and add a shell script that checks that the the
module seems to do what we expect.
Suggestions from: rwatson, trhodes
Reviewed by: trhodes
MFC after: 2 months
credential: mac_associate_nfsd_label()
This entry point can be utilized by various Mandatory Access Control policies
so they can properly initialize the label of files which get created
as a result of an NFS operation. This work will be useful for fixing kernel
panics associated with accessing un-initialized or invalid vnode labels.
The implementation of these entry points will come shortly.
Obtained from: TrustedBSD
Requested by: mdodd
MFC after: 3 weeks
branch:
Integrate audit.c to audit_worker.c, so as to migrate the worker
thread implementation to its own .c file.
Populate audit_worker.c using parts now removed from audit.c:
- Move audit rotation global variables.
- Move audit_record_write(), audit_worker_rotate(),
audit_worker_drain(), audit_worker(), audit_rotate_vnode().
- Create audit_worker_init() from relevant parts of audit_init(),
which now calls this routine.
- Recreate audit_free(), which wraps uma_zfree() so that
audit_record_zone can be static to audit.c.
- Unstaticize various types and variables relating to the audit
record queue so that audit_worker can get to them. We may want
to wrap these in accessor methods at some point.
- Move AUDIT_PRINTF() to audit_private.h.
Addition of audit_worker.c to kernel configuration, missed in
earlier submit.
Obtained from: TrustedBSD Project
Add ioctls to audit pipes in order to allow querying of the current
record queue state, setting of the queue limit, and querying of pipe
statistics.
Obtained from: TrustedBSD Project
Change send_trigger() prototype to return an int, so that user
space callers can tell if the message was successfully placed
in the trigger queue. This isn't quite the same as it being
successfully received, but is close enough that we can generate
a more useful warning message in audit(8).
Obtained from: TrustedBSD Project
vnode and a mode and checks if a given access mode is permitted.
This centralises the mac_bsdextended_enabled check and the GETATTR
calls and makes the implementation of the mac policy methods simple.
This should make it easier for us to match vnodes on more complex
attributes than just uid and gid in the future, but for now there
should be no functional change.
Approved/Reviewed by: rwatson, trhodes
MFC after: 1 month
- Include audit_internal.h to get definition of internal audit record
structures, as it's no longer in audit.h. Forward declare au_record
in audit_private.h as not all audit_private.h consumers care about
it.
- Remove __APPLE__ compatibility bits that are subsumed by configure
for user space.
- Don't expose in6_addr internals (non-portable, but also cleaner
looking).
- Avoid nested include of audit.h in audit_private.h.
Obtained from: TrustedBSD Project
be called without any vnode locks held. Remove calls to vn_start_write() and
vn_finished_write() in vnode_pager_putpages() and add these calls before the
vnode lock is obtained to most of the callers that don't already have them.
In the future, we may want to acquire the lock early in the function and
hold it across calls to vn_rdwr(), etc, to avoid multiple acquires.
Spotted by: kris (bugmagnet)
Obtained from: TrustedBSD Project
applications to insert a "tee" in the live audit event stream. Records
are inserted into a per-clone queue so that user processes can pull
discreet records out of the queue. Unlike delivery to disk, audit pipes
are "lossy", dropping records in low memory conditions or when the
process falls behind real-time events. This mechanism is appropriate
for use by live monitoring systems, host-based intrusion detection, etc,
and avoids applications having to dig through active on-disk trails that
are owned by the audit daemon.
Obtained from: TrustedBSD Project
initialization routines into a ctor, tear-down to a dtor, cleaning
up, etc. This will allow audit records to be allocated from
per-cpu caches.
On recent FreeBSD, dropping the audit_mtx around freeing to UMA is
no longer required (at one point it was possible to acquire Giant
on that path), so a mutex-free thread-local drain is no longer
required.
Obtained from: TrustedBSD Project
This should not happen, but with this assert, brueffer and I would
not have spent 45 minutes trying to figure out why he wasn't
seeing audit records with the audit version in CVS.
Obtained from: TrustedBSD Project
an incompatible conversion from a 64-bit pointer to a 32-bit integer on
64-bit platforms. We will investigate whether Solaris uses a 64-bit
token here, or a new record here, in order to avoid truncating user
pointers that are 64-bit. However, in the mean time, truncation is fine
as these are rarely/never used fields in audit records.
Obtained from: TrustedBSD Project
- td_ar to struct thread, which holds the in-progress audit record during
a system call.
- p_au to struct proc, which holds per-process audit state, such as the
audit identifier, audit terminal, and process audit masks.
In the earlier implementation, td_ar was added to the zero'd section of
struct thread. In order to facilitate merging to RELENG_6, it has been
moved to the end of the data structure, requiring explicit
initalization in the thread constructor.
Much help from: wsalamon
Obtained from: TrustedBSD Project
- Management of audit state on processes.
- Audit system calls to configure process and system audit state.
- Reliable audit record queue implementation, audit_worker kernel
thread to asynchronously store records on disk.
- Audit event argument.
- Internal audit data structure -> BSM audit trail conversion library.
- Audit event pre-selection.
- Audit pseudo-device permitting kernel->user upcalls to notify auditd
of kernel audit events.
Much work by: wsalamon
Obtained from: TrustedBSD Project, Apple Computer, Inc.
security.mac.biba.interfaces_equal
If non-zero, all network interfaces be created with the label:
biba/equal(equal-equal)
This is useful where programs which initialize network interfaces
do not have any labeling support. This includes dhclient and ppp. A
long term solution is to add labeling support into dhclient(8)
and ppp(8), and remove this variable.
It should be noted that this behavior is different then setting the:
security.mac.biba.trust_all_interfaces
sysctl variable, as this will create interfaces with a biba/high label.
Lower integrity processes are not able to write to the interface in this
event. The security.mac.biba.interfaces_equal will override
trust_all_interfaces.
The security.mac.biba.interfaces_equal variable will be set to zero
or disabled by default.
MFC after: 2 weeks
- Prefer '_' to ' ', as it results in more easily parsed results in
memory monitoring tools such as vmstat.
- Remove punctuation that is incompatible with using memory type names
as file names, such as '/' characters.
- Disambiguate some collisions by adding subsystem prefixes to some
memory types.
- Generally prefer lower case to upper case.
- If the same type is defined in multiple architecture directories,
attempt to use the same name in additional cases.
Not all instances were caught in this change, so more work is required to
finish this conversion. Similar changes are required for UMA zone names.
framework. This makes Giant protection around MAC operations which inter-
act with VFS conditional, based on the MPSAFE status of the file system.
Affected the following syscalls:
o __mac_get_fd
o __mac_get_file
o __mac_get_link
o __mac_set_fd
o __mac_set_file
o __mac_set_link
-Drop Giant all together in __mac_set_proc because the
mac_cred_mmapped_drop_perms_recurse routine no longer requires it.
-Move conditional Giant aquisitions to after label allocation routines.
-Move the conditional release of Giant to before label de-allocation
routines.
Discussed with: rwatson
provided access to the root file system before the start of the
init process. This was used briefly by SEBSD before it knew about
preloading data in the loader, and using that method to gain
access to data earlier results in fewer inconsistencies in the
approach. Policy modules still have access to the root file system
creation event through the mac_create_mount() entry point.
Removed now, and will be removed from RELENG_6, in order to gain
third party policy dependencies on the entry point for the lifetime
of the 6.x branch.
MFC after: 3 days
Submitted by: Chris Vance <Christopher dot Vance at SPARTA dot com>
Sponsored by: SPARTA
entry points that will be inserted over the life-time of the 6.x branch,
including for:
- New struct file labeling (void * already added to struct file), events,
access control checks.
- Additional struct mount access control checks, internalization/
externalization.
- mac_check_cap()
- System call enter/exit check and event.
- Socket and vnode ioctl entry points.
MFC after: 3 days
process that caused the clone event to take place for the device driver
creating the device. This allows cloned device drivers to adapt the
device node based on security aspects of the process, such as the uid,
gid, and MAC label.
- Add a cred reference to struct cdev, so that when a device node is
instantiated as a vnode, the cloning credential can be exposed to
MAC.
- Add make_dev_cred(), a version of make_dev() that additionally
accepts the credential to stick in the struct cdev. Implement it and
make_dev() in terms of a back-end make_dev_credv().
- Add a new event handler, dev_clone_cred, which can be registered to
receive the credential instead of dev_clone, if desired.
- Modify the MAC entry point mac_create_devfs_device() to accept an
optional credential pointer (may be NULL), so that MAC policies can
inspect and act on the label or other elements of the credential
when initializing the skeleton device protections.
- Modify tty_pty.c to register clone_dev_cred and invoke make_dev_cred(),
so that the pty clone credential is exposed to the MAC Framework.
While currently primarily focussed on MAC policies, this change is also
a prerequisite for changes to allow ptys to be instantiated with the UID
of the process looking up the pty. This requires further changes to the
pty driver -- in particular, to immediately recycle pty nodes on last
close so that the credential-related state can be recreated on next
lookup.
Submitted by: Andrew Reisse <andrew.reisse@sparta.com>
Obtained from: TrustedBSD Project
Sponsored by: SPAWAR, SPARTA
MFC after: 1 week
MFC note: Merge to 6.x, but not 5.x for ABI reasons
redundant with respect to existing mbuf copy label routines. Expose
a new mac_copy_mbuf() routine at the top end of the Framework and
use that; use the existing mpo_copy_mbuf_label() routine on the
bottom end.
Obtained from: TrustedBSD Project
Sponsored by: SPARTA, SPAWAR
Approved by: re (scottl)
which is invoked from socket() and socketpair(), permitting MAC
policy modules to control the creation of sockets by domain, type, and
protocol.
Obtained from: TrustedBSD Project
Sponsored by: SPARTA, SPAWAR
Approved by: re (scottl)
Requested by: SCC
points to convert _sema() to _sem() for consistency purposes with
respect to the other semaphore-related entry points:
mac_init_sysv_sema() -> mac_init_sysv_sem()
mac_destroy_sysv_sem() -> mac_destroy_sysv_sem()
mac_create_sysv_sema() -> mac_create_sysv_sem()
mac_cleanup_sysv_sema() -> mac_cleanup_sysv_sem()
Congruent changes are made to the policy interface to support this.
Obtained from: TrustedBSD Project
Sponsored by: SPAWAR, SPARTA
access to POSIX Semaphores:
mac_init_posix_sem() Initialize label for POSIX semaphore
mac_create_posix_sem() Create POSIX semaphore
mac_destroy_posix_sem() Destroy POSIX semaphore
mac_check_posix_sem_destroy() Check whether semaphore may be destroyed
mac_check_posix_sem_getvalue() Check whether semaphore may be queried
mac_check_possix_sem_open() Check whether semaphore may be opened
mac_check_posix_sem_post() Check whether semaphore may be posted to
mac_check_posix_sem_unlink() Check whether semaphore may be unlinked
mac_check_posix_sem_wait() Check whether may wait on semaphore
Update Biba, MLS, Stub, and Test policies to implement these entry points.
For information flow policies, most semaphore operations are effectively
read/write.
Submitted by: Dandekar Hrishikesh <rishi_dandekar at sbcglobal dot net>
Sponsored by: DARPA, McAfee, SPARTA
Obtained from: TrustedBSD Project
- Introduce a global mutex, mac_bsdextended_mtx, to protect the rule
array and hold this mutex over use and modification of the rule array
and rules.
- Re-order and clean up sysctl_rule so that copyin/copyout/update happen
in the right order (suggested by: jhb done by rwatson).
mac_check_proc_wait(), which control the ability to wait4() specific
processes. This permits MAC policies to limit information flow from
children that have changed label, although has to be handled carefully
due to common programming expectations regarding the behavior of
wait4(). The cr_seeotheruids() check in p_canwait() is #if 0'd for
this reason.
The mac_stub and mac_test policies are updated to reflect these new
entry points.
Sponsored by: SPAWAR, SPARTA
Obtained from: TrustedBSD Project
control socket poll() (select()), fstat(), and accept() operations,
required for some policies:
poll() mac_check_socket_poll()
fstat() mac_check_socket_stat()
accept() mac_check_socket_accept()
Update mac_stub and mac_test policies to be aware of these entry points.
While here, add missing entry point implementations for:
mac_stub.c stub_check_socket_receive()
mac_stub.c stub_check_socket_send()
mac_test.c mac_test_check_socket_send()
mac_test.c mac_test_check_socket_visible()
Obtained from: TrustedBSD Project
Sponsored by: SPAWAR, SPARTA
of the socket label to thread-local storage, and replace it with
conditional acquisition based on debug.mpsafenet. Acquire the socket
lock around the copy operation.
In mac_set_fd(), replace the unconditional acquisition of Giant with
the conditional acquisition of Giant based on debug.mpsafenet. The socket
lock is acquired in mac_socket_label_set() so doesn't have to be
acquired here.
Obtained from: TrustedBSD Project
Sponsored by: SPAWAR, SPARTA
of system calls to manipulate elements of the process credential,
including:
setuid() mac_check_proc_setuid()
seteuid() mac_check_proc_seteuid()
setgid() mac_check_proc_setgid()
setegid() mac_check_proc_setegid()
setgroups() mac_check_proc_setgroups()
setreuid() mac_check_proc_setreuid()
setregid() mac_check_proc_setregid()
setresuid() mac_check_proc_setresuid()
setresgid() mac_check_rpoc_setresgid()
MAC checks are performed before other existing security checks; both
current credential and intended modifications are passed as arguments
to the entry points. The mac_test and mac_stub policies are updated.
Submitted by: Samy Al Bahra <samy@kerneled.org>
Obtained from: TrustedBSD Project
MAP_SHARED so that the entry point gets executed un-conditionally.
This may be useful for security policies which want to perform access
control checks around run-time linking.
-add the mmap(2) flags argument to the check_vnode_mmap entry point
so that we can make access control decisions based on the type of
mapped object.
-update any dependent API around this parameter addition such as
function prototype modifications, entry point parameter additions
and the inclusion of sys/mman.h header file.
-Change the MLS, BIBA and LOMAC security policies so that subject
domination routines are not executed unless the type of mapping is
shared. This is done to maintain compatibility between the old
vm_mmap_vnode(9) and these policies.
Reviewed by: rwatson
MFC after: 1 month
security.mac.portacl.autoport_exempt
This sysctl exempts to bind port '0' as long as IP_PORTRANGELOW hasn't
been set on the socket. This is quite useful as it allows applications
to use automatic binding without adding overly broad rules for the
binding of port 0. This sysctl defaults to enabled.
This is a slight variation on the patch submitted by the contributor.
MFC after: 2 weeks
Submitted by: Michal Mertl <mime at traveller dot cz>
the sx lock was used previously because we might sleep allocating
additional memory by using auto-extending sbufs. However, we no longer
do this, instead retaining the user-submitted rule string, so mutexes
can be used instead. Annotate the reason for not using the sbuf-related
rule-to-string code with a comment.
Switch to using TAILQ_CONCAT() instead of manual list copying, as it's
O(1), reducing the rule replacement step under the mutex from O(2N) to
O(2).
Remove now uneeded vnode-related includes.
MFC after: 2 weeks
MAC policies to perform object life cycle operations and access
control checks.
Submitted by: Dandekar Hrishikesh <rishi_dandekar at sbcglobal dot net>
Obtained from: TrustedBSD Project
Sponsored by: DARPA, SPAWAR, McAfee Research
objects and operations:
- System V IPC message, message queue, semaphore, and shared memory
segment init, destroy, cleanup, create operations.
- System V IPC message, message queue, seamphore, and shared memory
segment access control entry points, including rights to attach,
destroy, and manipulate these IPC objects.
Submitted by: Dandekar Hrishikesh <rishi_dandekar at sbcglobal dot net>
Obtained from: TrustedBSD Project
Sponsored by: DARPA, SPAWAR, McAfee Research
changes associated with adding System V IPC support. This will prevent
old modules from being used with the new kernel, and new modules from
being used with the old kernel.
for modules linked into the kernel or loaded very early, panics will
result otherwise, as the CV code it calls will panic due to its use
of a mutex before it is initialized.
as well as document the properties of the mac_policy_conf structure.
Warn about the ABI risks in changing the structure without careful
consideration.
Obtained from: TrustedBSD Project
Sponsored by: SPAWAR
right bits rather than piggy-backing on the V* rights defined in
vnode.h. The mac_bsdextended bits are given the same values as the V*
bits to make the new kernel module binary compatible with the old
version of libugidfw that uses V* bits. This avoids leaking kernel
API/ABI to user management tools, and in particular should remove the
need for libugidfw to include vnode.h.
Requested by: phk
rule only in place of all rules match. This is similar to how ipfw(8) works.
Provide a sysctl, mac_bsdextended_firstmatch_enabled, to enable this
feature.
Reviewed by: re (jhb)
Aprroved by: re (jhb)
so that they know whether the allocation is supposed to be able to sleep
or not.
* Allow uma_zone constructors and initialation functions to return either
success or error. Almost all of the ones in the tree currently return
success unconditionally, but mbuf is a notable exception: the packet
zone constructor wants to be able to fail if it cannot suballocate an
mbuf cluster, and the mbuf allocators want to be able to fail in general
in a MAC kernel if the MAC mbuf initializer fails. This fixes the
panics people are seeing when they run out of memory for mbuf clusters.
* Allow debug.nosleepwithlocks on WITNESS to be disabled, without changing
the default.
Both bmilekic and jeff have reviewed the changes made to make failable
zone allocations work.
accurately represents the intention of the 'single' label element in
Biba and MLS labels. It also approximates the use of 'effective' in
traditional UNIX credentials, and avoids confusion with 'singlelabel'
in the context of file systems.
Inspired by: trhodes
for unknown events.
A number of modules return EINVAL in this instance, and I have left
those alone for now and instead taught MOD_QUIESCE to accept this
as "didn't do anything".
network interfaces. This global mutex will protect all ifnet labels.
Acquire the mutex across various MAC activities on interfaces, such
as security checks, propagating interface labels to mbufs generated
from the interface, retrieving and setting the interface label.
Introduce mpo_copy_ifnet_label MAC policy entry point to copy the
value of an interface label from one label to another. Use this
to avoid performing a label externalize while holding mac_ifnet_mtx;
copy the label to a temporary ifnet label and then externalize that.
Implement mpo_copy_ifnet_label for various MAC policies that
implement interface labeling using generic label copying routines.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, McAfee Research
SOCK_LOCK(so):
- Hold socket lock over calls to MAC entry points reading or
manipulating socket labels.
- Assert socket lock in MAC entry point implementations.
- When externalizing the socket label, first make a thread-local
copy while holding the socket lock, then release the socket lock
to externalize to userspace.
lookup for the label tag fails, return NULL rather than something close
to NULL. This scenario occurs if mbuf header labeling is optional and
a policy requiring labeling is loaded, resulting in some mbufs having
labels and others not. Previously, 0x14 would be returned because the
NULL from m_tag_find() was not treated specially.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, McAfee Research
test the label pointer for NULL before testing the label slot for
permitted values. When loading mac_test dynamically with conditional
mbuf labels, the label pointer may be NULL if the mbuf was
instantiated while labels were not required on mbufs by any policy.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, McAfee Research
synchronization protecting against dynamic load and unload of MAC
policies, and instead simply blocks load and unload. In a static
configuration, this allows you to avoid the synchronization costs
associated with introducing dynamicism.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, McAfee Research
Assert the BPF descriptor lock in the MAC calls referencing live
BPF descriptors.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, McAfee Research
are employed in entry points later in the same include file.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Air Force Research Laboratory, McAfee Research
struct vattr in mac_policy.h. This permits policies not
implementing entry points using these types to compile without
including include files with these types.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Air Force Research Laboratory
to a new mac_inet.c. This code is now conditionally compiled based
on inet support being compiled into the kernel.
Move socket related MAC Framework entry points from mac_net.c to a new
mac_socket.c.
To do this, some additional _enforce MIB variables are now non-static.
In addition, mbuf_to_label() is now mac_mbuf_to_label() and non-static.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, McAfee Research
Now I believe it is done in the right way.
Removed some XXMAC cases, we now assume 'high' integrity level for all
sysctls, except those with CTLFLAG_ANYBODY flag set. No more magic.
Reviewed by: rwatson
Approved by: rwatson, scottl (mentor)
Tested with: LINT (compilation), mac_biba(4) (functionality)
to use the "year1-year3" format, as opposed to "year1, year2, year3".
This seems to make lawyers more happy, but also prevents the
lines from getting excessively long as the years start to add up.
Suggested by: imp
would allocate two 'struct pipe's from the pipe zone, and malloc a
mutex.
- Create a new "struct pipepair" object holding the two 'struct
pipe' instances, struct mutex, and struct label reference. Pipe
structures now have a back-pointer to the pipe pair, and a
'pipe_present' flag to indicate whether the half has been
closed.
- Perform mutex init/destroy in zone init/destroy, avoiding
reallocating the mutex for each pipe. Perform most pipe structure
setup in zone constructor.
- VM memory mappings for pageable buffers are still done outside of
the UMA zone.
- Change MAC API to speak 'struct pipepair' instead of 'struct pipe',
update many policies. MAC labels are also handled outside of the
UMA zone for now. Label-only policy modules don't have to be
recompiled, but if a module is recompiled, its pipe entry points
will need to be updated. If a module actually reached into the
pipe structures (unlikely), that would also need to be modified.
These changes substantially simplify failure handling in the pipe
code as there are many fewer possible failure modes.
On half-close, pipes no longer free the 'struct pipe' for the closed
half until a full-close takes place. However, VM mapped buffers
are still released on half-close.
Some code refactoring is now possible to clean up some of the back
references, etc; this patch attempts not to change the structure
of most of the pipe implementation, only allocation/free code
paths, so as to avoid introducing bugs (hopefully).
This cuts about 8%-9% off the cost of sequential pipe allocation
and free in system call tests on UP and SMP in my micro-benchmarks.
May or may not make a difference in macro-benchmarks, but doing
less work is good.
Reviewed by: juli, tjr
Testing help: dwhite, fenestro, scottl, et al
wait, rather than the socket label. This avoids reaching up to
the socket layer during connection close, which requires locking
changes. To do this, introduce MAC Framework entry point
mac_create_mbuf_from_inpcb(), which is called from tcp_twrespond()
instead of calling mac_create_mbuf_from_socket() or
mac_create_mbuf_netlayer(). Introduce MAC Policy entry point
mpo_create_mbuf_from_inpcb(), and implementations for various
policies, which generally just copy label data from the inpcb to
the mbuf. Assert the inpcb lock in the entry point since we
require consistency for the inpcb label reference.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
only turned up when running mac_test side by side with a transitioning
policy such as SEBSD. Make the NULL testing match
mac_test_execve_will_transition(), which already tested the vnode
label pointer for NULL.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
and the mpo_create_cred() MAC policy entry point to
mpo_copy_cred_label(). This is more consistent with similar entry
points for creation and label copying, as mac_create_cred() was
called from crdup() as opposed to during process creation. For
a number of policies, this removes the requirement for special
handling when copying credential labels, and improves consistency.
Approved by: re (scottl)
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
the MAC label referenced from 'struct socket' in the IPv4 and
IPv6-based protocols. This permits MAC labels to be checked during
network delivery operations without dereferencing inp->inp_socket
to get to so->so_label, which will eventually avoid our having to
grab the socket lock during delivery at the network layer.
This change introduces 'struct inpcb' as a labeled object to the
MAC Framework, along with the normal circus of entry points:
initialization, creation from socket, destruction, as well as a
delivery access control check.
For most policies, the inpcb label will simply be a cache of the
socket label, so a new protocol switch method is introduced,
pr_sosetlabel() to notify protocols that the socket layer label
has been updated so that the cache can be updated while holding
appropriate locks. Most protocols implement this using
pru_sosetlabel_null(), but IPv4/IPv6 protocols using inpcbs use
the the worker function in_pcbsosetlabel(), which calls into the
MAC Framework to perform a cache update.
Biba, LOMAC, and MLS implement these entry points, as do the stub
policy, and test policy.
Reviewed by: sam, bms
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
system calls, and prefer these calls over getsockopt()/setsockopt()
for ABI reasons. When addressing UNIX domain sockets, these calls
retrieve and modify the socket label, not the label of the
rendezvous vnode.
- Create mac_copy_socket_label() entry point based on
mac_copy_pipe_label() entry point, intended to copy the socket
label into temporary storage that doesn't require a socket lock
to be held (currently Giant).
- Implement mac_copy_socket_label() for various policies.
- Expose socket label allocation, free, internalize, externalize
entry points as non-static from mac_net.c.
- Use mac_socket_label_set() in __mac_set_fd().
MAC-aware applications may now use mac_get_fd(), mac_set_fd(), and
mac_get_peer() to retrieve and set various socket labels without
directly invoking the getsockopt() interface.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
SO_PEERLABEL. This provides an interface to query the label of a
socket peer without embedding implementation details of mac_t in
the application. Previously, sizeof(*mac_t) had to be specified
by an application when performing getsockopt().
Document mac_get_peer(3), and expand documentation of the other
mac_get(3) functions. Note that it's possible to get EINVAL back
from mac_get_fd(3) when pointing it at an inappropriate object.
NOTE: mac_get_fd() and mac_set_fd() support for sockets will
follow shortly, so the documentation is slightly ahead of the
code.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
mac_setsockopt_label() into mac_socket_label_set(); make it non-static
so that it can be invoked from kern_mac.c for mac_set_fd().
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
This fixes a dependency of mac_label.c on namespace pollution in
<vm/uma.h>.
Similarly for SYSCTL_DECL() although I had no problems with it. This
probably makes some includes of <sys/sysctl.h> bogus.
Giant and is also MPSAFE.
Push Giant further down into __mac_get_fd() and __mac_set_fd(),
grabbing it only for constrained regions dealing with VFS, and
dropping it entirely for operations related to labeling of pipes.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
extra argument to the devfs MAC policy entry points was accidentally
merged from the MAC branch during my earlier commit to these policies,
and is not scheduled to be merged just yet.
in various kernel objects to represent security data, we embed a
(struct label *) pointer, which now references labels allocated using
a UMA zone (mac_label.c). This allows the size and shape of struct
label to be varied without changing the size and shape of these kernel
objects, which become part of the frozen ABI with 5-STABLE. This opens
the door for boot-time selection of the number of label slots, and hence
changes to the bound on the number of simultaneous labeled policies
at boot-time instead of compile-time. This also makes it easier to
embed label references in new objects as required for locking/caching
with fine-grained network stack locking, such as inpcb structures.
This change also moves us further in the direction of hiding the
structure of kernel objects from MAC policy modules, not to mention
dramatically reducing the number of '&' symbols appearing in both the
MAC Framework and MAC policy modules, and improving readability.
While this results in minimal performance change with MAC enabled, it
will observably shrink the size of a number of critical kernel data
structures for the !MAC case, and should have a small (but measurable)
performance benefit (i.e., struct vnode, struct socket) do to memory
conservation and reduced cost of zeroing memory.
NOTE: Users of MAC must recompile their kernel and all MAC modules as a
result of this change. Because this is an API change, third party
MAC modules will also need to be updated to make less use of the '&'
symbol.
Suggestions from: bmilekic
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
successfully initialized in the label as a socket peer label, not a
socket label. For current policy modules, this didn't make a
difference, but if a policy module had label data in the peer label
that was to be GC'd in a different way than the normal socket label,
it might have been a problem.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
if_xname, if_dname, and if_dunit. if_xname is the name of the interface
and if_dname/unit are the driver name and instance.
This change paves the way for interface renaming and enhanced pseudo
device creation and configuration symantics.
Approved By: re (in principle)
Reviewed By: njl, imp
Tested On: i386, amd64, sparc64
Obtained From: NetBSD (if_xname)
type, rather than "object_label" as the first argument. This reduces
complexity a little for the consumer, and also makes it easier for
use to rename the underlying entry points in struct mac_policy_obj.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
Include src/sys/security/mac/mac_internal.h in kern_mac.c.
Remove redundant defines from the include: SYSCTL_DECL(), debug macros,
composition macros.
Unstaticize various bits now exposed to the remainder of the kernel:
mac_init_label(), mac_destroy_label().
Remove all the functions now implemented in mac_process/mac_vfs/mac_net/
mac_pipe. Also remove debug counters, sysctls exporting debug
counters, enforcement flags, sysctls exporting enforcement flags.
Leave module declaration, sysctl nodes, mactemp malloc type, system
calls.
This should conclude MAC/LINT/NOTES breakage from the break-out process,
but I'm running builds now to make sure I caught everything.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
Unstaticize mac_late.
Remove ea_warn_once, now in mac_vfs.c.
Unstaticisize mac_policy_list, mac_static_policy_list, use
struct mac_policy_list_head instead of LIST_HEAD() directly.
Unstaticize and un-inline MAC policy locking functions so they can
be referenced from mac_*.c.
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories
Extended attribute transaction warning flag if transactions aren't
supported on the EA implementation being used.
Debug fallback flag to permit a less conservative fallback if reading
an on-disk label fails.
Enforce_fs toggle to enforce file systme access control.
Debugging counters for file system objects: mounts, vnodes, devfs_dirents.
Object initialization, destruction, copying, internalization,
externalization, relabeling for file system objects.
Life cycle operations for devfs entries.
Generic extended attribute label implementation for use by UFS, UFS2 in
multilabel mode.
Generic single-level label implementation for use by all file systems
when in singlelabel mode.
Exec-time transition based on file label entry points.
Vnode operation access control checks (many).
Mount operation access control checks (few).
Obtained from: TrustedBSD Project
Sponsored by: DARPA, Network Associates Laboratories