wrapper macros that allow trace points and arguments to be declared
using a single macro rather than several. This means a lot less
repetition and vertical space for each trace point.
Use these macros when defining privilege and MAC Framework trace points.
Reviewed by: jb
MFC after: 1 week
contrib/openbsm (svn merge) and src/sys/{bsm,security/audit} (manual
merge).
OpenBSM history for imported revision below for reference.
MFC after: 1 month
Sponsored by: Apple, Inc.
Obtained from: TrustedBSD Project
OpenBSM 1.1 beta 1
- The filesz parameter in audit_control(5) now accepts suffixes: 'B' for
Bytes, 'K' for Kilobytes, 'M' for Megabytes, and 'G' for Gigabytes.
For legacy support no suffix defaults to bytes.
- Audit trail log expiration support added. It is configured in
audit_control(5) with the expire-after parameter. If there is no
expire-after parameter in audit_control(5), the default, then the audit
trail files are not expired and removed. See audit_control(5) for
more information.
- Change defaults in audit_control: warn at 5% rather than 20% free for audit
partitions, rotate automatically at 2mb, and set the default policy to
cnt,argv rather than cnt so that execve(2) arguments are captured if
AUE_EXECVE events are audited. These may provide more usable defaults for
many users.
- Use au_domain_to_bsm(3) and au_socket_type_to_bsm(3) to convert
au_to_socket_ex(3) arguments to BSM format.
- Fix error encoding AUT_IPC_PERM tokens.
contrib/openbsm (svn merge) and src/sys/{bsm,security/audit} (manual
merge). Hook up bsm_domain.c and bsm_socket_type.c to the libbsm
build along with man pages, add audit_bsm_domain.c and
audit_bsm_socket_type.c to the kernel environment.
OpenBSM history for imported revisions below for reference.
MFC after: 1 month
Sponsored by: Apple Inc.
Obtained from: TrustedBSD Project
OpenBSM 1.1 alpha 5
- Stub libauditd(3) man page added.
- All BSM error number constants with BSM_ERRNO_.
- Interfaces to convert between local and BSM socket types and protocol
families have been added: au_bsm_to_domain(3), au_bsm_to_socket_type(3),
au_domain_to_bsm(3), and au_socket_type_to_bsm(3), along with definitions
of constants in audit_domain.h and audit_socket_type.h. This improves
interoperability by converting local constant spaces, which vary by OS, to
and from Solaris constants (where available) or OpenBSM constants for
protocol domains not present in Solaris (a fair number). These routines
should be used when generating and interpreting extended socket tokens.
- Fix build warnings with full gcc warnings enabled on most supported
platforms.
- Don't compile error strings into bsm_errno.c when building it in the kernel
environment.
- When started by launchd, use the label com.apple.auditd rather than
org.trustedbsd.auditd.
they label, derive that information implicitly from the set of label
initializers in their policy operations set. This avoids a possible
class of programmer errors, while retaining the structure that
allows us to avoid allocating labels for objects that don't need
them. As before, we regenerate a global mask of labeled objects
each time a policy is loaded or unloaded, stored in mac_labeled.
Discussed with: csjp
Suggested by: Jacques Vidrine <nectar at apple.com>
Obtained from: TrustedBSD Project
Sponsored by: Apple, Inc.
lock in order to avoid the lock acquire hit if the pipe list is very
likely empty.
Obtained from: TrustedBSD Project
MFC after: 3 weeks
Sponsored by: Apple, Inc.
record is active on the current thread--historically we may always
have wanted to enter the audit code if auditing was enabled, but now
we just commit the audit record so don't need to enter if there isn't
one.
Obtained from: TrustedBSD Project
Sponsored by: Apple, Inc.
contrib/openbsm (svn merge) and src/sys/{bsm,security/audit} (manual
merge). Add libauditd build parts and add to auditd's linkage;
force libbsm to build before libauditd.
OpenBSM history for imported revisions below for reference.
MFC after: 1 month
Sponsored by: Apple Inc.
Obtained from: TrustedBSD Project
OpenBSM 1.1 alpha 4
- With the addition of BSM error number mapping, we also need to map the
local error number passed to audit_submit(3) to a BSM error number,
rather than have the caller perform that conversion.
- Reallocate user audit events to avoid collisions with Solaris; adopt a
more formal allocation scheme, and add some events allocated in Solaris
that will be of immediate use on other platforms.
- Add an event for Calife.
- Add au_strerror(3), which allows generating strings for BSM errors
directly, rather than requiring applications to map to the local error
space, which might not be able to entirely represent the BSM error
number space.
- Major auditd rewrite for launchd(8) support. Add libauditd library
that is shared between launchd and auditd.
- Add AUDIT_TRIGGER_INITIALIZE trigger (sent via 'audit -i') for
(re)starting auditing under launchd(8) on Mac OS X.
- Add 'current' symlink to active audit trail.
- Add crash recovery of previous audit trail file when detected on audit
startup that it has not been properly terminated.
- Add the event AUE_audit_recovery to indicated when an audit trail file
has been recovered from not being properly terminated. This event is
stored in the new audit trail file and includes the path of recovered
audit trail file.
- Mac OS X and FreeBSD dependent code in auditd.c is separated into
auditd_darwin.c and auditd_fbsd.c files.
- Add an event for the posix_spawn(2) and fsgetpath(2) Mac OS X system
calls.
- For Mac OS X, we use ASL(3) instead of syslog(3) for logging.
- Add support for NOTICE level logging.
OpenBSM 1.1 alpha 3
- Add two new functions, au_bsm_to_errno() and au_errno_to_bsm(), to map
between BSM error numbers (largely the Solaris definitions) and local
errno(2) values for 32-bit and 64-bit return tokens. This is required
as operating systems don't agree on some of the values of more recent
error numbers.
- Fix a bug how au_to_exec_args(3) and au_to_exec_env(3) calculates the
total size for the token. This buge.
- Deprecated Darwin constants, such as TRAILER_PAD_MAGIC, removed.
mac_proc_vm_revoke_recurse() requests a read lock on the vm map at the start
but does not handle failure by vm_map_lock_upgrade() when it seeks to modify
the vm map. At present, this works because all lock request on a vm map are
implemented as exclusive locks. Thus, vm_map_lock_upgrade() is a no-op that
always reports success. However, that is about to change, and
proc_vm_revoke_recurse() will require substantial modifications to handle
vm_map_lock_upgrade() failures. For the time being, I am changing
mac_proc_vm_revoke_recurse() to request a write lock on the vm map at the
start.
Approved by: rwatson
MFC after: 3 months
contrib/openbsm (svn merge) and sys/{bsm,security/audit} (manual merge).
- Add OpenBSM contrib tree to include paths for audit(8) and auditd(8).
- Merge support for new tokens, fixes to existing token generation to
audit_bsm_token.c.
- Synchronize bsm includes and definitions.
OpenBSM history for imported revisions below for reference.
MFC after: 1 month
Sponsored by: Apple Inc.
Obtained from: TrustedBSD Project
--
OpenBSM 1.1 alpha 2
- Include files in OpenBSM are now broken out into two parts: library builds
required solely for user space, and system includes, which may also be
required for use in the kernels of systems integrating OpenBSM. Submitted
by Stacey Son.
- Configure option --with-native-includes allows forcing the use of native
include for system includes, rather than the versions bundled with OpenBSM.
This is intended specifically for platforms that ship OpenBSM, have adapted
versions of the system includes in a kernel source tree, and will use the
OpenBSM build infrastructure with an unmodified OpenBSM distribution,
allowing the customized system includes to be used with the OpenBSM build.
Submitted by Stacey Son.
- Various strcpy()'s/strcat()'s have been changed to strlcpy()'s/strlcat()'s
or asprintf(). Added compat/strlcpy.h for Linux.
- Remove compatibility defines for old Darwin token constant names; now only
BSM token names are provided and used.
- Add support for extended header tokens, which contain space for information
on the host generating the record.
- Add support for setting extended host information in the kernel, which is
used for setting host information in extended header tokens. The
audit_control file now supports a "host" parameter which can be used by
auditd to set the information; if not present, the kernel parameters won't
be set and auditd uses unextended headers for records that it generates.
OpenBSM 1.1 alpha 1
- Add option to auditreduce(1) which allows users to invert sense of
matching, such that BSM records that do not match, are selected.
- Fix bug in audit_write() where we commit an incomplete record in the
event there is an error writing the subject token. This was submitted
by Diego Giagio.
- Build support for Mac OS X 10.5.1 submitted by Eric Hall.
- Fix a bug which resulted in host XML attributes not being arguments so
that const strings can be passed as arguments to tokens. This patch was
submitted by Xin LI.
- Modify the -m option so users can select more then one audit event.
- For Mac OS X, added Mach IPC support for audit trigger messages.
- Fixed a bug in getacna() which resulted in a locking problem on Mac OS X.
- Added LOG_PERROR flag to openlog when -d option is used with auditd.
- AUE events added for Mac OS X Leopard system calls.
by getaudit(2). Some applications such has su, id will interpret E2BIG as
requiring the use of getaudit_addr(2) to pull extended audit state (ip6)
from the kernel.
This change un-breaks the ABI when auditing has been activated on a system
and the users are logged in via ip6.
This is a RELENG_7_1 candidate.
MFC after: 1 day
Discussed with: rwatson
Bring in updated jail support from bz_jail branch.
This enhances the current jail implementation to permit multiple
addresses per jail. In addtion to IPv4, IPv6 is supported as well.
Due to updated checks it is even possible to have jails without
an IP address at all, which basically gives one a chroot with
restricted process view, no networking,..
SCTP support was updated and supports IPv6 in jails as well.
Cpuset support permits jails to be bound to specific processor
sets after creation.
Jails can have an unrestricted (no duplicate protection, etc.) name
in addition to the hostname. The jail name cannot be changed from
within a jail and is considered to be used for management purposes
or as audit-token in the future.
DDB 'show jails' command was added to aid debugging.
Proper compat support permits 32bit jail binaries to be used on 64bit
systems to manage jails. Also backward compatibility was preserved where
possible: for jail v1 syscalls, as well as with user space management
utilities.
Both jail as well as prison version were updated for the new features.
A gap was intentionally left as the intermediate versions had been
used by various patches floating around the last years.
Bump __FreeBSD_version for the afore mentioned and in kernel changes.
Special thanks to:
- Pawel Jakub Dawidek (pjd) for his multi-IPv4 patches
and Olivier Houchard (cognet) for initial single-IPv6 patches.
- Jeff Roberson (jeff) and Randall Stewart (rrs) for their
help, ideas and review on cpuset and SCTP support.
- Robert Watson (rwatson) for lots and lots of help, discussions,
suggestions and review of most of the patch at various stages.
- John Baldwin (jhb) for his help.
- Simon L. Nielsen (simon) as early adopter testing changes
on cluster machines as well as all the testers and people
who provided feedback the last months on freebsd-jail and
other channels.
- My employer, CK Software GmbH, for the support so I could work on this.
Reviewed by: (see above)
MFC after: 3 months (this is just so that I get the mail)
X-MFC Before: 7.2-RELEASE if possible
pointer in a local thread. While this is unlikely to significantly
improve performance given modern compiler behavior, it makes the code
more readable and reduces diffs to the Mac OS X version of the same
code (which stores things in creds in the same way, but where the
cred for a thread is reached quite differently).
Discussed with: sson
MFC after: 1 month
Sponsored by: Apple Inc.
Obtained from: TrustedBSD Project
queue length variables as well, avoiding storing the limit in a larger
type than the length.
Submitted by: sson
Sponsored by: Apple Inc.
MFC after: 1 week
regular header tokens. The extended header tokens contain an IP
or IPv6 address which makes it possible to identify which host an
audit record came from when audit records are centralized.
If the host information has not been specified, the system will
default to the old style headers. Otherwise, audit records that
are created as a result of system calls will contain host information.
This implemented has been designed to be consistent with the Solaris
implementation. Host information is set/retrieved using the A_GETKAUDIT
and A_SETKAUDIT auditon(2) commands. These commands require that a
pointer to a auditinfo_addr_t object is passed. Currently only IP and
IPv6 address families are supported.
The users pace bits associated with this change will follow in an
openbsm import.
Reviewed by: rwatson, (sson, wsalamon (older version))
MFC after: 1 month
record queue, so move the offset field from the per-record
audit_pipe_entry structure to the audit_pipe structure.
Now that we support reading more than one record at a time, add a
new summary field to audit_pipe, ap_qbyteslen, which tracks the
total number of bytes present in a pipe, and return that (minus
the current offset) via FIONREAD and kqueue's data variable for
the pending byte count rather than the number of bytes remaining
in only the first record.
Add a number of asserts to confirm that these counts and offsets
following the expected rules.
MFC after: 2 months
Sponsored by: Apple, Inc.
read(2), which meant that records longer than the buffer passed to read(2)
were dropped. Instead take the approach of allowing partial reads to be
continued across multiple system calls more in the style of streaming
character device.
This means retaining a record on the per-pipe queue in a partially read
state, so maintain a current offset into the record. Keep the record on
the queue during a read, so add a new lock, ap_sx, to serialize removal
of records from the queue by either read(2) or ioctl(2) requesting a pipe
flush. Modify the kqueue handler to return bytes left in the current
record rather than simply the size of the current record.
It is now possible to use praudit, which used the standard FILE * buffer
sizes, to track much larger record sizes from /dev/auditpipe, such as
very long command lines to execve(2).
MFC after: 2 months
Sponsored by: Apple, Inc.
pipe has overflowed, drop the newest, rather than oldest, record. This
makes overflow drop behavior consistent with memory allocation failure
leading to drop, avoids touching the consumer end of the queue from a
producer, and lowers the CPU overhead of dropping a record by dropping
before memory allocation and copying.
Obtained from: Apple, Inc.
MFC after: 2 months
protecting the list of audit pipes, and a per-pipe mutex protecting the
queue.
Likewise, replace the single global condition variable used to signal
delivery of a record to one or more pipes, and add a per-pipe condition
variable to avoid spurious wakeups when event subscriptions differ
across multiple pipes.
This slightly increases the cost of delivering to audit pipes, but should
reduce lock contention in the presence of multiple readers as only the
per-pipe lock is required to read from a pipe, as well as avoid
overheading when different pipes are used in different ways.
MFC after: 2 months
Sponsored by: Apple, Inc.
mutex, as it's rarely changed but frequently accessed read-only from
multiple threads, so a potentially significant source of contention.
MFC after: 1 month
Sponsored by: Apple, Inc.
access control checks in mac_bsdextended are not in the same
namespace as the MBI_ flags used in ugidfw policies, so add an
explicit conversion routine to get from one to the other.
Obtained from: TrustedBSD Project
to add more V* constants, and the variables changed by this patch were often
being assigned to mode_t variables, which is 16 bit.
Approved by: rwatson (mentor)
memory mappings when the MAC label on a process changes, to
mac_proc_vm_revoke(),
It now also acquires its own credential reference directly from the
affected process rather than accepting one passed by the the caller,
simplifying the API and consumer code.
Obtained from: TrustedBSD Project
that they operate directly on credentials: mac_proc_create_swapper(),
mac_proc_create_init(), and mac_proc_associate_nfsd(). Update policies.
Obtained from: TrustedBSD Project
be a no-op request, and why this might have to change if we want to allow
leaving a partition someday.
Obtained from: TrustedBSD Project
MFC after: 3 days
control logic and policy registration remaining in that file, and access
control checks broken out into other files by class of check.
Obtained from: TrustedBSD Project
fragment reassembly queues.
This allows policies to label reassembly queues, perform access
control checks when matching fragments to a queue, update a queue
label when fragments are matched, and label the resulting
reassembled datagram.
Obtained from: TrustedBSD Project
When I changed kern_conf.c three months ago I made device unit numbers
equal to (unneeded) device minor numbers. We used to require
bitshifting, because there were eight bits in the middle that were
reserved for a device major number. Not very long after I turned
dev2unit(), minor(), unit2minor() and minor2unit() into macro's.
The unit2minor() and minor2unit() macro's were no-ops.
We'd better not remove these four macro's from the kernel, because there
is a lot of (external) code that may still depend on them. For now it's
harmless to remove all invocations of unit2minor() and minor2unit().
Reviewed by: kib
years by the priv_check(9) interface and just very few places are left.
Note that compatibility stub with older FreeBSD version
(all above the 8 limit though) are left in order to reduce diffs against
old versions. It is responsibility of the maintainers for any module, if
they think it is the case, to axe out such cases.
This patch breaks KPI so __FreeBSD_version will be bumped into a later
commit.
This patch needs to be credited 50-50 with rwatson@ as he found time to
explain me how the priv_check() works in detail and to review patches.
Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
Reviewed by: rwatson
appropriate even if Solaris doesn't document it (E2BIG) or use it
(EOVERFLOW).
Submitted by: nectar at apple dot com
Sponsored by: Apple, Inc.
MFC after: 3 days
(1) Abstract interpreter vnode labeling in execve(2) and mac_execve(2)
so that the general exec code isn't aware of the details of
allocating, copying, and freeing labels, rather, simply passes in
a void pointer to start and stop functions that will be used by
the framework. This change will be MFC'd.
(2) Introduce a new flags field to the MAC_POLICY_SET(9) interface
allowing policies to declare which types of objects require label
allocation, initialization, and destruction, and define a set of
flags covering various supported object types (MPC_OBJECT_PROC,
MPC_OBJECT_VNODE, MPC_OBJECT_INPCB, ...). This change reduces the
overhead of compiling the MAC Framework into the kernel if policies
aren't loaded, or if policies require labels on only a small number
or even no object types. Each time a policy is loaded or unloaded,
we recalculate a mask of labeled object types across all policies
present in the system. Eliminate MAC_ALWAYS_LABEL_MBUF option as it
is no longer required.
MFC after: 1 week ((1) only)
Reviewed by: csjp
Obtained from: TrustedBSD Project
Sponsored by: Apple, Inc.
space provided by its argument structure, return EOVERFLOW instead of
E2BIG. The latter is documented in Solaris's man page, but the
former is implemented. In either case, the caller should use
getaudit_addr(2) to return the IPv6 address.
Submitted by: sson
Obtained from: Apple, Inc.
MFC after: 3 days
It is possible that the audit pipe(s) have different preselection configs
then the global preselection mask.
Spotted by: Vincenzo Iozzo
MFC after: 2 weeks
return success if the passed vnode pointer is NULL (rather than
panicking). This can occur if either audit or accounting are
disabled while the policy is running.
Since the swapoff control has no real relevance to this policy,
which is concerned about intent to write rather than water under the
bridge, remove it.
PR: kern/126100
Reported by: Alan Amesbury <amesbury at umn dot edu>
MFC after: 3 days
processes are not producing absolute pathname tokens. It is required
that audited pathnames are generated relative to the global root mount
point. This modification changes our implementation of audit_canon_path(9)
and introduces a new function: vn_fullpath_global(9) which performs a
vnode -> pathname translation relative to the global mount point based
on the contents of the name cache. Much like vn_fullpath,
vn_fullpath_global is a wrapper function which called vn_fullpath1.
Further, the string parsing routines have been converted to use the
sbuf(9) framework. This change also removes the conditional acquisition
of Giant, since the vn_fullpath1 method will not dip into file system
dependent code.
The vnode locking was modified to use vhold()/vdrop() instead the vref()
and vrele(). This will modify the hold count instead of modifying the
user count. This makes more sense since it's the kernel that requires
the reference to the vnode. This also makes sure that the vnode does not
get recycled we hold the reference to it. [1]
Discussed with: rwatson
Reviewed by: kib [1]
MFC after: 2 weeks
semaphores. Specifically, semaphores are now represented as new file
descriptor type that is set to close on exec. This removes the need for
all of the manual process reference counting (and fork, exec, and exit
event handlers) as the normal file descriptor operations handle all of
that for us nicely. It is also suggested as one possible implementation
in the spec and at least one other OS (OS X) uses this approach.
Some bugs that were fixed as a result include:
- References to a named semaphore whose name is removed still work after
the sem_unlink() operation. Prior to this patch, if a semaphore's name
was removed, valid handles from sem_open() would get EINVAL errors from
sem_getvalue(), sem_post(), etc. This fixes that.
- Unnamed semaphores created with sem_init() were not cleaned up when a
process exited or exec'd. They were only cleaned up if the process
did an explicit sem_destroy(). This could result in a leak of semaphore
objects that could never be cleaned up.
- On the other hand, if another process guessed the id (kernel pointer to
'struct ksem' of an unnamed semaphore (created via sem_init)) and had
write access to the semaphore based on UID/GID checks, then that other
process could manipulate the semaphore via sem_destroy(), sem_post(),
sem_wait(), etc.
- As part of the permission check (UID/GID), the umask of the proces
creating the semaphore was not honored. Thus if your umask denied group
read/write access but the explicit mode in the sem_init() call allowed
it, the semaphore would be readable/writable by other users in the
same group, for example. This includes access via the previous bug.
- If the module refused to unload because there were active semaphores,
then it might have deregistered one or more of the semaphore system
calls before it noticed that there was a problem. I'm not sure if
this actually happened as the order that modules are discovered by the
kernel linker depends on how the actual .ko file is linked. One can
make the order deterministic by using a single module with a mod_event
handler that explicitly registers syscalls (and deregisters during
unload after any checks). This also fixes a race where even if the
sem_module unloaded first it would have destroyed locks that the
syscalls might be trying to access if they are still executing when
they are unloaded.
XXX: By the way, deregistering system calls doesn't do any blocking
to drain any threads from the calls.
- Some minor fixes to errno values on error. For example, sem_init()
isn't documented to return ENFILE or EMFILE if we run out of semaphores
the way that sem_open() can. Instead, it should return ENOSPC in that
case.
Other changes:
- Kernel semaphores now use a hash table to manage the namespace of
named semaphores nearly in a similar fashion to the POSIX shared memory
object file descriptors. Kernel semaphores can now also have names
longer than 14 chars (up to MAXPATHLEN) and can include subdirectories
in their pathname.
- The UID/GID permission checks for access to a named semaphore are now
done via vaccess() rather than a home-rolled set of checks.
- Now that kernel semaphores have an associated file object, the various
MAC checks for POSIX semaphores accept both a file credential and an
active credential. There is also a new posixsem_check_stat() since it
is possible to fstat() a semaphore file descriptor.
- A small set of regression tests (using the ksem API directly) is present
in src/tools/regression/posixsem.
Reported by: kris (1)
Tested by: kris
Reviewed by: rwatson (lightly)
MFC after: 1 month
same as the global variable defined in ip_input.c. Instead, adopt the name
'q' as found in about 1/2 of uses in ip_input.c, preventing a collision on
the name. This is non-harmful, but means that search and replace on the
global works less well (as in the virtualization work), as well as indexing
tools.
MFC after: 1 week
Reported by: julian
Except for the case where we use the cloner library (clone_create() and
friends), there is no reason to enforce a unique device minor number
policy. There are various drivers in the source tree that allocate unr
pools and such to provide minor numbers, without using them themselves.
Because we still need to support unique device minor numbers for the
cloner library, introduce a new flag called D_NEEDMINOR. All cdevsw's
that are used in combination with the cloner library should be marked
with this flag to make the cloning work.
This means drivers can now freely use si_drv0 to store their own flags
and state, making it effectively the same as si_drv1 and si_drv2. We
still keep the minor() and dev2unit() routines around to make drivers
happy.
The NTFS code also used the minor number in its hash table. We should
not do this anymore. If the si_drv0 field would be changed, it would no
longer end up in the same list.
Approved by: philip (mentor)