Commit Graph

801 Commits

Author SHA1 Message Date
Pawel Jakub Dawidek
ceaea52f0c IFp4 @219811:
VFS is now fully MPSAFE, fix compilation.
2012-12-01 08:51:40 +00:00
Pawel Jakub Dawidek
80a044ea46 IFp4 @208452:
Audit handling for missing events:
- AUE_READLINKAT
- AUE_FACCESSAT
- AUE_MKDIRAT
- AUE_MKFIFOAT
- AUE_MKNODAT
- AUE_SYMLINKAT

Sponsored by:	FreeBSD Foundation (auditdistd)
MFC after:	2 weeks
2012-11-30 23:21:55 +00:00
Pawel Jakub Dawidek
499f0f4d55 IFp4 @208451:
Fix path handling for *at() syscalls.

Before the change directory descriptor was totally ignored,
so the relative path argument was appended to current working
directory path and not to the path provided by descriptor, thus
wrong paths were stored in audit logs.

Now that we use directory descriptor in vfs_lookup, move
AUDIT_ARG_UPATH1() and AUDIT_ARG_UPATH2() calls to the place where
we hold file descriptors table lock, so we are sure paths will
be resolved according to the same directory in audit record and
in actual operation.

Sponsored by:	FreeBSD Foundation (auditdistd)
Reviewed by:	rwatson
MFC after:	2 weeks
2012-11-30 23:18:49 +00:00
Pawel Jakub Dawidek
1d8cd15cf8 IFp4 @208383:
Currently when we discover that trail file is greater than configured
limit we send AUDIT_TRIGGER_ROTATE_KERNEL trigger to the auditd daemon
once. If for some reason auditd didn't rotate trail file it will never
be rotated.

Change it by sending the trigger when trail file size grows by the
configured limit. For example if the limit is 1MB, we will send trigger
on 1MB, 2MB, 3MB, etc.

This is also needed for the auditd change that will be committed soon
where auditd may ignore the trigger - it might be ignored if kernel
requests the trail file to be rotated too quickly (often than once a second)
which would result in overwriting previous trail file.

Sponsored by:	FreeBSD Foundation (auditdistd)
MFC after:	2 weeks
2012-11-30 23:03:51 +00:00
Pawel Jakub Dawidek
6293140411 IFp4 @208382:
Currently on each record write we call VFS_STATFS() to get available space
on the file system as well as VOP_GETATTR() to get trail file size.

We can assume that trail file is only updated by the audit worker, so instead
of asking for file size on every write, get file size on trail switch only
(it should be zero, but it's not expensive) and use global variable audit_size
protected by the audit worker lock to keep track of trail file's size.

This eliminates VOP_GETATTR() call for every write. VFS_STATFS() is satisfied
from in-memory data (mount->mnt_stat), so shouldn't be expensive.

Sponsored by:	FreeBSD Foundation (auditdistd)
MFC after:	2 weeks
2012-11-30 22:59:20 +00:00
Pawel Jakub Dawidek
9658c0582e IFp4 @208381:
For VOP_GETATTR() we just need vnode to be shared-locked.

Sponsored by:	FreeBSD Foundation (auditdistd)
MFC after:	2 weeks
2012-11-30 22:52:35 +00:00
Konstantin Belousov
5050aa86cf Remove the support for using non-mpsafe filesystem modules.
In particular, do not lock Giant conditionally when calling into the
filesystem module, remove the VFS_LOCK_GIANT() and related
macros. Stop handling buffers belonging to non-mpsafe filesystems.

The VFS_VERSION is bumped to indicate the interface change which does
not result in the interface signatures changes.

Conducted and reviewed by:	attilio
Tested by:	pho
2012-10-22 17:50:54 +00:00
Christian Brueffer
e623c22083 Check vplabel for NULL before dereferencing it. Fixes a panic
when running atop with MAC_MLS enabled.

Submitted by:	Richard Kojedzinszky <krichy@tvnetwork.hu>
Reviewed by:	rwatson
MFC after:	1 week
2012-05-03 15:51:34 +00:00
Robert Watson
b4ef8be228 When allocation of labels on files is implicitly disabled due to MAC
policy configuration, avoid leaking resources following failed calls
to get and set MAC labels by file descriptor.

Reported by:	Mateusz Guzik <mjguzik at gmail.com> + clang scan-build
MFC after:	3 days
2012-04-08 11:01:49 +00:00
Alexander V. Chernikov
e4b3229aa5 - Improve BPF locking model.
Interface locks and descriptor locks are converted from mutex(9) to rwlock(9).
This greately improves performance: in most common case we need to acquire 1
reader lock instead of 2 mutexes.

- Remove filter(descriptor) (reader) lock in bpf_mtap[2]
This was suggested by glebius@. We protect filter by requesting interface
writer lock on filter change.

- Cover struct bpf_if under BPF_INTERNAL define. This permits including bpf.h
without including rwlock stuff. However, this is is temporary solution,
struct bpf_if should be made opaque for any external caller.

Found by:       Dmitrij Tejblum <tejblum@yandex-team.ru>
Sponsored by:   Yandex LLC

Reviewed by:    glebius (previous version)
Reviewed by:    silence on -net@
Approved by:    (mentor)

MFC after:      3 weeks
2012-04-06 06:53:58 +00:00
Ed Schouten
7870adb640 Remove direct access to si_name.
Code should just use the devtoname() function to obtain the name of a
character device. Also add const keywords to pieces of code that need it
to build properly.

MFC after:	2 weeks
2012-02-10 12:35:57 +00:00
Ed Schouten
dc15eac046 Use strchr() and strrchr().
It seems strchr() and strrchr() are used more often than index() and
rindex(). Therefore, simply migrate all kernel code to use it.

For the XFS code, remove an empty line to make the code identical to
the code in the Linux kernel.
2012-01-02 12:12:10 +00:00
Attilio Rao
77befd1d23 Revert the approach for skipping lockstat_probe_func call when doing
lock_success/lock_failure, introduced in r228424, by directly skipping
in dtrace_probe.

This mainly helps in avoiding namespace pollution and thus lockstat.h
dependency by systm.h.

As an added bonus, this also helps in MFC case.
Reviewed by:	avg
MFC after:	3 months (or never)
X-MFC:		r228424
2011-12-12 23:29:32 +00:00
Andriy Gapon
7a7ce668ef put sys/systm.h at its proper place or add it if missing
Reported by:	lstewart, tinderbox
Pointyhat to:	avg, attilio
MFC after:	1 week
MFC with:	r228430
2011-12-12 10:05:13 +00:00
Ed Schouten
6472ac3d8a Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.
The SYSCTL_NODE macro defines a list that stores all child-elements of
that node. If there's no SYSCTL_DECL macro anywhere else, there's no
reason why it shouldn't be static.
2011-11-07 15:43:11 +00:00
Ed Schouten
d745c852be Mark MALLOC_DEFINEs static that have no corresponding MALLOC_DECLAREs.
This means that their use is restricted to a single C file.
2011-11-07 06:44:47 +00:00
Ed Schouten
a185bd12f3 Get rid of D_PSEUDO.
It seems the D_PSEUDO flag was meant to allow make_dev() to return NULL.
Nowadays we have a different interface for that; make_dev_p(). There's
no need to keep it there.

While there, remove an unneeded D_NEEDMINOR from the gpio driver.

Discussed with:	gonzo@ (gpio)
2011-10-18 08:09:44 +00:00
Christian Brueffer
709ef347b7 Remove two dublicated assignments.
CID:		9870
Found with:	Coverity Prevent(tm)
Confirmed by:	rwatson
MFC after:	1 week
2011-10-08 09:14:18 +00:00
Kip Macy
8451d0dd78 In order to maximize the re-usability of kernel code in user space this
patch modifies makesyscalls.sh to prefix all of the non-compatibility
calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel
entry points and all places in the code that use them. It also
fixes an additional name space collision between the kernel function
psignal and the libc function of the same name by renaming the kernel
psignal kern_psignal(). By introducing this change now we will ease future
MFCs that change syscalls.

Reviewed by:	rwatson
Approved by:	re (bz)
2011-09-16 13:58:51 +00:00
Robert Watson
9b6dd12e5d Correct several issues in the integration of POSIX shared memory objects
and the new setmode and setowner fileops in FreeBSD 9.0:

- Add new MAC Framework entry point mac_posixshm_check_create() to allow
  MAC policies to authorise shared memory use.  Provide a stub policy and
  test policy templates.

- Add missing Biba and MLS implementations of mac_posixshm_check_setmode()
  and mac_posixshm_check_setowner().

- Add 'accmode' argument to mac_posixshm_check_open() -- unlike the
  mac_posixsem_check_open() entry point it was modeled on, the access mode
  is required as shared memory access can be read-only as well as writable;
  this isn't true of POSIX semaphores.

- Implement full range of POSIX shared memory entry points for Biba and MLS.

Sponsored by:   Google Inc.
Obtained from:	TrustedBSD Project
Approved by:    re (kib)
2011-09-02 17:40:39 +00:00
Attilio Rao
6aba400a70 Fix a deficiency in the selinfo interface:
If a selinfo object is recorded (via selrecord()) and then it is
quickly destroyed, with the waiters missing the opportunity to awake,
at the next iteration they will find the selinfo object destroyed,
causing a PF#.

That happens because the selinfo interface has no way to drain the
waiters before to destroy the registered selinfo object. Also this
race is quite rare to get in practice, because it would require a
selrecord(), a poll request by another thread and a quick destruction
of the selrecord()'ed selinfo object.

Fix this by adding the seldrain() routine which should be called
before to destroy the selinfo objects (in order to avoid such case),
and fix the present cases where it might have already been called.
Sometimes, the context is safe enough to prevent this type of race,
like it happens in device drivers which installs selinfo objects on
poll callbacks. There, the destruction of the selinfo object happens
at driver detach time, when all the filedescriptors should be already
closed, thus there cannot be a race.
For this case, mfi(4) device driver can be set as an example, as it
implements a full correct logic for preventing this from happening.

Sponsored by:	Sandvine Incorporated
Reported by:	rstone
Tested by:	pluknet
Reviewed by:	jhb, kib
Approved by:	re (bz)
MFC after:	3 weeks
2011-08-25 15:51:54 +00:00
Konstantin Belousov
9c00bb9190 Add the fo_chown and fo_chmod methods to struct fileops and use them
to implement fchown(2) and fchmod(2) support for several file types
that previously lacked it. Add MAC entries for chown/chmod done on
posix shared memory and (old) in-kernel posix semaphores.

Based on the submission by:	glebius
Reviewed by:	rwatson
Approved by:	re (bz)
2011-08-16 20:07:47 +00:00
Robert Watson
a9d2f8d84f Second-to-last commit implementing Capsicum capabilities in the FreeBSD
kernel for FreeBSD 9.0:

Add a new capability mask argument to fget(9) and friends, allowing system
call code to declare what capabilities are required when an integer file
descriptor is converted into an in-kernel struct file *.  With options
CAPABILITIES compiled into the kernel, this enforces capability
protection; without, this change is effectively a no-op.

Some cases require special handling, such as mmap(2), which must preserve
information about the maximum rights at the time of mapping in the memory
map so that they can later be enforced in mprotect(2) -- this is done by
narrowing the rights in the existing max_protection field used for similar
purposes with file permissions.

In namei(9), we assert that the code is not reached from within capability
mode, as we're not yet ready to enforce namespace capabilities there.
This will follow in a later commit.

Update two capability names: CAP_EVENT and CAP_KEVENT become
CAP_POST_KEVENT and CAP_POLL_KEVENT to more accurately indicate what they
represent.

Approved by:	re (bz)
Submitted by:	jonathan
Sponsored by:	Google Inc
2011-08-11 12:30:23 +00:00
Jonathan Anderson
778b0e42a8 Provide ability to audit cap_rights_t arguments.
We wish to be able to audit capability rights arguments; this code
provides the necessary infrastructure.

This commit does not, of itself, turn on such auditing for any
system call; that should follow shortly.

Approved by: mentor (rwatson), re (Capsicum blanket)
Sponsored by: Google Inc
2011-07-18 12:58:18 +00:00
Alexander Leidinger
d783bbd2d2 - Add a FEATURE for capsicum (security_capabilities).
- Rename mac FEATURE to security_mac.

Discussed with:	rwatson
2011-03-04 09:03:54 +00:00
Robert Watson
25122f5c5f Add ECAPMODE, "Not permitted in capability mode", a new kernel errno
constant to indicate that a system call (or perhaps an operation requested
via a system call) is not permitted for a capability mode process.

Submitted by:	anderson
Sponsored by:	Google, Inc.
Obtained from:	Capsicum Project
MFC after:	1 week
2011-03-01 13:14:28 +00:00
Alexander Leidinger
de5b19526b Add some FEATURE macros for various features (AUDIT/CAM/IPC/KTR/MAC/NFS/NTP/
PMC/SYSV/...).

No FreeBSD version bump, the userland application to query the features will
be committed last and can serve as an indication of the availablility if
needed.

Sponsored by:   Google Summer of Code 2010
Submitted by:   kibab
Reviewed by:    arch@ (parts by rwatson, trasz, jhb)
X-MFC after:    to be determined in last commit with code from this project
2011-02-25 10:11:01 +00:00
Alan Cox
17f3095d1a Unless "cnt" exceeds MAX_COMMIT_COUNT, nfsrv_commit() and nfsvno_fsync() are
incorrectly calling vm_object_page_clean().  They are passing the length of
the range rather than the ending offset of the range.

Perform the OFF_TO_IDX() conversion in vm_object_page_clean() rather than the
callers.

Reviewed by:	kib
MFC after:	3 weeks
2011-02-05 21:21:27 +00:00
Matthew D Fleming
123d2cb7e9 sysctl(9) cleanup checkpoint: amd64 GENERIC builds cleanly.
Commit the security directory.
2011-01-12 19:54:14 +00:00
Rebecca Cran
b1ce21c6ef Fix typos.
PR:	bin/148894
Submitted by:	olgeni
2010-11-09 10:59:09 +00:00
Robert Watson
a959b1f02c Add missing DTrace probe invocation to mac_vnode_check_open; the probe
was declared, but never used.

MFC after:	3 days
Sponsored by:	Google, Inc.
2010-10-23 16:59:39 +00:00
Matthew D Fleming
4d369413e1 Replace sbuf_overflowed() with sbuf_error(), which returns any error
code associated with overflow or with the drain function.  While this
function is not expected to be used often, it produces more information
in the form of an errno that sbuf_overflowed() did.
2010-09-10 16:42:16 +00:00
Rui Paulo
79856499bd Add an extra comment to the SDT probes definition. This allows us to get
use '-' in probe names, matching the probe names in Solaris.[1]

Add userland SDT probes definitions to sys/sdt.h.

Sponsored by:	The FreeBSD Foundation
Discussed with:	rwaston [1]
2010-08-22 11:18:57 +00:00
Christian S.J. Peron
24ffe72448 Add a case to make sure that internal audit records get converted
to BSM format for lpathconf(2) events.

MFC after:	2 weeks
2010-05-04 15:29:07 +00:00
Robert Watson
9663e34384 Update device-labeling logic for Biba, LOMAC, and MLS to recognize new-style
pts devices when various policy ptys_equal flags are enabled.

Submitted by:	Estella Mystagic <estella at mystagic.com>
MFC after:	1 week
2010-03-02 15:05:48 +00:00
Christian S.J. Peron
583450efd7 Make sure we convert audit records that were produced as the result of the
closefrom(2) syscall.
2010-01-31 22:31:01 +00:00
Brooks Davis
412f9500e2 Replace the static NGROUPS=NGROUPS_MAX+1=1024 with a dynamic
kern.ngroups+1.  kern.ngroups can range from NGROUPS_MAX=1023 to
INT_MAX-1.  Given that the Windows group limit is 1024, this range
should be sufficient for most applications.

MFC after:	1 month
2010-01-12 07:49:34 +00:00
Edward Tomasz Napierala
3a597bfc7b Make mac_lomac(4) able to interpret NFSv4 access bits.
Reviewed by:	rwatson
2010-01-03 17:19:14 +00:00
Poul-Henning Kamp
0d07b627a3 Having thrown the cat out of the house, add a necessary include. 2009-09-08 13:24:36 +00:00
Poul-Henning Kamp
6778431478 Revert previous commit and add myself to the list of people who should
know better than to commit with a cat in the area.
2009-09-08 13:19:05 +00:00
Poul-Henning Kamp
b34421bf9c Add necessary include. 2009-09-08 13:16:55 +00:00
Robert Watson
9eb3e4639a Correctly audit real gids following changes to the audit record argument
interface.

Approved by:	re (kib)
2009-08-12 10:45:45 +00:00
Robert Watson
791b0ad2bf Eliminate ARG_UPATH[12] arguments to AUDIT_ARG_UPATH() and instead
provide specific macros, AUDIT_ARG_UPATH1() and AUDIT_ARG_UPATH2()
to capture path information for audit records.  This allows us to
move the definitions of ARG_* out of the public audit header file,
as they are an implementation detail of our current kernel-internal
audit record, which may change.

Approved by:	re (kensmith)
Obtained from:	TrustedBSD Project
MFC after:	1 month
2009-07-29 07:44:43 +00:00
Robert Watson
b146fc1bf0 Rework vnode argument auditing to follow the same structure, in order
to avoid exposing ARG_ macros/flag values outside of the audit code in
order to name which one of two possible vnodes will be audited for a
system call.

Approved by:	re (kib)
Obtained from:	TrustedBSD Project
MFC after:	1 month
2009-07-28 21:52:24 +00:00
Robert Watson
e4b4bbb665 Audit file descriptors passed to fooat(2) system calls, which are used
instead of the root/current working directory as the starting point for
lookups.  Up to two such descriptors can be audited.  Add audit record
BSM encoding for fooat(2).

Note: due to an error in the OpenBSM 1.1p1 configuration file, a
further change is required to that file in order to fix openat(2)
auditing.

Approved by:	re (kib)
Reviewed by:	rdivacky (fooat(2) portions)
Obtained from:	TrustedBSD Project
MFC after:	1 month
2009-07-28 21:39:58 +00:00
Robert Watson
597df30e62 Import OpenBSM 1.1p1 from vendor branch to 8-CURRENT, populating
contrib/openbsm and a subset also imported into sys/security/audit.
This patch release addresses several minor issues:

- Fixes to AUT_SOCKUNIX token parsing.
- IPv6 support for au_to_me(3).
- Improved robustness in the parsing of audit_control, especially long
  flags/naflags strings and whitespace in all fields.
- Add missing conversion of a number of FreeBSD/Mac OS X errnos to/from BSM
  error number space.

MFC after:	3 weeks
Obtained from:	TrustedBSD Project
Sponsored by:	Apple, Inc.
Approved by:	re (kib)
2009-07-17 14:02:20 +00:00
Robert Watson
6196f898bb Create audit records for AUE_POSIX_OPENPT, currently w/o arguments.
Approved by:	re (audit argument blanket)
2009-07-02 16:33:38 +00:00
Robert Watson
deedc899fd Fix comment misthink.
Submitted by:	b. f. <bf1783 at googlemail.com>
Approved by:	re (audit argument blanket)
MFC after:	1 week
2009-07-02 09:50:13 +00:00
Robert Watson
2a5658382a Clean up a number of aspects of token generation from audit arguments to
system calls:

- Centralize generation of argument tokens for VM addresses in a macro,
  ADDR_TOKEN(), and properly encode 64-bit addresses in 64-bit arguments.
- Fix up argument numbers across a large number of syscalls so that they
  match the numeric argument into the system call.
- Don't audit the address argument to ioctl(2) or ptrace(2), but do keep
  generating tokens for mmap(2), minherit(2), since they relate to passing
  object access across execve(2).

Approved by:	re (audit argument blanket)
Obtained from:	TrustedBSD Project
MFC after:	1 week
2009-07-02 09:15:30 +00:00
Robert Watson
03f7b00438 For access(2) and eaccess(2), audit the requested access mode.
Approved by:	re (audit argument blanket)
MFC after:	3 days
2009-07-01 22:47:45 +00:00
Robert Watson
9e4c1521d5 Define missing audit argument macro AUDIT_ARG_SOCKET(), and
capture the domain, type, and protocol arguments to socket(2)
and socketpair(2).

Approved by:	re (audit argument blanket)
MFC after:	3 days
2009-07-01 18:54:49 +00:00
Robert Watson
6d5a61563a When auditing unmount(2), capture FSID arguments as regular text strings
rather than as paths, which would lead to them being treated as relative
pathnames and hence confusingly converted into absolute pathnames.

Capture flags to unmount(2) via an argument token.

Approved by:	re (audit argument blanket)
MFC after:	3 days
2009-07-01 16:56:56 +00:00
Robert Watson
422d786676 Audit the file descriptor number passed to lseek(2).
Approved by:	re (kib)
MFC after:	3 days
2009-07-01 15:37:23 +00:00
Robert Watson
2ef24dde7c udit the 'options' argument to wait4(2).
Approved by:	re (kib)
MFC after:	3 days
2009-07-01 12:36:10 +00:00
Stacey Son
86120afae4 Dynamically allocate the gidset field in audit record.
This fixes a problem created by the recent change that allows a large
number of groups per user.  The gidset field in struct kaudit_record
is now dynamically allocated to the size needed rather than statically
(using NGROUPS).

Approved by:	re@ (kensmith, rwatson), gnn (mentor)
2009-06-29 20:19:19 +00:00
Robert Watson
14961ba789 Replace AUDIT_ARG() with variable argument macros with a set more more
specific macros for each audit argument type.  This makes it easier to
follow call-graphs, especially for automated analysis tools (such as
fxr).

In MFC, we should leave the existing AUDIT_ARG() macros as they may be
used by third-party kernel modules.

Suggested by:	brooks
Approved by:	re (kib)
Obtained from:	TrustedBSD Project
MFC after:	1 week
2009-06-27 13:58:44 +00:00
Konstantin Belousov
3364c323e6 Implement global and per-uid accounting of the anonymous memory. Add
rlimit RLIMIT_SWAP that limits the amount of swap that may be reserved
for the uid.

The accounting information (charge) is associated with either map entry,
or vm object backing the entry, assuming the object is the first one
in the shadow chain and entry does not require COW. Charge is moved
from entry to object on allocation of the object, e.g. during the mmap,
assuming the object is allocated, or on the first page fault on the
entry. It moves back to the entry on forks due to COW setup.

The per-entry granularity of accounting makes the charge process fair
for processes that change uid during lifetime, and decrements charge
for proper uid when region is unmapped.

The interface of vm_pager_allocate(9) is extended by adding struct ucred *,
that is used to charge appropriate uid when allocation if performed by
kernel, e.g. md(4).

Several syscalls, among them is fork(2), may now return ENOMEM when
global or per-uid limits are enforced.

In collaboration with:	pho
Reviewed by:	alc
Approved by:	re (kensmith)
2009-06-23 20:45:22 +00:00
Ed Schouten
fbbbf5d135 Chase the removal of PRIV_TTY_PRISON in the mac(9) modules.
Reported by:	kib
Pointy hat to:	me
2009-06-20 15:54:35 +00:00
Konstantin Belousov
d8b0556c6d Adapt vfs kqfilter to the shared vnode lock used by zfs write vop. Use
vnode interlock to protect the knote fields [1]. The locking assumes
that shared vnode lock is held, thus we get exclusive access to knote
either by exclusive vnode lock protection, or by shared vnode lock +
vnode interlock.

Do not use kl_locked() method to assert either lock ownership or the
fact that curthread does not own the lock. For shared locks, ownership
is not recorded, e.g. VOP_ISLOCKED can return LK_SHARED for the shared
lock not owned by curthread, causing false positives in kqueue subsystem
assertions about knlist lock.

Remove kl_locked method from knlist lock vector, and add two separate
assertion methods kl_assert_locked and kl_assert_unlocked, that are
supposed to use proper asserts. Change knlist_init accordingly.

Add convenience function knlist_init_mtx to reduce number of arguments
for typical knlist initialization.

Submitted by:	jhb [1]
Noted by:	jhb [2]
Reviewed by:	jhb
Tested by:	rnoland
2009-06-10 20:59:32 +00:00
Robert Watson
bcf11e8d00 Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERIC
and used in a large number of files, but also because an increasing number
of incorrect uses of MAC calls were sneaking in due to copy-and-paste of
MAC-aware code without the associated opt_mac.h include.

Discussed with:	pjd
2009-06-05 14:55:22 +00:00
Robert Watson
3ad3d9c5ef Add one further check with mac_policy_count to an mbuf copying case
(limited to netatalk) to avoid MAC label lookup on both mbufs if no
policies are registered.

Obtained from:	TrustedBSD Project
2009-06-03 19:41:12 +00:00
Robert Watson
3de4046939 Continue work to optimize performance of "options MAC" when no MAC policy
modules are loaded by avoiding mbuf label lookups when policies aren't
loaded, pushing further socket locking into MAC policy modules, and
avoiding locking MAC ifnet locks when no policies are loaded:

- Check mac_policies_count before looking for mbuf MAC label m_tags in MAC
  Framework entry points.  We will still pay label lookup costs if MAC
  policies are present but don't require labels (typically a single mbuf
  header field read, but perhaps further indirection if IPSEC or other
  m_tag consumers are in use).

- Further push socket locking for socket-related access control checks and
  events into MAC policies from the MAC Framework, so that sockets are
  only locked if a policy specifically requires a lock to protect a label.
  This resolves lock order issues during sonewconn() and also in local
  domain socket cross-connect where multiple socket locks could not be
  held at once for the purposes of propagatig MAC labels across multiple
  sockets.  Eliminate mac_policy_count check in some entry points where it
  no longer avoids locking.

- Add mac_policy_count checking in some entry points relating to network
  interfaces that otherwise lock a global MAC ifnet lock used to protect
  ifnet labels.

Obtained from:	TrustedBSD Project
2009-06-03 18:46:28 +00:00
Robert Watson
15141acc67 By default, label all network interfaces as biba/equal on attach. This
makes it easier for first-time users to configure and work with biba as
remote acess is still allowed.  Effectively, this means that, by default,
only local security properties, not distributed ones, are enforced.

Obtained from:	TrustedBSD Project
2009-06-03 08:49:44 +00:00
Robert Watson
5f51fb4871 Mark MAC Framework sx and rm locks as NOWITNESS to suppress warnings that
might arise from WITNESS not understanding its locking protocol, which
should be deadlock-free.  Currently these warnings generally don't occur,
but as object locking is pushed into policies for some object types, they
would otherwise occur more often.

Obtained from:	TrustedBSD Project
2009-06-02 22:22:09 +00:00
Robert Watson
f93bfb23dc Add internal 'mac_policy_count' counter to the MAC Framework, which is a
count of the number of registered policies.

Rather than unconditionally locking sockets before passing them into MAC,
lock them in the MAC entry points only if mac_policy_count is non-zero.

This avoids locking overhead for a number of socket system calls when no
policies are registered, eliminating measurable overhead for the MAC
Framework for the socket subsystem when there are no active policies.

Possibly socket locks should be acquired by policies if they are required
for socket labels, which would further avoid locking overhead when there
are policies but they don't require labeling of sockets, or possibly
don't even implement socket controls.

Obtained from:	TrustedBSD Project
2009-06-02 18:26:17 +00:00
Robert Watson
1a109c1cb0 Make the rmlock(9) interface a bit more like the rwlock(9) interface:
- Add rm_init_flags() and accept extended options only for that variation.
- Add a flags space specifically for rm_init_flags(), rather than borrowing
  the lock_init() flag space.
- Define flag RM_RECURSE to use instead of LO_RECURSABLE.
- Define flag RM_NOWITNESS to allow an rmlock to be exempt from WITNESS
  checking; this wasn't possible previously as rm_init() always passed
  LO_WITNESS when initializing an rmlock's struct lock.
- Add RM_SYSINIT_FLAGS().
- Rename embedded mutex in rmlocks to make it more obvious what it is.
- Update consumers.
- Update man page.
2009-05-29 10:52:37 +00:00
Jamie Gritton
0304c73163 Add hierarchical jails. A jail may further virtualize its environment
by creating a child jail, which is visible to that jail and to any
parent jails.  Child jails may be restricted more than their parents,
but never less.  Jail names reflect this hierarchy, being MIB-style
dot-separated strings.

Every thread now points to a jail, the default being prison0, which
contains information about the physical system.  Prison0's root
directory is the same as rootvnode; its hostname is the same as the
global hostname, and its securelevel replaces the global securelevel.
Note that the variable "securelevel" has actually gone away, which
should not cause any problems for code that properly uses
securelevel_gt() and securelevel_ge().

Some jail-related permissions that were kept in global variables and
set via sysctls are now per-jail settings.  The sysctls still exist for
backward compatibility, used only by the now-deprecated jail(2) system
call.

Approved by:	bz (mentor)
2009-05-27 14:11:23 +00:00
Robert Watson
81fee06f9c Convert the MAC Framework from using rwlocks to rmlocks to stabilize
framework registration for non-sleepable entry points.

Obtained from:	TrustedBSD Project
2009-05-27 09:41:58 +00:00
Attilio Rao
dfd233edd5 Remove the thread argument from the FSD (File-System Dependent) parts of
the VFS.  Now all the VFS_* functions and relating parts don't want the
context as long as it always refers to curthread.

In some points, in particular when dealing with VOPs and functions living
in the same namespace (eg. vflush) which still need to be converted,
pass curthread explicitly in order to retain the old behaviour.
Such loose ends will be fixed ASAP.

While here fix a bug: now, UFS_EXTATTR can be compiled alone without the
UFS_EXTATTR_AUTOSTART option.

VFS KPI is heavilly changed by this commit so thirdy parts modules needs
to be recompiled.  Bump __FreeBSD_version in order to signal such
situation.
2009-05-11 15:33:26 +00:00
Robert Watson
fa76567150 Rename MAC Framework-internal macros used to invoke policy entry points:
MAC_BOOLEAN           -> MAC_POLICY_BOOLEAN
  MAC_BOOLEAN_NOSLEEP   -> MAC_POLICY_BOOLEANN_NOSLEEP
  MAC_CHECK             -> MAC_POLICY_CHECK
  MAC_CHECK_NOSLEEP     -> MAC_POLICY_CHECK_NOSLEEP
  MAC_EXTERNALIZE       -> MAC_POLICY_EXTERNALIZE
  MAC_GRANT             -> MAC_POLICY_GRANT
  MAC_GRANT_NOSLEEP     -> MAC_POLICY_GRANT_NOSLEEP
  MAC_INTERNALIZE       -> MAC_POLICY_INTERNALIZE
  MAC_PERFORM           -> MAC_POLICY_PERFORM_CHECK
  MAC_PERFORM_NOSLEEP   -> MAC_POLICY_PERFORM_NOSLEEP

This frees up those macro names for use in wrapping calls into the MAC
Framework from the remainder of the kernel.

Obtained from:	TrustedBSD Project
2009-05-01 21:05:40 +00:00
Robert Watson
2a5058a3ed Temporarily relax the constraints on argument size checking for A_GETCOND;
login(1) isn't quite ready for them yet on 64-bit systems as it continues
to use the conventions of the old version of the API.

Reported by:	stas, Jakub Lach <jakub_lach at mailplus.pl>
2009-04-19 23:28:08 +00:00
Robert Watson
4df4e33572 Merge OpenBSM 1.1 changes to the FreeBSD 8.x kernel:
- Add and use mapping of fcntl(2) commands to new BSM constant space.
- Adopt (int) rather than (long) arguments to a number of auditon(2)
  commands, as has happened in Solaris, and add compatibility code to
  handle the old comments.

Note that BSM_PF_IEEE80211 is partially but not fully removed, as the
userspace OpenBSM 1.1alpha5 code still depends on it.  Once userspace
is updated, I'll GCC the kernel constant.

MFC after:		2 weeks
Sponsored by:		Apple, Inc.
Obtained from:		TrustedBSD Project
Portions submitted by:	sson
2009-04-19 14:53:17 +00:00
Robert Watson
fe69399069 Merge new kernel files from OpenBSM 1.1: audit_fcntl.h and
audit_bsm_fcntl.c contain utility routines to map local fcntl
commands into BSM constants.  Adaptation to the FreeBSD kernel
environment will follow in a future commit.

Sponsored by:	Apple, Inc.
Obtained from:	TrustedBSD Project
MFC after:	2 weeks
2009-04-16 20:17:32 +00:00
Robert Watson
2f106d5e08 Remove D_NEEDGIANT from audit pipes. I'm actually not sure why this was
here, but isn't needed.

MFC after:	2 weeks
Sponsored by:	Apple, Inc.
2009-04-16 11:57:16 +00:00
Edward Tomasz Napierala
6180d3185d Get rid of VSTAT and replace it with VSTAT_PERMS, which is somewhat
better defined.

Approved by:	rwatson (mentor)
2009-03-29 17:45:48 +00:00
Pawel Jakub Dawidek
a3ce3b6d35 - Correct logic in if statement - we want to allocate temporary buffer
when someone is passing new rules, not when he only want to read them.
  Because of this bug, even if the given rules were incorrect, they
  ended up in rule_string.
- Add missing protection for rule_string when coping it.

Reviewed by:	rwatson
MFC after:	1 week
2009-03-14 20:40:06 +00:00
Robert Watson
4020272933 Rework MAC Framework synchronization in a number of ways in order to
improve performance:

- Eliminate custom reference count and condition variable to monitor
  threads entering the framework, as this had both significant overhead
  and behaved badly in the face of contention.

- Replace reference count with two locks: an rwlock and an sx lock,
  which will be read-acquired by threads entering the framework
  depending on whether a give policy entry point is permitted to sleep
  or not.

- Replace previous mutex locking of the reference count for exclusive
  access with write acquiring of both the policy list sx and rw locks,
  which occurs only when policies are attached or detached.

- Do a lockless read of the dynamic policy list head before acquiring
  any locks in order to reduce overhead when no dynamic policies are
  loaded; this a race we can afford to lose.

- For every policy entry point invocation, decide whether sleeping is
  permitted, and if not, use a _NOSLEEP() variant of the composition
  macros, which will use the rwlock instead of the sxlock.  In some
  cases, we decide which to use based on allocation flags passed to the
  MAC Framework entry point.

As with the move to rwlocks/rmlocks in pfil, this may trigger witness
warnings, but these should (generally) be false positives as all
acquisition of the locks is for read with two very narrow exceptions
for policy load/unload, and those code blocks should never acquire
other locks.

Sponsored by:	Google, Inc.
Obtained from:	TrustedBSD Project
Discussed with:	csjp (idea, not specific patch)
2009-03-14 16:06:06 +00:00
Christian S.J. Peron
095b4d2689 Mark the bsdextended rules sysctl as being mpsafe.
Discussed with:	rwatson
2009-03-09 17:42:18 +00:00
Robert Watson
b3f468e253 Add a new thread-private flag, TDP_AUDITREC, to indicate whether or
not there is an audit record hung off of td_ar on the current thread.
Test this flag instead of td_ar when auditing syscall arguments or
checking for an audit record to commit on syscall return.  Under
these circumstances, td_pflags is much more likely to be in the cache
(especially if there is no auditing of the current system call), so
this should help reduce cache misses in the system call return path.

MFC after:      1 week
Reported by:    kris
Obtained from:  TrustedBSD Project
2009-03-09 10:45:58 +00:00
Robert Watson
fefd0ac8a9 Remove 'uio' argument from MAC Framework and MAC policy entry points for
extended attribute get/set; in the case of get an uninitialized user
buffer was passed before the EA was retrieved, making it of relatively
little use; the latter was simply unused by any policies.

Obtained from:	TrustedBSD Project
Sponsored by:	Google, Inc.
2009-03-08 12:32:06 +00:00
Robert Watson
c14172e3ae Rename 'ucred' argument to mac_socket_check_bind() to 'cred' to match
other use of the same variable type.

Obtained from:	TrustedBSD Project
Sponsored by:	Google, Inc.
2009-03-08 12:22:00 +00:00
Robert Watson
6f6174a762 Improve the consistency of MAC Framework and MAC policy entry point
naming by renaming certain "proc" entry points to "cred" entry points,
reflecting their manipulation of credentials.  For some entry points,
the process was passed into the framework but not into policies; in
these cases, stop passing in the process since we don't need it.

  mac_proc_check_setaudit -> mac_cred_check_setaudit
  mac_proc_check_setaudit_addr -> mac_cred_check_setaudit_addr
  mac_proc_check_setauid -> mac_cred_check_setauid
  mac_proc_check_setegid -> mac_cred_check_setegid
  mac_proc_check_seteuid -> mac_cred_check_seteuid
  mac_proc_check_setgid -> mac_cred_check_setgid
  mac_proc_check_setgroups -> mac_cred_ceck_setgroups
  mac_proc_check_setregid -> mac_cred_check_setregid
  mac_proc_check_setresgid -> mac_cred_check_setresgid
  mac_proc_check_setresuid -> mac_cred_check_setresuid
  mac_proc_check_setreuid -> mac_cred_check_setreuid
  mac_proc_check_setuid -> mac_cred_check_setuid

Obtained from:	TrustedBSD Project
Sponsored by:	Google, Inc.
2009-03-08 10:58:37 +00:00
Robert Watson
2087a58ca2 Add static DTrace probes for MAC Framework access control checks and
privilege grants so that dtrace can be more easily used to monitor
the security decisions being generated by the MAC Framework following
policy invocation.

Successful access control checks will be reported by:

  mac_framework:kernel:<entrypoint>:mac_check_ok

Failed access control checks will be reported by:

  mac_framework:kernel:<entrypoint>:mac_check_err

Successful privilege grants will be reported by:

  mac_framework:kernel:priv_grant:mac_grant_ok

Failed privilege grants will be reported by:

  mac_framework:kernel:priv_grant:mac_grant_err

In all cases, the return value (always 0 for _ok, otherwise an errno
for _err) will be reported via arg0 on the probe, and subsequent
arguments will hold entrypoint-specific data, in a style similar to
privilege tracing.

Obtained from:	TrustedBSD Project
Sponsored by:	Google, Inc.
2009-03-08 00:50:37 +00:00
Robert Watson
73e416e35d Reduce the verbosity of SDT trace points for DTrace by defining several
wrapper macros that allow trace points and arguments to be declared
using a single macro rather than several.  This means a lot less
repetition and vertical space for each trace point.

Use these macros when defining privilege and MAC Framework trace points.

Reviewed by:	jb
MFC after:	1 week
2009-03-03 17:15:05 +00:00
Robert Watson
06edd2f1e8 Merge OpenBSM 1.1 beta 1 from OpenBSM vendor branch to head, both
contrib/openbsm (svn merge) and src/sys/{bsm,security/audit} (manual
merge).

OpenBSM history for imported revision below for reference.

MFC after:      1 month
Sponsored by:   Apple, Inc.
Obtained from:  TrustedBSD Project

OpenBSM 1.1 beta 1

- The filesz parameter in audit_control(5) now accepts suffixes: 'B' for
  Bytes, 'K' for Kilobytes, 'M' for Megabytes, and 'G' for Gigabytes.
  For legacy support no suffix defaults to bytes.
- Audit trail log expiration support added.  It is configured in
  audit_control(5) with the expire-after parameter.  If there is no
  expire-after parameter in audit_control(5), the default, then the audit
  trail files are not expired and removed.  See audit_control(5) for
  more information.
- Change defaults in audit_control: warn at 5% rather than 20% free for audit
  partitions, rotate automatically at 2mb, and set the default policy to
  cnt,argv rather than cnt so that execve(2) arguments are captured if
  AUE_EXECVE events are audited.  These may provide more usable defaults for
  many users.
- Use au_domain_to_bsm(3) and au_socket_type_to_bsm(3) to convert
  au_to_socket_ex(3) arguments to BSM format.
- Fix error encoding AUT_IPC_PERM tokens.
2009-03-02 13:29:18 +00:00
Konstantin Belousov
ad062f5bb8 Use vm_map_entry_t instead of explicit struct vm_map_entry *.
Reviewed by:	alc
2009-02-24 20:27:48 +00:00
Robert Watson
8f8dedbed8 Set the lower bound on queue size for an audit pipe to 1 instead of 0,
as an audit pipe with a queue length of 0 is less useful.

Obtained from:	TrustedBSD Project
Sponsored by:	Apple, Inc.
MFC after:	1 week
2009-02-08 15:38:31 +00:00
Robert Watson
f4f93a63f5 Change various routines that are responsible for transforming audit
event IDs based on arguments to return au_event_t rather than int.

Obtained from:	TrustedBSD Project
Sponsored by:	Apple, Inc.
MFC after:	1 week
2009-02-08 14:39:35 +00:00
Robert Watson
8b14aeee4b Audit AUE_MAC_EXECVE; currently just the standard AUE_EXECVE arguments
and not the label.

Obtained from:	TrustedBSD Project
Sponsored by:	Apple, Inc.
MFC after:	1 week
2009-02-08 14:24:35 +00:00
Robert Watson
4ba1f444c5 Audit the flag argument to the nfssvc(2) system call.
Obtained from:	TrustedBSD Project
Sponsored by:	Apple, Inc.
2009-02-08 14:04:08 +00:00
Robert Watson
91e832e0c3 Eliminate the local variable 'ape' in audit_pipe_kqread(), as it's only
used for an assertion that we don't really need anymore.

MFC after:	1 week
Reported by:	Christoph Mallon <christoph dot mallon at gmx dot de>
2009-02-04 19:56:37 +00:00
Robert Watson
c7ed8c0a85 Use __FBSDID() for $FreeBSD$ version strings in .c files.
Obtained from:	TrustedBSD Project
MFC after:	3 days
2009-01-24 13:15:45 +00:00
Robert Watson
91ec000612 Begin to add SDT tracing of the MAC Framework: add policy modevent,
register, and unregister hooks that give access to the mac_policy_conf
for the policy.

Obtained from:	TrustedBSD Project
MFC after:	3 days
2009-01-24 10:57:32 +00:00
Robert Watson
07cd9ab013 Update copyright, P4 version number as audit_bsm_token.c reflects changes
in bsm_token.c through #86 from OpenBSM.

MFC after:	1 month
Sponsored by:	Apple, Inc.
Obtained from:	TrustedBSD Project
2009-01-14 12:16:14 +00:00
Robert Watson
c74c7b73a0 Merge OpenBSM alpha 5 from OpenBSM vendor branch to head, both
contrib/openbsm (svn merge) and src/sys/{bsm,security/audit} (manual
merge).  Hook up bsm_domain.c and bsm_socket_type.c to the libbsm
build along with man pages, add audit_bsm_domain.c and
audit_bsm_socket_type.c to the kernel environment.

OpenBSM history for imported revisions below for reference.

MFC after:      1 month
Sponsored by:   Apple Inc.
Obtained from:  TrustedBSD Project

OpenBSM 1.1 alpha 5

- Stub libauditd(3) man page added.
- All BSM error number constants with BSM_ERRNO_.
- Interfaces to convert between local and BSM socket types and protocol
  families have been added: au_bsm_to_domain(3), au_bsm_to_socket_type(3),
  au_domain_to_bsm(3), and au_socket_type_to_bsm(3), along with definitions
  of constants in audit_domain.h and audit_socket_type.h.  This improves
  interoperability by converting local constant spaces, which vary by OS, to
  and from Solaris constants (where available) or OpenBSM constants for
  protocol domains not present in Solaris (a fair number).  These routines
  should be used when generating and interpreting extended socket tokens.
- Fix build warnings with full gcc warnings enabled on most supported
  platforms.
- Don't compile error strings into bsm_errno.c when building it in the kernel
  environment.
- When started by launchd, use the label com.apple.auditd rather than
  org.trustedbsd.auditd.
2009-01-14 10:44:16 +00:00
Robert Watson
9162f64b58 Rather than having MAC policies explicitly declare what object types
they label, derive that information implicitly from the set of label
initializers in their policy operations set.  This avoids a possible
class of programmer errors, while retaining the structure that
allows us to avoid allocating labels for objects that don't need
them.  As before, we regenerate a global mask of labeled objects
each time a policy is loaded or unloaded, stored in mac_labeled.

Discussed with:   csjp
Suggested by:     Jacques Vidrine <nectar at apple.com>
Obtained from:    TrustedBSD Project
Sponsored by:     Apple, Inc.
2009-01-10 10:58:41 +00:00
Robert Watson
dbdcb99498 Use MPC_OBJECT_IP6Q to indicate labeling of struct ip6q rather than
MPC_OBJECT_IPQ; it was already defined, just not used.

Obtained from:	TrustedBSD Project
Sponsored by:	Apple, Inc.
2009-01-10 09:17:16 +00:00
Robert Watson
d423f266c4 Do a lockless read of the audit pipe list before grabbing the audit pipe
lock in order to avoid the lock acquire hit if the pipe list is very
likely empty.

Obtained from:	TrustedBSD Project
MFC after:	3 weeks
Sponsored by:	Apple, Inc.
2009-01-06 14:15:38 +00:00
Robert Watson
cd6bbe656e In AUDIT_SYSCALL_EXIT(), invoke audit_syscall_exit() only if an audit
record is active on the current thread--historically we may always
have wanted to enter the audit code if auditing was enabled, but now
we just commit the audit record so don't need to enter if there isn't
one.

Obtained from:	TrustedBSD Project
Sponsored by:	Apple, Inc.
2009-01-06 13:59:59 +00:00
Robert Watson
efcde1e8c7 Fix white space botch: use carriage returns rather than tabs. 2008-12-31 23:22:45 +00:00