Commit Graph

745 Commits

Author SHA1 Message Date
Bjoern A. Zeeb
413628a7e3 MFp4:
Bring in updated jail support from bz_jail branch.

This enhances the current jail implementation to permit multiple
addresses per jail. In addtion to IPv4, IPv6 is supported as well.
Due to updated checks it is even possible to have jails without
an IP address at all, which basically gives one a chroot with
restricted process view, no networking,..

SCTP support was updated and supports IPv6 in jails as well.

Cpuset support permits jails to be bound to specific processor
sets after creation.

Jails can have an unrestricted (no duplicate protection, etc.) name
in addition to the hostname. The jail name cannot be changed from
within a jail and is considered to be used for management purposes
or as audit-token in the future.

DDB 'show jails' command was added to aid debugging.

Proper compat support permits 32bit jail binaries to be used on 64bit
systems to manage jails. Also backward compatibility was preserved where
possible: for jail v1 syscalls, as well as with user space management
utilities.

Both jail as well as prison version were updated for the new features.
A gap was intentionally left as the intermediate versions had been
used by various patches floating around the last years.

Bump __FreeBSD_version for the afore mentioned and in kernel changes.

Special thanks to:
- Pawel Jakub Dawidek (pjd) for his multi-IPv4 patches
  and Olivier Houchard (cognet) for initial single-IPv6 patches.
- Jeff Roberson (jeff) and Randall Stewart (rrs) for their
  help, ideas and review on cpuset and SCTP support.
- Robert Watson (rwatson) for lots and lots of help, discussions,
  suggestions and review of most of the patch at various stages.
- John Baldwin (jhb) for his help.
- Simon L. Nielsen (simon) as early adopter testing changes
  on cluster machines as well as all the testers and people
  who provided feedback the last months on freebsd-jail and
  other channels.
- My employer, CK Software GmbH, for the support so I could work on this.

Reviewed by:	(see above)
MFC after:	3 months (this is just so that I get the mail)
X-MFC Before:   7.2-RELEASE if possible
2008-11-29 14:32:14 +00:00
Robert Watson
a760c0b245 Regularize /* FALLTHROUGH */ comments in the BSM event type switch, and
add one that was missing.

MFC after:	3 weeks
Coverity ID:	3960
2008-11-25 11:25:45 +00:00
Robert Watson
e6870c95e3 When repeatedly accessing a thread credential, cache the credential
pointer in a local thread.  While this is unlikely to significantly
improve performance given modern compiler behavior, it makes the code
more readable and reduces diffs to the Mac OS X version of the same
code (which stores things in creds in the same way, but where the
cred for a thread is reached quite differently).

Discussed with: sson
MFC after:      1 month
Sponsored by:   Apple Inc.
Obtained from:	TrustedBSD Project
2008-11-14 01:24:52 +00:00
Robert Watson
618521b1a2 The audit queue limit variables are size_t, so use size_t for the audit
queue length variables as well, avoiding storing the limit in a larger
type than the length.

Submitted by:	sson
Sponsored by:	Apple Inc.
MFC after:	1 week
2008-11-13 00:21:01 +00:00
Robert Watson
4ebff7e0ca Move audit-internal function definitions for getting and setting audit
kinfo state to audit_private.h.
2008-11-11 23:08:20 +00:00
Robert Watson
91721ee9ba Minor style tweaks and change lock name string to use _'s and not spaces
to improve parseability.
2008-11-11 22:59:40 +00:00
Christian S.J. Peron
ffbcef5a42 Add support for extended header BSM tokens. Currently we use the
regular header tokens.  The extended header tokens contain an IP
or IPv6 address which makes it possible to identify which host an
audit record came from when audit records are centralized.

If the host information has not been specified, the system will
default to the old style headers.  Otherwise, audit records that
are created as a result of system calls will contain host information.

This implemented has been designed to be consistent with the Solaris
implementation.  Host information is set/retrieved using the A_GETKAUDIT
and A_SETKAUDIT auditon(2) commands.  These commands require that a
pointer to a auditinfo_addr_t object is passed.  Currently only IP and
IPv6 address families are supported.

The users pace bits associated with this change will follow in an
openbsm import.

Reviewed by:	rwatson, (sson, wsalamon (older version))
MFC after:	1 month
2008-11-11 21:57:03 +00:00
Robert Watson
b713bf6e3a Wrap sx locking of the audit worker sleep lock in macros, update comments.
MFC after:	2 months
Sponsored by:	Apple, Inc.
2008-11-10 22:06:24 +00:00
John Baldwin
927edcc9ba Use shared vnode locks for auditing vnode arguments as auditing only
does a VOP_GETATTR() which does not require an exclusive lock.

Reviewed by:	csjp, rwatson
2008-11-04 22:31:04 +00:00
John Baldwin
16da60664d Don't lock the vnode around calls to vn_fullpath().
Reviewed by:	csjp, rwatson
2008-11-04 22:30:24 +00:00
Robert Watson
d2f6bb070f Update introductory comment for audit pipes.
MFC after:	2 months
Sponsored by:	Apple, Inc.
2008-11-02 00:25:48 +00:00
Robert Watson
6e1362b499 Remove stale comment about filtering in audit pipe ioctl routine: we do
support filtering now, although we may want to make it more interesting
in the future.

MFC after:	2 months
Sponsored by:	Apple, Inc.
2008-11-02 00:18:19 +00:00
Robert Watson
e4565e2028 Add comment for per-pipe stats.
MFC after:	2 months
Sponsored by:	Apple, Inc.
2008-11-01 23:05:49 +00:00
Robert Watson
cff9c52e23 We only allow a partial read of the first record in an audit pipe
record queue, so move the offset field from the per-record
audit_pipe_entry structure to the audit_pipe structure.

Now that we support reading more than one record at a time, add a
new summary field to audit_pipe, ap_qbyteslen, which tracks the
total number of bytes present in a pipe, and return that (minus
the current offset) via FIONREAD and kqueue's data variable for
the pending byte count rather than the number of bytes remaining
in only the first record.

Add a number of asserts to confirm that these counts and offsets
following the expected rules.

MFC after:	2 months
Sponsored by:	Apple, Inc.
2008-11-01 21:56:45 +00:00
Robert Watson
a9275e0bd5 Allow a single read(2) system call on an audit pipe to retrieve data from
more than one audit record at a time in order to improve efficiency.

MFC after:	2 months
Sponsored by:	Apple, Inc.
2008-11-01 21:16:09 +00:00
Robert Watson
1a0edb10ca Since there is no longer the opportunity for record truncation, just
return 0 if the truncation counter is queried on an audit pipe.

MFC after:	2 months
Sponsored by:	Apple, Inc.
2008-10-31 15:11:01 +00:00
Robert Watson
5a9d15cd4c Historically, /dev/auditpipe has allows only whole records to be read via
read(2), which meant that records longer than the buffer passed to read(2)
were dropped.  Instead take the approach of allowing partial reads to be
continued across multiple system calls more in the style of streaming
character device.

This means retaining a record on the per-pipe queue in a partially read
state, so maintain a current offset into the record.  Keep the record on
the queue during a read, so add a new lock, ap_sx, to serialize removal
of records from the queue by either read(2) or ioctl(2) requesting a pipe
flush.  Modify the kqueue handler to return bytes left in the current
record rather than simply the size of the current record.

It is now possible to use praudit, which used the standard FILE * buffer
sizes, to track much larger record sizes from /dev/auditpipe, such as
very long command lines to execve(2).

MFC after:	2 months
Sponsored by:	Apple, Inc.
2008-10-31 14:40:21 +00:00
Robert Watson
1daa6feb45 When we drop an audit record going to and audit pipe because the audit
pipe has overflowed, drop the newest, rather than oldest, record.  This
makes overflow drop behavior consistent with memory allocation failure
leading to drop, avoids touching the consumer end of the queue from a
producer, and lowers the CPU overhead of dropping a record by dropping
before memory allocation and copying.

Obtained from:	Apple, Inc.
MFC after:	2 months
2008-10-30 23:09:19 +00:00
Robert Watson
846f37f1e7 Break out single audit_pipe_mtx into two types of locks: a global rwlock
protecting the list of audit pipes, and a per-pipe mutex protecting the
queue.

Likewise, replace the single global condition variable used to signal
delivery of a record to one or more pipes, and add a per-pipe condition
variable to avoid spurious wakeups when event subscriptions differ
across multiple pipes.

This slightly increases the cost of delivering to audit pipes, but should
reduce lock contention in the presence of multiple readers as only the
per-pipe lock is required to read from a pipe, as well as avoid
overheading when different pipes are used in different ways.

MFC after:	2 months
Sponsored by:	Apple, Inc.
2008-10-30 21:58:39 +00:00
Robert Watson
c211285f25 Protect the event->class lookup database using an rwlock instead of a
mutex, as it's rarely changed but frequently accessed read-only from
multiple threads, so a potentially significant source of contention.

MFC after:	1 month
Sponsored by:	Apple, Inc.
2008-10-30 17:47:57 +00:00
Robert Watson
a1b9471a47 The V* flags passed using an accmode_t to the access() and open()
access control checks in mac_bsdextended are not in the same
namespace as the MBI_ flags used in ugidfw policies, so add an
explicit conversion routine to get from one to the other.

Obtained from:	TrustedBSD Project
2008-10-30 10:13:53 +00:00
Edward Tomasz Napierala
178da2a90e Commit part of accmode_t changes that I missed in previous commit.
Approved by:	rwatson (mentor)
2008-10-28 21:57:32 +00:00
Robert Watson
564f8f0fee Break out strictly credential-related portions of mac_process.c into a
new file, mac_cred.c.

Obtained from:	TrustedBSD Project
2008-10-28 21:53:10 +00:00
Edward Tomasz Napierala
15bc6b2bd8 Introduce accmode_t. This is required for NFSv4 ACLs - it will be neccessary
to add more V* constants, and the variables changed by this patch were often
being assigned to mode_t variables, which is 16 bit.

Approved by:	rwatson (mentor)
2008-10-28 13:44:11 +00:00
Robert Watson
9215889d21 Rename mac_cred_mmapped_drop_perms(), which revokes access to virtual
memory mappings when the MAC label on a process changes, to
mac_proc_vm_revoke(),

It now also acquires its own credential reference directly from the
affected process rather than accepting one passed by the the caller,
simplifying the API and consumer code.

Obtained from:	TrustedBSD Project
2008-10-28 12:49:07 +00:00
Robert Watson
212ab0cfb3 Rename three MAC entry points from _proc_ to _cred_ to reflect the fact
that they operate directly on credentials: mac_proc_create_swapper(),
mac_proc_create_init(), and mac_proc_associate_nfsd().  Update policies.

Obtained from:	TrustedBSD Project
2008-10-28 11:33:06 +00:00
Robert Watson
048e2d5899 Extended comment on why we consider a partition relabel request of "0" to
be a no-op request, and why this might have to change if we want to allow
leaving a partition someday.

Obtained from:	TrustedBSD Project
MFC after:	3 days
2008-10-28 09:16:34 +00:00
Robert Watson
6c6c03be2d Rename label_on_label() to partition_check(), which is far more
suggestive as to its actual function.

Obtained from:	TrustedBSD Project
MFC after:	3 days
2008-10-28 09:12:13 +00:00
Robert Watson
5077415a10 Improve alphabetical sort order of stub entry points. 2008-10-28 08:50:09 +00:00
Robert Watson
168a6ae7a7 When the mac_bsdextended policy is unloaded, free rule memory.
Obtained from:	TrustedBSD Project
MFC after:	3 days
2008-10-27 18:08:12 +00:00
Robert Watson
0ee8da47fb Add TrustedBSD credit to new ugidfw_internal.h file. 2008-10-27 12:12:23 +00:00
Robert Watson
34f6230e25 Break mac_bsdextended.c out into multiple .c files, with the base access
control logic and policy registration remaining in that file, and access
control checks broken out into other files by class of check.

Obtained from:	TrustedBSD Project
2008-10-27 12:09:15 +00:00
Robert Watson
4074f67f5b Copy mac_bsdextended.c to two object-specific files as a prototype for how
modularize MAC policy layout.

Obtained from:	TrustedBSD Project
2008-10-27 11:46:19 +00:00
Robert Watson
048e1287fa Implement MAC policy support for IPv6 fragment reassembly queues,
modeled on IPv4 fragment reassembly queue support.

Obtained from:	TrustedBSD Project
2008-10-26 22:46:37 +00:00
Robert Watson
4b908c8bb4 Add a MAC label, MAC Framework, and MAC policy entry points for IPv6
fragment reassembly queues.

This allows policies to label reassembly queues, perform access
control checks when matching fragments to a queue, update a queue
label when fragments are matched, and label the resulting
reassembled datagram.

Obtained from:	TrustedBSD Project
2008-10-26 22:45:18 +00:00
Dag-Erling Smørgrav
e11e3f187d Fix a number of style issues in the MALLOC / FREE commit. I've tried to
be careful not to fix anything that was already broken; the NFSv4 code is
particularly bad in this respect.
2008-10-23 20:26:15 +00:00
Dag-Erling Smørgrav
1ede983cc9 Retire the MALLOC and FREE macros. They are an abomination unto style(9).
MFC after:	3 months
2008-10-23 15:53:51 +00:00
Bjoern A. Zeeb
7fb179ba7e Add a mac_inpcb_check_visible implementation to all MAC policies
that handle mac_socket_check_visible.

Reviewed by:	rwatson
MFC after:	3 months (set timer; decide then)
2008-10-17 15:11:12 +00:00
Bjoern A. Zeeb
37ee72936b Add mac_inpcb_check_visible MAC Framework entry point, which is similar
to mac_socket_check_visible but operates on the inpcb.

Reviewed by:	rwatson
MFC after:	3 months (set timer, decide then)
2008-10-17 12:54:28 +00:00
Bjoern A. Zeeb
4a5216a6dc Use the label from the socket credential rather than the
solabel which was not set by the mac_partition policy.

Spotted by:	rwatson
Reviewed by:	rwatson
MFC after:	3 days
2008-10-17 08:58:33 +00:00
Ed Schouten
d3ce832719 Remove unit2minor() use from kernel code.
When I changed kern_conf.c three months ago I made device unit numbers
equal to (unneeded) device minor numbers. We used to require
bitshifting, because there were eight bits in the middle that were
reserved for a device major number. Not very long after I turned
dev2unit(), minor(), unit2minor() and minor2unit() into macro's.
The unit2minor() and minor2unit() macro's were no-ops.

We'd better not remove these four macro's from the kernel, because there
is a lot of (external) code that may still depend on them. For now it's
harmless to remove all invocations of unit2minor() and minor2unit().

Reviewed by:	kib
2008-09-26 14:19:52 +00:00
Attilio Rao
cecd8edba5 Remove the suser(9) interface from the kernel. It has been replaced from
years by the priv_check(9) interface and just very few places are left.
Note that compatibility stub with older FreeBSD version
(all above the 8 limit though) are left in order to reduce diffs against
old versions. It is responsibility of the maintainers for any module, if
they think it is the case, to axe out such cases.

This patch breaks KPI so __FreeBSD_version will be bumped into a later
commit.

This patch needs to be credited 50-50 with rwatson@ as he found time to
explain me how the priv_check() works in detail and to review patches.

Tested by:      Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
Reviewed by:    rwatson
2008-09-17 15:49:44 +00:00
Edward Tomasz Napierala
dfa7fd1d70 Remove VSVTX, VSGID and VSUID. This should be a no-op,
as VSVTX == S_ISVTX, VSGID == S_ISGID and VSUID == S_ISUID.

Approved by:	rwatson (mentor)
2008-09-10 13:16:41 +00:00
Dag-Erling Smørgrav
19de3e9011 Unbreak the build.
Pointy hat to:	kevlo
2008-09-04 13:06:36 +00:00
Kevin Lo
f308bddd3f If the process id specified is invalid, the system call returns ESRCH 2008-09-04 10:44:33 +00:00
Attilio Rao
0359a12ead Decontextualize the couplet VOP_GETATTR / VOP_SETATTR as the passed thread
was always curthread and totally unuseful.

Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>
2008-08-28 15:23:18 +00:00
Robert Watson
3f3978840e More fully audit fexecve(2) and its arguments.
Obtained from:	TrustedBSD Project
Sponsored by:	Google, Inc.
2008-08-25 13:50:01 +00:00
Robert Watson
e08f2b26f4 Use ERANGE instead of EOVERFLOW selected in r182059, this seems more
appropriate even if Solaris doesn't document it (E2BIG) or use it
(EOVERFLOW).

Submitted by:	nectar at apple dot com
Sponsored by:	Apple, Inc.
MFC after:	3 days
2008-08-24 19:55:10 +00:00
Christian S.J. Peron
db8502672e Use sbuf_putc instead of sbuf_cat. This makes more sense, since we are
appending a single character to the buffer.

MFC after:	2 weeks
2008-08-24 03:12:17 +00:00
Robert Watson
6356dba0b4 Introduce two related changes to the TrustedBSD MAC Framework:
(1) Abstract interpreter vnode labeling in execve(2) and mac_execve(2)
    so that the general exec code isn't aware of the details of
    allocating, copying, and freeing labels, rather, simply passes in
    a void pointer to start and stop functions that will be used by
    the framework.  This change will be MFC'd.

(2) Introduce a new flags field to the MAC_POLICY_SET(9) interface
    allowing policies to declare which types of objects require label
    allocation, initialization, and destruction, and define a set of
    flags covering various supported object types (MPC_OBJECT_PROC,
    MPC_OBJECT_VNODE, MPC_OBJECT_INPCB, ...).  This change reduces the
    overhead of compiling the MAC Framework into the kernel if policies
    aren't loaded, or if policies require labels on only a small number
    or even no object types.  Each time a policy is loaded or unloaded,
    we recalculate a mask of labeled object types across all policies
    present in the system.  Eliminate MAC_ALWAYS_LABEL_MBUF option as it
    is no longer required.

MFC after:	1 week ((1) only)
Reviewed by:	csjp
Obtained from:	TrustedBSD Project
Sponsored by:	Apple, Inc.
2008-08-23 15:26:36 +00:00
Robert Watson
cc10282298 When getaudit(2) is unable to fit the terminal IPv6 address into the
space provided by its argument structure, return EOVERFLOW instead of
E2BIG.  The latter is documented in Solaris's man page, but the
former is implemented.  In either case, the caller should use
getaudit_addr(2) to return the IPv6 address.

Submitted by:	sson
Obtained from:	Apple, Inc.
MFC after:	3 days
2008-08-23 14:39:01 +00:00
Christian S.J. Peron
40d288ba0c Make sure we check the preselection masks present for all audit pipes.
It is possible that the audit pipe(s) have different preselection configs
then the global preselection mask.

Spotted by:	Vincenzo Iozzo
MFC after:	2 weeks
2008-08-11 20:14:56 +00:00
Dag-Erling Smørgrav
2616144e43 Add sbuf_new_auto as a shortcut for the very common case of creating a
completely dynamic sbuf.

Obtained from:	Varnish
MFC after:	2 weeks
2008-08-09 11:14:05 +00:00
Robert Watson
95b85ca3a9 Minor style tweaks. 2008-08-02 22:30:51 +00:00
Robert Watson
f7c4bd95ba Rename mac_partition_enabled to partition_enabled to synchronize with
other policies that similarly now avoid the additional mac_ prefix on
variables.

MFC after:	soon
2008-08-02 20:53:59 +00:00
Robert Watson
80794edc05 In mac_bsdextended's auditctl and acct policy access control checks,
return success if the passed vnode pointer is NULL (rather than
panicking).  This can occur if either audit or accounting are
disabled while the policy is running.

Since the swapoff control has no real relevance to this policy,
which is concerned about intent to write rather than water under the
bridge, remove it.

PR:             kern/126100
Reported by:    Alan Amesbury <amesbury at umn dot edu>
MFC after:      3 days
2008-07-31 20:49:12 +00:00
Christian S.J. Peron
dfc714fba1 Currently, BSM audit pathname token generation for chrooted or jailed
processes are not producing absolute pathname tokens.  It is required
that audited pathnames are generated relative to the global root mount
point.  This modification changes our implementation of audit_canon_path(9)
and introduces a new function: vn_fullpath_global(9) which performs a
vnode -> pathname translation relative to the global mount point based
on the contents of the name cache.  Much like vn_fullpath,
vn_fullpath_global is a wrapper function which called vn_fullpath1.

Further, the string parsing routines have been converted to use the
sbuf(9) framework.  This change also removes the conditional acquisition
of Giant, since the vn_fullpath1 method will not dip into file system
dependent code.

The vnode locking was modified to use vhold()/vdrop() instead the vref()
and vrele().  This will modify the hold count instead of modifying the
user count.  This makes more sense since it's the kernel that requires
the reference to the vnode.  This also makes sure that the vnode does not
get recycled we hold the reference to it. [1]

Discussed with:	rwatson
Reviewed by:	kib [1]
MFC after:	2 weeks
2008-07-31 16:57:41 +00:00
Robert Watson
f6d4a8a77b Further synchronization of copyrights, licenses, white space, etc from
Apple and from the OpenBSM vendor tree.

Obtained from:	Apple Inc., TrustedBSD Project
MFC after:	3 days
2008-07-31 09:54:35 +00:00
Robert Watson
33f0efe6b0 Minor white space tweak.
Obtained from:	Apple Inc.
MFC after:	3 days
2008-07-23 07:42:31 +00:00
Robert Watson
93536b495d If an AUE_SYSCTL_NONADMIN audit event is selected, generate a record
with equivilent content to AUE_SYSCTL.

Obtained from:	Apple Inc.
MFC after:	3 days
2008-07-22 17:54:32 +00:00
Robert Watson
30d0721b59 Further minor style fixes to audit.
Obtained from:	Apple Inc.
MFC after:	3 days
2008-07-22 17:49:30 +00:00
Robert Watson
1814e5b748 Remove unneeded \ at the end of a macro.
Obtained from:	Apple Inc.
MFC after:	3 days
2008-07-22 17:08:27 +00:00
Robert Watson
3c4636a7d4 Further minor white space tweaks.
Obtained from:	Apple Inc.
MFC after:	3 days
2008-07-22 17:06:49 +00:00
Robert Watson
fc1286c81d Generally avoid <space><tab> as a white space anomoly.
Obtained from:	Apple Inc.
MFC after:	3 days
2008-07-22 16:44:48 +00:00
Robert Watson
0c0a142a52 Use #define<tab> rather than #define<space>.
Obtained from:	Apple Inc.
MFC after:	3 days
2008-07-22 16:21:59 +00:00
Robert Watson
f1cb603072 Comment fix.
Obtained from:	Apple Inc.
MFC after:	3 days
2008-07-22 16:02:21 +00:00
Robert Watson
98ee1b30aa Comment typo fix.
Obtained from:	Apple Inc.
MFC after:	3 days
2008-07-22 15:54:10 +00:00
Robert Watson
c2f027ffb8 Minor white space synchronization to Apple version of security audit.
Obtained from:	Apple Inc.
MFC after:	3 days
2008-07-22 15:49:19 +00:00
Robert Watson
bc9a43d698 In preparation to sync Apple and FreeBSD versions of security audit,
pick up the Apple Computer -> Apple change in their copyright and
license templates.

Obtained from:	Apple Inc.
MFC after:	3 days
2008-07-22 15:29:48 +00:00
Robert Watson
59b622e6b3 Use unsigned int when iterating over groupsets in audit_arg_groupset().
Obtained from:	Apple Inc.
MFC after:	3 days
2008-07-22 15:17:21 +00:00
John Baldwin
6bc1e9cd84 Rework the lifetime management of the kernel implementation of POSIX
semaphores.  Specifically, semaphores are now represented as new file
descriptor type that is set to close on exec.  This removes the need for
all of the manual process reference counting (and fork, exec, and exit
event handlers) as the normal file descriptor operations handle all of
that for us nicely.  It is also suggested as one possible implementation
in the spec and at least one other OS (OS X) uses this approach.

Some bugs that were fixed as a result include:
- References to a named semaphore whose name is removed still work after
  the sem_unlink() operation.  Prior to this patch, if a semaphore's name
  was removed, valid handles from sem_open() would get EINVAL errors from
  sem_getvalue(), sem_post(), etc.  This fixes that.
- Unnamed semaphores created with sem_init() were not cleaned up when a
  process exited or exec'd.  They were only cleaned up if the process
  did an explicit sem_destroy().  This could result in a leak of semaphore
  objects that could never be cleaned up.
- On the other hand, if another process guessed the id (kernel pointer to
  'struct ksem' of an unnamed semaphore (created via sem_init)) and had
  write access to the semaphore based on UID/GID checks, then that other
  process could manipulate the semaphore via sem_destroy(), sem_post(),
  sem_wait(), etc.
- As part of the permission check (UID/GID), the umask of the proces
  creating the semaphore was not honored.  Thus if your umask denied group
  read/write access but the explicit mode in the sem_init() call allowed
  it, the semaphore would be readable/writable by other users in the
  same group, for example.  This includes access via the previous bug.
- If the module refused to unload because there were active semaphores,
  then it might have deregistered one or more of the semaphore system
  calls before it noticed that there was a problem.  I'm not sure if
  this actually happened as the order that modules are discovered by the
  kernel linker depends on how the actual .ko file is linked.  One can
  make the order deterministic by using a single module with a mod_event
  handler that explicitly registers syscalls (and deregisters during
  unload after any checks).  This also fixes a race where even if the
  sem_module unloaded first it would have destroyed locks that the
  syscalls might be trying to access if they are still executing when
  they are unloaded.

  XXX: By the way, deregistering system calls doesn't do any blocking
  to drain any threads from the calls.
- Some minor fixes to errno values on error.  For example, sem_init()
  isn't documented to return ENFILE or EMFILE if we run out of semaphores
  the way that sem_open() can.  Instead, it should return ENOSPC in that
  case.

Other changes:
- Kernel semaphores now use a hash table to manage the namespace of
  named semaphores nearly in a similar fashion to the POSIX shared memory
  object file descriptors.  Kernel semaphores can now also have names
  longer than 14 chars (up to MAXPATHLEN) and can include subdirectories
  in their pathname.
- The UID/GID permission checks for access to a named semaphore are now
  done via vaccess() rather than a home-rolled set of checks.
- Now that kernel semaphores have an associated file object, the various
  MAC checks for POSIX semaphores accept both a file credential and an
  active credential.  There is also a new posixsem_check_stat() since it
  is possible to fstat() a semaphore file descriptor.
- A small set of regression tests (using the ksem API directly) is present
  in src/tools/regression/posixsem.

Reported by:	kris (1)
Tested by:	kris
Reviewed by:	rwatson (lightly)
MFC after:	1 month
2008-06-27 05:39:04 +00:00
John Baldwin
127cc7673d Add missing counter increments for posix shm checks. 2008-06-26 13:49:32 +00:00
John Baldwin
c4f3a35a54 Remove the posixsem_check_destroy() MAC check. It is semantically identical
to doing a MAC check for close(), but no other types of close() (including
close(2) and ksem_close(2)) have MAC checks.

Discussed with:	rwatson
2008-06-23 21:37:53 +00:00
Robert Watson
37f44cb428 The TrustedBSD MAC Framework named struct ipq instances 'ipq', which is the
same as the global variable defined in ip_input.c.  Instead, adopt the name
'q' as found in about 1/2 of uses in ip_input.c, preventing a collision on
the name.  This is non-harmful, but means that search and replace on the
global works less well (as in the virtualization work), as well as indexing
tools.

MFC after:	1 week
Reported by:	julian
2008-06-13 22:14:15 +00:00
Ed Schouten
29d4cb241b Don't enforce unique device minor number policy anymore.
Except for the case where we use the cloner library (clone_create() and
friends), there is no reason to enforce a unique device minor number
policy. There are various drivers in the source tree that allocate unr
pools and such to provide minor numbers, without using them themselves.

Because we still need to support unique device minor numbers for the
cloner library, introduce a new flag called D_NEEDMINOR. All cdevsw's
that are used in combination with the cloner library should be marked
with this flag to make the cloning work.

This means drivers can now freely use si_drv0 to store their own flags
and state, making it effectively the same as si_drv1 and si_drv2. We
still keep the minor() and dev2unit() routines around to make drivers
happy.

The NTFS code also used the minor number in its hash table. We should
not do this anymore. If the si_drv0 field would be changed, it would no
longer end up in the same list.

Approved by:	philip (mentor)
2008-06-11 18:55:19 +00:00
Simon L. B. Nielsen
3bff0167b9 When the file-system containing the audit log file is running low on
disk space a warning is printed.  Make this warning a bit more
informative.

Approved by:	rwatson
2008-06-10 20:05:32 +00:00
Robert Watson
4e95375678 Add an XXX comment regarding a bug I introduced when modifying the behavior
of audit log vnode rotation: on shutdown, we may not properly drain all
pending records, which could lead to lost records during system shutdown.
2008-06-03 11:06:34 +00:00
Christian S.J. Peron
1f84ab0f2a Plug a memory leak which can occur when multiple MAC policies are loaded
which label mbufs.  This leak can occur if one policy successfully allocates
label storage and subsequent allocations from other policies fail.

Spotted by:	rwatson
MFC after:	1 week
2008-05-27 14:18:02 +00:00
Robert Watson
bcbd871a3f Don't use LK_DRAIN before calling VOP_FSYNC() in the two further
panic cases for audit trail failure -- this doesn't contribute
anything, and might arguably be wrong.

MFC after:	1 week
Requested by:	attilio
2008-05-21 13:59:05 +00:00
Robert Watson
bf7baa9eca Don't use LK_DRAIN before calling VOP_FSYNC() in the panic case for
audit trail failure -- this doesn't contribute anything, and might
arguably be wrong.

MFC after:	1 week
Requested by:	attilio
2008-05-21 13:05:06 +00:00
Robert Watson
7d8ab8bafb When testing whether to enter the audit argument gathering code, rather
than checking whether audit is enabled globally, instead check whether
the current thread has an audit record.  This avoids entering the audit
code to collect argument data if auditing is enabled but the current
system call is not of interest to audit.

MFC after:	1 week
Sponsored by:	Apple, Inc.
2008-05-06 00:32:23 +00:00
Robert Watson
fa9e0a18af Fix include guard spelling.
MFC after:	3 days
Submitted by:	diego
2008-04-27 15:51:49 +00:00
Robert Watson
81efe39deb Use logic or, not binary or, when deciding whether or not a system call
exit requires entering the audit code.  The result is much the same,
but they mean different things.

MFC afer:	3 days
Submitted by:	Diego Giagio <dgiagio at gmail dot com>
2008-04-24 12:23:31 +00:00
Robert Watson
1a46aa801e When auditing state from an IPv4 or IPv6 socket, use read locks on the
inpcb rather than write locks.

MFC after:	3 months
2008-04-19 18:37:08 +00:00
Robert Watson
211b72ad2f When propagating a MAC label from an inpcb to an mbuf, allow read and
write locks on the inpcb, not just write locks.

MFC after:	3 months
2008-04-19 18:35:27 +00:00
Robert Watson
8501a69cc9 Convert pcbinfo and inpcb mutexes to rwlocks, and modify macros to
explicitly select write locking for all use of the inpcb mutex.
Update some pcbinfo lock assertions to assert locked rather than
write-locked, although in practice almost all uses of the pcbinfo
rwlock main exclusive, and all instances of inpcb lock acquisition
are exclusive.

This change should introduce (ideally) little functional change.
However, it lays the groundwork for significantly increased
parallelism in the TCP/IP code.

MFC after:	3 months
Tested by:	kris (superset of committered patch)
2008-04-17 21:38:18 +00:00
Robert Watson
dda409d4ec Use __FBSDID() for $FreeBSD$ IDs in the audit code.
MFC after:	3 days
2008-04-13 22:06:56 +00:00
Robert Watson
646a9f8029 Make naming of include guards for MAC Framework include files more
consistent with other kernel include guards (don't start with _SYS).

MFC after:	3 days
2008-04-13 21:45:52 +00:00
Konstantin Belousov
57b4252e45 Add the support for the AT_FDCWD and fd-relative name lookups to the
namei(9).

Based on the submission by rdivacky,
	sponsored by Google Summer of Code 2007
Reviewed by:	rwatson, rdivacky
Tested by:	pho
2008-03-31 12:01:21 +00:00
Robert Watson
237fdd787b In keeping with style(9)'s recommendations on macros, use a ';'
after each SYSINIT() macro invocation.  This makes a number of
lightweight C parsers much happier with the FreeBSD kernel
source, including cflow's prcc and lxr.

MFC after:	1 month
Discussed with:	imp, rink
2008-03-16 10:58:09 +00:00
Robert Watson
d4cafc74ae Remove XXX to remind me to check the free space calculation, which to my
eyes appears right following a check.

MFC after:	3 days
2008-03-10 18:15:02 +00:00
Christian S.J. Peron
e5ad5f4d70 Change auditon(2) so that if somebody supplies an invalid command, it
returns EINVAL. Right now we return 0 or success for invalid commands,
which could be quite problematic in certain conditions.

MFC after:	1 week
Discussed with:	rwatson
2008-03-06 22:57:03 +00:00
Robert Watson
8805ca53e7 Rather than copying out the full audit trigger record, which includes
a queue entry field, just copy out the unsigned int that is the trigger
message.  In practice, auditd always requested sizeof(unsigned int), so
the extra bytes were ignored, but copying them out was not the intent.

MFC after:	1 month
2008-03-02 21:34:17 +00:00
Robert Watson
6cc189913c Add audit_prefixes to two more globally visible functions in the Audit
implementation.

MFC after:	1 month
2008-03-01 11:40:49 +00:00
Robert Watson
fb4ed8c9bf Rename globally exposed symbol send_trigger() to audit_send_trigger().
MFC after:	1 month
2008-03-01 11:04:04 +00:00
Robert Watson
ae87be447c Replace somewhat awkward audit trail rotation scheme, which involved the
global audit mutex and condition variables, with an sx lock which protects
the trail vnode and credential while in use, and is acquired by the system
call code when rotating the trail.  Previously, a "message" would be sent
to the kernel audit worker, which did the rotation, but the new code is
simpler and (hopefully) less error-prone.

Obtained from:	TrustedBSD Project
MFC after:	1 month
2008-02-27 17:12:22 +00:00
Robert Watson
303d3f35fb Rename several audit functions in the global kernel symbol namespace to
have audit_ on the front:

- canon_path -> audit_canon_path
- msgctl_to_event -> audit_msgctl_to_event
- semctl_to_event -> audit_semctl_to_event

MFC after:	1 month
2008-02-25 20:28:00 +00:00
Christian S.J. Peron
c52a508838 Make sure that the termid type is initialized to AU_IPv4 by default.
This makes sure that process tokens credentials with un-initialized
audit contexts are handled correctly.  Currently, when invariants are
enabled, this change fixes a panic by ensuring that we have a valid
termid family.  Also, this fixes token generation for process tokens
making sure that userspace is always getting a valid token.

This is consistent with what Solaris does when an audit context is
un-initialized.

Obtained from:	TrustedBSD Project
MFC after:	1 week
2008-01-28 17:33:46 +00:00
Robert Watson
5ac3b03500 Properly return the error from mls_subject_privileged() in the ifnet
relabel check for MLS rather than returning 0 directly.

This problem didn't result in a vulnerability currently as the central
implementation of ifnet relabeling also checks for UNIX privilege, and
we currently don't guarantee containment for the root user in mac_mls,
but we should be using the MLS definition of privilege as well as the
UNIX definition in anticipation of supporting root containment at some
point.

MFC after:	3 days
Submitted by:	Zhouyi Zhou <zhouzhouyi at gmail dot com>
Sponsored by:	Google SoC 2007
2008-01-28 10:20:18 +00:00
Christian S.J. Peron
0f7e334a95 Fix gratuitous whitespace bug
MFC after:	1 week
Obtained from:	TrustedBSD Project
2008-01-18 19:57:21 +00:00
Christian S.J. Peron
cd109a68ae Add a case for AUE_LISTEN. This removes the following console error message:
"BSM conversion requested for unknown event 43140"

It should be noted that we need to audit the fd argument for this system
call.

Obtained from:	TrustedBSD Project
MFC after:	1 week
2008-01-18 19:50:34 +00:00
Attilio Rao
22db15c06f VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in
conjuction with 'thread' argument passing which is always curthread.
Remove the unuseful extra-argument and pass explicitly curthread to lower
layer functions, when necessary.

KPI results broken by this change, which should affect several ports, so
version bumping and manpage update will be further committed.

Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>
2008-01-13 14:44:15 +00:00
Attilio Rao
cb05b60a89 vn_lock() is currently only used with the 'curthread' passed as argument.
Remove this argument and pass curthread directly to underlying
VOP_LOCK1() VFS method. This modify makes the code cleaner and in
particular remove an annoying dependence helping next lockmgr() cleanup.
KPI results, obviously, changed.

Manpage and FreeBSD_version will be updated through further commits.

As a side note, would be valuable to say that next commits will address
a similar cleanup about VFS methods, in particular vop_lock1 and
vop_unlock.

Tested by:	Diego Sardina <siarodx at gmail dot com>,
		Andrea Di Pasquale <whyx dot it at gmail dot com>
2008-01-10 01:10:58 +00:00
John Baldwin
8e38aeff17 Add a new file descriptor type for IPC shared memory objects and use it to
implement shm_open(2) and shm_unlink(2) in the kernel:
- Each shared memory file descriptor is associated with a swap-backed vm
  object which provides the backing store.  Each descriptor starts off with
  a size of zero, but the size can be altered via ftruncate(2).  The shared
  memory file descriptors also support fstat(2).  read(2), write(2),
  ioctl(2), select(2), poll(2), and kevent(2) are not supported on shared
  memory file descriptors.
- shm_open(2) and shm_unlink(2) are now implemented as system calls that
  manage shared memory file descriptors.  The virtual namespace that maps
  pathnames to shared memory file descriptors is implemented as a hash
  table where the hash key is generated via the 32-bit Fowler/Noll/Vo hash
  of the pathname.
- As an extension, the constant 'SHM_ANON' may be specified in place of the
  path argument to shm_open(2).  In this case, an unnamed shared memory
  file descriptor will be created similar to the IPC_PRIVATE key for
  shmget(2).  Note that the shared memory object can still be shared among
  processes by sharing the file descriptor via fork(2) or sendmsg(2), but
  it is unnamed.  This effectively serves to implement the getmemfd() idea
  bandied about the lists several times over the years.
- The backing store for shared memory file descriptors are garbage
  collected when they are not referenced by any open file descriptors or
  the shm_open(2) virtual namespace.

Submitted by:	dillon, peter (previous versions)
Submitted by:	rwatson (I based this on his version)
Reviewed by:	alc (suggested converting getmemfd() to shm_open())
2008-01-08 21:58:16 +00:00
Robert Watson
3de213cc00 Add a new 'why' argument to kdb_enter(), and a set of constants to use
for that argument.  This will allow DDB to detect the broad category of
reason why the debugger has been entered, which it can use for the
purposes of deciding which DDB script to run.

Assign approximate why values to all current consumers of the
kdb_enter() interface.
2007-12-25 17:52:02 +00:00
Wojciech A. Koszek
7a9d5a45e7 Change "audit_pipe_preselect" to "audit_pipe_presel" to make it print
with proper alignment in ddb(4) and vmstat(8).

Reviewed by:	rwatson@
2007-12-25 13:23:19 +00:00
Robert Watson
b5f992b93d Fix a MAC label leak for POSIX semaphores, in which per-policy labels
would be properly disposed of, but the global label structure for the
semaphore wouldn't be freed.

MFC after:	3 days
Reported by:	tanyong <tanyong at ercist dot iscas dot ac dot cn>,
		zhouzhouyi
2007-12-17 17:26:32 +00:00
Wojciech A. Koszek
4ce05f7e44 Explicitly initialize 'ret' to 0'. It lets one to build tmpfs from the
latest source tree with older compiler--gcc3.

Approved by:	cognet (mentor)
2007-12-04 20:20:59 +00:00
Robert Watson
1876fb2118 Implement per-object type consistency checks for labels passed to
'internalize' operations rather than using a single common check.

Obtained from:	TrustedBSD Project
2007-10-30 00:01:28 +00:00
Robert Watson
323f4cc31d Replace use of AU_NULL with 0 when no audit classes are in use; this
supports the removal of hard-coded audit class constants in OpenBSM
1.0.  All audit classes are now dynamically configured via the
audit_class database.

Obtained from:	TrustedBSD Project
2007-10-29 18:07:48 +00:00
Robert Watson
f03368334e Canonicalize names of local variables.
Add some missing label checks in mac_test.

Obtained from:	TrustedBSD Project
2007-10-29 15:30:47 +00:00
Robert Watson
eb320b0ee7 Resort TrustedBSD MAC Framework policy entry point implementations and
declarations to match the object, operation sort order in the framework
itself.

Obtained from:	TrustedBSD Project
2007-10-29 13:33:06 +00:00
Robert Watson
f10b1ebc78 Add missing mac_test labeling and sleep checks for the syncache.
Discussed with:	csjp
Obtained from:	TrustedBSD Project
2007-10-28 18:33:31 +00:00
Robert Watson
2a9e17ce8e Garbage collect mac_mbuf_create_multicast_encap TrustedBSD MAC Framework
entry point, which is no longer required now that we don't support
old-style multicast tunnels.  This removes the last mbuf object class
entry point that isn't init/copy/destroy.

Obtained from:	TrustedBSD Project
2007-10-28 17:55:57 +00:00
Robert Watson
a13e21f7bc Continue to move from generic network entry points in the TrustedBSD MAC
Framework by moving from mac_mbuf_create_netlayer() to more specific
entry points for specific network services:

- mac_netinet_firewall_reply() to be used when replying to in-bound TCP
  segments in pf and ipfw (etc).

- Rename mac_netinet_icmp_reply() to mac_netinet_icmp_replyinplace() and
  add mac_netinet_icmp_reply(), reflecting that in some cases we overwrite
  a label in place, but in others we apply the label to a new mbuf.

Obtained from:	TrustedBSD Project
2007-10-28 17:12:48 +00:00
Robert Watson
b9b0dac33b Move towards more explicit support for various network protocol stacks
in the TrustedBSD MAC Framework:

- Add mac_atalk.c and add explicit entry point mac_netatalk_aarp_send()
  for AARP packet labeling, rather than using a generic link layer
  entry point.

- Add mac_inet6.c and add explicit entry point mac_netinet6_nd6_send()
  for ND6 packet labeling, rather than using a generic link layer entry
  point.

- Add expliict entry point mac_netinet_arp_send() for ARP packet
  labeling, and mac_netinet_igmp_send() for IGMP packet labeling,
  rather than using a generic link layer entry point.

- Remove previous genering link layer entry point,
  mac_mbuf_create_linklayer() as it is no longer used.

- Add implementations of new entry points to various policies, largely
  by replicating the existing link layer entry point for them; remove
  old link layer entry point implementation.

- Make MAC_IFNET_LOCK(), MAC_IFNET_UNLOCK(), and mac_ifnet_mtx global
  to the MAC Framework rather than static to mac_net.c as it is now
  needed outside of mac_net.c.

Obtained from:	TrustedBSD Project
2007-10-28 15:55:23 +00:00
Robert Watson
b0f4c777e4 Perform explicit label type checks for externalize entry points, rather than
a generic initialized test.

Obtained from:	TrustedBSD Project
2007-10-28 14:28:33 +00:00
Christian S.J. Peron
4777d3f98a Make sure we are incrementing the read count for each audit pipe read.
MFC after:	1 week
2007-10-27 22:28:01 +00:00
Robert Watson
438aeadf27 Give each posixsem MAC Framework entry point its own counter and test case
in the mac_test policy, rather than sharing a single function for all of
the access control checks.

Obtained from:	TrustedBSD Project
2007-10-27 10:38:57 +00:00
Robert Watson
6683b28d78 Update comment following MAC Framework entry point renaming and
reorganization.

Obtained from:	TrustedBSD Project
2007-10-26 21:16:34 +00:00
Robert Watson
8640764682 Rename 'mac_mbuf_create_from_firewall' to 'mac_netinet_firewall_send' as
we move towards netinet as a pseudo-object for the MAC Framework.

Rename 'mac_create_mbuf_linklayer' to 'mac_mbuf_create_linklayer' to
reflect general object-first ordering preference.

Sponsored by:	SPARTA (original patches against Mac OS X)
Obtained from:	TrustedBSD Project, Apple Computer
2007-10-26 13:18:38 +00:00
Christian S.J. Peron
57274c513c Implement AUE_CORE, which adds process core dump support into the kernel.
This change introduces audit_proc_coredump() which is called by coredump(9)
to create an audit record for the coredump event.  When a process
dumps a core, it could be security relevant.  It could be an indicator that
a stack within the process has been overflowed with an incorrectly constructed
malicious payload or a number of other events.

The record that is generated looks like this:

header,111,10,process dumped core,0,Thu Oct 25 19:36:29 2007, + 179 msec
argument,0,0xb,signal
path,/usr/home/csjp/test.core
subject,csjp,csjp,staff,csjp,staff,1101,1095,50457,10.37.129.2
return,success,1
trailer,111

- We allocate a completely new record to make sure we arent clobbering
  the audit data associated with the syscall that produced the core
  (assuming the core is being generated in response to SIGABRT  and not
  an invalid memory access).
- Shuffle around expand_name() so we can use the coredump name at the very
  beginning of the coredump call.  Make sure we free the storage referenced
  by "name" if we need to bail out early.
- Audit both successful and failed coredump creation efforts

Obtained from:	TrustedBSD Project
Reviewed by:	rwatson
MFC after:	1 month
2007-10-26 01:23:07 +00:00
Robert Watson
179da74eb8 Sort entry points in mac_framework.h and mac_policy.h alphabetically by
primary object type, and then by secondarily by method name.  This sorts
entry points relating to particular objects, such as pipes, sockets, and
vnodes together.

Sponsored by:	SPARTA (original patches against Mac OS X)
Obtained from:	TrustedBSD Project, Apple Computer
2007-10-25 22:45:25 +00:00
Robert Watson
02be6269c3 Normalize TCP syncache-related MAC Framework entry points to match most
other entry points in the form mac_<object>_method().

Discussed with:	csjp
Obtained from:	TrustedBSD Project
2007-10-25 14:37:37 +00:00
Robert Watson
eb2cd5e1df Rename mac_associate_nfsd_label() to mac_proc_associate_nfsd(), and move
from mac_vfs.c to mac_process.c to join other functions that setup up
process labels for specific purposes.  Unlike the two proc create calls,
this call is intended to run after creation when a process registers as
the NFS daemon, so remains an _associate_ call..

Obtained from:	TrustedBSD Project
2007-10-25 12:34:14 +00:00
Robert Watson
3f1a7a9086 Consistently name functions for mac_<policy> as <policy>_whatever rather
than mac_<policy>_whatever, as this shortens the names and makes the code
a bit easier to read.

When dealing with label structures, name variables 'mb', 'ml', 'mm rather
than the longer 'mac_biba', 'mac_lomac', and 'mac_mls', likewise making
the code a little easier to read.

Obtained from:	TrustedBSD Project
2007-10-25 11:31:11 +00:00
Robert Watson
a7f3aac7cb Further MAC Framework cleanup: normalize some local variable names and
clean up some comments.

Obtained from:	TrustedBSD Project
2007-10-25 07:49:47 +00:00
Robert Watson
30d239bc4c Merge first in a series of TrustedBSD MAC Framework KPI changes
from Mac OS X Leopard--rationalize naming for entry points to
the following general forms:

  mac_<object>_<method/action>
  mac_<object>_check_<method/action>

The previous naming scheme was inconsistent and mostly
reversed from the new scheme.  Also, make object types more
consistent and remove spaces from object types that contain
multiple parts ("posix_sem" -> "posixsem") to make mechanical
parsing easier.  Introduce a new "netinet" object type for
certain IPv4/IPv6-related methods.  Also simplify, slightly,
some entry point names.

All MAC policy modules will need to be recompiled, and modules
not updates as part of this commit will need to be modified to
conform to the new KPI.

Sponsored by:	SPARTA (original patches against Mac OS X)
Obtained from:	TrustedBSD Project, Apple Computer
2007-10-24 19:04:04 +00:00
Christian S.J. Peron
088b56a874 Use extended process token. The in kernel process audit
state is stored in an extended subject token now.  Make sure
that we are using the extended data.  This fixes the termID
for process tokens.

Obtained from:	TrustedBSD Project
Discussed with:	rwatson
MFC after:	1 week
2007-10-24 00:05:52 +00:00
Robert Watson
1cb99cfc25 Bump MAC_VERSION to 4 and add an 8.x line in the version table. Version 4
will include significant synchronization to the Mac OS X Leopard version
of the MAC Framework.

Obtained from:	TrustedBSD Project
2007-10-23 14:12:16 +00:00
Robert Watson
fe09513e7d Canonicalize naming of local variables for struct ksem and associated
labels to 'ks' and 'kslabel' to reflect the convention in posix_sem.c.

MFC after:	3 days
Obtained from:	TrustedBSD Project
2007-10-21 11:11:07 +00:00
Julian Elischer
3745c395ec Rename the kthread_xxx (e.g. kthread_create()) calls
to kproc_xxx as they actually make whole processes.
Thos makes way for us to add REAL kthread_create() and friends
that actually make theads. it turns out that most of these
calls actually end up being moved back to the thread version
when it's added. but we need to make this cosmetic change first.

I'd LOVE to do this rename in 7.0  so that we can eventually MFC the
new kthread_xxx() calls.
2007-10-20 23:23:23 +00:00
Christian S.J. Peron
24f4142c18 - Change the wakeup logic associated with having multiple sleepers
on multiple different audit pipes.  The old method used cv_signal()
  which would result in only one thread being woken up after we
  appended a record to it's queue.  This resulted in un-timely wake-ups
  when processing audit records real-time.

- Assign PSOCK priority to threads that have been sleeping on a read(2).
  This is the same priority threads are woken up with when they select(2)
  or poll(2).  This yields fairness between various forms of sleep on
  the audit pipes.

Obtained from:	TrustedBSD Project
Discussed with:	rwatson
MFC after:	1 week
2007-10-12 15:09:02 +00:00
Jeff Roberson
b61ce5b0e6 - Move all of the PS_ flags into either p_flag or td_flags.
- p_sflag was mostly protected by PROC_LOCK rather than the PROC_SLOCK or
   previously the sched_lock.  These bugs have existed for some time.
 - Allow swapout to try each thread in a process individually and then
   swapin the whole process if any of these fail.  This allows us to move
   most scheduler related swap flags into td_flags.
 - Keep ki_sflag for backwards compat but change all in source tools to
   use the new and more correct location of P_INMEM.

Reported by:	pho
Reviewed by:	attilio, kib
Approved by:	re (kensmith)
2007-09-17 05:31:39 +00:00
Robert Watson
45e0f3d63d Rename mac_check_vnode_delete() MAC Framework and MAC Policy entry
point to mac_check_vnode_unlink(), reflecting UNIX naming conventions.

This is the first of several commits to synchronize the MAC Framework
in FreeBSD 7.0 with the MAC Framework as it will appear in Mac OS X
Leopard.

Reveiwed by:    csjp, Samy Bahra <sbahra at gwu dot edu>
Submitted by:   Jacques Vidrine <nectar at apple dot com>
Obtained from:  Apple Computer, Inc.
Sponsored by:   SPARTA, SPAWAR
Approved by:    re (bmah)
2007-09-10 00:00:18 +00:00
Robert Watson
0bf686c125 Remove the now-unused NET_{LOCK,UNLOCK,ASSERT}_GIANT() macros, which
previously conditionally acquired Giant based on debug.mpsafenet.  As that
has now been removed, they are no longer required.  Removing them
significantly simplifies error-handling in the socket layer, eliminated
quite a bit of unwinding of locking in error cases.

While here clean up the now unneeded opt_net.h, which previously was used
for the NET_WITH_GIANT kernel option.  Clean up some related gotos for
consistency.

Reviewed by:	bz, csjp
Tested by:	kris
Approved by:	re (kensmith)
2007-08-06 14:26:03 +00:00
Robert Watson
7bb9c8a05b When checking labels during a vnode link operation in MLS, use the file
vnode label for a check rather than the directory vnode label a second
time.

MFC after:	3 days
Submitted by:	Zhouyi ZHOU <zhouzhouyi at FreeBSD dot org>
Reviewed by:	csjp
Sponsored by:	Google Summer of Code 2007
Approved by:	re (bmah)
2007-07-23 13:28:54 +00:00
Robert Watson
458f818f47 In preparation for 7.0 privilege cleanup, clean up style:
- Sort copyrights by date.
- Re-wrap, and in some cases, fix comments.
- Fix tabbing, white space, remove extra blank lines.
- Remove commented out debugging printfs.

Approved by:	re (kensmith)
2007-07-05 13:16:04 +00:00
Peter Wemm
343cc83e1b Fix a bunch of warnings due to a missing forward declaration of a struct.
Approved by: re (rwatson)
2007-07-05 06:45:37 +00:00
Robert Watson
536b405093 Remove two boot printfs generated by Audit to announce it's presence,
and replace with software-testable sysctl node (security.audit) that
can be used to detect kernel audit support.

Obtained from:	TrustedBSD Project
Approved by:	re (kensmith)
2007-07-01 20:51:30 +00:00
Christian S.J. Peron
cac465aa7f - Add audit_arg_audinfo_addr() for auditing the arguments for setaudit_addr(2)
- In audit_bsm.c, make sure all the arguments: ARG_AUID, ARG_ASID, ARG_AMASK,
  and ARG_TERMID{_ADDR} are valid before auditing their arguments. (This is done
  for both setaudit and setaudit_addr.
- Audit the arguments passed to setaudit_addr(2)
- AF_INET6 does not equate to AU_IPv6. Change this in au_to_in_addr_ex() so the
  audit token is created with the correct type. This fixes the processing of the
  in_addr_ex token in users pace.
- Change the size of the token (as generated by the kernel) from 5*4 bytes to
  4*4 bytes (the correct size of an ip6 address)
- Correct regression from ucred work which resulted in getaudit() not returning
  E2BIG if the subject had an ip6 termid
- Correct slight regression in getaudit(2) which resulted in the size of a pointer
  being passed instead of the size of the structure. (This resulted in invalid
  auditinfo data being returned via getaudit(2))

Reviewed by:	rwatson
Approved by:	re@ (kensmith)
Obtained from:	TrustedBSD Project
MFC after:	1 month
2007-06-27 17:01:15 +00:00
Robert Watson
f1e8bf6dd4 Add a new MAC framework and policy entry point,
mpo_check_proc_setaudit_addr to be used when controlling use of
setaudit_addr(), rather than mpo_check_proc_setaudit(), which takes a
different argument type.

Reviewed by:	csjp
Approved by:	re (kensmith)
2007-06-26 14:14:01 +00:00
Robert Watson
f640bf4767 In setaudit_addr(), drop the process lock in error cases.
Submitted by:	Peter Holm <peter@holm.cc> (BugMaster)
2007-06-15 15:20:56 +00:00
Robert Watson
3805385e3d Spell statistics more correctly in comments. 2007-06-14 03:02:33 +00:00
Robert Watson
c2259ba44f Include priv.h to pick up suser(9) definitions, missed in an earlier
commit.

Warnings spotted by:	kris
2007-06-13 22:42:43 +00:00
Robert Watson
6a9a600b49 Close a very narrow race that might cause a trigger allocation to be
leaked if a trigger is delivered as the trigger device is closed.

Obtained from:	TrustedBSD Project
2007-06-13 21:17:23 +00:00
Robert Watson
32f9753cfb Eliminate now-unused SUSER_ALLOWJAIL arguments to priv_check_cred(); in
some cases, move to priv_check() if it was an operation on a thread and
no other flags were present.

Eliminate caller-side jail exception checking (also now-unused); jail
privilege exception code now goes solely in kern_jail.c.

We can't yet eliminate suser() due to some cases in the KAME code where
a privilege check is performed and then used in many different deferred
paths.  Do, however, move those prototypes to priv.h.

Reviewed by:	csjp
Obtained from:	TrustedBSD Project
2007-06-12 00:12:01 +00:00
Robert Watson
3666798f15 Clean up, and sometimes remove, a number of audit-related implementation
comments.

Obtained from:	TrutstedBSD Project
2007-06-11 22:10:54 +00:00
Robert Watson
faef53711b Move per-process audit state from a pointer in the proc structure to
embedded storage in struct ucred.  This allows audit state to be cached
with the thread, avoiding locking operations with each system call, and
makes it available in asynchronous execution contexts, such as deep in
the network stack or VFS.

Reviewed by:	csjp
Approved by:	re (kensmith)
Obtained from:	TrustedBSD Project
2007-06-07 22:27:15 +00:00
Jeff Roberson
982d11f836 Commit 14/14 of sched_lock decomposition.
- Use thread_lock() rather than sched_lock for per-thread scheduling
   sychronization.
 - Use the per-process spinlock rather than the sched_lock for per-process
   scheduling synchronization.

Tested by:      kris, current@
Tested on:      i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-05 00:00:57 +00:00