freebsd with flexible iflib nic queues
Go to file
John Baldwin 6bc1e9cd84 Rework the lifetime management of the kernel implementation of POSIX
semaphores.  Specifically, semaphores are now represented as new file
descriptor type that is set to close on exec.  This removes the need for
all of the manual process reference counting (and fork, exec, and exit
event handlers) as the normal file descriptor operations handle all of
that for us nicely.  It is also suggested as one possible implementation
in the spec and at least one other OS (OS X) uses this approach.

Some bugs that were fixed as a result include:
- References to a named semaphore whose name is removed still work after
  the sem_unlink() operation.  Prior to this patch, if a semaphore's name
  was removed, valid handles from sem_open() would get EINVAL errors from
  sem_getvalue(), sem_post(), etc.  This fixes that.
- Unnamed semaphores created with sem_init() were not cleaned up when a
  process exited or exec'd.  They were only cleaned up if the process
  did an explicit sem_destroy().  This could result in a leak of semaphore
  objects that could never be cleaned up.
- On the other hand, if another process guessed the id (kernel pointer to
  'struct ksem' of an unnamed semaphore (created via sem_init)) and had
  write access to the semaphore based on UID/GID checks, then that other
  process could manipulate the semaphore via sem_destroy(), sem_post(),
  sem_wait(), etc.
- As part of the permission check (UID/GID), the umask of the proces
  creating the semaphore was not honored.  Thus if your umask denied group
  read/write access but the explicit mode in the sem_init() call allowed
  it, the semaphore would be readable/writable by other users in the
  same group, for example.  This includes access via the previous bug.
- If the module refused to unload because there were active semaphores,
  then it might have deregistered one or more of the semaphore system
  calls before it noticed that there was a problem.  I'm not sure if
  this actually happened as the order that modules are discovered by the
  kernel linker depends on how the actual .ko file is linked.  One can
  make the order deterministic by using a single module with a mod_event
  handler that explicitly registers syscalls (and deregisters during
  unload after any checks).  This also fixes a race where even if the
  sem_module unloaded first it would have destroyed locks that the
  syscalls might be trying to access if they are still executing when
  they are unloaded.

  XXX: By the way, deregistering system calls doesn't do any blocking
  to drain any threads from the calls.
- Some minor fixes to errno values on error.  For example, sem_init()
  isn't documented to return ENFILE or EMFILE if we run out of semaphores
  the way that sem_open() can.  Instead, it should return ENOSPC in that
  case.

Other changes:
- Kernel semaphores now use a hash table to manage the namespace of
  named semaphores nearly in a similar fashion to the POSIX shared memory
  object file descriptors.  Kernel semaphores can now also have names
  longer than 14 chars (up to MAXPATHLEN) and can include subdirectories
  in their pathname.
- The UID/GID permission checks for access to a named semaphore are now
  done via vaccess() rather than a home-rolled set of checks.
- Now that kernel semaphores have an associated file object, the various
  MAC checks for POSIX semaphores accept both a file credential and an
  active credential.  There is also a new posixsem_check_stat() since it
  is possible to fstat() a semaphore file descriptor.
- A small set of regression tests (using the ksem API directly) is present
  in src/tools/regression/posixsem.

Reported by:	kris (1)
Tested by:	kris
Reviewed by:	rwatson (lightly)
MFC after:	1 month
2008-06-27 05:39:04 +00:00
bin use 'const' for the parameters of the two static functions unalias() and hashalias() 2008-06-07 16:28:20 +00:00
cddl Don't need to include vmem.h anymore. 2008-05-23 22:44:46 +00:00
contrib Bring in the vendor's fix for a bug in strtod() whereby 2008-06-21 19:27:54 +00:00
crypto Fix conflicts after heimdal-1.1 import and add build infrastructure. Import 2008-05-07 13:53:12 +00:00
etc Quiet rc.d/syscons unless it has something to say. 2008-06-24 21:01:56 +00:00
games Months in English are capitalized (even when abbreviated). 2008-06-25 04:56:08 +00:00
gnu Enable GCC stack protection (aka Propolice) for userland: 2008-06-25 21:33:28 +00:00
include Turn execvpe() into an internal libc routine. 2008-06-23 05:22:06 +00:00
kerberos5 Add roken.h to SRCS. This fixes the compilation of slc during a 2008-06-18 21:20:50 +00:00
lib - add description of the MLINK error 2008-06-26 12:15:38 +00:00
libexec Enable GCC stack protection (aka Propolice) for userland: 2008-06-25 21:33:28 +00:00
release Enable GCC stack protection (aka Propolice) for userland: 2008-06-25 21:33:28 +00:00
rescue Enable GCC stack protection (aka Propolice) for userland: 2008-06-25 21:33:28 +00:00
sbin The signature for a pthread function requires that it 2008-06-26 07:05:35 +00:00
secure Fix conflicts after heimdal-1.1 import and add build infrastructure. Import 2008-05-07 13:53:12 +00:00
share Regen properly. 2008-06-25 21:42:23 +00:00
sys Rework the lifetime management of the kernel implementation of POSIX 2008-06-27 05:39:04 +00:00
tools Rework the lifetime management of the kernel implementation of POSIX 2008-06-27 05:39:04 +00:00
usr.bin Rework the lifetime management of the kernel implementation of POSIX 2008-06-27 05:39:04 +00:00
usr.sbin Re-implement the client side of rpc.lockd in the kernel. This implementation 2008-06-26 10:21:54 +00:00
COPYRIGHT Happy new year 2008! 2007-12-31 22:09:19 +00:00
LOCKS Update LOCKS syntax. 2008-06-05 19:47:58 +00:00
MAINTAINERS Update description text 2008-06-06 21:32:01 +00:00
Makefile Back out rev. 1.352 (SVN rev 179842) as phk pointed out that 2008-06-17 11:08:49 +00:00
Makefile.inc1 Enable GCC stack protection (aka Propolice) for userland: 2008-06-25 21:33:28 +00:00
ObsoleteFiles.inc Turn sgtty into a binary-only compatibility interface. 2008-06-14 10:42:18 +00:00
README
UPDATING Note removal of gpt(8). 2008-06-09 21:33:57 +00:00

This is the top level of the FreeBSD source directory.  This file
was last revised on:
$FreeBSD$

For copyright information, please see the file COPYRIGHT in this
directory (additional copyright information also exists for some
sources in this tree - please see the specific source directories for
more information).

The Makefile in this directory supports a number of targets for
building components (or all) of the FreeBSD source tree, the most
commonly used one being ``world'', which rebuilds and installs
everything in the FreeBSD system from the source tree except the
kernel, the kernel-modules and the contents of /etc.  The ``world''
target should only be used in cases where the source tree has not
changed from the currently running version.  See:
http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/makeworld.html
for more information, including setting make(1) variables.

The ``buildkernel'' and ``installkernel'' targets build and install
the kernel and the modules (see below).  Please see the top of
the Makefile in this directory for more information on the
standard build targets and compile-time flags.

Building a kernel is a somewhat more involved process, documentation
for which can be found at:
   http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/kernelconfig.html
And in the config(8) man page.
Note: If you want to build and install the kernel with the
``buildkernel'' and ``installkernel'' targets, you might need to build
world before.  More information is available in the handbook.

The sample kernel configuration files reside in the sys/<arch>/conf
sub-directory (assuming that you've installed the kernel sources), the
file named GENERIC being the one used to build your initial installation
kernel.  The file NOTES contains entries and documentation for all possible
devices, not just those commonly used.  It is the successor of the ancient
LINT file, but in contrast to LINT, it is not buildable as a kernel but a
pure reference and documentation file.


Source Roadmap:
---------------
bin		System/user commands.

contrib		Packages contributed by 3rd parties.

crypto		Cryptography stuff (see crypto/README).

etc		Template files for /etc.

games		Amusements.

gnu		Various commands and libraries under the GNU Public License.
		Please see gnu/COPYING* for more information.

include		System include files.

kerberos5	Kerberos5 (Heimdal) package.

lib		System libraries.

libexec		System daemons.

release		Release building Makefile & associated tools.

rescue		Build system for statically linked /rescue utilities.

sbin		System commands.

secure		Cryptographic libraries and commands.

share		Shared resources.

sys		Kernel sources.

tools		Utilities for regression testing and miscellaneous tasks.

usr.bin		User commands.

usr.sbin	System administration commands.


For information on synchronizing your source tree with one or more of
the FreeBSD Project's development branches, please see:

  http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/synching.html