Commit Graph

10 Commits

Author SHA1 Message Date
John Baldwin
6bc1e9cd84 Rework the lifetime management of the kernel implementation of POSIX
semaphores.  Specifically, semaphores are now represented as new file
descriptor type that is set to close on exec.  This removes the need for
all of the manual process reference counting (and fork, exec, and exit
event handlers) as the normal file descriptor operations handle all of
that for us nicely.  It is also suggested as one possible implementation
in the spec and at least one other OS (OS X) uses this approach.

Some bugs that were fixed as a result include:
- References to a named semaphore whose name is removed still work after
  the sem_unlink() operation.  Prior to this patch, if a semaphore's name
  was removed, valid handles from sem_open() would get EINVAL errors from
  sem_getvalue(), sem_post(), etc.  This fixes that.
- Unnamed semaphores created with sem_init() were not cleaned up when a
  process exited or exec'd.  They were only cleaned up if the process
  did an explicit sem_destroy().  This could result in a leak of semaphore
  objects that could never be cleaned up.
- On the other hand, if another process guessed the id (kernel pointer to
  'struct ksem' of an unnamed semaphore (created via sem_init)) and had
  write access to the semaphore based on UID/GID checks, then that other
  process could manipulate the semaphore via sem_destroy(), sem_post(),
  sem_wait(), etc.
- As part of the permission check (UID/GID), the umask of the proces
  creating the semaphore was not honored.  Thus if your umask denied group
  read/write access but the explicit mode in the sem_init() call allowed
  it, the semaphore would be readable/writable by other users in the
  same group, for example.  This includes access via the previous bug.
- If the module refused to unload because there were active semaphores,
  then it might have deregistered one or more of the semaphore system
  calls before it noticed that there was a problem.  I'm not sure if
  this actually happened as the order that modules are discovered by the
  kernel linker depends on how the actual .ko file is linked.  One can
  make the order deterministic by using a single module with a mod_event
  handler that explicitly registers syscalls (and deregisters during
  unload after any checks).  This also fixes a race where even if the
  sem_module unloaded first it would have destroyed locks that the
  syscalls might be trying to access if they are still executing when
  they are unloaded.

  XXX: By the way, deregistering system calls doesn't do any blocking
  to drain any threads from the calls.
- Some minor fixes to errno values on error.  For example, sem_init()
  isn't documented to return ENFILE or EMFILE if we run out of semaphores
  the way that sem_open() can.  Instead, it should return ENOSPC in that
  case.

Other changes:
- Kernel semaphores now use a hash table to manage the namespace of
  named semaphores nearly in a similar fashion to the POSIX shared memory
  object file descriptors.  Kernel semaphores can now also have names
  longer than 14 chars (up to MAXPATHLEN) and can include subdirectories
  in their pathname.
- The UID/GID permission checks for access to a named semaphore are now
  done via vaccess() rather than a home-rolled set of checks.
- Now that kernel semaphores have an associated file object, the various
  MAC checks for POSIX semaphores accept both a file credential and an
  active credential.  There is also a new posixsem_check_stat() since it
  is possible to fstat() a semaphore file descriptor.
- A small set of regression tests (using the ksem API directly) is present
  in src/tools/regression/posixsem.

Reported by:	kris (1)
Tested by:	kris
Reviewed by:	rwatson (lightly)
MFC after:	1 month
2008-06-27 05:39:04 +00:00
Robert Watson
070356d1fb Use ddb(4), not DDB(4) for man page cross-references.
MFC after:	3 days
Reported by:	novel
2008-04-21 17:09:53 +00:00
Robert Watson
b27c1c8db7 Provide more detailed information about each procstat(1) display mode,
including a key to fields in each mode and flag abbreviations.

MFC after:	3 days
X-MFC-note:	POSIX shared memory memory objects aren't in 7-STABLE yet
2008-04-19 13:40:42 +00:00
Robert Watson
ba8ca9db9c It is a bug that procstat(8) works only on live kernels and not crashdumps;
document in case anyone wants to work on fixing this.

MFC after:	3 days
2008-04-19 12:39:15 +00:00
Joe Marcus Clarke
f280594937 Add support for displaying a process' current working directory, root
directory, and jail directory within procstat.  While this functionality
is available already in fstat, encapsulating it in the kern.proc.filedesc
sysctl makes it accessible without using kvm and thus without needing
elevated permissions.

The new procstat output looks like:

  PID COMM               FD T V FLAGS    REF  OFFSET PRO NAME
  76792 tcsh              cwd v d --------   -       - -   /usr/src
  76792 tcsh             root v d --------   -       - -   /
  76792 tcsh               15 v c rw------  16    9130 -   -
  76792 tcsh               16 v c rw------  16    9130 -   -
  76792 tcsh               17 v c rw------  16    9130 -   -
  76792 tcsh               18 v c rw------  16    9130 -   -
  76792 tcsh               19 v c rw------  16    9130 -   -

I am also bumping __FreeBSD_version for this as this new feature will be
used in at least one port.

Reviewed by:	rwatson
Approved by:	rwatson
2008-02-09 05:16:26 +00:00
David Malone
97ce0ae60f WARNS fixes: mainly constness and avoid comparing signed with
unsigned by making array indicies unsigned. Also note one or two
unused parameters.
2008-02-08 11:03:05 +00:00
Robert Watson
87cb56f6df When printing process file descriptor lists, show a type of 'h' for
POSIX shared memory descriptors.
2008-01-20 19:57:33 +00:00
Robert Watson
5a246d2912 Add 'COMM' column to a few more output modes of procstat(1). The only
one it's missing from is the VM display, where there's really not room,
and the file output display is looking quite cramped.
2007-12-10 20:55:43 +00:00
Robert Watson
45c29f5ba2 Display per-thread command line in TDNAME field for -k and -t; if
no per-thread name is available or the name is identical to the
process name, display "-" instead.  Very slightly shrink the COMM
entry to make a bit more room, although this doesn't help with
stack traces much.

Suggested by:	thompsa
2007-12-03 21:21:15 +00:00
Robert Watson
3d91be41d1 Add procstat(1), a process inspection utility. This provides both some
of the missing functionality from procfs(4) and new functionality for
monitoring and debugging specific processes.  procstat(1) operates in
the following modes:

  -b  Display binary information for the process.
  -c  Display command line arguments for the process.
  -f  Display file descriptor information for the process.
  -k  Display the stacks of kernel threads in the process.
  -s  Display security credential information for the process.
  -t  Display thread information for the process.
  -v  Display virtual memory mappings for the process.

Further revision and modes are expected.

Testing, ideas, etc:	cognet, sam, Skip Ford <skip at menantico dot com>
			Wesley Shields <wxs at atarininja dot org>
2007-12-02 23:31:45 +00:00