18 Commits

Author SHA1 Message Date
John Baldwin
013818111a Add a new type of VM object: OBJT_SG. An OBJT_SG object is very similar to
a device pager (OBJT_DEVICE) object in that it uses fictitious pages to
provide aliases to other memory addresses.  The primary difference is that
it uses an sglist(9) to determine the physical addresses for a given offset
into the object instead of invoking the d_mmap() method in a device driver.

Reviewed by:	alc
Approved by:	re (kensmith)
MFC after:	2 weeks
2009-07-24 13:50:29 +00:00
Robert Watson
e1f323f350 Include param.h instead of types.h before user.h so that the nested
include of param.h can be removed from audit.h.

MFC after:	3 weeks
2008-12-29 18:58:22 +00:00
Wesley Shields
d5db4444c3 Fix a typo.
Approved by:	rwatson
2008-12-19 16:56:49 +00:00
Joe Marcus Clarke
08afefa814 Do not segfault when procstat -f or procstat -v is called on a process not
owned by the current user.  If kinfo_getfile() or kinfo_getvmmap() return
NULL, simply exit, and do not try and derefernce the memory.

Reviewed by:	peter
Approved by:	peter
2008-12-19 06:50:15 +00:00
Peter Wemm
41fbf4374b Update format string for kve_start/end. 2008-12-02 15:08:33 +00:00
Peter Wemm
43151ee6cf Merge user/peter/kinfo branch as of r185547 into head.
This changes struct kinfo_filedesc and kinfo_vmentry such that they are
same on both 32 and 64 bit platforms like i386/amd64 and won't require
sysctl wrapping.

Two new OIDs are assigned.  The old ones are available under
COMPAT_FREEBSD7 - but it isn't that simple.  The superceded interface
was never actually released on 7.x.

The other main change is to pack the data passed to userland via the
sysctl.  kf_structsize and kve_structsize are reduced for the copyout.
If you have a process with 100,000+ sockets open, the unpacked records
require a 132MB+ copyout.  With packing, it is "only" ~35MB.  (Still
seriously unpleasant, but not quite as devastating).  A similar problem
exists for the vmentry structure - have lots and lots of shared libraries
and small mmaps and its copyout gets expensive too.

My immediate problem is valgrind.  It traditionally achieves this
functionality by parsing procfs output, in a packed format.  Secondly, when
tracing 32 bit binaries on amd64 under valgrind, it uses a cross compiled
32 bit binary which ran directly into the differing data structures in 32
vs 64 bit mode.  (valgrind uses this to track file descriptor operations
and this therefore affected every single 32 bit binary)

I've added two utility functions to libutil to unpack the structures into
a fixed record length and to make it a little more convenient to use.
2008-12-02 06:50:26 +00:00
Ed Schouten
bc093719ca Integrate the new MPSAFE TTY layer to the FreeBSD operating system.
The last half year I've been working on a replacement TTY layer for the
FreeBSD kernel. The new TTY layer was designed to improve the following:

- Improved driver model:

  The old TTY layer has a driver model that is not abstract enough to
  make it friendly to use. A good example is the output path, where the
  device drivers directly access the output buffers. This means that an
  in-kernel PPP implementation must always convert network buffers into
  TTY buffers.

  If a PPP implementation would be built on top of the new TTY layer
  (still needs a hooks layer, though), it would allow the PPP
  implementation to directly hand the data to the TTY driver.

- Improved hotplugging:

  With the old TTY layer, it isn't entirely safe to destroy TTY's from
  the system. This implementation has a two-step destructing design,
  where the driver first abandons the TTY. After all threads have left
  the TTY, the TTY layer calls a routine in the driver, which can be
  used to free resources (unit numbers, etc).

  The pts(4) driver also implements this feature, which means
  posix_openpt() will now return PTY's that are created on the fly.

- Improved performance:

  One of the major improvements is the per-TTY mutex, which is expected
  to improve scalability when compared to the old Giant locking.
  Another change is the unbuffered copying to userspace, which is both
  used on TTY device nodes and PTY masters.

Upgrading should be quite straightforward. Unlike previous versions,
existing kernel configuration files do not need to be changed, except
when they reference device drivers that are listed in UPDATING.

Obtained from:		//depot/projects/mpsafetty/...
Approved by:		philip (ex-mentor)
Discussed:		on the lists, at BSDCan, at the DevSummit
Sponsored by:		Snow B.V., the Netherlands
dcons(4) fixed by:	kan
2008-08-20 08:31:58 +00:00
Ed Schouten
d693ac206e Fix a small typo in the procstat(1) manpage: messsage queue.
Approved by:	philip (mentor)
MFC after:	3 days
2008-07-28 08:01:24 +00:00
John Baldwin
6bc1e9cd84 Rework the lifetime management of the kernel implementation of POSIX
semaphores.  Specifically, semaphores are now represented as new file
descriptor type that is set to close on exec.  This removes the need for
all of the manual process reference counting (and fork, exec, and exit
event handlers) as the normal file descriptor operations handle all of
that for us nicely.  It is also suggested as one possible implementation
in the spec and at least one other OS (OS X) uses this approach.

Some bugs that were fixed as a result include:
- References to a named semaphore whose name is removed still work after
  the sem_unlink() operation.  Prior to this patch, if a semaphore's name
  was removed, valid handles from sem_open() would get EINVAL errors from
  sem_getvalue(), sem_post(), etc.  This fixes that.
- Unnamed semaphores created with sem_init() were not cleaned up when a
  process exited or exec'd.  They were only cleaned up if the process
  did an explicit sem_destroy().  This could result in a leak of semaphore
  objects that could never be cleaned up.
- On the other hand, if another process guessed the id (kernel pointer to
  'struct ksem' of an unnamed semaphore (created via sem_init)) and had
  write access to the semaphore based on UID/GID checks, then that other
  process could manipulate the semaphore via sem_destroy(), sem_post(),
  sem_wait(), etc.
- As part of the permission check (UID/GID), the umask of the proces
  creating the semaphore was not honored.  Thus if your umask denied group
  read/write access but the explicit mode in the sem_init() call allowed
  it, the semaphore would be readable/writable by other users in the
  same group, for example.  This includes access via the previous bug.
- If the module refused to unload because there were active semaphores,
  then it might have deregistered one or more of the semaphore system
  calls before it noticed that there was a problem.  I'm not sure if
  this actually happened as the order that modules are discovered by the
  kernel linker depends on how the actual .ko file is linked.  One can
  make the order deterministic by using a single module with a mod_event
  handler that explicitly registers syscalls (and deregisters during
  unload after any checks).  This also fixes a race where even if the
  sem_module unloaded first it would have destroyed locks that the
  syscalls might be trying to access if they are still executing when
  they are unloaded.

  XXX: By the way, deregistering system calls doesn't do any blocking
  to drain any threads from the calls.
- Some minor fixes to errno values on error.  For example, sem_init()
  isn't documented to return ENFILE or EMFILE if we run out of semaphores
  the way that sem_open() can.  Instead, it should return ENOSPC in that
  case.

Other changes:
- Kernel semaphores now use a hash table to manage the namespace of
  named semaphores nearly in a similar fashion to the POSIX shared memory
  object file descriptors.  Kernel semaphores can now also have names
  longer than 14 chars (up to MAXPATHLEN) and can include subdirectories
  in their pathname.
- The UID/GID permission checks for access to a named semaphore are now
  done via vaccess() rather than a home-rolled set of checks.
- Now that kernel semaphores have an associated file object, the various
  MAC checks for POSIX semaphores accept both a file credential and an
  active credential.  There is also a new posixsem_check_stat() since it
  is possible to fstat() a semaphore file descriptor.
- A small set of regression tests (using the ksem API directly) is present
  in src/tools/regression/posixsem.

Reported by:	kris (1)
Tested by:	kris
Reviewed by:	rwatson (lightly)
MFC after:	1 month
2008-06-27 05:39:04 +00:00
Robert Watson
070356d1fb Use ddb(4), not DDB(4) for man page cross-references.
MFC after:	3 days
Reported by:	novel
2008-04-21 17:09:53 +00:00
Robert Watson
b27c1c8db7 Provide more detailed information about each procstat(1) display mode,
including a key to fields in each mode and flag abbreviations.

MFC after:	3 days
X-MFC-note:	POSIX shared memory memory objects aren't in 7-STABLE yet
2008-04-19 13:40:42 +00:00
Robert Watson
ba8ca9db9c It is a bug that procstat(8) works only on live kernels and not crashdumps;
document in case anyone wants to work on fixing this.

MFC after:	3 days
2008-04-19 12:39:15 +00:00
Joe Marcus Clarke
f280594937 Add support for displaying a process' current working directory, root
directory, and jail directory within procstat.  While this functionality
is available already in fstat, encapsulating it in the kern.proc.filedesc
sysctl makes it accessible without using kvm and thus without needing
elevated permissions.

The new procstat output looks like:

  PID COMM               FD T V FLAGS    REF  OFFSET PRO NAME
  76792 tcsh              cwd v d --------   -       - -   /usr/src
  76792 tcsh             root v d --------   -       - -   /
  76792 tcsh               15 v c rw------  16    9130 -   -
  76792 tcsh               16 v c rw------  16    9130 -   -
  76792 tcsh               17 v c rw------  16    9130 -   -
  76792 tcsh               18 v c rw------  16    9130 -   -
  76792 tcsh               19 v c rw------  16    9130 -   -

I am also bumping __FreeBSD_version for this as this new feature will be
used in at least one port.

Reviewed by:	rwatson
Approved by:	rwatson
2008-02-09 05:16:26 +00:00
David Malone
97ce0ae60f WARNS fixes: mainly constness and avoid comparing signed with
unsigned by making array indicies unsigned. Also note one or two
unused parameters.
2008-02-08 11:03:05 +00:00
Robert Watson
87cb56f6df When printing process file descriptor lists, show a type of 'h' for
POSIX shared memory descriptors.
2008-01-20 19:57:33 +00:00
Robert Watson
5a246d2912 Add 'COMM' column to a few more output modes of procstat(1). The only
one it's missing from is the VM display, where there's really not room,
and the file output display is looking quite cramped.
2007-12-10 20:55:43 +00:00
Robert Watson
45c29f5ba2 Display per-thread command line in TDNAME field for -k and -t; if
no per-thread name is available or the name is identical to the
process name, display "-" instead.  Very slightly shrink the COMM
entry to make a bit more room, although this doesn't help with
stack traces much.

Suggested by:	thompsa
2007-12-03 21:21:15 +00:00
Robert Watson
3d91be41d1 Add procstat(1), a process inspection utility. This provides both some
of the missing functionality from procfs(4) and new functionality for
monitoring and debugging specific processes.  procstat(1) operates in
the following modes:

  -b  Display binary information for the process.
  -c  Display command line arguments for the process.
  -f  Display file descriptor information for the process.
  -k  Display the stacks of kernel threads in the process.
  -s  Display security credential information for the process.
  -t  Display thread information for the process.
  -v  Display virtual memory mappings for the process.

Further revision and modes are expected.

Testing, ideas, etc:	cognet, sam, Skip Ford <skip at menantico dot com>
			Wesley Shields <wxs at atarininja dot org>
2007-12-02 23:31:45 +00:00