Robert Watson 2c255e9df6 Moderate rewrite of kernel ktrace code to attempt to generally improve
reliability when tracing fast-moving processes or writing traces to
slow file systems by avoiding unbounded queueuing and dropped records.
Record loss was previously possible when the global pool of records
become depleted as a result of record generation outstripping record
commit, which occurred quickly in many common situations.

These changes partially restore the 4.x model of committing ktrace
records at the point of trace generation (synchronous), but maintain
the 5.x deferred record commit behavior (asynchronous) for situations
where entering VFS and sleeping is not possible (i.e., in the
scheduler).  Records are now queued per-process as opposed to
globally, with processes responsible for committing records from their
own context as required.

- Eliminate the ktrace worker thread and global record queue, as they
  are no longer used.  Keep the global free record list, as records
  are still used.

- Add a per-process record queue, which will hold any asynchronously
  generated records, such as from context switches.  This replaces the
  global queue as the place to submit asynchronous records to.

- When a record is committed asynchronously, simply queue it to the
  process.

- When a record is committed synchronously, first drain any pending
  per-process records in order to maintain ordering as best we can.
  Currently ordering between competing threads is provided via a global
  ktrace_sx, but a per-process flag or lock may be desirable in the
  future.

- When a process returns to user space following a system call, trap,
  signal delivery, etc, flush any pending records.

- When a process exits, flush any pending records.

- Assert on process tear-down that there are no pending records.

- Slightly abstract the notion of being "in ktrace", which is used to
  prevent the recursive generation of records, as well as generating
  traces for ktrace events.

Future work here might look at changing the set of events marked for
synchronous and asynchronous record generation, re-balancing queue
depth, timeliness of commit to disk, and so on.  I.e., performing a
drain every (n) records.

MFC after:	1 month
Discussed with:	jhb
Requested by:	Marc Olzheim <marcolz at stack dot nl>
2005-11-13 13:27:44 +00:00
..
2005-11-11 09:57:32 +00:00
2005-11-13 13:26:37 +00:00
2005-11-09 12:26:37 +00:00
2005-11-11 12:03:28 +00:00