operations to dump a ktrace event out to an output file are now handled
asychronously by a ktrace worker thread. This enables most ktrace events
to not need Giant once p_tracep and p_traceflag are suitably protected by
the new ktrace_lock.
There is a single todo list of pending ktrace requests. The various
ktrace tracepoints allocate a ktrace request object and tack it onto the
end of the queue. The ktrace kernel thread grabs requests off the head of
the queue and processes them using the trace vnode and credentials of the
thread triggering the event.
Since we cannot assume that the user memory referenced when doing a
ktrgenio() will be valid and since we can't access it from the ktrace
worker thread without a bit of hassle anyways, ktrgenio() requests are
still handled synchronously. However, in order to ensure that the requests
from a given thread still maintain relative order to one another, when a
synchronous ktrace event (such as a genio event) is triggered, we still put
the request object on the todo list to synchronize with the worker thread.
The original thread blocks atomically with putting the item on the queue.
When the worker thread comes across an asynchronous request, it wakes up
the original thread and then blocks to ensure it doesn't manage to write a
later event before the original thread has a chance to write out the
synchronous event. When the original thread wakes up, it writes out the
synchronous using its own context and then finally wakes the worker thread
back up. Yuck. The sychronous events aren't pretty but they do work.
Since ktrace events can be triggered in fairly low-level areas (msleep()
and cv_wait() for example) the ktrace code is designed to use very few
locks when posting an event (currently just the ktrace_mtx lock and the
vnode interlock to bump the refcoun on the trace vnode). This also means
that we can't allocate a ktrace request object when an event is triggered.
Instead, ktrace request objects are allocated from a pre-allocated pool
and returned to the pool after a request is serviced.
The size of this pool defaults to 100 objects, which is about 13k on an
i386 kernel. The size of the pool can be adjusted at compile time via the
KTRACE_REQUEST_POOL kernel option, at boot time via the
kern.ktrace_request_pool loader tunable, or at runtime via the
kern.ktrace_request_pool sysctl.
If the pool of request objects is exhausted, then a warning message is
printed to the console. The message is rate-limited in that it is only
printed once until the size of the pool is adjusted via the sysctl.
I have tested all kernel traces but have not tested user traces submitted
by utrace(2), though they should work fine in theory.
Since a ktrace request has several properties (content of event, trace
vnode, details of originating process, credentials for I/O, etc.), I chose
to drop the first argument to the various ktrfoo() functions. Currently
the functions just assume the event is posted from curthread. If there is
a great desire to do so, I suppose I could instead put back the first
argument but this time make it a thread pointer instead of a vnode pointer.
Also, KTRPOINT() now takes a thread as its first argument instead of a
process. This is because the check for a recursive ktrace event is now
per-thread instead of process-wide.
Tested on: i386
Compiles on: sparc64, alpha
lock_object by another pointer (though all of lock_object should be
conditional on LOCK_DEBUG anyways) in exchange for an O(1) TAILQ_REMOVE()
in witness_destroy() (called for every mtx_destroy() and sx_destroy())
instead of an O(n) STAILQ_REMOVE. Since WITNESS is so dog slow as it is,
the speed-up is worth the space cost.
Suggested by: iedowse
being created and destroyed without a single long-term one around to ensure
the witness associated with that group of locks stays alive. The pipe
mutexes are an example of this group. For a dead witness we no longer
clear the witness name. Instead, when looking up the witness for a lock,
if a dead witness' (a witness with a refcount of 0) w_name pointer is
identical to the witness name of the lock then we revive that witness
instead of using a new witness for the lock. This results in far fewer
dead witness objects and also better preserves locking orders over the long
term resulting in more correct lock order checking. Note that we can't
ever derefence w_name of a dead witness since we don't know if the string
it is pointing to has been free()'d or kldunload()'d out from under us.
daadr_t is no larger than a long, and some other relatively harmless
things (*blush*). Overflow for subtracting a daddr_t from a u_long
caused "truncation" of the i/o for attempts to access blocks beyond
the end of the actually cause expansion of the i/o to a preposterous
size.
simple reads (and on IA32, a "pause" instruction for each interation of the
loop) to spin until either the mutex owner field changes, or the lock owner
stops executing.
Suggested by: tanimura
Tested on: i386
(P_CONTINUED) is set when a stopped process receives a SIGCONT and
cleared after it has notified a parent process that has requested
notification via waitpid(2) with WCONTINUED specified in its options
operand. The status value can be checked with the new WIFCONTINUED()
macro.
Reviewed by: jake
set to zero. This field indicates the total space in the external buffer
and therefore should not be modified after the external buffer is added.
Add a comment warning that the mbufs returned by m_split() might be read-only.
Fix M_TRAILINGSPACE() to return zero if !M_WRITABLE(m).
Reviewed by: freebsd-net
Obtained from: Vernier Networks, Inc.
MFC after: 1 week
The uuidgen command, by means of the uuidgen syscall, generates one
or more Universally Unique Identifiers compatible with OSF/DCE 1.1
version 1 UUIDs.
From the Perforce logs (change 11995):
Round of cleanups:
o Give uuidgen() the correct prototype in syscalls.master
o Define struct uuid according to DCE 1.1 in sys/uuid.h
o Use struct uuid instead of uuid_t. The latter is defined
in sys/uuid.h but should not be used in kernel land.
o Add snprintf_uuid(), printf_uuid() and sbuf_printf_uuid()
to kern_uuid.c for use in the kernel (currently geom_gpt.c).
o Rename the non-standard struct uuid in kern/kern_uuid.c
to struct uuid_private and give it a slightly better definition
for better byte-order handling. See below.
o In sys/gpt.h, fix the broken uuid definitions to match the now
compliant struct uuid definition. See below.
o In usr.bin/uuidgen/uuidgen.c catch up with struct uuid change.
A note about byte-order:
The standard failed to provide a non-conflicting and
unambiguous definition for the binary representation. My initial
implementation always wrote the timestamp as a 64-bit little-endian
(2s-complement) integral. The clock sequence was always written
as a 16-bit big-endian (2s-complement) integral. After a good
nights sleep and couple of Pan Galactic Gargle Blasters (not
necessarily in that order :-) I reread the spec and came to the
conclusion that the time fields are always written in the native
by order, provided the the low, mid and hi chopping still occurs.
The spec mentions that you "might need to swap bytes if you talk
to a machine that has a different byte-order". The clock sequence
is always written in big-endian order (as is the IEEE 802 address)
because its division is resulting in bytes, making the ordering
unambiguous.
(UUIDs). On ia64 UUIDs, aka GUIDs, are used by EFI and the firmware
among others. To create GUID Partition Tables (GPTs), we need to
be able to generate UUIDs.
Make kern.ttys export a struct xtty rather than struct tty. Since struct
tty is no longer exposed to userland, remove the dev_t / udev_t hack.
Sponsored by: DARPA, NAI Labs