Commit Graph

150 Commits

Author SHA1 Message Date
kib
560aa751e0 Remove the support for using non-mpsafe filesystem modules.
In particular, do not lock Giant conditionally when calling into the
filesystem module, remove the VFS_LOCK_GIANT() and related
macros. Stop handling buffers belonging to non-mpsafe filesystems.

The VFS_VERSION is bumped to indicate the interface change which does
not result in the interface signatures changes.

Conducted and reviewed by:	attilio
Tested by:	pho
2012-10-22 17:50:54 +00:00
jhb
aa85973504 Include the associated wait channel message for context switch ktrace
records.  kdump supports both the old and new messages.

Submitted by:	Andrey Zonov  andrey zonov org
MFC after:	1 week
2012-04-20 15:32:36 +00:00
jhb
5829de48d9 Add new ktrace records for the start and end of VM faults. This gives
a pair of records similar to syscall entry and return that a user can
use to determine how long page faults take.  The new ktrace records are
enabled via the 'p' trace type, and are enabled in the default set of
trace points.

Reviewed by:	kib
MFC after:	2 weeks
2012-04-05 17:13:14 +00:00
kib
80ae8fe82c Fix found places where uio_resid is truncated to int.
Add the sysctl debug.iosize_max_clamp, enabled by default. Setting the
sysctl to zero allows to perform the SSIZE_MAX-sized i/o requests from
the usermode.

Discussed with:	bde, das (previous versions)
MFC after:	1 month
2012-02-21 01:05:12 +00:00
eadler
bc26bd381f - Fix ktrace leakage if error is set
PR:		kern/163098
Submitted by:	Loganaden Velvindron <loganaden@devio.us>
Approved by:	sbruno@
MFC after:	1 month
2011-12-08 03:20:38 +00:00
des
1b405df8ba Revisit the capability failure trace points. The initial implementation
only logged instances where an operation on a file descriptor required
capabilities which the file descriptor did not have.  By adding a type enum
to struct ktr_cap_fail, we can catch other types of capability failures as
well, such as disallowed system calls or attempts to wrap a file descriptor
with more capabilities than it had to begin with.
2011-10-18 07:28:58 +00:00
des
9b8d9b3ed1 Add a new trace point, KTRFAC_CAPFAIL, which traces capability check
failures.  It is included in the default set for ktrace(1) and kdump(1).
2011-10-11 20:37:10 +00:00
kmacy
99851f359e In order to maximize the re-usability of kernel code in user space this
patch modifies makesyscalls.sh to prefix all of the non-compatibility
calls (e.g. not linux_, freebsd32_) with sys_ and updates the kernel
entry points and all places in the code that use them. It also
fixes an additional name space collision between the kernel function
psignal and the libc function of the same name by renaming the kernel
psignal kern_psignal(). By introducing this change now we will ease future
MFCs that change syscalls.

Reviewed by:	rwatson
Approved by:	re (bz)
2011-09-16 13:58:51 +00:00
jhb
96b1d8b6d7 Fix several places to ignore processes that are not yet fully constructed.
MFC after:	1 week
2011-04-06 17:47:22 +00:00
dchagin
de259f37cb Style(9) fix.
Fix indentation in comment, double ';' in variable declaration.

MFC after:	1 Week
2011-03-05 20:54:17 +00:00
dchagin
99b120c8f2 Partially reworked r219042.
The reason for this is a bug at ktrops() where process dereferenced
without having a lock. This might cause a panic if ktrace was runned
with -p flag and the specified process exited between the dropping
a lock and writing sv_flags.

Since it is impossible to acquire sx lock while holding mtx switch
to use asynchronous enqueuerequest() instead of writerequest().

Rename ktr_getrequest_ne() to more understandable name [1].

Requested by:	jhb [1]

MFC after:	1 Week
2011-03-05 20:36:42 +00:00
dchagin
44c082b543 Introduce preliminary support of the show description of the ABI of
traced process by adding two new events which records value of process
sv_flags to the trace file at process creation/execing/exiting time.

MFC after:	1 Month.
2011-02-25 22:05:33 +00:00
dchagin
049c9dd26f ktrace_resize_pool() locking slightly reworked:
1) do not take a lock around the single atomic operation.
2) do not lose the invariant of lock by dropping/acquiring
   ktrace_mtx around free() or malloc().

MFC after:	1 Month.
2011-02-25 22:03:28 +00:00
netchild
cc4128c6b1 Add some FEATURE macros for various features (AUDIT/CAM/IPC/KTR/MAC/NFS/NTP/
PMC/SYSV/...).

No FreeBSD version bump, the userland application to query the features will
be committed last and can serve as an indication of the availablility if
needed.

Sponsored by:   Google Summer of Code 2010
Submitted by:   kibab
Reviewed by:    arch@ (parts by rwatson, trasz, jhb)
X-MFC after:    to be determined in last commit with code from this project
2011-02-25 10:11:01 +00:00
jhb
a41cfdca06 - When disabling ktracing on a process, free any pending requests that
may be left.  This fixes a memory leak that can occur when tracing is
  disabled on a process via disabling tracing of a specific file (or if
  an I/O error occurs with the tracefile) if the process's next system
  call is exit().  The trace disabling code clears p_traceflag, so exit1()
  doesn't do any KTRACE-related cleanup leading to the leak.  I chose to
  make the free'ing of pending records synchronous rather than patching
  exit1().
- Move KTRACE-specific logic out of kern_(exec|exit|fork).c and into
  kern_ktrace.c instead.  Make ktrace_mtx private to kern_ktrace.c as a
  result.

MFC after:	1 month
2010-10-21 19:17:40 +00:00
jhb
faa167a723 Fix a whitespace nit and remove a questioning comment. STAILQ_CONCAT()
does require the STAILQ the existing list is being added to to already
be initialized (it is CONCAT() vs MOVE()).
2010-08-19 16:38:58 +00:00
jhb
d64a4df941 Keep the process locked when calling ktrops() or ktrsetchildren() instead
of dropping the lock only to immediately reacquire it.
2010-08-17 21:34:19 +00:00
gavin
dbc7cd5ae9 Add descriptions to a handful of sysctl nodes.
PR:		kern/148580
Submitted by:	Galimov Albert <wtfcrap mail.ru>
MFC after:	1 week
2010-08-09 14:48:31 +00:00
jhb
fb1e0aa66f - Document layout of KTR_STRUCT payload in a comment.
- Simplify ktrstruct() calling convention by having ktrstruct() use
  strlen() rather than requiring the caller to hand-code the length of
  constant strings.

MFC after:	1 month
2010-07-14 17:38:01 +00:00
jhb
a661f652ad - Fix several off-by-one errors when using MAXCOMLEN. The p_comm[] and
td_name[] arrays are actually MAXCOMLEN + 1 in size and a few places that
  created shadow copies of these arrays were just using MAXCOMLEN.
- Prefer using sizeof() of an array type to explicit constants for the
  array length in a few places.
- Ensure that all of p_comm[] and td_name[] is always zero'd during
  execve() to guard against any possible information leaks.  Previously
  trailing garbage in p_comm[] could be leaked to userland in ktrace
  record headers via td_name[].

Reviewed by:	bde
2009-10-23 15:14:54 +00:00
rwatson
f4934662e5 Move "options MAC" from opt_mac.h to opt_global.h, as it's now in GENERIC
and used in a large number of files, but also because an increasing number
of incorrect uses of MAC calls were sneaking in due to copy-and-paste of
MAC-aware code without the associated opt_mac.h include.

Discussed with:	pjd
2009-06-05 14:55:22 +00:00
rwatson
fba90f2e03 Remove VOP_LEASE and supporting functions. This hasn't been used since
the removal of NQNFS, but was left in in case it was required for NFSv4.
Since our new NFSv4 client and server can't use it for their
requirements, GC the old mechanism, as well as other unused lease-
related code and interfaces.

Due to its impact on kernel programming and binary interfaces, this
change should not be MFC'd.

Proposed by:    jeff
Reviewed by:    jeff
Discussed with: rmacklem, zach loafman @ isilon
2009-04-10 10:52:19 +00:00
jhb
2548f05c6c Add a new type of KTRACE record for sysctl(3) invocations. It uses the
internal sysctl_sysctl_name() handler to map the MIB array to a string
name and logs this name in the trace log.  This can be useful to see
exactly which sysctls a thread is invoking.

MFC after:	1 month
2009-03-11 21:48:36 +00:00
bz
965b4db5a5 Fix a credential reference leak. [1]
Close subtle but relatively unlikely race conditions when
propagating the vnode write error to other active sessions
tracing to the same vnode, without holding a reference on
the vnode anymore. [2]

PR:		kern/126368 [1]
Submitted by:	rwatson [2]
Reviewed by:	kib, rwatson
MFC after:	4 weeks
2008-12-03 15:54:35 +00:00
des
df26e399aa This patch adds a new ktrace(2) record type, KTR_STRUCT, whose payload
consists of the null-terminated name and the contents of any structure
you wish to record.  A new ktrstruct() function constructs and emits a
KTR_STRUCT record.  It is accompanied by convenience macros for struct
stat and struct sockaddr.

In kdump(1), KTR_STRUCT records are handled by a dispatcher function
that runs stringent sanity checks on its contents before handing it
over to individual decoding funtions for each type of structure.
Currently supported structures are struct stat and struct sockaddr for
the AF_INET, AF_INET6 and AF_UNIX families; support for AF_APPLETALK
and AF_IPX is present but disabled, as I am unable to test it properly.

Since 's' was already taken, the letter 't' is used by ktrace(1) to
enable KTR_STRUCT trace points, and in kdump(1) to enable their
decoding.

Derived from patches by Andrew Li <andrew2.li@citi.com>.

PR:		kern/117836
MFC after:	3 weeks
2008-02-23 01:01:49 +00:00
attilio
71b7824213 VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in
conjuction with 'thread' argument passing which is always curthread.
Remove the unuseful extra-argument and pass explicitly curthread to lower
layer functions, when necessary.

KPI results broken by this change, which should affect several ports, so
version bumping and manpage update will be further committed.

Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>
2008-01-13 14:44:15 +00:00
attilio
18d0a0dd51 vn_lock() is currently only used with the 'curthread' passed as argument.
Remove this argument and pass curthread directly to underlying
VOP_LOCK1() VFS method. This modify makes the code cleaner and in
particular remove an annoying dependence helping next lockmgr() cleanup.
KPI results, obviously, changed.

Manpage and FreeBSD_version will be updated through further commits.

As a side note, would be valuable to say that next commits will address
a similar cleanup about VFS methods, in particular vop_lock1 and
vop_unlock.

Tested by:	Diego Sardina <siarodx at gmail dot com>,
		Andrea Di Pasquale <whyx dot it at gmail dot com>
2008-01-10 01:10:58 +00:00
julian
7ee6259be7 A bunch more files that should probably print out a thread name
instead of a process name.
2007-11-14 06:51:33 +00:00
rwatson
60570a92bf Merge first in a series of TrustedBSD MAC Framework KPI changes
from Mac OS X Leopard--rationalize naming for entry points to
the following general forms:

  mac_<object>_<method/action>
  mac_<object>_check_<method/action>

The previous naming scheme was inconsistent and mostly
reversed from the new scheme.  Also, make object types more
consistent and remove spaces from object types that contain
multiple parts ("posix_sem" -> "posixsem") to make mechanical
parsing easier.  Introduce a new "netinet" object type for
certain IPv4/IPv6-related methods.  Also simplify, slightly,
some entry point names.

All MAC policy modules will need to be recompiled, and modules
not updates as part of this commit will need to be modified to
conform to the new KPI.

Sponsored by:	SPARTA (original patches against Mac OS X)
Obtained from:	TrustedBSD Project, Apple Computer
2007-10-24 19:04:04 +00:00
jhb
7ec8dd9926 Partially revert the previous change. I failed to notice that where
ktruserret() is invoked, an unlocked check of  the per-process queue
is performed inline, thus, we don't lock the ktrace_sx on every userret().

Pointy hat to:	jhb
Approved by:	re (kensmith)
Pointy hat recovered from:	rwatson
2007-08-29 21:17:11 +00:00
jhb
af94eb296c Improve the ktrace locking somewhat to reduce overhead:
- Depessimize userret() in kernels where KTRACE is enabled by doing an
  unlocked check of the per-process queue of pending events before
  acquiring any locks.  Previously ktr_userret() unconditionally acquired
  the global ktrace_sx lock on every return to userland for every thread,
  even if ktrace wasn't enabled for the thread.
- Optimize the locking in exit() to first perform an unlocked read of
  p_traceflag to see if ktrace is enabled and only acquire locks and
  teardown ktrace if the test succeeds.  Also, explicitly disable tracing
  before draining any pending events so the pending events actually get
  written out.  The unlocked read is safe because proc lock is acquired
  earlier after single-threading so p_traceflag can't change between then
  and this check (well, it can currently due to a bug in ktrace I will fix
  next, but that race existed prior to this change as well).

Reviewed by:	rwatson
2007-06-13 20:01:42 +00:00
rwatson
00b02345d4 Eliminate now-unused SUSER_ALLOWJAIL arguments to priv_check_cred(); in
some cases, move to priv_check() if it was an operation on a thread and
no other flags were present.

Eliminate caller-side jail exception checking (also now-unused); jail
privilege exception code now goes solely in kern_jail.c.

We can't yet eliminate suser() due to some cases in the KAME code where
a privilege check is performed and then used in many different deferred
paths.  Do, however, move those prototypes to priv.h.

Reviewed by:	csjp
Obtained from:	TrustedBSD Project
2007-06-12 00:12:01 +00:00
kib
f13486a222 Revert UF_OPENING workaround for CURRENT.
Change the VOP_OPEN(), vn_open() vnode operation and d_fdopen() cdev operation
argument from being file descriptor index into the pointer to struct file.

Proposed and reviewed by:	jhb
Reviewed by:	daichi (unionfs)
Approved by:	re (kensmith)
2007-05-31 11:51:53 +00:00
rwatson
69938bd196 Further system call comment cleanup:
- Remove also "MP SAFE" after prior "MPSAFE" pass. (suggested by bde)
- Remove extra blank lines in some cases.
- Add extra blank lines in some cases.
- Remove no-op comments consisting solely of the function name, the word
  "syscall", or the system call name.
- Add punctuation.
- Re-wrap some comments.
2007-03-05 13:10:58 +00:00
rwatson
300d4098cf Remove 'MPSAFE' annotations from the comments above most system calls: all
system calls now enter without Giant held, and then in some cases, acquire
Giant explicitly.

Remove a number of other MPSAFE annotations in the credential code and
tweak one or two other adjacent comments.
2007-03-04 22:36:48 +00:00
mpp
ea6456848e Do not do a vn_close for all references to the ktraced file if we are
doing a CLEARFILE option.  Do a vrele instead.  This prevents
a panic later due to v_writecount being negative when the vnode
is taken off the freelist.

Submitted by:	jhb
2007-02-13 00:20:13 +00:00
delphij
2e20bff54b Use FOREACH_PROC_IN_SYSTEM instead of using its unrolled form. 2007-01-17 14:58:53 +00:00
kmacy
af645e118f ktrace_cv is no longer used - remove
Submitted by: Attilio Rao
2006-12-17 00:16:09 +00:00
rwatson
10d0d9cf47 Sweep kernel replacing suser(9) calls with priv(9) calls, assigning
specific privilege names to a broad range of privileges.  These may
require some future tweaking.

Sponsored by:           nCircle Network Security, Inc.
Obtained from:          TrustedBSD Project
Discussed on:           arch@
Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri,
                        Alex Lyashkov <umka at sevcity dot net>,
                        Skip Ford <skip dot ford at verizon dot net>,
                        Antoine Brodin <antoine dot brodin at laposte dot net>
2006-11-06 13:42:10 +00:00
rwatson
7beaaf5cd2 Complete break-out of sys/sys/mac.h into sys/security/mac/mac_framework.h
begun with a repo-copy of mac.h to mac_framework.h.  sys/mac.h now
contains the userspace and user<->kernel API and definitions, with all
in-kernel interfaces moved to mac_framework.h, which is now included
across most of the kernel instead.

This change is the first step in a larger cleanup and sweep of MAC
Framework interfaces in the kernel, and will not be MFC'd.

Obtained from:	TrustedBSD Project
Sponsored by:	SPARTA
2006-10-22 11:52:19 +00:00
jhb
5f5d488d28 Trim an obsolete comment. ktrgenio() stopped doing crazy gymnastics when
ktrace was redone to be mostly synchronous again.
2006-07-31 15:31:43 +00:00
pjd
9de1945c3f Use suser_cred(9) instead of checking cr_uid directly.
Reviewed by:	rwatson
2006-06-27 11:29:38 +00:00
jhb
44f0d9f519 - Conditionalize Giant around VFS operations for ALQ, ktrace, and
generating a coredump as the result of a signal.
- Fix a bug where we could leak a Giant lock if vn_start_write() failed
  in coredump().

Reported by:	jmg (2)
2006-03-28 21:30:22 +00:00
jeff
822d3c8355 - Lock access to vrele() with VFS_LOCK_GIANT() rather than mtx_lock(&Giant).
Sponsored by:	Isilon Systems, Inc.
2006-01-30 08:19:01 +00:00
jhb
0c769dbc8b Fix a vnode reference leak in the ktrace code. We always grab a reference
to the vnode at the start of ktr_writerequest() but were missing the
corresponding vrele() after we finished the write operation.

Reported by:	jasone
2006-01-23 21:45:32 +00:00
rwatson
2fab30d9d4 In ktr_getrequest(), acquire ktrace_mtx earlier -- while the race
currently present is minor and offers no real semantic issues, it also
doesn't make sense since an earlier lockless check has already
occurred.  Also hold the mutex longer, over a manipulation of
per-process ktrace state, which requires synchronization.

MFC after:	1 month
Pointed out by:	jhb
2005-11-14 19:30:09 +00:00
rwatson
2a5785fb21 Moderate rewrite of kernel ktrace code to attempt to generally improve
reliability when tracing fast-moving processes or writing traces to
slow file systems by avoiding unbounded queueuing and dropped records.
Record loss was previously possible when the global pool of records
become depleted as a result of record generation outstripping record
commit, which occurred quickly in many common situations.

These changes partially restore the 4.x model of committing ktrace
records at the point of trace generation (synchronous), but maintain
the 5.x deferred record commit behavior (asynchronous) for situations
where entering VFS and sleeping is not possible (i.e., in the
scheduler).  Records are now queued per-process as opposed to
globally, with processes responsible for committing records from their
own context as required.

- Eliminate the ktrace worker thread and global record queue, as they
  are no longer used.  Keep the global free record list, as records
  are still used.

- Add a per-process record queue, which will hold any asynchronously
  generated records, such as from context switches.  This replaces the
  global queue as the place to submit asynchronous records to.

- When a record is committed asynchronously, simply queue it to the
  process.

- When a record is committed synchronously, first drain any pending
  per-process records in order to maintain ordering as best we can.
  Currently ordering between competing threads is provided via a global
  ktrace_sx, but a per-process flag or lock may be desirable in the
  future.

- When a process returns to user space following a system call, trap,
  signal delivery, etc, flush any pending records.

- When a process exits, flush any pending records.

- Assert on process tear-down that there are no pending records.

- Slightly abstract the notion of being "in ktrace", which is used to
  prevent the recursive generation of records, as well as generating
  traces for ktrace events.

Future work here might look at changing the set of events marked for
synchronous and asynchronous record generation, re-balancing queue
depth, timeliness of commit to disk, and so on.  I.e., performing a
drain every (n) records.

MFC after:	1 month
Discussed with:	jhb
Requested by:	Marc Olzheim <marcolz at stack dot nl>
2005-11-13 13:27:44 +00:00
rwatson
c6854c347f Reuse ktr_unused field in ktr_header structure as ktr_tid; populate
ktr_tid as part of gathering of ktr header data for new ktrace
records.  The continued use of intptr_t is required for file layout
reasons, and cannot be changed to lwpid_t at this point.

MFC after:	1 month
Reviewed by:	davidxu
2005-11-01 14:46:37 +00:00
rwatson
efbbbf570d Replace ktr_buffer pointer in struct ktr_header with a ktr_unused
intptr_t.  The buffer length needs to be written to disk as part
of the trace log, but the kernel pointer for the buffer does not.
Add a new ktr_buffer pointer to the kernel-only ktrace request
structure to hold that pointer.  This frees up an integer in the
ktrace record format that can be used to hold the threadid,
although older ktrace files will have a garbage ktr_buffer field
(or more accurately, a kernel pointer value).

MFC after:		2 weeks
Space requested by:	davidxu
2005-11-01 12:36:19 +00:00
pjd
333a175a13 Close another information leak in ktrace(2): one was able to find active
process groups outside a jail, etc. by using ktrace(2).

OK'ed by:	rwatson
Approved by:	re (scottl)
MFC after:	1 week
2005-06-24 12:05:24 +00:00