Commit Graph

4660 Commits

Author SHA1 Message Date
Jake Burkholder
60a57b73ef ktr changes to improve performance and make writing a userland utility to
dump the trace buffer feasible.
- Remove KTR_EXTEND.  This changes the format of the trace entries when
  activated, making writing a userland tool which is not tied to a specific
  kernel configuration difficult.
- Use get_cyclecount() for timestamps.  nanotime() is much too heavy weight
  and requires recursion protection due to ktr traces occuring as a result
  of ktr traces.  KTR_VERBOSE may still require recursion protection, which
  is now conditional on it.
- Allow KTR_CPU to be overridden by MD code.  This is so that it is possible
  to trace early in startup before pcpu and/or curthread are setup.
- Add a version number for the ktr interface.  A userland tool can check this
  to detect mismatches.
- Use an array for the parameters to make decoding in userland easier.
- Add file and line recording to the non-extended traces now that the extended
  version is no more.

These changes will break gdb macros to decode the extended version of the
trace buffer which are floating around.  Users of these macros should either
use the show ktr command in ddb, or use the userland utility which can be run
on a core dump.

Approved by:	jhb
Tested on:	i386, sparc64
2002-04-01 05:35:26 +00:00
Poul-Henning Kamp
81661c94b6 Here follows the new kernel dumping infrastructure.
Caveats:

The new savecore program is not complete in the sense that it emulates
enough of the old savecores features to do the job, but implements none
of the options yet.

I would appreciate if a userland hacker could help me out getting savecore
to do what we want it to do from a users point of view, compression,
email-notification, space reservation etc etc.  (send me email if
you are interested).

Currently, savecore will scan all devices marked as "swap" or "dump" in
/etc/fstab _or_ any devices specified on the command-line.

All architectures but i386 lack an implementation of dumpsys(), but
looking at the i386 version it should be trivial for anybody familiar
with the platform(s) to provide this function.

Documentation is quite sparse at this time, more to come.

Details:

ATA and SCSI drivers should work as the dump formatting code has been
removed.  The IDA, TWE and AAC have not yet been converted.

Dumpon now opens the device and uses ioctl(DIOCGKERNELDUMP) to set
the device as dumpdev.  To implement the "off" argument, /dev/null
is used as the device.

Savecore will fail if handed any options since they are not (yet)
implemented.  All devices marked "dump" or "swap" in /etc/fstab
will be scanned and dumps found will be saved to diskfiles
named from the MD5 hash of the header record.  The header record
is dumped in readable format in the .info file.  The kernel
is not saved.  Only complete dumps will be saved.

All maintainer rights for this code are disclaimed: feel free to
improve and extend.

Sponsored by:   DARPA, NAI Labs
2002-03-31 22:37:00 +00:00
Poul-Henning Kamp
1f3a74b1b1 Implement the two "GEOM" ioctls DIOCGSECTORSIZE and DIOCGMEDIASIZE for
the non-GEOM code as well.  This simplifies the the kernel-dumping
and disk-management tools as less compatibility cruft will be needed.

Sponsored by:	DARPA and NAI Labs.
2002-03-31 21:17:12 +00:00
Alan Cox
a5c0b1c020 Keep the reference to the file acquired in _aio_aqueue() until the operation
completes.  The reference is released in aio_free_entry().

Submitted by:	tegge
2002-03-31 20:17:56 +00:00
Alfred Perlstein
7b11fea64f Close some holes with p->p_args by NULL'ing out the p->p_args pointer
while holding the proc lock, and by holding the pargs structure when
accessing it from outside of the owner.

Submitted by: Jonathan Mini <mini@haikugeek.com>
2002-03-31 10:33:12 +00:00
Poul-Henning Kamp
8d19a26558 Centralize the "bootdev" and "dumpdev" variables. They are still pretty
bogus all things considered, but at least now they don't camouflage as
being MD variables.
2002-03-31 07:15:28 +00:00
Alan Cox
5e20c11f19 Add a local proc *p in exec_new_vmspace() to avoid repeated dereferencing
to obtain it.
2002-03-31 00:05:30 +00:00
Bruce Evans
4f1f485f34 Fixed handling of short reads in readdisklabel() and writedisklabel().
These functions use DEV_STRATEGY() which can easily return a short
count (with no error) for reads near EOF.  EOF happens for "disks" too
small to contain a label sector (mainly for empty slices).  The functions
didn't understand this at all, and looked for labels in the garbage
in the buffer beyond what DEV_STRATEGY() returned.  The recent UMA
changes combined with my local changes and configuration resulted in
the garbage often containing a valid but garbage label left over from
a previous call.

Bugs in EOF handling in -current limited the problem to "disks" with
size precisely LABELSECTOR sectors.  LABELSECTOR happens to be a very
unusual "disk" size since it is only 0 for non-i386 arches that don't
usually have disks with DOS MBRs.
2002-03-30 16:02:43 +00:00
Dan Moschuk
e7876c0943 Nuke CV_DEBUG in favour of INVARIANTS.
Approved by: jhb
2002-03-30 03:52:52 +00:00
Jake Burkholder
b454c6dd29 Style fixes purposefully left out of last commit. I checked the kse tree
and didn't see any changes that this conflicts with.
2002-03-29 16:45:03 +00:00
Jake Burkholder
d0ce9a7e07 Remove abuse of intr_disable/restore in MI code by moving the loop in ast()
back into the calling MD code.  The MD code must ensure no races between
checking the astpening flag and returning to usermode.

Submitted by:	peter (ia64 bits)
Tested on:	alpha (peter, jeff), i386, ia64 (peter), sparc64
2002-03-29 16:35:26 +00:00
Seigo Tanimura
5cf4bcebbf The description of fd_mtx is "filedesc structure." 2002-03-29 11:26:05 +00:00
Matthew N. Dodd
32bc1098b2 Add resource_list_add_next() which returns the RID for the resource added. 2002-03-29 06:42:54 +00:00
Alfred Perlstein
c1508b28c6 To remove nested include of sys/lock.h and sys/mutex.h from sys/proc.h
make the pargs_* functions into non-inlines in kern/kern_proc.c.

Requested by: bde
2002-03-28 18:12:27 +00:00
Poul-Henning Kamp
45609bea17 Get the magnitude of the NTP adjustment right. 2002-03-28 16:02:44 +00:00
Maxime Henrion
daab5e2472 - Properly sync vfs_nmount() with changes that have be already done
in vfs_mount(), in particular revisions 1.215, 1.227 and 1.240.
- flag2 is a low quality variable name, change it to kern_flag.
- strncpy NUL-terminates f_fstypename and f_mntonname since the strings
  have length <= <buffer length> - 1, so the explicit NUL-termination is
  bogus.
- M_ZERO'ing space for fstype and fspath is stupid since we never use the
  space beyond the end of the string.
- Do various style(9) cleanups in both functions.

Submitted by:	bde
Reviewed by:	phk
2002-03-28 13:47:32 +00:00
Alan Cox
cd430164f1 Allow resursion on the pipe mutex because filt_piperead() and filt_pipewrite()
can be called both with and without the pipe mutex held.  (For example,
if called by pipeselwakeup(), it is held.  Whereas, if called by kqueue_scan(),
it is not.)

Reviewed by:	alfred
2002-03-27 21:47:50 +00:00
Alfred Perlstein
8899023f66 Make the reference counting of 'struct pargs' SMP safe.
There is still some locations where the PROC lock should be held
in order to prevent inconsistent views from outside (like the
proc->p_fd fix for kern/vfs_syscalls.c:checkdirs()) that can be
fixed later.

Submitted by: Jonathan Mini <mini@haikugeek.com>
2002-03-27 21:36:18 +00:00
Jeff Roberson
f22a4b62f5 Add a new mtx_init option "MTX_DUPOK" which allows duplicate acquires of locks
with this flag.  Remove the dup_list and dup_ok code from subr_witness.  Now
we just check for the flag instead of doing string compares.

Also, switch the process lock, process group lock, and uma per cpu locks over
to this interface.  The original mechanism did not work well for uma because
per cpu lock names are unique to each zone.

Approved by:	jhb
2002-03-27 09:23:41 +00:00
Matthew Dillon
e6bbfd402d oops, forgot to commit this. td->td_savecrit = 0 replaced by API
call cpu_thread_link().
2002-03-27 08:26:37 +00:00
Jake Burkholder
f2a79bb9b4 Make this compile.
Pointy hat to:	dillon
2002-03-27 06:44:32 +00:00
Matthew Dillon
d74ac6819b Compromise for critical*()/cpu_critical*() recommit. Cleanup the interrupt
disablement assumptions in kern_fork.c by adding another API call,
cpu_critical_fork_exit().  Cleanup the td_savecrit field by moving it
from MI to MD.  Temporarily move cpu_critical*() from <arch>/include/cpufunc.h
to <arch>/<arch>/critical.c (stage-2 will clean this up).

Implement interrupt deferral for i386 that allows interrupts to remain
enabled inside critical sections.  This also fixes an IPI interlock bug,
and requires uses of icu_lock to be enclosed in a true interrupt disablement.

This is the stage-1 commit.  Stage-2 will occur after stage-1 has stabilized,
and will move cpu_critical*() into its own header file(s) + other things.
This commit may break non-i386 architectures in trivial ways.  This should
be temporary.

Reviewed by:	core
Approved by:	core
2002-03-27 05:39:23 +00:00
Bruce Evans
c0f7f75fd7 "Fixed" -Wshadow warnings by changing the name of some function parameters
from `index' to `indx'.  The correct fix would be to not support or use
index().
2002-03-27 04:04:17 +00:00
Alan Cox
cb100b25ce Remove an unnecessary and inconsistently used variable from exec_new_vmspace(). 2002-03-26 19:20:04 +00:00
Andrew R. Reiter
dcce8874eb - Fixup a few style nits:
- return error -> return (error);
  - move a declaration to the top of the function.
  - become bug for bug compatible with if (error) lines.

Submitted by: bde
2002-03-26 18:07:10 +00:00
Maxime Henrion
17594b936b As discussed in -arch, add the new nmount(2) system call and the
new vfs_getopt()/vfs_copyopt() API.  This is intended to be used
later, when there will be filesystems implementing the VFS_NMOUNT
operation.  The mount(2) system call will disappear when all
filesystems will be converted to the new API.  Documentation will
be committed in a while.

Reviewed by:	phk
2002-03-26 15:33:44 +00:00
Bruce Evans
237e41fc58 Added used include of <sys/sx.h>. Don't depend on namespace pollution in
<sys/file.h>.
2002-03-26 01:09:51 +00:00
Bruce Evans
ee99e978a3 Added used include of <sys/sx.h>. Don't depend on namespace pollution in
<sys/file.h> or <sys/socketvar.h>.
2002-03-25 21:52:04 +00:00
David E. O'Brien
0beb3ecc6c Commit work-around for panics when mounting FS's that are auto-loaded as
modules (ie. procfs.ko).

When the kernel loads dynamic filesystem module, it looks for any of the
VOP operations specified by the new filesystem that have not been registered
already by the currently known filesystems.  If any of such operations exist,
vfs_add_vnops function calls vfs_opv_recalc function, which rebuilds vop_t
vectors for each filesystem and sets all global pointers like ufs_vnops_p,
devfs_specop_p, etc to the new values and then frees the old pointers.  This
behavior is bad because there might be already active vnodes whose v_op fields
will be left pointing to the random garbage, leading to inevitable crash soon.

Submitted by:	Alexander Kabaev <ak03@gte.com>
2002-03-25 21:30:50 +00:00
Andrew R. Reiter
517f30c2c1 - Recommit the securelevel_gt() calls removed by commits rev. 1.84 of
kern_linker.c and rev. 1.237 of vfs_syscalls.c since these are not the
  source of the recent panics occuring around kldloading file system
  support modules.

Requested by: rwatson
2002-03-25 18:26:34 +00:00
Poul-Henning Kamp
aaead0dfe9 Modernize my email address. 2002-03-25 13:52:45 +00:00
Bruce Evans
70f52b4845 Fixed some style bugs in the removal of __P(()). The main ones were
not removing tabs before "__P((", and not outdenting continuation lines
to preserve non-KNF lining up of code with parentheses.  Switch to KNF
formatting and/or rewrap the whole prototype in some cases.
2002-03-24 05:09:11 +00:00
John Baldwin
d846883bc4 Use td_ucred in several trivial syscalls and remove Giant locking as
appropriate.
2002-03-22 22:32:04 +00:00
John Baldwin
f2ae7368ea Use explicit Giant locks and unlocks for rather than instrumented ones for
code that is still not safe.  suser() reads p_ucred so it still needs
Giant for the time being.  This should allow kern.giant.proc to be set
to 0 for the time being.
2002-03-22 21:02:02 +00:00
Robert Watson
29dc1288b0 Merge from TrustedBSD MAC branch:
Move the network code from using cr_cansee() to check whether a
    socket is visible to a requesting credential to using a new
    function, cr_canseesocket(), which accepts a subject credential
    and object socket.  Implement cr_canseesocket() so that it does a
    prison check, a uid check, and add a comment where shortly a MAC
    hook will go.  This will allow MAC policies to seperately
    instrument the visibility of sockets from the visibility of
    processes.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-03-22 19:57:41 +00:00
Alfred Perlstein
db51256707 When "cloning" a pipe's buffer bcopy the data after dropping the pipe's
lock as the data may be paged out and cause a fault.
2002-03-22 16:09:22 +00:00
Robert Watson
7906271f25 In sysctl, req->td is believed always to be non-NULL, so there's no need
to test req->td for NULL values and then do somewhat more bizarre things
relating to securelevel special-casing and suser checks.  Remove the
testing and conditional security checks based on req->td!=NULL, and insert
a KASSERT that td != NULL.  Callers to sysctl must always specify the
thread (be it kernel or otherwise) requesting the operation, or a
number of current sysctls will fail due to assumptions that the thread
exists.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
Discussed with:	bde
2002-03-22 14:58:27 +00:00
Robert Watson
4584bb3945 Since cred never appears to be passed into the securelevel calls as
NULL, turn warning printf's into panic's, since this call has been
restructured such that a NULL cred would result in a page fault anyway.

There appears to be one case where NULL is explicitly passed in in the
sysctl code, and this is believed to be in error, so will be modified.
Securelevels now always require a credential context so that per-jail
securelevels are properly implemented.

Obtained from:	TrustedBSD Project
Sponsored by:	NAI Labs
Discussed with:	bde
2002-03-22 14:49:12 +00:00
Andrew R. Reiter
fe3240e9aa - Back out the commit to make the linker_load_file() securelevel check
made aware in jail environments.  Supposedly something is broken, so
  this should be backed out until further investigation proves otherwise,
  or a proper fix can be provided.
2002-03-22 04:56:09 +00:00
Robert Watson
1b350b4542 Break out the "see_other_uids" policy check from the various
method-based inter-process security checks.  To do this, introduce
a new cr_seeotheruids(u1, u2) function, which encapsulates the
"see_other_uids" logic.  Call out to this policy following the
jail security check for all of {debug,sched,see,signal} inter-process
checks.  This more consistently enforces the check, and makes the
check easy to modify.  Eventually, it may be that this check should
become a MAC policy, loaded via a module.

Obtained from:	TrustedBSD Project
Sponsored by:	DARPA, NAI Labs
2002-03-22 02:28:26 +00:00
Andrew R. Reiter
e85b9ae9ac - Fix a logic error in checking the securelevel that was introduced in the
previous commit.

Pointy hats to: arr, rwatson
2002-03-21 15:27:39 +00:00
Warner Losh
cb9a238a8a Remove last two abuses of cpu_critical_{enter,exit} in the MI code.
Reviewed by: jake, jhb, rwatson
2002-03-21 06:11:09 +00:00
Benno Rice
565ab9395f Add a change mirroring that made to kern/subr_trap.c and others.
This makes kernel builds with DIAGNOSTIC work again.

Apparently forgotten by:	jhb
Might want to be checked by:	jhb
2002-03-21 02:47:51 +00:00
Jeff Roberson
59295dba57 UMA permited us to utilize the 'waitok' flag to soalloc. 2002-03-20 21:23:26 +00:00
John Baldwin
01c04d2de9 Change the way we ensure td_ucred is NULL if DIAGNOSTIC is defined.
Instead of caching the ucred reference, just go ahead and eat the
decerement and increment of the refcount.  Now that Giant is pushed down
into crfree(), we no longer have to get Giant in the common case.  In the
case when we are actually free'ing the ucred, we would normally free it on
the next kernel entry, so the cost there is not new, just in a different
place.  This also removse td_cache_ucred from struct thread.  This is
still only done #ifdef DIAGNOSTIC.

[ missed this file in the previous commit ]

Tested on:	i386, alpha
2002-03-20 21:12:04 +00:00
John Baldwin
c1a513c951 - Push down Giant into crfree() in the case that we actually free a ucred.
- Add a cred_free_thread() function (conditional on DIAGNOSTICS) that drops
  a per-thread ucred reference to be used in debugging code when leaving
  the kernel.
2002-03-20 21:00:50 +00:00
Andrew R. Reiter
c457a4403a - Change a check of securelevel to securelevel_gt() call in order to help
against users within a jail attempting to load kernel modules.
- Add a check of securelevel_gt() to vfs_mount() in order to chop some
  low hanging fruit for the repair of securelevel checking of linking and
  unlinking files from within jails.  There is more to be done here.

Reviewed by: rwatson
2002-03-20 16:03:42 +00:00
Andrew R. Reiter
dca9d05526 - Remove a semi-colon from after SYSINIT that was introduced in rev. 1.163. 2002-03-20 14:46:38 +00:00
Jeff Roberson
586c8b6b29 Add calls to uma_zone_set_max() to restore previously enforced limits. 2002-03-20 05:30:58 +00:00
Jeff Roberson
54d77689ed Backout part of my previous commit; I was wrong about vm_zone's handling of
limits on zones w/o objects.
2002-03-20 04:39:32 +00:00