Commit Graph

8936 Commits

Author SHA1 Message Date
Alexander Leidinger
ef39c05baa MI changes:
- provide an interface (macros) to the page coloring part of the VM system,
   this allows to try different coloring algorithms without the need to
   touch every file [1]
 - make the page queue tuning values readable: sysctl vm.stats.pagequeue
 - autotuning of the page coloring values based upon the cache size instead
   of options in the kernel config (disabling of the page coloring as a
   kernel option is still possible)

MD changes:
 - detection of the cache size: only IA32 and AMD64 (untested) contains
   cache size detection code, every other arch just comes with a dummy
   function (this results in the use of default values like it was the
   case without the autotuning of the page coloring)
 - print some more info on Intel CPU's (like we do on AMD and Transmeta
   CPU's)

Note to AMD owners (IA32 and AMD64): please run "sysctl vm.stats.pagequeue"
and report if the cache* values are zero (= bug in the cache detection code)
or not.

Based upon work by:	Chad David <davidc@acns.ab.ca> [1]
Reviewed by:		alc, arch (in 2004)
Discussed with:		alc, Chad David, arch (in 2004)
2005-12-31 14:39:20 +00:00
Pawel Jakub Dawidek
d362c40d3a Improve memguard a bit:
- Provide tunable vm.memguard.desc, so one can specify memory type without
  changing the code and recompiling the kernel.
- Allow to use memguard for kernel modules by providing sysctl
  vm.memguard.desc, which can be changed to short description of memory
  type before module is loaded.
- Move as much memguard code as possible to memguard.c.
- Add sysctl node vm.memguard. and move memguard-specific sysctl there.
- Add malloc_desc2type() function for finding memory type based on its
  short description (ks_shortdesc field).
- Memory type can be changed (via vm.memguard.desc sysctl) only if it
  doesn't exist (will be loaded later) or when no memory is allocated yet.
  If there is allocated memory for the given memory type, return EBUSY.
- Implement two ways of memory types comparsion and make safer/slower the
  default.
2005-12-30 11:45:07 +00:00
Pawel Jakub Dawidek
e7736557d6 Print a warning when we miss vinactive() call, because of race in vget().
The race is very real, but conditions needed for triggering it are rather
hard to meet now.
When gjournal will be committed (where it is quite easy to trigger) we need
to fix it.

For now, verify if it is really hard to trigger.

Discussed with:	kan
2005-12-29 22:52:09 +00:00
John Baldwin
8963150678 patch(1) and I aren't friends today. Axe a duplicate copy of
the msleep_spin() function definition.

Spotted by:	pjd
2005-12-29 21:15:32 +00:00
John Baldwin
0cb7e6aec8 Add a new function msleep_spin() which is a slightly stripped down version
of msleep().  msleep_spin() doesn't support changing the priority of the
thread while it is asleep nor does it support interruptible sleeps (PCATCH)
or the PDROP flag.  It does support timeouts however.  It differs from
msleep() in that the passed in mutex is a spin mutex.  This means one can
use msleep_spin() and wakeup() with a spin mutex similar to msleep() and
wakeup() with a regular mutex.  Note that the spin mutex in question needs
to come before sched_lock and the sleepq locks in lock order.
2005-12-29 20:57:45 +00:00
John Baldwin
b0e9883e2f Teach WITNESS_SAVE() and WITNESS_RESTORE() to work with spin locks instead
of only sleep locks.
2005-12-29 20:54:25 +00:00
John Baldwin
0a46ed7d56 Fix a deadlock I introduced with the recently added printf to warn about
spin locks that are not in the static order list.  It is not safe to call
printf while holding the witness spin mutex since the console drivers that
back printf may need to use their own spin locks which would try to talk
to witness when they were locked.  Given this, it is possible for one
CPU to lock a console driver lock (such as sio) which then tries to lock
the witness lock while another CPU is doing the printf while holding the
witness lock.  Fix this by moving the printf outside of the witness lock.
All other printf's in witness are already correct.

MFC after:	3 days
2005-12-29 20:53:01 +00:00
John Baldwin
42b6a681bc Increment kobj_lookup_misses on a miss rather than decrementing it.
Otherwise, the miss count is actually -kobj_lookup_misses.  Mostly a
pedantic change as KOBJ_STATS isn't on by default.
2005-12-29 18:00:42 +00:00
David Xu
3357835a46 Add code to report zombie state.
PR: threads/91044
MFC after: 3 days
2005-12-29 13:00:42 +00:00
Alexander Kabaev
3f34977614 Trim trailing whitespace. 2005-12-28 17:13:31 +00:00
Pawel Jakub Dawidek
619f284195 In realloc(9), determine size of the original block based on
UMA_SLAB_MALLOC flag.
In some circumstances (I observed it when I was doing a lot of reallocs)
UMA_SLAB_MALLOC can be set even if us_keg != NULL.

If this is the case we have wonderful, silent data corruption, because less
data is copied to the newly allocated region than should be.

I'm not sure when this bug was introduced, it could be there undetected
for years now, as we don't have a lot of realloc(9) consumers and it was
hard to reproduce it...
...but what I know for sure, is that I don't want to know who introduce
the bug:) It took me two/three days to track it down (of course most of
the time I was looking for the bug in my own code).
2005-12-28 01:53:13 +00:00
David Xu
9f8eb3cb52 Use variable i instead of variable cpus as an index to get correct kseq. 2005-12-27 12:02:03 +00:00
Maxim Sobolev
d49b21093c Fix breakage introduced in the previous commit. 2005-12-26 22:32:52 +00:00
Maxim Sobolev
900b28f9f6 Remove kern.elf32.can_exec_dyn sysctl. Instead extend Brandinfo structure
with flags bitfield and set BI_CAN_EXEC_DYN flag for all brands that usually
allow executing elf dynamic binaries (aka shared libraries). When it is
requested to execute ET_DYN elf image check if this flag is on after we
know the elf brand allowing execution if so.

PR:		kern/87615
Submitted by:	Marcin Koziej <creep@desk.pl>
2005-12-26 21:23:57 +00:00
Alan Cox
60bb39431a Maintain the lock on the vnode for most of exec_elfN_imgact().
Specifically, it is required for the I/O that may be performed by
elfN_load_section().

Avoid an obscure deadlock in the a.out, elf, and gzip image
activators.  Add a comment describing why the deadlock does not occur
in the common case and how it might occur in less usual circumstances.

Eliminate an unused variable from exec_aout_imgact().

In collaboration with: tegge
2005-12-24 04:57:50 +00:00
David Xu
d7bc12b096 Avoid kernel panic when attaching a process which may not be stopped
by debugger, e.g process is dumping core. Only access p_xthread if
P_STOPPED_TRACE is set, this means thread is ready to exchange signal
with debugger, print a warning if P_STOPPED_TRACE is not set due to
some bugs in other code, if there is.

The patch has been tested by Anish Mistry mistry.7 at osu dot edu, and
is slightly adjusted.
2005-12-24 02:59:29 +00:00
Jeff Roberson
49bdcff518 - Remove and unused include.
Submitted by:	Antoine Brodin <antoine.brodin@laposte.net>
2005-12-23 21:32:40 +00:00
Poul-Henning Kamp
25f6e35a05 Regenerate sysent with new abort2 system call.
Implement abort2(const char *reason, int narg, void **args);

Submitted by:	"Wojciech A. Koszek" <dunstan@freebsd.czest.pl>
2005-12-23 11:58:42 +00:00
Poul-Henning Kamp
5a56b437ec Add abort2() systemcall. 2005-12-23 11:54:11 +00:00
Poul-Henning Kamp
49091c48d5 Make sbuf_copyin() return the number of bytes copied on success.
Submitted by:	"Wojciech A. Koszek" <dunstan@freebsd.czest.pl>
2005-12-23 11:49:53 +00:00
Scott Long
d2a401cb70 Create the taskqueue_fast handler with INTR_MPSAFE so that it doesn't run
with Giant.

MFC After: 3 days
2005-12-23 06:18:33 +00:00
John Baldwin
b439e431bf Tweak how the MD code calls the fooclock() methods some. Instead of
passing a pointer to an opaque clockframe structure and requiring the
MD code to supply CLKF_FOO() macros to extract needed values out of the
opaque structure, just pass the needed values directly.  In practice this
means passing the pair (usermode, pc) to hardclock() and profclock() and
passing the boolean (usermode) to hardclock_cpu() and hardclock_process().
Other details:
- Axe clockframe and CLKF_FOO() macros on all architectures.  Basically,
  all the archs were taking a trapframe and converting it into a clockframe
  one way or another.  Now they can just extract the PC and usermode values
  directly out of the trapframe and pass it to fooclock().
- Renamed hardclock_process() to hardclock_cpu() as the latter is more
  accurate.
- On Alpha, we now run profclock() at hz (profhz == hz) rather than at
  the slower stathz.
- On Alpha, for the TurboLaser machines that don't have an 8254
  timecounter, call hardclock() directly.  This removes an extra
  conditional check from every clock interrupt on Alpha on the BSP.
  There is probably room for even further pruning here by changing Alpha
  to use the simplified timecounter we use on x86 with the lapic timer
  since we don't get interrupts from the 8254 on Alpha anyway.
- On x86, clkintr() shouldn't ever be called now unless using_lapic_timer
  is false, so add a KASSERT() to that affect and remove a condition
  to slightly optimize the non-lapic case.
- Change prototypeof  arm_handler_execute() so that it's first arg is a
  trapframe pointer rather than a void pointer for clarity.
- Use KCOUNT macro in profclock() to lookup the kernel profiling bucket.

Tested on:	alpha, amd64, arm, i386, ia64, sparc64
Reviewed by:	bde (mostly)
2005-12-22 22:16:09 +00:00
Alan Cox
373d1a3f8c Maintain the vnode lock throughout elfN_load_file() rather than releasing
it and reacquiring it in vrele().  Consequently, there is no reason to
increase the reference count on the vm object caching the file's pages.
Reviewed by: tegge

Eliminate unused parameters to elfN_load_file().
2005-12-21 18:58:40 +00:00
Alan Cox
ff6f03c7cd Eliminate an unneeded (vm_prot_t) parameter from two functions. Eliminate
unnecessary uses of a local variable.

Reviewed by: tegge
2005-12-20 23:42:18 +00:00
Pawel Jakub Dawidek
c505fe7a0f Reduce Giant scope a bit, as fdrop() is believed to be MPSAFE.
The purpose of this change is consistency (not performance improvement:)),
as it was hard to tell if fdrop() is MPSAFE or not when I saw it sometimes
under the Giant and sometimes without it.

Glanced at by:	ssouhlal, kan
2005-12-20 00:49:59 +00:00
Pawel Jakub Dawidek
ade9b797a0 vfs_mount_alloc() always returns 0, but what we really want is newly
allocated 'struct mount *' pointer, so simplify code a bit and return
the pointer directly.

Reviewed by:	ssouhlal
2005-12-20 00:43:51 +00:00
Pawel Jakub Dawidek
003ba8a000 Use 'td' instead of 'curthread'. 2005-12-19 16:27:13 +00:00
David Xu
a1d4fe69d2 Fix a bug in slice calculation code, current code uses hz but
sched_clock() is called by state clock.

Submitted by: taku at tackymt dot homeip dot net
2005-12-19 08:26:09 +00:00
Nate Lawson
bd6b217753 Remove the KTR for hardclock completely. It seems to not be useful.
Requested by:	jhb
2005-12-18 18:11:55 +00:00
Nate Lawson
1335c4df32 Restore KTR_CRITICAL but conditionally compile it in as KTR_SCHED.
Requested by:	scottl, jhb
2005-12-18 18:10:57 +00:00
Marcel Moolenaar
757686b115 Make our ELF64 type definitions match standards. In particular this
means:
o  Remove Elf64_Quarter,
o  Redefine Elf64_Half to be 16-bit,
o  Redefine Elf64_Word to be 32-bit,
o  Add Elf64_Xword and Elf64_Sxword for 64-bit entities,
o  Use Elf_Size in MI code to abstract the difference between
   Elf32_Word and Elf64_Word.
o  Add Elf_Ssize as the signed counterpart of Elf_Size.

MFC after: 2 weeks
2005-12-18 04:52:37 +00:00
Alan Cox
044bbbb523 Correct a long-standing problem in elfN_map_insert(): In order to copy a
page to user space, the user space mapping must allow write access.

In collaboration with: tegge@
MFC after: 3 weeks
2005-12-17 19:40:47 +00:00
Nate Lawson
8615fd8696 Clean up unused or poorly utilized KTR values. Remove KTR_FS, KTR_KGDB,
and KTR_IO as they were never used.  Remove KTR_CLK since it was only
used for hardclock firing and use KTR_INTR there instead.  Remove
KTR_CRITICAL since it was only used for crit enter/exit and use
KTR_CONTENTION instead.
2005-12-17 03:57:10 +00:00
John Baldwin
5c8b444153 - Use uintfptr_t rather than int for the kernel profiling index (though it
really should be a fptrdiff_t if we had that) in profclock().
- Don't try to profile kernel pc's that are >= the kernel lowpc to avoid
  underflows when computing a profiling index.
- Use the PC_TO_I() macro to compute the kernel profiling index rather than
  doing it inline.

Discussed with:	bde
2005-12-16 22:11:52 +00:00
John Baldwin
cb49fcd145 Change the addupc_*() functions to use the uintfptr_t type for pc rather
than uintptr_t as that is technically more correct.
2005-12-16 22:08:32 +00:00
Alan Cox
584716b08a Style: The second argument to vm_map_find() should be NULL instead of 0. 2005-12-16 19:14:25 +00:00
Alan Cox
da61b9a69e Use sf_buf_alloc() instead of vm_map_find() on exec_map to create the
ephemeral mappings that are used as the source for three copy
operations from kernel space to user space.  There are two reasons for
making this change: (1) Under heavy load exec_map can fill up causing
vm_map_find() to fail.  When it fails, the nascent process is aborted
(SIGABRT).  Whereas, this reimplementation using sf_buf_alloc()
sleeps.  (2) Although it is possible to sleep on vm_map_find()'s
failure until address space becomes available (see kmem_alloc_wait()),
using sf_buf_alloc() is faster.  Furthermore, the reimplementation
uses a CPU private mapping, avoiding a TLB shootdown on
multiprocessors.

Problem uncovered by: kris@
Reviewed by: tegge@
MFC after: 3 weeks
2005-12-16 18:34:14 +00:00
Xin LI
6ba9ec2d09 In pipe_write(): when uiomove() fails, do not spin on it forever.
Submitted by:	Kostik Belousov <kostikbel at gmail.com> on -current@
Message-ID:	<20051216151016.GE84442@deviant.zoral.local>
MFC After:	3 weeks
2005-12-16 18:32:39 +00:00
David Xu
03f70aec67 Replace selwakeuppri with selwakeup, let scheduler figure out
appropriate thread priority.
2005-12-16 15:01:16 +00:00
Ed Maste
63e6f39011 When using m_dup(9) to copy more than MHLEN bytes of data, don't create an
mbuf chain that starts with a cluster containing just MHLEN bytes.  This
happened because m_dup called m_get or m_getcl depending on the amount of
data to copy, but then always set the size available in the first mbuf to
MHLEN.

Submitted by:	Matt Koivisto <mkoivisto at sandvine dot com>
Approved by:	jmg
Silence from:	rwatson (mentor)
2005-12-14 23:34:26 +00:00
Maxime Henrion
e59898ff36 Fix a bunch of SYSCTL_INT() that should have been SYSCTL_ULONG() to
match the type of the variable they are exporting.

Spotted by:	Thomas Hurst <tom@hur.st>
MFC after:	3 days
2005-12-14 22:27:48 +00:00
Dag-Erling Smørgrav
0430a5e289 Eradicate caddr_t from the VFS API. 2005-12-14 00:49:52 +00:00
John Baldwin
d272fe53a4 Add a new 'show lock' command to ddb. If the argument has a valid lock
class, then it displays various information about the lock and calls a
new function pointer in lock_class (lc_ddb_show) to dump class-specific
information about the lock as well (such as the owner of a mutex or
xlock'ed sx lock).  This is easier than staring at hex dumps of locks to
figure out who owns the lock, etc.  Note that extending lock_class doesn't
affect the ABI for any kernel modules as the only code that deals with
lock_class structures directly is kern_mutex.c, kern_sx.c, and witness.

MFC after:	1 week
2005-12-13 23:14:35 +00:00
David Xu
dd1a6f53ac Stop fiddling thread priority with msleep, eliminating unnecessary
context switching. This improves performance about 30% on UP machine.
2005-12-12 05:04:56 +00:00
Craig Rodrigues
92f44a3f3c Contributions from XFS for FreeBSD project:
- Implement cv_wait_unlock() method which has semantics compatible
  with the sv_wait() method in IRIX.  For cv_wait_unlock(), the lock
  must be held before entering the function, but is not held when the
  function is exited.

- Implement the existing cv_wait() function in terms of cv_wait_unlock().

Submitted by:	kan
Feedback from:	jhb, trhodes, Christoph Hellwig <hch at infradead dot org>
2005-12-12 00:02:22 +00:00
Alan Cox
05406e6f33 Remove unneeded calls to pmap_remove_all(). The given page is not mapped.
Reviewed by: tegge
2005-12-11 22:06:57 +00:00
Andre Oppermann
36ae3fd3c3 Hide the 4k mbuf clusters if the normal clusters are defined to be
4k already.

This unbreaks tinderbox.

Submitted by:	ru
2005-12-10 15:21:04 +00:00
David Xu
3e70c6f047 Fix compiling warning on 64 bits system. 2005-12-09 13:16:48 +00:00
David Xu
f71a882f15 Add a sysctl to force a process to sigexit if a trap signal is
being hold by current thread or ignored by current process,
otherwise, it is very possible the thread will enter an infinite loop
and lead to an administrator's nightmare.
2005-12-09 08:29:29 +00:00
David Xu
d26b1a1fb9 Register itimers_event_hook as a kernel event handler, so I don't
have to duplicate code to call it in exec() and exit1().
2005-12-09 05:43:26 +00:00