4e32b7b3cc
running thread's id on each cpu. This allow us to add in-kernel adaptive spin for user level mutex. While spinning in user space is possible, without correct thread running state exported from kernel, it hardly can be implemented efficiently without wasting cpu cycles, however exporting thread running state unlikely will be implemented soon as it has to design and stablize interfaces. This implementation is transparent to user space, it can be disabled dynamically. With this change, mutex ping-pong program's performance is improved massively on SMP machine. performance of mysql super-smack select benchmark is increased about 7% on Intel dual dual-core2 Xeon machine, it indicates on systems which have bunch of cpus and system-call overhead is low (athlon64, opteron, and core-2 are known to be fast), the adaptive spin does help performance. Added sysctls: kern.threads.umtx_dflt_spins if the sysctl value is non-zero, a zero umutex.m_spincount will cause the sysctl value to be used a spin cycle count. kern.threads.umtx_max_spins the sysctl sets upper limit of spin cycle count. Tested on: Athlon64 X2 3800+, Dual Xeon 5130 |
||
---|---|---|
.. | ||
autoconf.c | ||
busdma_machdep.c | ||
clock.c | ||
context.S | ||
db_machdep.c | ||
dump_machdep.c | ||
efi.c | ||
elf_machdep.c | ||
emulate.c | ||
exception.S | ||
gdb_machdep.c | ||
genassym.c | ||
in_cksum.c | ||
interrupt.c | ||
locore.S | ||
machdep.c | ||
mca.c | ||
mem.c | ||
mp_machdep.c | ||
nexus.c | ||
pal.S | ||
pmap.c | ||
ptrace_machdep.c | ||
sal.c | ||
sapic.c | ||
setjmp.S | ||
ssc.c | ||
sscdisk.c | ||
support.S | ||
sys_machdep.c | ||
syscall.S | ||
trap.c | ||
uio_machdep.c | ||
uma_machdep.c | ||
unaligned.c | ||
unwind.c | ||
vm_machdep.c |