2001-07-05 01:32:42 +00:00
|
|
|
/*-
|
|
|
|
* Copyright (c) 1994 John Dyson
|
|
|
|
* Copyright (c) 2001 Matt Dillon
|
|
|
|
*
|
2003-08-12 23:24:05 +00:00
|
|
|
* All Rights Reserved.
|
|
|
|
* Redistribution and use in source and binary forms, with or without
|
|
|
|
* modification, are permitted provided that the following conditions
|
|
|
|
* are met:
|
|
|
|
* 1. Redistributions of source code must retain the above copyright
|
|
|
|
* notice, this list of conditions and the following disclaimer.
|
|
|
|
* 2. Redistributions in binary form must reproduce the above copyright
|
|
|
|
* notice, this list of conditions and the following disclaimer in the
|
|
|
|
* documentation and/or other materials provided with the distribution.
|
|
|
|
* 4. Neither the name of the University nor the names of its contributors
|
|
|
|
* may be used to endorse or promote products derived from this software
|
|
|
|
* without specific prior written permission.
|
|
|
|
*
|
|
|
|
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS
|
|
|
|
* OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
|
|
|
|
* WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
|
|
|
* ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY
|
|
|
|
* DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
|
|
|
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
|
|
|
|
* GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
|
|
|
|
* INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
|
|
|
|
* WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
|
|
|
|
* NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
|
|
|
* SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
2001-07-05 01:32:42 +00:00
|
|
|
*
|
|
|
|
* from: @(#)vm_machdep.c 7.3 (Berkeley) 5/13/91
|
|
|
|
* Utah $Hdr: vm_machdep.c 1.16.1.1 89/06/23$
|
2004-03-04 10:18:17 +00:00
|
|
|
* from: FreeBSD: .../i386/vm_machdep.c,v 1.165 2001/07/04 23:27:04 dillon
|
2001-07-05 01:32:42 +00:00
|
|
|
*/
|
|
|
|
|
2003-06-11 23:50:51 +00:00
|
|
|
#include <sys/cdefs.h>
|
|
|
|
__FBSDID("$FreeBSD$");
|
|
|
|
|
2004-09-02 18:59:15 +00:00
|
|
|
#include <opt_sched.h>
|
|
|
|
|
2001-07-05 01:32:42 +00:00
|
|
|
#include <sys/param.h>
|
|
|
|
#include <sys/systm.h>
|
2001-08-25 05:00:44 +00:00
|
|
|
#include <sys/kernel.h>
|
2001-07-05 01:32:42 +00:00
|
|
|
#include <sys/proc.h>
|
|
|
|
#include <sys/vmmeter.h>
|
2001-08-25 05:00:44 +00:00
|
|
|
#include <sys/lock.h>
|
2001-07-05 01:32:42 +00:00
|
|
|
#include <sys/mutex.h>
|
2002-10-12 05:32:24 +00:00
|
|
|
#include <sys/sched.h>
|
2001-07-05 01:32:42 +00:00
|
|
|
#include <sys/sysctl.h>
|
2001-08-25 05:00:44 +00:00
|
|
|
#include <sys/kthread.h>
|
2004-02-02 07:51:03 +00:00
|
|
|
#include <sys/unistd.h>
|
2001-07-05 01:32:42 +00:00
|
|
|
|
|
|
|
#include <vm/vm.h>
|
|
|
|
#include <vm/vm_page.h>
|
Enable the new physical memory allocator.
This allocator uses a binary buddy system with a twist. First and
foremost, this allocator is required to support the implementation of
superpages. As a side effect, it enables a more robust implementation
of contigmalloc(9). Moreover, this reimplementation of
contigmalloc(9) eliminates the acquisition of Giant by
contigmalloc(..., M_NOWAIT, ...).
The twist is that this allocator tries to reduce the number of TLB
misses incurred by accesses through a direct map to small, UMA-managed
objects and page table pages. Roughly speaking, the physical pages
that are allocated for such purposes are clustered together in the
physical address space. The performance benefits vary. In the most
extreme case, a uniprocessor kernel running on an Opteron, I measured
an 18% reduction in system time during a buildworld.
This allocator does not implement page coloring. The reason is that
superpages have much the same effect. The contiguous physical memory
allocation necessary for a superpage is inherently colored.
Finally, the one caveat is that this allocator does not effectively
support prezeroed pages. I hope this is temporary. On i386, this is
a slight pessimization. However, on amd64, the beneficial effects of
the direct-map optimization outweigh the ill effects. I speculate
that this is true in general of machines with a direct map.
Approved by: re
2007-06-16 04:57:06 +00:00
|
|
|
#include <vm/vm_phys.h>
|
2001-07-05 01:32:42 +00:00
|
|
|
|
Enable the new physical memory allocator.
This allocator uses a binary buddy system with a twist. First and
foremost, this allocator is required to support the implementation of
superpages. As a side effect, it enables a more robust implementation
of contigmalloc(9). Moreover, this reimplementation of
contigmalloc(9) eliminates the acquisition of Giant by
contigmalloc(..., M_NOWAIT, ...).
The twist is that this allocator tries to reduce the number of TLB
misses incurred by accesses through a direct map to small, UMA-managed
objects and page table pages. Roughly speaking, the physical pages
that are allocated for such purposes are clustered together in the
physical address space. The performance benefits vary. In the most
extreme case, a uniprocessor kernel running on an Opteron, I measured
an 18% reduction in system time during a buildworld.
This allocator does not implement page coloring. The reason is that
superpages have much the same effect. The contiguous physical memory
allocation necessary for a superpage is inherently colored.
Finally, the one caveat is that this allocator does not effectively
support prezeroed pages. I hope this is temporary. On i386, this is
a slight pessimization. However, on amd64, the beneficial effects of
the direct-map optimization outweigh the ill effects. I speculate
that this is true in general of machines with a direct map.
Approved by: re
2007-06-16 04:57:06 +00:00
|
|
|
static int idlezero_enable_default = 0;
|
2004-08-29 01:02:33 +00:00
|
|
|
TUNABLE_INT("vm.idlezero_enable", &idlezero_enable_default);
|
|
|
|
/* Defer setting the enable flag until the kthread is running. */
|
|
|
|
static int idlezero_enable = 0;
|
2008-08-03 14:26:15 +00:00
|
|
|
SYSCTL_INT(_vm, OID_AUTO, idlezero_enable, CTLFLAG_RW, &idlezero_enable, 0,
|
|
|
|
"Allow the kernel to use idle cpu cycles to zero-out pages");
|
2001-07-05 01:32:42 +00:00
|
|
|
/*
|
|
|
|
* Implement the pre-zeroed page mechanism.
|
|
|
|
*/
|
|
|
|
|
|
|
|
#define ZIDLE_LO(v) ((v) * 2 / 3)
|
|
|
|
#define ZIDLE_HI(v) ((v) * 4 / 5)
|
|
|
|
|
2004-10-31 19:32:57 +00:00
|
|
|
static boolean_t wakeup_needed = FALSE;
|
2001-08-25 05:00:44 +00:00
|
|
|
static int zero_state;
|
|
|
|
|
|
|
|
static int
|
|
|
|
vm_page_zero_check(void)
|
2001-07-05 01:32:42 +00:00
|
|
|
{
|
|
|
|
|
2001-08-25 05:00:44 +00:00
|
|
|
if (!idlezero_enable)
|
2004-03-04 10:18:17 +00:00
|
|
|
return (0);
|
2001-07-05 01:32:42 +00:00
|
|
|
/*
|
|
|
|
* Attempt to maintain approximately 1/2 of our free pages in a
|
|
|
|
* PG_ZERO'd state. Add some hysteresis to (attempt to) avoid
|
|
|
|
* generally zeroing a page when the system is near steady-state.
|
|
|
|
* Otherwise we might get 'flutter' during disk I/O / IPC or
|
|
|
|
* fast sleeps. We also do not want to be continuously zeroing
|
|
|
|
* pages because doing so may flush our L1 and L2 caches too much.
|
|
|
|
*/
|
2007-05-31 22:52:15 +00:00
|
|
|
if (zero_state && vm_page_zero_count >= ZIDLE_LO(cnt.v_free_count))
|
2004-03-04 10:18:17 +00:00
|
|
|
return (0);
|
2007-05-31 22:52:15 +00:00
|
|
|
if (vm_page_zero_count >= ZIDLE_HI(cnt.v_free_count))
|
2004-03-04 10:18:17 +00:00
|
|
|
return (0);
|
|
|
|
return (1);
|
2001-08-25 05:00:44 +00:00
|
|
|
}
|
|
|
|
|
2006-08-21 00:55:05 +00:00
|
|
|
static void
|
2001-08-25 05:00:44 +00:00
|
|
|
vm_page_zero_idle(void)
|
|
|
|
{
|
|
|
|
|
2007-02-11 05:18:40 +00:00
|
|
|
mtx_assert(&vm_page_queue_free_mtx, MA_OWNED);
|
2001-08-25 05:00:44 +00:00
|
|
|
zero_state = 0;
|
Enable the new physical memory allocator.
This allocator uses a binary buddy system with a twist. First and
foremost, this allocator is required to support the implementation of
superpages. As a side effect, it enables a more robust implementation
of contigmalloc(9). Moreover, this reimplementation of
contigmalloc(9) eliminates the acquisition of Giant by
contigmalloc(..., M_NOWAIT, ...).
The twist is that this allocator tries to reduce the number of TLB
misses incurred by accesses through a direct map to small, UMA-managed
objects and page table pages. Roughly speaking, the physical pages
that are allocated for such purposes are clustered together in the
physical address space. The performance benefits vary. In the most
extreme case, a uniprocessor kernel running on an Opteron, I measured
an 18% reduction in system time during a buildworld.
This allocator does not implement page coloring. The reason is that
superpages have much the same effect. The contiguous physical memory
allocation necessary for a superpage is inherently colored.
Finally, the one caveat is that this allocator does not effectively
support prezeroed pages. I hope this is temporary. On i386, this is
a slight pessimization. However, on amd64, the beneficial effects of
the direct-map optimization outweigh the ill effects. I speculate
that this is true in general of machines with a direct map.
Approved by: re
2007-06-16 04:57:06 +00:00
|
|
|
if (vm_phys_zero_pages_idle()) {
|
2007-05-31 22:52:15 +00:00
|
|
|
if (vm_page_zero_count >= ZIDLE_HI(cnt.v_free_count))
|
2001-08-25 05:00:44 +00:00
|
|
|
zero_state = 1;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2004-03-04 10:18:17 +00:00
|
|
|
/* Called by vm_page_free to hint that a new page is available. */
|
2001-08-25 05:00:44 +00:00
|
|
|
void
|
|
|
|
vm_page_zero_idle_wakeup(void)
|
|
|
|
{
|
|
|
|
|
2007-02-11 05:18:40 +00:00
|
|
|
mtx_assert(&vm_page_queue_free_mtx, MA_OWNED);
|
2004-10-31 19:32:57 +00:00
|
|
|
if (wakeup_needed && vm_page_zero_check()) {
|
|
|
|
wakeup_needed = FALSE;
|
2001-08-25 05:00:44 +00:00
|
|
|
wakeup(&zero_state);
|
2004-10-31 19:32:57 +00:00
|
|
|
}
|
2001-08-25 05:00:44 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
2004-02-02 07:51:03 +00:00
|
|
|
vm_pagezero(void __unused *arg)
|
2001-08-25 05:00:44 +00:00
|
|
|
{
|
|
|
|
|
2004-08-29 01:02:33 +00:00
|
|
|
idlezero_enable = idlezero_enable_default;
|
2001-08-25 05:00:44 +00:00
|
|
|
|
2007-02-11 05:18:40 +00:00
|
|
|
mtx_lock(&vm_page_queue_free_mtx);
|
2001-08-25 05:00:44 +00:00
|
|
|
for (;;) {
|
|
|
|
if (vm_page_zero_check()) {
|
2004-11-05 19:14:02 +00:00
|
|
|
vm_page_zero_idle();
|
2004-07-02 20:21:44 +00:00
|
|
|
#ifndef PREEMPTION
|
2004-11-05 19:14:02 +00:00
|
|
|
if (sched_runnable()) {
|
Commit 14/14 of sched_lock decomposition.
- Use thread_lock() rather than sched_lock for per-thread scheduling
sychronization.
- Use the per-process spinlock rather than the sched_lock for per-process
scheduling synchronization.
Tested by: kris, current@
Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-05 00:00:57 +00:00
|
|
|
thread_lock(curthread);
|
2008-04-17 04:20:10 +00:00
|
|
|
mi_switch(SW_VOL | SWT_IDLE, NULL);
|
Commit 14/14 of sched_lock decomposition.
- Use thread_lock() rather than sched_lock for per-thread scheduling
sychronization.
- Use the per-process spinlock rather than the sched_lock for per-process
scheduling synchronization.
Tested by: kris, current@
Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-05 00:00:57 +00:00
|
|
|
thread_unlock(curthread);
|
2001-08-25 05:00:44 +00:00
|
|
|
}
|
2004-07-02 20:21:44 +00:00
|
|
|
#endif
|
2001-08-25 05:00:44 +00:00
|
|
|
} else {
|
2004-10-31 19:32:57 +00:00
|
|
|
wakeup_needed = TRUE;
|
2007-02-11 05:18:40 +00:00
|
|
|
msleep(&zero_state, &vm_page_queue_free_mtx, 0,
|
|
|
|
"pgzero", hz * 300);
|
2001-07-05 01:32:42 +00:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2004-02-02 07:51:03 +00:00
|
|
|
static void
|
|
|
|
pagezero_start(void __unused *arg)
|
|
|
|
{
|
|
|
|
int error;
|
2009-11-03 16:46:52 +00:00
|
|
|
struct proc *p;
|
2005-02-04 06:18:31 +00:00
|
|
|
struct thread *td;
|
2004-02-02 07:51:03 +00:00
|
|
|
|
2009-11-03 16:46:52 +00:00
|
|
|
error = kproc_create(vm_pagezero, NULL, &p, RFSTOPPED, 0, "pagezero");
|
2004-02-02 07:51:03 +00:00
|
|
|
if (error)
|
|
|
|
panic("pagezero_start: error %d\n", error);
|
2009-11-03 16:46:52 +00:00
|
|
|
td = FIRST_THREAD_IN_PROC(p);
|
Commit 14/14 of sched_lock decomposition.
- Use thread_lock() rather than sched_lock for per-thread scheduling
sychronization.
- Use the per-process spinlock rather than the sched_lock for per-process
scheduling synchronization.
Tested by: kris, current@
Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-05 00:00:57 +00:00
|
|
|
thread_lock(td);
|
2009-11-03 16:46:52 +00:00
|
|
|
|
|
|
|
/* We're an idle task, don't count us in the load. */
|
|
|
|
td->td_flags |= TDF_NOLOAD;
|
2006-10-26 21:42:22 +00:00
|
|
|
sched_class(td, PRI_IDLE);
|
2005-02-04 06:18:31 +00:00
|
|
|
sched_prio(td, PRI_MAX_IDLE);
|
2007-01-23 08:46:51 +00:00
|
|
|
sched_add(td, SRQ_BORING);
|
Commit 14/14 of sched_lock decomposition.
- Use thread_lock() rather than sched_lock for per-thread scheduling
sychronization.
- Use the per-process spinlock rather than the sched_lock for per-process
scheduling synchronization.
Tested by: kris, current@
Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc.
Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)
2007-06-05 00:00:57 +00:00
|
|
|
thread_unlock(td);
|
2004-02-02 07:51:03 +00:00
|
|
|
}
|
2008-03-16 10:58:09 +00:00
|
|
|
SYSINIT(pagezero, SI_SUB_KTHREAD_VM, SI_ORDER_ANY, pagezero_start, NULL);
|