freebsd-skq

Author	SHA1	Message	Date
alc	d1bce06c64	Change the management of cached pages (PQ_CACHE) in two fundamental ways: (1) Cached pages are no longer kept in the object's resident page splay tree and memq. Instead, they are kept in a separate per-object splay tree of cached pages. However, access to this new per-object splay tree is synchronized by the _free_ page queues lock, not to be confused with the heavily contended page queues lock. Consequently, a cached page can be reclaimed by vm_page_alloc(9) without acquiring the object's lock or the page queues lock. This solves a problem independently reported by tegge@ and Isilon. Specifically, they observed the page daemon consuming a great deal of CPU time because of pages bouncing back and forth between the cache queue (PQ_CACHE) and the inactive queue (PQ_INACTIVE). The source of this problem turned out to be a deadlock avoidance strategy employed when selecting a cached page to reclaim in vm_page_select_cache(). However, the root cause was really that reclaiming a cached page required the acquisition of an object lock while the page queues lock was already held. Thus, this change addresses the problem at its root, by eliminating the need to acquire the object's lock. Moreover, keeping cached pages in the object's primary splay tree and memq was, in effect, optimizing for the uncommon case. Cached pages are reclaimed far, far more often than they are reactivated. Instead, this change makes reclamation cheaper, especially in terms of synchronization overhead, and reactivation more expensive, because reactivated pages will have to be reentered into the object's primary splay tree and memq. (2) Cached pages are now stored alongside free pages in the physical memory allocator's buddy queues, increasing the likelihood that large allocations of contiguous physical memory (i.e., superpages) will succeed. Finally, as a result of this change long-standing restrictions on when and where a cached page can be reclaimed and returned by vm_page_alloc(9) are eliminated. Specifically, calls to vm_page_alloc(9) specifying VM_ALLOC_INTERRUPT can now reclaim and return a formerly cached page. Consequently, a call to malloc(9) specifying M_NOWAIT is less likely to fail. Discussed with: many over the course of the summer, including jeff@, Justin Husted @ Isilon, peter@, tegge@ Tested by: an earlier version by kris@ Approved by: re (kensmith)	2007-09-25 06:25:06 +00:00
alc	a8415c5a0d	Enable the new physical memory allocator. This allocator uses a binary buddy system with a twist. First and foremost, this allocator is required to support the implementation of superpages. As a side effect, it enables a more robust implementation of contigmalloc(9). Moreover, this reimplementation of contigmalloc(9) eliminates the acquisition of Giant by contigmalloc(..., M_NOWAIT, ...). The twist is that this allocator tries to reduce the number of TLB misses incurred by accesses through a direct map to small, UMA-managed objects and page table pages. Roughly speaking, the physical pages that are allocated for such purposes are clustered together in the physical address space. The performance benefits vary. In the most extreme case, a uniprocessor kernel running on an Opteron, I measured an 18% reduction in system time during a buildworld. This allocator does not implement page coloring. The reason is that superpages have much the same effect. The contiguous physical memory allocation necessary for a superpage is inherently colored. Finally, the one caveat is that this allocator does not effectively support prezeroed pages. I hope this is temporary. On i386, this is a slight pessimization. However, on amd64, the beneficial effects of the direct-map optimization outweigh the ill effects. I speculate that this is true in general of machines with a direct map. Approved by: re	2007-06-16 04:57:06 +00:00
attilio	e9fc4edc44	Optimize vmmeter locking. In particular: - Add an explicative table for locking of struct vmmeter members - Apply new rules for some of those members - Remove some unuseful comments Heavily reviewed by: alc, bde, jeff Approved by: jeff (mentor)	2007-06-10 21:59:14 +00:00
attilio	7dd8ed88a9	Revert VMCNT_* operations introduction. Probabilly, a general approach is not the better solution here, so we should solve the sched_lock protection problems separately. Requested by: alc Approved by: jeff (mentor)	2007-05-31 22:52:15 +00:00
jeff	e1996cb960	- define and use VMCNT_{GET,SET,ADD,SUB,PTR} macros for manipulating vmcnts. This can be used to abstract away pcpu details but also changes to use atomics for all counters now. This means sched lock is no longer responsible for protecting counts in the switch routines. Contributed by: Attilio Rao <attilio@FreeBSD.org>	2007-05-18 07:10:50 +00:00
alc	4881bd38e2	Change the free page queue lock from a spin mutex to a default (blocking) mutex. With the demise of Alpha support, there is no longer a reason for it to be a spin mutex.	2007-02-05 06:02:55 +00:00
alc	d108c1d6d1	The return value from vm_pageq_add_new_page() is not used. Eliminate it.	2006-08-25 04:36:19 +00:00
alc	fe4d04c555	Add _vm_stats and _vm_stats_misc to the sysctl declarations in sysctl.h and eliminate their declarations from various source files.	2006-08-21 06:27:28 +00:00
jhb	d4be78a6fa	Move the code to handle the vm.blacklist tunable up a layer into vm_page_startup(). As a result, we now only lookup the tunable once instead of looking it up once for every physical page of memory in the system. This cuts out about a 1 second or so delay in boot on x86 systems. The delay is much larger and more noticable on sun4v apparently. Reported by: kmacy MFC after: 1 week	2006-06-23 16:44:24 +00:00
alc	67908520ec	Add synchronization to vm_pageq_add_new_page() so that it can be called safely after kernel initialization. Remove GIANT_REQUIRED. MFC after: 6 weeks	2006-04-25 17:27:24 +00:00
imp	f1947eff71	Remove leading __ from __(inline\|const\|signed\|volatile). They are obsolete. This should reduce diffs to NetBSD as well.	2006-03-08 06:31:46 +00:00
alc	c5764c76fc	Style: Add blank line after local variable declarations.	2006-01-27 21:06:37 +00:00
alc	d2566b1431	Use the new macros abstracting the page coloring/queues implementation. (There are no functional changes.)	2006-01-27 07:28:51 +00:00
alc	e8d6ec993a	With the recent changes to the implementation of page coloring, the the option PQ_NOOPT is used exclusively by vm_pageq.c. Thus, the include of opt_vmpage.h can be removed from vm_page.h.	2006-01-24 19:24:54 +00:00
netchild	8852f0af37	Convert the PAGE_SIZE check into a CTASSERT. Suggested by: jhb	2006-01-04 19:19:42 +00:00
netchild	fbbb1e238e	Prevent divide by zero, use default values in case one of the divisor's is zero. Tested by: Randy Bush <randy@psg.com>	2006-01-04 18:26:54 +00:00
netchild	507a9b3e93	MI changes: - provide an interface (macros) to the page coloring part of the VM system, this allows to try different coloring algorithms without the need to touch every file [1] - make the page queue tuning values readable: sysctl vm.stats.pagequeue - autotuning of the page coloring values based upon the cache size instead of options in the kernel config (disabling of the page coloring as a kernel option is still possible) MD changes: - detection of the cache size: only IA32 and AMD64 (untested) contains cache size detection code, every other arch just comes with a dummy function (this results in the use of default values like it was the case without the autotuning of the page coloring) - print some more info on Intel CPU's (like we do on AMD and Transmeta CPU's) Note to AMD owners (IA32 and AMD64): please run "sysctl vm.stats.pagequeue" and report if the cache* values are zero (= bug in the cache detection code) or not. Based upon work by: Chad David <davidc@acns.ab.ca> [1] Reviewed by: alc, arch (in 2004) Discussed with: alc, Chad David, arch (in 2004)	2005-12-31 14:39:20 +00:00
alc	2d109601cb	Introduce a procedure, pmap_page_init(), that initializes the vm_page's machine-dependent fields. Use this function in vm_pageq_add_new_page() so that the vm_page's machine-dependent and machine-independent fields are initialized at the same time. Remove code from pmap_init() for initializing the vm_page's machine-dependent fields. Remove stale comments from pmap_init(). Eliminate the Boolean variable pmap_initialized from the alpha, amd64, i386, and ia64 pmap implementations. Its use is no longer required because of the above changes and earlier changes that result in physical memory that is being mapped at initialization time being mapped without pv entries. Tested by: cognet, kensmith, marcel	2005-06-10 03:33:36 +00:00
alc	6224234587	Update some comments to reflect the change from spl-based to lock-based synchronization.	2005-05-28 17:56:18 +00:00
des	9900fad82d	Unbreak the build on 64-bit architectures.	2005-04-16 12:37:16 +00:00
jhb	de9a1fa207	Add a vm.blacklist tunable which can hold a space or comma seperated list of physical addresses. The pages containing these physical addresses will not be added to the free list and thus will effectively be ignored by the VM system. This is mostly useful for the case when one knows of specific physical addresses that have bit errors (such as from a memtest run) so that one can blacklist the bad pages while waiting for the new sticks of RAM to arrive. The physical addresses of any ignored pages are listed in the message buffer as well.	2005-04-15 21:45:02 +00:00
netchild	8ec3bc4102	Remove references to L1 in the comments, according to Alan they are historical leftovers. Approved by: alc	2004-06-07 19:33:05 +00:00
hmp	58fffa8ca6	Correct typo, vm_page_list_find() is called vm_pageq_find() for quite a long time, i.e., since the cleanup of the VM Page-queues code done two years ago. Reviewed by: Alan Cox <alc at freebsd.org>, Matthew Dillon <dillon at backplane.com>	2004-05-30 00:42:38 +00:00
imp	3bc162cfa3	Expand inline the relevant parts of src/COPYRIGHT for Matt Dillon's copyrighted files. Approved by: Matt Dillon	2003-08-12 23:24:05 +00:00
alc	5cfe94b875	Modify vm_pageq_requeue() to handle a PQ_NONE page without dereferencing a NULL pointer; remove some now unused code.	2003-06-26 03:14:40 +00:00
obrien	b0678d7a44	Use __FBSDID().	2003-06-11 23:50:51 +00:00
jake	783ae539c3	- Add vm_paddr_t, a physical address type. This is required for systems where physical addresses larger than virtual addresses, such as i386s with PAE. - Use this to represent physical addresses in the MI vm system and in the i386 pmap code. This also changes the paddr parameter to d_mmap_t. - Fix printf formats to handle physical addresses >4G in the i386 memory detection code, and due to kvtop returning vm_paddr_t instead of u_long. Note that this is a name change only; vm_paddr_t is still the same as vm_offset_t on all currently supported platforms. Sponsored by: DARPA, Network Associates Laboratories Discussed with: re, phk (cdevsw change)	2003-03-25 00:07:06 +00:00
alc	7393ec9f9d	Remove GIANT_REQUIRED from vm_pageq_remove().	2003-02-16 06:36:48 +00:00
alc	9b15115842	o Remove dead and/or unused code.	2002-07-20 05:06:20 +00:00
alc	1ba951badc	o Remove the acquisition and release of Giant from the idle priority thread that pre-zeroes free pages. o Remove GIANT_REQUIRED from some low-level page queue functions. (Instead assertions on the page queue lock are being added to the higher-level functions, like vm_page_wire(), etc.) In collaboration with: peter	2002-07-18 17:40:07 +00:00
jhb	db9aa81e23	Change callers of mtx_init() to pass in an appropriate lock type name. In most cases NULL is passed, but in some cases such as network driver locks (which use the MTX_NETWORK_LOCK macro) and UMA zone locks, a name is used. Tested on: i386, alpha, sparc64	2002-04-04 21:03:38 +00:00
eivind	0799ec54b1	- Remove a number of extra newlines that do not belong here according to style(9) - Minor space adjustment in cases where we have "( ", " )", if(), return(), while(), for(), etc. - Add /* SYMBOL */ after a few #endifs. Reviewed by: alc	2002-03-10 21:52:48 +00:00
alc	42c47fb891	o Create vm_pageq_enqueue() to encapsulate code that is duplicated time and again in vm_page.c and vm_pageq.c. o Delete unusused prototypes. (Mainly a result of the earlier renaming of various functions from vm_page_() to vm_pageq_().)	2002-03-04 18:55:26 +00:00
tegge	5f4060fe3b	Add a page queue, PQ_HOLD, that temporarily owns pages with nonzero hold count that would otherwise be on one of the free queues. This eliminates a panic when broken programs unmap memory that still has pending IO from raw devices. Reviewed by: dillon, alc	2002-02-19 23:19:30 +00:00
dillon	93369f554a	Reorg vm_page.c into vm_page.c, vm_pageq.c, and vm_contig.c (for contigmalloc). Also removed some spl's and added some VM mutexes, but they are not actually used yet, so this commit does not really make any operational changes to the system. vm_page.c relates to vm_page_t manipulation, including high level deactivation, activation, etc... vm_pageq.c relates to finding free pages and aquiring exclusive access to a page queue (exclusivity part not yet implemented). And the world still builds... :-)	2001-07-04 23:27:09 +00:00

35 Commits