freebsd-dev

Author	SHA1	Message	Date
Rafal Jaworowski	ffb5669540	Rework and extend PowerPC headers definitons towards Book-E/e500 CPUs support. Approved by: cognet (mentor) Obtained from: Juniper, Semihalf MFp4: e500	2008-03-03 13:20:52 +00:00
Alan Cox	b8e7fc24fe	Add configuration knobs for the superpage reservation system. Initially, the reservation will only be enabled on amd64.	2007-12-27 16:45:39 +00:00
Alan Cox	7bfda801a8	Change the management of cached pages (PQ_CACHE) in two fundamental ways: (1) Cached pages are no longer kept in the object's resident page splay tree and memq. Instead, they are kept in a separate per-object splay tree of cached pages. However, access to this new per-object splay tree is synchronized by the _free_ page queues lock, not to be confused with the heavily contended page queues lock. Consequently, a cached page can be reclaimed by vm_page_alloc(9) without acquiring the object's lock or the page queues lock. This solves a problem independently reported by tegge@ and Isilon. Specifically, they observed the page daemon consuming a great deal of CPU time because of pages bouncing back and forth between the cache queue (PQ_CACHE) and the inactive queue (PQ_INACTIVE). The source of this problem turned out to be a deadlock avoidance strategy employed when selecting a cached page to reclaim in vm_page_select_cache(). However, the root cause was really that reclaiming a cached page required the acquisition of an object lock while the page queues lock was already held. Thus, this change addresses the problem at its root, by eliminating the need to acquire the object's lock. Moreover, keeping cached pages in the object's primary splay tree and memq was, in effect, optimizing for the uncommon case. Cached pages are reclaimed far, far more often than they are reactivated. Instead, this change makes reclamation cheaper, especially in terms of synchronization overhead, and reactivation more expensive, because reactivated pages will have to be reentered into the object's primary splay tree and memq. (2) Cached pages are now stored alongside free pages in the physical memory allocator's buddy queues, increasing the likelihood that large allocations of contiguous physical memory (i.e., superpages) will succeed. Finally, as a result of this change long-standing restrictions on when and where a cached page can be reclaimed and returned by vm_page_alloc(9) are eliminated. Specifically, calls to vm_page_alloc(9) specifying VM_ALLOC_INTERRUPT can now reclaim and return a formerly cached page. Consequently, a call to malloc(9) specifying M_NOWAIT is less likely to fail. Discussed with: many over the course of the summer, including jeff@, Justin Husted @ Isilon, peter@, tegge@ Tested by: an earlier version by kris@ Approved by: re (kensmith)	2007-09-25 06:25:06 +00:00
Alan Cox	2446e4f02c	Enable the new physical memory allocator. This allocator uses a binary buddy system with a twist. First and foremost, this allocator is required to support the implementation of superpages. As a side effect, it enables a more robust implementation of contigmalloc(9). Moreover, this reimplementation of contigmalloc(9) eliminates the acquisition of Giant by contigmalloc(..., M_NOWAIT, ...). The twist is that this allocator tries to reduce the number of TLB misses incurred by accesses through a direct map to small, UMA-managed objects and page table pages. Roughly speaking, the physical pages that are allocated for such purposes are clustered together in the physical address space. The performance benefits vary. In the most extreme case, a uniprocessor kernel running on an Opteron, I measured an 18% reduction in system time during a buildworld. This allocator does not implement page coloring. The reason is that superpages have much the same effect. The contiguous physical memory allocation necessary for a superpage is inherently colored. Finally, the one caveat is that this allocator does not effectively support prezeroed pages. I hope this is temporary. On i386, this is a slight pessimization. However, on amd64, the beneficial effects of the direct-map optimization outweigh the ill effects. I speculate that this is true in general of machines with a direct map. Approved by: re	2007-06-16 04:57:06 +00:00
Alan Cox	66ab556097	Eliminate some unused definitions that came from NetBSD.	2007-05-28 21:04:22 +00:00
Alan Cox	c155d5d059	Eliminate an unused definition.	2007-05-27 20:34:26 +00:00
Alan Cox	04a18977c8	Define every architecture as either VM_PHYSSEG_DENSE or VM_PHYSSEG_SPARSE depending on whether the physical address space is densely or sparsely populated with memory. The effect of this definition is to determine which of two implementations of vm_page_array and PHYS_TO_VM_PAGE() is used. The legacy implementation is obtained by defining VM_PHYSSEG_DENSE, and a new implementation that trades off time for space is obtained by defining VM_PHYSSEG_SPARSE. For now, all architectures except for ia64 and sparc64 define VM_PHYSSEG_DENSE. Defining VM_PHYSSEG_SPARSE on ia64 allows the entirety of my Itanium 2's memory to be used. Previously, only the first 1 GB could be used. Defining VM_PHYSSEG_SPARSE on sparc64 allows USIIIi-based systems to boot without crashing. This change is a combination of Nathan Whitehorn's patch and my own work in perforce. Discussed with: kmacy, marius, Nathan Whitehorn PR: 112194	2007-05-05 19:50:28 +00:00
Alan Cox	b554f899bd	Eliminate unused definitions. (They came from NetBSD.) Discussed with: cognet, grehan, marcel	2006-08-25 23:51:11 +00:00
Peter Grehan	4daf20b2f1	Increase kernel VA from 256Mb to 512Mb by shifting the segment used for user copyinout down to 12, and keeping segments 13/14 for kernel VA. It would be nice to have more available, but segments lower than this are reserved for either memory or 1:1 mapped device i/o, and seg 15 is OpenFirmware ROM. Also, the effort to keep OpenFirmware available for callbacks limits the use of VA-mapped segments. Fortunately UMA_MD_SMALL_ALLOC takes away a lot of VM pressure. Obtained from: NetBSD	2004-03-02 06:49:21 +00:00
Peter Grehan	7c2779715c	Cleaned up param.h: - culled long-dead #define's - segment register defs moved to sr.h - NPMAPS moved to pmap.h - KERNBASE moved to vmparam.h - removed include of <machine/cpu.h> and fixed src files that relied on this. Modifying segment register code no longer causes gcc rebuilds :-)	2004-02-11 07:27:34 +00:00
Peter Grehan	db55e39aa1	Implement UMA_MD_SMALL_ALLOC, since the BAT registers allow direct addressing of memory. Makes a substantial improvement for apps that stress the limited amount of KVM on PPC (e.g. untarring the ports tree). uma_machdep.c stolen from amd64/ia64.	2004-01-29 00:32:22 +00:00
Benno Rice	f9bac91b18	Bring in NetBSD code used in the PowerPC port. Reviewed by: obrien, dfr Obtained from: NetBSD	2001-06-10 02:39:37 +00:00

12 Commits