freebsd-dev

Author	SHA1	Message	Date
Nathan Whitehorn	c1f4123b05	Add support for memory attributes (pmap_mapdev_attr() and friends) on PowerPC/AIM. This is currently stubbed out on Book-E, since I have no idea how to implement it there.	2010-09-30 18:14:12 +00:00
Nathan Whitehorn	6416b9a85d	Split the SLB mirror cache into two kinds of object, one for kernel maps which are similar to the previous ones, and one for user maps, which are arrays of pointers into the SLB tree. This changes makes user SLB updates atomic, closing a window for memory corruption. While here, rearrange the allocation functions to make context switches faster.	2010-09-16 03:46:17 +00:00
Nathan Whitehorn	95fa3335e1	Replace the SLB backing store splay tree used on 64-bit PowerPC AIM hardware with a lockless sparse tree design. This marginally improves the performance of PMAP and allows copyin()/copyout() to run without acquiring locks when used on wired mappings. Submitted by: mdf	2010-09-16 00:22:25 +00:00
Peter Grehan	33529b98d5	Introduce inheritance into the PowerPC MMU kobj interface. include/mmuvar.h - Change the MMU_DEF macro to also create the class definition as well as define the DATA_SET. Add a macro, MMU_DEF_INHERIT, which has an extra parameter specifying the MMU class to inherit methods from. Update the comments at the start of the header file to describe the new macros. booke/pmap.c aim/mmu_oea.c aim/mmu_oea64.c - Collapse mmu_def_t declaration into updated MMU_DEF macro The MMU_DEF_INHERIT macro will be used in the PS3 MMU implementation to allow it to inherit the stock powerpc64 MMU methods. Reviewed by: nwhitehorn	2010-09-15 00:17:52 +00:00
Nathan Whitehorn	61473c5fd1	Reorder statistics tracking and table lock acquisitions already in place to avoid race conditions updating the PVO statistics.	2010-09-09 16:06:55 +00:00
Nathan Whitehorn	bcb478eb35	Fix a printf specifier on 64-bit systems.	2010-09-08 19:28:43 +00:00
Nathan Whitehorn	0dfddf6e65	Fix a typo in the original import of this code from NetBSD that caused the wrong element of the VSID bitmap array to be examined after a collision, leading to reallocation of in-use VSIDs under some circumstances, with attendant memory corruption. Also add an assert to check for this kind of problem in the future. MFC after: 4 days	2010-09-08 16:58:06 +00:00
Nathan Whitehorn	4982c539ae	Fix an error made in r209975 related to context ID allocation for 64-bit PowerPC CPUs running a 32-bit kernel. This bug could cause in-use VSIDs to be allocated again to another process, causing memory space overlaps and corruption. Reported by: linimon	2010-09-07 23:31:48 +00:00
Nathan Whitehorn	1264a5f4c5	Missed one place the SLB lock should be held in r211967.	2010-08-31 02:07:13 +00:00
Nathan Whitehorn	7eeda62ca9	Avoid a race in the allocation of new segment IDs that could result in memory corruption on heavily loaded SMP systems. MFC after: 2 weeks	2010-08-29 18:17:38 +00:00
Nathan Whitehorn	3b4b38304e	Improve hash coverage for kernel page table entries by modifying the kernel ESID -> VSID map function. This makes ZFS run stably on PowerPC under heavy loads (repeated simultaneous SVN checkouts and updates).	2010-07-31 21:35:15 +00:00
Nathan Whitehorn	c3e289e1ce	MFppc64: Kernel sources for 64-bit PowerPC, along with build-system changes to keep 32-bit kernels compiling (build system changes for 64-bit kernels are coming later). Existing 32-bit PowerPC kernel configurations must be updated after this change to specify their architecture.	2010-07-13 05:32:19 +00:00
Alan Cox	9124d0d6a3	Relax one of the new assertions in pmap_enter() a little. Specifically, allow pmap_enter() to be performed on an unmanaged page that doesn't have VPO_BUSY set. Having VPO_BUSY set really only matters for managed pages. (See, for example, pmap_remove_write().)	2010-06-11 15:49:39 +00:00
Alan Cox	ce18658792	Reduce the scope of the page queues lock and the number of PG_REFERENCED changes in vm_pageout_object_deactivate_pages(). Simplify this function's inner loop using TAILQ_FOREACH(), and shorten some of its overly long lines. Update a stale comment. Assert that PG_REFERENCED may be cleared only if the object containing the page is locked. Add a comment documenting this. Assert that a caller to vm_page_requeue() holds the page queues lock, and assert that the page is on a page queue. Push down the page queues lock into pmap_ts_referenced() and pmap_page_exists_quick(). (As of now, there are no longer any pmap functions that expect to be called with the page queues lock held.) Neither pmap_ts_referenced() nor pmap_page_exists_quick() should ever be passed an unmanaged page. Assert this rather than returning "0" and "FALSE" respectively. ARM: Simplify pmap_page_exists_quick() by switching to TAILQ_FOREACH(). Push down the page queues lock inside of pmap_clearbit(), simplifying pmap_clear_modify(), pmap_clear_reference(), and pmap_remove_write(). Additionally, this allows for avoiding the acquisition of the page queues lock in some cases. PowerPC/AIM: moea_page_exits_quick() and moea_page_wired_mappings() will never be called before pmap initialization is complete. Therefore, the check for moea_initialized can be eliminated. Push down the page queues lock inside of moea_clear_bit(), simplifying moea_clear_modify() and moea_clear_reference(). The last parameter to moea_clear_bit() is never used. Eliminate it. PowerPC/BookE: Simplify mmu_booke_page_exists_quick()'s control flow. Reviewed by: kib@	2010-06-10 16:56:35 +00:00
Alan Cox	2368a37125	Don't set PG_WRITEABLE in pmap_enter() unless the page is managed.	2010-06-05 06:56:06 +00:00
Alan Cox	c46b90e90a	Push down page queues lock acquisition in pmap_enter_object() and pmap_is_referenced(). Eliminate the corresponding page queues lock acquisitions from vm_map_pmap_enter() and mincore(), respectively. In mincore(), this allows some additional cases to complete without ever acquiring the page queues lock. Assert that the page is managed in pmap_is_referenced(). On powerpc/aim, push down the page queues lock acquisition from moea_is_modified() and moea_is_referenced() into moea*_query_bit(). Again, this will allow some additional cases to complete without ever acquiring the page queues lock. Reorder a few statements in vm_page_dontneed() so that a race can't lead to an old reference persisting. This scenario is described in detail by a comment. Correct a spelling error in vm_page_dontneed(). Assert that the object is locked in vm_page_clear_dirty(), and restrict the page queues lock assertion to just those cases in which the page is currently writeable. Add object locking to vnode_pager_generic_putpages(). This was the one and only place where vm_page_clear_dirty() was being called without the object being locked. Eliminate an unnecessary vm_page_lock() around vnode_pager_setsize()'s call to vm_page_clear_dirty(). Change vnode_pager_generic_putpages() to the modern-style of function definition. Also, change the name of one of the parameters to follow virtual memory system naming conventions. Reviewed by: kib	2010-05-26 18:00:44 +00:00
Alan Cox	567e51e18c	Roughly half of a typical pmap_mincore() implementation is machine- independent code. Move this code into mincore(), and eliminate the page queues lock from pmap_mincore(). Push down the page queues lock into pmap_clear_modify(), pmap_clear_reference(), and pmap_is_modified(). Assert that these functions are never passed an unmanaged page. Eliminate an inaccurate comment from powerpc/powerpc/mmu_if.m: Contrary to what the comment says, pmap_mincore() is not simply an optimization. Without a complete pmap_mincore() implementation, mincore() cannot return either MINCORE_MODIFIED or MINCORE_REFERENCED because only the pmap can provide this information. Eliminate the page queues lock from vfs_setdirty_locked_object(), vm_pageout_clean(), vm_object_page_collect_flush(), and vm_object_page_clean(). Generally speaking, these are all accesses to the page's dirty field, which are synchronized by the containing vm object's lock. Reduce the scope of the page queues lock in vm_object_madvise() and vm_page_dontneed(). Reviewed by: kib (an earlier version)	2010-05-24 14:26:57 +00:00
Alan Cox	9ab6032f73	On entry to pmap_enter(), assert that the page is busy. While I'm here, make the style of assertion used by pmap_enter() consistent across all architectures. On entry to pmap_remove_write(), assert that the page is neither unmanaged nor fictitious, since we cannot remove write access to either kind of page. With the push down of the page queues lock, pmap_remove_write() cannot condition its behavior on the state of the PG_WRITEABLE flag if the page is busy. Assert that the object containing the page is locked. This allows us to know that the page will neither become busy nor will PG_WRITEABLE be set on it while pmap_remove_write() is running. Correct a long-standing bug in vm_page_cowsetup(). We cannot possibly do copy-on-write-based zero-copy transmit on unmanaged or fictitious pages, so don't even try. Previously, the call to pmap_remove_write() would have failed silently.	2010-05-16 23:45:10 +00:00
Alan Cox	3c4a24406b	Push down the page queues into vm_page_cache(), vm_page_try_to_cache(), and vm_page_try_to_free(). Consequently, push down the page queues lock into pmap_enter_quick(), pmap_page_wired_mapped(), pmap_remove_all(), and pmap_remove_write(). Push down the page queues lock into Xen's pmap_page_is_mapped(). (I overlooked the Xen pmap in r207702.) Switch to a per-processor counter for the total number of pages cached.	2010-05-08 20:34:01 +00:00
Kip Macy	2965a45315	On Alan's advice, rather than do a wholesale conversion on a single architecture from page queue lock to a hashed array of page locks (based on a patch by Jeff Roberson), I've implemented page lock support in the MI code and have only moved vm_page's hold_count out from under page queue mutex to page lock. This changes pmap_extract_and_hold on all pmaps. Supported by: Bitgravity Inc. Discussed with: alc, jeffr, and kib	2010-04-30 00:46:43 +00:00
Alan Cox	7b85f59183	Resurrect pmap_is_referenced() and use it in mincore(). Essentially, pmap_ts_referenced() is not always appropriate for checking whether or not pages have been referenced because it clears any reference bits that it encounters. For example, in mincore(), clearing the reference bits has two negative consequences. First, it throws off the activity count calculations performed by the page daemon. Specifically, a page on which mincore() has called pmap_ts_referenced() looks less active to the page daemon than it should. Consequently, the page could be deactivated prematurely by the page daemon. Arguably, this problem could be fixed by having mincore() duplicate the activity count calculation on the page. However, there is a second problem for which that is not a solution. In order to clear a reference on a 4KB page, it may be necessary to demote a 2/4MB page mapping. Thus, a mincore() by one process can have the side effect of demoting a superpage mapping within another process!	2010-04-24 17:32:52 +00:00
Nathan Whitehorn	cb8617b275	Revisit locking in the 64-bit AIM PMAP. The PVO head for a page is generally protected by the VM page queue mutex. Instead of extending the table lock to cover the PVO heads, add some asserts that the page queue mutex is in fact held. This fixes several LORs and possible deadlocks. This also adds an optimization to moea64_kextract() useful for direct-mapped quantities, like UMA buffers. Being able to use this from inside UMA removes an additional LOR.	2010-03-20 14:35:24 +00:00
Nathan Whitehorn	5cf13d9573	Fix two small bugs. The PowerPC 970 does not support non-coherent memory access, and reflects this by autonomously writing LPTE_M into PTE entries. As such, we should not panic if LPTE_M changes by itself. While here, fix a harmless typo in moea64_sync_icache().	2010-03-15 00:27:40 +00:00
Nathan Whitehorn	5d7fdd31c8	Fix an obvious lock escape and fix a typo in a comment.	2010-03-04 17:24:31 +00:00
Nathan Whitehorn	9fcd9ccb86	Patch some more concurrency issues here. This expands the page table lock to cover the PVOs, and removes the scratchpage PTEs from the PVOs entirely to avoid the system trying to be helpful and rewriting them.	2010-03-04 06:39:58 +00:00
Nathan Whitehorn	44f06ae57d	Move the OEA64 scratchpage to the end of KVA from the beginning, and set its PVO to map physical address 0 instead of kernelstart. This fixes a situation in which a user process could attempt to return this address via KVM, have it fault while being modified, and then panic the kernel because (a) it is supposed to map a valid address and (b) it lies in the no-fault region between VM_MIN_KERNEL_ADDRESS and virtual_avail. While here, move msgbuf and dpcpu make into regular KVA space for consistency with other implementations.	2010-02-25 03:53:21 +00:00
Nathan Whitehorn	07d5198098	Provide an implementation of pmap_dev_direct_mapped() on OEA64. This is required in order to be able to mmap the running kernel, which is turn required to avoid fstat returning gibberish.	2010-02-25 03:49:17 +00:00
Nathan Whitehorn	9b3829abc7	Use dcbz instead of word stores for page zeroing, providing a factor of 3-4 speedup.	2010-02-24 00:55:55 +00:00
Nathan Whitehorn	83c01b8cf6	Close a race involving the OEA64 scratchpage. When the scratch page's physical address is changed, there is a brief window during which its PTE is invalid. Since moea64_set_scratchpage_pa() does not and cannot hold the page table lock, it was possible for another CPU to insert a new PTE into the scratch page's PTEG slot during this interval, corrupting both mappings. Solve this by creating a new flag, LPTE_LOCKED, such that moea64_pte_insert will avoid claiming locked PTEG slots even if they are invalid. This change also incorporates some additional paranoia added to solve things I thought might be this bug. Reported by: linimon	2010-02-24 00:54:37 +00:00
Nathan Whitehorn	ab73970649	Reduce KVA pressure on OEA64 systems running in bridge mode by mapping UMA segments at their physical addresses instead of into KVA. This emulates the direct mapping behavior of OEA32 in an ad-hoc way. To make this work properly required sharing the entire kernel PMAP with Open Firmware, so ofw_pmap is transformed into a stub on 64-bit CPUs. Also implement some more tweaks to get more mileage out of our limited amount of KVA, principally by extending KVA into segment 16 until the beginning of the first OFW mapping. Reported by: linimon	2010-02-20 16:23:29 +00:00
Nathan Whitehorn	062c8f4c86	Fix a bug where pages being removed from memory entirely no longer have PVOs, and so the modified state of the page can no longer be communicated to the VM layer, causing pages not to be flushed to swap when needed, in turn causing memory corruption. Also make several correctness adjustments to I-Cache synchronization and TLB invalidation for 64-bit Book-S CPUs. Obtained from: projects/ppc64 Discussed with: grehan MFC after: 2 weeks	2010-02-18 15:00:43 +00:00
Martin Blapp	c2ede4b379	Remove extraneous semicolons, no functional changes. Submitted by: Marc Balmer <marc@msys.ch> MFC after: 1 week	2010-01-07 21:01:37 +00:00
Nathan Whitehorn	4603558264	Provide a real fix to the too-many-translations problem when booting from CD on 64-bit hardware to replace existing band-aids. This occurred when the preloaded mdroot required too many mappings for the static buffer. Since we only use the translations buffer once, allocate a dynamic buffer on the stack. This early in the boot process, the call chain is quite short and we can be assured of having sufficient stack space. Reviewed by: grehan	2009-11-12 15:19:09 +00:00
Nathan Whitehorn	c10d3c2cd8	Spell sz correctly. Pointed out by: jmallett	2009-11-09 21:12:28 +00:00
Nathan Whitehorn	f90550c2d2	Increase the size of the OFW translations buffer to handle G5 systems that use many translation regions in firmware, and add bounds checking to prevent buffer overflows in case even the new value is exceeded. Reported by: Jacob Lambert MFC after: 3 days	2009-11-09 14:26:23 +00:00
Nathan Whitehorn	e2ee8728ee	Do not map the trap vectors into the kernel's address space. They are only used in real mode and keeping them mapped only serves to make NULL a valid address, which results in silent NULL pointer deferences. Suggested by: Patrick Kerharo Obtained from: projects/ppc64	2009-10-23 14:27:40 +00:00
Nathan Whitehorn	999987e51a	Add SMP support on U3-based G5 systems. This does not yet work perfectly: at least on my Xserve, getting the decrementer and timebase on APs to tick requires setting up a clock chip over I2C, which is not yet done. While here, correct the 64-bit tlbie function to set the CPU to 64-bit mode correctly. Hardware donated by: grehan	2009-10-23 03:17:02 +00:00
Marcel Moolenaar	1a4fcaebe3	o Introduce vm_sync_icache() for making the I-cache coherent with the memory or D-cache, depending on the semantics of the platform. vm_sync_icache() is basically a wrapper around pmap_sync_icache(), that translates the vm_map_t argumument to pmap_t. o Introduce pmap_sync_icache() to all PMAP implementation. For powerpc it replaces the pmap_page_executable() function, added to solve the I-cache problem in uiomove_fromphys(). o In proc_rwmem() call vm_sync_icache() when writing to a page that has execute permissions. This assures that when breakpoints are written, the I-cache will be coherent and the process will actually hit the breakpoint. o This also fixes the Book-E PMAP implementation that was missing necessary locking while trying to deal with the I-cache coherency in pmap_enter() (read: mmu_booke_enter_locked). The key property of this change is that the I-cache is made coherent after writes have been done. Doing it in the PMAP layer when adding or changing a mapping means that the I-cache is made coherent before any writes happen. The difference is key when the I-cache prefetches.	2009-10-21 18:38:02 +00:00
Nathan Whitehorn	f136543e41	Increase the size of the page table on 64-bit PowerPC machines as a bandaid to prevent exhaustion of the primary and secondary hash groups in the event of extreme stress on the PMAP layer (e.g. a forkbomb). This wastes memory, and should be revised to properly handle PTEG spills instead. Suggested by: grehan Approved by: re (kensmith)	2009-07-12 04:07:52 +00:00
Jeff Roberson	50c202c592	Implement a facility for dynamic per-cpu variables. - Modules and kernel code alike may use DPCPU_DEFINE(), DPCPU_GET(), DPCPU_SET(), etc. akin to the statically defined PCPU_. Requires only one extra instruction more than PCPU_ and is virtually the same as __thread for builtin and much faster for shared objects. DPCPU variables can be initialized when defined. - Modules are supported by relocating the module's per-cpu linker set over space reserved in the kernel. Modules may fail to load if there is insufficient space available. - Track space available for modules with a one-off extent allocator. Free may block for memory to allocate space for an extent. Reviewed by: jhb, rwatson, kan, sam, grehan, marius, marcel, stas	2009-06-23 22:42:39 +00:00
Nathan Whitehorn	b40ce02a2f	Factor out platform dependent things unrelated to device drivers into a new platform module. These are probed in early boot, and have the responsibility of determining the layout of physical memory, determining the CPU timebase frequency, and handling the zoo of SMP mechanisms found on PowerPC. Reviewed by: marcel, raj Book-E parts by: raj	2009-05-14 00:34:26 +00:00
Nathan Whitehorn	1c96bdd146	Add support for 64-bit PowerPC CPUs operating in the 64-bit bridge mode provided, for example, on the PowerPC 970 (G5), as well as on related CPUs like the POWER3 and POWER4. This also adds support for various built-in hardware found on Apple G5 hardware (e.g. the IBM CPC925 northbridge). Reviewed by: grehan	2009-04-04 00:22:44 +00:00

1 2

92 Commits