freebsd-dev

Author	SHA1	Message	Date
John Baldwin	d9d8d1449d	Add a new ipi_cpu() function to the MI IPI API that can be used to send an IPI to a specific CPU by its cpuid. Replace calls to ipi_selected() that constructed a mask for a single CPU with calls to ipi_cpu() instead. This will matter more in the future when we transition from cpumask_t to cpuset_t for CPU masks in which case building a CPU mask is more expensive. Submitted by: peter, sbruno Reviewed by: rookie Obtained from: Yahoo! (x86) MFC after: 1 month	2010-08-06 15:36:59 +00:00
Alan Cox	9124d0d6a3	Relax one of the new assertions in pmap_enter() a little. Specifically, allow pmap_enter() to be performed on an unmanaged page that doesn't have VPO_BUSY set. Having VPO_BUSY set really only matters for managed pages. (See, for example, pmap_remove_write().)	2010-06-11 15:49:39 +00:00
Alan Cox	ce18658792	Reduce the scope of the page queues lock and the number of PG_REFERENCED changes in vm_pageout_object_deactivate_pages(). Simplify this function's inner loop using TAILQ_FOREACH(), and shorten some of its overly long lines. Update a stale comment. Assert that PG_REFERENCED may be cleared only if the object containing the page is locked. Add a comment documenting this. Assert that a caller to vm_page_requeue() holds the page queues lock, and assert that the page is on a page queue. Push down the page queues lock into pmap_ts_referenced() and pmap_page_exists_quick(). (As of now, there are no longer any pmap functions that expect to be called with the page queues lock held.) Neither pmap_ts_referenced() nor pmap_page_exists_quick() should ever be passed an unmanaged page. Assert this rather than returning "0" and "FALSE" respectively. ARM: Simplify pmap_page_exists_quick() by switching to TAILQ_FOREACH(). Push down the page queues lock inside of pmap_clearbit(), simplifying pmap_clear_modify(), pmap_clear_reference(), and pmap_remove_write(). Additionally, this allows for avoiding the acquisition of the page queues lock in some cases. PowerPC/AIM: moea_page_exits_quick() and moea_page_wired_mappings() will never be called before pmap initialization is complete. Therefore, the check for moea_initialized can be eliminated. Push down the page queues lock inside of moea_clear_bit(), simplifying moea_clear_modify() and moea_clear_reference(). The last parameter to moea_clear_bit() is never used. Eliminate it. PowerPC/BookE: Simplify mmu_booke_page_exists_quick()'s control flow. Reviewed by: kib@	2010-06-10 16:56:35 +00:00
Alan Cox	b2830a9649	Eliminate a stale comment.	2010-05-31 06:06:10 +00:00
Alan Cox	72dc3eb65b	Simplify the inner loop of pmap_collect(): While iterating over the page's pv list, there is no point in checking whether or not the pv list is empty. Instead, wait until the loop completes.	2010-05-30 18:48:41 +00:00
Alan Cox	a1192299b3	Merge various changes from i386/i386/pmap.c: The remaining, unmerged portions of r175404 Retire PMAP_DIAGNOSTIC. Any useful diagnostics that were conditionally compiled under PMAP_DIAGNOSTIC are now KASSERT()s. (Note: The kernel option DIAGNOSTIC still disables inlining of certain pmap functions.) Eliminate dead code from pmap_enter(). This code implemented an assertion. On i386, an equivalent check is already implemented. However, on amd64, a small change is required to implement an equivalent check. Eliminate \n from a nearby panic string. Use KASSERT() to reimplement pmap_copy()'s two assertions. Merge portions of r177659 To date, we have assumed that the TLB will only set the PG_M bit in a PTE if that PTE has the PG_RW bit set. However, this assumption does not hold on recent processors from Intel. For example, consider a PTE that has the PG_RW bit set but the PG_M bit clear. Suppose this PTE is cached in the TLB and later the PG_RW bit is cleared in the PTE, but the corresponding TLB entry is not (yet) invalidated. Historically, upon a write access using this (stale) TLB entry, the TLB would observe that the PG_RW bit had been cleared and initiate a page fault, aborting the setting of the PG_M bit in the PTE. Now, however, P4- and Core2-family processors will set the PG_M bit before observing that the PG_RW bit is clear and initiating a page fault. In other words, the write does not occur but the PG_M bit is still set. The real impact of this difference is not that great. Specifically, we should no longer assert that any PTE with the PG_M bit set must also have the PG_RW bit set, and we should ignore the state of the PG_M bit unless the PG_RW bit is set. r208609 Defer freeing any page table pages in pmap_remove_all() until after the page queues lock is released. This may reduce the amount of time that the page queues lock is held by pmap_remove_all(). r208645 When I pushed down the page queues lock into pmap_is_modified(), I created an ordering dependence: A pmap operation that clears PG_WRITEABLE and calls vm_page_dirty() must perform the call first. Otherwise, pmap_is_modified() could return FALSE without acquiring the page queues lock because the page is not (currently) writeable, and the caller to pmap_is_modified() might believe that the page's dirty field is clear because it has not seen the effect of the vm_page_dirty() call. When I pushed down the page queues lock into pmap_is_modified(), I overlooked one place where this ordering dependence is violated: pmap_enter(). In a rare situation pmap_enter() can be called to replace a dirty mapping to one page with a mapping to another page. (I say rare because replacements generally occur as a result of a copy-on-write fault, and so the old page is not dirty.) This change delays clearing PG_WRITEABLE until after vm_page_dirty() has been called. Fixing the ordering dependency also makes it easy to introduce a small optimization: When pmap_enter() used to replace a mapping to one page with a mapping to another page, it freed the pv entry for the first mapping and later called the pv entry allocator for the new mapping. Now, pmap_enter() attempts to recycle the old pv entry, saving two calls to the pv entry allocator. There is no point in setting PG_WRITEABLE on unmanaged pages, so don't. Update a comment to reflect this. Tidy up the variable declarations at the start of pmap_enter().	2010-05-30 04:44:32 +00:00
Alan Cox	c46b90e90a	Push down page queues lock acquisition in pmap_enter_object() and pmap_is_referenced(). Eliminate the corresponding page queues lock acquisitions from vm_map_pmap_enter() and mincore(), respectively. In mincore(), this allows some additional cases to complete without ever acquiring the page queues lock. Assert that the page is managed in pmap_is_referenced(). On powerpc/aim, push down the page queues lock acquisition from moea_is_modified() and moea_is_referenced() into moea*_query_bit(). Again, this will allow some additional cases to complete without ever acquiring the page queues lock. Reorder a few statements in vm_page_dontneed() so that a race can't lead to an old reference persisting. This scenario is described in detail by a comment. Correct a spelling error in vm_page_dontneed(). Assert that the object is locked in vm_page_clear_dirty(), and restrict the page queues lock assertion to just those cases in which the page is currently writeable. Add object locking to vnode_pager_generic_putpages(). This was the one and only place where vm_page_clear_dirty() was being called without the object being locked. Eliminate an unnecessary vm_page_lock() around vnode_pager_setsize()'s call to vm_page_clear_dirty(). Change vnode_pager_generic_putpages() to the modern-style of function definition. Also, change the name of one of the parameters to follow virtual memory system naming conventions. Reviewed by: kib	2010-05-26 18:00:44 +00:00
Alan Cox	567e51e18c	Roughly half of a typical pmap_mincore() implementation is machine- independent code. Move this code into mincore(), and eliminate the page queues lock from pmap_mincore(). Push down the page queues lock into pmap_clear_modify(), pmap_clear_reference(), and pmap_is_modified(). Assert that these functions are never passed an unmanaged page. Eliminate an inaccurate comment from powerpc/powerpc/mmu_if.m: Contrary to what the comment says, pmap_mincore() is not simply an optimization. Without a complete pmap_mincore() implementation, mincore() cannot return either MINCORE_MODIFIED or MINCORE_REFERENCED because only the pmap can provide this information. Eliminate the page queues lock from vfs_setdirty_locked_object(), vm_pageout_clean(), vm_object_page_collect_flush(), and vm_object_page_clean(). Generally speaking, these are all accesses to the page's dirty field, which are synchronized by the containing vm object's lock. Reduce the scope of the page queues lock in vm_object_madvise() and vm_page_dontneed(). Reviewed by: kib (an earlier version)	2010-05-24 14:26:57 +00:00
Alan Cox	9ab6032f73	On entry to pmap_enter(), assert that the page is busy. While I'm here, make the style of assertion used by pmap_enter() consistent across all architectures. On entry to pmap_remove_write(), assert that the page is neither unmanaged nor fictitious, since we cannot remove write access to either kind of page. With the push down of the page queues lock, pmap_remove_write() cannot condition its behavior on the state of the PG_WRITEABLE flag if the page is busy. Assert that the object containing the page is locked. This allows us to know that the page will neither become busy nor will PG_WRITEABLE be set on it while pmap_remove_write() is running. Correct a long-standing bug in vm_page_cowsetup(). We cannot possibly do copy-on-write-based zero-copy transmit on unmanaged or fictitious pages, so don't even try. Previously, the call to pmap_remove_write() would have failed silently.	2010-05-16 23:45:10 +00:00
Alan Cox	3c4a24406b	Push down the page queues into vm_page_cache(), vm_page_try_to_cache(), and vm_page_try_to_free(). Consequently, push down the page queues lock into pmap_enter_quick(), pmap_page_wired_mapped(), pmap_remove_all(), and pmap_remove_write(). Push down the page queues lock into Xen's pmap_page_is_mapped(). (I overlooked the Xen pmap in r207702.) Switch to a per-processor counter for the total number of pages cached.	2010-05-08 20:34:01 +00:00
Kip Macy	958d87cd86	merge 194209 in to the i386/xen pmap requested by: alc@	2010-04-30 03:26:12 +00:00
Kip Macy	2965a45315	On Alan's advice, rather than do a wholesale conversion on a single architecture from page queue lock to a hashed array of page locks (based on a patch by Jeff Roberson), I've implemented page lock support in the MI code and have only moved vm_page's hold_count out from under page queue mutex to page lock. This changes pmap_extract_and_hold on all pmaps. Supported by: Bitgravity Inc. Discussed with: alc, jeffr, and kib	2010-04-30 00:46:43 +00:00
Alan Cox	14dd3a29ea	MFi386 r207205 Clearing a page table entry's accessed bit (PG_A) and setting the page's PG_REFERENCED flag in pmap_protect() can't really be justified, so don't do it.	2010-04-27 05:35:35 +00:00
Alan Cox	7b85f59183	Resurrect pmap_is_referenced() and use it in mincore(). Essentially, pmap_ts_referenced() is not always appropriate for checking whether or not pages have been referenced because it clears any reference bits that it encounters. For example, in mincore(), clearing the reference bits has two negative consequences. First, it throws off the activity count calculations performed by the page daemon. Specifically, a page on which mincore() has called pmap_ts_referenced() looks less active to the page daemon than it should. Consequently, the page could be deactivated prematurely by the page daemon. Arguably, this problem could be fixed by having mincore() duplicate the activity count calculation on the page. However, there is a second problem for which that is not a solution. In order to clear a reference on a 4KB page, it may be necessary to demote a 2/4MB page mapping. Thus, a mincore() by one process can have the side effect of demoting a superpage mapping within another process!	2010-04-24 17:32:52 +00:00
John Baldwin	cf684ede27	Make NKPT a kernel option on i386 so that it can be set to a non-default value from kernel config files. Tested by: Charles Sprickman spork of bway net MFC after: 2 weeks	2010-03-10 19:50:52 +00:00
Attilio Rao	3258030144	Introduce the new kernel sub-tree x86 which should contain all the code shared and generalized between our current amd64, i386 and pc98. This is just an initial step that should lead to a more complete effort. For the moment, a very simple porting of cpufreq modules, BIOS calls and the whole MD specific ISA bus part is added to the sub-tree but ideally a lot of code might be added and more shared support should grow. Sponsored by: Sandvine Incorporated Reviewed by: emaste, kib, jhb, imp Discussed on: arch MFC: 3 weeks	2010-02-25 14:13:39 +00:00
Kip Macy	20f72e6f2f	- fix bootstrap for variable KVA_PAGES - remove unused CADDR1 - hold lock across page table update MFC after: 3 days	2010-02-21 01:13:34 +00:00
Ed Schouten	ddc534916d	Allow the pmap code to be built with GCC from FreeBSD 7 again. This patch basically gives us the best of both worlds. Instead of forcing the compiler to emulate GNU-style inline semantics even though we're using ISO C99, it will only use GNU-style inlining when the compiler is configured that way (__GNUC_GNU_INLINE__). Tested by: jhb	2010-02-18 14:28:38 +00:00
Ed Schouten	91bfd816f2	Recommit r193732: Remove __gnu89_inline. Now that we use C99 almost everywhere, just use C99-style in the pmap code. Since the pmap code is the only consumer of __gnu89_inline, remove it from cdefs.h as well. Because the flag was only introduced 17 months ago, I don't expect any problems. Reviewed by: alc It was backed out, because it prevented us from building kernels using a 7.x compiler. Now that most people use 8.x, there is nothing that holds us back. Even if people run 7.x, they should be able to build a kernel if they run `make kernel-toolchain' or `make buildworld' first.	2010-01-19 15:31:18 +00:00
Alan Cox	418f8af3a9	Eliminate an unused declaration.	2010-01-11 15:51:13 +00:00
Alan Cox	92697f16aa	Eliminate unused declarations.	2010-01-10 21:00:52 +00:00
Bjoern A. Zeeb	23b6a4482b	Unbreak the XEN build after r201751.	2010-01-08 16:56:11 +00:00
Alan Cox	28a5e2a5d7	Make pmap_set_pg() static.	2010-01-07 17:34:45 +00:00
Alan Cox	0d7e26de54	Eliminate unused variables (see r137912).	2010-01-07 04:47:09 +00:00
Kip Macy	72bc4ff74d	- revert pmap_kenter_temporary to taking a physical address - make minidump work	2009-12-10 03:09:35 +00:00
Attilio Rao	385e11118c	i386 has not (yet) any DEV_ATPIC conditional than axe it out from Xen version. No objections by: kmacy	2009-11-27 01:02:17 +00:00
Kip Macy	be7747b449	fixup kernel core dumps on paravirtual guests	2009-11-24 07:17:51 +00:00
Andriy Gapon	6cc16fcb4e	reflect that pg_ps_enabled is a tunable, not just a read-only sysctl Nod from: jhb	2009-11-11 14:21:31 +00:00
Marcel Moolenaar	1a4fcaebe3	o Introduce vm_sync_icache() for making the I-cache coherent with the memory or D-cache, depending on the semantics of the platform. vm_sync_icache() is basically a wrapper around pmap_sync_icache(), that translates the vm_map_t argumument to pmap_t. o Introduce pmap_sync_icache() to all PMAP implementation. For powerpc it replaces the pmap_page_executable() function, added to solve the I-cache problem in uiomove_fromphys(). o In proc_rwmem() call vm_sync_icache() when writing to a page that has execute permissions. This assures that when breakpoints are written, the I-cache will be coherent and the process will actually hit the breakpoint. o This also fixes the Book-E PMAP implementation that was missing necessary locking while trying to deal with the I-cache coherency in pmap_enter() (read: mmu_booke_enter_locked). The key property of this change is that the I-cache is made coherent after writes have been done. Doing it in the PMAP layer when adding or changing a mapping means that the I-cache is made coherent before any writes happen. The difference is key when the I-cache prefetches.	2009-10-21 18:38:02 +00:00
Kip Macy	46aba52a50	make read_eflags and write_eflags accomplish the same effect on PVM as native, simplifying interrupt handling	2009-10-01 22:05:38 +00:00
Kip Macy	7d05808361	fix UP compilation	2009-09-11 23:41:11 +00:00
Jung-uk Kim	3bcdfb9bf8	Consolidate CPUID to CPU family/model macros for amd64 and i386 to reduce unnecessary #ifdef's for shared code between them.	2009-09-10 17:27:36 +00:00
Konstantin Belousov	9411b7675e	As was done in r196643 for i386 and amd64, swap the start/end virtual addresses in pmap_invalidate_cache_range(). Reported by: Vincent Hoffman <vince unsane co uk> Reviewed by: jhb MFC after: 3 days	2009-09-09 19:40:54 +00:00
Adrian Chadd	56af96debe	Delete whitespace not in i386/pmap.c	2009-09-01 12:17:47 +00:00
Adrian Chadd	c0bb2b8058	Migrate to use cpuset_t.	2009-09-01 06:15:50 +00:00
Adrian Chadd	fa52c6106b	Merge in the pat_works work from sys/i386/i386/pmap.c - primarily to reduce diff size.	2009-09-01 05:15:45 +00:00
Adrian Chadd	f8d44dbb74	Fix broken build.	2009-09-01 03:44:25 +00:00
Adrian Chadd	82fe1cca67	Revert previous commit; that was left-over junk in the tree.	2009-08-31 23:35:59 +00:00
Adrian Chadd	0ad6375395	Shuffle pagezero() into the same location as in sys/i386/i386/pmap.c.	2009-08-31 23:30:39 +00:00
Attilio Rao	68a5590836	Port recent IPI enhachements to en: * Introduce the ipi_nmi_handler() function for the Xen infrastructure * Fixup adeguately the ipi sender functions Approved by: re (kib)	2009-08-15 18:37:06 +00:00
Attilio Rao	dc6fbf6545	* Completely Remove the option STOP_NMI from the kernel. This option has proven to have a good effect when entering KDB by using a NMI, but it completely violates all the good rules about interrupts disabled while holding a spinlock in other occasions. This can be the cause of deadlocks on events where a normal IPI_STOP is expected. * Adds an new IPI called IPI_STOP_HARD on all the supported architectures. This IPI is responsible for sending a stop message among CPUs using a privileged channel when disponible. In other cases it just does match a normal IPI_STOP. Right now the IPI_STOP_HARD functionality uses a NMI on ia32 and amd64 architectures, while on the other has a normal IPI_STOP effect. It is responsibility of maintainers to eventually implement an hard stop when necessary and possible. * Use the new IPI facility in order to implement a new userend SMP kernel function called stop_cpus_hard(). That is specular to stop_cpu() but it does use the privileged channel for the stopping facility. * Let KDB use the newly introduced function stop_cpus_hard() and leave stop_cpus() for all the other cases * Disable interrupts on CPU0 when starting the process of APs suspension. * Style cleanup and comments adding This patch should fix the reboot/shutdown deadlocks many users are constantly reporting on mailing lists. Please don't forget to update your config file with the STOP_NMI option removal Reviewed by: jhb Tested by: pho, bz, rink Approved by: re (kib)	2009-08-13 17:09:45 +00:00
Konstantin Belousov	ed190ad06a	Fix XEN build breakage, by implementing pmap_invalidate_cache_range() and using it when appropriate. Merge analogue of the r195836 optimization to XEN. Approved by: re (kensmith)	2009-07-29 19:38:33 +00:00
John Baldwin	013818111a	Add a new type of VM object: OBJT_SG. An OBJT_SG object is very similar to a device pager (OBJT_DEVICE) object in that it uses fictitious pages to provide aliases to other memory addresses. The primary difference is that it uses an sglist(9) to determine the physical addresses for a given offset into the object instead of invoking the d_mmap() method in a device driver. Reviewed by: alc Approved by: re (kensmith) MFC after: 2 weeks	2009-07-24 13:50:29 +00:00
Alan Cox	9861cbc6ca	Change the handling of fictitious pages by pmap_page_set_memattr() on amd64 and i386. Essentially, fictitious pages provide a mechanism for creating aliases for either normal or device-backed pages. Therefore, pmap_page_set_memattr() on a fictitious page needn't update the direct map or flush the cache. Such actions are the responsibility of the "primary" instance of the page or the device driver that "owns" the physical address. For example, these actions are already performed by pmap_mapdev(). The device pager needn't restore the memory attributes on a fictitious page before releasing it. It's now pointless. Add pmap_page_set_memattr() to the Xen pmap. Approved by: re (kib)	2009-07-19 21:40:19 +00:00
Alan Cox	3153e878dd	Add support to the virtual memory system for configuring machine- dependent memory attributes: Rename vm_cache_mode_t to vm_memattr_t. The new name reflects the fact that there are machine-dependent memory attributes that have nothing to do with controlling the cache's behavior. Introduce vm_object_set_memattr() for setting the default memory attributes that will be given to an object's pages. Introduce and use pmap_page_{get,set}_memattr() for getting and setting a page's machine-dependent memory attributes. Add full support for these functions on amd64 and i386 and stubs for them on the other architectures. The function pmap_page_set_memattr() is also responsible for any other machine-dependent aspects of changing a page's memory attributes, such as flushing the cache or updating the direct map. The uses include kmem_alloc_contig(), vm_page_alloc(), and the device pager: kmem_alloc_contig() can now be used to allocate kernel memory with non-default memory attributes on amd64 and i386. vm_page_alloc() and the device pager will set the memory attributes for the real or fictitious page according to the object's default memory attributes. Update the various pmap functions on amd64 and i386 that map pages to incorporate each page's memory attributes in the mapping. Notes: (1) Inherent to this design are safety features that prevent the specification of inconsistent memory attributes by different mappings on amd64 and i386. In addition, the device pager provides a warning when a device driver creates a fictitious page with memory attributes that are inconsistent with the real page that the fictitious page is an alias for. (2) Storing the machine-dependent memory attributes for amd64 and i386 as a dedicated "int" in "struct md_page" represents a compromise between space efficiency and the ease of MFCing these changes to RELENG_7. In collaboration with: jhb Approved by: re (kib)	2009-07-12 23:31:20 +00:00
Alan Cox	0e18ab26d0	PAE adds another level to the i386 page table. This level is a small 4-entry table that must be located within the first 4GB of RAM. This requirement is met by defining an UMA zone with a custom back-end allocator function. This revision makes two changes to this back-end allocator function: (1) It replaces the use of contigmalloc() with the use of kmem_alloc_contig(). This eliminates "double accounting", i.e., accounting by both the UMA zone and malloc tags. (I made the same change for the same reason to the zones supporting jumbo frames a week ago.) (2) It passes through the "wait" parameter, i.e., M_WAITOK, M_ZERO, etc. to kmem_alloc_contig() rather than ignoring it. pmap_init() calls uma_zalloc() with both M_WAITOK and M_ZERO. At the moment, this is harmless only because the default behavior of contigmalloc()/kmem_alloc_contig() is to wait and because pmap_init() doesn't really depend on the memory being zeroed. The back-end allocator function in the Xen pmap is dead code. I am changing it nonetheless because I don't want to leave any "bad examples" in the source tree for someone to copy at a later date. Approved by: re (kib)	2009-07-05 21:40:21 +00:00
Jeff Roberson	50c202c592	Implement a facility for dynamic per-cpu variables. - Modules and kernel code alike may use DPCPU_DEFINE(), DPCPU_GET(), DPCPU_SET(), etc. akin to the statically defined PCPU_. Requires only one extra instruction more than PCPU_ and is virtually the same as __thread for builtin and much faster for shared objects. DPCPU variables can be initialized when defined. - Modules are supported by relocating the module's per-cpu linker set over space reserved in the kernel. Modules may fail to load if there is insufficient space available. - Track space available for modules with a one-off extent allocator. Free may block for memory to allocate space for an extent. Reviewed by: jhb, rwatson, kan, sam, grehan, marius, marcel, stas	2009-06-23 22:42:39 +00:00
Ed Schouten	5942207fb4	Revert my change; reintroduce __gnu89_inline. It turns out our compiler in stable/7 can't build this code anymore. Even though my opinion is that those people should just run `make kernel-toolchain' before building a kernel, I am willing to wait and commit this after we've branched stable/8. Requested by: rwatson	2009-06-08 18:23:43 +00:00
Ed Schouten	032e3d1d19	Remove __gnu89_inline. Now that we use C99 almost everywhere, just use C99-style in the pmap code. Since the pmap code is the only consumer of __gnu89_inline, remove it from cdefs.h as well. Because the flag was only introduced 17 months ago, I don't expect any problems. Reviewed by: alc	2009-06-08 17:27:25 +00:00
Adrian Chadd	c22ca7f04f	Fix the MP IPI code to differentiate between bitmapped IPIs and function IPIs. This attempts to fix the IPI handling code to correctly differentiate between bitmapped IPIs and function IPIs. The Xen IPIs were on low numbers which clashed with the bitmapped IPIs. This commit bumps those IPI numbers up to 240 and above (just like in the i386 code) and fiddles with the ipi_vectors[] logic to call the correct function. This still isn't "right". Specifically, the IPI code may work fine for TLB shootdown events but the rendezvous/lazypmap IPIs are thrown by calling ipi_*() routines which don't set the call_func stuff (function id, addr1, addr2) that the TLB shootdown events are. So the Xen SMP support is still broken. PR: 135069	2009-05-31 08:11:39 +00:00

1 2 3

103 Commits