freebsd-skq

Author	SHA1	Message	Date
marius	100c1f73e9	- For Cheetah- and Zeus-class CPUs don't flush all unlocked entries from the TLBs in order to get rid of the user mappings but instead traverse them an flush only the latter like we also do for the Spitfire-class. Also flushing the unlocked kernel entries can cause instant faults which when called from within cpu_switch() are handled with the scheduler lock held which in turn can cause timeouts on the acquisition of the lock by other CPUs. This was easily seen with a 16-core V890 but occasionally also happened with 2-way machines. While at it, move the SPARC64-V support code entirely to zeus.c. This causes a little bit of duplication but is less confusing than partially using Cheetah-class bits for these. - For SPARC64-V ensure that 4-Mbyte page entries are stored in the 1024- entry, 2-way set associative TLB. - In {d,i}tlb_get_data_sun4u() turn off the interrupts in order to ensure that ASI_{D,I}TLB_DATA_ACCESS_REG actually are read twice back-to-back. Tested by: Peter Jeremy (16-core US-IV), Michael Moll (2-way SPARC64-V)	2011-07-02 11:14:54 +00:00
marius	17a68df815	- Add TTE and context register bits for the additional page sizes supported by UltraSparc-IV and -IV+ as well as SPARC64 V, VI, VII and VIIIfx CPUs. - Replace TLB_PCXR_PGSZ_MASK and TLB_SCXR_PGSZ_MASK with TLB_CXR_PGSZ_MASK which just is the complement of TLB_CXR_CTX_MASK instead of trying to assemble it from the page size bits which vary across CPUs. - Add macros for the remainder of the SFSR bits, which are useful for at least debugging purposes.	2010-03-17 20:23:14 +00:00
marius	78ee2961aa	- Currently the PMAP code is laid out to let the kernel TSB cover the whole KVA space using one locked 4MB dTLB entry per GB of physical memory. On Cheetah-class machines only the dt16 can hold locked entries though, which would be completely consumed for the kernel TSB on machines with >= 16GB. Therefore limit the KVA space to use no more than half of the lockable dTLB slots, given that we need them also for other things. - Add sanity checks which ensure that we don't exhaust the (lockable) TLB slots.	2009-01-01 14:01:21 +00:00
marius	b484b97e77	For cheetah-class CPUs ensure that the dt512_0 is set to hold 8k pages for all three contexts and configure the dt512_1 to hold 4MB pages for them (e.g. for direct mappings). This might allow for additional optimization by using the faulting page sizes provided by AA_DMMU_TAG_ACCESS_EXT for bypassing the page size walker for the dt512 in the superpage support code. Submitted by: nwhitehorn (initial patch)	2008-09-08 21:24:25 +00:00
marius	f77ece5c7e	- Do as the comment in pmap_bootstrap() suggests and flush all non-locked TLB entries possibly left over by the firmware and also do so while bootstrapping APs. - Use __FBSDID. MFC after: 1 month	2008-03-09 15:53:34 +00:00
jake	08ad0d0b12	- Move the routine for flushing all user mappings from the tlb from pmap to the cpu dependent files. It will need to be done differently for USIII. - Simplify the logic for detecting context rollovers. Instead of dealing with it when the next context switch would cause the context numbers to rollover, deal with it when they actually do rollover. - Move some things around in cpu_switch so that we only do 1 membar #Sync when switching address space, instead of 2. - Detect kernel threads by comparing the new vm space to vmspace0, instead if checking if the tlb context is 0. - Removed some debug code.	2003-04-13 21:54:58 +00:00
jake	aadaaa160e	- Change the way the direct mapped region is implemented to be generally useful for accessing more than 1 page of contiguous physical memory, and to use 4mb tlb entries instead of 8k. This requires that the system only use the direct mapped addresses when they have the same virtual colour as all other mappings of the same page, instead of being able to choose the colour and cachability of the mapping. - Adapt the physical page copying and zeroing functions to account for not being able to choose the colour or cachability of the direct mapped address. This adds a lot more cases to handle. Basically when a page has a different colour than its direct mapped address we have a choice between bypassing the data cache and using physical addresses directly, which requires a cache flush, or mapping it at the right colour, which requires a tlb flush. For now we choose to map the page and do the tlb flush. This will allows the direct mapped addresses to be used for more things that don't require normal pmap handling, including mapping the vm_page structures, the message buffer, temporary mappings for crash dumps, and will provide greater benefit for implementing uma_small_alloc, due to the much greater tlb coverage.	2002-12-23 23:39:57 +00:00
jake	37f6204036	Demark sections of code that need special fault handling with labels. Check if the trapped pc is inside of the demarked sections to implement fault recovery for copyin etc, instead of pcb_onfault. Handle recovery from data access exceptions as well as page faults. Inspired by: bde's sys.dif	2002-08-16 00:57:37 +00:00
jake	07153efecf	Store the number of itlb and dtlb entries separately; they may be different. Find the prom node for the boot cpu earlier and store it in the per-cpu area, so that cache_init can be called earlier.	2002-08-15 05:24:55 +00:00
jake	d1139f54e7	Implement a direct mapped address region, like alpha and ia64. This basically maps all of physical memory 1:1 to a range of virtual addresses outside of normal kva. The advantage of doing this instead of accessing phsyical addresses directly is that memory accesses will go through the data cache, and will participate in the normal cache coherency algorithm for invalidating lines in our own and in other cpus' data caches. So we don't have to flush the cache manually or send IPIs to do so on other cpus. Also, since the mappings never change, we don't have to flush them from the tlb manually. This makes pmap_copy_page and pmap_zero_page MP safe, allowing the idle zero proc to run outside of giant. Inspired by: ia64	2002-07-27 21:57:38 +00:00
jake	dc1ed5c34f	Remove the tlb argument to tlb_page_demap (itlb or dtlb), in order to better match the pmap_invalidate api.	2002-07-26 15:54:04 +00:00
jake	51d015a4b6	Fix bizarre SMP problems. The secondary cpus sometimes start up with junk in their tlb which the prom doesn't clear out, so we have to do so manually before mapping the kernel page table or the cpu can hang due various conditions which cause undefined behaviour from the tlb.	2002-06-08 07:10:28 +00:00
jake	2a45651b25	Use a contrived 'tlb_entry' structure for passing the mappings for the kernel text and data from the loader to the kernel, so that the tte format is not part of the loader->kernel ABI.	2002-05-29 05:49:59 +00:00
jake	e0ffa7e6d3	Redefine the tte accessor macros to take a pointer to a tte, instead of the value of the tag or data field. Add macros for getting the page shift, size and mask for the physical page that a tte maps (which may be one of several sizes). Use the new cache functions for invalidating single pages.	2002-05-21 00:29:02 +00:00
jake	1166262e26	De-inline the tlb demap functions. These were so big that gcc3.1 refused to inline them anyway. ;)	2002-05-20 16:10:17 +00:00
tmm	3ac364813a	Fix crashes that would happen when more than one 4MB page was used to hold the kernel text, data and loader metadata by not using a fixed slot to store the TSB page(s) into. Enter fake 8k page entries into the kernel TSB that cover the 4M kernel page(s), sot that pmap_kenter() will work without having to treat these pages as a special case. Problem reported by: mjacob, obrien Problem spotted and 4M page handling proposed by: jake	2002-04-02 17:50:13 +00:00
jake	f11b0f8cfb	Fix a deadlock condition with tlb shootdown ipi delivery. Since ipis are not blocked by raising the pil, a reciever may be interrupted while holding a spinlock. If the sender does not defer interrupts throughout the entire operation it may be interrupted and try to acquire a spinlock held by a reciever, leading to a deadlock due to the synchronization used by the ipi handlers themselves. Submitted by: tmm	2002-03-23 04:20:00 +00:00
jake	33d439a7d0	Implement delivery of tlb shootdown ipis. This is currently more fine grained than the other implementations; we have complete control over the tlb, so we only demap specific pages. We take advantage of the ranged tlb flush api to send one ipi for a range of pages, and due to the pm_active optimization we rarely send ipis for demaps from user pmaps. Remove now unused routines to load the tlb; this is only done once outside of the tlb fault handlers. Minor cleanups to the smp startup code. This boots multi user with both cpus active on a dual ultra 60 and on a dual ultra 2.	2002-03-07 06:01:40 +00:00
jake	951cf2831e	Modify the tlb demap API to take a pmap instead of a tlb context number. Due to allocating tlb contexts on the fly, we only ever need to demap the primary context, non-primary contexts have already been implicitly flushed by context switching. All we really need to tell is if its a kernel demap or not, and its easier just to compare against the kernel_pmap which is a constant.	2002-03-07 05:25:15 +00:00
jake	c87ee2427d	Dig the information about which tlb slots were used to map the kernel out of the metadata passed by the loader.	2002-03-04 07:07:10 +00:00
jake	8322761809	Allocate tlb contexts on the fly in cpu_switch, instead of statically 1 to 1 with pmaps. When the context numbers wrap around we flush all user mappings from the tlb. This makes use of the array indexed by cpuid to allow a pmap to have a different context number on a different cpu. If the context numbers are then divided evenly among cpus such that none are shared, we can avoid sending tlb shootdown ipis in an smp system for non-shared pmaps. This also removes a limit of 8192 processes (pmaps) that could be active at any given time due to running out of tlb contexts. Inspired by: the brown book Crucial bugfix from: tmm	2002-03-04 05:20:29 +00:00
jake	7eea55cfea	Modify the tte format to not include the tlb context number and to store the virtual page number in a much more convenient way; all in one piece. This greatly simplifies the comparison for a matching tte, and allows the fault handlers to be much simpler due to not having to load wierd masks. Rewrite the tlb fault handlers to account for the new format. These are also written to allow faults on the user tsb inside of the fault handlers; the kernel fault handler must be aware of this and not clobber the other's registers. The faults do not yet occur due to other support that is needed (and still under my desk). Bug fixes from: tmm	2002-02-25 04:56:50 +00:00
jake	63d6dd6dde	Add inlines for demapping a range of pages from the itlb and dtlb. This will be used to reduce the number of tlb shootdown ipis in an smp system by sending one ipi for a whole range of pages, instead of one per page. Munge the context demap operations slightly to support demapping a non-primary context.	2002-02-23 21:10:06 +00:00
jake	b0c5fb0be3	Use intr_disable/intr_restore instead of TLB_ATOMIC_START/END. Submitted by: tmm	2002-02-23 20:59:35 +00:00
jake	a3769d0f6a	1. Certain tlb operations need to be atomic, so disable interrupts for their duration. This is still only effective as long as they are only used in the static kernel. Code in modules may cause instruction faults which makes these break in different ways anyway. 2. Add a load bearing membar #Sync. 3. Add an inline for demapping an entire context. Submitted by: tmm (1, 2)	2001-12-29 07:07:35 +00:00
jake	c2c8a8c21a	Remove traces that are loud and not that useful. Remove nested include of ktr.h.	2001-10-20 15:58:31 +00:00
jake	e6fc6fd6cd	Add ktr traces.	2001-09-03 22:57:21 +00:00
jake	636a22e19d	1. Add code to demap pages from the tlb for user contexts. 2. Add a context argument to most functions, instead of extracting it from from the tte. Submitted by: tmm (1).	2001-08-10 04:21:44 +00:00
obrien	f202ead276	The author isn't a [UC] Regents. Correct the copyright language.	2001-08-09 02:09:34 +00:00
jake	8117bcdd30	Oops. Last commit to tsb.h should have gone here. Fix macros for eadling with tte contexts and add macros for sfsr fields.	2001-08-06 02:21:53 +00:00
jake	fb7edc502f	Flesh out the sparc64 port considerably. This contains: - mostly complete kernel pmap support, and tested but currently turned off userland pmap support - low level assembly language trap, context switching and support code - fully implemented atomic.h and supporting cpufunc.h - some support for kernel debugging with ddb - various header tweaks and filling out of machine dependent structures	2001-07-31 06:05:05 +00:00

31 Commits