freebsd-skq

Author	SHA1	Message	Date
gibbs	c5fb44c57d	Rename definition of HYPERVISOR_VIRT_START to avoid conflict with upstream Xen definition found in xen/interface/arch-x86/xen-x86_32.h. Submitted by: Roger Pau Monné Reviewed by: gibbs MFC after: 2 weeks	2013-08-22 20:07:06 +00:00
kib	05a9dff802	Revert r254501. Instead, reuse the type stability of the struct pmap which is the part of struct vmspace, allocated from UMA_ZONE_NOFREE zone. Initialize the pmap lock in the vmspace zone init function, and remove pmap lock initialization and destruction from pmap_pinit() and pmap_release(). Suggested and reviewed by: alc (previous version) Tested by: pho Sponsored by: The FreeBSD Foundation	2013-08-22 18:12:24 +00:00
attilio	16c7563cf4	The soft and hard busy mechanism rely on the vm object lock to work. Unify the 2 concept into a real, minimal, sxlock where the shared acquisition represent the soft busy and the exclusive acquisition represent the hard busy. The old VPO_WANTED mechanism becames the hard-path for this new lock and it becomes per-page rather than per-object. The vm_object lock becames an interlock for this functionality: it can be held in both read or write mode. However, if the vm_object lock is held in read mode while acquiring or releasing the busy state, the thread owner cannot make any assumption on the busy state unless it is also busying it. Also: - Add a new flag to directly shared busy pages while vm_page_alloc and vm_page_grab are being executed. This will be very helpful once these functions happen under a read object lock. - Move the swapping sleep into its own per-object flag The KPI is heavilly changed this is why the version is bumped. It is very likely that some VM ports users will need to change their own code. Sponsored by: EMC / Isilon storage division Discussed with: alc Reviewed by: jeff, kib Tested by: gavin, bapt (older version) Tested by: pho, scottl	2013-08-09 11:11:11 +00:00
jeff	de4ecca213	Replace kernel virtual address space allocation with vmem. This provides transparent layering and better fragmentation. - Normalize functions that allocate memory to use kmem_* - Those that allocate address space are named kva_* - Those that operate on maps are named kmap_* - Implement recursive allocation handling for kmem_arena in vmem. Reviewed by: alc Tested by: pho Sponsored by: EMC / Isilon Storage Division	2013-08-07 06:21:20 +00:00
gibbs	16fc3a3c4b	Adjust i386 Xen PV support for updated Xen interface files. sys/i386/include/xen/xenvar.h: sys/i386/xen/xen_machdep.c: sys/xen/interface/foreign/structs.py: sys/xen/evtchn/evtchn.c: MAX_VIRT_CPUS => XEN_LEGACY_MAX_VCPUS Submitted by: Roger Pau Monné Reviewed by: gibbs	2013-06-17 01:43:07 +00:00
jeff	7ee88fb112	- Add a BIT_FFS() macro and use it to replace cpusetffs_obj() Discussed with: attilio Sponsored by: EMC / Isilon Storage Division	2013-06-13 20:46:03 +00:00
attilio	fdf82ef9cf	o Relax locking assertions for vm_page_find_least() o Relax locking assertions for pmap_enter_object() and add them also to architectures that currently don't have any o Introduce VM_OBJECT_LOCK_DOWNGRADE() which is basically a downgrade operation on the per-object rwlock o Use all the mechanisms above to make vm_map_pmap_enter() to work mostl of the times only with readlocks. Sponsored by: EMC / Isilon storage division Reviewed by: alc	2013-05-21 20:38:19 +00:00
kib	7c26a038f9	Implement the concept of the unmapped VMIO buffers, i.e. buffers which do not map the b_pages pages into buffer_map KVA. The use of the unmapped buffers eliminate the need to perform TLB shootdown for mapping on the buffer creation and reuse, greatly reducing the amount of IPIs for shootdown on big-SMP machines and eliminating up to 25-30% of the system time on i/o intensive workloads. The unmapped buffer should be explicitely requested by the GB_UNMAPPED flag by the consumer. For unmapped buffer, no KVA reservation is performed at all. The consumer might request unmapped buffer which does have a KVA reserve, to manually map it without recursing into buffer cache and blocking, with the GB_KVAALLOC flag. When the mapped buffer is requested and unmapped buffer already exists, the cache performs an upgrade, possibly reusing the KVA reservation. Unmapped buffer is translated into unmapped bio in g_vfs_strategy(). Unmapped bio carry a pointer to the vm_page_t array, offset and length instead of the data pointer. The provider which processes the bio should explicitely specify a readiness to accept unmapped bio, otherwise g_down geom thread performs the transient upgrade of the bio request by mapping the pages into the new bio_transient_map KVA submap. The bio_transient_map submap claims up to 10% of the buffer map, and the total buffer_map + bio_transient_map KVA usage stays the same. Still, it could be manually tuned by kern.bio_transient_maxcnt tunable, in the units of the transient mappings. Eventually, the bio_transient_map could be removed after all geom classes and drivers can accept unmapped i/o requests. Unmapped support can be turned off by the vfs.unmapped_buf_allowed tunable, disabling which makes the buffer (or cluster) creation requests to ignore GB_UNMAPPED and GB_KVAALLOC flags. Unmapped buffers are only enabled by default on the architectures where pmap_copy_page() was implemented and tested. In the rework, filesystem metadata is not the subject to maxbufspace limit anymore. Since the metadata buffers are always mapped, the buffers still have to fit into the buffer map, which provides a reasonable (but practically unreachable) upper bound on it. The non-metadata buffer allocations, both mapped and unmapped, is accounted against maxbufspace, as before. Effectively, this means that the maxbufspace is forced on mapped and unmapped buffers separately. The pre-patch bufspace limiting code did not worked, because buffer_map fragmentation does not allow the limit to be reached. By Jeff Roberson request, the getnewbuf() function was split into smaller single-purpose functions. Sponsored by: The FreeBSD Foundation Discussed with: jeff (previous version) Tested by: pho, scottl (previous version), jhb, bf MFC after: 2 weeks	2013-03-19 14:13:12 +00:00
attilio	d500d6361a	MFC	2013-03-17 23:39:52 +00:00
kib	63efc821c3	Add pmap function pmap_copy_pages(), which copies the content of the pages around, taking array of vm_page_t both for source and destination. Starting offsets and total transfer size are specified. The function implements optimal algorithm for copying using the platform-specific optimizations. For instance, on the architectures were the direct map is available, no transient mappings are created, for i386 the per-cpu ephemeral page frame is used. The code was typically borrowed from the pmap_copy_page() for the same architecture. Only i386/amd64, powerpc aim and arm/arm-v6 implementations were tested at the time of commit. High-level code, not committed yet to the tree, ensures that the use of the function is only allowed after explicit enablement. For sparc64, the existing code has known issues and a stab is added instead, to allow the kernel linking. Sponsored by: The FreeBSD Foundation Tested by: pho (i386, amd64), scottl (amd64), ian (arm and arm-v6) MFC after: 2 weeks	2013-03-14 20:18:12 +00:00
attilio	76954ad68a	Merge from vmcontention.	2013-03-09 03:19:53 +00:00
attilio	bf1dc90446	MFC	2013-03-08 00:03:07 +00:00
attilio	d924385016	Fixup XEN pmap to cope with removal of left/right iterators from pages. Sponsored by: EMC / Isilon storage division	2013-03-03 01:26:11 +00:00
attilio	be8012c4e6	Fix-up r247622 by also renaming pv_list iterator into the xen pmap verbatim copy. Sponsored by: EMC / Isilon storage division Reported by: tinderbox	2013-03-03 01:02:57 +00:00
attilio	e98f58faf6	MFC	2013-03-02 14:48:41 +00:00
mav	6cf7cc6e4d	MFcalloutng: Switch eventtimers(9) from using struct bintime to sbintime_t. Even before this not a single driver really supported full dynamic range of struct bintime even in theory, not speaking about practical inexpediency. This change legitimates the status quo and cleans up the code.	2013-02-28 13:46:03 +00:00
attilio	8d28f94790	Merge from vmobj-rwlock: VM_OBJECT_LOCKED() macro is only used to implement a custom version of lock assertions right now (which likely spread out thanks to copy and paste). Remove it and implement actual assertions. Sponsored by: EMC / Isilon storage division Reviewed by: alc Tested by: pho	2013-02-27 18:12:13 +00:00
attilio	905e648d42	Hide the details for the assertion for VM_OBJECT_LOCK operations. Rename current VM_OBJECT_LOCK_ASSERT(foo, RA_WLOCKED) into VM_OBJECT_ASSERT_WLOCKED(foo) Sponsored by: EMC / Isilon storage division Requested by: alc	2013-02-21 21:54:53 +00:00
attilio	1f1e13ca03	There is no need to use VM_OBJECT_LOCKED() as the assertion won't make the check available in any case if INVARIANTS is switched off. Remove VM_OBJECT_LOCKED().	2013-02-20 10:51:34 +00:00
attilio	658534ed5a	Switch vm_object lock to be a rwlock. * VM_OBJECT_LOCK and VM_OBJECT_UNLOCK are mapped to write operations * VM_OBJECT_SLEEP() is introduced as a general purpose primitve to get a sleep operation using a VM_OBJECT_LOCK() as protection * The approach must bear with vm_pager.h namespace pollution so many files require including directly rwlock.h	2013-02-20 10:38:34 +00:00
jkim	92f5e4e678	Consistently use round_page(x) rather than roundup(x, PAGE_SIZE). There is no functional change.	2013-02-15 22:43:08 +00:00
marius	185d79fdec	Fix !INVARIANTS && !SMP build. MFC after: 3 days	2013-01-03 01:09:50 +00:00
dim	fb83924633	Fix a minor warning in sys/i386/xen/clock.c. MFC after: 3 days	2012-11-12 20:50:11 +00:00
alc	0014523dbd	Replace all uses of the vm page queues lock by a new R/W lock. Unfortunately, this lock cannot be defined as static under Xen because it is (ab)used to serialize queued page table changes. Tested by: sbruno	2012-10-12 23:26:00 +00:00
alc	99dca09c83	MFi386 r241356 Add several asserts. MFC after: 3 days	2012-10-10 17:15:34 +00:00
alc	41b1f8207c	In a few places, like the implementation of ptrace(), a thread may call upon pmap_enter() to create a mapping within a different address space, i.e., not the thread's own address space. On i386, this entails the creation of a temporary mapping to the affected page table page (PTP). In general, pmap_enter() will read from this PTP, allocate a PV entry, and write to this PTP. The trouble comes when the system is short of memory. In order to allocate a new PV entry, an older PV entry has to be reclaimed. Reclaiming a PV entry involves destroying a mapping, which requires access to the affected PTP. Thus, the PTP mapped at the beginning of pmap_enter() is no longer mapped at the end of pmap_enter(), which leads to pmap_enter() modifying the wrong PTP. To address this problem, pmap_pv_reclaim() is changed to use an alternate method of mapping PTPs. Update a related comment. Reported by: pho Diagnosed by: kib MFC after: 5 days	2012-10-08 16:57:05 +00:00
alc	55f6ff40ed	Eliminate a stale comment. It describes another use case for the pmap in Mach that doesn't exist in FreeBSD.	2012-09-28 05:30:59 +00:00
alc	2bc702613a	Simplify pmap_unmapdev(). Since kmem_free() eventually calls pmap_remove(), pmap_unmapdev()'s own direct efforts to destroy the page table entries are redundant, so eliminate them. Don't set PTE_W on the page table entry in pmap_kenter{,_attr}() on MIPS. Setting PTE_W on MIPS is inconsistent with the implementation of this function on other architectures. Moreover, PTE_W should not be set, unless the pmap's wired mapping count is incremented, which pmap_kenter{,_attr}() doesn't do. MFC after: 10 days	2012-09-10 16:11:29 +00:00
alc	b1adba8ac6	Rename {_,}pmap_unwire_pte_hold() to {_,}pmap_unwire_ptp() and update the comment describing them. Both the function names and the comment had grown stale. Quite some time has passed since these pmap implementations last used the page's hold count to track the number of valid mapping within a page table page. Also, returning TRUE from pmap_unwire_ptp() rather than _pmap_unwire_ptp() eliminates a few instructions from callers like pmap_enter_quick_locked() where pmap_unwire_ptp()'s return value is used directly by a conditional statement.	2012-09-05 06:02:54 +00:00
alc	9e07f73116	Eliminate an unnecessary acquisition and release of the page queues lock from pmap_pte(). PT_SET_MA() is not a queued mapping update, but instead an immediate mapping update, so the page queues lock is not required here. Reviewed by: cperciva	2012-08-10 05:47:04 +00:00
alc	93b7cabc81	Various small changes to PV entry management: Constify pc_freemask[]. pmap_pv_reclaim() Eliminate "freemask" because it was a pessimization. Add a comment about the resident count adjustment. free_pv_entry() [i386 only] Merge an optimization from amd64 (r233954). get_pv_entry() Eliminate the move to tail of the pv_chunk on the global pv_chunks list. (The right strategy needs more thought. Moreover, there were unintended differences between the amd64 and i386 implementation.) pmap_remove_pages() Eliminate unnecessary ()'s.	2012-06-04 03:51:08 +00:00
alc	3124f82898	Eliminate code duplication in free_pv_entry() and pmap_remove_pages() by introducing free_pv_chunk().	2012-06-01 04:26:50 +00:00
alc	8be2ae281d	Eliminate some purely stylistic differences among the amd64, i386 native, and i386 xen PV entry allocators.	2012-05-30 04:16:54 +00:00
alc	2af67ba17d	MFi386 pmap r233433 Disable detailed PV entry accounting by default. (A config option for enabling it was already introduced in r233433.)	2012-05-29 16:11:15 +00:00
alc	ef5ae2f2e5	Rename pmap_collect() to pmap_pv_reclaim() and rewrite it such that it no longer uses the active and inactive paging queues. Instead, the pmap now maintains an LRU-ordered list of pv entry pages, and pmap_pv_reclaim() uses this list to select pv entries for reclamation. Note: The old pmap_collect() tried to avoid reclaiming mappings for pages that have either a hold_count or a busy field that is non-zero. However, this isn't necessary for correctness, and the locking in pmap_collect() was insufficient to guarantee that such mappings weren't reclaimed. The new pmap_pv_reclaim() doesn't even try. Tested by: sbruno MFC after: 5 weeks	2012-05-29 15:41:20 +00:00
alc	d65386a32f	Merge r216333 and r216555 from the native pmap When r207410 eliminated the acquisition and release of the page queues lock from pmap_extract_and_hold(), it didn't take into account that pmap_pte_quick() sometimes requires the page queues lock to be held. This change reimplements pmap_extract_and_hold() such that it no longer uses pmap_pte_quick(), and thus never requires the page queues lock. Merge r177525 from the native pmap Prevent the overflow in the calculation of the next page directory. The overflow causes the wraparound with consequent corruption of the (almost) whole address space mapping. Strictly speaking, r177525 is not required by the Xen pmap because the hypervisor steals the uppermost region of the normal kernel address space. I am nonetheless merging it in order to reduce the number of unnecessary differences between the native and Xen pmap implementations. Tested by: sbruno	2011-12-30 18:16:15 +00:00
alc	d8353bd3a2	Fix a bug in the Xen pmap's implementation of pmap_extract_and_hold(): If the page lock acquisition is retried, then the underlying thread is not unpinned. Wrap nearby lines that exceed 80 columns.	2011-12-28 19:59:54 +00:00
alc	32c6bb1579	Eliminate many of the unnecessary differences between the native and paravirtualized pmap implementations for i386. This includes some style fixes to the native pmap and several bug fixes that were not previously applied to the paravirtualized pmap. Tested by: sbruno MFC after: 3 weeks	2011-12-27 23:53:00 +00:00
alc	2d22b480db	The size passed to kmem functions should be in terms of bytes and not pages. Avoid an out-of-bounds array access. Reviewed by: cperciva	2011-12-20 20:29:45 +00:00
alc	559108fd28	The Xen pmap doesn't support superpages. So, there is no point in it initializing structures, like the pv table, that are only used to implement superpages. In fact, some of the unnecessary code in pmap_init() was actually doing harm. It was preventing the kernel from booting on virtual machines with more than 768 MB of memory. Tested by: sbruno	2011-12-20 20:16:12 +00:00
alc	d368df9b98	Eliminate vestiges of page coloring.	2011-12-15 05:07:16 +00:00
ed	0c56cf839d	Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs. The SYSCTL_NODE macro defines a list that stores all child-elements of that node. If there's no SYSCTL_DECL macro anywhere else, there's no reason why it shouldn't be static.	2011-11-07 15:43:11 +00:00
alc	841afea2d9	Eliminate vestiges of page coloring in VM_ALLOC_NOOBJ calls to vm_page_alloc(). While I'm here, for the sake of consistency, always specify the allocation class, such as VM_ALLOC_NORMAL, as the first of the flags.	2011-10-27 16:39:17 +00:00
kib	a9d505a22a	Split the vm_page flags PG_WRITEABLE and PG_REFERENCED into atomic flags field. Updates to the atomic flags are performed using the atomic ops on the containing word, do not require any vm lock to be held, and are non-blocking. The vm_page_aflag_set(9) and vm_page_aflag_clear(9) functions are provided to modify afalgs. Document the changes to flags field to only require the page lock. Introduce vm_page_reference(9) function to provide a stable KPI and KBI for filesystems like tmpfs and zfs which need to mark a page as referenced. Reviewed by: alc, attilio Tested by: marius, flo (sparc64); andreast (powerpc, powerpc64) Approved by: re (bz)	2011-09-06 10:30:11 +00:00
kib	f408aa11a3	- Move the PG_UNMANAGED flag from m->flags to m->oflags, renaming the flag to VPO_UNMANAGED (and also making the flag protected by the vm object lock, instead of vm page queue lock). - Mark the fake pages with both PG_FICTITIOUS (as it is now) and VPO_UNMANAGED. As a consequence, pmap code now can use use just VPO_UNMANAGED to decide whether the page is unmanaged. Reviewed by: alc Tested by: pho (x86, previous version), marius (sparc64), marcel (arm, ia64, powerpc), ray (mips) Sponsored by: The FreeBSD Foundation Approved by: re (bz)	2011-08-09 21:01:36 +00:00
jhb	1545102484	Don't include mptable_pci.c in Xen kernels. It is only meant for systems that truly have an MPTable. The MPTable code in Xen is really a Xen specific CPU enumerator and probably shouldn't be using the mptable name at all.	2011-07-17 01:23:50 +00:00
jhb	53534ecdc0	Fix build with NEW_PCIB defined.	2011-07-16 14:06:02 +00:00
attilio	364d0522f7	With retirement of cpumask_t and usage of cpuset_t for representing a mask of CPUs, pc_other_cpus and pc_cpumask become highly inefficient. Remove them and replace their usage with custom pc_cpuid magic (as, atm, pc_cpumask can be easilly represented by (1 << pc_cpuid) and pc_other_cpus by (all_cpus & ~(1 << pc_cpuid))). This change is not targeted for MFC because of struct pcpu members removal and dependency by cpumask_t retirement. MD review by: marcel, marius, alc Tested by: pluknet MD testing by: marcel, marius, gonzo, andreast	2011-07-04 12:04:52 +00:00
alc	9cb2f04853	When iterating over a paging queue, explicitly check for PG_MARKER, instead of relying on zeroed memory being interpreted as an empty PV list. Reviewed by: kib	2011-07-02 23:42:04 +00:00
alc	21902be08c	Add a new option, OBJPR_NOTMAPPED, to vm_object_page_remove(). Passing this option to vm_object_page_remove() asserts that the specified range of pages is not mapped, or more precisely that none of these pages have any managed mappings. Thus, vm_object_page_remove() need not call pmap_remove_all() on the pages. This change not only saves time by eliminating pointless calls to pmap_remove_all(), but it also eliminates an inconsistency in the use of pmap_remove_all() versus related functions, like pmap_remove_write(). It eliminates harmless but pointless calls to pmap_remove_all() that were being performed on PG_UNMANAGED pages. Update all of the existing assertions on pmap_remove_all() to reflect this change. Reviewed by: kib	2011-06-29 16:40:41 +00:00

1 2 3 4

187 Commits