freebsd-nq

Author	SHA1	Message	Date
Konstantin Belousov	edc8222303	Make kstack_pages a tunable on arm, x86, and powepc. On i386, the initial thread stack is not adjusted by the tunable, the stack is allocated too early to get access to the kernel environment. See TD0_KSTACK_PAGES for the thread0 stack sizing on i386. The tunable was tested on x86 only. From the visual inspection, it seems that it might work on arm and powerpc. The arm USPACE_SVC_STACK_TOP and powerpc USPACE macros seems to be already incorrect for the threads with non-default kstack size. I only changed the macros to use variable instead of constant, since I cannot test. On arm64, mips and sparc64, some static data structures are sized by KSTACK_PAGES, so the tunable is disabled. Sponsored by: The FreeBSD Foundation MFC after: 2 week	2015-08-10 17:18:21 +00:00
Jason A. Harmening	713841afb2	Add two new pmap functions: vm_offset_t pmap_quick_enter_page(vm_page_t m) void pmap_quick_remove_page(vm_offset_t kva) These will create and destroy a temporary, CPU-local KVA mapping of a specified page. Guarantees: --Will not sleep and will not fail. --Safe to call under a non-sleepable lock or from an ithread Restrictions: --Not guaranteed to be safe to call from an interrupt filter or under a spin mutex on all platforms --Current implementation does not guarantee more than one page of mapping space across all platforms. MI code should not make nested calls to pmap_quick_enter_page. --MI code should not perform locking while holding onto a mapping created by pmap_quick_enter_page The idea is to use this in busdma, for bounce buffer copies as well as virtually-indexed cache maintenance on mips and arm. NOTE: the non-i386, non-amd64 implementations of these functions still need review and testing. Reviewed by: kib Approved by: kib (mentor) Differential Revision: http://reviews.freebsd.org/D3013	2015-08-04 19:46:13 +00:00
Justin Hibbits	0936003e3d	Use the correct type for physical addresses. On Book-E, physical addresses are actually 36-bits, not 32-bits. This is currently worked around by ignoring the top bits. However, in some cases, the boot loader configures CCSR to something above the 32-bit mark. This is stage 1 in updating the pmap to handle 36-bit physaddr.	2015-07-04 19:00:38 +00:00
Justin Hibbits	6c8df58287	Unify booke and AIM machdep. Much of the code was common to begin with. There is one nit, which is likely not an issue at all. With the old code, the AIM machdep would __syncicache() the entire kernel core at setup. However, in the unified setup, that seems to hang on the MPC7455, perhaps because it's running later than before. Removing this allows it to boot just fine. Examining the code, the FreeBSD loader already does syncicache of the full kernel, and each module loaded, so this doesn't appear to be an actual problem. Initial code by Nathan Whitehorn.	2015-04-30 01:24:25 +00:00
Ryan Stone	f2c2231e0c	Fix integer truncation bug in malloc(9) A couple of internal functions used by malloc(9) and uma truncated a size_t down to an int. This could cause any number of issues (e.g. indefinite sleeps, memory corruption) if any kernel subsystem tried to allocate 2GB or more through malloc. zfs would attempt such an allocation when run on a system with 2TB or more of RAM. Note to self: When this is MFCed, sparc64 needs the same fix. Differential revision: https://reviews.freebsd.org/D2106 Reviewed by: kib Reported by: Michael Fuckner <michael@fuckner.net> Tested by: Michael Fuckner <michael@fuckner.net> MFC after: 2 weeks	2015-04-01 12:42:26 +00:00
Nathan Whitehorn	1cd30eb6dd	Deallocate any leftover page table entries in the LPAR at boot. This prevents contamination from a previous kernel (e.g. after shutdown -r).	2015-03-13 00:08:58 +00:00
Nathan Whitehorn	d1295abdc0	Move Book-E/AIM dependent bits for setting user PMAP during thread switch out of cpu_switch() and into pmap_activate() where they belong. This also removes all the #ifdef from cpu_switch().	2015-03-04 16:45:31 +00:00
Nathan Whitehorn	827cc9b981	New pmap implementation for 64-bit PowerPC processors. The main focus of this change is to improve concurrency: - Drop global state stored in the shadow overflow page table (and all other global state) - Remove all global locks - Use per-PTE lock bits to allow parallel page insertion - Reconstruct state when requested for evicted PTEs instead of buffering it during overflow This drops total wall time for make buildworld on a 32-thread POWER8 system by a factor of two and system time by a factor of three, providing performance 20% better than similarly clocked Core i7 Xeons per-core. Performance on smaller SMP systems, where PMAP lock contention was not as much of an issue, is nearly unchanged. Tested on: POWER8, POWER5+, G5 UP, G5 SMP (64-bit and 32-bit kernels) Merged from: user/nwhitehorn/ppc64-pmap-rework Looked over by: jhibbits, andreast MFC after: 3 months Relnotes: yes Sponsored by: FreeBSD Foundation	2015-02-24 21:37:20 +00:00
Nathan Whitehorn	c5e8bb4f2e	Provide a tunable (machdep.moea64_bpvo_pool_size) to set the bootstrap PVO pool size. The default errs on the exceedingly large side, so absent any intelligent automatic tuning, at least let the user set it to save RAM on memory-constrained systems. MFC after: 2 weeks	2015-01-19 05:14:07 +00:00
Nathan Whitehorn	bf27800837	Do not remap Open Firmware mappings covered by the direct map. It's pointless and wastes resources. MFC after: 1 week	2015-01-14 02:18:29 +00:00
Mark Johnston	bdb9ab0dd9	Factor out duplicated code from dumpsys() on each architecture into generic code in sys/kern/kern_dump.c. Most dumpsys() implementations are nearly identical and simply redefine a number of constants and helper subroutines; a generic implementation will make it easier to implement features around kernel core dumps. This change does not alter any minidump code and should have no functional impact. PR: 193873 Differential Revision: https://reviews.freebsd.org/D904 Submitted by: Conrad Meyer <conrad.meyer@isilon.com> Reviewed by: jhibbits (earlier version) Sponsored by: EMC / Isilon Storage Division	2015-01-07 01:01:39 +00:00
Nathan Whitehorn	44d29d4762	Allow booting with both a real Open Firmware tree and a flattened version of the Open Firmware, as provided by petitboot, for example. Note that this is not quite complete, since RTAS instantiation still depends on callable firmware. MFC after: 2 weeks	2015-01-01 22:26:12 +00:00
Konstantin Belousov	39ffa8c138	Change pmap_enter(9) interface to take flags parameter and superpage mapping size (currently unused). The flags includes the fault access bits, wired flag as PMAP_ENTER_WIRED, and a new flag PMAP_ENTER_NOSLEEP to indicate that pmap should not sleep. For powerpc aim both 32 and 64 bit, fix implementation to ensure that the requested mapping is created when PMAP_ENTER_NOSLEEP is not specified, in particular, wait for the available memory required to proceed. In collaboration with: alc Tested by: nwhitehorn (ppc aim32 and booke) Sponsored by: The FreeBSD Foundation and EMC / Isilon Storage Division MFC after: 2 weeks	2014-08-08 17:12:03 +00:00
Alan Cox	a695d9b25b	Retire pmap_change_wiring(). We have never used it to wire virtual pages. We continue to use pmap_enter() for that. For unwiring virtual pages, we now use pmap_unwire(), which unwires a range of virtual addresses instead of a single virtual page. Sponsored by: EMC / Isilon Storage Division	2014-08-03 20:40:51 +00:00
Alan Cox	081b8e203b	Simplify the selection of the pvo_head and pvo allocation zone in moea_enter_locked() and moea64_enter(). Eliminate an unused variable from moea64_enter().	2014-08-01 17:09:50 +00:00
Alan Cox	add0359068	Correct a long-standing problem in moea{,64}_pvo_enter() that was revealed by the combination of r268591 and r269134: When we attempt to add the wired attribute to an existing mapping, moea{,64}_pvo_enter() do nothing. (They only set the wired attribute on newly created mappings.) Tested by: andreast	2014-08-01 01:48:41 +00:00
Alan Cox	974524373d	Correct a defect in r268591. In the implementation of the new function pmap_unwire(), the call to MOEA64_PVO_TO_PTE() must be performed before any changes are made to the PVO. Otherwise, MOEA64_PVO_TO_PTE() will panic. Reported by: andreast	2014-07-31 16:17:30 +00:00
Alan Cox	a844c68fc2	Implement pmap_unwire(). See r268327 for the motivation behind this change.	2014-07-13 16:27:57 +00:00
Ed Maste	0fcefb433d	Update NetBSD Foundation copyrights to 2-clause BSD The NetBSD Foundation states "Third parties are encouraged to change the license on any files which have a 4-clause license contributed to the NetBSD Foundation to a 2-clause license." This change removes clauses 3 and 4 from copyright / license blocks that list The NetBSD Foundation as the only copyright holder. Sponsored by: The FreeBSD Foundation	2014-03-18 01:40:25 +00:00
Nathan Whitehorn	fe875ee040	Do not assume a value for #address-cells when parsing the OF translations map. This allows the kernel to get farther with OpenBIOS on 64-bit CPUs.	2013-11-17 18:03:03 +00:00
Justin Hibbits	1eb04b44aa	Fix copy+paste-o, OEA64 uses LPTE, not PTE. X-MFC with: r257941	2013-11-14 07:41:52 +00:00
Justin Hibbits	a470e71336	Add the necessary bits for dumps on ppc64. MFC after: 2 weeks	2013-11-11 03:17:38 +00:00
Nathan Whitehorn	10d06e67fd	Add some extra sanity checking and checks to printf format specifiers.	2013-10-26 18:19:36 +00:00
Alan Cox	deb179bb4c	The pmap function pmap_clear_reference() is no longer used. Remove it. pmap_clear_reference() has had exactly one caller in the kernel for several years, more precisely, since FreeBSD 8. Now, that call no longer exists. Approved by: re (kib) Sponsored by: EMC / Isilon Storage Division	2013-09-20 04:30:18 +00:00
Nathan Whitehorn	1330c354c5	Change VM object lock assertion to match locking higher in the call chain. This repairs a panic observed during pageout on some 64-bit PowerPC systems. Submitted by: grehan Approved by: re (kib) MFC after: 2 weeks Revisit after: 10.0	2013-09-13 01:12:45 +00:00
Nathan Whitehorn	c5915fdc44	Add POWER CPUs to the kernel's knowledge. This does not imply we currently actually run on any machines with POWER CPUs but avoids closing that door unnecessarily. Approved by: re (kib)	2013-09-09 12:51:24 +00:00
Konstantin Belousov	e68c64f0ba	Revert r254501. Instead, reuse the type stability of the struct pmap which is the part of struct vmspace, allocated from UMA_ZONE_NOFREE zone. Initialize the pmap lock in the vmspace zone init function, and remove pmap lock initialization and destruction from pmap_pinit() and pmap_release(). Suggested and reviewed by: alc (previous version) Tested by: pho Sponsored by: The FreeBSD Foundation	2013-08-22 18:12:24 +00:00
Attilio Rao	c7aebda8a1	The soft and hard busy mechanism rely on the vm object lock to work. Unify the 2 concept into a real, minimal, sxlock where the shared acquisition represent the soft busy and the exclusive acquisition represent the hard busy. The old VPO_WANTED mechanism becames the hard-path for this new lock and it becomes per-page rather than per-object. The vm_object lock becames an interlock for this functionality: it can be held in both read or write mode. However, if the vm_object lock is held in read mode while acquiring or releasing the busy state, the thread owner cannot make any assumption on the busy state unless it is also busying it. Also: - Add a new flag to directly shared busy pages while vm_page_alloc and vm_page_grab are being executed. This will be very helpful once these functions happen under a read object lock. - Move the swapping sleep into its own per-object flag The KPI is heavilly changed this is why the version is bumped. It is very likely that some VM ports users will need to change their own code. Sponsored by: EMC / Isilon storage division Discussed with: alc Reviewed by: jeff, kib Tested by: gavin, bapt (older version) Tested by: pho, scottl	2013-08-09 11:11:11 +00:00
Jeff Roberson	5df87b21d3	Replace kernel virtual address space allocation with vmem. This provides transparent layering and better fragmentation. - Normalize functions that allocate memory to use kmem_* - Those that allocate address space are named kva_* - Those that operate on maps are named kmap_* - Implement recursive allocation handling for kmem_arena in vmem. Reviewed by: alc Tested by: pho Sponsored by: EMC / Isilon Storage Division	2013-08-07 06:21:20 +00:00
Nathan Whitehorn	d9675cb0af	Fix check: bitwise and has only one &. MFC after: 1 week	2013-07-12 15:56:30 +00:00
Attilio Rao	9af6d512f5	o Relax locking assertions for vm_page_find_least() o Relax locking assertions for pmap_enter_object() and add them also to architectures that currently don't have any o Introduce VM_OBJECT_LOCK_DOWNGRADE() which is basically a downgrade operation on the per-object rwlock o Use all the mechanisms above to make vm_map_pmap_enter() to work mostl of the times only with readlocks. Sponsored by: EMC / Isilon storage division Reviewed by: alc	2013-05-21 20:38:19 +00:00
Alan Cox	658f180b3b	Relax the object locking assertion in pmap_enter_locked(). Reviewed by: attilio Sponsored by: EMC / Isilon Storage Division	2013-05-17 18:59:00 +00:00
Konstantin Belousov	ee75e7de7b	Implement the concept of the unmapped VMIO buffers, i.e. buffers which do not map the b_pages pages into buffer_map KVA. The use of the unmapped buffers eliminate the need to perform TLB shootdown for mapping on the buffer creation and reuse, greatly reducing the amount of IPIs for shootdown on big-SMP machines and eliminating up to 25-30% of the system time on i/o intensive workloads. The unmapped buffer should be explicitely requested by the GB_UNMAPPED flag by the consumer. For unmapped buffer, no KVA reservation is performed at all. The consumer might request unmapped buffer which does have a KVA reserve, to manually map it without recursing into buffer cache and blocking, with the GB_KVAALLOC flag. When the mapped buffer is requested and unmapped buffer already exists, the cache performs an upgrade, possibly reusing the KVA reservation. Unmapped buffer is translated into unmapped bio in g_vfs_strategy(). Unmapped bio carry a pointer to the vm_page_t array, offset and length instead of the data pointer. The provider which processes the bio should explicitely specify a readiness to accept unmapped bio, otherwise g_down geom thread performs the transient upgrade of the bio request by mapping the pages into the new bio_transient_map KVA submap. The bio_transient_map submap claims up to 10% of the buffer map, and the total buffer_map + bio_transient_map KVA usage stays the same. Still, it could be manually tuned by kern.bio_transient_maxcnt tunable, in the units of the transient mappings. Eventually, the bio_transient_map could be removed after all geom classes and drivers can accept unmapped i/o requests. Unmapped support can be turned off by the vfs.unmapped_buf_allowed tunable, disabling which makes the buffer (or cluster) creation requests to ignore GB_UNMAPPED and GB_KVAALLOC flags. Unmapped buffers are only enabled by default on the architectures where pmap_copy_page() was implemented and tested. In the rework, filesystem metadata is not the subject to maxbufspace limit anymore. Since the metadata buffers are always mapped, the buffers still have to fit into the buffer map, which provides a reasonable (but practically unreachable) upper bound on it. The non-metadata buffer allocations, both mapped and unmapped, is accounted against maxbufspace, as before. Effectively, this means that the maxbufspace is forced on mapped and unmapped buffers separately. The pre-patch bufspace limiting code did not worked, because buffer_map fragmentation does not allow the limit to be reached. By Jeff Roberson request, the getnewbuf() function was split into smaller single-purpose functions. Sponsored by: The FreeBSD Foundation Discussed with: jeff (previous version) Tested by: pho, scottl (previous version), jhb, bf MFC after: 2 weeks	2013-03-19 14:13:12 +00:00
Konstantin Belousov	e8a4a618cf	Add pmap function pmap_copy_pages(), which copies the content of the pages around, taking array of vm_page_t both for source and destination. Starting offsets and total transfer size are specified. The function implements optimal algorithm for copying using the platform-specific optimizations. For instance, on the architectures were the direct map is available, no transient mappings are created, for i386 the per-cpu ephemeral page frame is used. The code was typically borrowed from the pmap_copy_page() for the same architecture. Only i386/amd64, powerpc aim and arm/arm-v6 implementations were tested at the time of commit. High-level code, not committed yet to the tree, ensures that the use of the function is only allowed after explicit enablement. For sparc64, the existing code has known issues and a stab is added instead, to allow the kernel linking. Sponsored by: The FreeBSD Foundation Tested by: pho (i386, amd64), scottl (amd64), ian (arm and arm-v6) MFC after: 2 weeks	2013-03-14 20:18:12 +00:00
Attilio Rao	89f6b8632c	Switch the vm_object mutex to be a rwlock. This will enable in the future further optimizations where the vm_object lock will be held in read mode most of the time the page cache resident pool of pages are accessed for reading purposes. The change is mostly mechanical but few notes are reported: * The KPI changes as follow: - VM_OBJECT_LOCK() -> VM_OBJECT_WLOCK() - VM_OBJECT_TRYLOCK() -> VM_OBJECT_TRYWLOCK() - VM_OBJECT_UNLOCK() -> VM_OBJECT_WUNLOCK() - VM_OBJECT_LOCK_ASSERT(MA_OWNED) -> VM_OBJECT_ASSERT_WLOCKED() (in order to avoid visibility of implementation details) - The read-mode operations are added: VM_OBJECT_RLOCK(), VM_OBJECT_TRYRLOCK(), VM_OBJECT_RUNLOCK(), VM_OBJECT_ASSERT_RLOCKED(), VM_OBJECT_ASSERT_LOCKED() * The vm/vm_pager.h namespace pollution avoidance (forcing requiring sys/mutex.h in consumers directly to cater its inlining functions using VM_OBJECT_LOCK()) imposes that all the vm/vm_pager.h consumers now must include also sys/rwlock.h. * zfs requires a quite convoluted fix to include FreeBSD rwlocks into the compat layer because the name clash between FreeBSD and solaris versions must be avoided. At this purpose zfs redefines the vm_object locking functions directly, isolating the FreeBSD components in specific compat stubs. The KPI results heavilly broken by this commit. Thirdy part ports must be updated accordingly (I can think off-hand of VirtualBox, for example). Sponsored by: EMC / Isilon storage division Reviewed by: jeff Reviewed by: pjd (ZFS specific review) Discussed with: alc Tested by: pho	2013-03-09 02:32:23 +00:00
Attilio Rao	dc1558d1cd	Merge from vmobj-rwlock: VM_OBJECT_LOCKED() macro is only used to implement a custom version of lock assertions right now (which likely spread out thanks to copy and paste). Remove it and implement actual assertions. Sponsored by: EMC / Isilon storage division Reviewed by: alc Tested by: pho	2013-02-27 18:12:13 +00:00
Attilio Rao	590f9303e5	Merge from vmobj-rwlock branch: Remove unused inclusion of vm/vm_pager.h and vm/vnode_pager.h. Sponsored by: EMC / Isilon storage division Tested by: pho Reviewed by: alc	2013-02-26 01:00:11 +00:00
Konstantin Belousov	b32ecf44bc	Flip the semantic of M_NOWAIT to only require the allocation to not sleep, and perform the page allocations with VM_ALLOC_SYSTEM class. Previously, the allocation was also allowed to completely drain the reserve of the free pages, being translated to VM_ALLOC_INTERRUPT request class for vm_page_alloc() and similar functions. Allow the caller of malloc* to request the 'deep drain' semantic by providing M_USE_RESERVE flag, now translated to VM_ALLOC_INTERRUPT class. Previously, it resulted in less aggressive VM_ALLOC_SYSTEM allocation class. Centralize the translation of the M_* malloc(9) flags in the single inline function malloc2vm_flags(). Discussion started by: "Sears, Steven" <Steven.Sears@netapp.com> Reviewed by: alc, mdf (previous version) Tested by: pho (previous version) MFC after: 2 weeks	2012-11-14 20:01:40 +00:00
Alan Cox	e4b8a2fc5a	Eliminate a stale comment. It describes another use case for the pmap in Mach that doesn't exist in FreeBSD.	2012-09-28 05:30:59 +00:00
Alan Cox	8d9e6d9f93	Avoid recursion on the pvh global lock in the aim oea pmap. Correct the return type of the pmap_ts_referenced() implementations. Reported by: jhibbits [1] Tested by: andreast	2012-07-10 22:10:21 +00:00
Rafal Jaworowski	d7c8c7fdfb	Fix physical address type to vm_paddr_t also for powerpc64.	2012-05-25 18:17:26 +00:00
Nathan Whitehorn	ccc4a5c761	Replace the list of PVOs owned by each PMAP with an RB tree. This simplifies range operations like pmap_remove() and pmap_protect() as well as allowing simple operations like pmap_extract() not to involve any global state. This substantially reduces lock coverages for the global table lock and improves concurrency.	2012-05-20 14:33:28 +00:00
Nathan Whitehorn	0b852c03eb	Avoid a lock order reversal in pmap_extract_and_hold() from relocking the page. This PMAP requires an additional lock besides the PMAP lock in pmap_extract_and_hold(), which vm_page_pa_tryrelock() did not release. Suggested by: kib MFC after: 4 days	2012-04-22 17:58:30 +00:00
Nathan Whitehorn	e3c2930d36	We don't need kcopy() in any of the remaining places it is used, so remove it. MFC after: 2 weeks	2012-04-11 22:23:50 +00:00
Nathan Whitehorn	b6aeb1ab97	Only manipulate the PGA_EXECUTABLE flag on managed pages. This is a proxy for whether the page is physical. On dense phys mem systems (32-bit), VM_PHYS_TO_PAGE will not return NULL for device memory pages if device memory is above physical memory even if there is no allocated vm_page. Attempting to use the returned page could then cause either memory corruption or a page fault.	2012-04-11 21:56:55 +00:00
Nathan Whitehorn	348bc07000	Substantially reduce the scope of the locks held in pmap_enter(), which improves concurrency slightly.	2012-04-06 18:18:48 +00:00
Nathan Whitehorn	57bd5cce62	Reduce the frequency that the PowerPC/AIM pmaps invalidate instruction caches, by invalidating kernel icaches only when needed and not flushing user caches for shared pages. Suggested by: kib MFC after: 2 weeks	2012-04-06 16:03:38 +00:00
Nathan Whitehorn	7e55df27cb	More PMAP performance improvements: skip 256 MB segments entirely if they are are not mapped during ranged operations and reduce the scope of the tlbie lock only to the actual tlbie instruction instead of the entire sequence. There are a few more optimization possibilities here as well.	2012-03-28 17:25:29 +00:00
Nathan Whitehorn	a3e9e259b3	Make sure to call vm_page_dirty() before the pmap lock is released to prevent a race where another process could conclude the page was clean. Submitted by: alc	2012-03-27 01:26:00 +00:00
Nathan Whitehorn	5afcb4c91e	More PMAP concurrency improvements: replace the table lock and (almost) all uses of the page queues mutex with a new rwlock that protects the page table and the PV lists. This reduces system time during a parallel buildworld by 35%. Reviewed by: alc	2012-03-27 01:24:18 +00:00

1 2 3

129 Commits