freebsd-dev/sys/amd64/amd64
Konstantin Belousov 56e61f57b0 Eliminate pvh_global_lock from the amd64 pmap.
The only current purpose of the pvh lock was explained there
On Wed, Jan 09, 2013 at 11:46:13PM -0600, Alan Cox wrote:
> Let me lay out one example for you in detail.  Suppose that we have
> three processors and two of these processors are actively using the same
> pmap.  Now, one of the two processors sharing the pmap performs a
> pmap_remove().  Suppose that one of the removed mappings is to a
> physical page P.  Moreover, suppose that the other processor sharing
> that pmap has this mapping cached with write access in its TLB.  Here's
> where the trouble might begin.  As you might expect, the processor
> performing the pmap_remove() will acquire the fine-grained lock on the
> PV list for page P before destroying the mapping to page P.  Moreover,
> this processor will ensure that the vm_page's dirty field is updated
> before releasing that PV list lock.  However, the TLB shootdown for this
> mapping may not be initiated until after the PV list lock is released.
> The processor performing the pmap_remove() is not problematic, because
> the code being executed by that processor won't presume that the mapping
> is destroyed until the TLB shootdown has completed and pmap_remove() has
> returned.  However, the other processor sharing the pmap could be
> problematic.  Specifically, suppose that the third processor is
> executing the page daemon and concurrently trying to reclaim page P.
> This processor performs a pmap_remove_all() on page P in preparation for
> reclaiming the page.  At this instant, the PV list for page P may
> already be empty but our second processor still has a stale TLB entry
> mapping page P.  So, changes might still occur to the page after the
> page daemon believes that all mappings have been destroyed.  (If the PV
> entry had still existed, then the pmap lock would have ensured that the
> TLB shootdown completed before the pmap_remove_all() finished.)  Note,
> however, the page daemon will know that the page is dirty.  It can't
> possibly mistake a dirty page for a clean one.  However, without the
> current pvh global locking, I don't think anything is stopping the page
> daemon from starting the laundering process before the TLB shootdown has
> completed.
>
> I believe that a similar example could be constructed with a clean page
> P' and a stale read-only TLB entry.  In this case, the page P' could be
> "cached" in the cache/free queues and recycled before the stale TLB
> entry is flushed.

TLBs for addresses with updated PTEs are always flushed before pmap
lock is unlocked.  On the other hand, amd64 pmap code does not always
flushes TLBs before PV list locks are unlocked, if previously PTEs
were cleared and PV entries removed.

To handle the situations where a thread might notice empty PV list but
third thread still having access to the page due to TLB invalidation
not finished yet, introduce delayed invalidation.  Comparing with the
pvh_global_lock, DI does not block entered thread when
pmap_remove_all() or pmap_remove_write() (callers of
pmap_delayed_invl_wait()) are executed in parallel.  But _invl_wait()
callers are blocked until all previously noted DI blocks are leaved,
thus ensuring that neccessary TLB invalidations were performed before
returning from pmap_remove_all() or pmap_remove_write().

See comments for detailed description of the mechanism, and also for
the explanations why several pmap methods, most important
pmap_enter(), do not need DI protection.

Reviewed by:	alc, jhb (turnstile KPI usage)
Tested by:	pho (previous version)
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D5747
2016-05-14 23:35:11 +00:00
..
amd64_mem.c sys: use our roundup2/rounddown2() macros when param.h is available. 2016-04-21 19:57:40 +00:00
apic_vector.S hyperv: Deprecate HYPERV option by moving Hyper-V IDT vector into vmbus 2016-04-15 02:20:18 +00:00
atomic.c sys/amd64: Small spelling fixes. 2016-05-03 22:13:04 +00:00
atpic_vector.S
bios.c
bpf_jit_machdep.c Provide includes that are needed in these files, and before were read 2013-10-26 18:18:50 +00:00
bpf_jit_machdep.h
cpu_switch.S Rewrite amd64 PCID implementation to follow an algorithm described in 2015-05-09 19:11:01 +00:00
db_disasm.c ddb: finish converting boolean values. 2015-05-21 15:16:18 +00:00
db_interface.c
db_trace.c Various changes to the registers displayed in DDB for x86. 2015-07-22 01:09:02 +00:00
elf_machdep.c Implement vsyscall hack. Prior to 2.13 glibc uses vsyscall 2016-01-09 20:18:53 +00:00
exception.S sys/amd64: Small spelling fixes. 2016-05-03 22:13:04 +00:00
fpu.c Use ANSI definitions. Wrap long line. 2016-01-19 08:08:08 +00:00
gdb_machdep.c Report the values of x86 segment registers to remote debuggers. 2015-06-12 15:14:08 +00:00
genassym.c Make kstack_pages a tunable on arm, x86, and powepc. On i386, the 2015-08-10 17:18:21 +00:00
in_cksum.c
initcpu.c fix missing variable in r298736 2016-04-28 09:40:24 +00:00
io.c
locore.S xen: add PV/PVH kernel entry point 2014-03-11 10:07:01 +00:00
machdep.c X86: use our nitems() macro when it is avaliable through param.h. 2016-04-19 23:41:46 +00:00
mem.c Revert r263475: TDP_DEVMEMIO no longer needed, since amd64 /dev/kmem 2015-01-12 08:58:07 +00:00
minidump_machdep.c Add 4Kn kernel dump support 2016-04-15 17:45:12 +00:00
mp_machdep.c re-enable AMD Topology extension on certain models if disabled by BIOS 2016-04-12 13:30:39 +00:00
mp_watchdog.c CALLOUT_MPSAFE has lost its meaning since r141428, i.e., for more than ten 2015-05-22 17:05:21 +00:00
mpboot.S sys/amd64: Small spelling fixes. 2016-05-03 22:13:04 +00:00
pmap.c Eliminate pvh_global_lock from the amd64 pmap. 2016-05-14 23:35:11 +00:00
prof_machdep.c
ptrace_machdep.c Disallow a debugger on 64bit system to set fs/gs bases of the 32bit 2015-07-01 16:37:03 +00:00
sigtramp.S
support.S Return dst as the result from memcpy(9) on amd64. 2016-02-24 11:58:15 +00:00
sys_machdep.c Due to invalid use of a signed intermediate value in the bounds checking 2016-03-16 22:33:12 +00:00
trap.c Implement vsyscall hack. Prior to 2.13 glibc uses vsyscall 2016-01-09 20:18:53 +00:00
uio_machdep.c amd64: make uiomove_fromphys functional for pages not mapped by the DMAP 2014-10-24 09:48:58 +00:00
uma_machdep.c Include sys/_task.h into uma_int.h, so that taskqueue.h isn't a 2016-02-09 20:22:35 +00:00
vm_machdep.c Eliminate pvh_global_lock from the amd64 pmap. 2016-05-14 23:35:11 +00:00
xen-locore.S amd64: set the correct LMA values 2015-06-26 07:12:17 +00:00