Improve the performance of the arm64 thread switching code.

The full system memory barrier around a TLB invalidation is stricter than
required. It needs to wait on accesses to main memory, with just the weaker
store variant before the invalidate. As such use the dsb istst, tlbi, dlb
ish sequence already used in pmap.

The tlbi instruction in this sequence is also unnecessarily using a
broadcast invalidate when it just needs to invalidate the local CPUs TLB.
Switch to a non-broadcast variant of this instruction.

Sponsored by:	DARPA, AFRL
This commit is contained in:
Andrew Turner 2017-08-21 18:12:32 +00:00
parent 8cf606a4d6
commit cbf2160e81

View File

@ -91,9 +91,9 @@ ENTRY(cpu_throw)
isb
/* Invalidate the TLB */
dsb sy
tlbi vmalle1is
dsb sy
dsb ishst
tlbi vmalle1
dsb ish
isb
/* If we are single stepping, enable it */
@ -192,9 +192,9 @@ ENTRY(cpu_switch)
isb
/* Invalidate the TLB */
dsb sy
tlbi vmalle1is
dsb sy
dsb ishst
tlbi vmalle1
dsb ish
isb
/*