87c8f0c0f2
MP lock for the last time. The use of a locked instruction to cpu-private memory is 3x faster then CPUID and 3x faster then the use of a locked instruction to shared memory (the lock itself). Instruction serialization is required to ensure that any pending memory ops are properly flushed prior to the release of the lock, due to out-of-order instruction execution by the cpu.