6a0fd1a51b
'sync' is pretty heavy-handed, and is unnecessary for this use case. It's a full barrier, which is applicable for all storage types. However, atomic_load_acq_*() is only expected to operate on physical memory, not device memory, so lwsync is sufficient (lwsync provides access ordering on memory that is marked as Coherency Required and is not Write Through nor Cache Inhibited). On 32-bit systems, this is a nop, since powerpc_lwsync() is defined to use sync, as a workaround for a silicon bug in the Freescale e500 core.