e44f4f3547
The IA32 memory model guarantees that all writes are seen in the program order. Also, any access to the uncacheable memory flushes the store buffers. As the consequence, SFENCE instruction is (almost) never needed, in particular, it is not needed to ensure the correct order of updates as seen by a PCIe device. Use atomic_thread_fence_rel() instead of wb() to only emit compiler barriers on x86 there. Other architectures get the right barrier instruction as well. Reviewed by: hselasky Sponsored by: Mellanox Technologies MFC after: 1 week