If structure packed as __packed clang (and probably gcc) generates
code that loads word fields (e.g. tx_pos) byte-by-byte and if it's
modified by VideoCore in the same time as ARM loads the value result
is going to be mixed combination of bytes from previous value and
new one.