This is not strictly required with the current ABI but will be when we
switch to the ARM EABI. The aapcs requires the stack to be 4 byte aligned
at all times and 8 byte aligned when calling a public subroutine where the
current ABI only requires sp to be a multiple of 4.