Improved MAC swap performance for ARM platform.
The improvement was achieved by using neon intrinsics
to save CPU cycles and doing swap for four packets
at a time.
The optimization had 15% - 20% throughput boost
in testpmd MAC swap mode.
Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>