047adc1724
In __rte_ring_move_prod_head, move the __atomic_load_n up and out of
the do {} while loop as upon failure the old_head will be updated,
another load is costly and not necessary.
This helps a little on the latency,about 1~5%.
Test result with the patch(two cores):
SP/SC bulk enq/dequeue (size: 8): 5.64
MP/MC bulk enq/dequeue (size: 8): 9.58
SP/SC bulk enq/dequeue (size: 32): 1.98
MP/MC bulk enq/dequeue (size: 32): 2.30
Fixes:
|
||
---|---|---|
.. | ||
Makefile | ||
meson.build | ||
rte_ring_c11_mem.h | ||
rte_ring_generic.h | ||
rte_ring_version.map | ||
rte_ring.c | ||
rte_ring.h |