mempool: flush cache completely on overflow

The cache was still full after flushing. In the opposite direction,
i.e. when getting objects from the cache, the cache is refilled to full
level when it crosses the low watermark (which happens to be zero).
Similarly, the cache should be flushed to empty level when it crosses
the high watermark (which happens to be 1.5 x the size of the cache).
The existing flushing behaviour was suboptimal for real applications,
because crossing the low or high watermark typically happens when the
application is in a state where the number of put/get events are out of
balance, e.g. when absorbing a burst of packets into a QoS queue
(getting more mbufs from the mempool), or when a burst of packets is
trickling out from the QoS queue (putting the mbufs back into the
mempool).
Now, the mempool cache is completely flushed when crossing the flush
threshold, so only the newly put (hot) objects remain in the mempool
cache afterwards.

This bug degraded performance caused by too frequent flushing.

Consider this application scenario:

Either, an lcore thread in the application is in a state of balance,
where it uses the mempool cache within its flush/refill boundaries; in
this situation, the flush method is less important, and this fix is
irrelevant.

Or, an lcore thread in the application is out of balance (either
permanently or temporarily), and mostly gets or puts objects from/to the
mempool. If it mostly puts objects, not flushing all of the objects will
cause more frequent flushing. This is the scenario addressed by this
fix. E.g.:

Cache size=256, flushthresh=384 (1.5x size), initial len=256;
application burst len=32.

If there are "size" objects in the cache after flushing, the cache is
flushed at every 4th burst.

If the cache is flushed completely, the cache is only flushed at every
16th burst.

As you can see, this bug caused the cache to be flushed 4x too
frequently in this example.

And when/if the application thread breaks its pattern of continuously
putting objects, and suddenly starts to get objects instead, it will
either get objects already in the cache, or the get() function will
refill the cache.

The concept of not flushing the cache completely was probably based on
an assumption that it is more likely for an application's lcore thread
to get() after flushing than to put() after flushing.
I strongly disagree with this assumption! If an application thread is
continuously putting so much that it overflows the cache, it is much
more likely to keep putting than it is to start getting. If in doubt,
consider how CPU branch predictors work: When the application has done
something many times consecutively, the branch predictor will expect the
application to do the same again, rather than suddenly do something
else.

Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
This commit is contained in:
Andrew Rybchenko 2022-10-09 16:37:37 +03:00 committed by Thomas Monjalon
parent 459531c958
commit e6e62f6f55

View File

@ -1344,19 +1344,9 @@ rte_mempool_do_generic_put(struct rte_mempool *mp, void * const *obj_table,
cache_objs = &cache->objs[cache->len];
cache->len += n;
} else {
unsigned int keep = (n >= cache->size) ? 0 : (cache->size - n);
/*
* If number of object to keep in the cache is positive:
* keep = cache->size - n < cache->flushthresh - n < cache->len
* since cache->flushthresh > cache->size.
* If keep is 0, cache->len cannot be 0 anyway since
* n <= cache->flushthresh and we'd no be here with
* cache->len == 0.
*/
cache_objs = &cache->objs[keep];
rte_mempool_ops_enqueue_bulk(mp, cache_objs, cache->len - keep);
cache->len = keep + n;
cache_objs = &cache->objs[0];
rte_mempool_ops_enqueue_bulk(mp, cache_objs, cache->len);
cache->len = n;
}
/* Add the objects to the cache. */