If the timer subsystem is not initialized before rte_timer_manage (for
example) is invoked, a pointer to a shared hugepage memory region will
still be null and dereferenced when it is checked for validity; handle
this case.
Fixes: c0749f7096 ("timer: allow management in shared memory")
Cc: stable@dpdk.org
Signed-off-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
Currently, whenever timer library is initialized, the memory
is leaked because there is no telling when primary or secondary
processes get to use the state, and there is no way to
initialize/deinitialize timer library state without race
conditions [1] because the data itself must live in shared memory.
Add a spinlock to the shared mem config to have a way to
exclusively initialize/deinitialize the timer library without
any races, and implement the synchronization mechanism based
on this lock in the timer library.
Also, update the API doc. Note that the behavior of the API
itself did not change - the requirement to call init in every
process was simply not documented explicitly.
[1] See the following email thread:
https://mails.dpdk.org/archives/dev/2019-May/131498.html
Fixes: c0749f7096 ("timer: allow management in shared memory")
Cc: stable@dpdk.org
Signed-off-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
We had some inconsistencies between functions prototypes and actual
definitions.
Let's avoid this by only adding the experimental tag to the prototypes.
Tests with gcc and clang show it is enough.
git grep -l __rte_experimental |grep \.c$ |while read file; do
sed -i -e '/^__rte_experimental$/d' $file;
sed -i -e 's/ *__rte_experimental//' $file;
sed -i -e 's/__rte_experimental *//' $file;
done
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Since memzones can be reserved from secondary processes as well as
primary processes, if the first call to the timer subsystem init
function occurs in a secondary process, we should allow it to succeed.
Fixes: c0749f7096 ("timer: allow management in shared memory")
Signed-off-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
The rte_timer_alt_manage function should track which is the running
timer and whether or not it was updated by a callback in the priv_timer
structure that corresponds to the running lcore, so that restarting
or stopping the timer from the callback works correctly.
Fixes: c0749f7096 ("timer: allow management in shared memory")
Signed-off-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
A null array is allowed to be passed as one of the parameters to
rte_timer_alt_manage() as a convenience. When that happened, an
anonymous array was created using compound literal syntax, and Coverity
detected that the object was out of scope in later uses of it. Create
an object in the proper scope instead.
Coverity issue: 337919
Fixes: c0749f7096 ("timer: allow management in shared memory")
Signed-off-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
This commit adds an autotest which exercises new timer reset/stop APIs
in a secondary process. Timers are created, and sometimes stopped, in
the secondary process, and their expiration is checked for and handled
in the primary process.
Signed-off-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
Add a function to the timer API that allows a caller to traverse a
specified set of timer lists, stopping each timer in each list,
and invoking a callback function.
Signed-off-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
Currently, the timer library uses a per-process table of structures to
manage skiplists of timers presumably because timers contain arbitrary
function pointers whose value may not resolve properly in other
processes.
However, if the same callback is used handle all timers, and that
callback is only invoked in one process, then it woud be safe to allow
the data structures to be allocated in shared memory, and to allow
secondary processes to modify the timer lists. This would let timers be
used in more multi-process scenarios.
The library's global variables are wrapped with a struct, and an array
of these structures is created in shared memory. The original APIs
are updated to reference the zeroth entry in the array. This maintains
the original behavior for both primary and secondary processes since
the set intersection of their coremasks should be empty [1]. New APIs
are introduced to enable the allocation/deallocation of other entries
in the array.
New variants of the APIs used to start and stop timers are introduced;
they allow a caller to specify which array entry should be used to
locate the timer list to insert into or delete from.
Finally, a new variant of rte_timer_manage() is introduced, which
allows a caller to specify which array entry should be used to locate
the timer lists to process; it can also process multiple timer lists per
invocation.
[1] https://doc.dpdk.org/guides/prog_guide/multi_proc_support.html#multi-process-limitations
Signed-off-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
rte_timer_manage() adds expired timers to a "run list", and walks the
list, transitioning each timer from the PENDING to the RUNNING state.
If another lcore resets or stops the timer at precisely this
moment, the timer state would instead be set to CONFIG by that other
lcore, which would cause timer_manage() to skip over it. This is
expected behavior.
However, if a timer expires quickly enough, there exists the
following race condition that causes the timer_manage() routine to
misinterpret a timer in CONFIG state, resulting in lost timers:
- Thread A:
- starts a timer with rte_timer_reset()
- the timer is moved to CONFIG state
- the spinlock associated with the appropriate skiplist is acquired
- timer is inserted into the skiplist
- the spinlock is released
- Thread B:
- executes rte_timer_manage()
- find above timer as expired, add it to run list
- walk run list, see above timer still in CONFIG state, unlink it from
run list and continue on
- Thread A:
- move timer to PENDING state
- return from rte_timer_reset()
- timer is now in PENDING state, but not actually linked into a
pending list or a run list and will never get processed further
by rte_timer_manage()
This commit fixes this race condition by only releasing the spinlock
after the timer state has been transitioned from CONFIG to PENDING,
which prevents rte_timer_manage() from seeing an incorrect state.
Fixes: 9b15ba895b ("timer: use a skip list")
Cc: stable@dpdk.org
Signed-off-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
rte_lcore_has_role() returns 0 if role of lcore matches requested
role. The return value of the API is confusing, and this is a known
problem with a deprecation notice announcing the change to more
intuitive semantics:
Commit 064518f68d ("doc: announce EAL API change to lcore role function")
Implement changes announced in the deprecation notice, and remove it.
Also, fix usages of this API to reflect the change. Control thread patches
expected new behavior and were broken before, now they are fixed as well.
Fixes: d651ee4919 ("eal: set affinity for control threads")
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Signed-off-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
The return value of rte_lcore_has_role is misinterpreted in the timer
reset function. The return values of rte_lcore_has_role will be changed
in a future DPDK release, but this commit fixes this call site until
that happens.
Fixes: 351f463456 ("timer: allow reset on service cores")
Cc: stable@dpdk.org
Signed-off-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
Acked-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
Replace the BSD license header with the SPDX tag for files
with only an Intel copyright on them.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
The memzone header is often included without good reason.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
64bit load and store will be an atomic operation on all the
64bit processors.
Change RTE_ARCH_X86_64 to RTE_ARCH_64 to reflect the case.
Fixes: 9b15ba895b ("timer: use a skip list")
Cc: stable@dpdk.org
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
The rte_timer_reset function should be able to register timers on service
lcores as they are EAL threads.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Remove rte_pause() definition from rte_common.h and
switchover to architecture specific rte_pause.h
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Fixing typos across dpdk source code using codespell utility.
Skipped the ethdev driver's base code fixes to keep the base
code intact.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
For periodic timers, if the lag gets introduced, the current code
added additional delay when the next peridoc timer was initialized
by not taking into account the delay added, with this fix the code
would start the next occurrence of timer keeping in account the
lag added. Corrected the behavior.
Fixes: 9b15ba89 ("timer: use a skip list")
Signed-off-by: Karmarkar Suyash <skarmarkar@sonusnet.com>
Acked-by: Robert Sanford <rsanford@akamai.com>
When timer_cb resets another running timer on the same lcore,
the list of expired timers is chained to the pending-list.
This commit prevents a running timer from being reset
by not its own timer_cb.
Fixes: a4b7a5a45c ("timer: fix race condition")
Signed-off-by: Hiroyuki Mikita <h.mikita89@gmail.com>
Acked-by: Robert Sanford <rsanford@akamai.com>
When timer_set_running_state() fails in rte_timer_manage(),
the failed timer is put back on pending-list.
In this case, another core tries to reset or stop the timer.
It does not need to be on pending-list.
Fixes: a4b7a5a45c ("timer: fix race condition")
Signed-off-by: Hiroyuki Mikita <h.mikita89@gmail.com>
Acked-by: Robert Sanford <rsanford@akamai.com>
This commit fixes incorrect pending-list manipulation
when getting list of expired timers in rte_timer_manage().
When timer_get_prev_entries() sets pending_head on prev,
the pending-list is broken.
The next of pending_head always becomes NULL.
In this depth level, it is not need to manipulate the list.
Fixes: 9b15ba895b ("timer: use a skip list")
Signed-off-by: Hiroyuki Mikita <h.mikita89@gmail.com>
Acked-by: Robert Sanford <rsanford@akamai.com>
Eliminate problematic race condition in rte_timer_manage() that can
lead to corruption of per-lcore pending-lists (implemented as
skip-lists). The race condition occurs when rte_timer_manage() expires
multiple timers on lcore A, while lcore B simultaneously invokes
rte_timer_reset() for one of the expiring timers (other than the first
one).
Lcore A splits its pending-list, creating a local list of expired timers
linked through their sl_next[0] pointers, and sets the first expired
timer to the RUNNING state, all during one list-lock round trip.
Lcore A then unlocks the list-lock to run the first callback, and that
is when A and B can have different interpretations of the subsequent
expired timers' true state. Lcore B sees an expired timer still in the
PENDING state, atomically changes the timer to the CONFIG state, locks
lcore A's list-lock, and reinserts the timer into A's pending-list.
The two lcores try to use the same next-pointers to maintain both lists!
Our solution is to remove expired timers from the pending-list and try
to set them all to the RUNNING state in one atomic step, i.e.,
rte_timer_manage() should perform these two actions within one
ownership of the list-lock.
After splitting the pending-list at the current point in time and trying
to set all expired timers to the RUNNING state, we must put back into
the pending-list any timers that we failed to set to the RUNNING state,
all while still holding the list-lock. It is then safe to release the
lock and run the callback functions for all expired timers that remain
on our local run-list.
Signed-off-by: Robert Sanford <rsanford@akamai.com>
- API rte_timer_reset() should return -1 when the timer is in the
RUNNING or CONFIG state. Instead, it ignores the return value of
internal function __rte_timer_reset() and always returns 0.
We change rte_timer_reset() to return the value returned by
__rte_timer_reset().
- Enhance timer stress test 2 to report how many timer reset
collisions occur, i.e., how many times rte_timer_reset() fails
due to a timer being in the CONFIG state.
Signed-off-by: Robert Sanford <rsanford2@gmail.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
In rte_timer_reset_sync(), insert rte_pause() into loop that waits
for rte_timer_reset() to succeed.
Signed-off-by: Robert Sanford <rsanford2@gmail.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Allow to setup timers only for EAL (lcore) threads (__lcore_id < MAX_LCORE_ID).
E.g. – dynamically created thread will be able to reset/stop timer for lcore thread,
but it will be not allowed to setup timer for itself or another non-lcore thread.
rte_timer_manage() for non-lcore thread would simply do nothing and return straightway.
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
No need for that 'x bit' on source files.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
This commit removes trailing whitespace from lines in files. Almost all
files are affected, as the BSD license copyright header had trailing
whitespace on 4 lines in it [hence the number of files reporting 8 lines
changed in the diffstat].
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
[Thomas: remove spaces before tabs in libs]
[Thomas: remove more trailing spaces in non-C files]
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Bug: When a timer is running
- if rte_timer_stop is called, the pending decrement is
skipped (decremented only if the timer is pending) and due
to the update flag the future processing is skipped so the
timer is counted as pending while it is stopped. - the same
applies when rte_timer_reset is called but then the pending
statistics is additionally incremented so the timer is
counted pending twice.
Solution: decrement the pending
statistics after returning from the callback. If
rte_timer_stop was called, it skipped decrementing the
pending statistics. If rte_time_reset was called, the
pending statistics was incremented. If neither was called
and the timer is periodic, the pending statistics is
incremented when it is reloaded
Signed-off-by: Vadim Suraev <vadim.suraev@gmail.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Bug: when a periodic timer's callback is running, if another
timer is manipulated, the periodic timer is not reloaded.
Solution: set the update flag only if the modified timer is
in RUNNING state
Signed-off-by: Vadim Suraev <vadim.suraev@gmail.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
The DPDK dump functions are useful for remote debugging of an
applications. But when application runs as a daemon, stdout
is typically routed to /dev/null.
Instead change all these functions to take a stdio FILE * handle
instead. An application can then use open_memstream() to capture
the output.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
[Thomas: fix quota_watermark example]
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
In many application there are no timers queued, and the call to
rte_timer_managecan be optimized in that case avoid reading HPET and
lock overhead.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Reviewed-by: Vincent Jardin <vincent.jardin@6wind.com>