Currently, we open the system base frequency file, but never close it,
which results in a memory leak.
Coverity issue: 369693
Fixes: 8a5febaac4f7 ("power: fix P-state base frequency handling")
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Reshma Pattan <reshma.pattan@intel.com>
Previous fix has addressed the incorrect handling of `base_frequency`
file, but has added a use-after-free error due to the fact that all
further code paths will lead to an `fclose()` call at the end, so the
additional `fclose()` call right after processing the file was
unnecessary.
Coverity issue: 369901
Fixes: 8a5febaac4f7 ("power: fix P-state base frequency handling")
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Liang Ma <liangma@liangbit.com>
Acked-by: David Hunt <david.hunt@intel.com>
Currently, when we set the pstate governor to "performance", we check if
it is already set to this value, and if it is, we skip setting it.
However, we never save this value anywhere, so that next time we come
back and request the governor to be set to its original value, the
original value is empty.
Fix it by saving the original pstate governor first. While we're at it,
replace `strlcpy` with `rte_strscpy`.
Fixes: e6c6dc0f96c8 ("power: add p-state driver compatibility")
Cc: stable@dpdk.org
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Reshma Pattan <reshma.pattan@intel.com>
Previous fix for base frequency handling in pstate mode introduced a
couple of issues:
- When base_frequency file does not exist, it simply bails out because
of what appears to be accidental addition of FOPEN_OR_ERR_RET. This is
incorrect, as absence of this file is not fatal and is in fact
expected on kernel versions earlier than 5.3
- When base_frequency file does exist, it gets opened, but never gets
closed, resulting in a resource leak
Both issues also manifest themselves as Coverity defects (dead code, and
a resource leak), so this fix addresses both.
Coverity issue: 369693, 369694
Bugzilla ID: 668
Fixes: 4db9587bbf72 ("power: check sysfs base frequency")
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Reshma Pattan <reshma.pattan@intel.com>
Some kernels may show in incorrect value for base frequency in
sysfs (e.g. 15 GHz). This throws off the SST-BF algorithm for
high and low priority cores. So if base_frequency is greater
than max turbo frequency, ignore, and handle it as a normal
core.
Known Kernel version with issue: Linux 5.8.7
Signed-off-by: David Hunt <david.hunt@intel.com>
This is causing build error, like:
https://travis-ci.com/github/ovsrobot/dpdk/jobs/482121104
Also '@internal' marker removed from doxygen comment, since public API
should not be internal.
Experimental tag removed from 'rte_power_guest_channel_send_msg()'
Fixes: 4d3892dcd77b ("power: make channel message functions public")
Cc: stable@dpdk.org
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
The rte_power_guest_channel.h file did not include its dependent
headers, so add them.
Fixes: 5f443cc0f905 ("power: create guest channel public header file")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Add a simple on/off switch that will enable saving power when no
packets are arriving. It is based on counting the number of empty
polls and, when the number reaches a certain threshold, entering an
architecture-defined optimized power state that will either wait
until a TSC timestamp expires, or when packets arrive.
This API mandates a core-to-single-queue mapping (that is, multiple
queued per device are supported, but they have to be polled on different
cores).
This design is using PMD RX callbacks.
1. UMWAIT/UMONITOR:
When a certain threshold of empty polls is reached, the core will go
into a power optimized sleep while waiting on an address of next RX
descriptor to be written to.
2. TPAUSE/Pause instruction
This method uses the pause (or TPAUSE, if available) instruction to
avoid busy polling.
3. Frequency scaling
Reuse existing DPDK power library to scale up/down core frequency
depending on traffic volume.
Signed-off-by: Liang Ma <liang.j.ma@intel.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: David Hunt <david.hunt@intel.com>
re-organise the including of the new public header file and
remove un-needed includes
Fixes: 210c383e247b ("power: packet format for vm power management")
Fixes: cd0d5547e873 ("power: vm communication channels in guest")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Adjust meson.build so that 'ninja install' copies the new header
file into the installation directory.
Fixes: 210c383e247b ("power: packet format for vm power management")
Fixes: cd0d5547e873 ("power: vm communication channels in guest")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Rename the #defines to have an RTE_POWER_ prefix
Fixes: 210c383e247b ("power: packet format for vm power management")
Fixes: cd0d5547e873 ("power: vm communication channels in guest")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Rename the public structs to have an rte_power_ prefix.
Fixes: 210c383e247b ("power: packet format for vm power management")
Fixes: cd0d5547e873 ("power: vm communication channels in guest")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Move the 2 public functions into rte_power_guest_channel.h
Fixes: 210c383e247b ("power: packet format for vm power management")
Fixes: cd0d5547e873 ("power: vm communication channels in guest")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
In preparation for making the header file public, we first rename
channel_commands.h as rte_power_guest_channel.h.
Fixes: 210c383e247b ("power: packet format for vm power management")
Fixes: cd0d5547e873 ("power: vm communication channels in guest")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
Reasons for building not supported generally start with lowercase
because printed as the second part of a line.
Other changes:
- "linux" should be "Linux" with a capital letter.
- ARCH_X86_64 may be simply x86_64.
- aarch64 is preferred over arm64.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
Replace master lcore with main lcore and
replace slave lcore with worker lcore.
Keep the old functions and macros but mark them as deprecated
for this release.
The "--master-lcore" command line option is also deprecated
and any usage will print a warning and use "--main-lcore"
as replacement.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Since each version map file is contained in the subdirectory of the library
it refers to, there is no need to include the library name in the filename.
This makes things simpler in case of library renaming.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Luca Boccassi <bluca@debian.org>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Rosen Xu <rosen.xu@intel.com>
During power initialization the pstate cpufreq api is
not setting the initial curr_idx of pstate_power_info
to corresponding current frequency index.
Without this the idx is always 0, which is causing the
below check to pass and returns without setting the initial
min/max frequency to system max frequency and this leads to
incorrect frequency settings when power_pstate_cpufreq_set_freq()
is called in the apps.
set_freq_internal(struct pstate_power_info *pi, uint32_t idx)
{
...
/* Check if it is the same as current */
if (idx == pi->curr_idx)
return 0;
...
}
scenario 1:
If system has starting scaling min/max: 1000/1000, and want to
set this to 2200/2200, the max frequency gets updated but not min.
scenario 2:
If system has starting scaling min/max: 2200/1000, and want to set
to 2200/2200, the max, min frequency was not updated. Since no change
in max that should be ok, but min was also ignored, which will be fixed
now with the new changes.
Fixes: e6c6dc0f ("power: add p-state driver compatibility")
Cc: stable@dpdk.org
Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Reviewed-by: Liang Ma <liang.j.ma@intel.com>
Since rte_atomicXX APIs are not allowed to be used, use C11 atomic
builtins for power in use state update.
Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: David Hunt <david.hunt@intel.com>
A decision was made [1] to no longer support Make in DPDK, this patch
removes all Makefiles that do not make use of pkg-config, along with
the mk directory previously used by make.
[1] https://mails.dpdk.org/archives/dev/2020-April/162839.html
Signed-off-by: Ciara Power <ciara.power@intel.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Start a new release cycle with empty release notes.
The ABI version becomes 21.0.
The ABI major is back to normal, having only one number (21 vs 20.0).
The map files are updated to the new ABI major number (21).
The ABI exceptions are dropped.
Travis ABI check is disabled because compatibility is not preserved.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Anything coming from sysfs has a newline at the end. Cut it off before
comparing the strings.
Fixes: 20ab67608a39 ("power: add environment capability probing")
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: David Hunt <david.hunt@intel.com>
Tested-by: Lihong Ma <lihongx.ma@intel.com>
Reviewed-by: Bruce Richardson <bruce.richardson@intel.com>
Currently, there is no way to know if the power management env is
supported without trying to initialize it. The init API also does
not distinguish between failure due to some error and failure due to
power management not being available on the platform in the first
place.
Thus, add an API that provides capability of probing support for a
specific power management API.
Suggested-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
There is a common macro __rte_unused, avoiding warnings,
which is now used where appropriate for consistency.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
There is a macro __rte_always_inline, forcing functions to be inlined,
which is now used where appropriate for consistency.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Remove setting ALLOW_EXPERIMENTAL_API individually for each Makefile and
meson.build. Instead, enable ALLOW_EXPERIMENTAL_API flag across app, lib
and drivers.
This changes reduces the clutter across the project while still
maintaining the functionality of ALLOW_EXPERIMENTAL_API i.e. warning
external applications about experimental API usage.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Should be passing errno rather than ret, which could be negative.
Coverity issue: 350362
Fixes: 9dc843eb273b ("power: extend guest channel API for reading")
Cc: stable@dpdk.org
Signed-off-by: David Hunt <david.hunt@intel.com>
Calling pstate's or acpi's rte_power_freq_up() when on the highest
non-turbo frequency results in an error, if turbo is enabled in the BIOS,
but disabled via the power library.
The error is in the form of a return code and a RTE_LOG() entry
on the ERR level.
According to the API documentation, the frequency is scaled up
"according to the available frequencies". In case turbo is disabled,
that frequency is not available. This patch's rte_power_freq_up()
behaviour is also consistent with how rte_power_freq_max() is
implemented (i.e. the highest non-turbo frequency is set, in case
turbo is disabled).
Fixes: 445c6528b55f ("power: common interface for guest and host")
Fixes: e6c6dc0f96c8 ("power: add p-state driver compatibility")
Cc: stable@dpdk.org
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Tested-by: David Hunt <david.hunt@intel.com>
Acked-by: David Hunt <david.hunt@intel.com>
Reviewed-by: Liang Ma <liang.j.ma@intel.com>
Merge all versions in linker version script files to DPDK_20.0.
This commit was generated by running the following command:
:~/DPDK$ buildtools/update-abi.sh 20.0
Signed-off-by: Pawel Modrak <pawelx.modrak@intel.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Since the library versioning for both stable and experimental ABI's is
now managed globally, the LIBABIVER and version variables no longer
serve any useful purpose, and can be removed.
The replacement in Makefiles was done using the following regex:
^(#.*\n)?LIBABIVER\s*:=\s*\d+\n(\s*\n)?
(LIBABIVER := numbers, optionally preceded by a comment and optionally
succeeded by an empty line)
The replacement for meson files was done using the following regex:
^(#.*\n)?version\s*=\s*\d+\n(\s*\n)?
(version = numbers, optionally preceded by a comment and optionally
succeeded by an empty line)
[David]: those variables are manually removed for the files:
- drivers/common/qat/Makefile
- lib/librte_eal/meson.build
[David]: the LIBABIVER is restored for the external ethtool example
library.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Fix these as they are user visible. Found with codespell.
Fixes: af75078fece3 ("first public release")
Fixes: c2361bab70c5 ("eal: compute IOVA mode based on PA availability")
Fixes: 0880c40113ef ("drivers: advertise kmod dependencies in pmdinfo")
Fixes: 56b6ef874f80 ("efd: new Elastic Flow Distributor library")
Fixes: 5a5f3178d4a8 ("power: return error when environment already set")
Cc: stable@dpdk.org
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Fix these as they are user visible. Found with codespell.
Fixes: bacaa2754017 ("eal: add channel for multi-process communication")
Fixes: f05e26051c15 ("eal: add IPC asynchronous request")
Fixes: 0cbce3a167f1 ("vfio: skip DMA map failure if already mapped")
Fixes: 445c6528b55f ("power: common interface for guest and host")
Fixes: e6c6dc0f96c8 ("power: add p-state driver compatibility")
Fixes: 8f972312b8f4 ("vhost: support vhost-user")
Cc: stable@dpdk.org
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Add new packet type and commands for capabilities query.
Signed-off-by: Marcin Hajkowski <marcinx.hajkowski@intel.com>
Tested-by: David Hunt <david.hunt@intel.com>
Acked-by: Lee Daly <lee.daly@intel.com>
Extend incoming packet reading API with new packet
type which carries CPU frequencies.
Signed-off-by: Marcin Hajkowski <marcinx.hajkowski@intel.com>
Tested-by: David Hunt <david.hunt@intel.com>
Acked-by: Lee Daly <lee.daly@intel.com>
Added new experimental API rte_power_guest_channel_receive_msg
which gives possibility to receive messages send to guest.
Signed-off-by: Marcin Hajkowski <marcinx.hajkowski@intel.com>
Tested-by: David Hunt <david.hunt@intel.com>
Acked-by: Lee Daly <lee.daly@intel.com>
Currently 0 is being used for not connected slot indication.
This is not consistent with linux doc which identifies 0 as valid
(connected) slot, thus modification was done to change it.
Fixes: cd0d5547 ("power: vm communication channels in guest")
Cc: stable@dpdk.org
Signed-off-by: Marcin Hajkowski <marcinx.hajkowski@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
For each library where we optionally disable it, add in the reason why it's
being disabled, so the user knows how to fix it.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Luca Boccassi <bluca@debian.org>
Putting a '__attribute__((deprecated))' in the middle of a function
prototype does not result in the expected result with gcc (while clang
is fine with this syntax).
$ cat deprecated.c
void * __attribute__((deprecated)) incorrect() { return 0; }
__attribute__((deprecated)) void *correct(void) { return 0; }
int main(int argc, char *argv[]) { incorrect(); correct(); return 0; }
$ gcc -o deprecated.o -c deprecated.c
deprecated.c: In function ‘main’:
deprecated.c:3:1: warning: ‘correct’ is deprecated (declared at
deprecated.c:2) [-Wdeprecated-declarations]
int main(int argc, char *argv[]) { incorrect(); correct(); return 0; }
^
Move the tag on a separate line and make it the first thing of function
prototypes.
This is not perfect but we will trust reviewers to catch the other not
so easy to detect patterns.
sed -i \
-e '/^\([^#].*\)\?__rte_experimental */{' \
-e 's//\1/; s/ *$//; i\' \
-e __rte_experimental \
-e '/^$/d}' \
$(git grep -l __rte_experimental -- '*.h')
Special mention for rte_mbuf_data_addr_default():
There is either a bug or a (not yet understood) issue with gcc.
gcc won't drop this inline when unused and rte_mbuf_data_addr_default()
calls rte_mbuf_buf_addr() which itself is experimental.
This results in a build warning when not accepting experimental apis
from sources just including rte_mbuf.h.
For this specific case, we hide the call to rte_mbuf_buf_addr() under
the ALLOW_EXPERIMENTAL_API flag.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
We had some inconsistencies between functions prototypes and actual
definitions.
Let's avoid this by only adding the experimental tag to the prototypes.
Tests with gcc and clang show it is enough.
git grep -l __rte_experimental |grep \.c$ |while read file; do
sed -i -e '/^__rte_experimental$/d' $file;
sed -i -e 's/ *__rte_experimental//' $file;
sed -i -e 's/__rte_experimental *//' $file;
done
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
The ACPI and PState CPU frequency scaling drivers used the
__rte_cache_aligned attribute without including rte_memory.h, which
turns what looks as the declaration of a cache line-aligned struct
into a non-aligned struct declaration and the definition of an
instance of the struct.
Fixes: e6c6dc0f96 ("power: add p-state driver compatibility")
Fixes: 445c6528b5 ("power: common interface for guest and host")
Cc: stable@dpdk.org
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Fix the resource leaking issue
Coverity issue: 337668
Fixes: b60fd5f8b1ce8f0a2c ("power: add bit for high frequency cores")
Signed-off-by: Liang Ma <liang.j.ma@intel.com>
Tested-by: David Hunt <david.hunt@intel.com>
As part of the documentation update on the changes made to the power
library for 19.05, information on SST-BF was added. This patch updates
the comment to clarify that a priority core is an SST-BF high
frequency core.
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
A previous change removed the limit of 64 cores by
moving away from 64-bit masks to char arrays. However
this left a buffer overrun issue, where the max channels
was defined as 64, and max cores was defined as 256. These
should all be consistently set to RTE_MAX_LCORE.
The #defines being removed are CHANNEL_CMDS_MAX_CPUS,
CHANNEL_CMDS_MAX_CHANNELS, POWER_MGR_MAX_CPUS, and
CHANNEL_CMDS_MAX_VM_CHANNELS, and are being replaced
with RTE_MAX_LCORE for consistency and simplicity.
Coverity issue: 337672, 337673, 337678
Fixes: fd73630e95c1 ("examples/power: change 64-bit masks to arrays")
Cc: stable@dpdk.org
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
This patch will ensure the correct max frequency of a core is set in
the lcore_power_info struct when disabling turbo, while using the
intel pstate driver.
Fixes: e6c6dc0f96c8 ("power: add p-state driver compatibility")
Cc: stable@dpdk.org
Signed-off-by: Lee Daly <lee.daly@intel.com>
Acked-by: David Hunt <david.hunt@intel.com>
Acked-by: Liang Ma <liang.j.ma@intel.com>
Set all power environment related function pointers to NULL
when unset is being made.
Signed-off-by: Marcin Hajkowski <marcinx.hajkowski@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
On attempt to set_env in already initialized state notify
user by returning error that operation cannot be performed.
Signed-off-by: Marcin Hajkowski <marcinx.hajkowski@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Due to lack of thread safety in exisiting solution
use spinlock mechanism for atomic
modification of power environment related data.
Fixes: 445c6528b5 ("power: common interface for guest and host")
Cc: stable@dpdk.org
Signed-off-by: Marcin Hajkowski <marcinx.hajkowski@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Define variables for "is_linux", "is_freebsd" and "is_windows"
to make the code shorter for comparisons and more readable.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Luca Boccassi <bluca@debian.org>