Commit Graph

82 Commits

Author SHA1 Message Date
Stephen Hemminger
cb056611a8 eal: rename lcore master and slave
Replace master lcore with main lcore and
replace slave lcore with worker lcore.

Keep the old functions and macros but mark them as deprecated
for this release.

The "--master-lcore" command line option is also deprecated
and any usage will print a warning and use "--main-lcore"
as replacement.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2020-10-20 13:17:08 +02:00
Bruce Richardson
63b3907833 build: remove library name from version map file name
Since each version map file is contained in the subdirectory of the library
it refers to, there is no need to include the library name in the filename.
This makes things simpler in case of library renaming.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Luca Boccassi <bluca@debian.org>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Rosen Xu <rosen.xu@intel.com>
2020-10-19 22:13:59 +02:00
Reshma Pattan
a3f9cca718 power: fix current frequency index
During power initialization the pstate cpufreq api is
not setting the initial curr_idx of pstate_power_info
to corresponding current frequency index.

Without this the idx is always 0, which is causing the
below check to pass and returns without setting the initial
min/max frequency to system max frequency and this leads to
incorrect frequency settings when power_pstate_cpufreq_set_freq()
is called in the apps.

set_freq_internal(struct pstate_power_info *pi, uint32_t idx)
{
...

 /* Check if it is the same as current */
        if (idx == pi->curr_idx)
                return 0;
...
}

scenario 1:
If system has starting scaling min/max: 1000/1000, and want to
set this to 2200/2200, the max frequency gets updated but not min.

scenario 2:
If system has starting scaling min/max: 2200/1000, and want to set
to 2200/2200, the max, min frequency was not updated. Since no change
in max that should be ok, but min was also ignored, which will be fixed
now with the new changes.

Fixes: e6c6dc0f ("power: add p-state driver compatibility")
Cc: stable@dpdk.org

Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Reviewed-by: Liang Ma <liang.j.ma@intel.com>
2020-10-07 14:51:52 +02:00
Phil Yang
e623c943eb power: use C11 atomics for power state
Since rte_atomicXX APIs are not allowed to be used, use C11 atomic
builtins for power in use state update.

Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: David Hunt <david.hunt@intel.com>
2020-09-25 15:42:29 +02:00
Ciara Power
3cc6ecfdfe build: remove makefiles
A decision was made [1] to no longer support Make in DPDK, this patch
removes all Makefiles that do not make use of pkg-config, along with
the mk directory previously used by make.

[1] https://mails.dpdk.org/archives/dev/2020-April/162839.html

Signed-off-by: Ciara Power <ciara.power@intel.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2020-09-08 00:09:50 +02:00
Thomas Monjalon
4f86c0ba19 version: 20.11-rc0
Start a new release cycle with empty release notes.

The ABI version becomes 21.0.
The ABI major is back to normal, having only one number (21 vs 20.0).
The map files are updated to the new ABI major number (21).
The ABI exceptions are dropped.
Travis ABI check is disabled because compatibility is not preserved.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2020-08-12 11:32:16 +02:00
Anatoly Burakov
8b7b02f945 power: fix environment detection
Anything coming from sysfs has a newline at the end. Cut it off before
comparing the strings.

Fixes: 20ab67608a ("power: add environment capability probing")

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: David Hunt <david.hunt@intel.com>
Tested-by: Lihong Ma <lihongx.ma@intel.com>
Reviewed-by: Bruce Richardson <bruce.richardson@intel.com>
2020-07-22 01:35:39 +02:00
Anatoly Burakov
20ab67608a power: add environment capability probing
Currently, there is no way to know if the power management env is
supported without trying to initialize it. The init API also does
not distinguish between failure due to some error and failure due to
power management not being available on the platform in the first
place.

Thus, add an API that provides capability of probing support for a
specific power management API.

Suggested-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
2020-07-11 13:31:16 +02:00
Thomas Monjalon
f2fc83b40f replace unused attributes
There is a common macro __rte_unused, avoiding warnings,
which is now used where appropriate for consistency.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2020-04-16 18:30:58 +02:00
Thomas Monjalon
33011cb3df replace always-inline attributes
There is a macro __rte_always_inline, forcing functions to be inlined,
which is now used where appropriate for consistency.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2020-04-16 18:16:46 +02:00
Pavan Nikhilesh
acec04c4b2 build: disable experimental API check internally
Remove setting ALLOW_EXPERIMENTAL_API individually for each Makefile and
meson.build. Instead, enable ALLOW_EXPERIMENTAL_API flag across app, lib
and drivers.
This changes reduces the clutter across the project while still
maintaining the functionality of ALLOW_EXPERIMENTAL_API i.e. warning
external applications about experimental API usage.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
2020-04-14 16:22:34 +02:00
David Hunt
19802aaf7c power: fix error log on guest message polling
Should be passing errno rather than ret, which could be negative.

Coverity issue: 350362
Fixes: 9dc843eb27 ("power: extend guest channel API for reading")
Cc: stable@dpdk.org

Signed-off-by: David Hunt <david.hunt@intel.com>
2019-11-26 00:29:24 +01:00
Mattias Rönnblom
388c4c03ec power: handle frequency increase with turbo disabled
Calling pstate's or acpi's rte_power_freq_up() when on the highest
non-turbo frequency results in an error, if turbo is enabled in the BIOS,
but disabled via the power library.
The error is in the form of a return code and a RTE_LOG() entry
on the ERR level.

According to the API documentation, the frequency is scaled up
"according to the available frequencies". In case turbo is disabled,
that frequency is not available. This patch's rte_power_freq_up()
behaviour is also consistent with how rte_power_freq_max() is
implemented (i.e. the highest non-turbo frequency is set, in case
turbo is disabled).

Fixes: 445c6528b5 ("power: common interface for guest and host")
Fixes: e6c6dc0f96 ("power: add p-state driver compatibility")
Cc: stable@dpdk.org

Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Tested-by: David Hunt <david.hunt@intel.com>
Acked-by: David Hunt <david.hunt@intel.com>
Reviewed-by: Liang Ma <liang.j.ma@intel.com>
2019-11-21 00:52:31 +01:00
Pawel Modrak
85ff364f3b build: align symbols with global ABI version
Merge all versions in linker version script files to DPDK_20.0.

This commit was generated by running the following command:

:~/DPDK$ buildtools/update-abi.sh 20.0

Signed-off-by: Pawel Modrak <pawelx.modrak@intel.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2019-11-20 23:05:39 +01:00
Anatoly Burakov
fbaf943887 build: remove individual library versions
Since the library versioning for both stable and experimental ABI's is
now managed globally, the LIBABIVER and version variables no longer
serve any useful purpose, and can be removed.

The replacement in Makefiles was done using the following regex:

	^(#.*\n)?LIBABIVER\s*:=\s*\d+\n(\s*\n)?

(LIBABIVER := numbers, optionally preceded by a comment and optionally
succeeded by an empty line)

The replacement for meson files was done using the following regex:

	^(#.*\n)?version\s*=\s*\d+\n(\s*\n)?

(version = numbers, optionally preceded by a comment and optionally
succeeded by an empty line)

[David]: those variables are manually removed for the files:
- drivers/common/qat/Makefile
- lib/librte_eal/meson.build
[David]: the LIBABIVER is restored for the external ethtool example
library.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2019-11-20 23:05:39 +01:00
Kevin Traynor
baf023a8ed lib: fix doxygen typos
Fix these as they are user visible. Found with codespell.

Fixes: af75078fec ("first public release")
Fixes: c2361bab70 ("eal: compute IOVA mode based on PA availability")
Fixes: 0880c40113 ("drivers: advertise kmod dependencies in pmdinfo")
Fixes: 56b6ef874f ("efd: new Elastic Flow Distributor library")
Fixes: 5a5f3178d4 ("power: return error when environment already set")
Cc: stable@dpdk.org

Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2019-11-19 22:03:38 +01:00
Kevin Traynor
0411d61fa9 lib: fix log typos
Fix these as they are user visible. Found with codespell.

Fixes: bacaa27540 ("eal: add channel for multi-process communication")
Fixes: f05e26051c ("eal: add IPC asynchronous request")
Fixes: 0cbce3a167 ("vfio: skip DMA map failure if already mapped")
Fixes: 445c6528b5 ("power: common interface for guest and host")
Fixes: e6c6dc0f96 ("power: add p-state driver compatibility")
Fixes: 8f972312b8 ("vhost: support vhost-user")
Cc: stable@dpdk.org

Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2019-11-19 22:03:27 +01:00
Marcin Hajkowski
8c00828da8 power: add packet type for capabilities
Add new packet type and commands for capabilities query.

Signed-off-by: Marcin Hajkowski <marcinx.hajkowski@intel.com>
Tested-by: David Hunt <david.hunt@intel.com>
Acked-by: Lee Daly <lee.daly@intel.com>
2019-10-27 21:12:04 +01:00
Marcin Hajkowski
04a8cb8ee9 power: extend guest channel for frequency query
Extend incoming packet reading API with new packet
type which carries CPU frequencies.

Signed-off-by: Marcin Hajkowski <marcinx.hajkowski@intel.com>
Tested-by: David Hunt <david.hunt@intel.com>
Acked-by: Lee Daly <lee.daly@intel.com>
2019-10-27 20:57:05 +01:00
Marcin Hajkowski
9dc843eb27 power: extend guest channel API for reading
Added new experimental API rte_power_guest_channel_receive_msg
which gives possibility to receive messages send to guest.

Signed-off-by: Marcin Hajkowski <marcinx.hajkowski@intel.com>
Tested-by: David Hunt <david.hunt@intel.com>
Acked-by: Lee Daly <lee.daly@intel.com>
2019-10-27 19:27:36 +01:00
Marcin Hajkowski
b4b2f84a59 power: fix socket indicator value
Currently 0 is being used for not connected slot indication.
This is not consistent with linux doc which identifies 0 as valid
(connected) slot, thus modification was done to change it.

Fixes: cd0d5547 ("power: vm communication channels in guest")
Cc: stable@dpdk.org

Signed-off-by: Marcin Hajkowski <marcinx.hajkowski@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2019-10-27 19:26:35 +01:00
Bruce Richardson
759a5fb18e lib: add reasons for components being disabled
For each library where we optionally disable it, add in the reason why it's
being disabled, so the user knows how to fix it.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Luca Boccassi <bluca@debian.org>
2019-07-02 23:21:05 +02:00
David Marchand
18218713bf enforce experimental tag at beginning of declarations
Putting a '__attribute__((deprecated))' in the middle of a function
prototype does not result in the expected result with gcc (while clang
is fine with this syntax).

$ cat deprecated.c
void * __attribute__((deprecated)) incorrect() { return 0; }
__attribute__((deprecated)) void *correct(void) { return 0; }
int main(int argc, char *argv[]) { incorrect(); correct(); return 0; }
$ gcc -o deprecated.o -c deprecated.c
deprecated.c: In function ‘main’:
deprecated.c:3:1: warning: ‘correct’ is deprecated (declared at
deprecated.c:2) [-Wdeprecated-declarations]
 int main(int argc, char *argv[]) { incorrect(); correct(); return 0; }
 ^

Move the tag on a separate line and make it the first thing of function
prototypes.
This is not perfect but we will trust reviewers to catch the other not
so easy to detect patterns.

sed -i \
     -e '/^\([^#].*\)\?__rte_experimental */{' \
     -e 's//\1/; s/ *$//; i\' \
     -e __rte_experimental \
     -e '/^$/d}' \
     $(git grep -l __rte_experimental -- '*.h')

Special mention for rte_mbuf_data_addr_default():

There is either a bug or a (not yet understood) issue with gcc.
gcc won't drop this inline when unused and rte_mbuf_data_addr_default()
calls rte_mbuf_buf_addr() which itself is experimental.
This results in a build warning when not accepting experimental apis
from sources just including rte_mbuf.h.

For this specific case, we hide the call to rte_mbuf_buf_addr() under
the ALLOW_EXPERIMENTAL_API flag.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
2019-06-29 19:04:48 +02:00
David Marchand
cfe3aeb170 remove experimental tags from all symbol definitions
We had some inconsistencies between functions prototypes and actual
definitions.
Let's avoid this by only adding the experimental tag to the prototypes.
Tests with gcc and clang show it is enough.

git grep -l __rte_experimental |grep \.c$ |while read file; do
	sed -i -e '/^__rte_experimental$/d' $file;
	sed -i -e 's/  *__rte_experimental//' $file;
	sed -i -e 's/__rte_experimental  *//' $file;
done

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2019-06-29 19:04:43 +02:00
Mattias Rönnblom
7727ad9107 power: fix cache line alignment
The ACPI and PState CPU frequency scaling drivers used the
__rte_cache_aligned attribute without including rte_memory.h, which
turns what looks as the declaration of a cache line-aligned struct
into a non-aligned struct declaration and the definition of an
instance of the struct.

Fixes: e6c6dc0f96 ("power: add p-state driver compatibility")
Fixes: 445c6528b5 ("power: common interface for guest and host")
Cc: stable@dpdk.org

Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
2019-05-09 21:07:55 +02:00
Liang Ma
3d45c3b0f5 power: fix resource leak
Fix the resource leaking issue

Coverity issue: 337668
Fixes: b60fd5f8b1 ("power: add bit for high frequency cores")

Signed-off-by: Liang Ma <liang.j.ma@intel.com>
Tested-by: David Hunt <david.hunt@intel.com>
2019-05-09 21:07:55 +02:00
David Hunt
8255e7c40e power: clarify comment about SST-BF priority core
As part of the documentation update on the changes made to the power
library for 19.05, information on SST-BF was added. This patch updates
the comment to clarify that a priority core is an SST-BF high
frequency core.

Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
2019-05-03 22:15:56 +02:00
John McNamara
8bd5f07c7a doc: fix spelling reported by aspell in comments
Fix spelling errors in the doxygen docs.

Signed-off-by: John McNamara <john.mcnamara@intel.com>
2019-05-03 00:38:14 +02:00
David Hunt
751227a08d power: fix buffer overruns
A previous change removed the limit of 64 cores by
moving away from 64-bit masks to char arrays. However
this left a buffer overrun issue, where the max channels
was defined as 64, and max cores was defined as 256. These
should all be consistently set to RTE_MAX_LCORE.

The #defines being removed are CHANNEL_CMDS_MAX_CPUS,
CHANNEL_CMDS_MAX_CHANNELS, POWER_MGR_MAX_CPUS, and
CHANNEL_CMDS_MAX_VM_CHANNELS, and are being replaced
with RTE_MAX_LCORE for consistency and simplicity.

Coverity issue: 337672, 337673, 337678
Fixes: fd73630e95 ("examples/power: change 64-bit masks to arrays")
Cc: stable@dpdk.org

Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2019-04-23 00:15:10 +02:00
Lee Daly
a0d15e43e4 power: fix max frequency after turbo disabling
This patch will ensure the correct max frequency of a core is set in
the lcore_power_info struct when disabling turbo, while using the
intel pstate driver.

Fixes: e6c6dc0f96 ("power: add p-state driver compatibility")
Cc: stable@dpdk.org

Signed-off-by: Lee Daly <lee.daly@intel.com>
Acked-by: David Hunt <david.hunt@intel.com>
Acked-by: Liang Ma <liang.j.ma@intel.com>
2019-04-23 00:15:09 +02:00
Marcin Hajkowski
dae173f892 power: reset function pointers on unset environment
Set all power environment related function pointers to NULL
when unset is being made.

Signed-off-by: Marcin Hajkowski <marcinx.hajkowski@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2019-04-22 22:45:51 +02:00
Marcin Hajkowski
5a5f3178d4 power: return error when environment already set
On attempt to set_env in already initialized state notify
user by returning error that operation cannot be performed.

Signed-off-by: Marcin Hajkowski <marcinx.hajkowski@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2019-04-22 22:44:55 +02:00
Marcin Hajkowski
92433e3473 power: fix thread-safety environment modification
Due to lack of thread safety in exisiting solution
use spinlock mechanism for atomic
modification of power environment related data.

Fixes: 445c6528b5 ("power: common interface for guest and host")
Cc: stable@dpdk.org

Signed-off-by: Marcin Hajkowski <marcinx.hajkowski@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2019-04-22 22:44:48 +02:00
Bruce Richardson
adf93ca564 build: increase readability via shortcut variables
Define variables for "is_linux", "is_freebsd" and "is_windows"
to make the code shorter for comparisons and more readable.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Luca Boccassi <bluca@debian.org>
2019-04-17 18:09:52 +02:00
Pallantla Poornima
1ea7c4aefc power: remove unused variable
Variable pfi_str is removed since it is unused.

Fixes: 450f079131 ("power: add traffic pattern aware power control")
Cc: stable@dpdk.org

Signed-off-by: Pallantla Poornima <pallantlax.poornima@intel.com>
Reviewed-by: Rami Rosen <ramirose@gmail.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: David Hunt <david.hunt@intel.com>
2019-04-05 10:40:56 +02:00
Bruce Richardson
6723c0fc72 replace snprintf with strlcpy
Do a global replace of snprintf(..."%s",...) with strlcpy, adding in the
rte_string_fns.h header if needed.  The function changes in this patch were
auto-generated via command:

  spatch --sp-file devtools/cocci/strlcpy.cocci --dir . --in-place

and then the files edited using awk to add in the missing header:

  gawk -i inplace '/include <rte_/ && ! seen { \
  	print "#include <rte_string_fns.h>"; seen=1} {print}'

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2019-04-04 22:46:05 +02:00
David Hunt
b60fd5f8b1 power: add bit for high frequency cores
This patch adds a new bit in the capabilities mask that's returned by
rte_power_get_capabilities(), allowing application to query which cores
have the higher frequencies, and can then pin the workloads accordingly.

Returned Bits:
 0 - Turbo Boost enabled
 1 - Higher core base_frequency

Signed-off-by: Liang Ma <liang.j.ma@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
2019-04-02 02:22:08 +02:00
David Hunt
08a710642d power: fix governor storage to trim newlines
Currently the Power Libray stores the governor name with an embedded
newline read from the scaling_governor sysfs file. This patch strips
it out.

Fixes: 445c6528b5 ("power: common interface for guest and host")
Cc: stable@dpdk.org

Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2019-04-01 22:23:44 +02:00
Kevin Traynor
e1e4dafbc7 power: fix frequency list buffer validation
The frequency list buffer was already validated in
power_acpi_cpufreq_freqs(), so the newly added check was redundant.
To keep consistency with power_pstate_cpufreq_freqs(), remove the
original check and update the log message.

Fixes: 2e6ccdb4e0 ("power: fix frequency list to handle null buffer")
Cc: stable@dpdk.org

Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
2019-03-29 14:58:27 +01:00
Liang Ma
7c06d9258a power: fix file descriptor leak
Coverity issue: 328528
Fixes: e6c6dc0f96 ("power: add p-state driver compatibility")

Signed-off-by: Liang Ma <liang.j.ma@intel.com>
Reviewed-by: Lei Yao <lei.a.yao@intel.com>
Tested-by: Lei Yao <lei.a.yao@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
2019-01-17 19:20:02 +01:00
David Hunt
ad514edf71 power: fix frequency list return code
The power_pstate_cpufreq_freqs() function was returning -1 in an
unsigned int, causing buffer over-runs when the results were being
processed. This function should be returning zero for all error
conditions, similar to it's acpi relation, power_acpi_cpufreq_freqs().

Fixes: e6c6dc0f96 ("power: add p-state driver compatibility")

Signed-off-by: David Hunt <david.hunt@intel.com>
2019-01-15 02:40:41 +01:00
David Hunt
2e6ccdb4e0 power: fix frequency list to handle null buffer
This patch fixes a segfault in the case where a null buffer is passed
to the following functions:
   power_acpi_cpufreq_freqs()
   power_pstate_cpufreq_freqs()

Fixes: 445c6528b5 ("power: common interface for guest and host")

Signed-off-by: David Hunt <david.hunt@intel.com>
2019-01-15 02:40:41 +01:00
David Hunt
de394915df power: fix error handling on setting governor
In the power_set_governor_*() functions, we using fputs() on /sys
filesystem. However, we also need to call fflush() to ensure that
the write completes successfully. Otherwise the attempt to set the
power governor fails and the function returns as if it has
succeeded. This patch adds an fflush to ensure that the
write succeeds, otherwise returns an error.

Fixes: e6c6dc0f96 ("power: add p-state driver compatibility")

Signed-off-by: David Hunt <david.hunt@intel.com>
2019-01-15 02:40:40 +01:00
Liang Ma
e6c6dc0f96 power: add p-state driver compatibility
Previously, in order to use the power library, it was necessary
for the user to disable the intel_pstate driver by adding
“intel_pstate=disable” to the kernel command line for the system,
which causes the acpi_cpufreq driver to be loaded in its place.

This patch adds the ability for the power library use the intel-pstate
driver.

It adds a new suite of functions behind the current power library API,
and will seamlessly set up the user facing API function pointers to
the relevant functions depending on whether the system is running with
acpi_cpufreq kernel driver, intel_pstate kernel driver or in a guest,
using kvm. The library API and ABI is unchanged.

Signed-off-by: Liang Ma <liang.j.ma@intel.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: David Hunt <david.hunt@intel.com>
2018-12-21 01:33:59 +01:00
Thomas Monjalon
c5f21bdae4 fix indentation in symbol maps
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Allain Legacy <allain.legacy@windriver.com>
2018-11-26 20:16:46 +01:00
David Hunt
31259a3376 power: fix traffic aware build
1. %ld to PRId64 for 32-bit builds
2. Fix dependency on librte_timer

Fixes: 450f079131 ("power: add traffic pattern aware power control")

Signed-off-by: David Hunt <david.hunt@intel.com>
Tested-by: Ferruh Yigit <ferruh.yigit@intel.com>
2018-10-26 14:51:36 +02:00
David Hunt
757bf2e7cf lib/power: add changes for host commands/policies
This patch does a couple of things:
  * Adds a new message type for removing policies (PKT_POLICY_REMOVE)
    Used when we want to remove a previously created policy.
  * Adds a core_type bool to the channel packet struct to specify whether
    the type of core we want to control is virtual or physical.

Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2018-10-26 10:48:15 +02:00
Liang Ma
450f079131 power: add traffic pattern aware power control
1. Abstract

For packet processing workloads such as DPDK polling is continuous.
This means CPU cores always show 100% busy independent of how much work
those cores are doing. It is critical to accurately determine how busy
a core is hugely important for the following reasons:

   * No indication of overload conditions.

   * User does not know how much real load is on a system, resulting
     in wasted energy as no power management is utilized.

Compared to the original l3fwd-power design, instead of going to sleep
after detecting an empty poll, the new mechanism just lowers the core
frequency. As a result, the application does not stop polling the device,
which leads to improved handling of bursts of traffic.

When the system become busy, the empty poll mechanism can also increase the
core frequency (including turbo) to do best effort for intensive traffic.
This gives us more flexible and balanced traffic awareness over the
standard l3fwd-power application.

2. Proposed solution

The proposed solution focuses on how many times empty polls are executed.
The less the number of empty polls, means current core is busy with
processing workload, therefore, the higher frequency is needed. The high
empty poll number indicates the current core not doing any real work
therefore, we can lower the frequency to safe power.

In the current implementation, each core has 1 empty-poll counter which
assume 1 core is dedicated to 1 queue. This will need to be expanded in the
future to support multiple queues per core.

2.1 Power state definition:

	LOW:  Not currently used, reserved for future use.

	MED:  the frequency is used to process modest traffic workload.

	HIGH: the frequency is used to process busy traffic workload.

2.2 There are two phases to establish the power management system:

	a.Initialization/Training phase. The training phase is necessary
	  in order to figure out the system polling baseline numbers from
	  idle to busy. The highest poll count will be during idle, where
	  all polls are empty. These poll counts will be different between
	  systems due to the many possible processor micro-arch, cache
	  and device configurations, hence the training phase.
	  In the training phase, traffic is blocked so the training
	  algorithm can average the empty-poll numbers for the LOW, MED and
	  HIGH  power states in order to create a baseline.
	  The core's counter are collected every 10ms, and the Training
	  phase will take 2 seconds.
	  Training is disabled as default configuration. The default
	  parameter is applied. Sample App still can trigger training
	  if that's needed. Once the training phase has been executed once on
	  a system, the application can then be started with the relevant
	  thresholds provided on the command line, allowing the application
	  to start passing start traffic immediately

	b.Normal phase. Traffic starts immediately based on the default
	  thresholds, or based on the user supplied thresholds via the
	  command line parameters. The run-time poll counts are compared with
	  the baseline and the decision will be taken to move to MED power
	  state or HIGH power state. The counters are calculated every 10ms.

3. Proposed  API

1.  rte_power_empty_poll_stat_init(struct ep_params **eptr,
		uint8_t *freq_tlb, struct ep_policy *policy);
which is used to initialize the power management system.
 
2.  rte_power_empty_poll_stat_free(void);
which is used to free the resource hold by power management system.
 
3.  rte_power_empty_poll_stat_update(unsigned int lcore_id);
which is used to update specific core empty poll counter, not thread safe
 
4.  rte_power_poll_stat_update(unsigned int lcore_id, uint8_t nb_pkt);
which is used to update specific core valid poll counter, not thread safe
 
5.  rte_power_empty_poll_stat_fetch(unsigned int lcore_id);
which is used to get specific core empty poll counter.
 
6.  rte_power_poll_stat_fetch(unsigned int lcore_id);
which is used to get specific core valid poll counter.

7.  rte_empty_poll_detection(struct rte_timer *tim, void *arg);
which is used to detect empty poll state changes then take action.

Signed-off-by: Liang Ma <liang.j.ma@intel.com>
Reviewed-by: Lei Yao <lei.a.yao@intel.com>
Acked-by: David Hunt <david.hunt@intel.com>
2018-10-26 01:55:07 +02:00
David Hunt
b89168ef15 examples/vm_power: add branch ratio policy type
Add the capability for the vm_power_manager to receive
a policy of type BRANCH_RATIO. This will add any vcpus
in the policy to the oob monitoring thread.

Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
2018-07-20 23:59:42 +02:00
Radu Nicolau
185109906b power: add get capabilities API
New API added, rte_power_get_capabilities(), that allows the
application to query the power and performance capabilities
of the CPU cores.

Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Acked-by: David Hunt <david.hunt@intel.com>
2018-07-12 19:15:14 +02:00