Commit Graph

39 Commits

Author SHA1 Message Date
Liang Ma
e6c6dc0f96 power: add p-state driver compatibility
Previously, in order to use the power library, it was necessary
for the user to disable the intel_pstate driver by adding
“intel_pstate=disable” to the kernel command line for the system,
which causes the acpi_cpufreq driver to be loaded in its place.

This patch adds the ability for the power library use the intel-pstate
driver.

It adds a new suite of functions behind the current power library API,
and will seamlessly set up the user facing API function pointers to
the relevant functions depending on whether the system is running with
acpi_cpufreq kernel driver, intel_pstate kernel driver or in a guest,
using kvm. The library API and ABI is unchanged.

Signed-off-by: Liang Ma <liang.j.ma@intel.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: David Hunt <david.hunt@intel.com>
2018-12-21 01:33:59 +01:00
Thomas Monjalon
c5f21bdae4 fix indentation in symbol maps
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Allain Legacy <allain.legacy@windriver.com>
2018-11-26 20:16:46 +01:00
David Hunt
31259a3376 power: fix traffic aware build
1. %ld to PRId64 for 32-bit builds
2. Fix dependency on librte_timer

Fixes: 450f079131 ("power: add traffic pattern aware power control")

Signed-off-by: David Hunt <david.hunt@intel.com>
Tested-by: Ferruh Yigit <ferruh.yigit@intel.com>
2018-10-26 14:51:36 +02:00
David Hunt
757bf2e7cf lib/power: add changes for host commands/policies
This patch does a couple of things:
  * Adds a new message type for removing policies (PKT_POLICY_REMOVE)
    Used when we want to remove a previously created policy.
  * Adds a core_type bool to the channel packet struct to specify whether
    the type of core we want to control is virtual or physical.

Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2018-10-26 10:48:15 +02:00
Liang Ma
450f079131 power: add traffic pattern aware power control
1. Abstract

For packet processing workloads such as DPDK polling is continuous.
This means CPU cores always show 100% busy independent of how much work
those cores are doing. It is critical to accurately determine how busy
a core is hugely important for the following reasons:

   * No indication of overload conditions.

   * User does not know how much real load is on a system, resulting
     in wasted energy as no power management is utilized.

Compared to the original l3fwd-power design, instead of going to sleep
after detecting an empty poll, the new mechanism just lowers the core
frequency. As a result, the application does not stop polling the device,
which leads to improved handling of bursts of traffic.

When the system become busy, the empty poll mechanism can also increase the
core frequency (including turbo) to do best effort for intensive traffic.
This gives us more flexible and balanced traffic awareness over the
standard l3fwd-power application.

2. Proposed solution

The proposed solution focuses on how many times empty polls are executed.
The less the number of empty polls, means current core is busy with
processing workload, therefore, the higher frequency is needed. The high
empty poll number indicates the current core not doing any real work
therefore, we can lower the frequency to safe power.

In the current implementation, each core has 1 empty-poll counter which
assume 1 core is dedicated to 1 queue. This will need to be expanded in the
future to support multiple queues per core.

2.1 Power state definition:

	LOW:  Not currently used, reserved for future use.

	MED:  the frequency is used to process modest traffic workload.

	HIGH: the frequency is used to process busy traffic workload.

2.2 There are two phases to establish the power management system:

	a.Initialization/Training phase. The training phase is necessary
	  in order to figure out the system polling baseline numbers from
	  idle to busy. The highest poll count will be during idle, where
	  all polls are empty. These poll counts will be different between
	  systems due to the many possible processor micro-arch, cache
	  and device configurations, hence the training phase.
	  In the training phase, traffic is blocked so the training
	  algorithm can average the empty-poll numbers for the LOW, MED and
	  HIGH  power states in order to create a baseline.
	  The core's counter are collected every 10ms, and the Training
	  phase will take 2 seconds.
	  Training is disabled as default configuration. The default
	  parameter is applied. Sample App still can trigger training
	  if that's needed. Once the training phase has been executed once on
	  a system, the application can then be started with the relevant
	  thresholds provided on the command line, allowing the application
	  to start passing start traffic immediately

	b.Normal phase. Traffic starts immediately based on the default
	  thresholds, or based on the user supplied thresholds via the
	  command line parameters. The run-time poll counts are compared with
	  the baseline and the decision will be taken to move to MED power
	  state or HIGH power state. The counters are calculated every 10ms.

3. Proposed  API

1.  rte_power_empty_poll_stat_init(struct ep_params **eptr,
		uint8_t *freq_tlb, struct ep_policy *policy);
which is used to initialize the power management system.
 
2.  rte_power_empty_poll_stat_free(void);
which is used to free the resource hold by power management system.
 
3.  rte_power_empty_poll_stat_update(unsigned int lcore_id);
which is used to update specific core empty poll counter, not thread safe
 
4.  rte_power_poll_stat_update(unsigned int lcore_id, uint8_t nb_pkt);
which is used to update specific core valid poll counter, not thread safe
 
5.  rte_power_empty_poll_stat_fetch(unsigned int lcore_id);
which is used to get specific core empty poll counter.
 
6.  rte_power_poll_stat_fetch(unsigned int lcore_id);
which is used to get specific core valid poll counter.

7.  rte_empty_poll_detection(struct rte_timer *tim, void *arg);
which is used to detect empty poll state changes then take action.

Signed-off-by: Liang Ma <liang.j.ma@intel.com>
Reviewed-by: Lei Yao <lei.a.yao@intel.com>
Acked-by: David Hunt <david.hunt@intel.com>
2018-10-26 01:55:07 +02:00
David Hunt
b89168ef15 examples/vm_power: add branch ratio policy type
Add the capability for the vm_power_manager to receive
a policy of type BRANCH_RATIO. This will add any vcpus
in the policy to the oob monitoring thread.

Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
2018-07-20 23:59:42 +02:00
Radu Nicolau
185109906b power: add get capabilities API
New API added, rte_power_get_capabilities(), that allows the
application to query the power and performance capabilities
of the CPU cores.

Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Acked-by: David Hunt <david.hunt@intel.com>
2018-07-12 19:15:14 +02:00
Bruce Richardson
6c9457c279 build: replace license text with SPDX tag
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Luca Boccassi <bluca@debian.org>
2018-01-30 21:58:59 +01:00
Bruce Richardson
5b9656b157 lib: build with meson
Add non-EAL libraries to DPDK build. The compat lib is a special case,
along with the previously-added EAL, but all other libs can be build using
the same set of commands, where the individual meson.build files only need
to specify their dependencies, source files, header files and ABI versions.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
Acked-by: Luca Boccassi <luca.boccassi@gmail.com>
2018-01-30 17:49:16 +01:00
Marko Kovacevic
5e1f62fb1a power: clean KVM files
rename private header file rte_power_kvm_vm.c
to power_kvm_vm.c. This prevents the private
functions from leaking into the documentation.
Change any private functions from
rte_<function_name> to just <function_name>.
Reserve the rte_ for public functions

Signed-off-by: Marko Kovacevic <marko.kovacevic@intel.com>
Acked-by: David Hunt <david.hunt@intel.com>
2018-01-12 00:37:07 +01:00
Marko Kovacevic
d69dc8476d power: clean ACPI files
Rename private header file rte_power_acpi_cpufreq.c
to power_acpi_cpufreq.c.This prevents the private
functions from leaking into the documentation.
Change any private functions from rte_<function_name>
to just <function_name>.Reserve the rte_ for public functions.

Signed-off-by: Marko Kovacevic <marko.kovacevic@intel.com>
Acked-by: David Hunt <david.hunt@intel.com>
2018-01-12 00:37:07 +01:00
Marko Kovacevic
08272999bd power: clean common header
Rename private header file rte_power_common.h
to power_common.h to prevent private functions
from leaking into the documentation.

Signed-off-by: Marko Kovacevic <marko.kovacevic@intel.com>
Acked-by: David Hunt <david.hunt@intel.com>
2018-01-12 00:37:07 +01:00
Marko Kovacevic
969e82ad81 power: changed unsigned to unsigned int
Since this patch-set attempts to clean up the power library,
and there are many instances of "unsigned" caught by checkpatch,
it was decided to clean these up first rather than have them included
in the later patches in the patch set. And would also minimise this
type of error being caught by checkpatch in the future

Signed-off-by: Marko Kovacevic <marko.kovacevic@intel.com>
Acked-by: David Hunt <david.hunt@intel.com>
2018-01-12 00:37:07 +01:00
Thierry Herbelot
8f87ba7015 fix typos
Repeated occurrences of 'the'.

The change was obtained using the following command:

  sed -i "s;the the ;the ;" `git grep -l "the "`

Signed-off-by: Thierry Herbelot <thierry.herbelot@6wind.com>
2018-01-11 18:26:46 +01:00
Bruce Richardson
369991d997 lib: use SPDX tag for Intel copyright files
Replace the BSD license header with the SPDX tag for files
with only an Intel copyright on them.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2018-01-04 22:41:39 +01:00
Pavel Shirshov
e32cb57973 lib: fix typos
Signed-off-by: Pavel Shirshov <pavel.shirshov@gmail.com>
2017-11-13 06:26:17 +01:00
Olivier Matz
cbc12b0a96 mk: do not generate LDLIBS from directory dependencies
The list of libraries in LDLIBS was generated from the DEPDIRS-xyz
variable. This is valid when the subdirectory name match the library
name, but it's not always the case, especially for PMDs.

The patches removes this feature and explicitly adds the proper
libraries in LDLIBS.

Some DEPDIRS-xyz variables become useless, remove them.

Reported-by: Gage Eads <gage.eads@intel.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Gage Eads <gage.eads@intel.com>
2017-10-24 02:14:57 +02:00
David Hunt
db0dc9b32a power: add send channel msg function to map file
Adding new wrapper function to existing private (but unused 'till now)
function with an rte_power_ prefix.

The plan is to clean up all the header files in the next release so
that only the intended public functions are in the map file and only
the relevant headers have the rte_ prefix so that only they are
included in the documentation.

Signed-off-by: David Hunt <david.hunt@intel.com>
Reviewed-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2017-10-12 00:46:11 +01:00
David Hunt
a8cb5f6571 power: add extra msg type for policies
Signed-off-by: Nemanja Marjanovic <nemanja.marjanovic@intel.com>
Signed-off-by: Rory Sexton <rory.sexton@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
Reviewed-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2017-10-12 00:42:50 +01:00
David Hunt
3b3db6b8bc power: add turbo functions to map file
Fixes: 94608a0f7f ("power: add per-core turbo boost API")

Signed-off-by: David Hunt <david.hunt@intel.com>
2017-10-03 10:46:12 +02:00
David Hunt
94608a0f7f power: add per-core turbo boost API
Adds a new set of APIs to allow per-core turbo
enable-disable.

Signed-off-by: David Hunt <david.hunt@intel.com>
2017-09-22 16:35:12 +02:00
Olivier Matz
feb9f680cd mk: optimize directory dependencies
Before this patch, the management of dependencies between directories
had several issues:

- the generation of .depdirs, done at configuration is slow: it can take
  more than one minute on some slow targets (usually ~10s on a standard
  PC without -j).

- for instance, it is possible to express a dependency like:
  - app/foo depends on lib/librte_foo
  - and lib/librte_foo depends on app/bar
  But this won't work because the directories are traversed with a
  depth-first algorithm, so we have to choose between doing 'app' before
  or after 'lib'.

- the script depdirs-rule.sh is too complex.

- we cannot use "make -d" for debug, because the output of make is used for
  the generation of .depdirs.

This patch moves the DEPDIRS-* variables in the upper Makefile, making
the dependencies much easier to calculate. A DEPDIRS variable is still
used to process library dependencies in LDLIBS.

After this commit, "make config" is almost immediate.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Tested-by: Robin Jarry <robin.jarry@6wind.com>
Tested-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2017-03-27 23:28:43 +02:00
Marvin Liu
fcccf6f1d5 examples/vm_power_manager: remove dependency on internal header
Macro CHANNEL_CMDS_MAX_CPUS stand for the maximum number of cores
controlled by virtual channels. This macro only be used in the example,
so remove it from library to example header file.

Signed-off-by: Marvin Liu <yong.liu@intel.com>
2016-07-11 17:23:32 +02:00
Daniel Mrzyglod
f87a9ddc4a power: fix error messages
Function strerror(errno) has built strings only for non-negative errno values.
for negative values of errno it describe error as "Unknown error -errno"
to be more descriptive i put string "channel not found" taken from header.

The negative argument will be interpreted as a very large unsigned value.

Coverity issue: 13266
Coverity issue: 13269
Fixes: 445c6528b5 ("power: common interface for guest and host")

Signed-off-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>
2016-05-16 14:17:41 +02:00
Thomas Monjalon
50810f095a config: remove useless explicit includes of generated header
The file rte_config.h is automatically generated and included.
No need to #include it.

The example performance-thread needs a makefile fix to avoid
overwriting the default cflags.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2016-02-10 22:43:38 +01:00
Aaron Conole
2fa09f338b power: remove duplicate definition
The CHANNEL_CMDS_MAX_VM_CHANNELS is duplicated in the channel_commands
header file. This commit removes that duplication.

Fixes: 210c383e24 ("power: packet format for vm power management")

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2015-12-07 04:57:16 +01:00
Thomas Monjalon
9e46f6c5d8 doc: fix doxygen warnings
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2015-06-19 12:11:53 +02:00
Neil Horman
133b75923b mk: add library version extension
To differentiate libraries that break ABI, we add a library version number
suffix to the library, which must be incremented when a given libraries ABI is
broken.  This patch enforces that addition, sets the initial abi soname
extension to 1 for each library and creates a symlink to the base SONAME so that
the test applications will link properly.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
2015-02-03 16:56:58 +01:00
Neil Horman
9d41beed24 lib: provide initial versioning
Add linker version script files to each DPDK library to put a stake in the
ground from which we can start cleaning up API's

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
2015-02-03 16:56:58 +01:00
Pablo de Lara
d10096c692 power: fix link with application
rte_power_freq_min function did not include "extern" keyword,
causing linking errors.

Fixes: 445c6528b5 ("power: common interface for guest and host")

Reported-by: Ildar Mustafin <imustafin@bk.ru>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2015-01-27 12:30:36 +01:00
Alan Carew
75bf9c8e82 power: integration of vm power management
librte_power now contains both rte_power_acpi_cpufreq and rte_power_kvm_vm
implementations.

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2014-11-26 17:27:04 +01:00
Alan Carew
210c383e24 power: packet format for vm power management
Provides a command packet format for host and guest.

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2014-11-26 17:27:04 +01:00
Alan Carew
445c6528b5 power: common interface for guest and host
Moved the current librte_power implementation to rte_power_acpi_cpufreq, with
renaming of functions only.
Added rte_power_kvm_vm implementation to support Power Management from a VM.

librte_power now hides the implementation based on the environment used.
A new call rte_power_set_env() can explicidly set the environment, if not
called then auto-detection takes place.

rte_power_kvm_vm is subset of the librte_power APIs, the following is supported:
 rte_power_init(unsigned lcore_id)
 rte_power_exit(unsigned lcore_id)
 rte_power_freq_up(unsigned lcore_id)
 rte_power_freq_down(unsigned lcore_id)
 rte_power_freq_min(unsigned lcore_id)
 rte_power_freq_max(unsigned lcore_id)

The other unsupported APIs return -ENOTSUP

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2014-11-26 17:27:04 +01:00
Alan Carew
cd0d5547e8 power: vm communication channels in guest
Allows for the opening of Virtio-Serial devices on a VM, where a DPDK
application can send packets to the host based monitor. The packet formatted is
specified in channel_commands.h
Each device appears as a serial device in path
/dev/virtio-ports/virtio.serial.port.<agent_type>.<lcore_num> where each lcore
in a DPDK application has exclusive to a device/channel.
Each channel is opened in non-blocking mode, after a successful open a test
packet is send to the host to ensure the host side is monitoring.

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2014-11-26 17:27:03 +01:00
Stephen Hemminger
6f41fe75e2 eal: deprecate rte_snprintf
The function rte_snprintf serves no useful purpose. It is the
same as snprintf() for all valid inputs. Deprecate it and
replace all uses in current code.

Leave the tests for the deprecated function in place.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-06-27 02:31:24 +02:00
Bruce Richardson
3031749c2d remove trailing whitespaces
This commit removes trailing whitespace from lines in files. Almost all
files are affected, as the BSD license copyright header had trailing
whitespace on 4 lines in it [hence the number of files reporting 8 lines
changed in the diffstat].

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
[Thomas: remove spaces before tabs in libs]
[Thomas: remove more trailing spaces in non-C files]
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-06-11 00:29:34 +02:00
Bruce Richardson
e9d48c0072 update Intel copyright years to 2014
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2014-02-25 21:29:14 +01:00
Intel
1c1d4d7a92 doc: whitespace changes in licenses
Signed-off-by: Intel
2013-10-09 14:51:55 +02:00
Intel
d7937e2e3d power: initial import
Signed-off-by: Intel
2013-09-17 14:09:21 +02:00