This patch adds a fifo channel to the vm_power_manager app through which
we can send commands and polices. Intended for sending JSON strings.
The fifo is at /tmp/powermonitor/fifo
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
The changes here are minimal, as the guest app functionality is not
changing at all, but there is a new element in the channel_packet
struct that needs to have a default set (channel_packet->core_type).
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Previously the vm_power_manager app required to have some vms defined, so
the call to get_all_vm() always set the noVms variable. Now we're accepting
policies from the host OS (without any VMs defined), so it is now valid to
have zero VMs. This patch initialises the relevant variables to zero just
in case the call to get_all_vms() does not find any, so could return with
the variables uninitialised.
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Allow vm_power_manager to run without requiring qemu to be present
on the machine. This will be required for instances where the JSON
interface is used for commands and polices, without any VMs present.
A use case for this is a container enviromnent.
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
For different workloads and poll loops, the theshold
may be different for when you want to scale up and down.
This patch allows changing of the default branch ratio
by using the -b command line argument (or --branch-ratio=)
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
Add new command line arguments to the guest app to make
testing and validation of the policy usage easier.
These arguments are mainly around setting up the power
management policy that is sent from the guest vm to
to the vm_power_manager in the host
New command line parameters:
-n or --vm-name
sets the name of the vm to be used by the host OS.
-b or --busy-hours
sets the list of hours that are predicted to be busy
-q or --quiet-hours
sets the list of hours that are predicted to be quiet
-l or --vcpu-list
sets the list of vcpus to monitor
-p or --port-list
sets the list of posts to monitor when using a
workload policy.
-o or --policy
sets the default policy type
TIME
WORKLOAD
TRAFFIC
BRANCH_RATIO
The format of the hours or list paramers is a comma-separated
list of integers, which can take the form of
a. x e.g. --vcpu-list=1
b. x,y e.g. --quiet-hours=3,4
c. x-y e.g. --busy-hours=9-12
d. combination of above (e.g. --busy-hours=4,5-7,9)
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
Add the capability for the vm_power_manager to receive
a policy of type BRANCH_RATIO. This will add any vcpus
in the policy to the oob monitoring thread.
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
Change the app to now require three cores, as the third core
will be used to run the oob montoring thread.
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
To facilitate more info per core, change the global_cpu_mask
from a uint64_t to an array. This also removes the limit on
64 cores, allocing the aray at run-time based on the number of
cores found in the system.
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
This patch introduces the out-of-band (oob) core monitoring
functions.
The functions are similar to the channel manager functions.
There are function to add and remove cores from the
list of cores being monitored. There is a function to initialise
the monitor setup, run the monitor thread, and exit the monitor.
The monitor thread runs in it's own lcore, and is separate
functionality to the channel monitor which is epoll based.
THis thread is timer based. It loops through all monitored cores,
calculates the branch ratio, scales up or down the core, then
sleeps for an interval (~250 uS).
The method it uses to read the branch counters is a pread on the
/dev/cpu/x/msr file, so the 'msr' kernel module needs to be loaded.
Also, since the msr.h file has been made unavailable in recent
kernels, we have #defines for the relevant MSRs included in the
code.
The makefile has a switch for x86 and non-x86 platforms,
and compiles stub function for non-x86 platforms.
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
Add in the '-l' command line parameter (also --core-list)
So the user can now pass --corelist=4,6,8-10 and it will
expand out to 4,6,8,9,10 using the parse function provided
in parse.c (parse_set).
This list of cores is then used to enable out-of-band monitoring
to scale up and down these cores based on the ratio of branch
hits versus branch misses. The ratio will be low when a poll
loop is spinning with no packets being received, so the frequency
will be scaled down.
Also , as part of this change, we introduce a core_info struct
which keeps information on each core in the system, and whether
we're doing out of band monitoring on them.
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
If we don't pass any ports to the app, we don't need to create
any mempools, and we don't need to init any ports.
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
In DPDK 17.11, the ethdev offloads API has changed:
commit cba7f53b71 ("ethdev: introduce Tx queue offloads API")
commit ce17eddefc ("ethdev: introduce Rx queue offloads API")
The new API is documented in the programmer's guide:
http://doc.dpdk.org/guides/prog_guide/poll_mode_drv.html#hardware-offload
For reminder, the main concepts in the new API were:
- All offloads are disabled by default
- Distinction between per port and per queue offloads.
The transition bits are now removed:
- Translation of the old API in ethdev
- rte_eth_conf.rxmode.ignore_offload_bitfield
- ETH_TXQ_FLAGS_IGNORE
The old API bits are now removed:
- Rx per-port rte_eth_conf.rxmode.[bit-fields]
- Tx per-queue rte_eth_txconf.txq_flags
- ETH_TXQ_FLAGS_NO*
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Shahaf Shuler <shahafs@mellanox.com>
The basic operations for ports enumeration should not be
considered as experimental in DPDK 18.05.
The iterator RTE_ETH_FOREACH_DEV was introduced in DPDK 17.05.
It uses the function the rte_eth_find_next_owned_by() to get
only ownerless ports. Its API can be considered stable.
So the flag experimental is removed from rte_eth_find_next_owned_by().
The flag experimental is removed from rte_eth_dev_count_avail()
which is the new name of the old function rte_eth_dev_count().
The flag experimental is set to rte_eth_dev_count_total()
in the .c file for consistency with the declaration in the .h file.
A lot of internal applications are fixed to not allow experimental API.
Fixes: 8728ccf376 ("fix ethdev ports enumeration")
Fixes: d9a42a69fe ("ethdev: deprecate port count function")
Fixes: e70e26861e ("net/mvpp2: fix build")
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Tested-by: David Marchand <david.marchand@6wind.com>
A number of example apps are not supported by the meson build system yet,
but to allow future testing with "-Dexamples=all" we add in a placeholder
meson.build file indicating that the apps should not be built.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Tested-by: Harry van Haaren <harry.van.haaren@intel.com>
Some DPDK applications wrongly assume these requirements:
- no hotplug, i.e. ports are never detached
- all allocated ports are available to the application
Such application iterates over ports by its own mean.
The most common pattern is to request the port count and
assume ports with index in the range [0..count[ can be used.
In order to fix this common mistake in all external applications,
the function rte_eth_dev_count is deprecated, while introducing
the new functions rte_eth_dev_count_avail and rte_eth_dev_count_total.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Some DPDK applications wrongly assume these requirements:
- no hotplug, i.e. ports are never detached
- all allocated ports are available to the application
Such application assume a valid port index is in the range [0..count[.
There are three consequences when using such wrong design:
- new ports having an index higher than the port count won't be valid
- old ports being detached (RTE_ETH_DEV_UNUSED) can be valid
Such mistake will be less common with growing hotplug awareness.
All applications and examples inside this repository - except testpmd -
must be fixed to use the function rte_eth_dev_is_valid_port.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Some DPDK applications wrongly assume these requirements:
- no hotplug, i.e. ports are never detached
- all allocated ports are available to the application
Such application iterates over ports by its own mean.
The most common pattern is to request the port count and
assume ports with index in the range [0..count[ can be used.
There are three consequences when using such wrong design:
- new ports having an index higher than the port count won't be seen
- old ports being detached (RTE_ETH_DEV_UNUSED) can be seen as ghosts
- failsafe sub-devices (RTE_ETH_DEV_DEFERRED) will be seen by the application
Such mistake will be less common with growing hotplug awareness.
All applications and examples inside this repository - except testpmd -
must be fixed to use the iterator RTE_ETH_FOREACH_DEV.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Current code only sets mac address of first VF. Fix code so that it
continues through the loop and sets the mac address of each VF.
Fixes: c9a4779135 ("examples/vm_power_mgr: set MAC address of VF")
Cc: stable@dpdk.org
Signed-off-by: David Coyle <david.coyle@intel.com>
Acked-by: David Hunt <david.hunt@intel.com>
Increase the default RX/TX ring sizes to 1024/1024 to
accommodate for NICs with higher throughput (25G, 40G etc)
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Change the example app Makefiles to query if DPDK is installed and
registered using pkg-config. If so, build directly using pkg-config info,
otherwise fall back to using the original build system with RTE_SDK and
RTE_TARGET
This commit changes the makefiles for the basic examples, i.e. those which
do not have multiple subdirectories underneath the main examples dir.
Examples not covered are:
* ethtool
* multi_process
* performance-thread
* quota_watermark
* netmap_compat
* server_node_efd
* vm_power_manager
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Luca Boccassi <bluca@debian.org>
Reorder the text in the makefiles, so that the app name and the source
files are listed first. This then will allow them to be shared later in a
combined makefile building with pkg-config and RTE_SDK-based build system.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Luca Boccassi <bluca@debian.org>
Replace the BSD license header with the SPDX tag for files
with only an Intel copyright on them.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
A delay in the loop waiting for the virtio-serial commands in the
vm_power_manager app was causing a lag in the response time. This was
set to 1 second, and has now been changed to 1ms.
Fixes: f14791a812 ("examples/vm_power_mgr: add policy to channels")
Signed-off-by: David Hunt <david.hunt@intel.com>
Remove variable declaration from within for loop.
Fixes: f14791a812 ("examples/vm_power_mgr: add policy to channels")
Signed-off-by: David Hunt <david.hunt@intel.com>
We need to set vf mac from the host, so that they will be in sync on the
guest and the host. Otherwise, we'll have a random mac on the guest, and
a 00:00:00:00:00:00 mac on the host.
Signed-off-by: David Hunt <david.hunt@intel.com>
Reviewed-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Here we're adding an example of setting up a policy, and allowing the
vm_cli_guest app to send it to the host using the cli command
"send_policy now"
Signed-off-by: Nemanja Marjanovic <nemanja.marjanovic@intel.com>
Signed-off-by: Rory Sexton <rory.sexton@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
Reviewed-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
We need to initialise the port's we're monitoring to be able to see
the throughput.
Signed-off-by: Nemanja Marjanovic <nemanja.marjanovic@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
Reviewed-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Add extra commands to guest cli to allow enable/disable of
per-core turbo. Includes messages to vm_power_mgr in host.
Signed-off-by: David Hunt <david.hunt@intel.com>
Add extra commands to command line to allow enable/disable of
per-core turbo.
When a core has turbo enabled, calling for max frequency will allow it to
go to a turbo frequency (P0n).
When a core has turbo disabled, calling for max frequency will allow it to
go to the maximum non-turbo frequency (P1), but not beyond.
Signed-off-by: David Hunt <david.hunt@intel.com>
Macro CHANNEL_CMDS_MAX_CPUS stand for the maximum number of cores
controlled by virtual channels. This macro only be used in the example,
so remove it from library to example header file.
Signed-off-by: Marvin Liu <yong.liu@intel.com>
In add_vm: The string buffer may not have a null terminator if the source
string's length is equal to the buffer size
Coverity issue: 30691
Fixes: e8ae9b6625 ("examples/vm_power: channel manager and monitor in host")
Signed-off-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>
vm_power_manager utilize libvirt API virDomainGetVcpuPinInfo to
retrieve domU vcpu information. This API is implemented from version 0.9.3.
Suse11 SP3 32bit default libvirt version is 0.8.8.
examples/vm_power_manager/channel_manager.c:
channel_manager.c:117:3: error: implicit declaration of function
'virDomainGetVcpuPinInfo'
Check and skip it from examples or raise an error when trying to compile
without libvirt or with a too old libvirt.
Fixes: e8ae9b662 ("examples/vm_power: channel manager and monitor in host")
Signed-off-by: Marvin Liu <yong.liu@intel.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
The file rte_config.h is automatically generated and included.
No need to #include it.
The example performance-thread needs a makefile fix to avoid
overwriting the default cflags.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
virNodeGetCPUMap introduced in libvirt 1.0. In some linux distributions
like Ubuntu12/14 and Fedora18, libvirt version is older than 1.0. So this
sample will not build pass.
Replace "virNodeGetCPUMap" with another libvirt API "virNodeGetInfo".
Signed-off-by: Marvin Liu <yong.liu@intel.com>
Host cpu mapping structure can only support 64 cpus. When run vm_power sample
on platform with more than 64 cpus, will generate improper physical core mask.
After limited supported host cpus to 64 will fix this issue.
Fixes: e9f64db946 ("examples/vm_power: show warning when more than 64 cores")
Signed-off-by: Marvin Liu <yong.liu@intel.com>
When using VM power manager app on systems with more than 64 cores,
app could not run even though user does not use cores 64 or higher.
The problem happens only in that case, in which case it will result
in an undefined behaviour.
Thefere, this patch allows the user to run the app on a system with more
than 64 cores, warning the user not to use cores higher than 64 in the VM(s).
Add new known issue where VM power manager app may not work
in a system with more than 64 cores, in release notes.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Tested-by: Marvin Liu <yong.liu@intel.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
rte_free handles getting passed a NULL pointer.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Fix a typo: cmdline_parse_token_string_t was used in place of
cmdline_parse_num_string_t.
Seen with clang-3.5.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
The check for NULL is in the wrong position in the "if" error leg. The
pointer should be checked for NULL before checking what the value of
what the pointer points to is.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
The length of the path to a unix socket is not PATH_MAX but instead is
UNIX_PATH_MAX which is generally just over 100 bytes in size. It's not
actually defined in sys/un.h on linux - despite the man page referencing
it, so calculate the size in the case where it's not defined.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
CACHE_LINE_SIZE is a macro defined in machine/param.h in FreeBSD and
conflicts with DPDK macro version.
Adding RTE_ prefix to avoid conflicts.
CACHE_LINE_MASK and CACHE_LINE_ROUNDUP are also prefixed.
Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
[Thomas: updated on HEAD, including PPC]