A previous change removed the limit of 64 cores by
moving away from 64-bit masks to char arrays. However
this left a buffer overrun issue, where the max channels
was defined as 64, and max cores was defined as 256. These
should all be consistently set to RTE_MAX_LCORE.
The #defines being removed are CHANNEL_CMDS_MAX_CPUS,
CHANNEL_CMDS_MAX_CHANNELS, POWER_MGR_MAX_CPUS, and
CHANNEL_CMDS_MAX_VM_CHANNELS, and are being replaced
with RTE_MAX_LCORE for consistency and simplicity.
Coverity issue: 337672, 337673, 337678
Fixes: fd73630e95c1 ("examples/power: change 64-bit masks to arrays")
Cc: stable@dpdk.org
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Removed dependency to mac_list from policies:
* BRANCH_RATIO,
* WORKLOAD,
* TIME
in function update_policy.
Fixes: 1b897991473f ("power: update error handling")
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
Acked-by: David Hunt <david.hunt@intel.com>
Tested-by: Yufeng Mo <yufengx.mo@intel.com>
Do a global replace of snprintf(..."%s",...) with strlcpy, adding in the
rte_string_fns.h header if needed. The function changes in this patch were
auto-generated via command:
spatch --sp-file devtools/cocci/strlcpy.cocci --dir . --in-place
and then the files edited using awk to add in the missing header:
gawk -i inplace '/include <rte_/ && ! seen { \
print "#include <rte_string_fns.h>"; seen=1} {print}'
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Most examples have in their makefiles a default RTE_TARGET directory to be
used in case RTE_TARGET is not set. Rather than just using a hard-coded
default, we can instead detect what the build directory is relative to
RTE_SDK directory.
This fixes a potential issue for anyone who continues to build using
"make install T=x86_64-native-linuxapp-gcc" and skips setting RTE_TARGET
explicitly, instead relying on the fact that they were building in a
directory which corresponded to the example default path - which was
changed to "x86_64-native-linux-gcc" by commit 218c4e68c1d9 ("mk: use
linux and freebsd in config names").
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Update for handling negative returned status from functions
call.
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Rather than using linuxapp and bsdapp everywhere, we can change things to
use the, more readable, terms "linux" and "freebsd" in our build configs.
Rather than renaming the configs we can just duplicate the existing ones
with the new names using symlinks, and use the new names exclusively
internally. ["make showconfigs" also only shows the new names to keep the
list short] The result is that backward compatibility is kept fully but any
new builds or development can be done using the newer names, i.e. both
"make config T=x86_64-native-linuxapp-gcc" and "T=x86_64-native-linux-gcc"
work.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
This patch fixes a bug introduced in the 64-core limitation
enhancement where the core_id is inadvertently converted from
virtual to physical even though it may already be a physical
core_id.
We should be using the core_type field, and only converting via
hypervisor when core_type is set to CORE_TYPE_VIRTUAL
Fixes: 5776b7a371d1 ("examples/power: allow VM to use lcores over 63")
Signed-off-by: David Hunt <david.hunt@intel.com>
Reviewed-by: Lei Yao <lei.a.yao@intel.com>
Tested-by: Lei Yao <lei.a.yao@intel.com>
The vm_power_manager starts by setting the environment to acpi
using rte_power_set_env(PM_ENV_ACPI_CPUFREQ). This causes a problem
starting vm_power_manager when the system is using the intel_pstate
driver. The env should be set to none, or not called at all, because
the library now auto-detects the environment to be either acpi or
intel_pstate. This patch sets the environment to none so that the
library can successfully auto-detect.
Fixes: e6c6dc0f96c8 ("power: add p-state driver compatibility")
Signed-off-by: David Hunt <david.hunt@intel.com>
Increase the number of addressable cores from 64 to 256. Also remove the
warning that incresing this number beyond 64 will cause problems (because
of the previous use of uint64_t masks). Now this number can be increased
significantly without causing problems.
Signed-off-by: David Hunt <david.hunt@intel.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Extending the functionality to allow vms to power manage cores beyond 63.
Signed-off-by: David Hunt <david.hunt@intel.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Since we're moving to allowing greater than 64 cores, the mask functions
that use uint64_t to perform functions on a masked set of cores are no
longer needed, so removing them.
Signed-off-by: David Hunt <david.hunt@intel.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
vm_power_manager currently makes use of uint64_t masks to keep track of
cores in use, limiting use of the app to only being able to manage the
first 64 cores in a multi-core system. Many modern systems have core
counts greater than 64, so this limitation needs to be removed.
This patch converts the relevant 64-bit masks to character arrays.
Signed-off-by: David Hunt <david.hunt@intel.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
The vm_power_manager app was not respecting the POWER_MGR_MAX_CPUS
during initialisation, so if there were more CPUs than this value (64),
it would lead to buffer overruns of there were more then 64 cores in
the system.
Added in a check during init and un-init to only initialise up to
lcore_id 63.
This raises the question as to why not simply increase the value of
POWER_MGR_MAX_CPUS. Well, it's not that simple, as many of the APIs take
a uint64_t as a parameter for the core mask, and this will not work for
cores greater than 63. So some work needs to be done in the future to
remove this limitation. For now we'll fix the memory corruption.
Also, the patch that this fixes says "allow greater than 64 cores" but
that's not across the entire application, it's only for the out-of-band
monitoring. I'll add a notice for an API change in the next release to
clean this up, i.e. depricate any API calls that use masks.
Fixes: 6453b9284b64 ("examples/vm_power: allow greater than 64 cores")
Cc: stable@dpdk.org
Signed-off-by: David Hunt <david.hunt@intel.com>
Add meson.build in vm_power_manager and the guest_cli subdirectory.
Building can be achieved by going to the build directory, and using
meson configure -Dexamples=vm_power_manager,vm_power_manager/guest_cli
Then, when ninja is invoked, it will build dpdk-vm_power_manger and
dpdk-guest_cli
Work still needs to be done on the meson build system to handles the case
where the target list of example apps is defined as 'all'. That will come
in a future patch.
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Some messages appearing several times a second, removing as they are
unnecessary. Other less severe messages change from INFO to DEBUG
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Add JSON string handling to vm_power_manager for JSON strings received
through the fifo. The format of the JSON strings are detailed in the
next patch, the vm_power_manager user guide documentation updates.
This patch introduces a new dependency on Jansson, a C library for
encoding, decoding and manipulating JSON data. To compile the sample app
you now need to have installed libjansson4 and libjansson-dev (these may
be named slightly differently depending on your Operating System)
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Now that we're handling host policies, containers and virtual machines,
we'll rename MAX_VMS to MAX_CLIENTS, and increase from 4 to 64
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
This patch adds a fifo channel to the vm_power_manager app through which
we can send commands and polices. Intended for sending JSON strings.
The fifo is at /tmp/powermonitor/fifo
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
The changes here are minimal, as the guest app functionality is not
changing at all, but there is a new element in the channel_packet
struct that needs to have a default set (channel_packet->core_type).
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Previously the vm_power_manager app required to have some vms defined, so
the call to get_all_vm() always set the noVms variable. Now we're accepting
policies from the host OS (without any VMs defined), so it is now valid to
have zero VMs. This patch initialises the relevant variables to zero just
in case the call to get_all_vms() does not find any, so could return with
the variables uninitialised.
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Allow vm_power_manager to run without requiring qemu to be present
on the machine. This will be required for instances where the JSON
interface is used for commands and polices, without any VMs present.
A use case for this is a container enviromnent.
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
For different workloads and poll loops, the theshold
may be different for when you want to scale up and down.
This patch allows changing of the default branch ratio
by using the -b command line argument (or --branch-ratio=)
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
Add new command line arguments to the guest app to make
testing and validation of the policy usage easier.
These arguments are mainly around setting up the power
management policy that is sent from the guest vm to
to the vm_power_manager in the host
New command line parameters:
-n or --vm-name
sets the name of the vm to be used by the host OS.
-b or --busy-hours
sets the list of hours that are predicted to be busy
-q or --quiet-hours
sets the list of hours that are predicted to be quiet
-l or --vcpu-list
sets the list of vcpus to monitor
-p or --port-list
sets the list of posts to monitor when using a
workload policy.
-o or --policy
sets the default policy type
TIME
WORKLOAD
TRAFFIC
BRANCH_RATIO
The format of the hours or list paramers is a comma-separated
list of integers, which can take the form of
a. x e.g. --vcpu-list=1
b. x,y e.g. --quiet-hours=3,4
c. x-y e.g. --busy-hours=9-12
d. combination of above (e.g. --busy-hours=4,5-7,9)
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
Add the capability for the vm_power_manager to receive
a policy of type BRANCH_RATIO. This will add any vcpus
in the policy to the oob monitoring thread.
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
Change the app to now require three cores, as the third core
will be used to run the oob montoring thread.
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
To facilitate more info per core, change the global_cpu_mask
from a uint64_t to an array. This also removes the limit on
64 cores, allocing the aray at run-time based on the number of
cores found in the system.
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
This patch introduces the out-of-band (oob) core monitoring
functions.
The functions are similar to the channel manager functions.
There are function to add and remove cores from the
list of cores being monitored. There is a function to initialise
the monitor setup, run the monitor thread, and exit the monitor.
The monitor thread runs in it's own lcore, and is separate
functionality to the channel monitor which is epoll based.
THis thread is timer based. It loops through all monitored cores,
calculates the branch ratio, scales up or down the core, then
sleeps for an interval (~250 uS).
The method it uses to read the branch counters is a pread on the
/dev/cpu/x/msr file, so the 'msr' kernel module needs to be loaded.
Also, since the msr.h file has been made unavailable in recent
kernels, we have #defines for the relevant MSRs included in the
code.
The makefile has a switch for x86 and non-x86 platforms,
and compiles stub function for non-x86 platforms.
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
Add in the '-l' command line parameter (also --core-list)
So the user can now pass --corelist=4,6,8-10 and it will
expand out to 4,6,8,9,10 using the parse function provided
in parse.c (parse_set).
This list of cores is then used to enable out-of-band monitoring
to scale up and down these cores based on the ratio of branch
hits versus branch misses. The ratio will be low when a poll
loop is spinning with no packets being received, so the frequency
will be scaled down.
Also , as part of this change, we introduce a core_info struct
which keeps information on each core in the system, and whether
we're doing out of band monitoring on them.
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
If we don't pass any ports to the app, we don't need to create
any mempools, and we don't need to init any ports.
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
In DPDK 17.11, the ethdev offloads API has changed:
commit cba7f53b717d ("ethdev: introduce Tx queue offloads API")
commit ce17eddefc20 ("ethdev: introduce Rx queue offloads API")
The new API is documented in the programmer's guide:
http://doc.dpdk.org/guides/prog_guide/poll_mode_drv.html#hardware-offload
For reminder, the main concepts in the new API were:
- All offloads are disabled by default
- Distinction between per port and per queue offloads.
The transition bits are now removed:
- Translation of the old API in ethdev
- rte_eth_conf.rxmode.ignore_offload_bitfield
- ETH_TXQ_FLAGS_IGNORE
The old API bits are now removed:
- Rx per-port rte_eth_conf.rxmode.[bit-fields]
- Tx per-queue rte_eth_txconf.txq_flags
- ETH_TXQ_FLAGS_NO*
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Shahaf Shuler <shahafs@mellanox.com>
The basic operations for ports enumeration should not be
considered as experimental in DPDK 18.05.
The iterator RTE_ETH_FOREACH_DEV was introduced in DPDK 17.05.
It uses the function the rte_eth_find_next_owned_by() to get
only ownerless ports. Its API can be considered stable.
So the flag experimental is removed from rte_eth_find_next_owned_by().
The flag experimental is removed from rte_eth_dev_count_avail()
which is the new name of the old function rte_eth_dev_count().
The flag experimental is set to rte_eth_dev_count_total()
in the .c file for consistency with the declaration in the .h file.
A lot of internal applications are fixed to not allow experimental API.
Fixes: 8728ccf37615 ("fix ethdev ports enumeration")
Fixes: d9a42a69febf ("ethdev: deprecate port count function")
Fixes: e70e26861eaf ("net/mvpp2: fix build")
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Tested-by: David Marchand <david.marchand@6wind.com>
A number of example apps are not supported by the meson build system yet,
but to allow future testing with "-Dexamples=all" we add in a placeholder
meson.build file indicating that the apps should not be built.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Tested-by: Harry van Haaren <harry.van.haaren@intel.com>
Some DPDK applications wrongly assume these requirements:
- no hotplug, i.e. ports are never detached
- all allocated ports are available to the application
Such application iterates over ports by its own mean.
The most common pattern is to request the port count and
assume ports with index in the range [0..count[ can be used.
In order to fix this common mistake in all external applications,
the function rte_eth_dev_count is deprecated, while introducing
the new functions rte_eth_dev_count_avail and rte_eth_dev_count_total.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Some DPDK applications wrongly assume these requirements:
- no hotplug, i.e. ports are never detached
- all allocated ports are available to the application
Such application assume a valid port index is in the range [0..count[.
There are three consequences when using such wrong design:
- new ports having an index higher than the port count won't be valid
- old ports being detached (RTE_ETH_DEV_UNUSED) can be valid
Such mistake will be less common with growing hotplug awareness.
All applications and examples inside this repository - except testpmd -
must be fixed to use the function rte_eth_dev_is_valid_port.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Some DPDK applications wrongly assume these requirements:
- no hotplug, i.e. ports are never detached
- all allocated ports are available to the application
Such application iterates over ports by its own mean.
The most common pattern is to request the port count and
assume ports with index in the range [0..count[ can be used.
There are three consequences when using such wrong design:
- new ports having an index higher than the port count won't be seen
- old ports being detached (RTE_ETH_DEV_UNUSED) can be seen as ghosts
- failsafe sub-devices (RTE_ETH_DEV_DEFERRED) will be seen by the application
Such mistake will be less common with growing hotplug awareness.
All applications and examples inside this repository - except testpmd -
must be fixed to use the iterator RTE_ETH_FOREACH_DEV.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Current code only sets mac address of first VF. Fix code so that it
continues through the loop and sets the mac address of each VF.
Fixes: c9a4779135c9 ("examples/vm_power_mgr: set MAC address of VF")
Cc: stable@dpdk.org
Signed-off-by: David Coyle <david.coyle@intel.com>
Acked-by: David Hunt <david.hunt@intel.com>
Increase the default RX/TX ring sizes to 1024/1024 to
accommodate for NICs with higher throughput (25G, 40G etc)
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Change the example app Makefiles to query if DPDK is installed and
registered using pkg-config. If so, build directly using pkg-config info,
otherwise fall back to using the original build system with RTE_SDK and
RTE_TARGET
This commit changes the makefiles for the basic examples, i.e. those which
do not have multiple subdirectories underneath the main examples dir.
Examples not covered are:
* ethtool
* multi_process
* performance-thread
* quota_watermark
* netmap_compat
* server_node_efd
* vm_power_manager
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Luca Boccassi <bluca@debian.org>
Reorder the text in the makefiles, so that the app name and the source
files are listed first. This then will allow them to be shared later in a
combined makefile building with pkg-config and RTE_SDK-based build system.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Luca Boccassi <bluca@debian.org>
Replace the BSD license header with the SPDX tag for files
with only an Intel copyright on them.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
A delay in the loop waiting for the virtio-serial commands in the
vm_power_manager app was causing a lag in the response time. This was
set to 1 second, and has now been changed to 1ms.
Fixes: f14791a8126e ("examples/vm_power_mgr: add policy to channels")
Signed-off-by: David Hunt <david.hunt@intel.com>
Remove variable declaration from within for loop.
Fixes: f14791a8126e ("examples/vm_power_mgr: add policy to channels")
Signed-off-by: David Hunt <david.hunt@intel.com>
We need to set vf mac from the host, so that they will be in sync on the
guest and the host. Otherwise, we'll have a random mac on the guest, and
a 00:00:00:00:00:00 mac on the host.
Signed-off-by: David Hunt <david.hunt@intel.com>
Reviewed-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Here we're adding an example of setting up a policy, and allowing the
vm_cli_guest app to send it to the host using the cli command
"send_policy now"
Signed-off-by: Nemanja Marjanovic <nemanja.marjanovic@intel.com>
Signed-off-by: Rory Sexton <rory.sexton@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
Reviewed-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
We need to initialise the port's we're monitoring to be able to see
the throughput.
Signed-off-by: Nemanja Marjanovic <nemanja.marjanovic@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
Reviewed-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>