numam-dpdk

Author	SHA1	Message	Date
Mattias Rönnblom	388c4c03ec	power: handle frequency increase with turbo disabled Calling pstate's or acpi's rte_power_freq_up() when on the highest non-turbo frequency results in an error, if turbo is enabled in the BIOS, but disabled via the power library. The error is in the form of a return code and a RTE_LOG() entry on the ERR level. According to the API documentation, the frequency is scaled up "according to the available frequencies". In case turbo is disabled, that frequency is not available. This patch's rte_power_freq_up() behaviour is also consistent with how rte_power_freq_max() is implemented (i.e. the highest non-turbo frequency is set, in case turbo is disabled). Fixes: `445c6528b5` ("power: common interface for guest and host") Fixes: `e6c6dc0f96` ("power: add p-state driver compatibility") Cc: stable@dpdk.org Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com> Tested-by: David Hunt <david.hunt@intel.com> Acked-by: David Hunt <david.hunt@intel.com> Reviewed-by: Liang Ma <liang.j.ma@intel.com>	2019-11-21 00:52:31 +01:00
Ferruh Yigit	d965af9e8a	kni: increase kernel version requirement for VA A build error reported related to the selected 'get_user_pages_remote()' kernel API: .../kernel/linux/kni/kni_dev.h:113:8: error: too few arguments to function ‘get_user_pages_remote’ ret = get_user_pages_remote(tsk, tsk->mm, iova, 1 ^~~~~~~~~~~~~~~~~~~~~ Currently there are three versions of the 'get_user_pages_remote()' supported, based on kernel version < 4.9, = 4.9, > 4.9. These version based checks are not working fine with the distro kernels which is the cause of reported build error. The error reported by the kernel version 4.8, but it is using API defined in > 4.9. To be able to take control of this, and possible more, related build error, increasing the minimum supported kernel version for iova=va with KNI to kernel version 4.9. This leaves us with single version of the kernel API and more manageable. Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2019-11-21 00:18:02 +01:00
Ruifeng Wang	bb2a9973c0	bpf/arm: fix clang build Clang has different prototype for __builtin___clear_cache(). It requires 'char ' parameters while gcc requires 'void '. Clang version 8.0 was used. Warning messages during build: ../lib/librte_bpf/bpf_jit_arm64.c:1438:26: warning: incompatible pointer types passing 'uint32_t ' (aka 'unsigned int ') to parameter of type 'char ' [-Wincompatible-pointer-types] __builtin___clear_cache(ctx.ins, ctx.ins + ctx.idx); ^~~~~~~ ../lib/librte_bpf/bpf_jit_arm64.c:1438:35: warning: incompatible pointer types passing 'uint32_t ' (aka 'unsigned int ') to parameter of type 'char ' [-Wincompatible-pointer-types] __builtin___clear_cache(ctx.ins, ctx.ins + ctx.idx); ^~~~~~~~~~~~~~~~~ Fixes: `f3e5167724` ("bpf/arm: add prologue and epilogue") Cc: jerinj@marvell.com Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> Reviewed-by: Phil Yang <phil.yang@arm.com> Reviewed-by: Gavin Hu <gavin.hu@arm.com> Acked-by: Jerin Jacob <jerinj@marvell.com>	2019-11-21 00:30:39 +01:00
Pawel Modrak	85ff364f3b	build: align symbols with global ABI version Merge all versions in linker version script files to DPDK_20.0. This commit was generated by running the following command: :~/DPDK$ buildtools/update-abi.sh 20.0 Signed-off-by: Pawel Modrak <pawelx.modrak@intel.com> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2019-11-20 23:05:39 +01:00
Marcin Baran	4ab92c53ed	distributor: rename v2.0 ABI to _single suffix The original ABI versioning was slightly misleading in that the DPDK 2.0 ABI was really a single mode for the distributor, and is used as such throughout the distributor code. Fix this by renaming all _v20 API's to _single API's, and remove symbol versioning. Signed-off-by: Marcin Baran <marcinx.baran@intel.com> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: David Hunt <david.hunt@intel.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2019-11-20 23:05:39 +01:00
Marcin Baran	6e5b516761	distributor: remove deprecated code Remove code for old ABI versions ahead of ABI version bump. Signed-off-by: Marcin Baran <marcinx.baran@intel.com> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: David Hunt <david.hunt@intel.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2019-11-20 23:05:39 +01:00
Marcin Baran	c381a8d554	lpm: remove deprecated code Remove code for old ABI versions ahead of ABI version bump. Signed-off-by: Marcin Baran <marcinx.baran@intel.com> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2019-11-20 23:05:39 +01:00
Marcin Baran	f2fb215843	timer: remove deprecated code Remove code for old ABI versions ahead of ABI version bump. Signed-off-by: Marcin Baran <marcinx.baran@intel.com> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2019-11-20 23:05:39 +01:00
Anatoly Burakov	fbaf943887	build: remove individual library versions Since the library versioning for both stable and experimental ABI's is now managed globally, the LIBABIVER and version variables no longer serve any useful purpose, and can be removed. The replacement in Makefiles was done using the following regex: ^(#.\n)?LIBABIVER\s:=\s\d+\n(\s\n)? (LIBABIVER := numbers, optionally preceded by a comment and optionally succeeded by an empty line) The replacement for meson files was done using the following regex: ^(#.\n)?version\s=\s\d+\n(\s\n)? (version = numbers, optionally preceded by a comment and optionally succeeded by an empty line) [David]: those variables are manually removed for the files: - drivers/common/qat/Makefile - lib/librte_eal/meson.build [David]: the LIBABIVER is restored for the external ethtool example library. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2019-11-20 23:05:39 +01:00
Marcin Baran	cba806e07d	build: change ABI versioning to global As per new ABI policy [1], all of the libraries are now versioned using one global ABI version. Stable libraries use the MAJOR.MINOR ABI version for their shared objects, while experimental libraries use the 0.MAJORMINOR convention for their versioning. Experimental library versioning is managed globally. Changes in this patch implement the necessary steps to enable that. The CONFIG_RTE_MAJOR_ABI option was introduced to permit multiple DPDK versions installed side by side. The problem is now addressed through the new ABI policy, and thus can be removed. [David] For external libraries relying on Makefile, LIBABIVER is preserved to avoid using DPDK global ABI version. [1] https://doc.dpdk.org/guides/contributing/abi_policy.html Signed-off-by: Marcin Baran <marcinx.baran@intel.com> Signed-off-by: Pawel Modrak <pawelx.modrak@intel.com> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2019-11-20 23:05:39 +01:00
Andrew Rybchenko	6e18704b7e	ethdev: avoid undefined behaviour on configuration copy memcpy() source and destination areas must not overlap and equal pointers is the case which is really met, so handle it. Fixes: `68b931bff2` ("ethdev: eliminate interim variable") Cc: stable@dpdk.org Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-11-20 17:36:06 +01:00
Jakub Grajciar	43b815d881	net/memif: support zero-copy slave Zero-copy slave support for memif PMD. Slave interface exposes DPDK memory to master interface. Only single file segments are supported (EAL option --single-file-segments). Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-11-20 17:36:06 +01:00
Stephen Hemminger	6c373e935a	net: constify pointer to IPv6 header The function rte_ipv6_get_next_ext does not modify the header that is passed in. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-11-20 17:36:06 +01:00
Andrew Rybchenko	5031897506	ethdev: improve message about not disabled offload Avoid usaged of "failed" in the message about not requested but enabled offload, since it is not a failure. Fixes: `1daa338058` ("ethdev: validate offloads set by PMD") Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org>	2019-11-20 17:36:06 +01:00
Andrew Rybchenko	9e954d3194	ethdev: decrease verbosity of not disabled offload logs Right now a PMD decides if it is critical that an offload cannot be disabled (i.e. not requested, but still enabled). If PMD treaks it as OK, we should not spam logs with corresponding messages by default. Default log level in ethdev is INFO, so change the message level to DEBUG. Fixes: `1daa338058` ("ethdev: validate offloads set by PMD") Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org>	2019-11-20 17:36:06 +01:00
Pavan Nikhilesh	7988d03229	ethdev: fix log line feed Fix missing new line token at the end of log. Fixes: `5d30897295` ("ethdev: add mbuf RSS update as an offload") Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-11-20 17:36:05 +01:00
Xueming Li	90f538b863	malloc: fix realloc padded element size When resize a memory with next element, the original element size grows. If the orginal element has padding, the real inner element size didn't grow as well and this causes trailer verification failure when malloc debug enabled. Fixes: `af75078fec` ("first public release") Cc: stable@dpdk.org Signed-off-by: Xueming Li <xuemingl@mellanox.com> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-11-20 14:08:39 +01:00
Xueming Li	a029a06036	malloc: fix realloc copy size In rte_realloc, if the old element has pad and need to allocate a new memory, the padding size was not deducted, so more data was copied to new data area. Fixes: `af75078fec` ("first public release") Cc: stable@dpdk.org Signed-off-by: Xueming Li <xuemingl@mellanox.com> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-11-20 14:08:39 +01:00
Kevin Traynor	baf023a8ed	lib: fix doxygen typos Fix these as they are user visible. Found with codespell. Fixes: `af75078fec` ("first public release") Fixes: `c2361bab70` ("eal: compute IOVA mode based on PA availability") Fixes: `0880c40113` ("drivers: advertise kmod dependencies in pmdinfo") Fixes: `56b6ef874f` ("efd: new Elastic Flow Distributor library") Fixes: `5a5f3178d4` ("power: return error when environment already set") Cc: stable@dpdk.org Signed-off-by: Kevin Traynor <ktraynor@redhat.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2019-11-19 22:03:38 +01:00
Kevin Traynor	0411d61fa9	lib: fix log typos Fix these as they are user visible. Found with codespell. Fixes: `bacaa27540` ("eal: add channel for multi-process communication") Fixes: `f05e26051c` ("eal: add IPC asynchronous request") Fixes: `0cbce3a167` ("vfio: skip DMA map failure if already mapped") Fixes: `445c6528b5` ("power: common interface for guest and host") Fixes: `e6c6dc0f96` ("power: add p-state driver compatibility") Fixes: `8f972312b8` ("vhost: support vhost-user") Cc: stable@dpdk.org Signed-off-by: Kevin Traynor <ktraynor@redhat.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2019-11-19 22:03:27 +01:00
Michael Pfeiffer	b8a0415008	kni: reduce interface name size The name in rte_kni_device_info is passed to the kernel, which allows interface names with at most 16 bytes (IFNAMSIZ). rte_kni_alloc with a longer name currently trigger a kernel BUG in alloc_netdev_mqs in net/core/dev.c. Reduce RTE_KNI_NAMESIZE to prevent this situation. Signed-off-by: Michael Pfeiffer <michael.pfeiffer@tu-ilmenau.de> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-11-19 22:00:32 +01:00
Anatoly Burakov	84191ddeb5	mempool: remove check for bad IOVA when populating Currently, mempool will check if IOVA is bad for a segment, and reject the IOVA if hugepages are also enabled. This check is wrong because now that we have external memory segments, they are allowed to have their IOVA's to be invalid. This check also doesn't make much sense in the first place, because the following code can handle bad IOVA's perfectly well (and in fact, this check is not triggering a failure when --no-huge option is enabled), so there is not much sense to check for this in the first place. Fixes: `950e8fb4e1` ("mem: allow registering external memory areas") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Olivier Matz <olivier.matz@6wind.com> Tested-by: Bo Chen <box.c.chen@intel.com>	2019-11-19 21:41:43 +01:00
Anatoly Burakov	2a7fd3ef38	mempool: use actual IOVA addresses when populating Currently, when mempool is being populated, we get IOVA address of every segment using rte_mem_virt2iova(). This works for internal memory, but does not really work for external memory, and does not work on platforms which return RTE_BAD_IOVA as a result of this call (such as FreeBSD). Moreover, even when it works, the function in question will do unnecessary pagewalks in IOVA as PA mode, as it falls back to rte_mem_virt2phy() instead of just doing a lookup in internal memseg table. To fix it, replace the call to first attempt to look through the internal memseg table (this takes care of internal and external memory), and fall back to rte_mem_virt2iova() when unable to perform VA->IOVA translation via memseg table. Fixes: `66cc45e293` ("mem: replace memseg with memseg lists") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Olivier Matz <olivier.matz@6wind.com> Tested-by: Bo Chen <box.c.chen@intel.com>	2019-11-19 21:41:43 +01:00
Vamsi Attunuru	a0dede62a5	eal/linux: remove KNI restriction on IOVA Now that KNI supports VA (with kernel versions starting 4.6.0), we can accept IOVA as VA, but KNI must be configured for this. Pass iova_mode when creating KNI netdevs. So far, IOVA detection policy forced IOVA as PA when KNI is loaded, whatever the buses IOVA requirements were. We can now use IOVA as VA, but this comes with a cost in KNI. When no constraint is expressed by the buses, keep the current behavior of choosing PA. Note: this change supposes that dpdk is built on the same kernel than the target system kernel; no objection has been expressed on this topic. Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com> Signed-off-by: Kiran Kumar K <kirankumark@marvell.com> Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Jerin Jacob <jerinj@marvell.com>	2019-11-18 16:00:51 +01:00
Vamsi Attunuru	e73831dc6c	kni: support userspace VA Patch adds support for kernel module to work in IOVA = VA mode by providing address translation routines to convert userspace VA to kernel VA. KNI performance using PA is not changed by this patch. But comparing KNI using PA to KNI using VA, the latter will have lower performance due to the cost of the added translation. This translation is implemented only with kernel versions starting 4.6.0. Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com> Signed-off-by: Kiran Kumar K <kirankumark@marvell.com> Reviewed-by: Jerin Jacob <jerinj@marvell.com>	2019-11-18 16:00:51 +01:00
Zhike Wang	1407b0752e	vhost: fix vring requests validation broken if no FD When VHOST_USER_VRING_NOFD_MASK is set, the fd_num is 0, so validate_msg_fds() will return error. In this case, the negotiation of vring message between vhost user front end and back end would fail, and as a result, vhost user link could NOT be up. How to reproduce: 1.Run dpdk testpmd insides VM, which locates at host with ovs+dpdk. 2.Notice that inside ovs there are endless logs regarding failure to handle VHOST_USER_SET_VRING_CALL, and link of vm could NOT be up. Fixes: `bf472259dd` ("vhost: fix possible denial of service by leaking FDs") Cc: stable@dpdk.org Signed-off-by: Zhike Wang <wangzk320@163.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>	2019-11-15 14:25:48 +01:00
Stephen Hemminger	08a234788e	cmdline: remove unnecessary #ifdef The #ifdef to conditionally include <sys/socket.h> on BSD is unnecessary. It is harmless to include the header on other OS's. An extra include is better than an #ifdef. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2019-11-12 18:35:17 +01:00
Maxime Coquelin	bf472259dd	vhost: fix possible denial of service by leaking FDs A malicious Vhost-user master could send in loop hand-crafted vhost-user messages containing more file descriptors the vhost-user slave expects. Doing so causes the application using the vhost-user library to run out of FDs. This issue has been assigned CVE-2019-14818 Fixes: `8f972312b8` ("vhost: support vhost-user") Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-11-12 12:21:20 +01:00
Maxime Coquelin	612e17cf6d	vhost: fix possible denial of service on SET_VRING_NUM vhost_user_set_vring_num() performs multiple allocations without checking whether data were previously allocated. It may cause a denial of service because of the memory leaks that happen if a malicious vhost-user master keeps sending VHOST_USER_SET_VRING_NUM request until the slave runs out of memory. This issue has been assigned CVE-2019-14818 Fixes: `b0a985d1f3` ("vhost: add dequeue zero copy") Reported-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-11-12 12:21:17 +01:00
Matan Azrad	eccc5d3237	ethdev: fix last item detection on RSS flow expand There is a rte_flow API which expands a RSS flow pattern to multiple patterns according to the RSS hash types in the RSS action configuration. As part of the expansion, detection of the last item of the flow uses the "next proto" field of the last configured item in the pattern list. Wrongly, the mask of this field was not considered in order to validate the field. Ignore "next proto" fields when their corresponded masks invalidate them. Fixes: `fc2dd8dd49` ("ethdev: fix expand RSS flows") Cc: stable@dpdk.org Signed-off-by: Matan Azrad <matan@mellanox.com> Acked-by: Xiaoyu Min <jackmin@mellanox.com> Acked-by: Ori Kam <orika@mellanox.com>	2019-11-12 01:55:26 +01:00
Dekel Peled	dc258e4ab9	ethdev: add maximum LRO packet size This patch implements API for configuration and validation of max size for LRO aggregated packet. Signed-off-by: Dekel Peled <dekelp@mellanox.com> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Matan Azrad <matan@mellanox.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-11-12 01:43:01 +01:00
Jerin Jacob	6f26f8a0ec	eventdev: reserve space in main structs for extension The struct rte_eventdev and rte_eventdev_data are supposed to be used internally only, but there is a chance that increasing their size would break ABI for some applications. In order to allow smooth addition of features without breaking ABI compatibility, some space is reserved. Signed-off-by: Jerin Jacob <jerinj@marvell.com>	2019-11-12 03:36:32 +01:00
Thomas Monjalon	436b3a6b6e	ethdev: reserve space in main structs for extension In order to allow smooth addition of features without breaking ABI compatibility, some space is reserved in several core structs of ethdev API. The struct rte_eth_dev and rte_eth_dev_data are supposed to be used internally only, but there is a chance that increasing their size would break ABI for some applications. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-11-11 17:02:29 +01:00
Pavan Nikhilesh	1daa338058	ethdev: validate offloads set by PMD Some PMDs cannot work when certain offloads are enable/disabled, as a workaround PMDs auto enable/disable offloads internally and expose it through dev->data->dev_conf.rxmode.offloads. After device specific dev_configure is called compare the requested offloads to the offloads exposed by the PMD and, if the PMD failed to enable a given offload then log it and return -EINVAL from rte_eth_dev_configure, else if the PMD failed to disable a given offload log and continue with rte_eth_dev_configure. Suggested-by: Andrew Rybchenko <arybchenko@solarflare.com> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>	2019-11-11 16:15:37 +01:00
Pavan Nikhilesh	5d30897295	ethdev: add mbuf RSS update as an offload Add new Rx offload flag `DEV_RX_OFFLOAD_RSS_HASH` which can be used to enable/disable PMDs write to `rte_mbuf:#️⃣:rss`. PMDs notify the validity of `rte_mbuf:#️⃣rss` to the application by enabling `PKT_RX_RSS_HASH ` flag in `rte_mbuf::ol_flags`. Also update testpmd rx_offload command to include RSS_HASH Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-11-11 16:15:36 +01:00
Pavan Nikhilesh	5d4813acda	ethdev: add packet type range function Add `rte_eth_dev_set_ptypes` function that will allow the application to inform the PMD about reduced range of packet types to handle. Based on the ptypes set PMDs can optimize their Rx path. -If application doesn’t want any ptype information it can call `rte_eth_dev_set_ptypes(ethdev_id, RTE_PTYPE_UNKNOWN, NULL, 0)` and PMD may skip packet type processing and set rte_mbuf::packet_type to RTE_PTYPE_UNKNOWN. -If application doesn’t call `rte_eth_dev_set_ptypes` PMD can return `rte_mbuf::packet_type` with `rte_eth_dev_get_supported_ptypes`. -If application is interested only in L2/L3 layer, it can inform the PMD to update `rte_mbuf::packet_type` with L2/L3 ptype by calling `rte_eth_dev_set_ptypes(ethdev_id, RTE_PTYPE_L2_MASK \| RTE_PTYPE_L3_MASK, NULL, 0)`. Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>	2019-11-11 16:15:36 +01:00
Xiaoyu Min	fc2dd8dd49	ethdev: fix expand RSS flows rte_flow_expand_rss expands rte_flow item list based on the RSS types. In another word, some additional rules are added if the user specified items are not complete enough according to the RSS type, for example: ... pattern eth / end actions rss type tcp end ... User only provides item eth but want to do RSS on tcp traffic. The pattern is not complete enough to filter TCP traffic only. This will be a problem for some HWs. So some PMDs use rte_flow_expand_rss to expand above user provided flow to: ... pattern eth / end actions rss types tcp ... pattern eth / ipv4 / tcp / end actions rss types tcp ... ... pattern eth / ipv6 / tcp / end actions rss types tcp ... in order to filter TCP traffic only and do RSS correctly. However the current expansion cannot handle pattern as below, which provides ethertype or ip next proto instead of providing an item: ... pattern eth type is 0x86DD / end actions rss types tcp ... rte_flow_expand_rss will expand above flow to: ... pattern eth type is 0x86DD / ipv4 / tcp end ... which has conflicting values: 0x86DD vs. ipv4 and some HWs will refuse to create flow. This patch will fix above by checking the last item's spec and to expand RSS flows correctly. Currently only support to complete item list based on ether type or ip next proto. Fixes: `4ed05fcd44` ("ethdev: add flow API to expand RSS flows") Cc: stable@dpdk.org Signed-off-by: Xiaoyu Min <jackmin@mellanox.com> Acked-by: Ori Kam <orika@mellanox.com>	2019-11-08 23:15:05 +01:00
Marvin Liu	aa74c383d4	vhost: fix batch enqueue only handle few packets After enqueue function finished, packet index has been increased. Batch enqueue function should retrieve mbuf structure pointed by that index. Fixes: `0294211bb6` ("vhost: optimize packed ring enqueue") Signed-off-by: Marvin Liu <yong.liu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-11-08 23:15:05 +01:00
Marvin Liu	4da3dd4885	vhost: fix dirty page logging missing Packets data are directly copied when doing batch enqueue, add missed dirty page logging after memory copy. Fixes: `ef861692c3` ("vhost: add packed ring batch enqueue") Signed-off-by: Marvin Liu <yong.liu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-11-08 23:15:05 +01:00
Marvin Liu	bc42ca1787	vhost: fix virtqueue not accessible Log feature is disabled in vhost user, so that log address was invalid when checking. Check whether log address is valid can work around it. Log address should also be translated in packed ring virtqueue. Fixes: `fbda9f1459` ("vhost: translate incoming log address to GPA") Cc: stable@dpdk.org Signed-off-by: Marvin Liu <yong.liu@intel.com> Reviewed-by: Adrian Moreno <amorenoz@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-11-08 23:15:05 +01:00
Jin Yu	201e748267	vhost: fix build dependency on hash lib Compile librte_vhost/vhost_crypto.c needs the rte_hash.h So we need the librte_hash to be compiled before vhost. Add the DEPDIRs to make sure this. Bugzilla ID: 356 Fixes: `939066d965` ("vhost/crypto: add public function implementation") Cc: stable@dpdk.org Signed-off-by: Jin Yu <jin.yu@intel.com> Reviewed-by: Tiwei Bie <tiwei.bie@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-11-08 23:15:05 +01:00
Marvin Liu	3939255eed	vhost: do not limit packed ring size Virtio spec only set rule that packed ring maximum size is up to 2^15 entries. Should not limit packed ring size to power of two. Fixes: `708e14d8b9` ("vhost: advertize packed ring layout support") Cc: stable@dpdk.org Signed-off-by: Marvin Liu <yong.liu@intel.com> Reviewed-by: Tiwei Bie <tiwei.bie@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-11-08 23:15:05 +01:00
Flavia Musatescu	5b1f399c5c	net: fix IPv4 IHL and VHL define Fix the RTE_IPV4_VHL_DEF macro that represents the value for the IPv4 VHL and Minimum IHL header fields according to rfc791. Fixes: `2318d8d545` ("net: define IPv4 IHL and VHL") Cc: stable@dpdk.org Reported-by: David Harton <dharton@cisco.com> Signed-off-by: Flavia Musatescu <flavia.musatescu@intel.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-11-08 23:15:05 +01:00
Viacheslav Ovsiienko	9bf26e1318	ethdev: move egress metadata to dynamic field The dynamic mbuf fields were introduced by [1]. The egress metadata is good candidate to be moved from statically allocated field tx_metadata to dynamic one. Because mbufs are used in half-duplex fashion only, it is safe to share this dynamic field with ingress metadata. The shared dynamic field contains either egress (if application going to transmit mbuf with tx_burst) or ingress (if mbuf is received with rx_burst) metadata and can be accessed by RTE_FLOW_DYNF_METADATA() macro or with rte_flow_dynf_metadata_set() and rte_flow_dynf_metadata_get() helper routines. PKT_TX_DYNF_METADATA/PKT_RX_DYNF_METADATA flag will be set along with the data. The mbuf dynamic field must be registered by calling rte_flow_dynf_metadata_register() prior accessing the data. The availability of dynamic mbuf metadata field can be checked with rte_flow_dynf_metadata_avail() routine. DEV_TX_OFFLOAD_MATCH_METADATA offload and configuration flag is removed. The metadata support in PMDs is engaged on dynamic field registration. Metadata feature is getting complex. We might have some set of actions and items that might be supported by PMDs in multiple combinations, the supported values and masks are the subjects to query by perfroming trials (with rte_flow_validate). [1] http://patches.dpdk.org/patch/62040/ Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Ori Kam <orika@mellanox.com>	2019-11-08 23:15:05 +01:00
Viacheslav Ovsiienko	e02ecc1324	ethdev: extend flow metadata Currently, metadata can be set on egress path via mbuf tx_metadata field with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META matches metadata. This patch extends the metadata feature usability. 1) RTE_FLOW_ACTION_TYPE_SET_META When supporting multiple tables, Tx metadata can also be set by a rule and matched by another rule. This new action allows metadata to be set as a result of flow match. 2) Metadata on ingress There's also need to support metadata on ingress. Metadata can be set by SET_META action and matched by META item like Tx. The final value set by the action will be delivered to application via metadata dynamic field of mbuf which can be accessed by RTE_FLOW_DYNF_METADATA() macro or with rte_flow_dynf_metadata_set() and rte_flow_dynf_metadata_get() helper routines. PKT_RX_DYNF_METADATA flag will be set along with the data. The mbuf dynamic field must be registered by calling rte_flow_dynf_metadata_register() prior to use SET_META action. The availability of dynamic mbuf metadata field can be checked with rte_flow_dynf_metadata_avail() routine. If application is going to engage the metadata feature it registers the metadata dynamic fields, then PMD checks the metadata field availability and handles the appropriate fields in datapath. For loopback/hairpin packet, metadata set on Rx/Tx may or may not be propagated to the other path depending on hardware capability. MARK and METADATA look similar and might operate in similar way, but not interacting. Initially, there were proposed two metadata related actions: - RTE_FLOW_ACTION_TYPE_FLAG - RTE_FLOW_ACTION_TYPE_MARK These actions set the special flag in the packet metadata, MARK action stores some specified value in the metadata storage, and, on the packet receiving PMD puts the flag and value to the mbuf and applications can see the packet was threated inside flow engine according to the appropriate RTE flow(s). MARK and FLAG are like some kind of gateway to transfer some per-packet information from the flow engine to the application via receiving datapath. Also, there is the item of type RTE_FLOW_ITEM_TYPE_MARK provided. It allows us to extend the flow match pattern with the capability to match the metadata values set by MARK/FLAG actions on other flows. From the datapath point of view, the MARK and FLAG are related to the receiving side only. It would useful to have the same gateway on the transmitting side and there was the feature of type RTE_FLOW_ITEM_TYPE_META was proposed. The application can fill the field in mbuf and this value will be transferred to some field in the packet metadata inside the flow engine. It did not matter whether these metadata fields are shared because of MARK and META items belonged to different domains (receiving and transmitting) and could be vendor-specific. So far, so good, DPDK proposes some entities to control metadata inside the flow engine and gateways to exchange these values on a per-packet basis via datapaths. As we can see, the MARK and META means are not symmetric, there is absent action which would allow us to set META value on the transmitting path. So, the action of type: - RTE_FLOW_ACTION_TYPE_SET_META was proposed. The next, applications raise the new requirements for packet metadata. The flow ngines are getting more complex, internal switches are introduced, multiple ports might be supported within the same flow engine namespace. From the DPDK points of view, it means the packets might be sent on one eth_dev port and received on the other one, and the packet path inside the flow engine entirely belongs to the same hardware device. The simplest example is SR-IOV with PF, VFs and the representors. And there is a brilliant opportunity to provide some out-of-band channel to transfer some extra data from one port to another one, besides the packet data itself. And applications would like to use this opportunity. It is supposed for application to use trials (with rte_flow_validate) to detect which metadata features (FLAG, MARK, META) actually supported by PMD and underlying hardware. It might depend on PMD configuration, system software, hardware settings, etc., and should be detected in run time. Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Ori Kam <orika@mellanox.com>	2019-11-08 23:15:04 +01:00
Haiyue Wang	8dedb54699	ethdev: enhance burst mode information API Change the type of burst mode information from bit field to free string data, so that each PMD can describe the Rx/Tx busrt functions flexibly. Fixes: `eb5902504a` ("ethdev: add API for getting burst mode information") Fixes: `6b6609f68c` ("net/i40e: support Rx/Tx burst mode info") Fixes: `e9a10e6c21` ("net/ice: support Rx/Tx burst mode info") Fixes: `7fe108edcf` ("app/testpmd: show Rx/Tx burst mode description") Signed-off-by: Haiyue Wang <haiyue.wang@intel.com> Acked-by: Ray Kinsella <ray.kinsella@intel.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>	2019-11-08 23:15:04 +01:00
Viacheslav Ovsiienko	9a2f44c762	ethdev: add flow tag A tag is a transient data which can be used during flow match. This can be used to store match result from a previous table so that the same pattern need not be matched again on the next table. Even if outer header is decapsulated on the previous match, the match result can be kept. Some device expose internal registers of its flow processing pipeline and those registers are quite useful for stateful connection tracking as it keeps status of flow matching. Multiple tags are supported by specifying index. Example testpmd commands are: flow create 0 ingress pattern ... / end actions set_tag index 2 value 0xaa00bb mask 0xffff00ff / set_tag index 3 value 0x123456 mask 0xffffff / vxlan_decap / jump group 1 / end flow create 0 ingress pattern ... / end actions set_tag index 2 value 0xcc00 mask 0xff00 / set_tag index 3 value 0x123456 mask 0xffffff / vxlan_decap / jump group 1 / end flow create 0 ingress group 1 pattern tag index is 2 value spec 0xaa00bb value mask 0xffff00ff / eth ... / end actions ... jump group 2 / end flow create 0 ingress group 1 pattern tag index is 2 value spec 0xcc00 value mask 0xff00 / tag index is 3 value spec 0x123456 value mask 0xffffff / eth ... / end actions ... / end flow create 0 ingress group 2 pattern tag index is 3 value spec 0x123456 value mask 0xffffff / eth ... / end actions ... / end Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Ori Kam <orika@mellanox.com>	2019-11-08 23:15:04 +01:00
Thomas Monjalon	2ed50762bd	ethdev: remove deprecated port count function The function rte_eth_dev_count() was marked as deprecated in DPDK 18.05 in commit `d9a42a69fe` ("ethdev: deprecate port count function"). It was planned to be removed after 19.11 LTS release, but given we must not break ABI between 19.11 and 20.11, it is removed now. Note the ABI version is not dumped in this commit because other changes already did. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Reviewed-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Reviewed-by: Jerin Jacob <jerinj@marvell.com>	2019-11-08 23:15:04 +01:00
Ori Kam	cf5516696d	ethdev: add hairpin queue This commit introduce hairpin queue type. The hairpin queue in build from Rx queue binded to Tx queue. It is used to offload traffic coming from the wire and redirect it back to the wire. There are 3 new functions: - rte_eth_dev_hairpin_capability_get - rte_eth_rx_hairpin_queue_setup - rte_eth_tx_hairpin_queue_setup In order to use the queue, there is a need to create rte_flow with queue / RSS action that targets one or more of the Rx queues. Signed-off-by: Ori Kam <orika@mellanox.com> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>	2019-11-08 23:15:04 +01:00
Ori Kam	f9adec46d4	ethdev: move queue state defines to private file The queue state defines are internal to the DPDK. This commit moves them to a private header file. Signed-off-by: Ori Kam <orika@mellanox.com> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>	2019-11-08 23:15:04 +01:00

1 2 3 4 5 ...

5774 Commits