numam-dpdk

Author	SHA1	Message	Date
Xueming Li	2808a12cc0	malloc: fix memory element size in case of padding This patch fixes wrong inner memory element size when joining two elements. Fixes: `af75078fec` ("first public release") Cc: stable@dpdk.org Signed-off-by: Xueming Li <xuemingl@mellanox.com> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-11-26 16:24:08 +01:00
Anatoly Burakov	bdc993fa3d	mem: clarify documentation of virt2iova behaviour It may not be immediately clear that rte_mem_virt2iova does not actually check the internal memseg table, and will instead either return VA (in IOVA as VA mode), or will fall back to kernel page table walk (in IOVA as PA mode). Add a note to API documentation indicating the above. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Reviewed-by: Olivier Matz <olivier.matz@6wind.com>	2019-11-26 00:30:51 +01:00
David Hunt	19802aaf7c	power: fix error log on guest message polling Should be passing errno rather than ret, which could be negative. Coverity issue: 350362 Fixes: `9dc843eb27` ("power: extend guest channel API for reading") Cc: stable@dpdk.org Signed-off-by: David Hunt <david.hunt@intel.com>	2019-11-26 00:29:24 +01:00
Stephen Hemminger	06710448c9	remove blank lines at end of file Remove trailing blank lines. They serve no purpose and are just editor leftovers. These can cause git to complain about whitespace errors during merges. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2019-11-26 00:12:08 +01:00
Shahaf Shuler	39a19ae03d	mbuf: extend mbuf pool private structure With the API and ABI freeze ahead, it will be good to reserve some bits on the private structure for future use. Otherwise we will potentially need to maintain two different private structure during 2020 period. There is already one use case for those reserved bits[1] The reserved field should be set to 0 by the user. [1] https://patches.dpdk.org/patch/63077/ Signed-off-by: Shahaf Shuler <shahafs@mellanox.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2019-11-25 22:44:46 +01:00
Mattias Rönnblom	388c4c03ec	power: handle frequency increase with turbo disabled Calling pstate's or acpi's rte_power_freq_up() when on the highest non-turbo frequency results in an error, if turbo is enabled in the BIOS, but disabled via the power library. The error is in the form of a return code and a RTE_LOG() entry on the ERR level. According to the API documentation, the frequency is scaled up "according to the available frequencies". In case turbo is disabled, that frequency is not available. This patch's rte_power_freq_up() behaviour is also consistent with how rte_power_freq_max() is implemented (i.e. the highest non-turbo frequency is set, in case turbo is disabled). Fixes: `445c6528b5` ("power: common interface for guest and host") Fixes: `e6c6dc0f96` ("power: add p-state driver compatibility") Cc: stable@dpdk.org Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com> Tested-by: David Hunt <david.hunt@intel.com> Acked-by: David Hunt <david.hunt@intel.com> Reviewed-by: Liang Ma <liang.j.ma@intel.com>	2019-11-21 00:52:31 +01:00
Ferruh Yigit	d965af9e8a	kni: increase kernel version requirement for VA A build error reported related to the selected 'get_user_pages_remote()' kernel API: .../kernel/linux/kni/kni_dev.h:113:8: error: too few arguments to function ‘get_user_pages_remote’ ret = get_user_pages_remote(tsk, tsk->mm, iova, 1 ^~~~~~~~~~~~~~~~~~~~~ Currently there are three versions of the 'get_user_pages_remote()' supported, based on kernel version < 4.9, = 4.9, > 4.9. These version based checks are not working fine with the distro kernels which is the cause of reported build error. The error reported by the kernel version 4.8, but it is using API defined in > 4.9. To be able to take control of this, and possible more, related build error, increasing the minimum supported kernel version for iova=va with KNI to kernel version 4.9. This leaves us with single version of the kernel API and more manageable. Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2019-11-21 00:18:02 +01:00
Ruifeng Wang	bb2a9973c0	bpf/arm: fix clang build Clang has different prototype for __builtin___clear_cache(). It requires 'char ' parameters while gcc requires 'void '. Clang version 8.0 was used. Warning messages during build: ../lib/librte_bpf/bpf_jit_arm64.c:1438:26: warning: incompatible pointer types passing 'uint32_t ' (aka 'unsigned int ') to parameter of type 'char ' [-Wincompatible-pointer-types] __builtin___clear_cache(ctx.ins, ctx.ins + ctx.idx); ^~~~~~~ ../lib/librte_bpf/bpf_jit_arm64.c:1438:35: warning: incompatible pointer types passing 'uint32_t ' (aka 'unsigned int ') to parameter of type 'char ' [-Wincompatible-pointer-types] __builtin___clear_cache(ctx.ins, ctx.ins + ctx.idx); ^~~~~~~~~~~~~~~~~ Fixes: `f3e5167724` ("bpf/arm: add prologue and epilogue") Cc: jerinj@marvell.com Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> Reviewed-by: Phil Yang <phil.yang@arm.com> Reviewed-by: Gavin Hu <gavin.hu@arm.com> Acked-by: Jerin Jacob <jerinj@marvell.com>	2019-11-21 00:30:39 +01:00
Pawel Modrak	85ff364f3b	build: align symbols with global ABI version Merge all versions in linker version script files to DPDK_20.0. This commit was generated by running the following command: :~/DPDK$ buildtools/update-abi.sh 20.0 Signed-off-by: Pawel Modrak <pawelx.modrak@intel.com> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2019-11-20 23:05:39 +01:00
Marcin Baran	4ab92c53ed	distributor: rename v2.0 ABI to _single suffix The original ABI versioning was slightly misleading in that the DPDK 2.0 ABI was really a single mode for the distributor, and is used as such throughout the distributor code. Fix this by renaming all _v20 API's to _single API's, and remove symbol versioning. Signed-off-by: Marcin Baran <marcinx.baran@intel.com> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: David Hunt <david.hunt@intel.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2019-11-20 23:05:39 +01:00
Marcin Baran	6e5b516761	distributor: remove deprecated code Remove code for old ABI versions ahead of ABI version bump. Signed-off-by: Marcin Baran <marcinx.baran@intel.com> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: David Hunt <david.hunt@intel.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2019-11-20 23:05:39 +01:00
Marcin Baran	c381a8d554	lpm: remove deprecated code Remove code for old ABI versions ahead of ABI version bump. Signed-off-by: Marcin Baran <marcinx.baran@intel.com> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2019-11-20 23:05:39 +01:00
Marcin Baran	f2fb215843	timer: remove deprecated code Remove code for old ABI versions ahead of ABI version bump. Signed-off-by: Marcin Baran <marcinx.baran@intel.com> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2019-11-20 23:05:39 +01:00
Anatoly Burakov	fbaf943887	build: remove individual library versions Since the library versioning for both stable and experimental ABI's is now managed globally, the LIBABIVER and version variables no longer serve any useful purpose, and can be removed. The replacement in Makefiles was done using the following regex: ^(#.\n)?LIBABIVER\s:=\s\d+\n(\s\n)? (LIBABIVER := numbers, optionally preceded by a comment and optionally succeeded by an empty line) The replacement for meson files was done using the following regex: ^(#.\n)?version\s=\s\d+\n(\s\n)? (version = numbers, optionally preceded by a comment and optionally succeeded by an empty line) [David]: those variables are manually removed for the files: - drivers/common/qat/Makefile - lib/librte_eal/meson.build [David]: the LIBABIVER is restored for the external ethtool example library. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2019-11-20 23:05:39 +01:00
Marcin Baran	cba806e07d	build: change ABI versioning to global As per new ABI policy [1], all of the libraries are now versioned using one global ABI version. Stable libraries use the MAJOR.MINOR ABI version for their shared objects, while experimental libraries use the 0.MAJORMINOR convention for their versioning. Experimental library versioning is managed globally. Changes in this patch implement the necessary steps to enable that. The CONFIG_RTE_MAJOR_ABI option was introduced to permit multiple DPDK versions installed side by side. The problem is now addressed through the new ABI policy, and thus can be removed. [David] For external libraries relying on Makefile, LIBABIVER is preserved to avoid using DPDK global ABI version. [1] https://doc.dpdk.org/guides/contributing/abi_policy.html Signed-off-by: Marcin Baran <marcinx.baran@intel.com> Signed-off-by: Pawel Modrak <pawelx.modrak@intel.com> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2019-11-20 23:05:39 +01:00
Andrew Rybchenko	6e18704b7e	ethdev: avoid undefined behaviour on configuration copy memcpy() source and destination areas must not overlap and equal pointers is the case which is really met, so handle it. Fixes: `68b931bff2` ("ethdev: eliminate interim variable") Cc: stable@dpdk.org Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-11-20 17:36:06 +01:00
Jakub Grajciar	43b815d881	net/memif: support zero-copy slave Zero-copy slave support for memif PMD. Slave interface exposes DPDK memory to master interface. Only single file segments are supported (EAL option --single-file-segments). Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-11-20 17:36:06 +01:00
Stephen Hemminger	6c373e935a	net: constify pointer to IPv6 header The function rte_ipv6_get_next_ext does not modify the header that is passed in. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-11-20 17:36:06 +01:00
Andrew Rybchenko	5031897506	ethdev: improve message about not disabled offload Avoid usaged of "failed" in the message about not requested but enabled offload, since it is not a failure. Fixes: `1daa338058` ("ethdev: validate offloads set by PMD") Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org>	2019-11-20 17:36:06 +01:00
Andrew Rybchenko	9e954d3194	ethdev: decrease verbosity of not disabled offload logs Right now a PMD decides if it is critical that an offload cannot be disabled (i.e. not requested, but still enabled). If PMD treaks it as OK, we should not spam logs with corresponding messages by default. Default log level in ethdev is INFO, so change the message level to DEBUG. Fixes: `1daa338058` ("ethdev: validate offloads set by PMD") Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org>	2019-11-20 17:36:06 +01:00
Pavan Nikhilesh	7988d03229	ethdev: fix log line feed Fix missing new line token at the end of log. Fixes: `5d30897295` ("ethdev: add mbuf RSS update as an offload") Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-11-20 17:36:05 +01:00
Xueming Li	90f538b863	malloc: fix realloc padded element size When resize a memory with next element, the original element size grows. If the orginal element has padding, the real inner element size didn't grow as well and this causes trailer verification failure when malloc debug enabled. Fixes: `af75078fec` ("first public release") Cc: stable@dpdk.org Signed-off-by: Xueming Li <xuemingl@mellanox.com> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-11-20 14:08:39 +01:00
Xueming Li	a029a06036	malloc: fix realloc copy size In rte_realloc, if the old element has pad and need to allocate a new memory, the padding size was not deducted, so more data was copied to new data area. Fixes: `af75078fec` ("first public release") Cc: stable@dpdk.org Signed-off-by: Xueming Li <xuemingl@mellanox.com> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-11-20 14:08:39 +01:00
Kevin Traynor	baf023a8ed	lib: fix doxygen typos Fix these as they are user visible. Found with codespell. Fixes: `af75078fec` ("first public release") Fixes: `c2361bab70` ("eal: compute IOVA mode based on PA availability") Fixes: `0880c40113` ("drivers: advertise kmod dependencies in pmdinfo") Fixes: `56b6ef874f` ("efd: new Elastic Flow Distributor library") Fixes: `5a5f3178d4` ("power: return error when environment already set") Cc: stable@dpdk.org Signed-off-by: Kevin Traynor <ktraynor@redhat.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2019-11-19 22:03:38 +01:00
Kevin Traynor	0411d61fa9	lib: fix log typos Fix these as they are user visible. Found with codespell. Fixes: `bacaa27540` ("eal: add channel for multi-process communication") Fixes: `f05e26051c` ("eal: add IPC asynchronous request") Fixes: `0cbce3a167` ("vfio: skip DMA map failure if already mapped") Fixes: `445c6528b5` ("power: common interface for guest and host") Fixes: `e6c6dc0f96` ("power: add p-state driver compatibility") Fixes: `8f972312b8` ("vhost: support vhost-user") Cc: stable@dpdk.org Signed-off-by: Kevin Traynor <ktraynor@redhat.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2019-11-19 22:03:27 +01:00
Michael Pfeiffer	b8a0415008	kni: reduce interface name size The name in rte_kni_device_info is passed to the kernel, which allows interface names with at most 16 bytes (IFNAMSIZ). rte_kni_alloc with a longer name currently trigger a kernel BUG in alloc_netdev_mqs in net/core/dev.c. Reduce RTE_KNI_NAMESIZE to prevent this situation. Signed-off-by: Michael Pfeiffer <michael.pfeiffer@tu-ilmenau.de> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-11-19 22:00:32 +01:00
Anatoly Burakov	84191ddeb5	mempool: remove check for bad IOVA when populating Currently, mempool will check if IOVA is bad for a segment, and reject the IOVA if hugepages are also enabled. This check is wrong because now that we have external memory segments, they are allowed to have their IOVA's to be invalid. This check also doesn't make much sense in the first place, because the following code can handle bad IOVA's perfectly well (and in fact, this check is not triggering a failure when --no-huge option is enabled), so there is not much sense to check for this in the first place. Fixes: `950e8fb4e1` ("mem: allow registering external memory areas") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Olivier Matz <olivier.matz@6wind.com> Tested-by: Bo Chen <box.c.chen@intel.com>	2019-11-19 21:41:43 +01:00
Anatoly Burakov	2a7fd3ef38	mempool: use actual IOVA addresses when populating Currently, when mempool is being populated, we get IOVA address of every segment using rte_mem_virt2iova(). This works for internal memory, but does not really work for external memory, and does not work on platforms which return RTE_BAD_IOVA as a result of this call (such as FreeBSD). Moreover, even when it works, the function in question will do unnecessary pagewalks in IOVA as PA mode, as it falls back to rte_mem_virt2phy() instead of just doing a lookup in internal memseg table. To fix it, replace the call to first attempt to look through the internal memseg table (this takes care of internal and external memory), and fall back to rte_mem_virt2iova() when unable to perform VA->IOVA translation via memseg table. Fixes: `66cc45e293` ("mem: replace memseg with memseg lists") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Olivier Matz <olivier.matz@6wind.com> Tested-by: Bo Chen <box.c.chen@intel.com>	2019-11-19 21:41:43 +01:00
Vamsi Attunuru	a0dede62a5	eal/linux: remove KNI restriction on IOVA Now that KNI supports VA (with kernel versions starting 4.6.0), we can accept IOVA as VA, but KNI must be configured for this. Pass iova_mode when creating KNI netdevs. So far, IOVA detection policy forced IOVA as PA when KNI is loaded, whatever the buses IOVA requirements were. We can now use IOVA as VA, but this comes with a cost in KNI. When no constraint is expressed by the buses, keep the current behavior of choosing PA. Note: this change supposes that dpdk is built on the same kernel than the target system kernel; no objection has been expressed on this topic. Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com> Signed-off-by: Kiran Kumar K <kirankumark@marvell.com> Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Jerin Jacob <jerinj@marvell.com>	2019-11-18 16:00:51 +01:00
Vamsi Attunuru	e73831dc6c	kni: support userspace VA Patch adds support for kernel module to work in IOVA = VA mode by providing address translation routines to convert userspace VA to kernel VA. KNI performance using PA is not changed by this patch. But comparing KNI using PA to KNI using VA, the latter will have lower performance due to the cost of the added translation. This translation is implemented only with kernel versions starting 4.6.0. Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com> Signed-off-by: Kiran Kumar K <kirankumark@marvell.com> Reviewed-by: Jerin Jacob <jerinj@marvell.com>	2019-11-18 16:00:51 +01:00
Zhike Wang	1407b0752e	vhost: fix vring requests validation broken if no FD When VHOST_USER_VRING_NOFD_MASK is set, the fd_num is 0, so validate_msg_fds() will return error. In this case, the negotiation of vring message between vhost user front end and back end would fail, and as a result, vhost user link could NOT be up. How to reproduce: 1.Run dpdk testpmd insides VM, which locates at host with ovs+dpdk. 2.Notice that inside ovs there are endless logs regarding failure to handle VHOST_USER_SET_VRING_CALL, and link of vm could NOT be up. Fixes: `bf472259dd` ("vhost: fix possible denial of service by leaking FDs") Cc: stable@dpdk.org Signed-off-by: Zhike Wang <wangzk320@163.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>	2019-11-15 14:25:48 +01:00
Stephen Hemminger	08a234788e	cmdline: remove unnecessary #ifdef The #ifdef to conditionally include <sys/socket.h> on BSD is unnecessary. It is harmless to include the header on other OS's. An extra include is better than an #ifdef. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2019-11-12 18:35:17 +01:00
Maxime Coquelin	bf472259dd	vhost: fix possible denial of service by leaking FDs A malicious Vhost-user master could send in loop hand-crafted vhost-user messages containing more file descriptors the vhost-user slave expects. Doing so causes the application using the vhost-user library to run out of FDs. This issue has been assigned CVE-2019-14818 Fixes: `8f972312b8` ("vhost: support vhost-user") Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-11-12 12:21:20 +01:00
Maxime Coquelin	612e17cf6d	vhost: fix possible denial of service on SET_VRING_NUM vhost_user_set_vring_num() performs multiple allocations without checking whether data were previously allocated. It may cause a denial of service because of the memory leaks that happen if a malicious vhost-user master keeps sending VHOST_USER_SET_VRING_NUM request until the slave runs out of memory. This issue has been assigned CVE-2019-14818 Fixes: `b0a985d1f3` ("vhost: add dequeue zero copy") Reported-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-11-12 12:21:17 +01:00
Matan Azrad	eccc5d3237	ethdev: fix last item detection on RSS flow expand There is a rte_flow API which expands a RSS flow pattern to multiple patterns according to the RSS hash types in the RSS action configuration. As part of the expansion, detection of the last item of the flow uses the "next proto" field of the last configured item in the pattern list. Wrongly, the mask of this field was not considered in order to validate the field. Ignore "next proto" fields when their corresponded masks invalidate them. Fixes: `fc2dd8dd49` ("ethdev: fix expand RSS flows") Cc: stable@dpdk.org Signed-off-by: Matan Azrad <matan@mellanox.com> Acked-by: Xiaoyu Min <jackmin@mellanox.com> Acked-by: Ori Kam <orika@mellanox.com>	2019-11-12 01:55:26 +01:00
Dekel Peled	dc258e4ab9	ethdev: add maximum LRO packet size This patch implements API for configuration and validation of max size for LRO aggregated packet. Signed-off-by: Dekel Peled <dekelp@mellanox.com> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Matan Azrad <matan@mellanox.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-11-12 01:43:01 +01:00
Jerin Jacob	6f26f8a0ec	eventdev: reserve space in main structs for extension The struct rte_eventdev and rte_eventdev_data are supposed to be used internally only, but there is a chance that increasing their size would break ABI for some applications. In order to allow smooth addition of features without breaking ABI compatibility, some space is reserved. Signed-off-by: Jerin Jacob <jerinj@marvell.com>	2019-11-12 03:36:32 +01:00
Thomas Monjalon	436b3a6b6e	ethdev: reserve space in main structs for extension In order to allow smooth addition of features without breaking ABI compatibility, some space is reserved in several core structs of ethdev API. The struct rte_eth_dev and rte_eth_dev_data are supposed to be used internally only, but there is a chance that increasing their size would break ABI for some applications. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-11-11 17:02:29 +01:00
Pavan Nikhilesh	1daa338058	ethdev: validate offloads set by PMD Some PMDs cannot work when certain offloads are enable/disabled, as a workaround PMDs auto enable/disable offloads internally and expose it through dev->data->dev_conf.rxmode.offloads. After device specific dev_configure is called compare the requested offloads to the offloads exposed by the PMD and, if the PMD failed to enable a given offload then log it and return -EINVAL from rte_eth_dev_configure, else if the PMD failed to disable a given offload log and continue with rte_eth_dev_configure. Suggested-by: Andrew Rybchenko <arybchenko@solarflare.com> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>	2019-11-11 16:15:37 +01:00
Pavan Nikhilesh	5d30897295	ethdev: add mbuf RSS update as an offload Add new Rx offload flag `DEV_RX_OFFLOAD_RSS_HASH` which can be used to enable/disable PMDs write to `rte_mbuf:#️⃣:rss`. PMDs notify the validity of `rte_mbuf:#️⃣rss` to the application by enabling `PKT_RX_RSS_HASH ` flag in `rte_mbuf::ol_flags`. Also update testpmd rx_offload command to include RSS_HASH Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-11-11 16:15:36 +01:00
Pavan Nikhilesh	5d4813acda	ethdev: add packet type range function Add `rte_eth_dev_set_ptypes` function that will allow the application to inform the PMD about reduced range of packet types to handle. Based on the ptypes set PMDs can optimize their Rx path. -If application doesn’t want any ptype information it can call `rte_eth_dev_set_ptypes(ethdev_id, RTE_PTYPE_UNKNOWN, NULL, 0)` and PMD may skip packet type processing and set rte_mbuf::packet_type to RTE_PTYPE_UNKNOWN. -If application doesn’t call `rte_eth_dev_set_ptypes` PMD can return `rte_mbuf::packet_type` with `rte_eth_dev_get_supported_ptypes`. -If application is interested only in L2/L3 layer, it can inform the PMD to update `rte_mbuf::packet_type` with L2/L3 ptype by calling `rte_eth_dev_set_ptypes(ethdev_id, RTE_PTYPE_L2_MASK \| RTE_PTYPE_L3_MASK, NULL, 0)`. Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>	2019-11-11 16:15:36 +01:00
Xiaoyu Min	fc2dd8dd49	ethdev: fix expand RSS flows rte_flow_expand_rss expands rte_flow item list based on the RSS types. In another word, some additional rules are added if the user specified items are not complete enough according to the RSS type, for example: ... pattern eth / end actions rss type tcp end ... User only provides item eth but want to do RSS on tcp traffic. The pattern is not complete enough to filter TCP traffic only. This will be a problem for some HWs. So some PMDs use rte_flow_expand_rss to expand above user provided flow to: ... pattern eth / end actions rss types tcp ... pattern eth / ipv4 / tcp / end actions rss types tcp ... ... pattern eth / ipv6 / tcp / end actions rss types tcp ... in order to filter TCP traffic only and do RSS correctly. However the current expansion cannot handle pattern as below, which provides ethertype or ip next proto instead of providing an item: ... pattern eth type is 0x86DD / end actions rss types tcp ... rte_flow_expand_rss will expand above flow to: ... pattern eth type is 0x86DD / ipv4 / tcp end ... which has conflicting values: 0x86DD vs. ipv4 and some HWs will refuse to create flow. This patch will fix above by checking the last item's spec and to expand RSS flows correctly. Currently only support to complete item list based on ether type or ip next proto. Fixes: `4ed05fcd44` ("ethdev: add flow API to expand RSS flows") Cc: stable@dpdk.org Signed-off-by: Xiaoyu Min <jackmin@mellanox.com> Acked-by: Ori Kam <orika@mellanox.com>	2019-11-08 23:15:05 +01:00
Marvin Liu	aa74c383d4	vhost: fix batch enqueue only handle few packets After enqueue function finished, packet index has been increased. Batch enqueue function should retrieve mbuf structure pointed by that index. Fixes: `0294211bb6` ("vhost: optimize packed ring enqueue") Signed-off-by: Marvin Liu <yong.liu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-11-08 23:15:05 +01:00
Marvin Liu	4da3dd4885	vhost: fix dirty page logging missing Packets data are directly copied when doing batch enqueue, add missed dirty page logging after memory copy. Fixes: `ef861692c3` ("vhost: add packed ring batch enqueue") Signed-off-by: Marvin Liu <yong.liu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-11-08 23:15:05 +01:00
Marvin Liu	bc42ca1787	vhost: fix virtqueue not accessible Log feature is disabled in vhost user, so that log address was invalid when checking. Check whether log address is valid can work around it. Log address should also be translated in packed ring virtqueue. Fixes: `fbda9f1459` ("vhost: translate incoming log address to GPA") Cc: stable@dpdk.org Signed-off-by: Marvin Liu <yong.liu@intel.com> Reviewed-by: Adrian Moreno <amorenoz@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-11-08 23:15:05 +01:00
Jin Yu	201e748267	vhost: fix build dependency on hash lib Compile librte_vhost/vhost_crypto.c needs the rte_hash.h So we need the librte_hash to be compiled before vhost. Add the DEPDIRs to make sure this. Bugzilla ID: 356 Fixes: `939066d965` ("vhost/crypto: add public function implementation") Cc: stable@dpdk.org Signed-off-by: Jin Yu <jin.yu@intel.com> Reviewed-by: Tiwei Bie <tiwei.bie@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-11-08 23:15:05 +01:00
Marvin Liu	3939255eed	vhost: do not limit packed ring size Virtio spec only set rule that packed ring maximum size is up to 2^15 entries. Should not limit packed ring size to power of two. Fixes: `708e14d8b9` ("vhost: advertize packed ring layout support") Cc: stable@dpdk.org Signed-off-by: Marvin Liu <yong.liu@intel.com> Reviewed-by: Tiwei Bie <tiwei.bie@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-11-08 23:15:05 +01:00
Flavia Musatescu	5b1f399c5c	net: fix IPv4 IHL and VHL define Fix the RTE_IPV4_VHL_DEF macro that represents the value for the IPv4 VHL and Minimum IHL header fields according to rfc791. Fixes: `2318d8d545` ("net: define IPv4 IHL and VHL") Cc: stable@dpdk.org Reported-by: David Harton <dharton@cisco.com> Signed-off-by: Flavia Musatescu <flavia.musatescu@intel.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-11-08 23:15:05 +01:00
Viacheslav Ovsiienko	9bf26e1318	ethdev: move egress metadata to dynamic field The dynamic mbuf fields were introduced by [1]. The egress metadata is good candidate to be moved from statically allocated field tx_metadata to dynamic one. Because mbufs are used in half-duplex fashion only, it is safe to share this dynamic field with ingress metadata. The shared dynamic field contains either egress (if application going to transmit mbuf with tx_burst) or ingress (if mbuf is received with rx_burst) metadata and can be accessed by RTE_FLOW_DYNF_METADATA() macro or with rte_flow_dynf_metadata_set() and rte_flow_dynf_metadata_get() helper routines. PKT_TX_DYNF_METADATA/PKT_RX_DYNF_METADATA flag will be set along with the data. The mbuf dynamic field must be registered by calling rte_flow_dynf_metadata_register() prior accessing the data. The availability of dynamic mbuf metadata field can be checked with rte_flow_dynf_metadata_avail() routine. DEV_TX_OFFLOAD_MATCH_METADATA offload and configuration flag is removed. The metadata support in PMDs is engaged on dynamic field registration. Metadata feature is getting complex. We might have some set of actions and items that might be supported by PMDs in multiple combinations, the supported values and masks are the subjects to query by perfroming trials (with rte_flow_validate). [1] http://patches.dpdk.org/patch/62040/ Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Ori Kam <orika@mellanox.com>	2019-11-08 23:15:05 +01:00
Viacheslav Ovsiienko	e02ecc1324	ethdev: extend flow metadata Currently, metadata can be set on egress path via mbuf tx_metadata field with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META matches metadata. This patch extends the metadata feature usability. 1) RTE_FLOW_ACTION_TYPE_SET_META When supporting multiple tables, Tx metadata can also be set by a rule and matched by another rule. This new action allows metadata to be set as a result of flow match. 2) Metadata on ingress There's also need to support metadata on ingress. Metadata can be set by SET_META action and matched by META item like Tx. The final value set by the action will be delivered to application via metadata dynamic field of mbuf which can be accessed by RTE_FLOW_DYNF_METADATA() macro or with rte_flow_dynf_metadata_set() and rte_flow_dynf_metadata_get() helper routines. PKT_RX_DYNF_METADATA flag will be set along with the data. The mbuf dynamic field must be registered by calling rte_flow_dynf_metadata_register() prior to use SET_META action. The availability of dynamic mbuf metadata field can be checked with rte_flow_dynf_metadata_avail() routine. If application is going to engage the metadata feature it registers the metadata dynamic fields, then PMD checks the metadata field availability and handles the appropriate fields in datapath. For loopback/hairpin packet, metadata set on Rx/Tx may or may not be propagated to the other path depending on hardware capability. MARK and METADATA look similar and might operate in similar way, but not interacting. Initially, there were proposed two metadata related actions: - RTE_FLOW_ACTION_TYPE_FLAG - RTE_FLOW_ACTION_TYPE_MARK These actions set the special flag in the packet metadata, MARK action stores some specified value in the metadata storage, and, on the packet receiving PMD puts the flag and value to the mbuf and applications can see the packet was threated inside flow engine according to the appropriate RTE flow(s). MARK and FLAG are like some kind of gateway to transfer some per-packet information from the flow engine to the application via receiving datapath. Also, there is the item of type RTE_FLOW_ITEM_TYPE_MARK provided. It allows us to extend the flow match pattern with the capability to match the metadata values set by MARK/FLAG actions on other flows. From the datapath point of view, the MARK and FLAG are related to the receiving side only. It would useful to have the same gateway on the transmitting side and there was the feature of type RTE_FLOW_ITEM_TYPE_META was proposed. The application can fill the field in mbuf and this value will be transferred to some field in the packet metadata inside the flow engine. It did not matter whether these metadata fields are shared because of MARK and META items belonged to different domains (receiving and transmitting) and could be vendor-specific. So far, so good, DPDK proposes some entities to control metadata inside the flow engine and gateways to exchange these values on a per-packet basis via datapaths. As we can see, the MARK and META means are not symmetric, there is absent action which would allow us to set META value on the transmitting path. So, the action of type: - RTE_FLOW_ACTION_TYPE_SET_META was proposed. The next, applications raise the new requirements for packet metadata. The flow ngines are getting more complex, internal switches are introduced, multiple ports might be supported within the same flow engine namespace. From the DPDK points of view, it means the packets might be sent on one eth_dev port and received on the other one, and the packet path inside the flow engine entirely belongs to the same hardware device. The simplest example is SR-IOV with PF, VFs and the representors. And there is a brilliant opportunity to provide some out-of-band channel to transfer some extra data from one port to another one, besides the packet data itself. And applications would like to use this opportunity. It is supposed for application to use trials (with rte_flow_validate) to detect which metadata features (FLAG, MARK, META) actually supported by PMD and underlying hardware. It might depend on PMD configuration, system software, hardware settings, etc., and should be detected in run time. Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Ori Kam <orika@mellanox.com>	2019-11-08 23:15:04 +01:00
Haiyue Wang	8dedb54699	ethdev: enhance burst mode information API Change the type of burst mode information from bit field to free string data, so that each PMD can describe the Rx/Tx busrt functions flexibly. Fixes: `eb5902504a` ("ethdev: add API for getting burst mode information") Fixes: `6b6609f68c` ("net/i40e: support Rx/Tx burst mode info") Fixes: `e9a10e6c21` ("net/ice: support Rx/Tx burst mode info") Fixes: `7fe108edcf` ("app/testpmd: show Rx/Tx burst mode description") Signed-off-by: Haiyue Wang <haiyue.wang@intel.com> Acked-by: Ray Kinsella <ray.kinsella@intel.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>	2019-11-08 23:15:04 +01:00
Viacheslav Ovsiienko	9a2f44c762	ethdev: add flow tag A tag is a transient data which can be used during flow match. This can be used to store match result from a previous table so that the same pattern need not be matched again on the next table. Even if outer header is decapsulated on the previous match, the match result can be kept. Some device expose internal registers of its flow processing pipeline and those registers are quite useful for stateful connection tracking as it keeps status of flow matching. Multiple tags are supported by specifying index. Example testpmd commands are: flow create 0 ingress pattern ... / end actions set_tag index 2 value 0xaa00bb mask 0xffff00ff / set_tag index 3 value 0x123456 mask 0xffffff / vxlan_decap / jump group 1 / end flow create 0 ingress pattern ... / end actions set_tag index 2 value 0xcc00 mask 0xff00 / set_tag index 3 value 0x123456 mask 0xffffff / vxlan_decap / jump group 1 / end flow create 0 ingress group 1 pattern tag index is 2 value spec 0xaa00bb value mask 0xffff00ff / eth ... / end actions ... jump group 2 / end flow create 0 ingress group 1 pattern tag index is 2 value spec 0xcc00 value mask 0xff00 / tag index is 3 value spec 0x123456 value mask 0xffffff / eth ... / end actions ... / end flow create 0 ingress group 2 pattern tag index is 3 value spec 0x123456 value mask 0xffffff / eth ... / end actions ... / end Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Ori Kam <orika@mellanox.com>	2019-11-08 23:15:04 +01:00
Thomas Monjalon	2ed50762bd	ethdev: remove deprecated port count function The function rte_eth_dev_count() was marked as deprecated in DPDK 18.05 in commit `d9a42a69fe` ("ethdev: deprecate port count function"). It was planned to be removed after 19.11 LTS release, but given we must not break ABI between 19.11 and 20.11, it is removed now. Note the ABI version is not dumped in this commit because other changes already did. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Reviewed-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Reviewed-by: Jerin Jacob <jerinj@marvell.com>	2019-11-08 23:15:04 +01:00
Ori Kam	cf5516696d	ethdev: add hairpin queue This commit introduce hairpin queue type. The hairpin queue in build from Rx queue binded to Tx queue. It is used to offload traffic coming from the wire and redirect it back to the wire. There are 3 new functions: - rte_eth_dev_hairpin_capability_get - rte_eth_rx_hairpin_queue_setup - rte_eth_tx_hairpin_queue_setup In order to use the queue, there is a need to create rte_flow with queue / RSS action that targets one or more of the Rx queues. Signed-off-by: Ori Kam <orika@mellanox.com> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>	2019-11-08 23:15:04 +01:00
Ori Kam	f9adec46d4	ethdev: move queue state defines to private file The queue state defines are internal to the DPDK. This commit moves them to a private header file. Signed-off-by: Ori Kam <orika@mellanox.com> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>	2019-11-08 23:15:04 +01:00
Bruce Richardson	ff962da373	lib: check experimental symbols with meson Call check-experimental-syms.sh script as part of the meson build to ensure that all functions are correctly tagged. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Luca Boccassi <bluca@debian.org>	2019-11-09 21:17:12 +01:00
Hemant Agrawal	0f56ca1aae	ipsec: remove redundant replay window size The rte_security lib has introduced replay_win_sz, so it can be removed from the rte_ipsec lib. The relevant tests, app are also update to reflect the usages. Note that esn and anti-replay fileds were earlier used only for ipsec library, they were enabling the libipsec by default. With this change esn and anti-replay setting will not automatically enabled libipsec. Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-11-08 13:51:16 +01:00
Hemant Agrawal	d5411b9a3d	security: add anti replay window size At present the ipsec xfrom is missing the important step to configure the anti replay window size. The newly added field will also help in to enable or disable the anti replay checking, if available in offload by means of non-zero or zero value. Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Anoob Joseph <anoobj@marvell.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-11-08 13:51:16 +01:00
Thomas Monjalon	14cba9ee22	cmdline: replace FreeBSD ifdef for IP address parsing The constants like AF_INET are in sys/socket.h in FreeBSD. The #ifdef macro __FreeBSD__ is replaced with RTE_EXEC_ENV_FREEBSD in order to be consistent across DPDK files, and allow to grep for EXEC_ENV among other benefits. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2019-11-08 15:34:10 +01:00
Andrzej Ostruszka	57e20572ac	eventdev: fix possible use of uninitialized var Fix the logic for the case of event queue allowing all schedule types. Compiler warning pointing to this error (with LTO enabled): error: ‘sched_type’ may be used uninitialized in this function [-Werror=maybe-uninitialized] if ((ret < 0 && ret != -EOVERFLOW) \|\| Fixes: `6750b21bd6` ("eventdev: add default software timer adapter") Cc: stable@dpdk.org Signed-off-by: Andrzej Ostruszka <aostruszka@marvell.com> Reviewed-by: Jerin Jacob <jerinj@marvell.com>	2019-11-08 15:17:24 +01:00
Andrzej Ostruszka	909dd291f0	lib: annotate versioned functions Every implementation of a particular version of given symbol needs to be marked in its declaration as such (using `__vsym` macro). This patch fixes this and also clarifies the documentation about that. Signed-off-by: Andrzej Ostruszka <aostruszka@marvell.com> Acked-by: Neil Horman <nhorman@tuxdriver.com>	2019-11-08 15:15:30 +01:00
Andrzej Ostruszka	519e6548f7	doc: fix description of versioning macros This patch fixes documentation of versioning macros so that they are aligned with their implementation (no underscore is added by macros). Fixes: `f1ef9794f9` ("doc: add ABI guidelines") Cc: stable@dpdk.org Signed-off-by: Andrzej Ostruszka <aostruszka@marvell.com> Acked-by: Neil Horman <nhorman@tuxdriver.com>	2019-11-08 15:15:09 +01:00
Rahul R Shah	9a643edb2b	port: fix build dependency The port library should be built after eventdev library. Fixes: `5d92c4e592` ("port: add eventdev port type") Cc: stable@dpdk.org Signed-off-by: Rahul R Shah <rahul.r.shah@intel.com> Tested-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2019-11-07 17:46:43 +01:00
Anatoly Burakov	47c45a4df6	vfio: fix DMA mapping of external heaps Currently, externally created heaps are supposed to be automatically mapped for VFIO DMA by EAL, however they only do so if, at the time of heap creation, VFIO is initialized and has at least one device available. If no devices are available at the time of heap creation (or if devices were available, but were since hot-unplugged, thus dropping all VFIO container mappings), then VFIO mapping code would have skipped over externally allocated heaps. The fix is two-fold. First, we allow externally allocated memory segments to be marked as "heap" segments. This allows us to distinguish between external memory segments that were created via heap API, from those that were created via rte_extmem_register() API. Then, we fix the VFIO code to only skip non-heap external segments. Also, since external heaps are not guaranteed to have valid IOVA addresses, we will skip those which have invalid IOVA addresses as well. Fixes: `0f526d674f` ("malloc: separate creating memseg list and malloc heap") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Tested-by: Rajesh Ravi <rajesh.ravi@broadcom.com> Acked-by: David Marchand <david.marchand@redhat.com>	2019-11-07 17:46:43 +01:00
Anatoly Burakov	b14d192ca1	vfio: remove deprecated DMA mapping functions The rte_vfio_dma_map/unmap API's have been marked as deprecated in release 19.05. Remove them. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: David Marchand <david.marchand@redhat.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2019-11-07 17:46:43 +01:00
Anatoly Burakov	9362945d7e	vfio: fix DMA mapping with default container When requesting DMA mapping to default container, we are meant to supply the RTE_VFIO_DEFAULT_CONTAINER_FD value, however this is not handled correctly by get_vfio_cfg_by_container_fd(), because it only looks at actual fd values and does not check for this special case. Fix it to return default container if the fd requested is the special RTE_VFIO_DEFAULT_CONTAINER_FD value. Fixes: `4106d89a18` ("vfio: allow DMA map to the default container") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2019-11-07 17:46:43 +01:00
Olivier Matz	b32037f7ef	mempool: use specific macro for object alignment For consistency, RTE_MEMPOOL_ALIGN should be used in place of RTE_CACHE_LINE_SIZE. They have the same value, because the only arch that was defining a specific value for it has been removed from DPDK. Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Nipun Gupta <nipun.gupta@nxp.com>	2019-11-06 11:34:19 +01:00
Olivier Matz	84626a0d61	mempool: prevent objects from being across pages When populating a mempool, ensure that objects are not located across several pages, except if user did not request IOVA-contiguous objects. Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com> Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Nipun Gupta <nipun.gupta@nxp.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>	2019-11-06 11:34:19 +01:00
Olivier Matz	23bdcedcd8	mempool: introduce helpers for populate and required size Introduce new functions that can used by mempool drivers to calculate required memory size and to populate mempool. For now, these helpers just replace the *_default() functions without change. They will be enhanced in next commit. Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Nipun Gupta <nipun.gupta@nxp.com> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>	2019-11-06 11:11:13 +01:00
Olivier Matz	b291e69423	mempool: introduce function to get mempool page size In rte_mempool_populate_default(), we determine the page size, which is needed for calc_size and allocation of memory. Move this in a function and export it, it will be used in a next commit. Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Nipun Gupta <nipun.gupta@nxp.com>	2019-11-06 11:11:12 +01:00
Olivier Matz	035ee5bea5	mempool: remove optimistic IOVA-contiguous allocation The previous commit reduced the amount of required memory when populating the mempool with non IOVA-contiguous memory. Since there is no big advantage to have a fully iova-contiguous mempool if it is not explicitly asked, remove this code, it simplifies the populate function. Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Nipun Gupta <nipun.gupta@nxp.com>	2019-11-06 11:11:11 +01:00
Olivier Matz	eba11e3646	mempool: reduce wasted space on populate The size returned by rte_mempool_op_calc_mem_size_default() is aligned to the specified page size. Therefore, with big pages, the returned size can be much more that what we really need to populate the mempool. For instance, populating a mempool that requires 1.1GB of memory with 1GB hugepages can result in allocating 2GB of memory. This problem is hidden most of the time due to the allocation method of rte_mempool_populate_default(): when try_iova_contig_mempool=true, it first tries to allocate an iova contiguous area, without the alignment constraint. If it fails, it fallbacks to an aligned allocation that does not require to be iova-contiguous. This can also fallback into several smaller aligned allocations. This commit changes rte_mempool_op_calc_mem_size_default() to relax the alignment constraint to a cache line and to return a smaller size. Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Nipun Gupta <nipun.gupta@nxp.com>	2019-11-06 11:11:10 +01:00
Olivier Matz	354788b60c	mempool: allow populating with unaligned virtual area rte_mempool_populate_virt() currently requires that both addr and length are page-aligned. Remove this unneeded constraint which can be annoying with big hugepages (ex: 1GB). Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Nipun Gupta <nipun.gupta@nxp.com>	2019-11-06 11:11:09 +01:00
Vladimir Medvedkin	c3e12e0f03	fib: add dataplane algorithm for IPv6 Add fib implementation for ipv6 using modified DIR24_8 algorithm. Implementation is similar to current LPM6 implementation but has few enhancements: faster control plane operations more bits for userdata in table entries configurable userdata size Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>	2019-11-06 00:11:44 +01:00
Vladimir Medvedkin	7dc7868b20	fib: add DIR24-8 dataplane algorithm Add fib implementation for DIR24_8 algorithm for IPv4. Implementation is similar to current LPM implementation but has few enhancements: faster control plane operations more bits for userdata in table entries configurable userdata size Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>	2019-11-06 00:11:44 +01:00
Vladimir Medvedkin	40d41a8a7b	fib: support IPv6 Add FIB library support for IPv6. It implements a dataplane structures and algorithms designed for fast IPv6 longest prefix match. Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>	2019-11-06 00:11:44 +01:00
Vladimir Medvedkin	39e9272484	fib: add FIB library Add FIB (Forwarding Information Base) library. This library implements a dataplane structures and algorithms designed for fast longest prefix match. Internally it consists of two parts - RIB (control plane ops) and implementation for the dataplane tasks. Initial version provides two implementations for both IPv4 and IPv6: dummy (uses RIB as a dataplane) and DIR24_8 (same as current LPM) Due to proposed design it allows to extend FIB with new algorithms in future (for example DXR, poptrie, etc). Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>	2019-11-06 00:11:44 +01:00
Vladimir Medvedkin	f7e861e21c	rib: support IPv6 Extend RIB library with IPv6 support. Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>	2019-11-06 00:09:48 +01:00
Vladimir Medvedkin	5a5793a5ff	rib: add RIB library Add RIB (Routing Information Base) library. This library implements an IPv4 routing table optimized for control plane operations. It implements a control plane struct containing routes in a tree and provides fast add/del operations for routes. Also it allows to perform fast subtree traversals (i.e. retrieve existing subroutes for a given prefix). This structure will be used as a control plane helper structure for FIB implementation. Also it might be used standalone in other different places such as bitmaps for example. Internal implementation is level compressed binary trie. Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>	2019-11-06 00:08:56 +01:00
Dharmik Thakkar	b28f28ae80	rename private header files Some of the internal header files have 'rte_' prefix and some don't. Remove 'rte_' prefix from all internal header files. Suggested-by: Thomas Monjalon <thomas@monjalon.net> Signed-off-by: Dharmik Thakkar <dharmik.thakkar@arm.com> Reviewed-by: Phil Yang <phil.yang@arm.com> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>	2019-10-27 22:03:06 +01:00
Marcin Hajkowski	8c00828da8	power: add packet type for capabilities Add new packet type and commands for capabilities query. Signed-off-by: Marcin Hajkowski <marcinx.hajkowski@intel.com> Tested-by: David Hunt <david.hunt@intel.com> Acked-by: Lee Daly <lee.daly@intel.com>	2019-10-27 21:12:04 +01:00
Marcin Hajkowski	04a8cb8ee9	power: extend guest channel for frequency query Extend incoming packet reading API with new packet type which carries CPU frequencies. Signed-off-by: Marcin Hajkowski <marcinx.hajkowski@intel.com> Tested-by: David Hunt <david.hunt@intel.com> Acked-by: Lee Daly <lee.daly@intel.com>	2019-10-27 20:57:05 +01:00
Marcin Hajkowski	9dc843eb27	power: extend guest channel API for reading Added new experimental API rte_power_guest_channel_receive_msg which gives possibility to receive messages send to guest. Signed-off-by: Marcin Hajkowski <marcinx.hajkowski@intel.com> Tested-by: David Hunt <david.hunt@intel.com> Acked-by: Lee Daly <lee.daly@intel.com>	2019-10-27 19:27:36 +01:00
Marcin Hajkowski	b4b2f84a59	power: fix socket indicator value Currently 0 is being used for not connected slot indication. This is not consistent with linux doc which identifies 0 as valid (connected) slot, thus modification was done to change it. Fixes: `cd0d5547` ("power: vm communication channels in guest") Cc: stable@dpdk.org Signed-off-by: Marcin Hajkowski <marcinx.hajkowski@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-10-27 19:26:35 +01:00
Bruce Richardson	da5350ef29	net: remove ethernet packing and set two-byte alignment The ether header does not need to be packed since that makes no sense for structures with only bytes in them, but it should be aligned to a two-byte boundary to simplify access to it from code. Other packed structures that use this also need to be updated to take account of the change, either by removing packing - where it is clearly unneeded - or by explicitly giving those structures 2-byte alignment also. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>	2019-10-27 18:13:44 +01:00
Bruce Richardson	268fa581b1	port: fix pcap support with meson The meson build was missing the define to enable pcap port support if libpcap (development) package was found on the build platform. Rather than duplicating the checks for libpcap found in the pcap net PMD build file, we can move the checks to the top-level config directory and reference the RTE_PCAP_PORT setting elsewhere in the build. Bugzilla ID: 351 Fixes: `5b9656b157` ("lib: build with meson") Cc: stable@dpdk.org Reported-by: Cristian Bidea <cristian.bidea@keysight.com> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Tested-by: Cristian Bidea <cristian.bidea@keysight.com>	2019-10-27 17:23:02 +01:00
Pavan Nikhilesh	f1c16d40ed	bitrate: use common macro RTE_DIM Use RTE_DIM instead of re-defining ARRAY_SIZE. Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2019-10-27 14:40:59 +01:00
Bruce Richardson	a5d4ea5943	build: support building ABI versioned files twice Any file with ABI versioned functions needs different macros for shared and static builds, so we need to accommodate that. Rather than building everything twice, we just flag to the build system which libraries need that handling, by setting use_function_versioning in the meson.build files. To ensure we don't get silent errors at build time due to this meson flag being missed, we add an explicit error to the function versioning header file if a known C macro is not defined. Since "make" builds always only build one of shared or static libraries, this define can be always set, and so is added to the global CFLAGS. For meson, the build flag - and therefore the C define - is set for the three libraries that need the function versioning: "distributor", "lpm" and "timer". Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Tested-by: Andrzej Ostruszka <amo@semihalf.com> Reviewed-by: Andrzej Ostruszka <amo@semihalf.com>	2019-10-27 12:49:28 +01:00
Bruce Richardson	dc61aa74b7	eal: split compat header file The compat.h header file provided macros for two purposes: 1. it provided the macros for marking functions as rte_experimental 2. it provided the macros for doing function versioning Although these were in the same file, #1 is something that is for use by public header files, which #2 is for internal use only. Therefore, we can split these into two headers, keeping #1 in rte_compat.h and #2 in a new file rte_function_versioning.h. For "make" builds, since internal objects pick up the headers from the "include/" folder, we need to add the new header to the installation list, but for "meson" builds it does not need to be installed as it's not for public use. The rework also serves to allow the use of the function versioning macros to files that actually need them, so the use of experimental functions does not need including of the versioning code. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Andrzej Ostruszka <amo@semihalf.com>	2019-10-27 12:49:28 +01:00
Igor Ryzhov	49e7e2dee3	kni: add ability to set min/max MTU Starting with kernel version 4.10, there are new min/max MTU values in net_device structure, which are set to ETH_MIN_MTU and ETH_DATA_LEN by default. We should be able to change these values to allow MTU more than 1500 to be set on KNI. Signed-off-by: Igor Ryzhov <iryzhov@nfware.com> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-10-27 11:07:43 +01:00
David Christensen	4b462021b4	vhost: fix build on RHEL 7.6 for Power Use of %llx print formatting causes meson build error on Power systems with RHEL 7.6 and gcc 4.8.5. Replace with PRIx64 macro. Fixes: `9b62e2da18` ("vhost: register new regions with userfaultfd") Cc: stable@dpdk.org Signed-off-by: David Christensen <drc@linux.vnet.ibm.com> Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>	2019-10-27 11:07:19 +01:00
David Marchand	9195ef7f78	ethdev: bump library version Let's stick to the current model of per library ABI version until the new model is in place. The ABI changed in the incriminated commit. The release notes were updated accordingly but the compiled version number has been missed. Fixes: `4f25d7d225` ("ethdev: add return code to device info get function") Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2019-10-27 10:41:50 +01:00
David Marchand	f58cef079b	eal: make the global configuration private Now that all elements of the rte_config structure have (deinlined) accessors, we can hide it. Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-10-27 10:41:49 +01:00
David Marchand	6614072791	eal: factorize lcore role code This code belongs to the lcore API, move the prototype to the right header, then factorize the code into the common code. Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2019-10-27 10:41:08 +01:00
David Marchand	56564391d7	eal: deinline lcore APIs Those functions are used to setup or take control decisions. Move them into the EAL common code and put them directly in the stable ABI. Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2019-10-27 10:41:08 +01:00
David Marchand	b5fedaedfc	log: add log stream accessor Define an accessor so that users can write their debug message to the same stream than the rte_log infrastructure. Use it in the qat infrastructure. Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Kevin Traynor <ktraynor@redhat.com>	2019-10-27 10:41:08 +01:00
David Marchand	ca52fccbb3	pci: remove deprecated functions Those functions have been deprecated since 17.11 and have 1:1 replacement. Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2019-10-27 10:41:05 +01:00
David Marchand	974be46e9e	mem: hide internal heap header Let's avoid exporting structures without an identified usecase. Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-10-27 10:39:56 +01:00
David Marchand	bbabce218d	eal: remove deprecated malloc virt2phys function Remove rte_malloc_virt2phy as announced previously. Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-10-27 10:36:19 +01:00
David Marchand	637af85090	eal: remove deprecated CPU flags check function Remove rte_cpu_check_supported as announced previously. Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-10-27 10:35:58 +01:00
Stephen Hemminger	65661351ca	eal: make lcore config private The internal structure of lcore_config does not need to be part of visible API/ABI. Make it private to EAL. Rearrange the structure so it takes less memory (and cache footprint). Since we change the ABI, bump the library version. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-10-27 10:35:11 +01:00
Flavio Leitner	84c39beb2f	vhost: fix IPv4 checksum Currently the IPv4 header checksum is calculated including its current value, which can be a valid checksum or just garbage. In any case, if the original value is not zero, then the result is always wrong. The IPv4 checksum is defined in RFC791, page 14 says: Header Checksum: 16 bits The checksum algorithm is: The checksum field is the 16 bit one's complement of the one's complement sum of all 16 bit words in the header. For purposes of computing the checksum, the value of the checksum field is zero. Thus force the csum field to always be zero. Fixes: `b08b8cfeb2` ("vhost: fix IP checksum") Cc: stable@dpdk.org Signed-off-by: Flavio Leitner <fbl@sysclose.org> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-10-25 19:23:22 +02:00
Ilya Maximets	70c7747689	vhost: disable host TSO for linear buffers without extbuf If linear buffers requested and external buffers are not, vhost will not be able to receive any buffer that doesn't fit in a single mbuf. Moreover, if such a buffer will appear in a vring it will never be dequeued and the whole vring will become dead breaking the network connection. Disable segmentation offloading from the host side to avoid having such a big buffers. Fixes: `c3ff0ac70a` ("vhost: improve performance by supporting large buffer") Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-10-25 19:23:06 +02:00
Ilya Maximets	19896c7393	vhost: return error message for mbuf allocation failure mbuf allocation failure is a hard failure that highlights some significant issues with memory pool size or a mbuf leak. We still have the message for subsequent chained mbufs, but not for the first one. It was removed while introducing extbuf support for large buffers. But it was useful for catching mempool issues and needs to be returned back. Fixes: `c3ff0ac70a` ("vhost: improve performance by supporting large buffer") Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Reviewed-by: Flavio Leitner <fbl@sysclose.org> Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>	2019-10-25 19:22:46 +02:00
Marvin Liu	f974ca7a29	vhost: optimize packed ring dequeue when in-order When VIRTIO_F_IN_ORDER feature is negotiated, vhost can optimize dequeue function by only update first used descriptor. Signed-off-by: Marvin Liu <yong.liu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-10-25 19:20:47 +02:00
Marvin Liu	31d6c6a5b8	vhost: optimize packed ring dequeue Optimize vhost device packed ring dequeue function by splitting batch and single functions. No-chained and direct descriptors will be handled by batch and other will be handled by single as before. Signed-off-by: Marvin Liu <yong.liu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-10-25 19:20:47 +02:00
Marvin Liu	d1eafb5322	vhost: add packed ring zcopy batch and single dequeue Add vhost packed ring zero copy batch and single dequeue functions like normal dequeue path. Signed-off-by: Marvin Liu <yong.liu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-10-25 19:20:47 +02:00
Marvin Liu	0294211bb6	vhost: optimize packed ring enqueue Optimize vhost device packed ring enqueue function by splitting batch and single functions. Packets can be filled into one desc will be handled by batch and others will be handled by single as before. Signed-off-by: Marvin Liu <yong.liu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-10-25 19:20:47 +02:00
Marvin Liu	c119edbc2d	vhost: update packed ring dequeue Buffer used ring updates as many as possible in vhost dequeue function for coordinating with virtio driver. For supporting buffer, shadow used ring element should contain descriptor's flags. First shadowed ring index was recorded for calculating buffered number. Signed-off-by: Marvin Liu <yong.liu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-10-25 19:20:47 +02:00
Marvin Liu	f41516c309	vhost: flush batched enqueue descs directly Flush used elements when batched enqueue function is finished. Descriptor's flags are pre-calculated as they will be reset by vhost. Signed-off-by: Marvin Liu <yong.liu@intel.com> Reviewed-by: Gavin Hu <gavin.hu@arm.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-10-25 19:20:47 +02:00
Marvin Liu	33d4a554f9	vhost: flush enqueue updates by cacheline Buffer vhost packed ring enqueue updates, flush ring descs if buffered content filled up one cacheline. Thus virtio can receive packets at a faster frequency. Signed-off-by: Marvin Liu <yong.liu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-10-25 19:20:47 +02:00
Marvin Liu	75ed516978	vhost: add packed ring batch dequeue Add batch dequeue function like enqueue function for packed ring, batch dequeue function will not support chained descriptors, single packet dequeue function will handle it. Signed-off-by: Marvin Liu <yong.liu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-10-25 19:20:47 +02:00
Marvin Liu	47ac243ac4	vhost: add packed ring single dequeue Add vhost single packet dequeue function for packed ring and meanwhile left space for shadow used ring update function. Signed-off-by: Marvin Liu <yong.liu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-10-25 19:20:47 +02:00
Marvin Liu	ef861692c3	vhost: add packed ring batch enqueue Batch enqueue function will first check whether descriptors are cache aligned. It will also check prerequisites in the beginning. Batch enqueue function do not support chained mbufs, single packet enqueue function will handle it. Signed-off-by: Marvin Liu <yong.liu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-10-25 19:20:47 +02:00
Marvin Liu	934274065a	vhost: try to unroll for each loop Create macro for adding unroll pragma before for each loop. Batch functions will be contained of several small loops which can be optimized by compilers' loop unrolling pragma. Signed-off-by: Marvin Liu <yong.liu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-10-25 19:20:47 +02:00
Marvin Liu	93520085ef	vhost: add packed ring single enqueue Add vhost enqueue function for single packet and meanwhile left space for flush used ring function. Signed-off-by: Marvin Liu <yong.liu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-10-25 19:20:47 +02:00
Marvin Liu	86202aae94	vhost: add packed ring indexes increasing function When enqueuing or dequeuing, the virtqueue's local available and used indexes are increased. Signed-off-by: Marvin Liu <yong.liu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-10-25 19:20:47 +02:00
Flavia Musatescu	512d873ff1	net: add new header file for VXLAN The VXLAN related definitions and structures are moved from rte_ether.h to a new header file: rte_xvlan.h. Also introducing a new define macro for VXLAN default port id: RTE_VXLAN_DEFAULT_PORT Signed-off-by: Flavia Musatescu <flavia.musatescu@intel.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com> Tested-by: Raslan Darawsheh <rasland@mellanox.com>	2019-10-25 19:00:22 +02:00
David Marchand	40549b086c	net: hide internal CRC defines No need to let those (non RTE_ prefixed) defines public. Hide them where we use them. Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-10-25 19:00:22 +02:00
David Marchand	d613fe10b3	net: add rte prefix to MPLS structure Add 'rte_' prefix to structures: - rename struct mpls_hdr as struct rte_mpls_hdr. Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-10-25 19:00:22 +02:00
David Marchand	2379572969	net: add missing rte prefix on PPPoE defines Those two defines have been missed. Fixes: `35b2d13fd6` ("net: add rte prefix to ether defines") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-10-25 19:00:22 +02:00
Ciara Power	22a0763673	ethdev: fix include of ethernet header file The include for rte_ether.h in each of these files should not use quotes, as the header file is not in the librte_ethdev directory. These are now updated to use <> symbols, to search directories pre-designated by the compiler. Fixes: `57668ed7bc` ("net: move ethernet definitions to the net library") Cc: stable@dpdk.org Signed-off-by: Ciara Power <ciara.power@intel.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2019-10-25 19:00:22 +02:00
Ting Xu	d8e5e69f3a	app/testpmd: add GTP parsing and Tx checksum offload Enable testpmd to forward GTP packet in csum fwd mode. A GTP header structure (without optional fields and extension header) is defined in new rte_gtp.h. A parser function in testpmd is added. GTPU and GTPC packets are both supported, with respective UDP destination port and GTP message type. Signed-off-by: Ting Xu <ting.xu@intel.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-10-25 19:00:22 +02:00
Olivier Matz	4958ca3a44	mbuf: support dynamic fields and flags Many features require to store data inside the mbuf. As the room in mbuf structure is limited, it is not possible to have a field for each feature. Also, changing fields in the mbuf structure can break the API or ABI. This commit addresses these issues, by enabling the dynamic registration of fields or flags: - a dynamic field is a named area in the rte_mbuf structure, with a given size (>= 1 byte) and alignment constraint. - a dynamic flag is a named bit in the rte_mbuf structure. The typical use case is a PMD that registers space for an offload feature, when the application requests to enable this feature. As the space in mbuf is limited, the space should only be reserved if it is going to be used (i.e when the application explicitly asks for it). The registration can be done at any moment, but it is not possible to unregister fields or flags. Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2019-10-26 19:08:50 +02:00
Anatoly Burakov	6d3f9917ff	eal: fix memory config allocation for multi-process Currently, mem config will be mapped without using the virtual area reservation infrastructure, which means it will be mapped at an arbitrary location. This may cause failures to map the shared config in secondary process due to things like PCI whitelist arguments allocating memory in a space where the primary has allocated the shared mem config. Fix this by using virtual area reservation to reserve space for the mem config, thereby avoiding the problem and reserving the shared config (hopefully) far away from any normal memory allocations. Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-10-26 18:03:26 +02:00
Anatoly Burakov	6080796f65	mem: make base address hint OS specific Not all OS's follow Linux's memory layout, which may lead to problems following the suggested common address hint absent of a base-virtaddr flag. Make this address hint OS-specific. Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-10-26 18:03:24 +02:00
Pallavi Kadam	7e708cd8c6	eal: move CPU operations to OS specific headers Moving RTE_CPU* definitions from the common code to the Linux and FreeBSD rte_os.h file to avoid #ifdef clutter. Signed-off-by: Pallavi Kadam <pallavi.kadam@intel.com> Signed-off-by: Antara Ganesh Kolar <antara.ganesh.kolar@intel.com> Reviewed-by: Ranjit Menon <ranjit.menon@intel.com> Reviewed-by: Jerin Jacob <jerinj@marvell.com> Signed-off-by: David Marchand <david.marchand@redhat.com>	2019-10-26 17:06:41 +02:00
Pavan Nikhilesh	9b0a1dadc3	reciprocal: fix off-by-one with 32-bit divisor Fix off-by-one error in 64bit reciprocal division when divisor is 32bit. Caught with the unit test: RTE>>reciprocal_division Validating unsigned 32bit division. Validating unsigned 64bit division. Validating unsigned 64bit division with 32bit divisor. Division failed, 16983222950483802557/819 = expected 20736535959076681 result 20736535959076682 Validating division by power of 2. Test Failed Fixes: `6d45659eac` ("eal: add u64-bit variant for reciprocal divide") Cc: stable@dpdk.org Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>	2019-10-26 16:09:51 +02:00
Konstantin Ananyev	3eb860b08e	mbuf: move definitions into a separate file Right now inclusion of rte_mbuf.h header can cause inclusion of some arch/os specific headers. That prevents it to be included directly by some non-DPDK (but related) entities: KNI, BPF programs, etc. To overcome that problem usually a separate definitions of rte_mbuf structure is created within these entities. That aproach has a lot of drawbacks: code duplication, error prone, etc. This patch moves rte_mbuf structure definition (and some related macros) into a separate file that can be included by both rte_mbuf.h and other non-DPDK entities. Note that it doesn't introduce any change for current DPDK code. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Michel Machado <michel@digirati.com.br> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2019-10-25 19:30:38 +02:00
Konstantin Ananyev	2dcb5f7987	eal: move cache line and IOVA related definitions Right now RTE_CACHE_ and IOVA definitions are located inside rte_memory.h That might cause an unwanted inclusions of arch/os specific header files. See [1] for particular problem example. Probably the simplest way to deal with such problems - move these definitions into rte_commmon.h Note that this move doesn't introduce any change in functionality. [1] https://bugs.dpdk.org/show_bug.cgi?id=321 Suggested-by: Vipin Varghese <vipin.varghese@intel.com> Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Michel Machado <michel@digirati.com.br>	2019-10-25 19:30:36 +02:00
Rahul Shah	5d92c4e592	port: add eventdev port type Adding a new port type called eventdev to the rte_port library. Signed-off-by: Rahul Shah <rahul.r.shah@intel.com> Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2019-10-25 18:29:48 +02:00
Jasvinder Singh	68c1f26d42	sched: support 64-bit values Modify internal structure and functions to support 64-bit values for rates and stats parameters. Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com> Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com> Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2019-10-25 18:07:37 +02:00
Jasvinder Singh	0edf18eee2	sched: add 64-bit values To support high bandwidth network interfaces, all rates (port, subport level token bucket and traffic class rates, pipe level token bucket and traffic class rates) and stats counters defined in public data structures (rte_sched.h) are modified to support 64 bit counters. Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com> Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com> Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2019-10-25 18:07:26 +02:00
Jasvinder Singh	def9c49267	sched: remove redundant code Remove redundant data structure fields from port level data structures and update the release notes. Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com> Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>	2019-10-25 17:53:36 +02:00
Jasvinder Singh	831104f0e8	sched: update queue stats read for config flexibility Modify pipe queue stats read function to allow different subports of the same port to have different configuration in terms of number of pipes, pipe queue sizes, etc. Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com> Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>	2019-10-25 17:51:26 +02:00
Jasvinder Singh	2a718309fd	sched: update pkt dequeue for flexible config Modify scheduler packet dequeue operation to allow different subports of the same port to have different configuration in terms of number of pipes, pipe queue sizes, etc. Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com> Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>	2019-10-25 17:51:22 +02:00
Jasvinder Singh	4d2ad6e34b	sched: update grinder functions for config flexibility Modify packet grinder functions of the schedule to allow different subports of the same port to have different configuration in terms of number of pipes, pipe queue sizes, etc. Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com> Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>	2019-10-25 17:51:19 +02:00
Jasvinder Singh	21dca4e3f6	sched: update memory compute to support flexiblity Update memory footprint compute function for allowing subports of the same port to have different configuration in terms of number of pipes, pipe queue sizes, etc. Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com> Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>	2019-10-25 17:51:17 +02:00
Jasvinder Singh	6fbbb0ef48	sched: modify pkt enqueue for config flexibility Modify scheduler packet enqueue operation of the scheduler to allow different subports of the same port to have different configuration in terms of number of pipes, pipe queue sizes, etc. Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com> Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>	2019-10-25 17:51:14 +02:00
Jasvinder Singh	34a90f8665	sched: modify pipe functions for config flexibility Modify pipe level functions to allow different subports of the same port to have different configuration in terms of number of pipes, pipe queue sizes, etc. Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com> Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>	2019-10-25 17:51:12 +02:00
Jasvinder Singh	ce7c4fd7c2	sched: add pipe config to subport level Add pipes configuration from the port level to allow different subports of the same port to have different configuration in terms of number of pipes, pipe queue sizes, etc. Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com> Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>	2019-10-25 17:51:10 +02:00
Jasvinder Singh	d9213b829a	sched: remove pipe params config from port level Remove pipes configuration from the port level to allow different subports of the same port to have different configuration in terms of number of pipes, pipe queue sizes, etc. Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com> Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>	2019-10-25 17:51:07 +02:00
Jasvinder Singh	b757097e37	sched: modify internal structs for config flexibility Update internal structures related to port and subport to allow different subports of the same port to have different configuration in terms of number of pipes, pipe queue sizes, etc. Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com> Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>	2019-10-25 17:51:04 +02:00
Jasvinder Singh	85f52aa422	sched: add pipe config params to subport struct Add pipe configuration parameters to subport level structure to allow different subports of the same port to have different configuration in terms of number of pipes, pipe queue sizes, etc. Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com> Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>	2019-10-25 17:49:45 +02:00
Ting Xu	d892768c6d	mbuf: add GTP tunnel type Add GTP tunnel type flag in mbuf for future use in GTP Tx checksum offload. Signed-off-by: Ting Xu <ting.xu@intel.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-10-23 16:43:10 +02:00
Kiran Kumar K	01b3156d33	ethdev: add HIGIG2 key field to flow API Add new rte_flow_item_higig2_hdr in order to match higig2 header. It is a layer 2.5 protocol and used in Broadcom switches. Header format is based on the following document. http://read.pudn.com/downloads558/doc/comm/2301468/HiGig_protocol.pdf Signed-off-by: Kiran Kumar K <kirankumark@marvell.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2019-10-23 16:43:10 +02:00
Ciara Power	400d758182	ethdev: check device promiscuous state The promiscuous enable and disable functions now check the promiscuous state of the device before checking if the dev_ops function exists for the device. This change is necessary to allow sample applications run on virtual PMDs, as previously -ENOTSUP returned when the promiscuous enable function was called. This caused the sample application to fail unnecessarily. Signed-off-by: Ciara Power <ciara.power@intel.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>	2019-10-23 16:43:10 +02:00
David Marchand	7eca7f7fd0	net: add missing endianness annotations OVS currently maintains a copy of those headers with the right endianness annotations so that sparse checks can pass. We introduced rte_beXX_t for better readibility in v17.08. Let's make use of them, OVS then only needs to override those rte_beXX_t types by exposing a tweaked rte_byteorder.h header. Other existing dpdk users won't be affected since rte_beXX_t types are mapped to uintXX_t types. Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2019-10-23 16:43:10 +02:00
Simei Su	d3ae8c44b8	ethdev: extend RSS offload types This patch reserves several bits as input set selection from the high end of the 64 bits. It is combined with exisiting ETH_RSS_* to represent RSS types. This patch also checks the simultaneous use of SRC_ONLY and DST_ONLY of the same level. Signed-off-by: Simei Su <simei.su@intel.com> Reviewed-by: Qi Zhang <qi.z.zhang@intel.com> Acked-by: Ori Kam <orika@mellanox.com> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>	2019-10-23 16:43:09 +02:00
Simei Su	fce6b66893	ethdev: decouple flow types and RSS offload types This patch decouples RTE_ETH_FLOW_* and ETH_RSS_*. The former defines flow types and the latter defines RSS offload types. Signed-off-by: Simei Su <simei.su@intel.com> Reviewed-by: Qi Zhang <qi.z.zhang@intel.com> Acked-by: Ori Kam <orika@mellanox.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>	2019-10-23 16:43:09 +02:00

1 2 3 4 5 ...

5879 Commits