numam-dpdk

Author	SHA1	Message	Date
David Marchand	79d69c6dcf	mbuf: remove seqn field As announced in the deprecation note, the field seqn is removed to give more space to the dynamic fields. This is how the mbuf layout looks like (pahole-style): word type name byte size 0 void * buf_addr; /* 0 + 8 / 1 rte_iova_t buf_iova / 8 + 8 / / --- RTE_MARKER64 rearm_data; / 2 uint16_t data_off; / 16 + 2 / uint16_t refcnt; / 18 + 2 / uint16_t nb_segs; / 20 + 2 / uint16_t port; / 22 + 2 / 3 uint64_t ol_flags; / 24 + 8 / / --- RTE_MARKER rx_descriptor_fields1; / 4 uint32_t union packet_type; / 32 + 4 / uint32_t pkt_len; / 36 + 4 / 5 uint16_t data_len; / 40 + 2 / uint16_t vlan_tci; / 42 + 2 / 5.5 uint64_t union hash; / 44 + 8 / 6.5 uint16_t vlan_tci_outer; / 52 + 2 / uint16_t buf_len; / 54 + 2 / 7 uint64_t timestamp; / 56 + 8 / / --- RTE_MARKER cacheline1; / 8 struct rte_mempool pool; /* 64 + 8 / 9 struct rte_mbuf next; /* 72 + 8 / 10 uint64_t union tx_offload; / 80 + 8 / 11 struct rte_mbuf_ext_shared_info shinfo; /* 88 + 8 / 12 uint16_t priv_size; / 96 + 2 / uint16_t timesync; / 98 + 2 / 12.5 uint32_t dynfield1[7]; / 100 + 28 / 16 / --- END 128 */ Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2020-10-31 22:14:44 +01:00
Thomas Monjalon	5284adad3e	mbuf: remove userdata field As announced in the deprecation note, the field userdata / udata64 is removed to give more space to the dynamic fields. This is how the mbuf layout looks like (pahole-style): word type name byte size 0 void * buf_addr; /* 0 + 8 / 1 rte_iova_t buf_iova / 8 + 8 / / --- RTE_MARKER64 rearm_data; / 2 uint16_t data_off; / 16 + 2 / uint16_t refcnt; / 18 + 2 / uint16_t nb_segs; / 20 + 2 / uint16_t port; / 22 + 2 / 3 uint64_t ol_flags; / 24 + 8 / / --- RTE_MARKER rx_descriptor_fields1; / 4 uint32_t union packet_type; / 32 + 4 / uint32_t pkt_len; / 36 + 4 / 5 uint16_t data_len; / 40 + 2 / uint16_t vlan_tci; / 42 + 2 / 5.5 uint64_t union hash; / 44 + 8 / 6.5 uint16_t vlan_tci_outer; / 52 + 2 / uint16_t buf_len; / 54 + 2 / 7 uint64_t timestamp; / 56 + 8 / / --- RTE_MARKER cacheline1; / 8 struct rte_mempool pool; /* 64 + 8 / 9 struct rte_mbuf next; /* 72 + 8 / 10 uint64_t union tx_offload; / 80 + 8 / 11 uint16_t priv_size; / 88 + 2 / uint16_t timesync; / 90 + 2 / uint32_t seqn; / 92 + 4 / 12 struct rte_mbuf_ext_shared_info shinfo; /* 96 + 8 / 13 uint64_t dynfield1[3]; / 104 + 24 / 16 / --- END 128 */ Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2020-10-31 16:13:11 +01:00
Thomas Monjalon	eb8258402b	examples/rxtx_callbacks: switch TSC to dynamic field The example used the deprecated mbuf field udata64. It is moved to a dynamic field in order to allow removal of udata64. Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2020-10-31 16:13:11 +01:00
Thomas Monjalon	614af75489	security: switch metadata to dynamic mbuf field The device-specific metadata was stored in the deprecated field udata64. It is moved to a dynamic mbuf field in order to allow removal of udata64. The name rte_security_dynfield is not very descriptive but it should be replaced later by separate fields for each type of data that drivers pass to the upper layer. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Haiyue Wang <haiyue.wang@intel.com>	2020-10-31 16:13:11 +01:00
Honnappa Nagarahalli	47bec9a5ca	ring: add zero copy API Add zero-copy APIs. These APIs provide the capability to copy the data to/from the ring memory directly, without having a temporary copy (for ex: an array of mbufs on the stack). Use cases that involve copying large amount of data to/from the ring can benefit from these APIs. Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2020-10-29 14:13:31 +01:00
Xiaoyun Li	079981e980	examples/tep_term: remove this application This example sets up a scenario that VXLAN packets can be received by different PF queues based on VNID and each queue is bound to a VM with a VNID so that the VM can receive its inner packets. Usually, OVS is used to do the software encap/decap for VXLAN packets. And the VXLAN packets offloading can be replaced with flow rules in testpmd like Chapter "Sample VXLAN flow rules" in Testpmd Application User Guide. And this example hasn't been used for a long time. So deprecate this example. Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>	2020-10-29 12:37:51 +01:00
Vladimir Medvedkin	1e5630e40d	fib6: add AVX512 lookup Add new lookup implementation for FIB6 trie algorithm using AVX512 instruction set Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2020-10-28 21:29:13 +01:00
Vladimir Medvedkin	b3509fa365	fib: add AVX512 lookup Add new lookup implementation for DIR24_8 algorithm using AVX512 instruction set Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2020-10-28 21:29:11 +01:00
Ray Kinsella	23b4fd825f	doc: update ABI version references Updated references to abi versions in the contributors guide. Fixed an inaccurate reference to a symbol in the policy. Signed-off-by: Ray Kinsella <mdr@ashroe.eu> Reviewed-by: David Marchand <david.marchand@redhat.com>	2020-10-27 08:53:53 +01:00
Ruifeng Wang	ced5a6ce24	lpm: hide internal data Fields except tbl24 and tbl8 in rte_lpm structure have no need to be exposed to the user. Hide the unneeded exposure of structure fields for better ABI maintainability. Suggested-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Acked-by: Kevin Traynor <ktraynor@redhat.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com> Signed-off-by: David Marchand <david.marchand@redhat.com>	2020-10-24 19:08:06 +02:00
Dharmik Thakkar	769b2de7fb	hash: implement RCU resources reclamation Currently, users have to use external RCU mechanisms to free resources when using lock free hash algorithm. Integrate RCU QSBR process to make it easier for the applications to use lock free algorithm. Refer to RCU documentation to understand various aspects of integrating RCU library into other libraries. Suggested-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Signed-off-by: Dharmik Thakkar <dharmik.thakkar@arm.com> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com> Acked-by: Ray Kinsella <mdr@ashroe.eu> Acked-by: Yipeng Wang <yipeng1.wang@intel.com>	2020-10-24 09:25:13 +02:00
Thomas Monjalon	513ee0ab74	doc: remove references to make from known issues The config options CONFIG_RTE_* are simple RTE_* defines with meson. Now that make support is dropped, update the HPET config reference. The comment about the AVX512 config option is not relevant anymore. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: David Marchand <david.marchand@redhat.com>	2020-10-23 19:25:21 +02:00
Ruifeng Wang	50ee0c2d0a	doc: update guide for armv8 crypto Added guide about building by using meson. Also added the command to create virtual device. Suggested-by: Akhil Goyal <akhil.goyal@nxp.com> Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>	2020-10-22 23:11:58 +02:00
Kevin Laatz	559fe5f7b6	doc: update patch cheatsheet to use meson With 'make' being removed, the patch cheatsheet needs to be updated to remove any references to 'make'. These references have been replaced with meson alternatives in this patch. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>	2020-10-22 22:54:05 +02:00
Ciara Power	194124fb67	doc: add to release notes to reflect removal of make Added an entry to describe the removal of the Make build system to the release notes for 20.11. Signed-off-by: Ciara Power <ciara.power@intel.com>	2020-10-22 22:54:05 +02:00
Ciara Power	532e4e48ca	doc: remove references to make from contributing guide Make is no longer supported for compiling DPDK, references are now removed in the documentation. Signed-off-by: Ciara Power <ciara.power@intel.com> Signed-off-by: Louise Kilheeney <louise.kilheeney@intel.com>	2020-10-22 22:54:05 +02:00
Ciara Power	95fcf7bff4	doc: remove reference to make from tools guide Make is no longer supported for compiling DPDK, references are now removed in the documentation. Signed-off-by: Ciara Power <ciara.power@intel.com> Reviewed-by: Kevin Laatz <kevin.laatz@intel.com> Acked-by: Nicolas Chautru <nicolas.chautru@intel.com>	2020-10-22 22:54:05 +02:00
Ciara Power	e2a94f9ad3	doc: remove references to make from apps guide While make has been deprecated for DPDK, it's still applicable for some example apps to be built standalone, this patch adjusts the guides to take that into consideration. Signed-off-by: Ciara Power <ciara.power@intel.com> Acked-by: Nicolas Chautru <nicolas.chautru@intel.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2020-10-22 22:54:05 +02:00
Ciara Power	6250e968ac	doc: remove references to make from rawdevs guide Make is no longer supported for compiling DPDK, references are now removed in the documentation. Signed-off-by: Ciara Power <ciara.power@intel.com> Reviewed-by: Kevin Laatz <kevin.laatz@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Rosen Xu <rosen.xu@intel.com>	2020-10-22 22:54:05 +02:00
Ciara Power	08b1d50543	doc: remove references to make from eventdevs guide Make is no longer supported for compiling DPDK, references are now removed in the documentation. Signed-off-by: Ciara Power <ciara.power@intel.com> Reviewed-by: Kevin Laatz <kevin.laatz@intel.com>	2020-10-22 22:54:05 +02:00
Ciara Power	89515c0348	doc: remove references to make from compressdevs guide Make is no longer supported for compiling DPDK, references are now removed in the documentation. Signed-off-by: Ciara Power <ciara.power@intel.com> Reviewed-by: Kevin Laatz <kevin.laatz@intel.com> Acked-by: Lee Daly <lee.daly@intel.com>	2020-10-22 22:54:05 +02:00
Ciara Power	fd5f9fb95f	doc: remove references to make from cryptodevs guide Make is no longer supported for compiling DPDK, references are now removed in the documentation. Signed-off-by: Ciara Power <ciara.power@intel.com> Reviewed-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2020-10-22 22:54:05 +02:00
Ciara Power	07a2a57261	doc: remove references to make from bbdevs guide Make is no longer supported for compiling DPDK, references are now removed in the documentation. Signed-off-by: Ciara Power <ciara.power@intel.com> Reviewed-by: Kevin Laatz <kevin.laatz@intel.com>	2020-10-22 22:54:05 +02:00
Ciara Power	dde524d3ff	doc: remove references to make from vdpadevs guide Make is no longer supported for compiling DPDK, references are now removed in the documentation. Signed-off-by: Ciara Power <ciara.power@intel.com> Acked-by: Matan Azrad <matan@nvidia.com>	2020-10-22 22:54:05 +02:00
Ciara Power	68d99d00ae	doc: remove references to make from NICs guide Make is no longer supported for compiling DPDK, references are now removed in the documentation. Signed-off-by: Ciara Power <ciara.power@intel.com> Reviewed-by: Bruce Richardson <bruce.richardson@intel.com> Reviewed-by: Igor Russkikh <irusskikh@marvell.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com> Reviewed-by: Harman Kalra <hkalra@marvell.com> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Acked-by: Hyong Youb Kim <hyonkim@cisco.com>	2020-10-22 22:54:05 +02:00
Ciara Power	a3b34b1df8	doc: remove references to make from mempool guide Make is no longer supported for compiling DPDK, references are now removed in the documentation. Signed-off-by: Ciara Power <ciara.power@intel.com> Reviewed-by: Kevin Laatz <kevin.laatz@intel.com>	2020-10-22 22:54:05 +02:00
Ciara Power	d2e65d43fe	doc: remove references to make from platforms guide Make is no longer supported for compiling DPDK, references are now removed in the documentation. Signed-off-by: Ciara Power <ciara.power@intel.com> Signed-off-by: Harman Kalra <hkalra@marvell.com> Reviewed-by: Kevin Laatz <kevin.laatz@intel.com>	2020-10-22 22:54:05 +02:00
David Marchand	30105f664f	drivers: add headers install helper A lot of drivers export headers, reproduce the same facility than for libraries. Note: this change fixes an issue with the crypto scheduler headers which were not installed properly. A separate backport will be sent to stable branches. Suggested-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2020-10-22 14:16:22 +02:00
David Marchand	6b3848e211	build: fix version map file references in documentation Fixes: 63b3907833d8 ("build: remove library name from version map file name") Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Ray Kinsella <mdr@ashroe.eu> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2020-10-22 14:11:49 +02:00
Stephen Hemminger	d250589d57	net/memif: replace master/slave arguments Replace master/slave terms in this driver. The memory interface drivers uses a client/server architecture so change the variable names and device arguments to that. The previous devargs are maintained for compatibility, but if used cause a notice in the log. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2020-10-20 13:17:08 +02:00
Stephen Hemminger	cb056611a8	eal: rename lcore master and slave Replace master lcore with main lcore and replace slave lcore with worker lcore. Keep the old functions and macros but mark them as deprecated for this release. The "--master-lcore" command line option is also deprecated and any usage will print a warning and use "--main-lcore" as replacement. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2020-10-20 13:17:08 +02:00
Stephen Hemminger	57c789fd94	doc: add policy about master/slave words Update the coding style document to include a policy against introducing new master/slave usage. This is taken from the similar place in the Linux kernel coding style. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2020-10-20 11:41:41 +02:00
Bruce Richardson	a20b2c01a7	build: standardize component names and defines As discussed on the dpdk-dev mailing list[1], we can make some easy improvements in standardizing the naming of the various components in DPDK, and their associated feature-enabled macros. Following this patch, each library will have the name in format, 'librte_<name>.so', and the macro indicating that library is enabled in the build will have the form 'RTE_LIB_<NAME>'. Similarly, for libraries, the equivalent name formats and macros are: 'librte_<class>_<name>.so' and 'RTE_<CLASS>_<NAME>', where class is the device type taken from the relevant driver subdirectory name, i.e. 'net', 'crypto' etc. To avoid too many changes at once for end applications, the old macro names will still be provided in the build in this release, but will be removed subsequently. [1] http://inbox.dpdk.org/dev/ef7c1a87-79ab-e405-4202-39b7ad6b0c71@solarflare.com/t/#u Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Luca Boccassi <bluca@debian.org> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Rosen Xu <rosen.xu@intel.com>	2020-10-19 22:15:34 +02:00
Bruce Richardson	c0a775a141	doc: add SPDX license tag header to meson guide The build-sdk-meson.rst file originates from the short plain-text meson instructions added in 2018. Add SPDX tag and copyright notice based on the original commit. Fixes: 9c3adc289c5e ("doc: add instructions on build using meson") Cc: stable@dpdk.org Reported-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>	2020-10-19 18:21:44 +02:00
Ciara Power	1e6a661302	acl: check max SIMD bitwidth When choosing a vector path to take, an extra condition must be satisfied to ensure the max SIMD bitwidth allows for the CPU enabled path. These checks are added in the check alg helper functions. Signed-off-by: Ciara Power <ciara.power@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Tested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2020-10-19 16:45:02 +02:00
Ciara Power	4635c840ce	doc: describe how to enable AVX512 This patch adds documentation on the usage of the max SIMD bitwidth EAL setting to enable AVX-512 at runtime. Signed-off-by: Ciara Power <ciara.power@intel.com> Acked-by: Kevin Laatz <kevin.laatz@intel.com>	2020-10-19 16:45:02 +02:00
Ciara Power	580af30dd6	eal: control max SIMD bitwidth This patch adds a max SIMD bitwidth EAL configuration. The API allows for an app to set this value. It can also be set using EAL argument --force-max-simd-bitwidth, which will lock the value and override any modifications made by the app. Each arch has a define for the default SIMD bitwidth value, this is used on EAL init to set the config max SIMD bitwidth. Signed-off-by: Ciara Power <ciara.power@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com> Acked-by: Ray Kinsella <mdr@ashroe.eu>	2020-10-19 16:45:02 +02:00
Akhil Goyal	e30b2833c4	security: update session create API The API ``rte_security_session_create`` takes only single mempool for session and session private data. So the application need to create mempool for twice the number of sessions needed and will also lead to wastage of memory as session private data need more memory compared to session. Hence the API is modified to take two mempool pointers - one for session and one for private data. This is very similar to crypto based session create APIs. Signed-off-by: Akhil Goyal <akhil.goyal@nxp.com> Reviewed-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com> Tested-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>	2020-10-19 09:54:54 +02:00
Konstantin Ananyev	5636d60347	examples/l3fwd-acl: select ACL classify method Replace '--scalar' command-line option with new one: --alg=<algname> to allow user explicitly select desired classify method. This is an optional parameter, if not specified default classify algorithm will be used. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2020-10-19 09:20:25 +02:00
Konstantin Ananyev	20f76bb666	examples/l3fwd-acl: update MAC addresses Introduces two changes into l3fwd-acl behaviour to make it behave in the same way as l3fwd: - Add a command-line parameter to allow the user to specify the destination mac address for each ethernet port used. - While forwarding the packet update source and destination mac addresses. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2020-10-19 09:20:25 +02:00
Ferruh Yigit	a72cb3e765	doc: announce queue stats moving to xstats Queue stats will be removed from basic stats to xstats. It will be PMDs responsibility to fill queue stats based on number of queues they have. Until all PMDs implement the xstats, a temporary 'RTE_ETH_DEV_AUTOFILL_QUEUE_XSTATS' device flag created. PMDs switched to the xstats should clear this flag to bypass the ethdev layer autofill for queue stats. Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Ray Kinsella <mdr@ashroe.eu> Acked-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Jerin Jacob <jerinj@marvell.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>	2020-10-16 23:27:15 +02:00
Ivan Ilchenko	58af59172b	ethdev: allow stop function to return an error Change rte_eth_dev_stop() return value from void to int and return negative errno values in case of error conditions. Also update the usage of the function in ethdev according to the new return type. Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2020-10-16 22:26:41 +02:00
Thomas Monjalon	8a5a0aad5d	ethdev: allow close function to return an error The API function rte_eth_dev_close() was returning void. The return type is changed to int for notifying of errors. If an error happens during a close operation, the status of the port is undefined, a maximum of resources having been freed. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Reviewed-by: Liron Himi <lironh@marvell.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2020-10-16 22:26:41 +02:00
Viacheslav Ovsiienko	91c78e090e	app/testpmd: add rxoffs commands and parameters Add command line parameter: --rxoffs=X[,Y] Sets the offsets of packet segments from the beginning of the receiving buffer if split feature is engaged. Affects only the queues configured with split offloads (currently BUFFER_SPLIT is supported only). Add interactive mode command, providing the same: testpmd> set rxoffs (x[,y]) Where x[,y] represents a CSV list of values, without white space. Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2020-10-16 22:26:40 +02:00
Viacheslav Ovsiienko	0f2096d7ab	app/testpmd: add rxpkts commands and parameters Add command line parameter: --rxpkts=X[,Y] Sets the length of segments to scatter packets on receiving if split feature is engaged. Affects only the queues configured with split offloads (currently BUFFER_SPLIT is supported only). Add interactive mode command: testpmd> set rxpkts (x[,y]) Where x[,y] represents a CSV list of values, without white space. Sets the length of segments to scatter packets on receiving if split feature is engaged. Affects only the queues configured with split offloads (currently BUFFER_SPLIT is supported only). Optionally the multiple memory pools can be specified with --mbuf-size command line parameter and the mbufs to receive will be allocated sequentially from these extra memory pools. Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2020-10-16 22:26:40 +02:00
Viacheslav Ovsiienko	26cbb4191e	app/testpmd: add multiple pools per core creation The command line parameter --mbuf-size is updated, it can handle the multiple values like the following: --mbuf-size=2176,512,768,4096 specifying the creation the extra memory pools with the requested mbuf data buffer sizes. If some buffer split feature is engaged the extra memory pools can be used to configure the Rx queues with rte_the_dev_rx_queue_setup_ex(). The extra pools are created with requested sizes, and pool names are assigned with appended index: mbuf_pool_socket_%socket_%index. Index zero is used to specify the first mandatory pool to maintain compatibility with existing code. Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2020-10-16 22:26:40 +02:00
Viacheslav Ovsiienko	4ff702b5df	ethdev: introduce Rx buffer split The DPDK datapath in the transmit direction is very flexible. An application can build the multi-segment packet and manages almost all data aspects - the memory pools where segments are allocated from, the segment lengths, the memory attributes like external buffers, registered for DMA, etc. In the receiving direction, the datapath is much less flexible, an application can only specify the memory pool to configure the receiving queue and nothing more. In order to extend receiving datapath capabilities it is proposed to add the way to provide extended information how to split the packets being received. The new offload flag RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT in device capabilities is introduced to present the way for PMD to report to application about supporting Rx packet split to configurable segments. Prior invoking the rte_eth_rx_queue_setup() routine application should check RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT flag. The following structure is introduced to specify the Rx packet segment for RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT offload: struct rte_eth_rxseg_split { struct rte_mempool mp; / memory pools to allocate segment from / uint16_t length; / segment maximal data length, configures "split point" / uint16_t offset; / data offset from beginning of mbuf data buffer / uint32_t reserved; / reserved field */ }; The segment descriptions are added to the rte_eth_rxconf structure: rx_seg - pointer the array of segment descriptions, each element describes the memory pool, maximal data length, initial data offset from the beginning of data buffer in mbuf. This array allows to specify the different settings for each segment in individual fashion. rx_nseg - number of elements in the array If the extended segment descriptions is provided with these new fields the mp parameter of the rte_eth_rx_queue_setup must be specified as NULL to avoid ambiguity. There are two options to specify Rx buffer configuration: - mp is not NULL, rrx_conf.rx_nseg is zero, it is compatible configuration, follows existing implementation, provides the single pool and no description for segment sizes and offsets. - mp is NULL, rx_conf.rx_seg is not NULL, rx_conf.rx_nseg is not zero, it provides the extended configuration, individually for each segment. f the Rx queue is configured with new settings the packets being received will be split into multiple segments pushed to the mbufs with specified attributes. The PMD will split the received packets into multiple segments according to the specification in the description array. For example, let's suppose we configured the Rx queue with the following segments: seg0 - pool0, len0=14B, off0=2 seg1 - pool1, len1=20B, off1=128B seg2 - pool2, len2=20B, off2=0B seg3 - pool3, len3=512B, off3=0B The packet 46 bytes long will look like the following: seg0 - 14B long @ RTE_PKTMBUF_HEADROOM + 2 in mbuf from pool0 seg1 - 20B long @ 128 in mbuf from pool1 seg2 - 12B long @ 0 in mbuf from pool2 The packet 1500 bytes long will look like the following: seg0 - 14B @ RTE_PKTMBUF_HEADROOM + 2 in mbuf from pool0 seg1 - 20B @ 128 in mbuf from pool1 seg2 - 20B @ 0 in mbuf from pool2 seg3 - 512B @ 0 in mbuf from pool3 seg4 - 512B @ 0 in mbuf from pool3 seg5 - 422B @ 0 in mbuf from pool3 The offload RTE_ETH_RX_OFFLOAD_SCATTER must be present and configured to support new buffer split feature (if rx_nseg is greater than one). The split limitations imposed by underlying PMD is reported in the new introduced rte_eth_dev_info->rx_seg_capa field. The new approach would allow splitting the ingress packets into multiple parts pushed to the memory with different attributes. For example, the packet headers can be pushed to the embedded data buffers within mbufs and the application data into the external buffers attached to mbufs allocated from the different memory pools. The memory attributes for the split parts may differ either - for example the application data may be pushed into the external memory located on the dedicated physical device, say GPU or NVMe. This would improve the DPDK receiving datapath flexibility with preserving compatibility with existing API. Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Acked-by: Jerin Jacob <jerinj@marvell.com> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2020-10-16 22:26:40 +02:00
Gregory Etelson	1b9f274623	app/testpmd: add commands for tunnel offload Tunnel Offload API provides hardware independent, unified model to offload tunneled traffic. Key model elements are: - apply matches to both outer and inner packet headers during entire offload procedure; - restore outer header of partially offloaded packet; - model is implemented as a set of helper functions. Implementation details: * Create application tunnel: flow tunnel create <port> type <tunnel type> On success, the command creates application tunnel object and returns the tunnel descriptor. Tunnel descriptor is used in subsequent flow creation commands to reference the tunnel. * Create tunnel steering flow rule: tunnel_set <tunnel descriptor> parameter used with steering rule template. * Create tunnel matching flow rule: tunnel_match <tunnel descriptor> used with matching rule template. * If tunnel steering rule was offloaded, outer header of a partially offloaded packet is restored after miss. Example: test packet= <Ether dst=24:8a:07:8d:ae:d6 src=50:6b:4b:cc:fc:e2 type=IPv4 \| <IP version=4 ihl=5 proto=udp src=1.1.1.1 dst=1.1.1.10 \| <UDP sport=4789 dport=4789 len=58 chksum=0x7f7b \| <VXLAN NextProtocol=Ethernet vni=0x0 \| <Ether dst=24:aa:aa:aa:aa:d6 src=50:bb:bb:bb:bb:e2 type=IPv4 \| <IP version=4 ihl=5 proto=icmp src=2.2.2.2 dst=2.2.2.200 \| <ICMP type=echo-request code=0 chksum=0xf7ff id=0x0 seq=0x0 \|>>>>>>> >>> len(packet) 92 testpmd> flow flush 0 testpmd> port 0/queue 0: received 1 packets src=50:6B:4B:CC:FC:E2 - dst=24:8A:07:8D:AE:D6 - type=0x0800 - length=92 testpmd> flow tunnel 0 type vxlan port 0: flow tunnel #1 type vxlan testpmd> flow create 0 ingress group 0 tunnel_set 1 pattern eth /ipv4 / udp dst is 4789 / vxlan / end actions jump group 0 / end Flow rule #0 created testpmd> port 0/queue 0: received 1 packets tunnel restore info: - vxlan tunnel - outer header present # <-- src=50:6B:4B:CC:FC:E2 - dst=24:8A:07:8D:AE:D6 - type=0x0800 - length=92 testpmd> flow create 0 ingress group 0 tunnel_match 1 pattern eth / ipv4 / udp dst is 4789 / vxlan / eth / ipv4 / end actions set_mac_dst mac_addr 02:CA:FE:CA:FA:80 / queue index 0 / end Flow rule #1 created testpmd> port 0/queue 0: received 1 packets src=50:BB:BB:BB:BB:E2 - dst=02:CA:FE:CA:FA:80 - type=0x0800 - length=42 * Destroy flow tunnel flow tunnel destroy <port> id <tunnel id> * Show existing flow tunnels flow tunnel list <port> Signed-off-by: Gregory Etelson <getelson@nvidia.com>	2020-10-16 19:48:19 +02:00
Eli Britstein	9ec0f97e02	ethdev: add tunnel offload model rte_flow API provides the building blocks for vendor-agnostic flow classification offloads. The rte_flow "patterns" and "actions" primitives are fine-grained, thus enabling DPDK applications the flexibility to offload network stacks and complex pipelines. Applications wishing to offload tunneled traffic are required to use the rte_flow primitives, such as group, meta, mark, tag, and others to model their high-level objects. The hardware model design for high-level software objects is not trivial. Furthermore, an optimal design is often vendor-specific. When hardware offloads tunneled traffic in multi-group logic, partially offloaded packets may arrive to the application after they were modified in hardware. In this case, the application may need to restore the original packet headers. Consider the following sequence: The application decaps a packet in one group and jumps to a second group where it tries to match on a 5-tuple, that will miss and send the packet to the application. In this case, the application does not receive the original packet but a modified one. Also, in this case, the application cannot match on the outer header fields, such as VXLAN vni and 5-tuple. There are several possible ways to use rte_flow "patterns" and "actions" to resolve the issues above. For example: 1 Mapping headers to a hardware registers using the rte_flow_action_mark/rte_flow_action_tag/rte_flow_set_meta objects. 2 Apply the decap only at the last offload stage after all the "patterns" were matched and the packet will be fully offloaded. Every approach has its pros and cons and is highly dependent on the hardware vendor. For example, some hardware may have a limited number of registers while other hardware could not support inner actions and must decap before accessing inner headers. The tunnel offload model resolves these issues. The model goals are: 1 Provide a unified application API to offload tunneled traffic that is capable to match on outer headers after decap. 2 Allow the application to restore the outer header of partially offloaded packets. The tunnel offload model does not introduce new elements to the existing RTE flow model and is implemented as a set of helper functions. For the application to work with the tunnel offload API it has to adjust flow rules in multi-table tunnel offload in the following way: 1 Remove explicit call to decap action and replace it with PMD actions obtained from rte_flow_tunnel_decap_and_set() helper. 2 Add PMD items obtained from rte_flow_tunnel_match() helper to all other rules in the tunnel offload sequence. VXLAN Code example: Assume application needs to do inner NAT on the VXLAN packet. The first rule in group 0: flow create <port id> ingress group 0 pattern eth / ipv4 / udp dst is 4789 / vxlan / end actions {pmd actions} / jump group 3 / end The first VXLAN packet that arrives matches the rule in group 0 and jumps to group 3. In group 3 the packet will miss since there is no flow to match and will be sent to the application. Application will call rte_flow_get_restore_info() to get the packet outer header. Application will insert a new rule in group 3 to match outer and inner headers: flow create <port id> ingress group 3 pattern {pmd items} / eth / ipv4 dst is 172.10.10.1 / udp dst 4789 / vxlan vni is 10 / ipv4 dst is 184.1.2.3 / end actions set_ipv4_dst 186.1.1.1 / queue index 3 / end Resulting of the rules will be that VXLAN packet with vni=10, outer IPv4 dst=172.10.10.1 and inner IPv4 dst=184.1.2.3 will be received decapped on queue 3 with IPv4 dst=186.1.1.1 Note: The packet in group 3 is considered decapped. All actions in that group will be done on the header that was inner before decap. The application may specify an outer header to be matched on. It's PMD responsibility to translate these items to outer metadata. API usage: /** * 1. Initiate RTE flow tunnel object / const struct rte_flow_tunnel tunnel = { .type = RTE_FLOW_ITEM_TYPE_VXLAN, .tun_id = 10, } /* * 2. Obtain PMD tunnel actions * * pmd_actions is an intermediate variable application uses to * compile actions array / struct rte_flow_action pmd_actions; rte_flow_tunnel_decap_and_set(&tunnel, &pmd_actions, &num_pmd_actions, &error); /* * 3. offload the first rule * matching on VXLAN traffic and jumps to group 3 * (implicitly decaps packet) / app_actions = jump group 3 rule_items = app_items; /* eth / ipv4 / udp / vxlan / rule_actions = { pmd_actions, app_actions }; attr.group = 0; flow_1 = rte_flow_create(port_id, &attr, rule_items, rule_actions, &error); /* * 4. after flow creation application does not need to keep the * tunnel action resources. / rte_flow_tunnel_action_release(port_id, pmd_actions, num_pmd_actions); /* * 5. After partially offloaded packet miss because there was no * matching rule handle miss on group 3 / struct rte_flow_restore_info info; rte_flow_get_restore_info(port_id, mbuf, &info, &error); /* * 6. Offload NAT rule: / app_items = { eth / ipv4 dst is 172.10.10.1 / udp dst 4789 / vxlan vni is 10 / ipv4 dst is 184.1.2.3 } app_actions = { set_ipv4_dst 186.1.1.1 / queue index 3 } rte_flow_tunnel_match(&info.tunnel, &pmd_items, &num_pmd_items, &error); rule_items = {pmd_items, app_items}; rule_actions = app_actions; attr.group = info.group_id; flow_2 = rte_flow_create(port_id, &attr, rule_items, rule_actions, &error); /* * 7. Release PMD items after rule creation */ rte_flow_tunnel_item_release(port_id, pmd_items, num_pmd_items); References 1. https://mails.dpdk.org/archives/dev/2020-June/index.html Signed-off-by: Eli Britstein <elibr@mellanox.com> Signed-off-by: Gregory Etelson <getelson@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2020-10-16 19:48:19 +02:00
Gregory Etelson	5d1bff8fe2	ethdev: allow negative values in flow rule types RTE flow items & actions use positive values in item & action type. Negative values are reserved for PMD private types. PMD items & actions usually are not exposed to application and are not used to create RTE flows. The patch allows applications with access to PMD flow items & actions ability to integrate RTE and PMD items & actions and use them to create flow rule. RTE flow item or action conversion library accepts positive known element types with predefined sizes only. Private PMD items and actions do not fit into this scheme because PMD type values are negative, each PMD has it's own types numeration and element types and their sizes are not visible at RTE level. To resolve these limitations the patch proposes this solution: 1. PMD can expose elements of pointer size only. RTE flow conversion functions will use pointer size for each configuration object in private PMD element it processes; 2. RTE flow verification will not reject elements with negative type. Signed-off-by: Gregory Etelson <getelson@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2020-10-16 19:48:19 +02:00

1 2 3 4 5 ...

4026 Commits