numam-dpdk

Author	SHA1	Message	Date
Ori Kam	b10a421a1f	ethdev: add packet integrity check flow rules Currently, DPDK application can offload the checksum check, and report it in the mbuf. However, as more and more applications are offloading some or all logic and action to the HW, there is a need to check the packet integrity so the right decision can be taken. The application logic can be positive meaning if the packet is valid jump / do actions, or negative if packet is not valid jump to SW / do actions (like drop) and add default flow (match all in low priority) that will direct the miss packet to the miss path. Since currently rte_flow works in positive way the assumption is that the positive way will be the common way in this case also. When thinking what is the best API to implement such feature, we need to consider the following (in no specific order): 1. API breakage. 2. Simplicity. 3. Performance. 4. HW capabilities. 5. rte_flow limitation. 6. Flexibility. First option: Add integrity flags to each of the items. For example add checksum_ok to IPv4 item. Pros: 1. No new rte_flow item. 2. Simple in the way that on each item the app can see what checks are available. Cons: 1. API breakage. 2. Increase number of flows, since app can't add global rule and must have dedicated flow for each of the flow combinations, for example matching on ICMP traffic or UDP/TCP traffic with IPv4 / IPv6 will result in 5 flows. Second option: dedicated item Pros: 1. No API breakage, and there will be no for some time due to having extra space. (by using bits) 2. Just one flow to support the ICMP or UDP/TCP traffic with IPv4 / IPv6. 3. Simplicity application can just look at one place to see all possible checks. 4. Allow future support for more tests. Cons: 1. New item, that holds number of fields from different items. For starter the following bits are suggested: 1. packet_ok - means that all HW checks depending on packet layer have passed. This may mean that in some HW such flow should be split to number of flows or fail. 2. l2_ok - all check for layer 2 have passed. 3. l3_ok - all check for layer 3 have passed. If packet doesn't have L3 layer this check should fail. 4. l4_ok - all check for layer 4 have passed. If packet doesn't have L4 layer this check should fail. 5. l2_crc_ok - the layer 2 CRC is O.K. 6. ipv4_csum_ok - IPv4 checksum is O.K. It is possible that the IPv4 checksum will be O.K. but the l3_ok will be 0. It is not possible that checksum will be 0 and the l3_ok will be 1. 7. l4_csum_ok - layer 4 checksum is O.K. 8. l3_len_OK - check that the reported layer 3 length is smaller than the frame length. Example of usage: 1. Check packets from all possible layers for integrity. flow create integrity spec packet_ok = 1 mask packet_ok = 1 ..... 2. Check only packet with layer 4 (UDP / TCP) flow create integrity spec l3_ok = 1, l4_ok = 1 mask l3_ok = 1 l4_ok = 1 Signed-off-by: Ori Kam <orika@nvidia.com> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2021-04-19 19:05:17 +02:00
Bing Zhao	4b61b8774b	ethdev: introduce indirect flow action Right now, rte_flow_shared_action_* APIs are used for some shared actions, like RSS, count. The shared action should be created before using it inside a flow. These shared actions sometimes are not really shared but just some indirect actions decoupled from a flow. The new functions rte_flow_action_handle_* are added to replace the current shared functions rte_flow_shared_action_. There are two types of flow actions: 1. the direct (normal) actions that could be created and stored within a flow rule. Such action is tied to its flow rule and cannot be reused. 2. the indirect action, in the past, named shared_action. It is created from a direct actioni, like count or rss, and then used in the flow rules with an object handle. The PMD will take care of the retrieve from indirect action to the direct action when it is referenced. The indirect action is accessed (update / query) w/o any flow rule, just via the action object handle. For example, when querying or resetting a counter, it could be done out of any flow using this counter, but only the handle of the counter action object is required. The indirect action object could be shared by different flows or used by a single flow, depending on the direct action type and the real-life requirements. The handle of an indirect action object is opaque and defined in each driver and possibly different per direct action type. The old name "shared" is improper in a sense and should be replaced. Since the APIs are changed from "rte_flow_shared_action" to the new "rte_flow_action_handle", the testpmd application code and command line interfaces also need to be updated to do the adaption. The testpmd application user guide is also updated. All the "shared action" related parts are replaced with "indirect action" to have a correct explanation. The parameter of "update" interface is also changed. A general pointer will replace the rte_flow_action struct pointer due to the facts: 1. Some action may not support fields updating. In the example of a counter, the only "update" supported should be the reset. So passing a rte_flow_action struct pointer is meaningless and there is even no such corresponding action struct. What's more, if more than one operations should be supported, for some other action, such pointer parameter may not meet the need. 2. Some action may need conditional or partial update, the current parameter will not provide the ability to indicate which part(s) to update. For different types of indirect action objects, the pointer could either be the same of rte_flow_action struct - in order not to break the current driver implementation, or some wrapper structures with bits as masks to indicate which part to be updated, depending on real needs of the corresponding direct action. For different direct actions, the structures of indirect action objects updating will be different. All the underlayer PMD callbacks will be moved to these new APIs. The RTE_FLOW_ACTION_TYPE_SHARED is kept for now in order not to break the ABI. All the implementations are changed by using RTE_FLOW_ACTION_TYPE_INDIRECT. Since the APIs are changed from "rte_flow_shared_action" to the new "rte_flow_action_handle" and the "update" interface's 3rd input parameter is changed to generic pointer, the mlx5 PMD that uses these APIs needs to do the adaption to the new APIs as well. Signed-off-by: Bing Zhao <bingz@nvidia.com> Acked-by: Andrey Vesnovaty <andreyv@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2021-04-19 18:25:42 +02:00
Bruce Richardson	99a2dd955f	lib: remove librte_ prefix from directory names There is no reason for the DPDK libraries to all have 'librte_' prefix on the directory names. This prefix makes the directory names longer and also makes it awkward to add features referring to individual libraries in the build - should the lib names be specified with or without the prefix. Therefore, we can just remove the library prefix and use the library's unique name as the directory name, i.e. 'eal' rather than 'librte_eal' Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>	2021-04-21 14:04:09 +02:00
Vladimir Medvedkin	28ebff11c2	hash: add predictable RSS This patch adds predictable RSS API. It is based on the idea of searching partial Toeplitz hash collisions. Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Yipeng Wang <yipeng1.wang@intel.com> Acked-by: John McNamara <john.mcnamara@intel.com>	2021-04-20 23:13:23 +02:00
Vladimir Medvedkin	534fe5f339	doc: add Toeplitz hash guide Add documentation for the Toeplitz hash library. Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com> Reviewed-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Reviewed-by: John McNamara <john.mcnamara@intel.com>	2021-04-20 23:12:47 +02:00
Akhil Goyal	f96a8ebb27	eventdev: introduce crypto adapter enqueue API In case an event from a previous stage is required to be forwarded to a crypto adapter and PMD supports internal event port in crypto adapter, exposed via capability RTE_EVENT_CRYPTO_ADAPTER_CAP_INTERNAL_PORT_OP_FWD, we do not have a way to check in the API rte_event_enqueue_burst(), whether it is for crypto adapter or for eth tx adapter. Hence we need a new API similar to rte_event_eth_tx_adapter_enqueue(), which can send to a crypto adapter. Note that RTE_EVENT_TYPE_* cannot be used to make that decision, as it is meant for event source and not event destination. And event port designated for crypto adapter is designed to be used for OP_NEW mode. Hence, in order to support an event PMD which has an internal event port in crypto adapter (RTE_EVENT_CRYPTO_ADAPTER_OP_FORWARD mode), exposed via capability RTE_EVENT_CRYPTO_ADAPTER_CAP_INTERNAL_PORT_OP_FWD, application should use rte_event_crypto_adapter_enqueue() API to enqueue events. When internal port is not available(RTE_EVENT_CRYPTO_ADAPTER_OP_NEW mode), application can use API rte_event_enqueue_burst() as it was doing earlier, i.e. retrieve event port used by crypto adapter and bind its event queues to that port and enqueue events using the API rte_event_enqueue_burst(). Signed-off-by: Akhil Goyal <gakhil@marvell.com> Acked-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com> Acked-by: Jerin Jacob <jerinj@marvell.com>	2021-04-17 18:49:52 +02:00
Asaf Penso	d1355fcc46	doc: add links for build requirements per OS To compile with meson some dependencies should be installed. Section "Getting the Tools" describes what needed, but per OS there are additional steps to do. Add links to Linux, FreeBSD, and Windows guide for more info. Signed-off-by: Asaf Penso <asafp@nvidia.com>	2021-04-17 12:37:38 +02:00
Gabriel Ganne	8c10530836	build: update minimum required Meson version Bump Meson required version to 0.49.2 which is chosen so as to be provided by both redhat-8 and debian-10. Update documentation and travis setup script accordingly. This fixes the following warning: WARNING: Project targeting '>= 0.47.1' but tried to use feature introduced in '0.48.0': console arg in custom_target 'console' argument is used within kernel/linux/kni/meson.build Signed-off-by: Gabriel Ganne <gabriel.ganne@6wind.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2021-04-16 18:51:51 +02:00
Pavan Nikhilesh	d7c428e557	eventdev: support Rx adapter event vector Add event vector support for event eth Rx adapter, the implementation creates vector flows based on port and queue identifier of the received mbufs. The flow id for SW Rx event vectorization will use 12-bits of queue identifier and 8-bits port identifier when custom flow id is not set for simplicity. Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>	2021-04-12 09:23:34 +02:00
Pavan Nikhilesh	3da4060a30	eventdev: introduce event vector Tx capability Introduce event vector transmit capability for event eth tx adapter. The capability indicates that the Tx adapter is capable of transmitting event vectors. When rte_event_vector::union_valid is set, the Tx adapter should transmit all the packets to the rte_event_vector::port using the rte_event_vector::queue. If rte_event_vector::union_valid is not set then the Tx adapter should peek into each mbuf to get the destination port and queue pair. Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Acked-by: Jerin Jacob <jerinj@marvell.com> Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>	2021-04-12 09:23:34 +02:00
Pavan Nikhilesh	3c838062b9	eventdev: introduce event vector Rx capability Introduce event ethernet Rx adapter event vector capability. If an event eth Rx adapter has the capability of RTE_EVENT_ETH_RX_ADAPTER_CAP_EVENT_VECTOR then a given Rx queue can be configured to enable event vectorization by passing the flag RTE_EVENT_ETH_RX_ADAPTER_QUEUE_EVENT_VECTOR to rte_event_eth_rx_adapter_queue_conf::rx_queue_flags while configuring Rx adapter through rte_event_eth_rx_adapter_queue_add(). The max vector size, vector timeout define the vector size and mempool used for allocating vector event are configured through rte_event_eth_rx_adapter_queue_add. The element size of the element in the vector pool should be equal to sizeof(struct rte_event_vector) + (vector_sz * sizeof(uintptr_t)) Application can use `rte_event_vector_pool_create` to create the vector mempool used for rte_event_eth_rx_adapter_queue_conf::vector_mp. The Rx adapter would be responsible for vectorizing the mbufs based on the flow, the vector limits configured by the application and add the vector event of mbufs to the event queue set via rte_event_eth_rx_adapter_queue_conf::ev::queue_id. It should also mark rte_event_vector::union_valid and fill rte_event_vector::port, rte_event_vector::queue. Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Acked-by: Jerin Jacob <jerinj@marvell.com> Acked-by: Ray Kinsella <mdr@ashroe.eu> Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>	2021-04-12 09:23:34 +02:00
Pavan Nikhilesh	1cc44d4092	eventdev: introduce event vector capability Introduce rte_event_vector datastructure which is capable of holding multiple uintptr_t of the same flow thereby allowing applications to vectorize their pipeline and reducing the complexity of pipelining the events across multiple stages. This approach also reduces the scheduling overhead on a event device. Add a event vector mempool create handler to create mempools based on the best mempool ops available on a given platform. Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Acked-by: Jerin Jacob <jerinj@marvell.com> Acked-by: Ray Kinsella <mdr@ashroe.eu> Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>	2021-04-12 09:23:34 +02:00
Shijith Thotton	a10d79a60b	eventdev: introduce adapter flags for periodic mode A timer adapter in periodic mode can be used to arm periodic timers. This patch adds flags used to advertise capability and configure timer adapter in periodic mode. Capability flag should be set for adapters which support periodic mode. Below is a programming sequence on the usage: /* check for periodic mode support by reading capability. / rte_event_timer_adapter_caps_get(...); / create adapter in periodic mode by setting periodic flag (RTE_EVENT_TIMER_ADAPTER_F_PERIODIC) and resolution. / rte_event_timer_adapter_create_ext(...); / arm periodic timer of configured resolution / rte_event_timer_arm_burst(...); / timer event will be periodically generated at configured resolution till cancel is called. / while (running) { rte_event_dequeue_burst(...); } / cancel periodic timer which stops generating events */ rte_event_timer_cancel_burst(...); Signed-off-by: Shijith Thotton <sthotton@marvell.com> Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com> Acked-by: Jerin Jacob <jerinj@marvell.com>	2021-04-12 09:23:34 +02:00
Alexander Kozyrev	d2cf28145f	doc: add fields enum for modify action in flow guide Fix the documentation about the MODIFY_FIELD flow action. 1. Include the rte_flow_field_id enumeration reference to point to the full list of all supported Field IDs available. 2. Correct the formatting of the MODIFY_FIELD action and the destination/source field definition tables. Fixes: 73b68f4c54a0 ("ethdev: introduce generic modify flow action") Cc: stable@dpdk.org Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2021-04-08 15:21:43 +02:00
Juraj Linkeš	5b3a6ca6fd	build: alias default build as generic The current machine='default' build name is not descriptive. The actual default build is machine='native'. Add an alternative string which does the same build and better describes what we're building: machine='generic'. Leave machine='default' for backwards compatibility. Signed-off-by: Juraj Linkeš <juraj.linkes@pantheon.tech> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2021-04-09 19:11:26 +02:00
Thomas Monjalon	fb7ad441d4	ethdev: replace callback getting filter operations Since rte_flow is the only API for filtering operations, the legacy driver interface filter_ctrl was too much complicated for the simple task of getting the struct rte_flow_ops. The filter type RTE_ETH_FILTER_GENERIC and the filter operarion RTE_ETH_FILTER_GET are removed. The new driver callback flow_ops_get replaces filter_ctrl. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Acked-by: Haiyue Wang <haiyue.wang@intel.com> Acked-by: Rosen Xu <rosen.xu@intel.com> Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2021-03-26 18:37:13 +01:00
Xueming Li	da97592635	ethdev: support PF index in representor With Kernel bonding, multiple underlying PFs are bonded, VFs come from different PF, need to identify representor of VFs unambiguously by adding PF index. This patch introduces optional 'pf' section to representor devargs syntax, examples: representor=pf0vf0 - single VF representor representor=pf[0-1]sf[0-1023] - SF representors from 2 PFs PF type representor is supported by using standalone 'pf' section: representor=pf1 - PF representor Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2021-03-16 20:15:29 +01:00
Xueming Li	fa4f3fecb9	ethdev: support sub-function representor SubFunction is a portion of the PCI device, created on demand, a SF netdev has its own dedicated queues(txq, rxq). A SF netdev supports eswitch representation offload similar to existing PF and VF representors. To support SF representor, this patch introduces new devargs syntax, examples: representor=sf0 - single SubFunction representor representor=sf[1,3,5] - single list representor=sf[0-3], - single range representor=sf[0,2-6,8,10-12] - list with singles and ranges Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2021-03-16 20:15:29 +01:00
Xueming Li	cebf7f1715	ethdev: support new VF representor syntax Current VF representor syntax: representor=2 - single representor representor=[0-3] - single range To prepare for more representor types, this patch adds compatible VF representor devargs syntax: vf#: representor=vf2 - single representor representor=vf[1,3,5] - single list representor=vf[0-3] - single range representor=vf[0,1,4-7] - list with singles and range For backwards compatibility, representor "#" is interpreted as "vf#". Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2021-03-16 20:15:29 +01:00
Xiaoyu Min	71b09bd950	doc: add more explanation about flow shared action Added more information of shared action on how to update, query, and the benefits. Signed-off-by: Xiaoyu Min <jackmin@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com> Reviewed-by: Asaf Penso <asafp@nvidia.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2021-02-04 15:38:36 +01:00
Liang Ma	682a645438	power: add ethdev power management Add a simple on/off switch that will enable saving power when no packets are arriving. It is based on counting the number of empty polls and, when the number reaches a certain threshold, entering an architecture-defined optimized power state that will either wait until a TSC timestamp expires, or when packets arrive. This API mandates a core-to-single-queue mapping (that is, multiple queued per device are supported, but they have to be polled on different cores). This design is using PMD RX callbacks. 1. UMWAIT/UMONITOR: When a certain threshold of empty polls is reached, the core will go into a power optimized sleep while waiting on an address of next RX descriptor to be written to. 2. TPAUSE/Pause instruction This method uses the pause (or TPAUSE, if available) instruction to avoid busy polling. 3. Frequency scaling Reuse existing DPDK power library to scale up/down core frequency depending on traffic volume. Signed-off-by: Liang Ma <liang.j.ma@intel.com> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: David Hunt <david.hunt@intel.com>	2021-01-29 15:29:48 +01:00
Alexander Kozyrev	73b68f4c54	ethdev: introduce generic modify flow action Implement the generic modify flow API to allow manipulations on an arbitrary header field (as well as mark, metadata or tag) using data from another field or a user-specified value. This generic modify mechanism removes the necessity to implement a separate RTE Flow action every time we need to modify a new packet field in the future. Supported operation are: - set: copy data from source to destination. - add: integer addition, stores the result in destination. - sub: integer subtraction, stores the result in destination. The field ID is used to specify the desired source/destination packet field in order to simplify the API for various encapsulation models. Specifying the packet field ID with the needed encapsulation level is able to quickly get a packet field for any inner packet header. Alternatively, the special ID (ITEM_START) can be used to point to the very beginning of a packet. This ID in conjunction with the offset parameter provides great flexibility to copy/modify any part of a packet as needed. The number of bits to use from a source as well as the offset can be be specified to allow a partial copy or dividing a big packet field into multiple small fields (e.g. copying 128 bits of IPv6 to 4 tags). An immediate value (or a pointer to it) can be specified instead of the level and the offset for the special FIELD_VALUE ID (or FIELD_POINTER). Can be used as a source only. Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>	2021-01-19 03:30:32 +01:00
Abhinandan Gujjar	1c3ffb9559	cryptodev: add enqueue and dequeue callbacks This patch adds APIs to add/remove callback functions on crypto enqueue/dequeue burst. The callback function will be called for each burst of crypto ops received/sent on a given crypto device queue pair. Signed-off-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2021-01-19 18:05:44 +01:00
Thomas Monjalon	924e7d8f67	doc: fix figure numbering in graph guide Some figures had a title inside the picture but not in RST file. As a consequence, some versions of Sphinx are emitting a warning. Warning, treated as error: doc/guides/prog_guide/graph_lib.rst:64: no number is assigned for figure: figure-anatomy-of-a-node The titles are moved from SVG to RST, except for graph_mem_layout.svg where in-picture title must be kept. Fixes: 4dc6d8e63c16 ("doc: add graph library guide") Cc: stable@dpdk.org Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Reviewed-by: Jerin Jacob <jerinj@marvell.com>	2021-01-15 12:28:19 +01:00
Yi Yang	76f093948f	gso: support VXLAN UDP/IPv4 As most NICs do not support segmentation for VXLAN-encapsulated UDP/IPv4 packets, this patch adds VXLAN UDP/IPv4 GSO support. Signed-off-by: Yi Yang <yangyi01@inspur.com> Acked-by: Jiayu Hu <jiayu.hu@intel.com>	2021-01-15 11:31:28 +01:00
Jiayu Hu	1b7b24389c	vhost: enhance async enqueue for small packets Async enqueue offloads large copies to DMA devices, and small copies are still performed by the CPU. However, it requires users to get enqueue completed packets by rte_vhost_poll_enqueue_completed(), even if they are completed by the CPU when rte_vhost_submit_enqueue_burst() returns. This design incurs extra overheads of tracking completed pktmbufs and function calls, thus degrading performance on small packets. This patch enhances async enqueue for small packets by enabling rte_vhost_submit_enqueue_burst() to return completed packets. Signed-off-by: Jiayu Hu <jiayu.hu@intel.com> Tested-by: Yinan Wang <yinan.wang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2021-01-13 18:51:58 +01:00
Eugeny Parshutin	6a9d1e28f1	doc: add vtune profiling config to prog guide Return back 'profiling with vtune' section to profiling programmers guide with updated instruction on how to enable vtune profiling with meson configuration option. Fixes: 89c67ae2cba7 ("doc: remove references to make from prog guide") Cc: stable@dpdk.org Signed-off-by: Eugeny Parshutin <eugeny.parshutin@linux.intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2021-01-13 21:25:13 +01:00
Andrew Rybchenko	8ca9bf26f5	ethdev: deprecate shared counters using action attribute A new generic shared actions API may be used to create shared counter. There is no point to keep duplicate COUNT action specific capability to create shared counters. Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2020-11-27 19:16:45 +01:00
Reshma Pattan	2351615116	doc: clarify multi-process roles for pdump Update the pdump library programmers guide and Howto doc with the use of multi process channel replacing socket based communication. Signed-off-by: Reshma Pattan <reshma.pattan@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2020-11-26 16:32:11 +01:00
Sarosh Arif	c053d9e962	doc: fix grammar This patch corrects a grammatical error by changing 'an DPDK' to 'a DPDK', so that the sentences can become grammatically accurate. Fixes: 2e486e26328c ("doc: remove Intel references from linux guide") Fixes: 48624fd96e7c ("doc: remove Intel references from prog guide") Fixes: e0c7c4731957 ("doc: remove Intel references from sample apps guide") Cc: stable@dpdk.org Signed-off-by: Sarosh Arif <sarosh.arif@emumba.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2020-11-26 16:03:16 +01:00
Gregory Etelson	4467fed6f9	doc: update flow API guide for rule removal on stop There is a discrepancy between ethdev API and flow rules guide regarding flow rules maintenance after port stop. librte_ethdev.h declares that flow rules will not be stored in PMD after port stop: >>>>> Quote start Please note that some configuration is not stored between calls to rte_eth_dev_stop()/rte_eth_dev_start(). The following configuration will be retained: - MTU - flow control settings - receive mode configuration (promiscuous mode, all-multicast mode, hardware checksum mode, RSS/VMDQ settings etc.) - VLAN filtering configuration - default MAC address - MAC addresses supplied to MAC address array - flow director filtering mode (but not filtering rules) - NIC queue statistics mappings <<<< Quote end PMD cannot always correctly restore flow rules after port stop / port start because application may alter port configuration after port stop without PMD knowledge about undergoing changes. Consider the following scenario: application configures 2 queues 0 and 1 and creates a flow rule with 'queue index 1' action. After that application stops the port and removes queue 1. Although PMD can implement flow rule shadow copy to be used for restore after port start, attempt to restore flow rule from shadow will fail in example above and PMD could not notify application about that failure. As the result, flow rules map in HW will differ from what application expects. In addition, flow rules shadow copy used for port start restore consumes considerable amount of system memory, especially in systems with millions of flow rules. Signed-off-by: Gregory Etelson <getelson@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>	2020-11-26 00:40:15 +01:00
Stephen Hemminger	db27370b57	eal: replace blacklist/whitelist options Replace -w / --pci-whitelist with -a / --allow options and --pci-blacklist with --block. The -b short option remains unchanged. Allow the old options for now, but print a nag warning since old options are deprecated. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Luca Boccassi <bluca@debian.org> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2020-11-16 00:11:22 +01:00
Cristian Dumitrescu	5adff62e5f	doc: describe the SWX pipeline type Add the new SWX pipeline type to the Programmer's Guide. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2020-11-13 13:55:07 +01:00
Ciara Power	cbd2f21ab7	doc: fix typo in KNI guide The typo "withe" should have been "with the". This is now fixed. Fixes: 89397a01ce4a ("kni: set default carrier state of interface") Cc: stable@dpdk.org Signed-off-by: Ciara Power <ciara.power@intel.com> Acked-by: John McNamara <john.mcnamara@intel.com>	2020-11-04 21:08:30 +01:00
Ajit Khaparde	6abd886826	doc: fix a typo in flow API guide flow_type_rss_offloads was misspelt as flow_tpe_rss_offloads Fixes: 6abee736abe6 ("doc: update RSS flow action with best effort") Cc: stable@dpdk.org Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2020-11-03 23:35:02 +01:00
Yi Yang	c0d002aed9	gso: fix mbuf freeing responsibility rte_gso_segment decreased refcnt of pkt by one, but it is wrong if pkt is external mbuf, pkt won't be freed because of incorrect refcnt, the result is application can't allocate mbuf from mempool because mbufs in mempool are run out of. One correct way is application should call rte_pktmbuf_free after calling rte_gso_segment to free pkt explicitly. rte_gso_segment must not handle it, this should be responsibility of application. This commit changed rte_gso_segment in functional behavior and return value, so the application must take appropriate actions according to return values, "ret < 0" means it should free and drop 'pkt', "ret == 0" means 'pkt' isn't GSOed but 'pkt' can be transmitted as a normal packet, "ret > 0" means 'pkt' has been GSOed into two or multiple segments, it should use "pkts_out" to transmit these segments. The application must free 'pkt' after call rte_gso_segment when return value isn't equal to 0. Fixes: 119583797b6a ("gso: support TCP/IPv4 GSO") Cc: stable@dpdk.org Signed-off-by: Yi Yang <yangyi01@inspur.com> Acked-by: Jiayu Hu <jiayu.hu@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2020-11-03 22:45:02 +01:00
Bruce Richardson	8809f78c7d	doc: fix driver names Since the built driver filenames have changed in DPDK 20.11, we need to update the driver doc to match. Most drivers start their section with the driver filename highlighted in bold, while a number were missing the highlight. When updating the names, add the markers for bold text to any missing it, so as to have things more consistent. Fixes: a20b2c01a7a1 ("build: standardize component names and defines") Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Rosen Xu <rosen.xu@intel.com> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>	2020-11-03 16:23:03 +01:00
Thomas Monjalon	52bf2010c9	eventdev: remove software Rx timestamp This a revert of the commit 569758758dcd ("eventdev: add Rx timestamp"). If the Rx timestamp is not configured on the ethdev port, there is no reason to set one. Also the accuracy of the timestamp was bad because set at a late stage. Anyway there is no trace of the usage of this timestamp. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: David Marchand <david.marchand@redhat.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2020-11-03 15:28:26 +01:00
Thomas Monjalon	614af75489	security: switch metadata to dynamic mbuf field The device-specific metadata was stored in the deprecated field udata64. It is moved to a dynamic mbuf field in order to allow removal of udata64. The name rte_security_dynfield is not very descriptive but it should be replaced later by separate fields for each type of data that drivers pass to the upper layer. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Haiyue Wang <haiyue.wang@intel.com>	2020-10-31 16:13:11 +01:00
Honnappa Nagarahalli	47bec9a5ca	ring: add zero copy API Add zero-copy APIs. These APIs provide the capability to copy the data to/from the ring memory directly, without having a temporary copy (for ex: an array of mbufs on the stack). Use cases that involve copying large amount of data to/from the ring can benefit from these APIs. Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2020-10-29 14:13:31 +01:00
Dharmik Thakkar	769b2de7fb	hash: implement RCU resources reclamation Currently, users have to use external RCU mechanisms to free resources when using lock free hash algorithm. Integrate RCU QSBR process to make it easier for the applications to use lock free algorithm. Refer to RCU documentation to understand various aspects of integrating RCU library into other libraries. Suggested-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Signed-off-by: Dharmik Thakkar <dharmik.thakkar@arm.com> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com> Acked-by: Ray Kinsella <mdr@ashroe.eu> Acked-by: Yipeng Wang <yipeng1.wang@intel.com>	2020-10-24 09:25:13 +02:00
Stephen Hemminger	cb056611a8	eal: rename lcore master and slave Replace master lcore with main lcore and replace slave lcore with worker lcore. Keep the old functions and macros but mark them as deprecated for this release. The "--master-lcore" command line option is also deprecated and any usage will print a warning and use "--main-lcore" as replacement. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2020-10-20 13:17:08 +02:00
Bruce Richardson	c0a775a141	doc: add SPDX license tag header to meson guide The build-sdk-meson.rst file originates from the short plain-text meson instructions added in 2018. Add SPDX tag and copyright notice based on the original commit. Fixes: 9c3adc289c5e ("doc: add instructions on build using meson") Cc: stable@dpdk.org Reported-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>	2020-10-19 18:21:44 +02:00
Ciara Power	1e6a661302	acl: check max SIMD bitwidth When choosing a vector path to take, an extra condition must be satisfied to ensure the max SIMD bitwidth allows for the CPU enabled path. These checks are added in the check alg helper functions. Signed-off-by: Ciara Power <ciara.power@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Tested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2020-10-19 16:45:02 +02:00
Ciara Power	580af30dd6	eal: control max SIMD bitwidth This patch adds a max SIMD bitwidth EAL configuration. The API allows for an app to set this value. It can also be set using EAL argument --force-max-simd-bitwidth, which will lock the value and override any modifications made by the app. Each arch has a define for the default SIMD bitwidth value, this is used on EAL init to set the config max SIMD bitwidth. Signed-off-by: Ciara Power <ciara.power@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com> Acked-by: Ray Kinsella <mdr@ashroe.eu>	2020-10-19 16:45:02 +02:00
Akhil Goyal	e30b2833c4	security: update session create API The API ``rte_security_session_create`` takes only single mempool for session and session private data. So the application need to create mempool for twice the number of sessions needed and will also lead to wastage of memory as session private data need more memory compared to session. Hence the API is modified to take two mempool pointers - one for session and one for private data. This is very similar to crypto based session create APIs. Signed-off-by: Akhil Goyal <akhil.goyal@nxp.com> Reviewed-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com> Tested-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>	2020-10-19 09:54:54 +02:00
Eli Britstein	9ec0f97e02	ethdev: add tunnel offload model rte_flow API provides the building blocks for vendor-agnostic flow classification offloads. The rte_flow "patterns" and "actions" primitives are fine-grained, thus enabling DPDK applications the flexibility to offload network stacks and complex pipelines. Applications wishing to offload tunneled traffic are required to use the rte_flow primitives, such as group, meta, mark, tag, and others to model their high-level objects. The hardware model design for high-level software objects is not trivial. Furthermore, an optimal design is often vendor-specific. When hardware offloads tunneled traffic in multi-group logic, partially offloaded packets may arrive to the application after they were modified in hardware. In this case, the application may need to restore the original packet headers. Consider the following sequence: The application decaps a packet in one group and jumps to a second group where it tries to match on a 5-tuple, that will miss and send the packet to the application. In this case, the application does not receive the original packet but a modified one. Also, in this case, the application cannot match on the outer header fields, such as VXLAN vni and 5-tuple. There are several possible ways to use rte_flow "patterns" and "actions" to resolve the issues above. For example: 1 Mapping headers to a hardware registers using the rte_flow_action_mark/rte_flow_action_tag/rte_flow_set_meta objects. 2 Apply the decap only at the last offload stage after all the "patterns" were matched and the packet will be fully offloaded. Every approach has its pros and cons and is highly dependent on the hardware vendor. For example, some hardware may have a limited number of registers while other hardware could not support inner actions and must decap before accessing inner headers. The tunnel offload model resolves these issues. The model goals are: 1 Provide a unified application API to offload tunneled traffic that is capable to match on outer headers after decap. 2 Allow the application to restore the outer header of partially offloaded packets. The tunnel offload model does not introduce new elements to the existing RTE flow model and is implemented as a set of helper functions. For the application to work with the tunnel offload API it has to adjust flow rules in multi-table tunnel offload in the following way: 1 Remove explicit call to decap action and replace it with PMD actions obtained from rte_flow_tunnel_decap_and_set() helper. 2 Add PMD items obtained from rte_flow_tunnel_match() helper to all other rules in the tunnel offload sequence. VXLAN Code example: Assume application needs to do inner NAT on the VXLAN packet. The first rule in group 0: flow create <port id> ingress group 0 pattern eth / ipv4 / udp dst is 4789 / vxlan / end actions {pmd actions} / jump group 3 / end The first VXLAN packet that arrives matches the rule in group 0 and jumps to group 3. In group 3 the packet will miss since there is no flow to match and will be sent to the application. Application will call rte_flow_get_restore_info() to get the packet outer header. Application will insert a new rule in group 3 to match outer and inner headers: flow create <port id> ingress group 3 pattern {pmd items} / eth / ipv4 dst is 172.10.10.1 / udp dst 4789 / vxlan vni is 10 / ipv4 dst is 184.1.2.3 / end actions set_ipv4_dst 186.1.1.1 / queue index 3 / end Resulting of the rules will be that VXLAN packet with vni=10, outer IPv4 dst=172.10.10.1 and inner IPv4 dst=184.1.2.3 will be received decapped on queue 3 with IPv4 dst=186.1.1.1 Note: The packet in group 3 is considered decapped. All actions in that group will be done on the header that was inner before decap. The application may specify an outer header to be matched on. It's PMD responsibility to translate these items to outer metadata. API usage: /** * 1. Initiate RTE flow tunnel object / const struct rte_flow_tunnel tunnel = { .type = RTE_FLOW_ITEM_TYPE_VXLAN, .tun_id = 10, } /* * 2. Obtain PMD tunnel actions * * pmd_actions is an intermediate variable application uses to * compile actions array / struct rte_flow_action pmd_actions; rte_flow_tunnel_decap_and_set(&tunnel, &pmd_actions, &num_pmd_actions, &error); /* * 3. offload the first rule * matching on VXLAN traffic and jumps to group 3 * (implicitly decaps packet) / app_actions = jump group 3 rule_items = app_items; /* eth / ipv4 / udp / vxlan / rule_actions = { pmd_actions, app_actions }; attr.group = 0; flow_1 = rte_flow_create(port_id, &attr, rule_items, rule_actions, &error); /* * 4. after flow creation application does not need to keep the * tunnel action resources. / rte_flow_tunnel_action_release(port_id, pmd_actions, num_pmd_actions); /* * 5. After partially offloaded packet miss because there was no * matching rule handle miss on group 3 / struct rte_flow_restore_info info; rte_flow_get_restore_info(port_id, mbuf, &info, &error); /* * 6. Offload NAT rule: / app_items = { eth / ipv4 dst is 172.10.10.1 / udp dst 4789 / vxlan vni is 10 / ipv4 dst is 184.1.2.3 } app_actions = { set_ipv4_dst 186.1.1.1 / queue index 3 } rte_flow_tunnel_match(&info.tunnel, &pmd_items, &num_pmd_items, &error); rule_items = {pmd_items, app_items}; rule_actions = app_actions; attr.group = info.group_id; flow_2 = rte_flow_create(port_id, &attr, rule_items, rule_actions, &error); /* * 7. Release PMD items after rule creation */ rte_flow_tunnel_item_release(port_id, pmd_items, num_pmd_items); References 1. https://mails.dpdk.org/archives/dev/2020-June/index.html Signed-off-by: Eli Britstein <elibr@mellanox.com> Signed-off-by: Gregory Etelson <getelson@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2020-10-16 19:48:19 +02:00
Gregory Etelson	5d1bff8fe2	ethdev: allow negative values in flow rule types RTE flow items & actions use positive values in item & action type. Negative values are reserved for PMD private types. PMD items & actions usually are not exposed to application and are not used to create RTE flows. The patch allows applications with access to PMD flow items & actions ability to integrate RTE and PMD items & actions and use them to create flow rule. RTE flow item or action conversion library accepts positive known element types with predefined sizes only. Private PMD items and actions do not fit into this scheme because PMD type values are negative, each PMD has it's own types numeration and element types and their sizes are not visible at RTE level. To resolve these limitations the patch proposes this solution: 1. PMD can expose elements of pointer size only. RTE flow conversion functions will use pointer size for each configuration object in private PMD element it processes; 2. RTE flow verification will not reject elements with negative type. Signed-off-by: Gregory Etelson <getelson@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2020-10-16 19:48:19 +02:00
Dekel Peled	09315fc838	ethdev: add VLAN attributes to ethernet and VLAN items This patch implements the change proposes in RFC [1], adding dedicated fields to ETH and VLAN items structs, to clearly define the required characteristic of a packet, and enable precise match criteria. Documentation is updated accordingly. [1] https://mails.dpdk.org/archives/dev/2020-August/177536.html Signed-off-by: Dekel Peled <dekelp@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com>	2020-10-16 19:48:19 +02:00
Bing Zhao	5d9f23fb8f	ethdev: add new attributes to hairpin config To support two ports hairpin mode and keep the backward compatibility for the application, two new attribute members of the hairpin queue configuration structure will be added. `tx_explicit` means if the application itself will insert the Tx part flow rules. If not set, PMD will insert the rules implicitly. `manual_bind` means if the hairpin Tx queue and peer Rx queue will be bound automatically during the device start stage. Different Tx and Rx queue pairs could have different values, but it is highly recommended that all paired queues between one egress and its peer ingress ports have the same values, in order not to bring any chaos to the system. The actual support of these attribute parameters will be checked and decided by the PMD drivers. In the single port hairpin, if both are zero without any setting, the behavior will remain the same as before. It means that no bind API needs to be called and no Tx flow rules need to be inserted manually by the application. Signed-off-by: Bing Zhao <bingz@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2020-10-16 19:48:19 +02:00

1 2 3 4 5 ...

540 Commits