numam-dpdk

Author	SHA1	Message	Date
Ivan Ilchenko	62024eb827	ethdev: change stop operation callback to return int Change eth_dev_stop_t return value from void to int. Make eth_dev_stop_t implementations across all drivers to return negative errno values if case of error conditions. Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2020-10-16 22:26:41 +02:00
Ivan Ilchenko	58af59172b	ethdev: allow stop function to return an error Change rte_eth_dev_stop() return value from void to int and return negative errno values in case of error conditions. Also update the usage of the function in ethdev according to the new return type. Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2020-10-16 22:26:41 +02:00
Thomas Monjalon	8a5a0aad5d	ethdev: allow close function to return an error The API function rte_eth_dev_close() was returning void. The return type is changed to int for notifying of errors. If an error happens during a close operation, the status of the port is undefined, a maximum of resources having been freed. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Reviewed-by: Liron Himi <lironh@marvell.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2020-10-16 22:26:41 +02:00
Thomas Monjalon	0607dadf98	ethdev: reset all when releasing a port The function rte_eth_dev_release_port() is partially resetting the struct rte_eth_dev. The drivers were completing this reset with more pointers set to NULL in the close or remove operations. More pointers are reset at ethdev level, and some redundant assignments are removed from PMDs. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Haiyue Wang <haiyue.wang@intel.com> Acked-by: Jeff Guo <jia.guo@intel.com> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>	2020-10-16 22:26:41 +02:00
Thomas Monjalon	b8f5d2ae75	ethdev: remove forcing stopped state upon close When closing a port, it is supposed to be already stopped, and marked as such with "dev_started" state zeroed by the stop API. Resetting "dev_started" before calling the driver close operation was hiding the case of not properly stopped port being closed. The flag "dev_started" is not changed anymore in "rte_eth_dev_close()". In case the "dev_stop" function is called from "dev_close", bypassing "rte_eth_dev_stop()" API, the "dev_started" state must be explicitly reset in the PMD in order to keep the same behaviour. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>	2020-10-16 22:26:41 +02:00
Viacheslav Ovsiienko	4ff702b5df	ethdev: introduce Rx buffer split The DPDK datapath in the transmit direction is very flexible. An application can build the multi-segment packet and manages almost all data aspects - the memory pools where segments are allocated from, the segment lengths, the memory attributes like external buffers, registered for DMA, etc. In the receiving direction, the datapath is much less flexible, an application can only specify the memory pool to configure the receiving queue and nothing more. In order to extend receiving datapath capabilities it is proposed to add the way to provide extended information how to split the packets being received. The new offload flag RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT in device capabilities is introduced to present the way for PMD to report to application about supporting Rx packet split to configurable segments. Prior invoking the rte_eth_rx_queue_setup() routine application should check RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT flag. The following structure is introduced to specify the Rx packet segment for RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT offload: struct rte_eth_rxseg_split { struct rte_mempool mp; / memory pools to allocate segment from / uint16_t length; / segment maximal data length, configures "split point" / uint16_t offset; / data offset from beginning of mbuf data buffer / uint32_t reserved; / reserved field */ }; The segment descriptions are added to the rte_eth_rxconf structure: rx_seg - pointer the array of segment descriptions, each element describes the memory pool, maximal data length, initial data offset from the beginning of data buffer in mbuf. This array allows to specify the different settings for each segment in individual fashion. rx_nseg - number of elements in the array If the extended segment descriptions is provided with these new fields the mp parameter of the rte_eth_rx_queue_setup must be specified as NULL to avoid ambiguity. There are two options to specify Rx buffer configuration: - mp is not NULL, rrx_conf.rx_nseg is zero, it is compatible configuration, follows existing implementation, provides the single pool and no description for segment sizes and offsets. - mp is NULL, rx_conf.rx_seg is not NULL, rx_conf.rx_nseg is not zero, it provides the extended configuration, individually for each segment. f the Rx queue is configured with new settings the packets being received will be split into multiple segments pushed to the mbufs with specified attributes. The PMD will split the received packets into multiple segments according to the specification in the description array. For example, let's suppose we configured the Rx queue with the following segments: seg0 - pool0, len0=14B, off0=2 seg1 - pool1, len1=20B, off1=128B seg2 - pool2, len2=20B, off2=0B seg3 - pool3, len3=512B, off3=0B The packet 46 bytes long will look like the following: seg0 - 14B long @ RTE_PKTMBUF_HEADROOM + 2 in mbuf from pool0 seg1 - 20B long @ 128 in mbuf from pool1 seg2 - 12B long @ 0 in mbuf from pool2 The packet 1500 bytes long will look like the following: seg0 - 14B @ RTE_PKTMBUF_HEADROOM + 2 in mbuf from pool0 seg1 - 20B @ 128 in mbuf from pool1 seg2 - 20B @ 0 in mbuf from pool2 seg3 - 512B @ 0 in mbuf from pool3 seg4 - 512B @ 0 in mbuf from pool3 seg5 - 422B @ 0 in mbuf from pool3 The offload RTE_ETH_RX_OFFLOAD_SCATTER must be present and configured to support new buffer split feature (if rx_nseg is greater than one). The split limitations imposed by underlying PMD is reported in the new introduced rte_eth_dev_info->rx_seg_capa field. The new approach would allow splitting the ingress packets into multiple parts pushed to the memory with different attributes. For example, the packet headers can be pushed to the embedded data buffers within mbufs and the application data into the external buffers attached to mbufs allocated from the different memory pools. The memory attributes for the split parts may differ either - for example the application data may be pushed into the external memory located on the dedicated physical device, say GPU or NVMe. This would improve the DPDK receiving datapath flexibility with preserving compatibility with existing API. Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Acked-by: Jerin Jacob <jerinj@marvell.com> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2020-10-16 22:26:40 +02:00
Eli Britstein	9ec0f97e02	ethdev: add tunnel offload model rte_flow API provides the building blocks for vendor-agnostic flow classification offloads. The rte_flow "patterns" and "actions" primitives are fine-grained, thus enabling DPDK applications the flexibility to offload network stacks and complex pipelines. Applications wishing to offload tunneled traffic are required to use the rte_flow primitives, such as group, meta, mark, tag, and others to model their high-level objects. The hardware model design for high-level software objects is not trivial. Furthermore, an optimal design is often vendor-specific. When hardware offloads tunneled traffic in multi-group logic, partially offloaded packets may arrive to the application after they were modified in hardware. In this case, the application may need to restore the original packet headers. Consider the following sequence: The application decaps a packet in one group and jumps to a second group where it tries to match on a 5-tuple, that will miss and send the packet to the application. In this case, the application does not receive the original packet but a modified one. Also, in this case, the application cannot match on the outer header fields, such as VXLAN vni and 5-tuple. There are several possible ways to use rte_flow "patterns" and "actions" to resolve the issues above. For example: 1 Mapping headers to a hardware registers using the rte_flow_action_mark/rte_flow_action_tag/rte_flow_set_meta objects. 2 Apply the decap only at the last offload stage after all the "patterns" were matched and the packet will be fully offloaded. Every approach has its pros and cons and is highly dependent on the hardware vendor. For example, some hardware may have a limited number of registers while other hardware could not support inner actions and must decap before accessing inner headers. The tunnel offload model resolves these issues. The model goals are: 1 Provide a unified application API to offload tunneled traffic that is capable to match on outer headers after decap. 2 Allow the application to restore the outer header of partially offloaded packets. The tunnel offload model does not introduce new elements to the existing RTE flow model and is implemented as a set of helper functions. For the application to work with the tunnel offload API it has to adjust flow rules in multi-table tunnel offload in the following way: 1 Remove explicit call to decap action and replace it with PMD actions obtained from rte_flow_tunnel_decap_and_set() helper. 2 Add PMD items obtained from rte_flow_tunnel_match() helper to all other rules in the tunnel offload sequence. VXLAN Code example: Assume application needs to do inner NAT on the VXLAN packet. The first rule in group 0: flow create <port id> ingress group 0 pattern eth / ipv4 / udp dst is 4789 / vxlan / end actions {pmd actions} / jump group 3 / end The first VXLAN packet that arrives matches the rule in group 0 and jumps to group 3. In group 3 the packet will miss since there is no flow to match and will be sent to the application. Application will call rte_flow_get_restore_info() to get the packet outer header. Application will insert a new rule in group 3 to match outer and inner headers: flow create <port id> ingress group 3 pattern {pmd items} / eth / ipv4 dst is 172.10.10.1 / udp dst 4789 / vxlan vni is 10 / ipv4 dst is 184.1.2.3 / end actions set_ipv4_dst 186.1.1.1 / queue index 3 / end Resulting of the rules will be that VXLAN packet with vni=10, outer IPv4 dst=172.10.10.1 and inner IPv4 dst=184.1.2.3 will be received decapped on queue 3 with IPv4 dst=186.1.1.1 Note: The packet in group 3 is considered decapped. All actions in that group will be done on the header that was inner before decap. The application may specify an outer header to be matched on. It's PMD responsibility to translate these items to outer metadata. API usage: /** * 1. Initiate RTE flow tunnel object / const struct rte_flow_tunnel tunnel = { .type = RTE_FLOW_ITEM_TYPE_VXLAN, .tun_id = 10, } /* * 2. Obtain PMD tunnel actions * * pmd_actions is an intermediate variable application uses to * compile actions array / struct rte_flow_action pmd_actions; rte_flow_tunnel_decap_and_set(&tunnel, &pmd_actions, &num_pmd_actions, &error); /* * 3. offload the first rule * matching on VXLAN traffic and jumps to group 3 * (implicitly decaps packet) / app_actions = jump group 3 rule_items = app_items; /* eth / ipv4 / udp / vxlan / rule_actions = { pmd_actions, app_actions }; attr.group = 0; flow_1 = rte_flow_create(port_id, &attr, rule_items, rule_actions, &error); /* * 4. after flow creation application does not need to keep the * tunnel action resources. / rte_flow_tunnel_action_release(port_id, pmd_actions, num_pmd_actions); /* * 5. After partially offloaded packet miss because there was no * matching rule handle miss on group 3 / struct rte_flow_restore_info info; rte_flow_get_restore_info(port_id, mbuf, &info, &error); /* * 6. Offload NAT rule: / app_items = { eth / ipv4 dst is 172.10.10.1 / udp dst 4789 / vxlan vni is 10 / ipv4 dst is 184.1.2.3 } app_actions = { set_ipv4_dst 186.1.1.1 / queue index 3 } rte_flow_tunnel_match(&info.tunnel, &pmd_items, &num_pmd_items, &error); rule_items = {pmd_items, app_items}; rule_actions = app_actions; attr.group = info.group_id; flow_2 = rte_flow_create(port_id, &attr, rule_items, rule_actions, &error); /* * 7. Release PMD items after rule creation */ rte_flow_tunnel_item_release(port_id, pmd_items, num_pmd_items); References 1. https://mails.dpdk.org/archives/dev/2020-June/index.html Signed-off-by: Eli Britstein <elibr@mellanox.com> Signed-off-by: Gregory Etelson <getelson@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2020-10-16 19:48:19 +02:00
Gregory Etelson	5d1bff8fe2	ethdev: allow negative values in flow rule types RTE flow items & actions use positive values in item & action type. Negative values are reserved for PMD private types. PMD items & actions usually are not exposed to application and are not used to create RTE flows. The patch allows applications with access to PMD flow items & actions ability to integrate RTE and PMD items & actions and use them to create flow rule. RTE flow item or action conversion library accepts positive known element types with predefined sizes only. Private PMD items and actions do not fit into this scheme because PMD type values are negative, each PMD has it's own types numeration and element types and their sizes are not visible at RTE level. To resolve these limitations the patch proposes this solution: 1. PMD can expose elements of pointer size only. RTE flow conversion functions will use pointer size for each configuration object in private PMD element it processes; 2. RTE flow verification will not reject elements with negative type. Signed-off-by: Gregory Etelson <getelson@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2020-10-16 19:48:19 +02:00
Dekel Peled	09315fc838	ethdev: add VLAN attributes to ethernet and VLAN items This patch implements the change proposes in RFC [1], adding dedicated fields to ETH and VLAN items structs, to clearly define the required characteristic of a packet, and enable precise match criteria. Documentation is updated accordingly. [1] https://mails.dpdk.org/archives/dev/2020-August/177536.html Signed-off-by: Dekel Peled <dekelp@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com>	2020-10-16 19:48:19 +02:00
Bing Zhao	bc6e15de08	ethdev: add hairpin queue operations Every hairpin queue pair should be configured properly and the connection between Tx and Rx queues should be established, before hairpin function works. In single port hairpin mode, the queues of each pair belong to the same device. It is easy to get the hardware and software information of each queue and configure the hairpin connection with such information. In two ports hairpin mode, it is not easy or inappropriate to access one queue's information from another device. Since hairpin is configured per queue pair, three new APIs are introduced and they are internal for the PMD using. The peer update API helps to pass one queue's information to the peer queue and get the peer's information back for the next step. The peer bind API configures the current queue with the peer's information. For each hairpin queue pair, this API may need to be called twice to configure the Tx, Rx queues separately. The peer unbind API resets the current queue configuration and state to disconnect it from the peer queue. Also, it may need to be called twice to disconnect Tx, Rx queues from each other. Some parameter of the above APIs might not be mandatory, and it depends on the PMD implementation. The structure of `rte_hairpin_peer_info` is only a declaration and the actual members will be defined in each PMD when being used. Signed-off-by: Bing Zhao <bingz@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com>	2020-10-16 19:48:19 +02:00
Bing Zhao	9a9ba10ada	ethdev: add function to get hairpin peer ports list After hairpin queues are configured, in general, the application will maintain the ports topology and even the queues configuration for the hairpin. But sometimes it will not. If there is no hot-plug, it is easy to bind and unbind hairpin among all the ports. The application can just connect or disconnect the hairpin egress ports to/from all the probed ingress ports. Then all the connections could be handled properly. But with hot-plug / hot-unplug, one port could be probed and removed dynamically. With two ports hairpin, all the connections from and to this port should be handled after start(bind) or before stop(unbind). It is necessary to know the hairpin topology with this port. This function will return the ports list with the actual peer ports number after configuration. Either peer Rx or Tx ports will be gotten with this function call. Signed-off-by: Bing Zhao <bingz@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com>	2020-10-16 19:48:19 +02:00
Bing Zhao	5d9f23fb8f	ethdev: add new attributes to hairpin config To support two ports hairpin mode and keep the backward compatibility for the application, two new attribute members of the hairpin queue configuration structure will be added. `tx_explicit` means if the application itself will insert the Tx part flow rules. If not set, PMD will insert the rules implicitly. `manual_bind` means if the hairpin Tx queue and peer Rx queue will be bound automatically during the device start stage. Different Tx and Rx queue pairs could have different values, but it is highly recommended that all paired queues between one egress and its peer ingress ports have the same values, in order not to bring any chaos to the system. The actual support of these attribute parameters will be checked and decided by the PMD drivers. In the single port hairpin, if both are zero without any setting, the behavior will remain the same as before. It means that no bind API needs to be called and no Tx flow rules need to be inserted manually by the application. Signed-off-by: Bing Zhao <bingz@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2020-10-16 19:48:19 +02:00
Bing Zhao	a9916fdfb8	ethdev: add hairpin bind and unbind API In single port hairpin mode, all the hairpin Tx and Rx queues belong to the same device. After the queues are set up properly, there is no other dependency between the Tx queue and its Rx peer queue. The binding process that connected the Tx and Rx queues together from hardware level will be done automatically during the device start procedure. Everything required is configured and initialized already for the binding process. But in two ports hairpin mode, there will be some cross-dependences between two different ports. Usually, the ports will be initialized serially by the main thread but not in parallel. The earlier port will not be able to enable the bind if the following peer port is not yet configured with HW resources. What's more, if one port is detached / attached dynamically, it would introduce more trouble for the hairpin binding. To overcome these, new APIs for binding and unbinding are added. During startup, only the hairpin Tx and Rx peer queues will be set up. Nothing will be done when starting the device if the queues are without auto-bind attribute. Only after the required ports pair started, the `rte_eth_hairpin_bind()` API can be called to bind the all Tx queues of the egress port to the Rx queues of the peer port. Then the connection between the egress and ingress ports pair will be established. The `rte_eth_hairpin_unbind()` API could be used to disconnect the egress and the peer ingress ports. This should only be called before the device is closed if needed. When doing the clean up, all the egress and ingress pairs related to a single port should be taken into consideration, especially in the hot unplug case. mode is described. Signed-off-by: Bing Zhao <bingz@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2020-10-16 19:48:19 +02:00
Patrick Fu	eb666d2408	vhost: fix async unregister deadlock When async unregister function is invoked in certain vhost event callbacks (e.g. vring state change), deadlock may occur due to recursive spinlock acquire. This patch uses trylock() primitive in the unregister API to avoid deadlock. Fixes: `78639d5456` ("vhost: introduce async enqueue registration API") Cc: stable@dpdk.org Signed-off-by: Patrick Fu <patrick.fu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2020-10-16 19:48:19 +02:00
Patrick Fu	3e1e9c2464	vhost: fix async vector buffer overrun Add check on the async vector buffer usage to prevent the buf overrun. If the unused vector buffer is not sufficient to prepare for next packet's iov creation, an async transfer will be triggered immediately to free the vector buffer. Fixes: `78639d5456` ("vhost: introduce async enqueue registration API") Cc: stable@dpdk.org Signed-off-by: Patrick Fu <patrick.fu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2020-10-16 19:48:19 +02:00
Patrick Fu	9287d3a5a7	vhost: allocate async memory dynamically Allocate async internal memory buffer by rte_malloc(), replacing array declaration inside vq structure. Dynamic allocation can help to save memory footprint when async path is not registered. Signed-off-by: Patrick Fu <patrick.fu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2020-10-16 19:48:19 +02:00
Patrick Fu	6b3c81db8b	vhost: simplify async copy completion Current async ops allows check_completed_copies() callback to return arbitrary number of async iov segments finished from backend async devices. This design creates complexity for vhost to handle breaking transfer of a single packet (i.e. transfer completes in the middle of a async descriptor) and prevents application callbacks from leveraging hardware capability to offload the work. Thus, this patch enforces the check_completed_copies() callback to return the number of async memory descriptors, which is aligned with async transfer data ops callbacks. vhost async data path are revised to work with new ops define, which provides a clean and simplified processing. Signed-off-by: Patrick Fu <patrick.fu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2020-10-16 19:48:19 +02:00
Dekel Peled	69cb50d9be	ethdev: add IPv6 fragment extension header item Applications handling fragmented IPv6 packets need to match on IPv6 fragment extension header, in order to identify the fragments order and location in the packet. This patch introduces the IPv6 fragment extension header item, proposed in [1]. Relevant definitions are moved from lib/librte_ip_frag/rte_ip_frag.h to lib/librte_net/rte_ip.h, as they are needed for IPv6 header handling. struct ipv6_extension_fragment renamed to rte_ipv6_fragment_ext to adapt it to the common naming convention. Default mask is not defined, since all fields are optional. [1] http://mails.dpdk.org/archives/dev/2020-March/160255.html Signed-off-by: Dekel Peled <dekelp@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2020-10-16 19:48:18 +02:00
Dekel Peled	ad976bd40d	ethdev: add extensions attributes to IPv6 item Using the current implementation of DPDK, an application cannot match on IPv6 packets, based on the existing extension headers, in a simple way. Field 'Next Header' in IPv6 header indicates type of the first extension header only. Following extension headers can't be identified by inspecting the IPv6 header. As a result, the existence or absence of specific extension headers can't be used for packet matching. For example, fragmented IPv6 packets contain a dedicated extension header (which is implemented in a later patch of this series). Non-fragmented packets don't contain the fragment extension header. For an application to match on non-fragmented IPv6 packets, the current implementation doesn't provide a suitable solution. Matching on the Next Header field is not sufficient, since additional extension headers might be present in the same packet. To match on fragmented IPv6 packets, the same difficulty exists. This patch implements the update as detailed in RFC [1]. A set of additional values will be added to IPv6 header struct. These values will indicate the existence of every defined extension header type, providing simple means for identification of existing extensions in the packet header. Continuing the above example, fragmented packets can be identified using the specific value indicating existence of fragment extension header. To match on non-fragmented IPv6 packets, need to use has_frag_ext 0. To match on fragmented IPv6 packets, need to use has_frag_ext 1. To match on any IPv6 packets, the has_frag_ext field should not be specified for match. [1] https://mails.dpdk.org/archives/dev/2020-August/177257.html Signed-off-by: Dekel Peled <dekelp@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2020-10-16 19:48:18 +02:00
Honnappa Nagarahalli	2b69bd1179	ethdev: fix memory ordering for callback functions Call back functions are registered on the control plane. They are accessed from the data plane. Hence, correct memory orderings should be used to avoid race conditions. Fixes: `4dc294158c` ("ethdev: support optional Rx and Tx callbacks") Fixes: `c8231c63dd` ("ethdev: insert Rx callback as head of list") Cc: stable@dpdk.org Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com> Reviewed-by: Phil Yang <phil.yang@arm.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2020-10-16 19:48:18 +02:00
Phil Yang	8dd4b2afc7	ethdev: replace full barrier with relaxed barrier While registering the call back functions full write barrier can be replaced with one-way write barrier. Signed-off-by: Phil Yang <phil.yang@arm.com> Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2020-10-16 19:48:18 +02:00
Andrey Vesnovaty	4d9fd85fb5	ethdev: add shared actions to flow API Introduce extension of flow action API enabling sharing of single rte_flow_action in multiple flows. The API intended for PMDs, where multiple HW offloaded flows can reuse the same HW essence/object representing flow action and modification of such an essence/object affects all the rules using it. Motivation and example === Adding or removing one or more queues to RSS used by multiple flow rules imposes per rule toll for current DPDK flow API; the scenario requires for each flow sharing cloned RSS action: - call `rte_flow_destroy()` - call `rte_flow_create()` with modified RSS action API for sharing action and its in-place update benefits: - reduce the overhead of multiple RSS flow rules reconfiguration - optimize resource utilization by sharing action across multiple flows Change description === Shared action === In order to represent flow action shared by multiple flows new action type RTE_FLOW_ACTION_TYPE_SHARED is introduced (see `enum rte_flow_action_type`). Actually the introduced API decouples action from any specific flow and enables sharing of single action by its handle across multiple flows. Shared action create/use/destroy === Shared action may be reused by some or none flow rules at any given moment, i.e. shared action resides outside of the context of any flow. Shared action represent HW resources/objects used for action offloading implementation. API for shared action create (see `rte_flow_shared_action_create()`): - should allocate HW resources and make related initializations required for shared action implementation. - make necessary preparations to maintain shared access to the action resources, configuration and state. API for shared action destroy (see `rte_flow_shared_action_destroy()`) should release HW resources and make related cleanups required for shared action implementation. In order to share some flow action reuse the handle of type `struct rte_flow_shared_action` returned by rte_flow_shared_action_create() as a `conf` field of `struct rte_flow_action` (see "example" section). If some shared action not used by any flow rule all resources allocated by the shared action can be released by rte_flow_shared_action_destroy() (see "example" section). The shared action handle passed as argument to destroy API should not be used any further i.e. result of the usage is undefined. Shared action re-configuration === Shared action behavior defined by its configuration can be updated via rte_flow_shared_action_update() (see "example" section). The shared action update operation modifies HW related resources/objects allocated on the action creation. The number of operations performed by the update operation should not depend on the number of flows sharing the related action. On return of shared action update API action behavior should be according to updated configuration for all flows sharing the action. Shared action query === Provide separate API to query shared action state (see rte_flow_shared_action_update()). Taking a counter as an example: query returns value aggregating all counter increments across all flow rules sharing the counter. This API doesn't query shared action configuration since it is controlled by rte_flow_shared_action_create() and rte_flow_shared_action_update() APIs and no supposed to change by other means. example === struct rte_flow_action actions[2]; struct rte_flow_shared_action_conf conf; struct rte_flow_action action; /* skipped: initialize conf and action / struct rte_flow_shared_action handle = rte_flow_shared_action_create(port_id, &conf, &action, &error); actions[0].type = RTE_FLOW_ACTION_TYPE_SHARED; actions[0].conf = handle; actions[1].type = RTE_FLOW_ACTION_TYPE_END; /* skipped: init attr0 & pattern0 args / struct rte_flow flow0 = rte_flow_create(port_id, &attr0, pattern0, actions, error); /* create more rules reusing shared action / struct rte_flow flow1 = rte_flow_create(port_id, &attr1, pattern1, actions, error); /* skipped: for flows 2 till N / struct rte_flow flowN = rte_flow_create(port_id, &attrN, patternN, actions, error); /* update shared action / struct rte_flow_action updated_action; / * skipped: initialize updated_action according to desired action * configuration change / rte_flow_shared_action_update(port_id, handle, &updated_action, error); / * from now on all flows 1 till N will act according to configuration of * updated_action / / skipped: destroy all flows 1 till N */ rte_flow_shared_action_destroy(port_id, handle, error); Signed-off-by: Andrey Vesnovaty <andreyv@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2020-10-16 19:48:18 +02:00
Michael Pfeiffer	9863627f52	net: add function to calculate IPv4 header length Add a function to calculate the length of an IPv4 header as suggested on the mailing list [1]. Call where appropriate. [1] https://mails.dpdk.org/archives/dev/2020-October/184471.html Suggested-by: Thomas Monjalon <thomas@monjalon.net> Signed-off-by: Michael Pfeiffer <michael.pfeiffer@tu-ilmenau.de> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2020-10-16 19:48:17 +02:00
Wei Hu (Xavier)	ee7a8fafba	ethdev: check queue id in Rx interrupt control This patch add queue ID checks to Rx interrupt control routines. Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2020-10-16 19:48:17 +02:00
Wei Hu (Xavier)	83e813ec2a	ethdev: check if queue setup in queue-related APIs This patch adds checking whether the related Tx or Rx queue has been setup in the queue-related API functions to avoid illegal address access. Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2020-10-16 19:48:17 +02:00
Wei Hu (Xavier)	9b47c352e3	ethdev: extract checking queue id into common functions This patch extract checking rx_queue_id or tx_queue_id into two separate common functions named eth_dev_validate_rx_queue and eth_dev_validate_tx_queue. Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2020-10-16 19:48:17 +02:00
Dekel Peled	f6859b5136	ethdev: support query of age action Existing API supports AGE action to monitor the aging of a flow. This patch implements RFC [1], introducing the response format for query of an AGE action. Application will be able to query the AGE action state. The response will be returned in the format implemented here. [1] https://mails.dpdk.org/archives/dev/2020-September/180061.html Signed-off-by: Dekel Peled <dekelp@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2020-10-16 19:48:16 +02:00
Jiawei Wang	805b8faa6b	ethdev: introduce flow sample action When using full offload, all traffic will be handled by the HW, and forwarded to the requested VF or wire and the control application does not see this traffic anymore. So there's a need for an action that enables the control application some forwarded traffic visibility. The solution introduces a new action that will sample the incoming traffic and send a duplicated traffic with the specified ratio to the application, while the original packet will continue to the target destination. The packets sampled equals is '1/ratio', the ratio value set to 1 means that the packets will be completely mirrored. The sample packet can be assigned with different set of actions from the original packet. In order to support the sample packet in rte_flow, new rte_flow action definition RTE_FLOW_ACTION_TYPE_SAMPLE and structure rte_flow_action_sample will be introduced. Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Jerin Jacob <jerinj@marvell.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>	2020-10-16 19:47:58 +02:00
Thomas Monjalon	1372d0cb2a	ethdev: fix xstat name of basic stats per queue As described in doc/guides/prog_guide/poll_mode_drv.rst, the naming scheme for the xstats is parts separated with underscore: * direction * detail 1 * detail 2 * detail n * unit where detail 1 can be "q" followed with a queue number. It means the name of the stats per queue should be rx_qN_* or tx_qN_*. The second underscore was missing so far. Fixing the basic xstat names may be considered an API change, that's why it should not be backported. While fixing this mistake, some examples of the naming scheme are given as part of the API documentation of rte_eth_xstat_name. More proposals about standardizing statistics: http://fast.dpdk.org/events/slides/DPDK-2019-09-Ethernet_Statistics.pdf Fixes: `bd6aa172cf` ("ethdev: fetch extended statistics with integer ids") Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Reviewed-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Kevin Traynor <ktraynor@redhat.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Ciara Power <ciara.power@intel.com>	2020-10-16 19:47:55 +02:00
Ophir Munk	df655504e3	app/testpmd: cleanup tunnel protocols parsing This is a cleanup commit. It assembles all tunnel outer updates into one function call to avoid code duplications. It defines RTE_VXLAN_GPE_DEFAULT_PORT (4790) in accordance with all other tunnel protocol definitions. It replaces all numeric values 4789 in their corresponding definition RTE_VXLAN_GPE_DEFAULT_PORT. It updates the 'csum parse-tunnel' documentation. Signed-off-by: Ophir Munk <ophirmu@mellanox.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2020-10-16 19:18:47 +02:00
Ophir Munk	ea0e711b8a	app/testpmd: add GENEVE parsing GENEVE is a widely used tunneling protocol in modern Virtualized Networks. testpmd already supports parsing of several tunneling protocols including VXLAN, VXLAN-GPE, GRE. This commit adds GENEVE parsing of inner protocols (IPv4-0x0800, IPv6-0x86dd, Ethernet-0x6558) based on IETF draft-ietf-nvo3-geneve-09. GENEVE is considered more flexible than the other protocols. In terms of protocol format GENEVE header has a variable length options as opposed to other tunneling protocols which have a fixed header size. Signed-off-by: Ophir Munk <ophirmu@mellanox.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2020-10-16 19:18:47 +02:00
Maxime Coquelin	7788819296	vhost: use fixed virtio-net header length packed ring This small optimization uses the static the Virtio-net header len in packed datapath, since Virtio-net header cannot be the legacy one in case of packed ring. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2020-10-16 19:18:47 +02:00
Maxime Coquelin	22eaf26135	vhost: fix virtio-net header length with packed ring In case packed ring layout has been negotiated, but neither Version 1 nor mergeable buffers, the Virtio-net header len is assigned to the legacy devices value, which is wrong. This patch fixes this with using the proper len as devices using packed ring are not legacy devices. Fixes: `a922401f35` ("vhost: add Rx support for packed ring") Fixes: `ae999ce49d` ("vhost: add Tx support for packed ring") Cc: stable@dpdk.org Reported-by: Marvin Liu <yong.liu@intel.com> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2020-10-16 19:18:47 +02:00
Olivier Matz	fa5054c4bb	vhost: fix external mbuf creation In virtio_dev_extbuf_alloc(), the shinfo structure used to store the reference counter and the free callback of the external buffer is by default stored inside the mbuf data. This is wrong because the mbuf (and its data) can be freed before the external buffer, for instance in the following situation: pkt2 = rte_pktmbuf_alloc(mp); rte_pktmbuf_attach(pkt2, pkt); rte_pktmbuf_free(pkt); After this, pkt is freed, but it still contains shinfo, which is referenced by pkt2. Fix this by always storing the shinfo beside the external buffer. Fixes: `c3ff0ac70a` ("vhost: improve performance by supporting large buffer") Cc: stable@dpdk.org Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2020-10-16 19:18:47 +02:00
Fan Zhang	ea1b835a0e	vhost/crypto: fix feature negotiation This patch fixes the feature negotiation for vhost crypto during initialization. The patch uses the newly created driver start function to inform the driver type with the fixed vhost features. In addition the patch provides a new API specifically used by the application to start a vhost-crypto driver. Fixes: `939066d965` ("vhost/crypto: add public function implementation") Cc: stable@dpdk.org Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2020-10-16 19:18:47 +02:00
Bruce Richardson	e8a83681f4	eal/x86: fix memcpy AVX-512 enablement When testing on some x86 platforms, code compiled with meson was observed running at a different power-license level to that compiled with make. This is due to the fact that meson auto-detects the instruction sets available on the system and enabled AVX512 rte_memcpy when AVX512 was available, while on make, a build time AVX-512 flag needed to be explicitly set to enable that AVX512 rte_memcpy code path. In the absence of runtime path selection for rte_memcpy - which is complicated by it being a static inline function in a header file - we can fix this behaviour regression by similarly having a build-time option which must be set to enable the AVX-512 memcpy path. Fixes: `a25a650be5` ("build: add infrastructure for meson and ninja builds") Fixes: `3e1bb55fd6` ("build/x86: add SSE flags") Cc: stable@dpdk.org Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Tested-by: Yingya Han <yingyax.han@intel.com>	2020-10-17 12:22:01 +02:00
Omkar Maslekar	4ffc2276e2	eal: add cache line demotion API rte_cldemote is similar to a prefetch hint - in reverse. On x86, cldemote(addr) enables software to hint to hardware that line is likely to be shared. This is quite useful in core-to-core communications where cache-line is likely to be shared. ARM and PPC implementation is provided with NOP and can be added if any equivalent instructions could be used for implementation on those architectures. Signed-off-by: Omkar Maslekar <omkar.maslekar@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: David Christensen <drc@linux.vnet.ibm.com> Acked-by: Jerin Jacob <jerinj@marvell.com> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>	2020-10-16 14:11:45 +02:00
David Marchand	9e2af97f87	eal/windows: fix symbol export The incriminated commit forgot to clean the Windows export file. Fixes: `3cd73a1a1c` ("eal: simplify exit functions") Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Ray Kinsella <mdr@ashroe.eu>	2020-10-16 14:01:37 +02:00
Timothy McDaniel	ff3bc497e4	eventdev: add PCI probe named convenience function Add new internal wrapper function for use by pci drivers as a .probe function to attach to an event interface. Same as rte_event_pmd_pci_probe, except the caller can specify the name. Updated rte_event_pmd_pci_probe so as to not duplicate code. Signed-off-by: Timothy McDaniel <timothy.mcdaniel@intel.com> Reviewed-by: Gage Eads <gage.eads@intel.com>	2020-10-15 23:25:35 +02:00
Timothy McDaniel	75d113136f	eventdev: express DLB/DLB2 PMD constraints This commit implements the eventdev ABI changes required by the DLB/DLB2 PMDs. Several data structures and constants are modified or added in this patch, thereby requiring modifications to the dependent apps and examples. The DLB/DLB2 hardware does not conform exactly to the eventdev interface. 1) It has a limit on the number of queues that may be linked to a port. 2) Some ports a further restricted to a maximum of 1 linked queue. 3) DLB does not have the ability to carry the flow_id as part of the event (QE) payload. Note that the DLB2 hardware is capable of carrying the flow_id. Following is a detailed description of the changes that have been made. 1) Add new fields to the rte_event_dev_info struct. These fields allow the device to advertise its capabilities so that applications can take the appropriate actions based on those capabilities. struct rte_event_dev_info { uint32_t max_event_port_links; /*< Maximum number of queues that can be linked to a single event port by this device. / uint8_t max_single_link_event_port_queue_pairs; /< Maximum number of event ports and queues that are optimized for (and only capable of) single-link configurations supported by this * device. These ports and queues are not accounted for in * max_event_ports or max_event_queues. / } 2) Add a new field to the rte_event_dev_config struct. This field allows the application to specify how many of its ports are limited to a single link, or will be used in single link mode. /* Event device configuration structure / struct rte_event_dev_config { uint8_t nb_single_link_event_port_queues; /< Number of event ports and queues that will be singly-linked to each other. These are a subset of the overall event ports and * queues; this value cannot exceed nb_event_ports or * nb_event_queues. If the device has ports and queues that are * optimized for single-link usage, this field is a hint for how many * to allocate; otherwise, regular event ports and queues can be used. / } 3) Replace the dedicated implicit_release_disabled field with a bit field of explicit port capabilities. The implicit_release_disable functionality is assigned to one bit, and a port-is-single-link-only attribute is assigned to other, with the remaining bits available for future assignment. Event port configuration bitmap flags / #define RTE_EVENT_PORT_CFG_DISABLE_IMPL_REL (1ULL << 0) /< Configure the port not to release outstanding events in rte_event_dev_dequeue_burst(). If set, all events received through * the port must be explicitly released with RTE_EVENT_OP_RELEASE or * RTE_EVENT_OP_FORWARD. Must be unset if the device is not * RTE_EVENT_DEV_CAP_IMPLICIT_RELEASE_DISABLE capable. / #define RTE_EVENT_PORT_CFG_SINGLE_LINK (1ULL << 1) /< This event port links only to a single event queue. * @see rte_event_port_setup(), rte_event_port_link() / #define RTE_EVENT_PORT_ATTR_IMPLICIT_RELEASE_DISABLE 3 /* * The implicit release disable attribute of the port / struct rte_event_port_conf { uint32_t event_port_cfg; /< Port cfg flags(EVENT_PORT_CFG_) / } This patch also removes the depreciation notice and announce the new eventdev ABI changes in release note. Signed-off-by: Timothy McDaniel <timothy.mcdaniel@intel.com> Acked-by: Harry van Haaren <harry.van.haaren@intel.com> Acked-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Acked-by: Jerin Jacob <jerinj@marvell.com>	2020-10-15 23:16:07 +02:00
Yunjian Wang	b7b09dab5e	eventdev: fix adapter leak in error path In rte_event_crypto_adapter_create_ext() allocated memory for adapter, we should free it when error happens, otherwise it will lead to memory leak. Fixes: `7901eac340` ("eventdev: add crypto adapter implementation") Cc: stable@dpdk.org Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Acked-by: Jerin Jacob <jerinj@marvell.com>	2020-10-15 21:38:09 +02:00
Yunjian Wang	e3eebdced0	eventdev: check allocation in Tx adapter The function rte_zmalloc() could return NULL, the return value need to be checked. Fixes: `a3bbf2e097` ("eventdev: add eth Tx adapter implementation") Cc: stable@dpdk.org Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>	2020-10-15 21:20:26 +02:00
Mike Ximing Chen	62303b08b7	eventdev: support telemetry with xstats info The telemetry library is connected with eventdev xstats and port link info. The following new telemetry commands are added: /eventdev/dev_list /eventdev/port_list,DevID /eventdev/queue_list,DevID /eventdev/dev_xstats,DevID /eventdev/port_xstats,DevID,PortID /eventdev/queue_xstats,DevID,PortID /eventdev/queue_links,DevID,PortID queue_links command displays a list of queues linked with a specified eventdev port and a service priority associated with each link. Signed-off-by: Mike Ximing Chen <mike.ximing.chen@intel.com> Reviewed-by: Ciara Power <ciara.power@intel.com> Reviewed-by: Gage Eads <gage.eads@intel.com>	2020-10-15 21:09:35 +02:00
Suanming Mou	80d1a9aff7	ethdev: make flow API thread safe Currently, the rte_flow functions are not defined as thread safe. DPDK applications either call the functions in single thread or protect any concurrent calling for the rte_flow operations using a lock. For PMDs support the flow operations thread safe natively, the redundant protection in application hurts the performance of the rte_flow operation functions. And the restriction of thread safe is not guaranteed for the rte_flow functions also limits the applications' expectation. This feature is going to change the rte_flow functions to be thread safe. As different PMDs have different flow operations, some may support thread safe already and others may not. For PMDs don't support flow thread safe operation, a new lock is defined in ethdev in order to protects thread unsafe PMDs from rte_flow level. A new RTE_ETH_DEV_FLOW_OPS_THREAD_SAFE device flag is added to determine whether the PMD supports thread safe flow operation or not. For PMDs support thread safe flow operations, set the RTE_ETH_DEV_FLOW_OPS_THREAD_SAFE flag, rte_flow level functions will skip the thread safe helper lock for these PMDs. Again the rte_flow level thread safe lock only works when PMD operation functions are not thread safe. For the PMDs which don't want the default mutex lock, just set the flag in the PMD, and add the prefer type of lock in the PMD. Then the default mutex lock is easily replaced by the PMD level lock. The change has no effect on the current DPDK applications. No change is required for the current DPDK applications. For the standard posix pthread_mutex, if no lock contention with the added rte_flow level mutex, the mutex only does the atomic increasing in pthread_mutex_lock() and decreasing in pthread_mutex_unlock(). No futex() syscall will be involved. Signed-off-by: Suanming Mou <suanmingm@nvidia.com> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2020-10-16 00:44:58 +02:00
Suanming Mou	96bb99f270	eal/windows: add pthread mutex Add pthread mutex lock as it is needed for the thread safe rte_flow functions. Signed-off-by: Suanming Mou <suanmingm@nvidia.com> Tested-by: Tal Shnaiderman <talshn@nvidia.com> Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> Acked-by: Ranjit Menon <ranjit.menon@intel.com> Acked-by: Narcisa Vasile <navasile@linux.microsoft.com>	2020-10-16 00:44:58 +02:00
Somnath Kotur	ed94631da7	mbuf: extend meaning of QinQ stripped bit Clarify the documentation of QinQ flags, and extend the meaning of the flag: if PKT_RX_QINQ_STRIPPED is set and PKT_RX_VLAN_STRIPPED is unset, only the outer VLAN is removed from packet data, but both tci are saved in mbuf->vlan_tci (inner) and mbuf->vlan_tci_outer (outer). Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2020-10-15 23:04:55 +02:00
Akhil Goyal	486f067a41	security: modify PDCP xform to support SDAP The SDAP is a protocol in the LTE stack on top of PDCP for QOS. A particular PDCP session may or may not have SDAP enabled. But if it is enabled, SDAP header should be authenticated but not encrypted if both confidentiality and integrity is enabled. Hence, the driver should be intimated from the xform so that it skip the SDAP header while encryption. A new field is added in the PDCP xform to specify SDAP is enabled. The overall size of the xform is not changed, as hfn_ovrd is just a flag and does not need uint32. Hence, it is converted to uint8_t and a 16 bit reserved field is added for future. Signed-off-by: Akhil Goyal <akhil.goyal@nxp.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2020-10-14 22:24:41 +02:00
Arek Kusztal	3a6f835b33	cryptodev: remove algo lists end This patch removes enumerators RTE_CRYPTO_CIPHER_LIST_END, RTE_CRYPTO_AUTH_LIST_END, RTE_CRYPTO_AEAD_LIST_END to prevent ABI breakage that may arise when adding new crypto algorithms. Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2020-10-14 22:22:06 +02:00
Fan Zhang	eb7eed345c	cryptodev: add raw crypto datapath API This patch adds raw data-path APIs for enqueue and dequeue operations to cryptodev. The APIs support flexible user-define enqueue and dequeue behaviors. Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com> Signed-off-by: Piotr Bronowski <piotrx.bronowski@intel.com> Acked-by: Adam Dybkowski <adamx.dybkowski@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2020-10-14 22:22:06 +02:00
Fan Zhang	8d928d47a2	cryptodev: change crypto symmetric vector structure This patch updates ``rte_crypto_sym_vec`` structure to add support for both cpu_crypto synchronous operation and asynchronous raw data-path APIs. The patch also includes AESNI-MB and AESNI-GCM PMD changes, unit test changes and documentation updates. Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2020-10-14 22:22:06 +02:00
Haggai Eran	784fb396f7	cryptodev: fix parameter parsing The rte_cryptodev_pmd_parse_input_args function crashes with a segmentation fault when passing a non-empty argument string. The function passes cryptodev_pmd_valid_params to rte_kvargs_parse, which accepts a NULL-terminated list of valid keys, yet cryptodev_pmd_valid_params does not end with NULL. The patch adds the missing NULL pointer. Fixes: `9e6edea418` ("cryptodev: add APIs to assist PMD initialisation") Cc: stable@dpdk.org Signed-off-by: Haggai Eran <haggaie@nvidia.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2020-10-14 22:22:06 +02:00
Adam Dybkowski	1483f91b30	cryptodev: remove v20 ABI compatibility This reverts commit `a0f0de06d4` as the rte_cryptodev_info_get function versioning was a temporary solution to maintain ABI compatibility for ChaCha20-Poly1305 and is not needed in 20.11. Fixes: `a0f0de06d4` ("cryptodev: fix ABI compatibility for ChaCha20-Poly1305") Signed-off-by: Adam Dybkowski <adamx.dybkowski@intel.com> Reviewed-by: Arek Kusztal <arkadiuszx.kusztal@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2020-10-14 22:22:06 +02:00
Conor Walsh	a748d24d79	ipsec: promote library as stable Since librte_ipsec was first introduced in 19.02 and there were no changes in it's public API since 19.11, it should be considered mature enough to remove the 'experimental' tag from it. The RTE_SATP_LOG2_NUM enum is also being dropped from rte_ipsec_sa.h to avoid possible ABI problems in the future. Signed-off-by: Conor Walsh <conor.walsh@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Ray Kinsella <mdr@ashroe.eu> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2020-10-14 21:26:36 +02:00
Thomas Monjalon	3cd73a1a1c	eal: simplify exit functions The option RTE_EAL_ALWAYS_PANIC_ON_ERROR was off by default, and not customizable with meson. It is completely removed. The function rte_dump_registers is a trace of the bare metal support era, and was not supported in userland. It is completely removed. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Ray Kinsella <mdr@ashroe.eu> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Kevin Traynor <ktraynor@redhat.com> Acked-by: David Marchand <david.marchand@redhat.com>	2020-10-15 22:33:47 +02:00
Harry van Haaren	31f83163cf	eal: add new prefetch write variants This commit adds new rte_prefetchX_write() variants, that suggest to the compiler to use a prefetch instruction with intention to write. As a compiler builtin, the compiler can choose based on compilation target what the best implementation for this instruction is. Three versions are provided, targeting the different levels of cache. Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> Reviewed-by: Jerin Jacob <jerinj@marvell.com> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>	2020-10-15 21:49:59 +02:00
Eli Britstein	057d9a92f0	eal: fix build with conflicting libc variable memory_order The cited commit introduced functions with 'int memory_order' argument. The C11 standard section 7.17.1.4 defines 'memory_order' as the "enumerated type whose enumerators identify memory ordering constraints". A compilation error occurs: error: declaration of 'memory_order' shadows a global declaration [-Werror=shadow] rte_atomic_thread_fence(int memory_order) This issue was hit when trying to compile OVS with gcc 4.8.5. This compiler version does not provide stdatomic.h, so enum memory_order is redefined in OVS code. In another case, if the compiler does provide stdatomic.h header, passing -Wsystem-headers in the CFLAGS will also cause that failure. Fix it by changing the argument name 'memory_order' to 'memorder'. Fixes: `672a150563` ("eal: add wrapper for C11 atomic thread fence") Signed-off-by: Eli Britstein <elibr@nvidia.com> Reviewed-by: Asaf Penso <asafp@nvidia.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>	2020-10-15 18:49:53 +02:00
Konstantin Ananyev	e0a1cd7a62	acl: fix build with gcc 5.4.0 gcc 5.4 fails with: ../lib/librte_acl/acl_run_avx512x8.h: In function 'match_process_avx512x8': ../lib/librte_acl/acl_run_avx512x8.h:382:31: error: pointer targets in passing argument 1 of '_mm256_mask_i32scatter_epi32' differ in signedness [-Werror=pointer-sign] Later gcc versions work fine, as for them parameter type was changed to 'void *'. Fixed by applying explicit cast for offending argument. Bugzilla ID: 556 Fixes: `b64c2295f7` ("acl: add 256-bit AVX512 classify method") Fixes: `45da22e42e` ("acl: add 512-bit AVX512 classify method") Reported-by: Ali Alnubani <alialnu@nvidia.com> Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Tested-by: Ali Alnubani <alialnu@nvidia.com>	2020-10-15 14:31:46 +02:00
David Marchand	0c0d0d9df7	eal: add experimental tags for write combining store Only marking the doxygen declarations is not enough. Arch specific implementations must be tagged as well since there is no common declaration of those inlines. Fixes: `8a00dfc738` ("eal: add write combining store") Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com> Reviewed-by: Radu Nicolau <radu.nicolau@intel.com>	2020-10-15 08:45:30 +02:00
Savinay Dharmappa	bf32a357e2	sched: remove redundant subport parameters Remove redundant data structure fields. Signed-off-by: Savinay Dharmappa <savinay.dharmappa@intel.com> Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2020-10-15 02:14:28 +02:00
Savinay Dharmappa	ac6fcb841b	sched: update subport rate dynamically Add support to update subport rate dynamically. Signed-off-by: Savinay Dharmappa <savinay.dharmappa@intel.com> Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2020-10-15 02:13:08 +02:00
Savinay Dharmappa	5f757d8fcc	sched: introduce subport profile add function API to add new subport bandwidth profile. Signed-off-by: Savinay Dharmappa <savinay.dharmappa@intel.com> Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com> Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2020-10-15 02:11:55 +02:00
Savinay Dharmappa	0ea4c6afca	sched: add subport profile table Add subport profile table to internal port data structure and update the port config function. Signed-off-by: Savinay Dharmappa <savinay.dharmappa@intel.com> Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com> Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2020-10-15 02:11:50 +02:00
Dmitry Kozlyuk	841dfdd06d	cmdline: support Windows Implement terminal handling, input polling, and vdprintf() for Windows. Because Windows I/O model differs fundamentally from Unix and there is no concept of character device, polling is simulated depending on the underlying input device. Supporting non-terminal input is useful for automated testing. Windows emulation of VT100 uses "ESC [ E" for newline instead of standard "ESC E", so add a workaround. Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2020-10-15 00:39:10 +02:00
Dmitry Kozlyuk	f40a74cfcf	eal/windows: improve compatibility networking headers Extend compatibility header system to support librte_cmdline. pthread.h has to include windows.h, which exposes struct in_addr, etc. conflicting with compatibility headers. WIN32_LEAN_AND_MEAN macro is required to disable this behavior. Use rte_windows.h to define WIN32_LEAN_AND_MEAN for pthread library. Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2020-10-15 00:39:10 +02:00
Dmitry Kozlyuk	b5741b5704	cmdline: add internal wrapper for vdprintf Add internal wrapper for vdprintf(3) that is only available on Unix. Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2020-10-15 00:39:10 +02:00
Dmitry Kozlyuk	9251cd97a6	cmdline: add internal wrappers for character input poll(3) is a purely Unix facility, so it cannot be directly used by common code. read(2) is limited in device support outside of Unix. Create wrapper functions and implement them for Unix. Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2020-10-15 00:39:10 +02:00
Dmitry Kozlyuk	7b5f4e1d30	cmdline: add internal wrappers for terminal handling Add functions that set up, save, and restore terminal parameters. Use existing code as Unix implementation. Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2020-10-15 00:39:10 +02:00
Dmitry Kozlyuk	51fcb6a1fe	cmdline: make implementation logically opaque struct cmdline exposes platform-specific members it contains, most notably struct termios that is only available on Unix. While ABI considerations prevent from hinding the definition on already supported platforms, struct cmdline is considered logically opaque from now on. Add a deprecation notice targeted at 20.11. * Remove tests checking struct cmdline content as meaningless. * Fix missing cmdline_free() in unit test. * Add cmdline_get_rdline() to access history buffer indirectly. The new function is currently used only in tests. Suggested-by: Olivier Matz <olivier.matz@6wind.com> Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> Acked-by: Ray Kinsella <mdr@ashroe.eu> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2020-10-15 00:39:10 +02:00
Dmitry Kozlyuk	f4cbdbc7fb	eal/windows: implement alarm API Implementation is based on waitable timers Win32 API. When timer is set, a callback and its argument are supplied to the OS, while timer handle is stored in EAL alarm list. When timer expires, OS wakes up the interrupt thread and runs the callback. Upon completion it removes the alarm. Waitable timers must be set from the thread their callback will run in, eal_intr_thread_schedule() provides a way to schedule asyncronuous code execution in the interrupt thread. Alarm module builds synchronous timer setup on top of it. Windows alarms are not a type of DPDK interrupt handle and do not interact with interrupt module beyond executing in the same thread. Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> Acked-by: Narcisa Vasile <navasile@linux.microsoft.com>	2020-10-14 22:54:04 +02:00
Dmitry Kozlyuk	5c016fc020	eal/windows: add interrupt thread skeleton Windows interrupt support is based on IO completion ports (IOCP). Interrupt thread would send the devices requests to notify about interrupts and then wait for any request completion. Add skeleton code of this model without any hardware support. Another way to wake up the interrupt thread is APC (asynchronous procedure call), scheduled by any other thread via eal_intr_thread_schedule(). This internal API is intended for alarm implementation. Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> Acked-by: Narcisa Vasile <navasile@linux.microsoft.com>	2020-10-14 22:48:38 +02:00
Ting Xu	99541c3028	table: fix hash for 32-bit When create softnic hash table with 16 keys, it failed on 32-bit environment, because the pointer field in structure rte_bucket_4_16 is only 32 bits. Add a padding field in 32-bit environment to keep the structure to a multiple of 64 bytes. Apply this to 8-byte and 32-byte key hash function as well. Fixes: `8aa327214c` ("table: hash") Cc: stable@dpdk.org Signed-off-by: Ting Xu <ting.xu@intel.com> Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2020-10-14 14:42:29 +02:00
Konstantin Ananyev	c5cf148d89	acl: deduplicate AVX512 code Current rte_acl_classify_avx512x32() and rte_acl_classify_avx512x16() code paths are very similar. The only differences are due to 256/512 register/instrincts naming conventions. So to deduplicate the code: - Move common code into “acl_run_avx512_common.h” - Use macros to hide difference in naming conventions Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2020-10-14 14:37:51 +02:00
Konstantin Ananyev	6fba1c8ba0	acl: optimize AVX512 classify with 4 bytes loads With current ACL implementation first field in the rule definition has always to be one byte long. Though for optimising classify implementation it might be useful to do 4B reads (as we do for rest of the fields). So at build phase, check user provided field definitions to determine is it safe to do 4B loads for first ACL field. Then at run-time this information can be used to choose classify behavior. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2020-10-14 14:23:01 +02:00
Konstantin Ananyev	45da22e42e	acl: add 512-bit AVX512 classify method Introduce classify implementation that uses AVX512 specific ISA. rte_acl_classify_avx512x32() is able to process up to 32 flows in parallel. It uses 512-bit width registers/instructions and provides higher performance then rte_acl_classify_avx512x16(), but can cause frequency level change. Note that for now only 64-bit version is supported. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2020-10-14 14:23:01 +02:00
Konstantin Ananyev	867d0d3649	acl: select 256-bit AVX512 classify method by default On supported platforms, set RTE_ACL_CLASSIFY_AVX512X16 as default ACL classify algorithm. Note that AVX512X16 implementation uses 256-bit registers/instincts only to avoid possibility of frequency drop. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2020-10-14 14:23:01 +02:00
Konstantin Ananyev	b64c2295f7	acl: add 256-bit AVX512 classify method Introduce classify implementation that uses AVX512 specific ISA. rte_acl_classify_avx512x16() is able to process up to 16 flows in parallel. It uses 256-bit width registers/instructions only (to avoid frequency level change). Note that for now only 64-bit version is supported. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2020-10-14 14:23:00 +02:00
Konstantin Ananyev	7c6cca6b60	acl: add infrastructure for AVX512 classify methods Add necessary changes to support new AVX512 specific ACL classify algorithm: - changes in meson.build to check that build tools (compiler, assembler, etc.) do properly support AVX512. - run-time checks to make sure target platform does support AVX512. - dummy rte_acl_classify_avx512() for targets where AVX512 implementation couldn't be properly supported. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2020-10-14 14:23:00 +02:00
Konstantin Ananyev	0cea36d689	acl: rework classify method selection Right now ACL library determines best possible (default) classify method on a given platform with special constructor function rte_acl_init(). This patch makes the following changes: - Move selection of default classify method into a separate private function and call it for each ACL context creation (rte_acl_create()). - Remove library constructor function - Make rte_acl_set_ctx_classify() to check that requested algorithm is supported on given platform. The purpose of these changes to improve and simplify algorithm selection process and prepare ACL library to be integrated with the max SIMD bitwidth series in discussion. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2020-10-14 14:23:00 +02:00
Konstantin Ananyev	ad20877a30	acl: remove classify methods count enum Removal of unused enum value (RTE_ACL_CLASSIFY_NUM). This enum value is not used inside DPDK, while it prevents to add new classify algorithms without causing an ABI breakage. Note that this change introduce a formal ABI incompatibility with previous versions of ACL library. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>	2020-10-14 14:23:00 +02:00
Konstantin Ananyev	85348c3e7d	acl: fix x86 build for compiler without AVX2 Right now we define dummy version of rte_acl_classify_avx2() when both X86 and AVX2 are not detected, though it should be for non-AVX2 case only. Fixes: `e53ce4e413` ("acl: remove use of weak functions") Cc: stable@dpdk.org Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2020-10-14 14:23:00 +02:00
Vladimir Medvedkin	afd9edb0d3	eal/x86: introduce type for AVX 512-bit New data type to manipulate 512 bit AVX values. Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2020-10-14 14:23:00 +02:00
Tal Shnaiderman	bbdab351ee	eal/windows: export all built functions for clang export for clang build all the functions currently built on Windows and listed in rte_eal_version.map by adding them to rte_eal_exports.def. Signed-off-by: Tal Shnaiderman <talshn@nvidia.com> Acked-by: Ranjit Menon <ranjit.menon@intel.com> Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>	2020-10-14 11:49:33 +02:00
Mairtin o Loingsigh	17a937baed	net: add CRC AVX512 implementation This patch enables the optimized calculation of CRC32-Ethernet and CRC16-CCITT using the AVX512 and VPCLMULQDQ instruction sets. This CRC implementation is built if the compiler supports the required instruction sets. It is selected at run-time if the host CPU, again, supports the required instruction sets. Signed-off-by: Mairtin o Loingsigh <mairtin.oloingsigh@intel.com> Signed-off-by: David Coyle <david.coyle@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Reviewed-by: Jasvinder Singh <jasvinder.singh@intel.com> Reviewed-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2020-10-13 19:26:15 +02:00
Mairtin o Loingsigh	ef94569cf9	net: add CRC implementation runtime selection This patch adds support for run-time selection of the optimal architecture-specific CRC path, based on the supported instruction set(s) of the CPU. The compiler option checks have been moved from the C files to the meson script. The rte_cpu_get_flag_enabled function is called automatically by the library at process initialization time to determine which instructions the CPU supports, with the most optimal supported CRC path ultimately selected. Signed-off-by: Mairtin o Loingsigh <mairtin.oloingsigh@intel.com> Signed-off-by: David Coyle <david.coyle@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Reviewed-by: Jasvinder Singh <jasvinder.singh@intel.com> Reviewed-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2020-10-13 19:26:03 +02:00
Wei Hu (Xavier)	b15a936d75	eal/arm64: update CPU flags ARM64 Linux kernel updated the CPU flags using the HWCAP scheme. The related marco definition can be found in linux kernel: arch/arm64/include/uapi/asm/hwcap.h This patch incorporates those changes to the EAL library. Signed-off-by: Chengwen Feng <fengchengwen@huawei.com> Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>	2020-10-13 17:52:11 +02:00
Ruifeng Wang	e9b9739264	config: remap flags used for Arm platforms RTE_ARCH_xx flags are used to distinguish platform architectures. These flags can be used to pick different code paths for different architectures at compile time. For Arm platforms, there are 3 flags in use: RTE_ARCH_ARM, RTE_ARCH_ARMv7 and RTE_ARCH_ARM64. RTE_ARCH_ARM64 is for 64-bit aarch64 platforms, and RTE_ARCH_ARM & RTE_ARCH_ARMv7 are for 32-bit platforms. RTE_ARCH_ARMv7 is for ARMv7 platforms as its name suggested. The issue is meaning of RTE_ARCH_ARM is not clear enough. Because no info about platform word length is included in the name. To make the flag names more clear, a naming scheme is proposed. RTE_ARCH_ARM (all Arm platforms) \| +----RTE_ARCH_32 (New. 32-bit platforms of all architectures) \| \| \| +----RTE_ARCH_ARMv7 (ARMv7 platforms) \| \| \| +----RTE_ARCH_ARMv8_AARCH32 (aarch32 state on aarch64 machine) \| +----RTE_ARCH_64 (64-bit platforms of all architectures) \| +----RTE_ARCH_ARM64 (64-bit Arm platforms) RTE_ARCH_32 will be explicitly defined for 32-bit platforms. To fit into the new naming scheme, current usage of RTE_ARCH_ARM in project is mapped to (RTE_ARCH_ARM && RTE_ARCH_32). Matching flags for other architectures are: RTE_ARCH_X86 \| +----RTE_ARCH_32 \| \| \| +----RTE_ARCH_I686 \| \| \| +----RTE_ARCH_X86_X32 \| +----RTE_ARCH_64 \| +----RTE_ARCH_X86_64 RTE_ARCH_PPC_64 ---- RTE_ARCH_64 Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> Reviewed-by: Phil Yang <phil.yang@arm.com>	2020-10-13 16:35:48 +02:00
Radu Nicolau	8a00dfc738	eal: add write combining store Add rte_write32_wc and rte_write32_wc_relaxed functions that implement 32bit stores using write combining memory protocol. Provided generic stubs and x86 implementation. Signed-off-by: Radu Nicolau <radu.nicolau@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2020-10-13 14:11:16 +02:00
Nick Connolly	9d42642e86	mem: fix allocation failure on non-NUMA kernel Running dpdk-helloworld on Linux with lib numa present, but no kernel support for NUMA (CONFIG_NUMA=n) causes rte_service_init() to fail with EAL: error allocating rte services array. alloc_seg() calls get_mempolicy to verify that the allocation has happened on the correct socket, but receives ENOSYS from the kernel and fails the allocation. The allocated socket should only be verified if check_numa() is true. Fixes: `2a96c88be8` ("mem: ease init in a docker container") Cc: stable@dpdk.org Signed-off-by: Nick Connolly <nick.connolly@mayadata.io> Reviewed-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2020-10-13 14:02:18 +02:00
Chas Williams	d98b0fc1af	net: check segment pointer in raw checksum processing If the overall pkt_len and segment lengths are out of agreement, it is possible for the seg to be NULL after the loop. Add assert to check this condition in debug builds. Otherwise, return failure. Fixes: `c442fed81b` ("net: add function to calculate checksum in mbuf") Cc: stable@dpdk.org Signed-off-by: Chas Williams <3chas3@gmail.com>	2020-10-12 23:09:52 +02:00
David Marchand	1a11380bf4	eal: fix doxygen for EAL cleanup Align rte_eal_cleanup return codes description to the rest of dpdk. Fixes: `aec9c13c52` ("eal: add function to release internal resources") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2020-10-12 14:19:05 +02:00
Min Hu (Connor)	b7ccfb09da	ethdev: introduce FEC API This patch adds Forward error correction(FEC) support for ethdev. Introduce APIs which support query and config FEC information in hardware. Signed-off-by: Min Hu (Connor) <humin29@huawei.com> Reviewed-by: Wei Hu (Xavier) <xavier.huwei@huawei.com> Reviewed-by: Chengwen Feng <fengchengwen@huawei.com> Reviewed-by: Chengchang Tang <tangchengchang@huawei.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>	2020-10-09 13:17:43 +02:00
Huisong Li	9f6dc8592d	ethdev: fix data type in TC queues Currently, base and nb_queue in the tc_rxq and tc_txq information of queue and TC mapping on both TX and RX paths are uint8_t. However, these data will be truncated when queue number under a TC is greater than 256. So it is necessary for base and nb_queue to change from uint8_t to uint16_t. Fixes: `89d6728c78` ("ethdev: get DCB information") Cc: stable@dpdk.org Signed-off-by: Huisong Li <lihuisong@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com> Reviewed-by: Wei Hu (Xavier) <xavier.huwei@huawei.com> Reviewed-by: Dongdong Liu <liudongdong3@huawei.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2020-10-08 19:58:11 +02:00
Dekel Peled	c7870bfe09	ethdev: move RSS expansion code to mlx5 driver Patch [1] added support for RSS flow expansion. It was added in ethdev for public use, but until now it is used only by MLX5 PMD. To allow local changes in this code, this patch removes it from ethdev and moves it to MLX5 PMD file. [1] commit `4ed05fcd44` ("ethdev: add flow API to expand RSS flows") Signed-off-by: Dekel Peled <dekelp@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2020-10-08 19:58:11 +02:00
Dekel Peled	7f6a3168ed	ethdev: fix RSS flow expansion in case of mismatch Function rte_flow_expand_rss() is used to expand a flow rule with partial pattern into several rules, to ensure all relevant packets are matched. It uses utility function rte_flow_expand_rss_item_complete(), to check if the last valid item in the flow rule pattern needs to be completed. For example the pattern "eth / ipv4 proto is 17 / end" will be completed with a "udp" item. This function returns "void" item in two cases: 1) The last item has empty spec, for example "eth / ipv4 / end". 2) The last itme has spec that can't be expanded for RSS. For example the pattern "eth / ipv4 proto is 47 / end" ends with IPv4 item that has next protocol GRE. In both cases the flow rule may be expanded, but in the second case such expansion may create rules with invalid pattern. For example "eth / ipv4 proto is 47 / udp / end". In such a case the flow rule should not be expanded. This patch updates function rte_flow_expand_rss_item_complete(). Return value RTE_FLOW_ITEM_TYPE_END is used to indicate the flow rule should not be expanded. In such a case, rte_flow_expand_rss() will return with the original flow rule only, without any expansion. Fixes: `fc2dd8dd49` ("ethdev: fix expand RSS flows") Cc: stable@dpdk.org Signed-off-by: Dekel Peled <dekelp@nvidia.com> Acked-by: Xiaoyu Min <jackmin@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com>	2020-10-08 19:58:11 +02:00
Ferruh Yigit	7ae5c75f37	ethdev: check if queues are allocated before getting info A crash is detected when '--txpkts=#' parameter provided to the testpmd, this is because queue information is requested before queues have been allocated. Adding check to queue info APIs ('rte_eth_rx_queue_info_get()' & 'rte_eth_tx_queue_info_get') to protect against similar cases. Fixes: `ba2fb4f022` ("ethdev: check if queue setup when getting queue info") Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>	2020-10-08 19:58:11 +02:00
Stephen Hemminger	bfa63c4d7b	ethdev: use mbuf bulk free API The mbuf library now has routine to free multiple buffers. Loop is no longer needed. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>	2020-10-08 19:58:11 +02:00
David Marchand	0e995cbcfc	eal: fix experimental block for 20.11 In EAL, we try to sort the experimental symbols per the release they were introduced in. Fixes: `8929de043e` ("service: retrieve lcore active state") Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2020-10-08 15:20:51 +02:00
Cristian Dumitrescu	f63ba2005e	pipeline: fix instruction config free Coverity issue: 362901 Fixes: `a1711f948d` ("pipeline: add SWX Rx and extract instructions") Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2020-10-08 15:09:28 +02:00
Cristian Dumitrescu	941717ffe1	pipeline: fix unused variable Coverity issue: 362855 Fixes: `75634474ca` ("pipeline: add SWX instruction verifier") Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2020-10-08 15:09:28 +02:00
Cristian Dumitrescu	0ebe8c38a3	pipeline: fix memory free Coverity issue: 362796, 362804, 362819, 362836, 362858, 362865, 362869 Fixes: `3ca60ceed7` ("pipeline: add SWX pipeline specification file") Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2020-10-08 15:09:25 +02:00

1 2 3 4 5 ...

6619 Commits