numam-dpdk

Author	SHA1	Message	Date
Stephen Hemminger	cbb44143be	app/dumpcap: add new packet capture application This is a new packet capture application to replace existing pdump. The new application works like Wireshark dumpcap program and supports the pdump API features. It is not complete yet some features such as filtering are not implemented. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2021-10-22 22:40:58 +02:00
Stephen Hemminger	10f726efe2	pdump: support pcapng and filtering This enhances the DPDK pdump library to support new pcapng format and filtering via BPF. The internal client/server protocol is changed to support two versions: the original pdump basic version and a new pcapng version. The internal version number (not part of exposed API or ABI) is intentionally increased to cause any attempt to try mismatched primary/secondary process to fail. Add new API to do allow filtering of captured packets with DPDK BPF (eBPF) filter program. It keeps statistics on packets captured, filtered, and missed (because ring was full). Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Reshma Pattan <reshma.pattan@intel.com> Acked-by: Ray Kinsella <mdr@ashroe.eu>	2021-10-22 22:07:48 +02:00
Stephen Hemminger	8d23ce8f5e	pcapng: add new library for writing pcapng files This is utility library for writing pcapng format files used by Wireshark family of utilities. Older tcpdump also knows how to read (but not write) this format. See https://github.com/pcapng/pcapng/ Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Reshma Pattan <reshma.pattan@intel.com> Acked-by: Ray Kinsella <mdr@ashroe.eu>	2021-10-22 17:19:07 +02:00
Shijith Thotton	796b07e9c6	examples/l2fwd-event: support event vector Added changes to receive packets as event vector. By default this is disabled and can be enabled using the option --event-vector. Vector size and timeout to form the vector can be configured using options --event-vector-size and --event-vector-tmo. Example: dpdk-l2fwd-event -l 0-3 -n 4 -- -p 0x03 --mode=eventdev \ --eventq-sched=ordered --event-vector --event-vector-size 16 Signed-off-by: Shijith Thotton <sthotton@marvell.com> Acked-by: Jerin Jacob <jerinj@marvell.com>	2021-10-21 10:16:00 +02:00
Shijith Thotton	e8adca1951	examples/l3fwd: support event vector Added changes to receive packets as event vector. By default this is disabled and can be enabled using the option --event-vector. Vector size and timeout to form the vector can be configured using options --event-vector-size and --event-vector-tmo. Example: dpdk-l3fwd -l 0-3 -n 4 -- -p 0x03 --mode=eventdev \ --eventq-sched=ordered --event-vector --event-vector-size 16 Signed-off-by: Shijith Thotton <sthotton@marvell.com> Acked-by: Jerin Jacob <jerinj@marvell.com>	2021-10-21 10:16:00 +02:00
Rashmi Shetty	20841a2551	app/eventdev: support burst enqueue Introduce a new command line option prod_enq_burst_sz to set burst size for eventdev enqueue at producer in perf_queue test. The newly added function perf_producer_burst is called when prod_enq_burst_sz is greater than 1. Signed-off-by: Rashmi Shetty <rashmi.shetty@intel.com> Acked-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Acked-by: Jerin Jacob <jerinj@marvell.com>	2021-10-21 10:16:00 +02:00
Pavan Nikhilesh	f3f3a91788	eventdev/timer: move adapters memory to hugepage Move memory used by timer adapters to hugepage. Allocate memory on the first adapter create or lookup to address both primary and secondary process usecases. This will prevent TLB misses if any and aligns to memory structure of other subsystems. Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>	2021-10-21 10:16:00 +02:00
Pavan Nikhilesh	1dcd67ba1e	eventdev/timer: rearrange struct fields Rearrange fields in rte_event_timer data structure to remove holes. Also, remove use of volatile from rte_event_timer. Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>	2021-10-21 10:14:50 +02:00
Pavan Nikhilesh	d35e61322d	eventdev: move inline APIs into separate structure Move fastpath inline function pointers from rte_eventdev into a separate structure accessed via a flat array. The intention is to make rte_eventdev and related structures private to avoid future API/ABI breakages.` Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Acked-by: Ray Kinsella <mdr@ashroe.eu>	2021-10-21 10:14:50 +02:00
Ganapati Kundapura	814d017093	eventdev/eth_rx: support telemetry Added telemetry callbacks to get Rx adapter stats, reset stats and to get Rx queue config information. Signed-off-by: Ganapati Kundapura <ganapati.kundapura@intel.com> Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com> Acked-by: Naga Harish K S V <s.v.naga.harish.k@intel.com> Acked-by: Jerin Jacob <jerinj@marvell.com>	2021-10-21 10:14:50 +02:00
Naga Harish K S V	b06bca69b7	eventdev/eth_rx: add per-queue event buffer Added per queue buffer. To configure per queue event buffer size, application sets rte_event_eth_rx_adapter_params::use_queue_event_buf flag as true while using rte_event_eth_rx_adapter_create_with_params(). The per queue event buffer size is populated in rte_event_eth_rx_adapter_queue_conf::event_buf_size and passed to rte_event_eth_rx_adapter_queue_add(). Signed-off-by: Naga Harish K S V <s.v.naga.harish.k@intel.com> Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>	2021-10-21 10:14:50 +02:00
Naga Harish K S V	bc0df25c83	eventdev/eth_rx: add event buffer size configurability Currently event buffer is static array with a default size defined internally. To configure event buffer size from application, rte_event_eth_rx_adapter_create_with_params() API is added which takes struct rte_event_eth_rx_adapter_params to configure event buffer size in addition other params. The event buffer size is rounded up for better buffer utilization and performance. In case of NULL params argument, default event buffer size is used. Signed-off-by: Naga Harish K S V <s.v.naga.harish.k@intel.com> Signed-off-by: Ganapati Kundapura <ganapati.kundapura@intel.com> Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com> Acked-by: Jerin Jacob <jerinj@marvell.com>	2021-10-21 10:14:50 +02:00
Ganapati Kundapura	da781e6488	eventdev/eth_rx: support Rx queue config get Added rte_event_eth_rx_adapter_queue_conf_get() API to get rx queue information - event queue identifier, flags for handling received packets, scheduler type, event priority, polling frequency of the receive queue and flow identifier in rte_event_eth_rx_adapter_queue_conf structure Signed-off-by: Ganapati Kundapura <ganapati.kundapura@intel.com> Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com> Acked-by: Jerin Jacob <jerinj@marvell.com>	2021-10-21 10:14:50 +02:00
Pavan Nikhilesh	929ebdd543	eventdev/eth_rx: simplify event vector config Include vector configuration into the structure ``rte_event_eth_rx_adapter_queue_conf`` that is used to configure Rx adapter ethernet device Rx queue parameters. This simplifies event vector configuration as it avoids splitting configuration per Rx queue. Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com> Acked-by: Ray Kinsella <mdr@ashroe.eu> Acked-by: Jerin Jacob <jerinj@marvell.com>	2021-10-21 10:14:50 +02:00
David Marchand	1752b08781	test: rely on EAL detection for core list Cores count has a direct impact on the time needed to complete unit tests. Currently, the core list used for unit test is enforced to "all cores on the system" with no way for (CI) users to adapt it. On the other hand, EAL default behavior (when no -c/-l option gets passed) is to start threads on as many cores available in the process cpu affinity. Remove logic from meson: users can then select where to run the tests by either running meson with a custom cpu affinity (using taskset/cpuset depending on OS) or by passing a --test-args option to meson. Example: $ sudo meson test -C build --suite fast-tests -t 3 --test-args "-l 0-3" Signed-off-by: David Marchand <david.marchand@redhat.com> Tested-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Aaron Conole <aconole@redhat.com>	2021-10-21 17:48:04 +02:00
Gregory Etelson	59f3a8acbc	app/testpmd: add flex item commands Network port hardware is shipped with fixed number of supported network protocols. If application must work with a protocol that is not included in the port hardware by default, it can try to add the new protocol to port hardware. Flex item or flex parser is port infrastructure that allows application to add support for a custom network header and offload flows to match the header elements. Application must complete the following tasks to create a flow rule that matches custom header: 1. Create flow item object in port hardware. Application must provide custom header configuration to PMD. PMD will use that configuration to create flex item object in port hardware. 2. Create flex patterns to match. Flex pattern has a spec and a mask components, like a regular flow item. Combined together, spec and mask can target unique data sequence or a number of data sequences in the custom header. Flex patterns of the same flex item can have different lengths. Flex pattern is identified by unique handler value. 3. Create a flow rule with a flex flow item that references flow pattern. Testpmd flex CLI commands are: testpmd> flow flex_item create <port> <flex_id> <filename> testpmd> set flex_pattern <pattern_id> \ spec <spec data> mask <mask data> testpmd> set flex_pattern <pattern_id> is <spec_data> testpmd> flow create <port> ... \ / flex item is <flex_id> pattern is <pattern_id> / ... The patch works with the jansson library API. A new optional dependency on jansson library is added for testpmd. If jansson not detected the flex item functionality is disabled. Jansson development files must be present: jansson.pc, jansson.h libjansson.[a,so] Signed-off-by: Gregory Etelson <getelson@nvidia.com> Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-10-20 19:00:26 +02:00
Viacheslav Ovsiienko	dc4d860e8a	ethdev: introduce configurable flexible item 1. Introduction and Retrospective Nowadays the networks are evolving fast and wide, the network structures are getting more and more complicated, the new application areas are emerging. To address these challenges the new network protocols are continuously being developed, considered by technical communities, adopted by industry and, eventually implemented in hardware and software. The DPDK framework follows the common trends and if we bother to glance at the RTE Flow API header we see the multiple new items were introduced during the last years since the initial release. The new protocol adoption and implementation process is not straightforward and takes time, the new protocol passes development, consideration, adoption, and implementation phases. The industry tries to mitigate and address the forthcoming network protocols, for example, many hardware vendors are implementing flexible and configurable network protocol parsers. As DPDK developers, could we anticipate the near future in the same fashion and introduce the similar flexibility in RTE Flow API? Let's check what we already have merged in our project, and we see the nice raw item (rte_flow_item_raw). At the first glance, it looks superior and we can try to implement a flow matching on the header of some relatively new tunnel protocol, say on the GENEVE header with variable length options. And, under further consideration, we run into the raw item limitations: - only fixed size network header can be represented - the entire network header pattern of fixed format (header field offsets are fixed) must be provided - the search for patterns is not robust (the wrong matches might be triggered), and actually is not supported by existing PMDs - no explicitly specified relations with preceding and following items - no tunnel hint support As the result, implementing the support for tunnel protocols like aforementioned GENEVE with variable extra protocol option with flow raw item becomes very complicated and would require multiple flows and multiple raw items chained in the same flow (by the way, there is no support found for chained raw items in implemented drivers). This RFC introduces the dedicated flex item (rte_flow_item_flex) to handle matches with existing and new network protocol headers in a unified fashion. 2. Flex Item Life Cycle Let's assume there are the requirements to support the new network protocol with RTE Flows. What is given within protocol specification: - header format - header length, (can be variable, depending on options) - potential presence of extra options following or included in the header the header - the relations with preceding protocols. For example, the GENEVE follows UDP, eCPRI can follow either UDP or L2 header - the relations with following protocols. For example, the next layer after tunnel header can be L2 or L3 - whether the new protocol is a tunnel and the header is a splitting point between outer and inner layers The supposed way to operate with flex item: - application defines the header structures according to protocol specification - application calls rte_flow_flex_item_create() with desired configuration according to the protocol specification, it creates the flex item object over specified ethernet device and prepares PMD and underlying hardware to handle flex item. On item creation call PMD backing the specified ethernet device returns the opaque handle identifying the object has been created - application uses the rte_flow_item_flex with obtained handle in the flows, the values/masks to match with fields in the header are specified in the flex item per flow as for regular items (except that pattern buffer combines all fields) - flows with flex items match with packets in a regular fashion, the values and masks for the new protocol header match are taken from the flex items in the flows - application destroys flows with flex items - application calls rte_flow_flex_item_release() as part of ethernet device API and destroys the flex item object in PMD and releases the engaged hardware resources 3. Flex Item Structure The flex item structure is intended to be used as part of the flow pattern like regular RTE flow items and provides the mask and value to match with fields of the protocol item was configured for. struct rte_flow_item_flex { void handle; uint32_t length; const uint8_t pattern; }; The handle is some opaque object maintained on per device basis by underlying driver. The protocol header fields are considered as bit fields, all offsets and widths are expressed in bits. The pattern is the buffer containing the bit concatenation of all the fields presented at item configuration time, in the same order and same amount. If byte boundary alignment is needed an application can use a dummy type field, this is just some kind of gap filler. The length field specifies the pattern buffer length in bytes and is needed to allow rte_flow_copy() operations. The approach of multiple pattern pointers and lengths (per field) was considered and found clumsy - it seems to be much suitable for the application to maintain the single structure within the single pattern buffer. 4. Flex Item Configuration The flex item configuration consists of the following parts: - header field descriptors: - next header - next protocol - sample to match - input link descriptors - output link descriptors The field descriptors tell the driver and hardware what data should be extracted from the packet and then control the packet handling in the flow engine. Besides this, sample fields can be presented to match with patterns in the flows. Each field is a bit pattern. It has width, offset from the header beginning, mode of offset calculation, and offset related parameters. The next header field is special, no data are actually taken from the packet, but its offset is used as a pointer to the next header in the packet, in other words the next header offset specifies the size of the header being parsed by flex item. There is one more special field - next protocol, it specifies where the next protocol identifier is contained and packet data sampled from this field will be used to determine the next protocol header type to continue packet parsing. The next protocol field is like eth_type field in MAC2, or proto field in IPv4/v6 headers. The sample fields are used to represent the data be sampled from the packet and then matched with established flows. There are several methods supposed to calculate field offset in runtime depending on configuration and packet content: - FIELD_MODE_FIXED - fixed offset. The bit offset from header beginning is permanent and defined by field_base configuration parameter. - FIELD_MODE_OFFSET - the field bit offset is extracted from other header field (indirect offset field). The resulting field offset to match is calculated from as: field_base + (offset_base & offset_mask) << offset_shift This mode is useful to sample some extra options following the main header with field containing main header length. Also, this mode can be used to calculate offset to the next protocol header, for example - IPv4 header contains the 4-bit field with IPv4 header length expressed in dwords. One more example - this mode would allow us to skip GENEVE header variable length options. - FIELD_MODE_BITMASK - the field bit offset is extracted from other header field (indirect offset field), the latter is considered as bitmask containing some number of one bits, the resulting field offset to match is calculated as: field_base + bitcount(offset_base & offset_mask) << offset_shift This mode would be useful to skip the GTP header and its extra options with specified flags. - FIELD_MODE_DUMMY - dummy field, optionally used for byte boundary alignment in pattern. Pattern mask and data are ignored in the match. All configuration parameters besides field size and offset are ignored. Note: "" - means the indirect field offset is calculated and actual data are extracted from the packet by this offset (like data are fetched by pointer p from memory). The offset mode list can be extended by vendors according to hardware supported options. The input link configuration section tells the driver after what protocols and at what conditions the flex item can follow. Input link specified the preceding header pattern, for example for GENEVE it can be UDP item specifying match on destination port with value 6081. The flex item can follow multiple header types and multiple input links should be specified. At flow creation time the item with one of the input link types should precede the flex item and driver will select the correct flex item settings, depending on the actual flow pattern. The output link configuration section tells the driver how to continue packet parsing after the flex item protocol. If multiple protocols can follow the flex item header the flex item should contain the field with the next protocol identifier and the parsing will be continued depending on the data contained in this field in the actual packet. The flex item fields can participate in RSS hash calculation, the dedicated flag is present in the field description to specify what fields should be provided for hashing. 5. Flex Item Chaining If there are multiple protocols supposed to be supported with flex items in chained fashion - two or more flex items within the same flow and these ones might be neighbors in the pattern, it means the flex items are mutual referencing. In this case, the item that occurred first should be created with empty output link list or with the list including existing items, and then the second flex item should be created referencing the first flex item as input arc, drivers should adjust the item configuration. Also, the hardware resources used by flex items to handle the packet can be limited. If there are multiple flex items that are supposed to be used within the same flow it would be nice to provide some hint for the driver that these two or more flex items are intended for simultaneous usage. The fields of items should be assigned with hint indices and these indices from two or more flex items supposed to be provided within the same flow should be the same as well. In other words, the field hint index specifies the group of fields that can be matched simultaneously within a single flow. If hint indices are specified, the driver will try to engage not overlapping hardware resources and provide independent handling of the field groups with unique indices. If the hint index is zero the driver assigns resources on its own. 6. Example of New Protocol Handling Let's suppose we have the requirements to handle the new tunnel protocol that follows UDP header with destination port 0xFADE and is followed by MAC header. Let the new protocol header format be like this: struct new_protocol_header { rte_be32 header_length; /* length in dwords, including options / rte_be32 specific0; / some protocol data, no intention / rte_be32 specific1; / to match in flows on these fields / rte_be32 crucial; / data of interest, match is needed / rte_be32 options[0]; / optional protocol data, variable length / }; The supposed flex item configuration: struct rte_flow_item_flex_field field0 = { .field_mode = FIELD_MODE_DUMMY, / Affects match pattern only / .field_size = 96, / three dwords from the beginning / }; struct rte_flow_item_flex_field field1 = { .field_mode = FIELD_MODE_FIXED, .field_size = 32, / Field size is one dword / .field_base = 96, / Skip three dwords from the beginning / }; struct rte_flow_item_udp spec0 = { .hdr = { .dst_port = RTE_BE16(0xFADE), } }; struct rte_flow_item_udp mask0 = { .hdr = { .dst_port = RTE_BE16(0xFFFF), } }; struct rte_flow_item_flex_link link0 = { .item = { .type = RTE_FLOW_ITEM_TYPE_UDP, .spec = &spec0, .mask = &mask0, }; struct rte_flow_item_flex_conf conf = { .next_header = { .tunnel = FLEX_TUNNEL_MODE_SINGLE, .field_mode = FIELD_MODE_OFFSET, .field_base = 0, .offset_base = 0, .offset_mask = 0xFFFFFFFF, .offset_shift = 2 / Expressed in dwords, shift left by 2 */ }, .sample = { &field0, &field1, }, .nb_samples = 2, .input_link[0] = &link0, .nb_inputs = 1 }; Let's suppose we have created the flex item successfully, and PMD returned the handle 0x123456789A. We can use the following item pattern to match the crucial field in the packet with value 0x00112233: struct new_protocol_header spec_pattern = { .crucial = RTE_BE32(0x00112233), }; struct new_protocol_header mask_pattern = { .crucial = RTE_BE32(0xFFFFFFFF), }; struct rte_flow_item_flex spec_flex = { .handle = 0x123456789A .length = sizeiof(struct new_protocol_header), .pattern = &spec_pattern, }; struct rte_flow_item_flex mask_flex = { .length = sizeof(struct new_protocol_header), .pattern = &mask_pattern, }; struct rte_flow_item item_to_match = { .type = RTE_FLOW_ITEM_TYPE_FLEX, .spec = &spec_flex, .mask = &mask_flex, }; Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com>	2021-10-20 18:58:54 +02:00
Sunil Kumar Kori	58397fedc6	net/cnxk: support meter action to flow create Meters are configured per flow using rte_flow_create API. Implement support for meter action applied on the flow. Signed-off-by: Sunil Kumar Kori <skori@marvell.com> Signed-off-by: Rakesh Kudurumalla <rkudurumalla@marvell.com> Acked-by: Jerin Jacob <jerinj@marvell.com>	2021-10-19 16:25:31 +02:00
Sunil Kumar Kori	26b034f78c	net/cnxk: support to validate meter policy Implement API to validate meter policy for CNXK platform. Signed-off-by: Sunil Kumar Kori <skori@marvell.com> Signed-off-by: Rakesh Kudurumalla <rkudurumalla@marvell.com> Acked-by: Jerin Jacob <jerinj@marvell.com>	2021-10-19 16:24:59 +02:00
Alvin Zhang	1506c90029	net/i40e: fix IPv6 fragment RSS offload type in flow To keep flow format uniform with ice, this patch adds support for this RSS rule: flow create 0 ingress pattern eth / ipv6 / ipv6_frag_ext / end \ actions rss types ipv6-frag end queues end queues end / end Fixes: ef4c16fd9148 ("net/i40e: refactor RSS flow") Cc: stable@dpdk.org Signed-off-by: Alvin Zhang <alvinx.zhang@intel.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com>	2021-10-19 13:06:42 +02:00
Michal Krawczyk	f93e20e516	net/ena: check missing Tx completions In some cases Tx descriptors may be uncompleted by the HW and as a result they will never be released. This patch adds checking for the missing Tx completions to the ENA timer service, so in order to use this feature, the application must call the function rte_timer_manage(). Missing Tx completion reset threshold is determined dynamically, by taking into consideration ring size and the default value. Tx cleanup is associated with the Tx burst function. As DPDK applications can call Tx burst function dynamically, time when last cleanup was called must be traced to avoid false detection of the missing Tx completion. Signed-off-by: Michal Krawczyk <mk@semihalf.com> Reviewed-by: Igor Chauskin <igorch@amazon.com> Reviewed-by: Shai Brandes <shaibran@amazon.com>	2021-10-19 15:04:17 +02:00
Michal Krawczyk	08180833cb	net/ena: add NUMA-aware allocations Only the IO rings memory was allocated with taking the socket ID into the respect, while the other structures was allocated using the regular rte_zmalloc() API. Ring specific structures are now being allocated using the ring's socket ID. Signed-off-by: Michal Krawczyk <mk@semihalf.com> Reviewed-by: Igor Chauskin <igorch@amazon.com> Reviewed-by: Shai Brandes <shaibran@amazon.com>	2021-10-19 15:04:17 +02:00
Michal Krawczyk	005064e505	net/ena: support Tx/Rx free thresholds The caller can pass Tx or Rx free threshold value to the configuration structure for each ring. It determines when the Tx/Rx function should start cleaning up/refilling the descriptors. ENA was ignoring this value and doing it's own calculations. Now the user can configure ENA's behavior using this parameter and if this variable won't be set, the ENA will continue with the old behavior and will use it's own threshold value. The default value is not provided by the ENA in the ena_infos_get(), as it's being determined dynamically, depending on the requested ring size. Note that NULL check for Tx conf was removed from the function ena_tx_queue_setup(), as at this place the configuration will be either provided by the user or the default config will be used and it's handled by the upper (rte_ethdev) layer. Tx threshold shouldn't be used for the Tx cleanup budget as it can be inadequate to the used burst. Now the PMD tries to release mbufs for the ring until it will be depleted. Signed-off-by: Michal Krawczyk <mk@semihalf.com> Reviewed-by: Igor Chauskin <igorch@amazon.com> Reviewed-by: Shai Brandes <shaibran@amazon.com>	2021-10-19 15:04:17 +02:00
Viacheslav Galaktionov	26706314d4	net/sfc: implement transfer proxy port callback In sfc, MAE admin serves as a transfer proxy. In order to track which ethdev is privileged, augment every independent switch port structure with information about its MAE privilege. Signed-off-by: Viacheslav Galaktionov <viacheslav.galaktionov@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Reviewed-by: Andy Moreton <amoreton@xilinx.com>	2021-10-18 20:56:02 +02:00
Ferruh Yigit	b563c14212	ethdev: remove jumbo offload flag Removing 'DEV_RX_OFFLOAD_JUMBO_FRAME' offload flag. Instead of drivers announce this capability, application can deduct the capability by checking reported 'dev_info.max_mtu' or 'dev_info.max_rx_pktlen'. And instead of application setting this flag explicitly to enable jumbo frames, this can be deduced by driver by comparing requested 'mtu' to 'RTE_ETHER_MTU'. Removing this additional configuration for simplification. Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Reviewed-by: Rosen Xu <rosen.xu@intel.com> Acked-by: Somnath Kotur <somnath.kotur@broadcom.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Huisong Li <lihuisong@huawei.com> Acked-by: Hyong Youb Kim <hyonkim@cisco.com> Acked-by: Michal Krawczyk <mk@semihalf.com>	2021-10-18 19:20:21 +02:00
Ferruh Yigit	1bb4a528c4	ethdev: fix max Rx packet length There is a confusion on setting max Rx packet length, this patch aims to clarify it. 'rte_eth_dev_configure()' API accepts max Rx packet size via 'uint32_t max_rx_pkt_len' field of the config struct 'struct rte_eth_conf'. Also 'rte_eth_dev_set_mtu()' API can be used to set the MTU, and result stored into '(struct rte_eth_dev)->data->mtu'. These two APIs are related but they work in a disconnected way, they store the set values in different variables which makes hard to figure out which one to use, also having two different method for a related functionality is confusing for the users. Other issues causing confusion is: * maximum transmission unit (MTU) is payload of the Ethernet frame. And 'max_rx_pkt_len' is the size of the Ethernet frame. Difference is Ethernet frame overhead, and this overhead may be different from device to device based on what device supports, like VLAN and QinQ. * 'max_rx_pkt_len' is only valid when application requested jumbo frame, which adds additional confusion and some APIs and PMDs already discards this documented behavior. * For the jumbo frame enabled case, 'max_rx_pkt_len' is an mandatory field, this adds configuration complexity for application. As solution, both APIs gets MTU as parameter, and both saves the result in same variable '(struct rte_eth_dev)->data->mtu'. For this 'max_rx_pkt_len' updated as 'mtu', and it is always valid independent from jumbo frame. For 'rte_eth_dev_configure()', 'dev->data->dev_conf.rxmode.mtu' is user request and it should be used only within configure function and result should be stored to '(struct rte_eth_dev)->data->mtu'. After that point both application and PMD uses MTU from this variable. When application doesn't provide an MTU during 'rte_eth_dev_configure()' default 'RTE_ETHER_MTU' value is used. Additional clarification done on scattered Rx configuration, in relation to MTU and Rx buffer size. MTU is used to configure the device for physical Rx/Tx size limitation, Rx buffer is where to store Rx packets, many PMDs use mbuf data buffer size as Rx buffer size. PMDs compare MTU against Rx buffer size to decide enabling scattered Rx or not. If scattered Rx is not supported by device, MTU bigger than Rx buffer size should fail. Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Acked-by: Somnath Kotur <somnath.kotur@broadcom.com> Acked-by: Huisong Li <lihuisong@huawei.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Rosen Xu <rosen.xu@intel.com> Acked-by: Hyong Youb Kim <hyonkim@cisco.com>	2021-10-18 19:20:20 +02:00
William Tu	d2e5ab2b42	doc: fix emulated device names in e1000 guide The device name should be 82574L Gigabit Ethernet Controller. The patch also remove a redundant "*". Fixes: fc1f2750a3ec ("doc: programmers guide") Cc: stable@dpdk.org Signed-off-by: William Tu <u9012063@gmail.com> Acked-by: Haiyue Wang <haiyue.wang@intel.com>	2021-10-15 15:50:50 +02:00
Jie Wang	632be32735	ethdev: add API to get device configuration The driver may change offloads info into dev->data->dev_conf in dev_configure which may cause apps use outdated values. Add a new API to get actual device configuration. Signed-off-by: Jie Wang <jie1x.wang@intel.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2021-10-15 13:27:05 +02:00
Gregory Etelson	63f2bbfa82	net: introduce IPv4 IHL and version fields RTE IPv4 header definition combines the `version' and `ihl' fields into a single structure member. This patch introduces dedicated structure members for both `version' and `ihl' IPv4 fields. Separated header fields definitions allow to create simplified code to match on the IHL value in a flow rule. The original `version_ihl' structure member is kept for backward compatibility. Signed-off-by: Gregory Etelson <getelson@nvidia.com> Acked-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Ray Kinsella <mdr@ashroe.eu> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>	2021-10-14 23:00:45 +02:00
Gregory Etelson	5dee2a0a6b	net: fix IPv4 change announce IPv4 header encodes fragment information into 16 bits field. 3 bits hold flags and remaining 13 bits are for fragment offset. 13 bits bit-field cannot be defined both for big and little endian systems. The patch removes IPv4 fragments union announce. Fixes: f7383e7c7ec1 ("net: announce changes in IPv4 header access") Signed-off-by: Gregory Etelson <getelson@nvidia.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Akhil Goyal <gakhil@marvell.com> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-10-14 23:00:40 +02:00
Ivan Ilchenko	63b7265717	app/testpmd: add option to display extended statistics Add 'display-xstats' option for using in accompanying with Rx/Tx statistics (i.e. 'stats-period' option or 'show port stats' interactive command) to display specified list of extended statistics. Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2021-10-14 14:40:51 +02:00
Viacheslav Ovsiienko	14fc81aed7	ethdev: update modify field flow action The generic modify field flow action introduced in [1] has some issues related to the immediate source operand: - immediate source can be presented either as an unsigned 64-bit integer or pointer to data pattern in memory. There was no explicit pointer field defined in the union. - the byte ordering for 64-bit integer was not specified. Many fields have shorter lengths and byte ordering is crucial. - how the bit offset is applied to the immediate source field was not defined and documented. - 64-bit integer size is not enough to provide IPv6 addresses. In order to cover the issues and exclude any ambiguities the following is done: - introduce the explicit pointer field in rte_flow_action_modify_data structure - replace the 64-bit unsigned integer with 16-byte array - update the modify field flow action documentation Appropriate deprecation notice has been removed. [1] commit 73b68f4c54a0 ("ethdev: introduce generic modify flow action") Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2021-10-14 14:34:31 +02:00
Ivan Malov	1179f05cc9	ethdev: query proxy port to manage transfer flows Not all DPDK ports in a given switching domain may have the privilege to manage "transfer" flows. Add an API to find a port with sufficient privileges by any port in the domain. Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Ori Kam <orika@nvidia.com>	2021-10-14 13:42:59 +02:00
Andrew Rybchenko	f55b61cec9	net/sfc: support port representor flow item Add support for item PORT_REPRESENTOR which should be used instead of ambiguous item PORT_ID. Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2021-10-13 22:59:26 +02:00
Andrew Rybchenko	8d13351d4c	net/octeontx2: support port representor flow action Action PORT_ID implementation assumes ingress only. Its semantics suggests that support for equal action PORT_REPRESENTOR be added. Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2021-10-13 22:59:26 +02:00
Andrew Rybchenko	d35dd287a2	net/mlx5: support represented port flow action Semantics of the existing support for action PORT_ID suggests that support for equal action REPRESENTED_PORT be implemented. Helper functions keep port_id suffix since action MLX5_FLOW_ACTION_PORT_ID is still used internally. Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2021-10-13 22:59:26 +02:00
Andrew Rybchenko	54bd4ebe8b	net/enic: support meta flow actions to overrule destinations Add support for actions PORT_REPRESENTOR and REPRESENTED_PORT based on the existing support for action PORT_ID. Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Hyong Youb Kim <hyonkim@cisco.com>	2021-10-13 22:59:26 +02:00
Andrew Rybchenko	640b44aa5c	net/bnxt: support meta flow actions to overrule destinations Add support for actions PORT_REPRESENTOR and REPRESENTED_PORT based on the existing support for action PORT_ID. Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2021-10-13 22:59:26 +02:00
Andrew Rybchenko	a8321e0979	net/bnxt: support meta flow items to match on traffic source Add support for items PORT_REPRESENTOR and REPRESENTED_PORT based on the existing support for item PORT_ID. The use of item PORT_ID depends on the specified direction attribute. Items PORT_REPRESENTOR and REPRESENTED_PORT, in turn, define traffic direction themselves. The former matches traffic from the driver's vNIC. The latter matches packets from either a v-port (network) or a VF's vNIC (if the driver's port is a VF representor). Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2021-10-13 22:59:26 +02:00
Ivan Malov	9d2a349b38	ethdev: deprecate direction attributes in transfer flows Attributes "ingress" and "egress" can only apply unambiguosly to non-"transfer" flows. In "transfer" flows, the standpoint is effectively shifted to the embedded switch. There can be many different endpoints connected to the switch, so the use of "ingress" / "egress" does not shed light on which endpoints precisely can be considered as traffic sources. Add relevant deprecation notices and suggest the use of precise traffic source items (PORT_REPRESENTOR and REPRESENTED_PORT). Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-10-13 22:59:26 +02:00
Ivan Malov	5da44faa80	ethdev: deprecate hard-to-use or ambiguous items and actions PF, VF and PHY_PORT require that applications have extra knowledge of the underlying NIC and thus are hard to use. Also, the corresponding items depend on the direction attribute (ingress / egress), which complicates their use in applications and interpretation in PMDs. The concept of PORT_ID is ambiguous as it doesn't say whether the port in question is an ethdev or the represented entity. Items and actions PORT_REPRESENTOR, REPRESENTED_PORT should be used instead. Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2021-10-13 22:59:26 +02:00
Ivan Malov	88caad251c	ethdev: add represented port action to flow API For use in "transfer" flows. Supposed to send matching traffic to the entity represented by the given ethdev, at embedded switch level. Such an entity can be a network (via a network port), a guest machine (via a VF) or another ethdev in the same application. Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2021-10-13 22:59:26 +02:00
Ivan Malov	8edb6bc026	ethdev: add port representor action to flow API For use in "transfer" flows. Supposed to send matching traffic to the given ethdev (to the application), at embedded switch level. Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2021-10-13 22:59:26 +02:00
Ivan Malov	49863ae2bf	ethdev: add represented port item to flow API For use in "transfer" flows. Supposed to match traffic entering the embedded switch from the entity represented by the given ethdev. Such an entity can be a network (via a network port), a guest machine (via a VF) or another ethdev in the same application. Must not be combined with direction attributes. Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2021-10-13 22:59:26 +02:00
Ivan Malov	081e42dab1	ethdev: add port representor item to flow API For use in "transfer" flows. Supposed to match traffic entering the embedded switch from the given ethdev. Must not be combined with direction attributes. Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2021-10-13 22:59:25 +02:00
Konstantin Ananyev	f9bdee267a	ethdev: hide internal structures Move rte_eth_dev, rte_eth_dev_data, rte_eth_rxtx_callback and related data into private header (ethdev_driver.h). Few minor changes to keep DPDK building after that. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com> Tested-by: Feifei Wang <feifei.wang2@arm.com>	2021-10-13 22:14:59 +02:00
Konstantin Ananyev	27a300e6af	ethdev: add API to retrieve multiple MAC addresses Introduce rte_eth_macaddrs_get() to allow user to retrieve all ethernet addresses assigned to given port. Change testpmd to use this new function and avoid referencing directly rte_eth_devices[]. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com> Tested-by: Feifei Wang <feifei.wang2@arm.com>	2021-10-13 22:14:59 +02:00
Konstantin Ananyev	8d7d4fcdca	ethdev: change input parameters for Rx queue count Currently majority of fast-path ethdev ops take pointers to internal queue data structures as an input parameter. While eth_rx_queue_count() takes a pointer to rte_eth_dev and queue index. For future work to hide rte_eth_devices[] and friends it would be plausible to unify parameters list of all fast-path ethdev ops. This patch changes eth_rx_queue_count() to accept pointer to internal queue data as input parameter. While this change is transparent to user, it still counts as an ABI change, as eth_rx_queue_count_t is used by ethdev public inline function rte_eth_rx_queue_count(). Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Hyong Youb Kim <hyonkim@cisco.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com> Tested-by: Feifei Wang <feifei.wang2@arm.com>	2021-10-13 22:14:58 +02:00
Ivan Malov	012bf708c2	net/sfc: support group flows in tunnel offload GROUP is an in-house term for so-called "tunnel_match" flows. On parsing, they are detected by virtue of PMD-internal item MARK. It associates a given flow with its tunnel context. Such a flow is represented by a MAE action rule which is chained with the corresponding JUMP rule's outer rule by virtue of matching on its recirculation ID. GROUP flows do narrower match than JUMP flows do and decapsulate matching packets (full offload). Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2021-10-13 21:30:13 +02:00
Ivan Malov	93de39f50a	net/sfc: support jump flows in tunnel offload JUMP is an in-house term for so-called "tunnel_set" flows. On parsing, they are identified by virtue of actions MARK (PMD-internal) and JUMP. The action MARK associates a given flow with its tunnel context. Such a flow is represented by a MAE outer rule (OR) which has its recirculation ID set. This ID is also associated with the tunnel context. The OR is supposed to set this ID in 8 high bits of Rx mark in matching packets. It also counts the packets. Packets that hit the OR but miss in action rule (AR) table, should go to MAE admin PF (that is, to DPDK) by default. Support for the use of action COUNT in JUMP flows will be introduced by later patches. Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2021-10-13 21:30:13 +02:00

1 2 3 4 5 ...

4888 Commits