numam-dpdk

Author	SHA1	Message	Date
Ferruh Yigit	3d98f921fb	ethdev: unify prefix for static functions and variables Prefix static function and variables with 'eth_dev'. For some 'rte_' prefix dropped, and for others 'eth_dev' added. This is useful to differentiate public and static function/variables. The cleanup is good to for having consistent naming to help new additions naming. No functional change, only naming. Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Gaetan Rivet <grive@u256.net>	2020-10-17 01:14:50 +02:00
Andrew Rybchenko	f6c763fbed	ethdev: unify error code if port ID is invalid Use ENODEV as the error code if specified port ID is invalid. Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2020-10-17 01:14:50 +02:00
Ferruh Yigit	f30e69b41f	ethdev: add device flag to bypass auto-filled queue xstats Queue stats are stored in 'struct rte_eth_stats' as array and array size is defined by 'RTE_ETHDEV_QUEUE_STAT_CNTRS' compile time flag. As a result of technical board discussion, decided to remove the queue statistics from 'struct rte_eth_stats' in the long term. Instead PMDs should represent the queue statistics via xstats, this gives more flexibility on the number of the queues supported. Currently queue stats in the xstats are filled by ethdev layer, using some basic stats, when queue stats removed from basic stats the responsibility to fill the relevant xstats will be pushed to the PMDs. During the switch period, temporary 'RTE_ETH_DEV_AUTOFILL_QUEUE_XSTATS' device flag is created. Initially all PMDs using xstats set this flag. The PMDs implemented queue stats in the xstats should clear the flag. When all PMDs switch to the xstats for the queue stats, queue stats related fields from 'struct rte_eth_stats' will be removed, as well as 'RTE_ETH_DEV_AUTOFILL_QUEUE_XSTATS' flag. Later 'RTE_ETHDEV_QUEUE_STAT_CNTRS' compile time flag also can be removed. Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Haiyue Wang <haiyue.wang@intel.com> Acked-by: Xiao Wang <xiao.w.wang@intel.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2020-10-16 23:27:15 +02:00
Ivan Ilchenko	62024eb827	ethdev: change stop operation callback to return int Change eth_dev_stop_t return value from void to int. Make eth_dev_stop_t implementations across all drivers to return negative errno values if case of error conditions. Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2020-10-16 22:26:41 +02:00
Ivan Ilchenko	58af59172b	ethdev: allow stop function to return an error Change rte_eth_dev_stop() return value from void to int and return negative errno values in case of error conditions. Also update the usage of the function in ethdev according to the new return type. Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2020-10-16 22:26:41 +02:00
Thomas Monjalon	8a5a0aad5d	ethdev: allow close function to return an error The API function rte_eth_dev_close() was returning void. The return type is changed to int for notifying of errors. If an error happens during a close operation, the status of the port is undefined, a maximum of resources having been freed. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Reviewed-by: Liron Himi <lironh@marvell.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2020-10-16 22:26:41 +02:00
Thomas Monjalon	0607dadf98	ethdev: reset all when releasing a port The function rte_eth_dev_release_port() is partially resetting the struct rte_eth_dev. The drivers were completing this reset with more pointers set to NULL in the close or remove operations. More pointers are reset at ethdev level, and some redundant assignments are removed from PMDs. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Haiyue Wang <haiyue.wang@intel.com> Acked-by: Jeff Guo <jia.guo@intel.com> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>	2020-10-16 22:26:41 +02:00
Thomas Monjalon	b8f5d2ae75	ethdev: remove forcing stopped state upon close When closing a port, it is supposed to be already stopped, and marked as such with "dev_started" state zeroed by the stop API. Resetting "dev_started" before calling the driver close operation was hiding the case of not properly stopped port being closed. The flag "dev_started" is not changed anymore in "rte_eth_dev_close()". In case the "dev_stop" function is called from "dev_close", bypassing "rte_eth_dev_stop()" API, the "dev_started" state must be explicitly reset in the PMD in order to keep the same behaviour. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>	2020-10-16 22:26:41 +02:00
Viacheslav Ovsiienko	4ff702b5df	ethdev: introduce Rx buffer split The DPDK datapath in the transmit direction is very flexible. An application can build the multi-segment packet and manages almost all data aspects - the memory pools where segments are allocated from, the segment lengths, the memory attributes like external buffers, registered for DMA, etc. In the receiving direction, the datapath is much less flexible, an application can only specify the memory pool to configure the receiving queue and nothing more. In order to extend receiving datapath capabilities it is proposed to add the way to provide extended information how to split the packets being received. The new offload flag RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT in device capabilities is introduced to present the way for PMD to report to application about supporting Rx packet split to configurable segments. Prior invoking the rte_eth_rx_queue_setup() routine application should check RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT flag. The following structure is introduced to specify the Rx packet segment for RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT offload: struct rte_eth_rxseg_split { struct rte_mempool mp; / memory pools to allocate segment from / uint16_t length; / segment maximal data length, configures "split point" / uint16_t offset; / data offset from beginning of mbuf data buffer / uint32_t reserved; / reserved field */ }; The segment descriptions are added to the rte_eth_rxconf structure: rx_seg - pointer the array of segment descriptions, each element describes the memory pool, maximal data length, initial data offset from the beginning of data buffer in mbuf. This array allows to specify the different settings for each segment in individual fashion. rx_nseg - number of elements in the array If the extended segment descriptions is provided with these new fields the mp parameter of the rte_eth_rx_queue_setup must be specified as NULL to avoid ambiguity. There are two options to specify Rx buffer configuration: - mp is not NULL, rrx_conf.rx_nseg is zero, it is compatible configuration, follows existing implementation, provides the single pool and no description for segment sizes and offsets. - mp is NULL, rx_conf.rx_seg is not NULL, rx_conf.rx_nseg is not zero, it provides the extended configuration, individually for each segment. f the Rx queue is configured with new settings the packets being received will be split into multiple segments pushed to the mbufs with specified attributes. The PMD will split the received packets into multiple segments according to the specification in the description array. For example, let's suppose we configured the Rx queue with the following segments: seg0 - pool0, len0=14B, off0=2 seg1 - pool1, len1=20B, off1=128B seg2 - pool2, len2=20B, off2=0B seg3 - pool3, len3=512B, off3=0B The packet 46 bytes long will look like the following: seg0 - 14B long @ RTE_PKTMBUF_HEADROOM + 2 in mbuf from pool0 seg1 - 20B long @ 128 in mbuf from pool1 seg2 - 12B long @ 0 in mbuf from pool2 The packet 1500 bytes long will look like the following: seg0 - 14B @ RTE_PKTMBUF_HEADROOM + 2 in mbuf from pool0 seg1 - 20B @ 128 in mbuf from pool1 seg2 - 20B @ 0 in mbuf from pool2 seg3 - 512B @ 0 in mbuf from pool3 seg4 - 512B @ 0 in mbuf from pool3 seg5 - 422B @ 0 in mbuf from pool3 The offload RTE_ETH_RX_OFFLOAD_SCATTER must be present and configured to support new buffer split feature (if rx_nseg is greater than one). The split limitations imposed by underlying PMD is reported in the new introduced rte_eth_dev_info->rx_seg_capa field. The new approach would allow splitting the ingress packets into multiple parts pushed to the memory with different attributes. For example, the packet headers can be pushed to the embedded data buffers within mbufs and the application data into the external buffers attached to mbufs allocated from the different memory pools. The memory attributes for the split parts may differ either - for example the application data may be pushed into the external memory located on the dedicated physical device, say GPU or NVMe. This would improve the DPDK receiving datapath flexibility with preserving compatibility with existing API. Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Acked-by: Jerin Jacob <jerinj@marvell.com> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2020-10-16 22:26:40 +02:00
Bing Zhao	bc6e15de08	ethdev: add hairpin queue operations Every hairpin queue pair should be configured properly and the connection between Tx and Rx queues should be established, before hairpin function works. In single port hairpin mode, the queues of each pair belong to the same device. It is easy to get the hardware and software information of each queue and configure the hairpin connection with such information. In two ports hairpin mode, it is not easy or inappropriate to access one queue's information from another device. Since hairpin is configured per queue pair, three new APIs are introduced and they are internal for the PMD using. The peer update API helps to pass one queue's information to the peer queue and get the peer's information back for the next step. The peer bind API configures the current queue with the peer's information. For each hairpin queue pair, this API may need to be called twice to configure the Tx, Rx queues separately. The peer unbind API resets the current queue configuration and state to disconnect it from the peer queue. Also, it may need to be called twice to disconnect Tx, Rx queues from each other. Some parameter of the above APIs might not be mandatory, and it depends on the PMD implementation. The structure of `rte_hairpin_peer_info` is only a declaration and the actual members will be defined in each PMD when being used. Signed-off-by: Bing Zhao <bingz@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com>	2020-10-16 19:48:19 +02:00
Bing Zhao	9a9ba10ada	ethdev: add function to get hairpin peer ports list After hairpin queues are configured, in general, the application will maintain the ports topology and even the queues configuration for the hairpin. But sometimes it will not. If there is no hot-plug, it is easy to bind and unbind hairpin among all the ports. The application can just connect or disconnect the hairpin egress ports to/from all the probed ingress ports. Then all the connections could be handled properly. But with hot-plug / hot-unplug, one port could be probed and removed dynamically. With two ports hairpin, all the connections from and to this port should be handled after start(bind) or before stop(unbind). It is necessary to know the hairpin topology with this port. This function will return the ports list with the actual peer ports number after configuration. Either peer Rx or Tx ports will be gotten with this function call. Signed-off-by: Bing Zhao <bingz@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com>	2020-10-16 19:48:19 +02:00
Bing Zhao	5d9f23fb8f	ethdev: add new attributes to hairpin config To support two ports hairpin mode and keep the backward compatibility for the application, two new attribute members of the hairpin queue configuration structure will be added. `tx_explicit` means if the application itself will insert the Tx part flow rules. If not set, PMD will insert the rules implicitly. `manual_bind` means if the hairpin Tx queue and peer Rx queue will be bound automatically during the device start stage. Different Tx and Rx queue pairs could have different values, but it is highly recommended that all paired queues between one egress and its peer ingress ports have the same values, in order not to bring any chaos to the system. The actual support of these attribute parameters will be checked and decided by the PMD drivers. In the single port hairpin, if both are zero without any setting, the behavior will remain the same as before. It means that no bind API needs to be called and no Tx flow rules need to be inserted manually by the application. Signed-off-by: Bing Zhao <bingz@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2020-10-16 19:48:19 +02:00
Bing Zhao	a9916fdfb8	ethdev: add hairpin bind and unbind API In single port hairpin mode, all the hairpin Tx and Rx queues belong to the same device. After the queues are set up properly, there is no other dependency between the Tx queue and its Rx peer queue. The binding process that connected the Tx and Rx queues together from hardware level will be done automatically during the device start procedure. Everything required is configured and initialized already for the binding process. But in two ports hairpin mode, there will be some cross-dependences between two different ports. Usually, the ports will be initialized serially by the main thread but not in parallel. The earlier port will not be able to enable the bind if the following peer port is not yet configured with HW resources. What's more, if one port is detached / attached dynamically, it would introduce more trouble for the hairpin binding. To overcome these, new APIs for binding and unbinding are added. During startup, only the hairpin Tx and Rx peer queues will be set up. Nothing will be done when starting the device if the queues are without auto-bind attribute. Only after the required ports pair started, the `rte_eth_hairpin_bind()` API can be called to bind the all Tx queues of the egress port to the Rx queues of the peer port. Then the connection between the egress and ingress ports pair will be established. The `rte_eth_hairpin_unbind()` API could be used to disconnect the egress and the peer ingress ports. This should only be called before the device is closed if needed. When doing the clean up, all the egress and ingress pairs related to a single port should be taken into consideration, especially in the hot unplug case. mode is described. Signed-off-by: Bing Zhao <bingz@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2020-10-16 19:48:19 +02:00
Honnappa Nagarahalli	2b69bd1179	ethdev: fix memory ordering for callback functions Call back functions are registered on the control plane. They are accessed from the data plane. Hence, correct memory orderings should be used to avoid race conditions. Fixes: 4dc294158cac ("ethdev: support optional Rx and Tx callbacks") Fixes: c8231c63ddcb ("ethdev: insert Rx callback as head of list") Cc: stable@dpdk.org Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com> Reviewed-by: Phil Yang <phil.yang@arm.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2020-10-16 19:48:18 +02:00
Phil Yang	8dd4b2afc7	ethdev: replace full barrier with relaxed barrier While registering the call back functions full write barrier can be replaced with one-way write barrier. Signed-off-by: Phil Yang <phil.yang@arm.com> Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2020-10-16 19:48:18 +02:00
Wei Hu (Xavier)	ee7a8fafba	ethdev: check queue id in Rx interrupt control This patch add queue ID checks to Rx interrupt control routines. Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2020-10-16 19:48:17 +02:00
Wei Hu (Xavier)	83e813ec2a	ethdev: check if queue setup in queue-related APIs This patch adds checking whether the related Tx or Rx queue has been setup in the queue-related API functions to avoid illegal address access. Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2020-10-16 19:48:17 +02:00
Wei Hu (Xavier)	9b47c352e3	ethdev: extract checking queue id into common functions This patch extract checking rx_queue_id or tx_queue_id into two separate common functions named eth_dev_validate_rx_queue and eth_dev_validate_tx_queue. Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2020-10-16 19:48:17 +02:00
Thomas Monjalon	1372d0cb2a	ethdev: fix xstat name of basic stats per queue As described in doc/guides/prog_guide/poll_mode_drv.rst, the naming scheme for the xstats is parts separated with underscore: * direction * detail 1 * detail 2 * detail n * unit where detail 1 can be "q" followed with a queue number. It means the name of the stats per queue should be rx_qN_* or tx_qN_*. The second underscore was missing so far. Fixing the basic xstat names may be considered an API change, that's why it should not be backported. While fixing this mistake, some examples of the naming scheme are given as part of the API documentation of rte_eth_xstat_name. More proposals about standardizing statistics: http://fast.dpdk.org/events/slides/DPDK-2019-09-Ethernet_Statistics.pdf Fixes: bd6aa172cf35 ("ethdev: fetch extended statistics with integer ids") Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Reviewed-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Kevin Traynor <ktraynor@redhat.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Ciara Power <ciara.power@intel.com>	2020-10-16 19:47:55 +02:00
Suanming Mou	80d1a9aff7	ethdev: make flow API thread safe Currently, the rte_flow functions are not defined as thread safe. DPDK applications either call the functions in single thread or protect any concurrent calling for the rte_flow operations using a lock. For PMDs support the flow operations thread safe natively, the redundant protection in application hurts the performance of the rte_flow operation functions. And the restriction of thread safe is not guaranteed for the rte_flow functions also limits the applications' expectation. This feature is going to change the rte_flow functions to be thread safe. As different PMDs have different flow operations, some may support thread safe already and others may not. For PMDs don't support flow thread safe operation, a new lock is defined in ethdev in order to protects thread unsafe PMDs from rte_flow level. A new RTE_ETH_DEV_FLOW_OPS_THREAD_SAFE device flag is added to determine whether the PMD supports thread safe flow operation or not. For PMDs support thread safe flow operations, set the RTE_ETH_DEV_FLOW_OPS_THREAD_SAFE flag, rte_flow level functions will skip the thread safe helper lock for these PMDs. Again the rte_flow level thread safe lock only works when PMD operation functions are not thread safe. For the PMDs which don't want the default mutex lock, just set the flag in the PMD, and add the prefer type of lock in the PMD. Then the default mutex lock is easily replaced by the PMD level lock. The change has no effect on the current DPDK applications. No change is required for the current DPDK applications. For the standard posix pthread_mutex, if no lock contention with the added rte_flow level mutex, the mutex only does the atomic increasing in pthread_mutex_lock() and decreasing in pthread_mutex_unlock(). No futex() syscall will be involved. Signed-off-by: Suanming Mou <suanmingm@nvidia.com> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2020-10-16 00:44:58 +02:00
Min Hu (Connor)	b7ccfb09da	ethdev: introduce FEC API This patch adds Forward error correction(FEC) support for ethdev. Introduce APIs which support query and config FEC information in hardware. Signed-off-by: Min Hu (Connor) <humin29@huawei.com> Reviewed-by: Wei Hu (Xavier) <xavier.huwei@huawei.com> Reviewed-by: Chengwen Feng <fengchengwen@huawei.com> Reviewed-by: Chengchang Tang <tangchengchang@huawei.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>	2020-10-09 13:17:43 +02:00
Ferruh Yigit	7ae5c75f37	ethdev: check if queues are allocated before getting info A crash is detected when '--txpkts=#' parameter provided to the testpmd, this is because queue information is requested before queues have been allocated. Adding check to queue info APIs ('rte_eth_rx_queue_info_get()' & 'rte_eth_tx_queue_info_get') to protect against similar cases. Fixes: ba2fb4f022fc ("ethdev: check if queue setup when getting queue info") Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>	2020-10-08 19:58:11 +02:00
Stephen Hemminger	bfa63c4d7b	ethdev: use mbuf bulk free API The mbuf library now has routine to free multiple buffers. Loop is no longer needed. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>	2020-10-08 19:58:11 +02:00
Ciara Power	083b0b310b	ethdev: add common stats for telemetry The ethdev library now registers a telemetry command for common ethdev statistics. An example usage is shown below: Connecting to /var/run/dpdk/rte/dpdk_telemetry.v2 {"version": "DPDK 20.08.0-rc1", "pid": 14119, "max_output_len": 16384} --> /ethdev/stats,0 {"/ethdev/stats": {"ipackets": 0, "opackets": 0, "ibytes": 0, "obytes": \ 0, "imissed": 0, "ierrors": 0, "oerrors": 0, "rx_nombuf": 0, \ "q_ipackets": [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], \ "q_opackets": [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], \ "q_ibytes": [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], \ "q_obytes": [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], \ "q_errors": [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]}} Signed-off-by: Ciara Power <ciara.power@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2020-10-06 22:55:41 +02:00
Ciara Power	5514319e7b	telemetry: fix passing full params string to command Telemetry only passed the first param to the command handler if multiple were entered by the user, separated by commas. Telemetry is required to pass the full params string to the command, by splitting by a comma delimiter only once to remove the command part of the string. This will enable future commands to take multiple param values. Fixes: b1ad0e124536 ("rawdev: add telemetry callbacks") Fixes: c190daedb9b1 ("ethdev: add telemetry callbacks") Fixes: 6dd571fd07c3 ("telemetry: introduce new functionality") Cc: stable@dpdk.org Signed-off-by: Ciara Power <ciara.power@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2020-10-06 22:54:58 +02:00
Thomas Monjalon	fbd1913561	ethdev: remove old close behaviour The temporary flag RTE_ETH_DEV_CLOSE_REMOVE is removed. It was introduced in DPDK 18.11 in order to give time for PMDs to migrate. The old behaviour was to free only queues when closing a port. The new behaviour is calling rte_eth_dev_release_port() which does three more tasks: - trigger event callback - reset state and few pointers - free all generic port resources The private port resources must be released in the .dev_close callback. The .remove callback should: - call .dev_close callback - call rte_eth_dev_release_port() - free multi-port device shared resources Despite waiting two years, some drivers have not migrated, so they may hit issues with the incompatible new behaviour. After sending emails, adding logs, and announcing the deprecation, the only last solution is to declare these drivers as unmaintained: ionic, liquidio, nfp Below is a summary of what to implement in those drivers. * The freeing of private port resources must be moved from the ".remove(device)" function to the ".dev_close(port)" function. * If a generic resource (.mac_addrs or .hash_mac_addrs) cannot be freed, it must be set to NULL in ".dev_close" function to protect from subsequent rte_eth_dev_release_port() freeing. * Note 1: The generic resources are freed in rte_eth_dev_release_port(), after ".dev_close" is called in rte_eth_dev_close(), but not when calling ".dev_close" directly from the ".remove" PMD function. That's why rte_eth_dev_release_port() must still be called explicitly from ".remove(device)" after calling the ".dev_close" PMD function. * Note 2: If a device can have multiple ports, the common resources must be freed only in the ".remove(device)" function. * Note 3: The port is supposed to be in a stopped state when it is closed. If it is not the case, it is free to the PMD implementation how to react when trying to close a non-stopped port: either try to stop it automatically or just return an error. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Reviewed-by: Liron Himi <lironh@marvell.com> Reviewed-by: Haiyue Wang <haiyue.wang@intel.com> Acked-by: Jeff Guo <jia.guo@intel.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org>	2020-09-30 19:19:14 +02:00
Thomas Monjalon	2c65898b48	ethdev: reset device and interrupt pointers on release The pointers .device and .intr_handle were already reset by the helper rte_eth_dev_pci_generic_remove(). It is now made part of rte_eth_dev_release_port(). It makes rte_eth_dev_pci_release() meaningless, so it is replaced with a call to rte_eth_dev_release_port(). Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org>	2020-09-30 19:19:13 +02:00
Ivan Dyukov	fbf931c9c3	ethdev: format link status text There is new link_speed value introduced. It's INT_MAX value which means that speed is unknown. To simplify processing of the value in application, new function is added which convert link_speed to string. Also dpdk examples have many duplicated code which format entire link status structure to text. This commit adds two functions: * rte_eth_link_speed_to_str - format link_speed to string * rte_eth_link_to_str - convert link status structure to string Signed-off-by: Ivan Dyukov <i.dyukov@samsung.com> Acked-by: Morten Brørup <mb@smartsharesystems.com> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>	2020-09-21 18:05:37 +02:00
Ferruh Yigit	5723fbed4f	ethdev: remove underscore prefix from internal API '_rte_eth_dev_callback_process()' & '_rte_eth_dev_reset()' internal APIs has unconventional underscore ('_') prefix. Although this is not documented most probably this is to mark them as internal. Since we have '__rte_internal' flag to mark this, removing '_' from API names. For '_rte_eth_dev_reset()', there is already a public API named 'rte_eth_dev_reset()', so renaming '_rte_eth_dev_reset()' to 'rte_eth_dev_internal_reset'. Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: David Marchand <david.marchand@redhat.com> Acked-by: Sachin Saxena <sachin.saxena@nxp.com>	2020-09-18 18:55:08 +02:00
Wei Hu (Xavier)	ba2fb4f022	ethdev: check if queue setup when getting queue info This patch adds checking whether the related Tx or Rx queue has been setup in the rte_eth_rx_queue_info_get and rte_eth_tx_queue_info_get API function to avoid illegal address access. Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2020-09-18 18:55:06 +02:00
Renata Saiakhova	c9c74288f0	ethdev: add function to release HW rings Free previously allocated memzone for HW rings Signed-off-by: Renata Saiakhova <renata.saiakhova@ekinops.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>	2020-07-11 06:18:54 +02:00
Viacheslav Ovsiienko	9da82e8d8b	mbuf: introduce accurate packet Tx scheduling There is the requirement on some networks for precise traffic timing management. The ability to send (and, generally speaking, receive) the packets at the very precisely specified moment of time provides the opportunity to support the connections with Time Division Multiplexing using the contemporary general purpose NIC without involving an auxiliary hardware. For example, the supporting of O-RAN Fronthaul interface is one of the promising features for potentially usage of the precise time management for the egress packets. The main objective of this patchset is to specify the way how applications can provide the moment of time at what the packet transmission must be started and to describe in preliminary the supporting this feature from mlx5 PMD side [1]. The new dynamic timestamp field is proposed, it provides some timing information, the units and time references (initial phase) are not explicitly defined but are maintained always the same for a given port. Some devices allow to query rte_eth_read_clock() that will return the current device timestamp. The dynamic timestamp flag tells whether the field contains actual timestamp value. For the packets being sent this value can be used by PMD to schedule packet sending. The device clock is opaque entity, the units and frequency are vendor specific and might depend on hardware capabilities and configurations. If might (or not) be synchronized with real time via PTP, might (or not) be synchronous with CPU clock (for example if NIC and CPU share the same clock source there might be no any drift between the NIC and CPU clocks), etc. After PKT_RX_TIMESTAMP flag and fixed timestamp field supposed deprecation and obsoleting, these dynamic flag and field might be used to manage the timestamps on receiving datapath as well. Having the dedicated flags for Rx/Tx timestamps allows applications not to perform explicit flags reset on forwarding and not to promote received timestamps to the transmitting datapath by default. The static PKT_RX_TIMESTAMP is considered as candidate to become the dynamic flag and this move should be discussed. When PMD sees the "rte_dynfield_timestamp" set on the packet being sent it tries to synchronize the time of packet appearing on the wire with the specified packet timestamp. If the specified one is in the past it should be ignored, if one is in the distant future it should be capped with some reasonable value (in range of seconds). These specific cases ("too late" and "distant future") can be optionally reported via device xstats to assist applications to detect the time-related problems. There is no any packet reordering according timestamps is supposed, neither within packet burst, nor between packets, it is an entirely application responsibility to generate packets and its timestamps in desired order. The timestamps can be put only in the first packet in the burst providing the entire burst scheduling. PMD reports the ability to synchronize packet sending on timestamp with new offload flag: This is palliative and might be replaced with new eth_dev API about reporting/managing the supported dynamic flags and its related features. This API would break ABI compatibility and can't be introduced at the moment, so is postponed to 20.11. For testing purposes it is proposed to update testpmd "txonly" forwarding mode routine. With this update testpmd application generates the packets and sets the dynamic timestamps according to specified time pattern if it sees the "rte_dynfield_timestamp" is registered. The new testpmd command is proposed to configure sending pattern: set tx_times <burst_gap>,<intra_gap> <intra_gap> - the delay between the packets within the burst specified in the device clock units. The number of packets in the burst is defined by txburst parameter <burst_gap> - the delay between the bursts in the device clock units As the result the bursts of packet will be transmitted with specific delays between the packets within the burst and specific delay between the bursts. The rte_eth_read_clock is supposed to be engaged to get the current device clock value and provide the reference for the timestamps. [1] http://patches.dpdk.org/patch/73714/ Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2020-07-11 06:18:54 +02:00
Wei Hu (Xavier)	50ce3e7aec	ethdev: fix VLAN offloads set if no relative capabilities Currently, there is a potential problem that calling the API function rte_eth_dev_set_vlan_offload to start VLAN hardware offloads which the driver does not support. If the PMD driver does not support certain VLAN hardware offloads and does not check for it, the hardware setting will not change, but the VLAN offloads in dev->data->dev_conf.rxmode.offloads will be turned on. It is supposed to check the hardware capabilities to decide whether the relative callback needs to be called just like the behavior in the API function named rte_eth_dev_configure. And it is also needed to cleanup duplicated checks which are done in some PMDs. Also, note that it is behaviour change for some PMDs which simply ignore (with error/warning log message) unsupported VLAN offloads, but now it will fail. Fixes: a4996bd89c42 ("ethdev: new Rx/Tx offloads API") Fixes: 0ebce6129bc6 ("net/dpaa2: support new ethdev offload APIs") Fixes: f9416bbafd98 ("net/enic: remove VLAN filter handler") Fixes: 4f7d9e383e5c ("fm10k: update vlan offload features") Fixes: fdba3bf15c7b ("net/hinic: add VLAN filter and offload") Fixes: b96fb2f0d22b ("net/i40e: handle QinQ strip") Fixes: d4a27a3b092a ("nfp: add basic features") Fixes: 56139e85abec ("net/octeontx: support VLAN filter offload") Fixes: ba1b3b081edf ("net/octeontx2: support VLAN offloads") Fixes: d87246a43759 ("net/qede: enable and disable VLAN filtering") Cc: stable@dpdk.org Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Hyong Youb Kim <hyonkim@cisco.com> Acked-by: Sachin Saxena <sachin.saxena@nxp.com> Acked-by: Xiaoyun Wang <cloud.wangxiaoyun@huawei.com> Acked-by: Harman Kalra <hkalra@marvell.com> Acked-by: Jeff Guo <jia.guo@intel.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>	2020-07-11 06:18:53 +02:00
Wei Hu (Xavier)	36fbaaf30d	ethdev: fix data room size verification in Rx queue setup In the rte_eth_rx_queue_setup API function, the local variable named mbp_buf_size, which is the data room size of the input parameter mp, is checked to guarantee that each memory chunk used for net device in the mbuf is bigger than the min_rx_bufsize. But if mbp_buf_size is less than RTE_PKTMBUF_HEADROOM, the value of the following statement will be a large number since the mbp_buf_size is a unsigned value. mbp_buf_size - RTE_PKTMBUF_HEADROOM As a result, it will cause a segment fault in this situation. This patch fixes it by modify the check condition to guarantee that the local variable named mbp_buf_size is bigger than RTE_PKTMBUF_HEADROOM. Fixes: af75078fece3 ("first public release") Cc: stable@dpdk.org Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Sachin Saxena <sachin.saxena@nxp.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>	2020-07-11 06:18:53 +02:00
Ferruh Yigit	cacd2bb786	ethdev: verify reserved HW ring Function 'rte_eth_dma_zone_reserve()' returns an existing memzone based on name match, but other requested attributes are discarded. This may cause driver using a memzone with wrong size or alignment. Verify size, alignment and socket_id for matched memzone, and do not use memzone if any one of the attributes are not justified. It is possible to free the existing memzone and allocate again with the requested attributes but it is better caller do the explicit free. Reported-by: Renata Saiakhova <renata.saiakhova@ekinops.com> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>	2020-07-11 06:18:52 +02:00
Ferruh Yigit	f6ac2b06ec	ethdev: fix log type for some error messages Some log macros was using 'EAL' logtype, convert them to 'ethdev'. Also fix missing EOL and fix syntax for some logs. Fixes: 214ed1acd125 ("ethdev: add iterator to match devargs input") Fixes: e489007a411c ("ethdev: add generic create/destroy ethdev APIs") Cc: stable@dpdk.org Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>	2020-07-07 23:38:28 +02:00
Jerin Jacob	9c99878aa1	log: introduce logtype register macro Introduce the RTE_LOG_REGISTER macro to avoid the code duplication in the logtype registration process. It is a wrapper macro for declaring the logtype, registering it and setting its level in the constructor context. Signed-off-by: Jerin Jacob <jerinj@marvell.com> Acked-by: Adam Dybkowski <adamx.dybkowski@intel.com> Acked-by: Sachin Saxena <sachin.saxena@nxp.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2020-07-03 15:52:51 +02:00
Asaf Penso	f6eb393849	ethdev: add 200G link speed There is no way to report back a link speed of 200Gbps. Adding 200G link speed. Signed-off-by: Asaf Penso <asafp@mellanox.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2020-05-11 22:27:39 +02:00
Ciara Power	eeb486f3ba	eal: add telemetry as dependency This patch moves telemetry further down the build, and adds it as a dependency for EAL. Telemetry V2 is now configured to build by default, and the legacy support is built when the telemetry config flag is set. Telemetry now has EAL flags, shown below: "--telemetry" = Enables telemetry (this is default if no flags given) "--no-telemetry" = Disables telemetry When telemetry is enabled, it will attempt to open the new socket version, and also the legacy support socket (this will depend on Jansson external dependency and telemetry config flag, as before). Signed-off-by: Ciara Power <ciara.power@intel.com> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Reviewed-by: Keith Wiles <keith.wiles@intel.com>	2020-05-11 00:37:16 +02:00
Bruce Richardson	c190daedb9	ethdev: add telemetry callbacks The ethdev library now registers commands with telemetry, and implements the callback functions. These commands allow the list of ethdev ports and the xstats and link status for a port to be queried. An example using ethdev commands is shown below: Connecting to /var/run/dpdk/rte/dpdk_telemetry.v2 {"version": "DPDK 20.05.0-rc0", "pid": 64379, "max_output_len": 16384} --> / {"/": ["/", "/ethdev/link_status", "/ethdev/list", "/ethdev/xstats", \ "/help", "/info"]} --> /ethdev/list {"/ethdev/list": [0, 1, 2, 3]} --> /ethdev/link_status,0 {"/ethdev/link_status": {"status": "UP", "speed": 10000, "duplex": \ "full-duplex"}} --> /ethdev/xstats,0 {"/ethdev/xstats": {"rx_good_packets": 0, "tx_good_packets": 0, \ <snip> "tx_priority7_xon_to_xoff_packets": 0}} Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Ciara Power <ciara.power@intel.com> Reviewed-by: Keith Wiles <keith.wiles@intel.com>	2020-05-11 00:37:01 +02:00
Sunil Kumar Kori	78d44153de	ethdev: add tracepoints Add tracepoints at important and mandatory APIs for tracing support. Signed-off-by: Sunil Kumar Kori <skori@marvell.com> Acked-by: David Marchand <david.marchand@redhat.com>	2020-04-23 15:40:06 +02:00
Stephen Hemminger	897358cc1e	ethdev: fix spelling Minor spelling errors found by aspell and codespell Fixes: 1daa33805824 ("ethdev: validate offloads set by PMD") Fixes: 81f9db8ecc2c ("ethdev: add vlan offload support") Fixes: c8231c63ddcb ("ethdev: insert Rx callback as head of list") Cc: stable@dpdk.org Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2020-03-18 10:21:42 +01:00
Pavan Nikhilesh	9196db904b	lib: use common macro RTE_DIM Use RTE_DIM to calculate array size. Suggested-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Acked-by: Kevin Laatz <kevin.laatz@intel.com> Acked-by: David Marchand <david.marchand@redhat.com>	2020-02-05 14:37:41 +01:00
Wei Hu (Xavier)	2e02a2afff	ethdev: fix VLAN offloads set if no driver callback Currently, there is a potential problem that changing the content of dev->data->dev_conf.rxmode.offloads even when there is no vlan_offload_set driver callback. It is a good idea that prevent the side effect and make the API return success if no change requested. This patch fixes the problem, the detail information as below: - keep possibility to do dummy set even if there is no driver callback - do not touch Rx mode offloads in device data before checking the driver callback availability - ensure that Rx mode offloads are rolled back correctly if driver callback returns error Fixes: 81f9db8ecc2c ("ethdev: add vlan offload support") Cc: stable@dpdk.org Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com> Signed-off-by: Chunsong Feng <fengchunsong@huawei.com> Signed-off-by: Min Wang (Jushui) <wangmin3@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>	2020-01-17 19:59:19 +01:00
Viacheslav Ovsiienko	7e9165b1be	ethdev: fix switching domain allocation The maximum amount of unique swutching domain is supposed to be equal RTE_MAX_ETHPORTS. Current implementation allows to allocate only RTE_MAX_ETHPORTS-1 domains. The definition of RTE_ETH_DEV_SWITCH_DOMAIN_ID_INVALID is changed from 0 to UINT16_MAX, the rte_eth_dev_info_get is updated to initialize dev_ibfo structure accordingly. Fixes: ce9250406323 ("ethdev: add switch domain allocator") Cc: stable@dpdk.org Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2020-01-17 19:59:19 +01:00
Ricardo Roldan	ba1e69f121	ethdev: fix callback unregister with wildcard argument list The function was checking -1 against the callback data instead of the given cb_arg parameter. Fixes: af75078fece3 ("first public release") Cc: stable@dpdk.org Signed-off-by: Ricardo Roldan <rroldan@bequant.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2020-01-17 19:46:02 +01:00
Thomas Monjalon	20bbb9e045	ethdev: limit maximum number of queues A buffer overflow happens in testpmd with some drivers since the queue arrays are limited to RTE_MAX_QUEUES_PER_PORT. The advertised capabilities of mlx4, mlx5 and softnic for the number of queues were the maximum number: UINT16_MAX. They must be limited by the configured RTE_MAX_QUEUES_PER_PORT that applications expect to be respected. The limitation is applied at ethdev level (function rte_eth_dev_info_get), in order to force the configured limit for all drivers. Fixes: 14b53e27b30e ("ethdev: fix crash with multiprocess") Cc: stable@dpdk.org Reported-by: Raslan Darawsheh <rasland@mellanox.com> Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2019-11-27 16:04:40 +01:00
Andrew Rybchenko	6e18704b7e	ethdev: avoid undefined behaviour on configuration copy memcpy() source and destination areas must not overlap and equal pointers is the case which is really met, so handle it. Fixes: 68b931bff287 ("ethdev: eliminate interim variable") Cc: stable@dpdk.org Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-11-20 17:36:06 +01:00
Andrew Rybchenko	5031897506	ethdev: improve message about not disabled offload Avoid usaged of "failed" in the message about not requested but enabled offload, since it is not a failure. Fixes: 1daa33805824 ("ethdev: validate offloads set by PMD") Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org>	2019-11-20 17:36:06 +01:00
Andrew Rybchenko	9e954d3194	ethdev: decrease verbosity of not disabled offload logs Right now a PMD decides if it is critical that an offload cannot be disabled (i.e. not requested, but still enabled). If PMD treaks it as OK, we should not spam logs with corresponding messages by default. Default log level in ethdev is INFO, so change the message level to DEBUG. Fixes: 1daa33805824 ("ethdev: validate offloads set by PMD") Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org>	2019-11-20 17:36:06 +01:00

1 2 3 4

152 Commits