numam-dpdk

Author	SHA1	Message	Date
Alexander Kozyrev	13cd6d5cc7	ethdev: bring in async indirect actions operations Queue-based flow rules management mechanism is suitable not only for flow rules creation/destruction, but also for speeding up other types of Flow API management. Indirect action object operations may be executed asynchronously as well. Provide async versions for all indirect action operations, namely: rte_flow_async_action_handle_create, rte_flow_async_action_handle_destroy and rte_flow_async_action_handle_update. Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2022-02-24 14:04:48 +01:00
Alexander Kozyrev	197e820c66	ethdev: bring in async queue-based flow rules operations A new, faster, queue-based flow rules management mechanism is needed for applications offloading rules inside the datapath. This asynchronous and lockless mechanism frees the CPU for further packet processing and reduces the performance impact of the flow rules creation/destruction on the datapath. Note that queues are not thread-safe and the queue should be accessed from the same thread for all queue operations. It is the responsibility of the app to sync the queue functions in case of multi-threaded access to the same queue. The rte_flow_async_create() function enqueues a flow creation to the requested queue. It benefits from already configured resources and sets unique values on top of item and action templates. A flow rule is enqueued on the specified flow queue and offloaded asynchronously to the hardware. The function returns immediately to spare CPU for further packet processing. The application must invoke the rte_flow_pull() function to complete the flow rule operation offloading, to clear the queue, and to receive the operation status. The rte_flow_async_destroy() function enqueues a flow destruction to the requested queue. Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2022-02-24 14:04:47 +01:00
Alexander Kozyrev	f076bcfbcf	ethdev: add flow item/action templates Treating every single flow rule as a completely independent and separate entity negatively impacts the flow rules insertion rate. Oftentimes in an application, many flow rules share a common structure (the same item mask and/or action list) so they can be grouped and classified together. This knowledge may be used as a source of optimization by a PMD/HW. The pattern template defines common matching fields (the item mask) without values. The actions template holds a list of action types that will be used together in the same rule. The specific values for items and actions will be given only during the rule creation. A table combines pattern and actions templates along with shared flow rule attributes (group ID, priority and traffic direction). This way a PMD/HW can prepare all the resources needed for efficient flow rules creation in the datapath. To avoid any hiccups due to memory reallocation, the maximum number of flow rules is defined at the table creation time. The flow rule creation is done by selecting a table, a pattern template and an actions template (which are bound to the table), and setting unique values for the items and actions. Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2022-02-24 14:04:47 +01:00
Alexander Kozyrev	4ff58b734b	ethdev: introduce flow engine configuration The flow rules creation/destruction at a large scale incurs a performance penalty and may negatively impact the packet processing when used as part of the datapath logic. This is mainly because software/hardware resources are allocated and prepared during the flow rule creation. In order to optimize the insertion rate, PMD may use some hints provided by the application at the initialization phase. The rte_flow_configure() function allows to pre-allocate all the needed resources beforehand. These resources can be used at a later stage without costly allocations. Every PMD may use only the subset of hints and ignore unused ones or fail in case the requested configuration is not supported. The rte_flow_info_get() is available to retrieve the information about supported pre-configurable resources. Both these functions must be called before any other usage of the flow API engine. Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2022-02-24 14:04:47 +01:00
Elena Agostini	9b8cae4d99	gpudev: use CPU mapping in communication list rte_gpu_mem_cpu_map() exposes a GPU memory area to the CPU. In gpudev communication list this is useful to store the status flag. A communication list status flag allocated on GPU memory and mapped for CPU visibility can be updated by CPU and polled by a GPU workload. The polling operation is more frequent than the CPU update operation. Having the status flag in GPU memory reduces the GPU workload polling latency. If CPU mapping feature is not enabled, status flag resides in CPU memory registered so it's visible from the GPU. To facilitate the interaction with the status flag, this patch provides also the set/get functions for it. Signed-off-by: Elena Agostini <eagostini@nvidia.com>	2022-02-22 20:08:52 +01:00
Ciara Power	92d55afe0f	cryptodev: add asymmetric session user data API A user data field is added to the asymmetric session structure. Relevant API added to get/set the field. Signed-off-by: Ciara Power <ciara.power@intel.com> Acked-by: Fan Zhang <roy.fan.zhang@intel.com> Acked-by: Anoob Joseph <anoobj@marvell.com> Acked-by: Akhil Goyal <gakhil@marvell.com>	2022-02-12 10:26:39 +01:00
Ciara Power	a29bb24898	cryptodev: hide asymmetric session structure The rte_cryptodev_asym_session structure is now moved to an internal header. This will no longer be used directly by apps, private session data can be accessed via get API. Signed-off-by: Ciara Power <ciara.power@intel.com> Acked-by: Fan Zhang <roy.fan.zhang@intel.com> Acked-by: Anoob Joseph <anoobj@marvell.com> Acked-by: Akhil Goyal <gakhil@marvell.com>	2022-02-12 10:26:39 +01:00
Ciara Power	1f1e4b7cba	cryptodev: use single mempool for asymmetric session Rather than using a session buffer that contains pointers to private session data elsewhere, have a single session buffer. This session is created for a driver ID, and the mempool element contains space for the max session private data needed for any driver. Signed-off-by: Ciara Power <ciara.power@intel.com> Acked-by: Fan Zhang <roy.fan.zhang@intel.com> Acked-by: Anoob Joseph <anoobj@marvell.com> Acked-by: Akhil Goyal <gakhil@marvell.com>	2022-02-12 10:26:39 +01:00
Ciara Power	0438b7dff3	doc: replace asym crypto code with literal includes The programmer's guide for cryptodev included sample code for using Asymmetric crypto. This is now replaced with direct code from the test application, using literal includes. It is broken into snippets as the test application didn't have all of the required code in one function. Signed-off-by: Ciara Power <ciara.power@intel.com> Acked-by: Anoob Joseph <anoobj@marvell.com> Acked-by: Fan Zhang <roy.fan.zhang@intel.com> Acked-by: Akhil Goyal <gakhil@marvell.com>	2022-02-12 10:26:39 +01:00
Sean Zhang	f61490bdf2	ethdev: support GRE optional fields Add flow pattern items and header format for matching optional fields (checksum/key/sequence) in GRE header. And the flags in gre item should be correspondingly set with the new added items. Signed-off-by: Sean Zhang <xiazhang@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com>	2022-02-11 16:51:59 +01:00
Jiayu Hu	53d3f4778c	vhost: integrate dmadev in asynchronous data-path Since dmadev is introduced in 21.11, to avoid the overhead of vhost DMA abstraction layer and simplify application logics, this patch integrates dmadev in asynchronous data path. Signed-off-by: Jiayu Hu <jiayu.hu@intel.com> Signed-off-by: Sunil Pai G <sunil.pai.g@intel.com> Tested-by: Yvonne Yang <yvonnex.yang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2022-02-10 15:59:17 +01:00
Elena Agostini	d69bb47d21	gpudev: expose GPU memory to CPU Enable the possibility to expose a GPU memory area and make it accessible from the CPU. GPU memory has to be allocated via rte_gpu_mem_alloc(). This patch allows the gpudev library to map (and unmap), through the GPU driver, a chunk of GPU memory and to return a memory pointer usable by the CPU to access the GPU memory area. Signed-off-by: Elena Agostini <eagostini@nvidia.com>	2022-02-10 10:06:56 +01:00
Xuan Ding	38e0f108c4	doc: update recommended IOVA mode for async vhost DPDK 21.11 adds vfio support for DMA device in vhost. This patch updates recommended IOVA mode in async datapath. Signed-off-by: Xuan Ding <xuan.ding@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2022-02-08 12:13:22 +01:00
Dmitry Kozlyuk	0dff3f26d6	eal: extend --huge-unlink for hugepage file reuse Expose Linux EAL ability to reuse existing hugepage files via --huge-unlink=never switch. Default behavior is unchanged, it can also be specified using --huge-unlink=existing for consistency. Old --huge-unlink switch is kept, it is an alias for --huge-unlink=always. Add a test case for the --huge-unlink=never mode. Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2022-02-08 21:32:53 +01:00
Dmitry Kozlyuk	2edd037c09	mem: add dirty malloc element support EAL malloc layer assumed all free elements content is filled with zeros ("clean"), as opposed to uninitialized ("dirty"). This assumption was ensured in two ways: 1. EAL memalloc layer always returned clean memory. 2. Freed memory was cleared before returning into the heap. Clearing the memory can be as slow as around 14 GiB/s. To save doing so, memalloc layer is allowed to return dirty memory. Such segments being marked with RTE_MEMSEG_FLAG_DIRTY. The allocator tracks elements that contain dirty memory using the new flag in the element header. When clean memory is requested via rte_zmalloc*() and the suitable element is dirty, it is cleared on allocation. When memory is deallocated, the freed element is joined with adjacent free elements, and the dirty flag is updated: a) If the joint element contains dirty parts, it is dirty: dirty + freed + dirty = dirty => no need to clean freed + dirty = dirty the freed memory Dirty parts may be large (e.g. initial allocation), so clearing them could create unpredictable slowdown. b) If the only dirty part of the joint element is the freed memory, the joint element can be made clean: clean + freed + clean = clean => freed memory clean + freed = clean must be cleared freed + clean = clean freed = clean This logic naturally reproduces the old behavior and always applies in modes when EAL memalloc layer returns only clean segments. As a result, memory is either cleared on free, as before, or it will be cleared on allocation if need be, but never twice. Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>	2022-02-08 21:32:53 +01:00
Dmitry Kozlyuk	1ba4f6735b	doc: add hugepage mapping details Hugepage mapping is a layer of EAL malloc builds upon. There were implicit references to its details, like mentions of segment file descriptors, but no explicit description of its modes and operation. Add an overview of mechanics used on ech supported OS. Convert memory management subsections from list items to level 4 headers: they are big and important enough. Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2022-02-08 21:04:42 +01:00
Tudor Cornea	5569dd7d90	kni: allow configuring thread granularity The Kni kthreads seem to be re-scheduled at a granularity of roughly 1 millisecond right now, which seems to be insufficient for performing tests involving a lot of control plane traffic. Even if KNI_KTHREAD_RESCHEDULE_INTERVAL is set to 5 microseconds, it seems that the existing code cannot reschedule at the desired granularily, due to precision constraints of schedule_timeout_interruptible(). In our use case, we leverage the Linux Kernel for control plane, and it is not uncommon to have 60K - 100K pps for some signaling protocols. Since we are not in atomic context, the usleep_range() function seems to be more appropriate for being able to introduce smaller controlled delays, in the range of 5-10 microseconds. Upon reading the existing code, it would seem that this was the original intent. Adding sub-millisecond delays, seems unfeasible with a call to schedule_timeout_interruptible(). KNI_KTHREAD_RESCHEDULE_INTERVAL 5 /* us */ schedule_timeout_interruptible( usecs_to_jiffies(KNI_KTHREAD_RESCHEDULE_INTERVAL)); Below, we attempted a brief comparison between the existing implementation, which uses schedule_timeout_interruptible() and usleep_range(). We attempt to measure the CPU usage, and RTT between two Kni interfaces, which are created on top of vmxnet3 adapters, connected by a vSwitch. insmod rte_kni.ko kthread_mode=single carrier=on schedule_timeout_interruptible(usecs_to_jiffies(5)) kni_single CPU Usage: 2-4 % [root@localhost ~]# ping 1.1.1.2 -I eth1 PING 1.1.1.2 (1.1.1.2) from 1.1.1.1 eth1: 56(84) bytes of data. 64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=2.70 ms 64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=1.00 ms 64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=1.99 ms 64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=0.985 ms 64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=1.00 ms usleep_range(5, 10) kni_single CPU usage: 50% 64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=0.338 ms 64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=0.150 ms 64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=0.123 ms 64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=0.139 ms 64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=0.159 ms usleep_range(20, 50) kni_single CPU usage: 24% 64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=0.202 ms 64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=0.170 ms 64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=0.171 ms 64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=0.248 ms 64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=0.185 ms usleep_range(50, 100) kni_single CPU usage: 13% 64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=0.537 ms 64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=0.257 ms 64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=0.231 ms 64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=0.143 ms 64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=0.200 ms usleep_range(100, 200) kni_single CPU usage: 7% 64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=0.716 ms 64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=0.167 ms 64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=0.459 ms 64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=0.455 ms 64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=0.252 ms usleep_range(1000, 1100) kni_single CPU usage: 2% 64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=2.22 ms 64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=1.17 ms 64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=1.17 ms 64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=1.17 ms 64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=1.15 ms Upon testing, usleep_range(1000, 1100) seems roughly equivalent in latency and cpu usage to the variant with schedule_timeout_interruptible(), while usleep_range(100, 200) seems to give a decent tradeoff between latency and cpu usage, while allowing users to tweak the limits for improved precision if they have such use cases. Disabling RTE_KNI_PREEMPT_DEFAULT, interestingly seems to lead to a softlockup on my kernel. Kernel panic - not syncing: softlockup: hung tasks CPU: 0 PID: 1226 Comm: kni_single Tainted: G W O 3.10 #1 <IRQ> [<ffffffff814f84de>] dump_stack+0x19/0x1b [<ffffffff814f7891>] panic+0xcd/0x1e0 [<ffffffff810993b0>] watchdog_timer_fn+0x160/0x160 [<ffffffff810644b2>] __run_hrtimer.isra.4+0x42/0xd0 [<ffffffff81064b57>] hrtimer_interrupt+0xe7/0x1f0 [<ffffffff8102cd57>] smp_apic_timer_interrupt+0x67/0xa0 [<ffffffff8150321d>] apic_timer_interrupt+0x6d/0x80 This patch also attempts to remove this option. References: [1] https://www.kernel.org/doc/Documentation/timers/timers-howto.txt Signed-off-by: Tudor Cornea <tudor.cornea@gmail.com> Acked-by: Padraig Connolly <Padraig.J.Connolly@intel.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2022-02-02 20:45:18 +01:00
Josh Soref	7be78d0279	fix spelling in comments and strings The tool comes from https://github.com/jsoref Signed-off-by: Josh Soref <jsoref@gmail.com> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2022-01-11 12:16:53 +01:00
Viacheslav Ovsiienko	bef7c9ff28	ethdev: announce migration to generic flow modify action The generic RTE_FLOW_ACTION_TYPE_MODIFY_FIELD action was introduced by [1]. This action provides an unified way to perform various arithmetic and transfer operations over packet network header fields and packet metadata. [1] `73b68f4c54` ("ethdev: introduce generic modify flow action") On other side there are a bunch of multiple legacy actions, that can be superseded by the generic MODIFY_FIELD action: RTE_FLOW_ACTION_TYPE_OF_SET_MPLS_TTL RTE_FLOW_ACTION_TYPE_OF_DEC_MPLS_TTL RTE_FLOW_ACTION_TYPE_OF_SET_NW_TTL RTE_FLOW_ACTION_TYPE_OF_DEC_NW_TTL sfc RTE_FLOW_ACTION_TYPE_OF_COPY_TTL_OUT RTE_FLOW_ACTION_TYPE_OF_COPY_TTL_IN RTE_FLOW_ACTION_TYPE_SET_IPV4_SRC bnxt, cxgbe, mlx5 RTE_FLOW_ACTION_TYPE_SET_IPV4_DST bnxt, cxgbe, mlx5 RTE_FLOW_ACTION_TYPE_SET_IPV6_SRC cxgbe, mlx5 RTE_FLOW_ACTION_TYPE_SET_IPV6_DST cxgbe, mlx5 RTE_FLOW_ACTION_TYPE_SET_TP_SRC cxgbe, mlx5 RTE_FLOW_ACTION_TYPE_SET_TP_DST cxgbe, mlx5 RTE_FLOW_ACTION_TYPE_DEC_TTL mlx5, sfc RTE_FLOW_ACTION_TYPE_SET_TTL mlx5 RTE_FLOW_ACTION_TYPE_SET_MAC_SRC cxgbe, mlx5 RTE_FLOW_ACTION_TYPE_SET_MAC_DST cxgbe, mlx5 RTE_FLOW_ACTION_TYPE_INC_TCP_SEQ mlx5 RTE_FLOW_ACTION_TYPE_DEC_TCP_SEQ mlx5 RTE_FLOW_ACTION_TYPE_INC_TCP_ACK mlx5 RTE_FLOW_ACTION_TYPE_DEC_TCP_ACK mlx5 RTE_FLOW_ACTION_TYPE_SET_IPV4_DSCP mlx5 RTE_FLOW_ACTION_TYPE_SET_IPV6_DSCP mlx5 RTE_FLOW_ACTION_TYPE_OF_SET_VLAN_VID bnxt, cnxk, cxgbe, enic, mlx5, octeontx2, sfc RTE_FLOW_ACTION_TYPE_OF_SET_VLAN_PCP bnxt, cnxk, cxgbe, enic, mlx5, octeontx2, sfc RTE_FLOW_ACTION_TYPE_SET_TAG mlx5 RTE_FLOW_ACTION_TYPE_SET_META mlx5 This note deprecates the following RTE Flow actions, as not supported by any of PMDs: RTE_FLOW_ACTION_TYPE_OF_SET_MPLS_TTL RTE_FLOW_ACTION_TYPE_OF_DEC_MPLS_TTL RTE_FLOW_ACTION_TYPE_OF_SET_NW_TTL RTE_FLOW_ACTION_TYPE_OF_COPY_TTL_OUT RTE_FLOW_ACTION_TYPE_OF_COPY_TTL_IN The following actions are supposed to be deprecated in 22.07 and replaced by generic field modify action: RTE_FLOW_ACTION_TYPE_OF_DEC_NW_TTL RTE_FLOW_ACTION_TYPE_SET_IPV4_SRC RTE_FLOW_ACTION_TYPE_SET_IPV4_DST RTE_FLOW_ACTION_TYPE_SET_IPV6_SRC RTE_FLOW_ACTION_TYPE_SET_IPV6_DST RTE_FLOW_ACTION_TYPE_SET_TP_SRC RTE_FLOW_ACTION_TYPE_SET_TP_DST RTE_FLOW_ACTION_TYPE_DEC_TTL RTE_FLOW_ACTION_TYPE_SET_TTL RTE_FLOW_ACTION_TYPE_SET_MAC_SRC RTE_FLOW_ACTION_TYPE_SET_MAC_DST RTE_FLOW_ACTION_TYPE_INC_TCP_SEQ RTE_FLOW_ACTION_TYPE_DEC_TCP_SEQ RTE_FLOW_ACTION_TYPE_INC_TCP_ACK RTE_FLOW_ACTION_TYPE_DEC_TCP_ACK RTE_FLOW_ACTION_TYPE_SET_IPV4_DSCP RTE_FLOW_ACTION_TYPE_SET_IPV6_DSCP RTE_FLOW_ACTION_TYPE_SET_TAG RTE_FLOW_ACTION_TYPE_SET_META The VLAN set actions are interrelated to VLAN header insertion/removal and supported by multiple PMDs and widely used by applications and not supposed to be deprecated due to potential large impact on drivers and applications. Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2021-11-26 18:02:49 +01:00
Vladimir Medvedkin	f3aa363df2	doc: add RIB and FIB programmer guides Currently, programmer's guide for the RIB and FIB libraries are missing. This commit adds them. Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com>	2021-11-26 15:47:23 +01:00
Kefu Chai	6c16a05c8c	doc: fix a typo in EAL guide Change from "how many segments each segment can have" to "how many segments each segment list can have". Fixes: `b317393283` ("doc: update guides for memory subsystem") Cc: stable@dpdk.org Signed-off-by: Kefu Chai <tchaikov@gmail.com>	2021-11-26 15:20:50 +01:00
Juraj Linkeš	9cd9c57061	doc: replace machine option in build guide Update the docs to reflect the two new variables, cpu_instruction_set for non-arm builds and platform for arm builds. Fixes: `bf66003b51` ("build: use platform for generic and native builds") Signed-off-by: Juraj Linkeš <juraj.linkes@pantheon.tech>	2021-11-26 15:10:16 +01:00
Jerin Jacob	44f44b803b	doc: add traffic metering API walk-through Added a diagram to document meter library components and added text for steps performed by the application to configure the traffic meter and policing library. Signed-off-by: Jerin Jacob <jerinj@marvell.com> Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-11-26 15:09:15 +01:00
David Hunt	7580f97338	doc: add power management scale mode reaction time note When using PMD Power Management, scale mode reacts slower than monitor mode and pause mode. Add note in user guide to this effect. Signed-off-by: David Hunt <david.hunt@intel.com>	2021-11-26 14:24:23 +01:00
Sean Morrissey	35bd0a5c58	doc: capitalise PMD The doc's contain references to pmd but the proper use is to use PMD. Cc: stable@dpdk.org Signed-off-by: Sean Morrissey <sean.morrissey@intel.com> Acked-by: John McNamara <john.mcnamara@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2021-11-26 11:28:34 +01:00
Sean Morrissey	f8dbaebbf1	fix PMD wording Removing the use of driver following PMD as its unnecessary. Cc: stable@dpdk.org Signed-off-by: Sean Morrissey <sean.morrissey@intel.com> Signed-off-by: Conor Fogarty <conor.fogarty@intel.com> Acked-by: John McNamara <john.mcnamara@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2021-11-26 11:28:34 +01:00
Thomas Monjalon	4f823975f4	doc: remove repeated repeated words Some duplicate words were detected with a script. Fixes: `fdec9301f5` ("doc: add flow classify guides") Fixes: `4dc6d8e63c` ("doc: add graph library guide") Fixes: `30d3aa861d` ("doc: rework VM power manager user guide") Fixes: `0d547ed037` ("examples/ipsec-secgw: support configuration file") Fixes: `e64833f227` ("examples/l2fwd-keepalive: add sample application") Cc: stable@dpdk.org Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: David Marchand <david.marchand@redhat.com>	2021-11-24 17:22:17 +01:00
Ferruh Yigit	a1b2558cdb	kni: restrict bifurcated device support To enable bifurcated device support, rtnl_lock is released before calling userspace callbacks and asynchronous requests are enabled. But these changes caused more issues, like bug #809, #816. To reduce the scope of the problems, the bifurcated device support related changes are only enabled when it is requested explicitly with new 'enable_bifurcated' module parameter. And bifurcated device support is disabled by default. So the bifurcated device related problems are isolated and they can be fixed without impacting all use cases. Bugzilla ID: 816 Fixes: `631217c761` ("kni: fix kernel deadlock with bifurcated device") Cc: stable@dpdk.org Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Igor Ryzhov <iryzhov@nfware.com>	2021-11-24 14:45:55 +01:00
Elena Agostini	f64b299cb3	build: make gpudev optional This library can be made optional. drivers/gpu and app/test-gpudev depend on this library, so they are automatically disabled if the lib is disabled. Signed-off-by: Elena Agostini <eagostini@nvidia.com>	2021-11-17 18:16:57 +01:00
David Christensen	f2a66612ee	eal/ppc: support ASan Add support for Address Sanitizer (ASan) for PPC/POWER architecture. Signed-off-by: David Christensen <drc@linux.vnet.ibm.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2021-11-16 11:24:22 +01:00
Volodymyr Fialko	001d402c89	eal/arm64: support ASan This patch defines ASAN_SHADOW_OFFSET for arm64 according to the ASan documentation. This offset should cover all arm64 VMAs supported by ASan. Signed-off-by: Volodymyr Fialko <vfialko@marvell.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Acked-by: Jerin Jacob <jerinj@marvell.com> Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>	2021-11-12 15:30:00 +01:00
Elena Agostini	3a99464456	doc: add CUDA example in GPU guide Add a pseudo-code example to show how to use gpudev API with a CUDA application. Signed-off-by: Elena Agostini <eagostini@nvidia.com>	2021-11-08 17:20:53 +01:00
Elena Agostini	c7ebd65c13	gpudev: add communication list In heterogeneous computing system, processing is not only in the CPU. Some tasks can be delegated to devices working in parallel. When mixing network activity with task processing there may be the need to put in communication the CPU with the device in order to synchronize operations. An example could be a receive-and-process application where CPU is responsible for receiving packets in multiple mbufs and the GPU is responsible for processing the content of those packets. The purpose of this list is to provide a buffer in CPU memory visible from the GPU that can be treated as a circular buffer to let the CPU provide fondamental info of received packets to the GPU. A possible use-case is described below. CPU: - Trigger some task on the GPU - in a loop: - receive a number of packets - provide packets info to the GPU GPU: - Do some pre-processing - Wait to receive a new set of packet to be processed Layout of a communication list would be: ------- \| 0 \| => pkt_list \| status \| \| #pkts \| ------- \| 1 \| => pkt_list \| status \| \| #pkts \| ------- \| 2 \| => pkt_list \| status \| \| #pkts \| ------- \| .... \| => pkt_list ------- Signed-off-by: Elena Agostini <eagostini@nvidia.com>	2021-11-08 17:20:53 +01:00
Elena Agostini	f56160a255	gpudev: add communication flag In heterogeneous computing system, processing is not only in the CPU. Some tasks can be delegated to devices working in parallel. When mixing network activity with task processing there may be the need to put in communication the CPU with the device in order to synchronize operations. The purpose of this flag is to allow the CPU and the GPU to exchange ACKs. A possible use-case is described below. CPU: - Trigger some task on the GPU - Prepare some data - Signal to the GPU the data is ready updating the communication flag GPU: - Do some pre-processing - Wait for more data from the CPU polling on the communication flag - Consume the data prepared by the CPU Signed-off-by: Elena Agostini <eagostini@nvidia.com>	2021-11-08 17:20:53 +01:00
Elena Agostini	2d61b429cf	gpudev: add memory barrier Add a function for the application to ensure the coherency of the writes executed by another device into the GPU memory. Signed-off-by: Elena Agostini <eagostini@nvidia.com>	2021-11-08 17:20:53 +01:00
Elena Agostini	e818c4e2bf	gpudev: add memory API In heterogeneous computing system, processing is not only in the CPU. Some tasks can be delegated to devices working in parallel. Such workload distribution can be achieved by sharing some memory. As a first step, the features are focused on memory management. A function allows to allocate memory inside the device, or in the main (CPU) memory while making it visible for the device. This memory may be used to save packets or for synchronization data. The next step should focus on GPU processing task control. Signed-off-by: Elena Agostini <eagostini@nvidia.com> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2021-11-08 17:20:53 +01:00
Thomas Monjalon	82e5f6b658	gpudev: add child device representing a device context The computing device may operate in some isolated contexts. Memory and processing are isolated in a silo represented by a child device. The context is provided as an opaque by the caller of rte_gpu_add_child(). Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2021-11-08 17:20:52 +01:00
Elena Agostini	8b8036a66e	gpudev: introduce GPU device class library In heterogeneous computing system, processing is not only in the CPU. Some tasks can be delegated to devices working in parallel. The new library gpudev is for dealing with GPGPU computing devices from a DPDK application running on the CPU. The infrastructure is prepared to welcome drivers in drivers/gpu/. Signed-off-by: Elena Agostini <eagostini@nvidia.com> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2021-11-08 17:20:52 +01:00
Naga Harish K S V	995b150c1a	eventdev/eth_rx: add queue stats API This patch adds new api ``rte_event_eth_rx_adapter_queue_stats_get`` to retrieve queue stats. The queue stats are in the format ``struct rte_event_eth_rx_adapter_queue_stats``. For resetting the queue stats, ``rte_event_eth_rx_adapter_queue_stats_reset`` api is added. The adapter stats_get and stats_reset apis are also updated to handle queue level event buffer use case. Signed-off-by: Naga Harish K S V <s.v.naga.harish.k@intel.com> Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>	2021-11-04 08:41:25 +01:00
Gowrishankar Muthukrishnan	259ca6d161	security: add telemetry endpoint for capabilities Add telemetry endpoint for cryptodev security capabilities. Details of endpoints added in documentation. Signed-off-by: Gowrishankar Muthukrishnan <gmuthukrishn@marvell.com> Acked-by: Akhil Goyal <gakhil@marvell.com>	2021-11-04 19:46:27 +01:00
Radu Nicolau	ff4a29d167	ipsec: support TSO Add support for transmit segmentation offload to inline crypto processing mode. This offload is not supported by other offload modes, as at a minimum it requires inline crypto for IPsec to be supported on the network interface. Signed-off-by: Declan Doherty <declan.doherty@intel.com> Signed-off-by: Radu Nicolau <radu.nicolau@intel.com> Signed-off-by: Abhijit Sinha <abhijit.sinha@intel.com> Signed-off-by: Daniel Martin Buckley <daniel.m.buckley@intel.com> Acked-by: Fan Zhang <roy.fan.zhang@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Akhil Goyal <gakhil@marvell.com>	2021-11-04 19:46:27 +01:00
Gowrishankar Muthukrishnan	1c559ee846	cryptodev: add telemetry endpoint for capabilities Add telemetry endpoint for getting cryptodev capabilities. Signed-off-by: Gowrishankar Muthukrishnan <gmuthukrishn@marvell.com> Acked-by: Akhil Goyal <gakhil@marvell.com>	2021-11-04 19:43:14 +01:00
Rebecca Troy	d3d98f5ce9	cryptodev: support telemetry The cryptodev library now registers commands with telemetry, and implements the corresponding callback functions. These commands allow a list of cryptodevs to be queried, as well as info and stats for the corresponding cryptodev. An example usage can be seen below: Connecting to /var/run/dpdk/rte/dpdk_telemetry.v2 {"version": "DPDK 21.11.0-rc0", "pid": 1135019, "max_output_len": 16384} --> / {"/": ["/", "/cryptodev/info", "/cryptodev/list", "/cryptodev/stats", ...]} --> /cryptodev/list {"/cryptodev/list": [0,1,2,3]} --> /cryptodev/info,0 {"/cryptodev/info": {"device_name": "0000:1c:01.0_qat_sym", \ "max_nb_queue_pairs": 2}} --> /cryptodev/stats,0 {"/cryptodev/stats": {"enqueued_count": 0, "dequeued_count": 0, \ "enqueue_err_count": 0, "dequeue_err_count": 0}} Signed-off-by: Rebecca Troy <rebecca.troy@intel.com> Acked-by: Ciara Power <ciara.power@intel.com> Acked-by: Akhil Goyal <gakhil@marvell.com>	2021-11-04 19:43:14 +01:00
Maxime Coquelin	ab4bb42406	vhost: rename driver callbacks struct As previously announced, this patch renames struct vhost_device_ops to struct rte_vhost_device_ops. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2021-11-03 11:59:27 +01:00
Dmitry Kozlyuk	2c9cd45de7	ethdev: add capability to keep shared objects on restart rte_flow_action_handle_create() did not mention what happens with an indirect action when a device is stopped and started again. It is natural for some indirect actions, like counter, to be persistent. Keeping others at least saves application time and complexity. However, not all PMDs can support it, or the support may be limited by particular action kinds, that is, combinations of action type and the value of the transfer bit in its configuration. Add a device capability to indicate if at least some indirect actions are kept across the above sequence. Without this capability the behavior is still unspecified, and application is required to destroy the indirect actions before stopping the device. In the future, indirect actions may not be the only type of objects shared between flow rules. The capability bit intends to cover all possible types of such objects, hence its name. Declare that the application can test for the persistence of a particular indirect action kind by attempting to create an indirect action of that kind when the device is stopped and checking for the specific error type. This is logical because if the PMD can to create an indirect action when the device is not started and use it after the start happens, it is natural that it can move its internal flow shared object to the same state when the device is stopped and restore the state when the device is started. Indirect action persistence across a reconfigurations is not required. In case a PMD cannot keep the indirect actions across reconfiguration, it is allowed just to report an error. Application must then flush the indirect actions before attempting it. Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2021-11-02 18:59:17 +01:00
Dmitry Kozlyuk	1d5a3d68c0	ethdev: add capability to keep flow rules on restart Previously, it was not specified what happens to the flow rules when the device is stopped, possibly reconfigured, then started. If flow rules were kept, it could be convenient for application developers, because they wouldn't need to save and restore them. However, due to the number of flows and possible creation rate it is impractical to save all flow rules in DPDK layer. This means that flow rules persistence really depends on whether PMD and HW can implement it efficiently. It can also be limited by the rule item and action types, and its attributes transfer bit (a combination of an item/action type and a value of the transfer bit is called a rule feature). Add a device capability bit for PMDs that can keep at least some of the flow rules across restart. Without this capability behavior is still unspecified and it is declared that the application must flush the rules before stopping the device. Allow the application to test for persistence of rules using a particular feature by attempting to create a flow rule using that feature when the device is stopped and checking for the specific error. This is logical because if the PMD can to create the flow rule when the device is not started and use it after the start happens, it is natural that it can move its internal flow rule object to the same state when the device is stopped and restore the state when the device is started. Rule persistence across a reconfigurations is not required, because tracking all the rules and configuration-dependent resources they use may be infeasible. In case a PMD cannot keep the rules across reconfiguration, it is allowed just to report an error. Application must then flush the rules before attempting it. Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2021-11-02 18:59:17 +01:00
Wojciech Liguzinski	44c730b0e3	sched: add PIE based congestion management Implement PIE based congestion management based on rfc8033. The Proportional Integral Controller Enhanced (PIE) algorithm works by proactively dropping packets randomly. PIE is implemented as more advanced queue management is required to address the bufferbloat problem and provide desirable quality of service to users. Tests for PIE code added to test application. Added PIE related information to documentation. Signed-off-by: Wojciech Liguzinski <wojciechx.liguzinski@intel.com> Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Acked-by: Jasvinder Singh <jasvinder.singh@intel.com>	2021-11-04 15:41:49 +01:00
Vladimir Medvedkin	31d7c06947	hash: add bulk Toeplitz hash implementation This patch adds a bulk version for the Toeplitz hash implemented with Galios Fields New Instructions (GFNI). Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2021-11-04 11:19:10 +01:00
Vladimir Medvedkin	4fd8c4cb0d	hash: add new Toeplitz hash implementation This patch add a new Toeplitz hash implementation using Galios Fields New Instructions (GFNI). Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2021-11-04 11:19:10 +01:00
Zhihong Peng	6cc51b1293	mem: instrument allocator for ASan This patch adds necessary hooks in the memory allocator for ASan. This feature is currently available in DPDK only on Linux x86_64. If other OS/architectures want to support it, ASAN_SHADOW_OFFSET must be defined and RTE_MALLOC_ASAN must be set accordingly in meson. Signed-off-by: Xueqin Lin <xueqin.lin@intel.com> Signed-off-by: Zhihong Peng <zhihongx.peng@intel.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2021-10-29 16:25:03 +02:00

1 2 3 4 5 ...

642 Commits