numam-dpdk

Author	SHA1	Message	Date
Anatoly Burakov	79a0bbe5b6	eal: pick IOVA as PA if IOMMU is not available When IOMMU is not available, /sys/kernel/iommu_groups will not be populated. This is happening since at least 3.6 when VFIO support was added. If the directory is empty, EAL should not pick IOVA as VA as the default IOVA mode. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Tested-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Tested-by: Jerin Jacob <jerinj@marvell.com> Reviewed-by: Jerin Jacob <jerinj@marvell.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2019-07-30 10:09:13 +02:00
Dekel Peled	68bb77e968	doc: fix ethernet addresses in flow API guide Ethernet address examples contain 5 bytes only. This patch changes the examples to valid 6 bytes Ethernet addresses. Fixes: 4d73b6fb9907 ("doc: add generic flow API guide") Cc: stable@dpdk.org Signed-off-by: Dekel Peled <dekelp@mellanox.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-07-23 14:31:35 +02:00
Jerin Jacob	bbe29a9bd7	eal/linux: select IOVA as VA mode for default case When bus layer reports the preferred mode as RTE_IOVA_DC then select the RTE_IOVA_VA mode: - All drivers work in RTE_IOVA_VA mode, irrespective of physical address availability. - By default, a mempool asks for IOVA-contiguous memory using RTE_MEMZONE_IOVA_CONTIG. This is slow in RTE_IOVA_PA mode and it may affect the application boot time. Signed-off-by: Jerin Jacob <jerinj@marvell.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com> Signed-off-by: David Marchand <david.marchand@redhat.com>	2019-07-22 17:47:27 +02:00
David Marchand	b76fafb174	eal: fix IOVA mode selection as VA for PCI drivers The incriminated commit broke the use of RTE_PCI_DRV_IOVA_AS_VA which was intended to mean "driver only supports VA" but had been understood as "driver supports both PA and VA" by most net drivers and used to let dpdk processes to run as non root (which do not have access to physical addresses on recent kernels). The check on physical addresses actually closed the gap for those drivers. We don't need to mark them with RTE_PCI_DRV_IOVA_AS_VA and this flag can retain its intended meaning. Document explicitly its meaning. We can check that a driver requirement wrt to IOVA mode is fulfilled before trying to probe a device. Finally, document the heuristic used to select the IOVA mode and hope that we won't break it again. Fixes: 703458e19c16 ("bus/pci: consider only usable devices for IOVA mode") Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Jerin Jacob <jerinj@marvell.com> Tested-by: Jerin Jacob <jerinj@marvell.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-07-22 17:45:52 +02:00
Adam Dybkowski	1525374af3	doc: update compressdev guide This patch updates the constant names and function names used in code snippets in the Compression Device Library documentation. Signed-off-by: Adam Dybkowski <adamx.dybkowski@intel.com> Acked-by: Fiona Trahe <fiona.trahe@intel.com>	2019-07-19 14:17:11 +02:00
Thomas Monjalon	15a9ef841a	doc: fix PDF build of bbdev prog guide Some machine (like on dpdk.org) may fail to build the prog guide PDF because of a table characher being "+" instead of "\|". Some figure references are also fixed with automatic numbering. Fixes: 3f3f608142cf ("doc: update bbdev guide for 5GNR operations") Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>	2019-07-18 23:56:12 +02:00
Harman Kalra	44dc7c0a26	metrics: add function to deinitialise library Once the library usage is over, it must be deinitialized which will free the shared memory reserved during initialization. Observed an issue while running 'metrics_autotest' continuously without quiting. For the first run 'metrics_autotest' passes all test cases but second run onwards first test case fails because metrics library is already initialized during first run. Signed-off-by: Harman Kalra <hkalra@marvell.com> Acked-by: Remy Horton <remy.horton@intel.com> Acked-by: Reshma Pattan <reshma.pattan@intel.com>	2019-07-16 12:45:30 +02:00
Yangchao Zhou	5eb1708ec1	kni: fix kernel crash with multi-segments va2pa depends on the physical address and virtual address offset of current mbuf. It may get the wrong physical address of next mbuf which allocated in another hugepage segment. In rte_mempool_populate_default(), trying to allocate whole block of contiguous memory could be failed. Then, it would reserve memory in several memzones that have different physical address and virtual address offsets. The rte_mempool_populate_default() is used by rte_pktmbuf_pool_create(). Fixes: 8451269e6d7b ("kni: remove continuous memory restriction") Cc: stable@dpdk.org Signed-off-by: Yangchao Zhou <zhouyates@gmail.com> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-07-15 22:48:20 +02:00
Stephen Hemminger	89247e1a73	doc: update KNI guides Update KNI documentation to reflect current ethtool support. Replace references to out dated tools (ifconfig) with modern iproute2. Tshark is a better replacement for tcpdump. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-07-15 19:16:48 +02:00
Xiaoyu Min	3374db1857	ethdev: add GRE key field to flow API Add new rte_flow_item_gre_key in order to match the optional key field. Signed-off-by: Xiaoyu Min <jackmin@mellanox.com> Acked-by: Ori Kam <orika@mellanox.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2019-07-08 21:26:52 +02:00
Xiaolong Ye	566d66aa84	doc: fix a grammar mistake in rawdev guide Fixes: a9bb0c44c775 ("doc: add rawdev library guide and doxygen page") Cc: stable@dpdk.org Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com> Acked-by: John McNamara <john.mcnamara@intel.com> Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>	2019-07-08 20:21:34 +02:00
Nicolas Chautru	3f3f608142	doc: update bbdev guide for 5GNR operations The documentation captures the related change in BBDEV API to support 5GNR encode/decode operations. Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com> Acked-by: Amr Mokhtar <amr.mokhtar@intel.com>	2019-07-05 15:28:14 +02:00
Konstantin Ananyev	e95291f019	ipsec: support multi-segment packets Add support for packets that consist of multiple segments. Take into account that trailer bytes (padding, ESP tail, ICV) can spawn across multiple segments. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-07-05 15:28:14 +02:00
Fiona Trahe	48903a7967	doc: clarify xform zeroing in crypto guide Signed-off-by: Fiona Trahe <fiona.trahe@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-07-05 15:28:14 +02:00
Fiona Trahe	f7095d41bb	doc: clarify data plane error handling in compressdev Fixed some typos and clarified how errors on ops which fail to get submitted on the enqueue API should be handled. Fixes: a584d3bea902 ("doc: add compressdev library guide") Cc: stable@dpdk.org Signed-off-by: Fiona Trahe <fiona.trahe@intel.com> Acked-by: Tomasz Jozwiak <tomaszx.jozwiak@intel.com> Acked-by: Shally Verma <shallyv@marvell.com>	2019-07-05 14:52:25 +02:00
Thierry Herbelot	6b1a74ef80	doc: fix grammar in prog guides A copy'n'paste typo was present in three guides. Fixes: a584d3bea902 ("doc: add compressdev library guide") Fixes: 4935e1e9f76e ("bbdev: introduce wireless base band device lib") Fixes: 0318c02b57cf ("doc: add cryptodev chapter in prog guide") Cc: stable@dpdk.org Signed-off-by: Thierry Herbelot <thierry.herbelot@6wind.com> Acked-by: Rami Rosen <ramirose@gmail.com> Acked-by: Fiona Trahe <fiona.trahe@intel.com>	2019-07-05 14:52:25 +02:00
Thierry Herbelot	8b283e90e6	doc: fix triplicated typo in prog guides A bad formula was copied and pasted. Fixes: 4935e1e9f76e ("bbdev: introduce wireless base band device lib") Fixes: 0318c02b57cf ("doc: add cryptodev chapter in prog guide") Fixes: a9bb0c44c775 ("doc: add rawdev library guide and doxygen page") Cc: stable@dpdk.org Signed-off-by: Thierry Herbelot <thierry.herbelot@6wind.com> Acked-by: Fiona Trahe <fiona.trahe@intel.com>	2019-07-05 14:52:25 +02:00
Dekel Peled	f10409b766	ethdev: allow action with any config object type In current implementation, an action which requires parameters must accept them enclosed in a structure. Some actions require a single, trivial type parameter, but it still must be enclosed in a structure. This obligation results in multiple, action-specific structures, each containing a single trivial type parameter. This patch introduces a new approach, allowing an action configuration object of any type, trivial or a structure. Signed-off-by: Dekel Peled <dekelp@mellanox.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2019-07-05 01:52:01 +02:00
Dekel Peled	5dca85377e	ethdev: add actions to modify TCP header fields Add actions: - INC_TCP_SEQ - Increase sequence number in the outermost TCP header. - DEC_TCP_SEQ - Decrease sequence number in the outermost TCP header. - INC_TCP_ACK - Increase acknowledgment number in the outermost TCP header. - DEC_TCP_ACK - Decrease acknowledgment number in the outermost TCP header. Original work by Xiaoyu Min. This patch uses the new approach introduced by [1], using a simple integer instead of using an action-specific structure for each of the new actions. [1] http://patches.dpdk.org/patch/55882/ Signed-off-by: Dekel Peled <dekelp@mellanox.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2019-07-05 01:52:01 +02:00
Gage Eads	4ae9f32ecc	doc: add a note for multi-process in mempool guide The mempool library assigns handler ops indexes based on the dynamic load order of mempool handlers. Indexes are used so a mempool can be used by multiple processes, but this only works if all processes agree on the mapping from index to mempool handler. When using the '-d' argument, it's possible for different processes to load mempool handlers in different orders, and thus have different index->handler mappings. Using a mempool in multiple of such processes will result in undefined behavior. This commit adds a note to the mempool library programmer's guide warning users against this. Fixes: 449c49b93a6b ("mempool: support handler operations") Cc: stable@dpdk.org Signed-off-by: Gage Eads <gage.eads@intel.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>	2019-07-04 23:14:28 +02:00
Thomas Monjalon	87d2f3c576	doc: fix typo in EAL guide The EAL init diagram had a typo for "lauch" instead of "launch". Fixes: fc1f2750a3ec ("doc: programmers guide") Cc: stable@dpdk.org Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Reviewed-by: David Marchand <david.marchand@redhat.com>	2019-06-17 12:35:54 +03:00
David Marchand	0c9da7555d	net: replace IPv4/v6 constants with uppercase name Since we change these macros, we might as well avoid triggering complaints from checkpatch because of mixed case. old=RTE_IPv4 new=RTE_IPV4 git grep -lw $old \| xargs sed -i -e "s/\<$old\>/$new/g" old=RTE_ETHER_TYPE_IPv4 new=RTE_ETHER_TYPE_IPV4 git grep -lw $old \| xargs sed -i -e "s/\<$old\>/$new/g" old=RTE_ETHER_TYPE_IPv6 new=RTE_ETHER_TYPE_IPV6 git grep -lw $old \| xargs sed -i -e "s/\<$old\>/$new/g" Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Olivier Matz <olivier.matz@6wind.com>	2019-06-03 16:54:54 +02:00
Olivier Matz	24ac604ef7	net: add rte prefix to IP defines Add 'RTE_' prefix to defines: - rename IPv4( as RTE_IPv4(. - rename IPV4_MAX_PKT_LEN as RTE_IPV4_MAX_PKT_LEN. - rename IPV4_HDR_IHL_MASK as RTE_IPV4_HDR_IHL_MASK. - rename IPV4_IHL_MULTIPLIER as RTE_IPV4_IHL_MULTIPLIER. - rename IPV4_HDR_DF_SHIFT as RTE_IPV4_HDR_DF_SHIFT. - rename IPV4_HDR_MF_SHIFT as RTE_IPV4_HDR_MF_SHIFT. - rename IPV4_HDR_FO_SHIFT as RTE_IPV4_HDR_FO_SHIFT. - rename IPV4_HDR_DF_FLAG as RTE_IPV4_HDR_DF_FLAG. - rename IPV4_HDR_MF_FLAG as RTE_IPV4_HDR_MF_FLAG. - rename IPV4_HDR_OFFSET_MASK as RTE_IPV4_HDR_OFFSET_MASK. - rename IPV4_HDR_OFFSET_UNITS as RTE_IPV4_HDR_OFFSET_UNITS. - rename IPV4_ANY as RTE_IPV4_ANY. - rename IPV4_LOOPBACK as RTE_IPV4_LOOPBACK. - rename IPV4_BROADCAST as RTE_IPV4_BROADCAST. - rename IPV4_ALLHOSTS_GROUP as RTE_IPV4_ALLHOSTS_GROUP. - rename IPV4_ALLRTRS_GROUP as RTE_IPV4_ALLRTRS_GROUP. - rename IPV4_MAX_LOCAL_GROUP as RTE_IPV4_MAX_LOCAL_GROUP. - rename IPV4_MIN_MCAST as RTE_IPV4_MIN_MCAST. - rename IPV4_MAX_MCAST as RTE_IPV4_MAX_MCAST. - rename IS_IPV4_MCAST as RTE_IS_IPV4_MCAST. - rename IPV6_HDR_FL_SHIFT as RTE_IPV6_HDR_FL_SHIFT. - rename IPV6_HDR_TC_SHIFT as RTE_IPV6_HDR_TC_SHIFT. - rename IPV6_HDR_FL_MASK as RTE_IPV6_HDR_FL_MASK. - rename IPV6_HDR_TC_MASK as RTE_IPV6_HDR_TC_MASK. Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-05-24 13:34:46 +02:00
Olivier Matz	a7c528e5d7	net: add rte prefix to IP structure Add 'rte_' prefix to structures: - rename struct ipv4_hdr as struct rte_ipv4_hdr. - rename struct ipv6_hdr as struct rte_ipv6_hdr. Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-05-24 13:34:46 +02:00
Olivier Matz	35b2d13fd6	net: add rte prefix to ether defines Add 'RTE_' prefix to defines: - rename ETHER_ADDR_LEN as RTE_ETHER_ADDR_LEN. - rename ETHER_TYPE_LEN as RTE_ETHER_TYPE_LEN. - rename ETHER_CRC_LEN as RTE_ETHER_CRC_LEN. - rename ETHER_HDR_LEN as RTE_ETHER_HDR_LEN. - rename ETHER_MIN_LEN as RTE_ETHER_MIN_LEN. - rename ETHER_MAX_LEN as RTE_ETHER_MAX_LEN. - rename ETHER_MTU as RTE_ETHER_MTU. - rename ETHER_MAX_VLAN_FRAME_LEN as RTE_ETHER_MAX_VLAN_FRAME_LEN. - rename ETHER_MAX_VLAN_ID as RTE_ETHER_MAX_VLAN_ID. - rename ETHER_MAX_JUMBO_FRAME_LEN as RTE_ETHER_MAX_JUMBO_FRAME_LEN. - rename ETHER_MIN_MTU as RTE_ETHER_MIN_MTU. - rename ETHER_LOCAL_ADMIN_ADDR as RTE_ETHER_LOCAL_ADMIN_ADDR. - rename ETHER_GROUP_ADDR as RTE_ETHER_GROUP_ADDR. - rename ETHER_TYPE_IPv4 as RTE_ETHER_TYPE_IPv4. - rename ETHER_TYPE_IPv6 as RTE_ETHER_TYPE_IPv6. - rename ETHER_TYPE_ARP as RTE_ETHER_TYPE_ARP. - rename ETHER_TYPE_VLAN as RTE_ETHER_TYPE_VLAN. - rename ETHER_TYPE_RARP as RTE_ETHER_TYPE_RARP. - rename ETHER_TYPE_QINQ as RTE_ETHER_TYPE_QINQ. - rename ETHER_TYPE_ETAG as RTE_ETHER_TYPE_ETAG. - rename ETHER_TYPE_1588 as RTE_ETHER_TYPE_1588. - rename ETHER_TYPE_SLOW as RTE_ETHER_TYPE_SLOW. - rename ETHER_TYPE_TEB as RTE_ETHER_TYPE_TEB. - rename ETHER_TYPE_LLDP as RTE_ETHER_TYPE_LLDP. - rename ETHER_TYPE_MPLS as RTE_ETHER_TYPE_MPLS. - rename ETHER_TYPE_MPLSM as RTE_ETHER_TYPE_MPLSM. - rename ETHER_VXLAN_HLEN as RTE_ETHER_VXLAN_HLEN. - rename ETHER_ADDR_FMT_SIZE as RTE_ETHER_ADDR_FMT_SIZE. - rename VXLAN_GPE_TYPE_IPV4 as RTE_VXLAN_GPE_TYPE_IPV4. - rename VXLAN_GPE_TYPE_IPV6 as RTE_VXLAN_GPE_TYPE_IPV6. - rename VXLAN_GPE_TYPE_ETH as RTE_VXLAN_GPE_TYPE_ETH. - rename VXLAN_GPE_TYPE_NSH as RTE_VXLAN_GPE_TYPE_NSH. - rename VXLAN_GPE_TYPE_MPLS as RTE_VXLAN_GPE_TYPE_MPLS. - rename VXLAN_GPE_TYPE_GBP as RTE_VXLAN_GPE_TYPE_GBP. - rename VXLAN_GPE_TYPE_VBNG as RTE_VXLAN_GPE_TYPE_VBNG. - rename ETHER_VXLAN_GPE_HLEN as RTE_ETHER_VXLAN_GPE_HLEN. Do not update the command line library to avoid adding a dependency to librte_net. Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-05-24 13:34:45 +02:00
Olivier Matz	6d13ea8e8e	net: add rte prefix to ether structures Add 'rte_' prefix to structures: - rename struct ether_addr as struct rte_ether_addr. - rename struct ether_hdr as struct rte_ether_hdr. - rename struct vlan_hdr as struct rte_vlan_hdr. - rename struct vxlan_hdr as struct rte_vxlan_hdr. - rename struct vxlan_gpe_hdr as struct rte_vxlan_gpe_hdr. Do not update the command line library to avoid adding a dependency to librte_net. Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-05-24 13:34:45 +02:00
Anatoly Burakov	bfbc3a5041	ipc: add warnings about correct API usage When handling synchronous or asynchronous requests, the reply must be sent explicitly even if the result of the operation is an error, to avoid the other side timing out. Make note of this in documentation explicitly. Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-05-09 17:50:59 +02:00
Anatoly Burakov	3855b41500	ipc: add warnings about not using IPC with memory API IPC and memory-related API's should not be mixed because memory relies on IPC internally. Add explicit warnings to IPC API and to the documentation about this. Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-05-09 17:49:32 +02:00
Anatoly Burakov	ab96056d13	doc: fix typo in IPC guide The word "synchronous" appears twice. Fix it. Fixes: e22266669e86 ("doc: add IPC guide") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: John McNamara <john.mcnamara@intel.com>	2019-05-09 17:45:30 +02:00
Herakliusz Lipiec	193774a799	doc: fix broken link in LPM guide Bugzilla ID: 235 Fixes: fc1f2750a3ec ("doc: programmers guide") Cc: stable@dpdk.org Signed-off-by: Herakliusz Lipiec <herakliusz.lipiec@intel.com> Acked-by: Rami Rosen <ramirose@gmail.com> Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com> Acked-by: John McNamara <john.mcnamara@intel.com>	2019-05-04 23:39:23 +02:00
Yipeng Wang	c96b55df8e	hash: fix doc about thread/process safety rte_hash_hash is multi-thread safe but not multi-process safe because of the use of function pointers. Previous document and comment says the other way around. This commit fixes the issue. Fixes: fc1f2750a3ec ("doc: programmers guide") Fixes: 48a399119619 ("hash: replace with cuckoo hash implementation") Cc: stable@dpdk.org Reported-by: Andrey Nikolaev <gentoorion@gmail.com> Suggested-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com> Acked-by: Dharmik Thakkar <dharmik.thakkar@arm.com> Acked-by: John McNamara <john.mcnamara@intel.com>	2019-05-04 23:29:45 +02:00
John McNamara	d629b7b5fe	doc: fix spelling reported by aspell in guides Fix spelling errors in the guide docs. Signed-off-by: John McNamara <john.mcnamara@intel.com> Acked-by: Rami Rosen <ramirose@gmail.com>	2019-05-03 00:37:13 +02:00
Honnappa Nagarahalli	d67e6ecf46	doc: add RCU guide Add lib RCU QSBR programmer guide documentation. Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Reviewed-by: Marko Kovacevic <marko.kovacevic@intel.com> Acked-by: John McNamara <john.mcnamara@intel.com>	2019-05-02 01:33:14 +02:00
Gage Eads	e75bc77f98	mempool/stack: add lock-free stack mempool handler This commit adds support for lock-free (linked list based) stack mempool handler. In mempool_perf_autotest the lock-based stack outperforms the lock-free handler for certain lcore/alloc count/free count combinations, however: - For applications with preemptible pthreads, a standard (lock-based) stack's worst-case performance (i.e. one thread being preempted while holding the spinlock) is much worse than the lock-free stack's. - Using per-thread mempool caches will largely mitigate the performance difference. Test setup: x86_64 build with default config, dual-socket Xeon E5-2699 v4, running on isolcpus cores with a tickless scheduler. The lock-based stack's rate_persec was 0.6x-3.5x the lock-free stack's. Signed-off-by: Gage Eads <gage.eads@intel.com> Reviewed-by: Olivier Matz <olivier.matz@6wind.com>	2019-04-04 22:06:16 +02:00
Gage Eads	3340202f59	stack: add lock-free implementation This commit adds support for a lock-free (linked list based) stack to the stack API. This behavior is selected through a new rte_stack_create() flag, RTE_STACK_F_LF. The stack consists of a linked list of elements, each containing a data pointer and a next pointer, and an atomic stack depth counter. The lock-free push operation enqueues a linked list of pointers by pointing the tail of the list to the current stack head, and using a CAS to swing the stack head pointer to the head of the list. The operation retries if it is unsuccessful (i.e. the list changed between reading the head and modifying it), else it adjusts the stack length and returns. The lock-free pop operation first reserves num elements by adjusting the stack length, to ensure the dequeue operation will succeed without blocking. It then dequeues pointers by walking the list -- starting from the head -- then swinging the head pointer (using a CAS as well). While walking the list, the data pointers are recorded in an object table. This algorithm stack uses a 128-bit compare-and-swap instruction, which atomically updates the stack top pointer and a modification counter, to protect against the ABA problem. The linked list elements themselves are maintained in a lock-free LIFO list, and are allocated before stack pushes and freed after stack pops. Since the stack has a fixed maximum depth, these elements do not need to be dynamically created. Signed-off-by: Gage Eads <gage.eads@intel.com> Reviewed-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>	2019-04-04 22:06:16 +02:00
Gage Eads	05d3b5283c	stack: introduce stack library The rte_stack library provides an API for configuration and use of a bounded stack of pointers. Push and pop operations are MT-safe, allowing concurrent access, and the interface supports pushing and popping multiple pointers at a time. The library's interface is modeled after another DPDK data structure, rte_ring, and its lock-based implementation is derived from the stack mempool handler. An upcoming commit will migrate the stack mempool handler to rte_stack. Signed-off-by: Gage Eads <gage.eads@intel.com> Reviewed-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>	2019-04-04 22:06:16 +02:00
Fan Zhang	3ed37e0934	doc: update supported algorithms in IPsec guide This patch updates the ipsec library programmer's guide with the additional algorithms which are now supported. Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2019-04-03 13:50:58 +02:00
Dharmik Thakkar	f401363d98	hash: support lock-free extendable bucket This patch enables lock-free read-write concurrency support for extendable bucket feature. Suggested-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Signed-off-by: Dharmik Thakkar <dharmik.thakkar@arm.com> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com> Reviewed-by: Gavin Hu <gavin.hu@arm.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Acked-by: Yipeng Wang <yipeng1.wang@intel.com>	2019-04-03 20:52:35 +02:00
Anatoly Burakov	1e3380a2f4	mem: do not use lockfiles for single file segments mode Due to internal glibc limitations [1], DPDK may exhaust internal file descriptor limits when using smaller page sizes, which results in inability to use system calls such as select() by user applications. Single file segments option stores lock files per page to ensure that pages are deleted when there are no more users, however this is not necessary because the processes will be holding onto the pages anyway because of mmap(). Thus, removing pages from the filesystem is safe even though they may be used by some other secondary process. As a result, single file segments mode no longer stores inordinate amounts of segment fd's, and the above issue with fd limits is solved. However, this will not work for legacy mem mode. For that, simply document that using bigger page sizes is the only option. [1] https://mails.dpdk.org/archives/dev/2019-February/124386.html Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-04-02 16:07:25 +02:00
Shahaf Shuler	c33a675b62	bus: introduce device level DMA memory mapping The DPDK APIs expose 3 different modes to work with memory used for DMA: 1. Use the DPDK owned memory (backed by the DPDK provided hugepages). This memory is allocated by the DPDK libraries, included in the DPDK memory system (memseg lists) and automatically DMA mapped by the DPDK layers. 2. Use memory allocated by the user and register to the DPDK memory systems. Upon registration of memory, the DPDK layers will DMA map it to all needed devices. After registration, allocation of this memory will be done with rte_malloc APIs. 3. Use memory allocated by the user and not registered to the DPDK memory system. This is for users who wants to have tight control on this memory (e.g. avoid the rte_malloc header). The user should create a memory, register it through rte_extmem_register API, and call DMA map function in order to register such memory to the different devices. The scope of the patch focus on #3 above. Currently the only way to map external memory is through VFIO (rte_vfio_dma_map). While VFIO is common, there are other vendors which use different ways to map memory (e.g. Mellanox and NXP). The work in this patch moves the DMA mapping to vendor agnostic APIs. Device level DMA map and unmap APIs were added. Implementation of those APIs was done currently only for PCI devices. For PCI bus devices, the pci driver can expose its own map and unmap functions to be used for the mapping. In case the driver doesn't provide any, the memory will be mapped, if possible, to IOMMU through VFIO APIs. Application usage with those APIs is quite simple: allocate memory * call rte_extmem_register on the memory chunk. * take a device, and query its rte_device. * call the device specific mapping function for this device. Future work will deprecate the rte_vfio_dma_map and rte_vfio_dma_unmap APIs, leaving the rte device APIs as the preferred option for the user. Signed-off-by: Shahaf Shuler <shahafs@mellanox.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>	2019-03-30 16:48:56 +01:00
Liron Himi	ff1e35fb5f	kni: calculate MTU from mbuf size - mbuf_size and mtu are now being calculated according to the given mb-pool. - max_mtu is now being set according to the given mtu the above two changes provide the ability to work with jumbo frames Signed-off-by: Liron Himi <lironh@marvell.com> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-03-30 00:59:59 +01:00
David Marchand	86dc5089e6	doc: fix examples in bonding guide Removed incorrect space character and fixed PCI addresses. Fixes: fc1f2750a3ec ("doc: programmers guide") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-03-21 21:09:16 +01:00
Pavan Nikhilesh	1534cc6ab1	doc: add notes about eventdev producer/consumer dependency EventDev i.e consumer needs to be started before starting the event producers. Update documentation of EventDev and EventDev adapters. Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Reviewed-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com> Reviewed-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com>	2019-03-15 06:46:50 +01:00
Bruce Richardson	218c4e68c1	mk: use linux and freebsd in config names Rather than using linuxapp and bsdapp everywhere, we can change things to use the, more readable, terms "linux" and "freebsd" in our build configs. Rather than renaming the configs we can just duplicate the existing ones with the new names using symlinks, and use the new names exclusively internally. ["make showconfigs" also only shows the new names to keep the list short] The result is that backward compatibility is kept fully but any new builds or development can be done using the newer names, i.e. both "make config T=x86_64-native-linuxapp-gcc" and "T=x86_64-native-linux-gcc" work. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>	2019-03-12 23:05:06 +01:00
Bruce Richardson	91d7846ce6	eal/linux: rename linuxapp to linux The term "linuxapp" is a legacy one, but just calling the subdirectory "linux" is just clearer for all concerned. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>	2019-03-12 17:31:13 +01:00
Bruce Richardson	25c99fbd68	eal/bsd: rename bsdapp to freebsd The term "bsdapp" is a legacy one, but just calling the subdirectory "freebsd" is just clearer for all concerned. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>	2019-03-12 17:30:20 +01:00
David Marchand	c3568ea376	eal: restrict control threads to startup CPU affinity Spawning the ctrl threads on anything that is not part of the eal coremask is not that polite to the rest of the system, especially when you took good care to pin your processes on cpu resources with tools like taskset (linux) / cpuset (freebsd). Rather than introduce yet another eal options to control on which cpu those ctrl threads are created, let's take the startup cpu affinity as a reference and remove the eal coremask from it. If no cpu is left, then we default to the master core. The cpuset is computed once at init before the original cpu affinity is lost. Introduced a RTE_CPU_AND macro to abstract the differences between linux and freebsd respective macros. Examples in a 4 cores FreeBSD vm: $ ./build/app/testpmd -l 2,3 --no-huge --no-pci -m 512 \ -- -i --total-num-mbufs=2048 $ procstat -S 1057 PID TID COMM TDNAME CPU CSID CPU MASK 1057 100131 testpmd - 2 1 2 1057 100140 testpmd eal-intr-thread 1 1 0-1 1057 100141 testpmd rte_mp_handle 1 1 0-1 1057 100142 testpmd lcore-slave-3 3 1 3 $ cpuset -l 1,2,3 ./build/app/testpmd -l 2,3 --no-huge --no-pci -m 512 \ -- -i --total-num-mbufs=2048 $ procstat -S 1061 PID TID COMM TDNAME CPU CSID CPU MASK 1061 100131 testpmd - 2 2 2 1061 100144 testpmd eal-intr-thread 1 2 1 1061 100145 testpmd rte_mp_handle 1 2 1 1061 100147 testpmd lcore-slave-3 3 2 3 $ cpuset -l 2,3 ./build/app/testpmd -l 2,3 --no-huge --no-pci -m 512 \ -- -i --total-num-mbufs=2048 $ procstat -S 1065 PID TID COMM TDNAME CPU CSID CPU MASK 1065 100131 testpmd - 2 2 2 1065 100148 testpmd eal-intr-thread 2 2 2 1065 100149 testpmd rte_mp_handle 2 2 2 1065 100150 testpmd lcore-slave-3 3 2 3 Fixes: d651ee4919cd ("eal: set affinity for control threads") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com> Reviewed-by: Olivier Matz <olivier.matz@6wind.com>	2019-03-07 19:21:28 +01:00
Thomas Monjalon	5a10413c58	doc: fix PCI whitelist typo in prog guide The placeholder for PCI address should be named DBDF which stands for Domain/Bus/Device/Function. Fixes: 33af337773ac ("ethdev: add common devargs parser") Cc: stable@dpdk.org Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Reviewed-by: Rami Rosen <ramirose@gmail.com>	2019-03-05 11:57:33 +00:00
Rami Rosen	f959f1148a	doc: remove reference to rte.doc.mk in programmers guide This patch removes the reference to rte.doc.mk in DPDK programmers guide. Fixes: ee801f6cc7b8 ("mk: clean dead doc rules") Cc: stable@dpdk.org Signed-off-by: Rami Rosen <ramirose@gmail.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-03-05 10:42:03 +00:00
Tiwei Bie	5c6c1480b3	doc: improve vhost zero copy guide Highlight that vhost zero copy mbufs should be consumed as soon as possible. Signed-off-by: Tiwei Bie <tiwei.bie@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-03-01 18:17:36 +01:00

1 2 3 4 5 ...

390 Commits