numam-dpdk

Author	SHA1	Message	Date
Pallavi Kadam	7e708cd8c6	eal: move CPU operations to OS specific headers Moving RTE_CPU* definitions from the common code to the Linux and FreeBSD rte_os.h file to avoid #ifdef clutter. Signed-off-by: Pallavi Kadam <pallavi.kadam@intel.com> Signed-off-by: Antara Ganesh Kolar <antara.ganesh.kolar@intel.com> Reviewed-by: Ranjit Menon <ranjit.menon@intel.com> Reviewed-by: Jerin Jacob <jerinj@marvell.com> Signed-off-by: David Marchand <david.marchand@redhat.com>	2019-10-26 17:06:41 +02:00
Pavan Nikhilesh	9b0a1dadc3	reciprocal: fix off-by-one with 32-bit divisor Fix off-by-one error in 64bit reciprocal division when divisor is 32bit. Caught with the unit test: RTE>>reciprocal_division Validating unsigned 32bit division. Validating unsigned 64bit division. Validating unsigned 64bit division with 32bit divisor. Division failed, 16983222950483802557/819 = expected 20736535959076681 result 20736535959076682 Validating division by power of 2. Test Failed Fixes: `6d45659eac` ("eal: add u64-bit variant for reciprocal divide") Cc: stable@dpdk.org Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>	2019-10-26 16:09:51 +02:00
Konstantin Ananyev	3eb860b08e	mbuf: move definitions into a separate file Right now inclusion of rte_mbuf.h header can cause inclusion of some arch/os specific headers. That prevents it to be included directly by some non-DPDK (but related) entities: KNI, BPF programs, etc. To overcome that problem usually a separate definitions of rte_mbuf structure is created within these entities. That aproach has a lot of drawbacks: code duplication, error prone, etc. This patch moves rte_mbuf structure definition (and some related macros) into a separate file that can be included by both rte_mbuf.h and other non-DPDK entities. Note that it doesn't introduce any change for current DPDK code. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Michel Machado <michel@digirati.com.br> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2019-10-25 19:30:38 +02:00
Konstantin Ananyev	2dcb5f7987	eal: move cache line and IOVA related definitions Right now RTE_CACHE_ and IOVA definitions are located inside rte_memory.h That might cause an unwanted inclusions of arch/os specific header files. See [1] for particular problem example. Probably the simplest way to deal with such problems - move these definitions into rte_commmon.h Note that this move doesn't introduce any change in functionality. [1] https://bugs.dpdk.org/show_bug.cgi?id=321 Suggested-by: Vipin Varghese <vipin.varghese@intel.com> Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Michel Machado <michel@digirati.com.br>	2019-10-25 19:30:36 +02:00
Rahul Shah	5d92c4e592	port: add eventdev port type Adding a new port type called eventdev to the rte_port library. Signed-off-by: Rahul Shah <rahul.r.shah@intel.com> Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2019-10-25 18:29:48 +02:00
Jasvinder Singh	68c1f26d42	sched: support 64-bit values Modify internal structure and functions to support 64-bit values for rates and stats parameters. Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com> Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com> Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2019-10-25 18:07:37 +02:00
Jasvinder Singh	0edf18eee2	sched: add 64-bit values To support high bandwidth network interfaces, all rates (port, subport level token bucket and traffic class rates, pipe level token bucket and traffic class rates) and stats counters defined in public data structures (rte_sched.h) are modified to support 64 bit counters. Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com> Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com> Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2019-10-25 18:07:26 +02:00
Jasvinder Singh	def9c49267	sched: remove redundant code Remove redundant data structure fields from port level data structures and update the release notes. Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com> Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>	2019-10-25 17:53:36 +02:00
Jasvinder Singh	831104f0e8	sched: update queue stats read for config flexibility Modify pipe queue stats read function to allow different subports of the same port to have different configuration in terms of number of pipes, pipe queue sizes, etc. Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com> Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>	2019-10-25 17:51:26 +02:00
Jasvinder Singh	2a718309fd	sched: update pkt dequeue for flexible config Modify scheduler packet dequeue operation to allow different subports of the same port to have different configuration in terms of number of pipes, pipe queue sizes, etc. Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com> Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>	2019-10-25 17:51:22 +02:00
Jasvinder Singh	4d2ad6e34b	sched: update grinder functions for config flexibility Modify packet grinder functions of the schedule to allow different subports of the same port to have different configuration in terms of number of pipes, pipe queue sizes, etc. Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com> Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>	2019-10-25 17:51:19 +02:00
Jasvinder Singh	21dca4e3f6	sched: update memory compute to support flexiblity Update memory footprint compute function for allowing subports of the same port to have different configuration in terms of number of pipes, pipe queue sizes, etc. Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com> Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>	2019-10-25 17:51:17 +02:00
Jasvinder Singh	6fbbb0ef48	sched: modify pkt enqueue for config flexibility Modify scheduler packet enqueue operation of the scheduler to allow different subports of the same port to have different configuration in terms of number of pipes, pipe queue sizes, etc. Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com> Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>	2019-10-25 17:51:14 +02:00
Jasvinder Singh	34a90f8665	sched: modify pipe functions for config flexibility Modify pipe level functions to allow different subports of the same port to have different configuration in terms of number of pipes, pipe queue sizes, etc. Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com> Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>	2019-10-25 17:51:12 +02:00
Jasvinder Singh	ce7c4fd7c2	sched: add pipe config to subport level Add pipes configuration from the port level to allow different subports of the same port to have different configuration in terms of number of pipes, pipe queue sizes, etc. Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com> Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>	2019-10-25 17:51:10 +02:00
Jasvinder Singh	d9213b829a	sched: remove pipe params config from port level Remove pipes configuration from the port level to allow different subports of the same port to have different configuration in terms of number of pipes, pipe queue sizes, etc. Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com> Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>	2019-10-25 17:51:07 +02:00
Jasvinder Singh	b757097e37	sched: modify internal structs for config flexibility Update internal structures related to port and subport to allow different subports of the same port to have different configuration in terms of number of pipes, pipe queue sizes, etc. Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com> Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>	2019-10-25 17:51:04 +02:00
Jasvinder Singh	85f52aa422	sched: add pipe config params to subport struct Add pipe configuration parameters to subport level structure to allow different subports of the same port to have different configuration in terms of number of pipes, pipe queue sizes, etc. Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com> Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>	2019-10-25 17:49:45 +02:00
Ting Xu	d892768c6d	mbuf: add GTP tunnel type Add GTP tunnel type flag in mbuf for future use in GTP Tx checksum offload. Signed-off-by: Ting Xu <ting.xu@intel.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-10-23 16:43:10 +02:00
Kiran Kumar K	01b3156d33	ethdev: add HIGIG2 key field to flow API Add new rte_flow_item_higig2_hdr in order to match higig2 header. It is a layer 2.5 protocol and used in Broadcom switches. Header format is based on the following document. http://read.pudn.com/downloads558/doc/comm/2301468/HiGig_protocol.pdf Signed-off-by: Kiran Kumar K <kirankumark@marvell.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2019-10-23 16:43:10 +02:00
Ciara Power	400d758182	ethdev: check device promiscuous state The promiscuous enable and disable functions now check the promiscuous state of the device before checking if the dev_ops function exists for the device. This change is necessary to allow sample applications run on virtual PMDs, as previously -ENOTSUP returned when the promiscuous enable function was called. This caused the sample application to fail unnecessarily. Signed-off-by: Ciara Power <ciara.power@intel.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>	2019-10-23 16:43:10 +02:00
David Marchand	7eca7f7fd0	net: add missing endianness annotations OVS currently maintains a copy of those headers with the right endianness annotations so that sparse checks can pass. We introduced rte_beXX_t for better readibility in v17.08. Let's make use of them, OVS then only needs to override those rte_beXX_t types by exposing a tweaked rte_byteorder.h header. Other existing dpdk users won't be affected since rte_beXX_t types are mapped to uintXX_t types. Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2019-10-23 16:43:10 +02:00
Simei Su	d3ae8c44b8	ethdev: extend RSS offload types This patch reserves several bits as input set selection from the high end of the 64 bits. It is combined with exisiting ETH_RSS_* to represent RSS types. This patch also checks the simultaneous use of SRC_ONLY and DST_ONLY of the same level. Signed-off-by: Simei Su <simei.su@intel.com> Reviewed-by: Qi Zhang <qi.z.zhang@intel.com> Acked-by: Ori Kam <orika@mellanox.com> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>	2019-10-23 16:43:09 +02:00
Simei Su	fce6b66893	ethdev: decouple flow types and RSS offload types This patch decouples RTE_ETH_FLOW_* and ETH_RSS_*. The former defines flow types and the latter defines RSS offload types. Signed-off-by: Simei Su <simei.su@intel.com> Reviewed-by: Qi Zhang <qi.z.zhang@intel.com> Acked-by: Ori Kam <orika@mellanox.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>	2019-10-23 16:43:09 +02:00
Flavio Leitner	c3ff0ac70a	vhost: improve performance by supporting large buffer The rte_vhost_dequeue_burst supports two ways of dequeuing data. If the data fits into a buffer, then all data is copied and a single linear buffer is returned. Otherwise it allocates additional mbufs and chains them together to return a multiple segments mbuf. While that covers most use cases, it forces applications that need to work with larger data sizes to support multiple segments mbufs. The non-linear characteristic brings complexity and performance implications to the application. To resolve the issue, add support to attach external buffer to a pktmbuf and let the host provide during registration if attaching an external buffer to pktmbuf is supported and if only linear buffer are supported. Signed-off-by: Flavio Leitner <fbl@sysclose.org> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-10-23 16:43:09 +02:00
Jin Yu	7d0963d74a	vhost: add packed ring support to vring related APIs This patch add packed ring support in two APIs so user can get the packed ring`. Signed-off-by: Lin Li <lilin24@baidu.com> Signed-off-by: Xun Ni <nixun@baidu.com> Signed-off-by: Yu Zhang <zhangyu31@baidu.com> Signed-off-by: Jin Yu <jin.yu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-10-23 16:43:09 +02:00
Jin Yu	4d891f77dd	vhost: add APIs to get inflight ring This patch introduces two APIs. one is for getting inflgiht ring and the other is for getting base. Signed-off-by: Lin Li <lilin24@baidu.com> Signed-off-by: Xun Ni <nixun@baidu.com> Signed-off-by: Yu Zhang <zhangyu31@baidu.com> Signed-off-by: Jin Yu <jin.yu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-10-23 16:43:09 +02:00
Jin Yu	bb0c2de960	vhost: add APIs to operate inflight ring This patch introduces three APIs to operate the inflight ring. Three APIs are set, set last and clear. It includes split and packed ring. Signed-off-by: Lin Li <lilin24@baidu.com> Signed-off-by: Xun Ni <nixun@baidu.com> Signed-off-by: Yu Zhang <zhangyu31@baidu.com> Signed-off-by: Jin Yu <jin.yu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-10-23 16:43:09 +02:00
Jin Yu	ad0a4ae491	vhost: checkout resubmit inflight information This patch shows how to checkout the inflight ring and construct the resubmit information also include destroying resubmit info. Signed-off-by: Lin Li <lilin24@baidu.com> Signed-off-by: Xun Ni <nixun@baidu.com> Signed-off-by: Yu Zhang <zhangyu31@baidu.com> Signed-off-by: Jin Yu <jin.yu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-10-23 16:43:09 +02:00
Jin Yu	d87f1a1cb7	vhost: support inflight info sharing This patch introduces two new messages VHOST_USER_GET_INFLIGHT_FD and VHOST_USER_SET_INFLIGHT_FD to support transferring a shared buffer between qemu and backend. Signed-off-by: Lin Li <lilin24@baidu.com> Signed-off-by: Xun Ni <nixun@baidu.com> Signed-off-by: Yu Zhang <zhangyu31@baidu.com> Signed-off-by: Jin Yu <jin.yu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-10-23 16:43:09 +02:00
Jin Yu	7588ebed5d	vhost: add inflight structures This patch adds the inflight queue region structure include the split and packed. Signed-off-by: Lin Li <lilin24@baidu.com> Signed-off-by: Xun Ni <nixun@baidu.com> Signed-off-by: Yu Zhang <zhangyu31@baidu.com> Signed-off-by: Jin Yu <jin.yu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-10-23 16:43:09 +02:00
Jin Yu	62a70db553	vhost: add packed ring into vring struct This patch add the packed ring in the rte_vhost_vring. Signed-off-by: Lin Li <lilin24@baidu.com> Signed-off-by: Xun Ni <nixun@baidu.com> Signed-off-by: Yu Zhang <zhangyu31@baidu.com> Signed-off-by: Jin Yu <jin.yu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-10-23 16:43:09 +02:00
Jin Yu	300cc9fd3d	vhost: add inflight description This patch add the inflight message description and the inflight share fd protocol feature flag. Signed-off-by: Lin Li <lilin24@baidu.com> Signed-off-by: Xun Ni <nixun@baidu.com> Signed-off-by: Yu Zhang <zhangyu31@baidu.com> Signed-off-by: Jin Yu <jin.yu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-10-23 16:43:09 +02:00
Adrian Moreno	c49197ff29	vhost: prevent zero copy mode if IOMMU is on The simultaneous use of dequeue_zero_copy and IOMMU is problematic. Not only because IOVA_VA mode is not supported but also because the potential invalidation of guest pages while the buffers are in use, is not handled. Prevent these two features to be enabled simultaneously. Fixes: `69c90e98f4` ("vhost: enable IOMMU support") Cc: stable@dpdk.org Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Reviewed-by: Tiwei Bie <tiwei.bie@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-10-23 16:43:09 +02:00
Adrian Moreno	1fc3b3f06a	vhost: convert buffer addresses to GPA for logging Add IOVA versions of dirty page logging functions. Note that the API facing rte_vhost_log_write is not modified. So, make explicit that it expects the address in GPA space. Fixes: `69c90e98f4` ("vhost: enable IOMMU support") Cc: stable@dpdk.org Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-10-23 16:43:09 +02:00
Adrian Moreno	fbda9f1459	vhost: translate incoming log address to GPA When IOMMU is enabled the incoming log address is in IOVA space. In that case, look in IOTLB table and translate the resulting HVA to GPA. If IOMMU is not enabled, the incoming log address is already a GPA so no transformation is needed. Fixes: `69c90e98f4` ("vhost: enable IOMMU support") Cc: stable@dpdk.org Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-10-23 16:43:09 +02:00
Joyce Kong	2c661d418e	net/virtio: improve perf via one-way barriers on used flag In case VIRTIO_F_ORDER_PLATFORM(36) is not negotiated, then the frontend and backend are assumed to be implemented in software, that is they can run on identical CPUs in an SMP configuration. Thus a weak form of memory barriers like rte_smp_r/wmb, other than rte_cio_r/wmb, is sufficient for this case(vq->hw->weak_barriers == 1) and yields better performance. For the above case, this patch helps yielding even better performance by replacing the two-way barriers with C11 one-way barriers for used flags in packed ring. Signed-off-by: Joyce Kong <joyce.kong@arm.com> Reviewed-by: Gavin Hu <gavin.hu@arm.com> Reviewed-by: Phil Yang <phil.yang@arm.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-10-23 16:43:09 +02:00
Joyce Kong	6094557de0	net/virtio: improve perf via one-way barrier on avail flag In case VIRTIO_F_ORDER_PLATFORM(36) is not negotiated, then the frontend and backend are assumed to be implemented in software, that is they can run on identical CPUs in an SMP configuration. Thus a weak form of memory barriers like rte_smp_r/wmb, other than rte_cio_r/wmb, is sufficient for this case(vq->hw->weak_barriers == 1) and yields better performance. For the above case, this patch helps yielding even better performance by replacing the two-way barriers with C11 one-way barriers for avail flags in packed ring. Meanwhile, a read barrier is required to ensure ordering between descriptor's flags and content reads [1]. With C11, load-acquire can enforce the ordering instead of rmb barrier. [1] https://patchwork.dpdk.org/patch/49109/ Signed-off-by: Joyce Kong <joyce.kong@arm.com> Reviewed-by: Gavin Hu <gavin.hu@arm.com> Reviewed-by: Phil Yang <phil.yang@arm.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-10-23 16:43:09 +02:00
Haiyue Wang	eb5902504a	ethdev: add API for getting burst mode information Some PMDs have more than one Rx/Tx burst paths, add the ethdev API that allows an application to retrieve the mode information about Rx/Tx packet burst such as Scalar or Vector, and Vector technology like AVX2. Signed-off-by: Haiyue Wang <haiyue.wang@intel.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-10-23 16:43:09 +02:00
Vivek Sharma	041dba5768	ethdev: fix QinQ offload Use correct flag for indicating QinQ strip rx offload. Fixes: `dfebfc9882` ("ethdev: support dynamic configuration of QinQ strip") Cc: stable@dpdk.org Signed-off-by: Vivek Sharma <viveksharma@marvell.com> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>	2019-10-23 16:43:08 +02:00
Kiran Kumar K	3266266db4	ethdev: add GTPU flow type Adding support to enable GTPU eth flow type for RSS hash index calculation. Signed-off-by: Kiran Kumar K <kirankumark@marvell.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-10-23 16:43:08 +02:00
Dekel Peled	790d6182c0	ethdev: add definitions for EEPROM standards This patch add definitions of maximal data length in module EEPROM, values are compatible with include/uapi/linux/ethtool.h. These definitions can be used by application to validate data length. Signed-off-by: Dekel Peled <dekelp@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-10-23 16:43:08 +02:00
Kiran Kumar K	67f8d7b620	ethdev: add AH key field to flow API Add new rte_flow_item_ah in order to match the Authentication Header based on RFC 2402. Signed-off-by: Kiran Kumar K <kirankumark@marvell.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-10-23 16:43:08 +02:00
Kiran Kumar K	30f9f9f451	ethdev: add IGMP key field to flow API Add new rte_flow_item_igmp in order to match the Internet Group Management Protocol based on RFC 2236. Signed-off-by: Kiran Kumar K <kirankumark@marvell.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-10-23 16:43:08 +02:00
Kiran Kumar K	86e1974a42	ethdev: add NSH key field to flow API Add new rte_flow_item_nsh in order to match the network service header based on RFC 8300. Signed-off-by: Kiran Kumar K <kirankumark@marvell.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-10-23 16:43:08 +02:00
Anatoly Burakov	3fe4bced1b	eal: use define instead of raw option name We are using '--base-virtaddr' in a few places. We have a define for that, so use it instead. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2019-10-25 11:35:10 +02:00
Anatoly Burakov	8f29a60764	eal/freebsd: support option --base-virtaddr According to our docs, only Linuxapp supports base-virtaddr option. That is, strictly speaking, not true because most of the things that are attempting to respect base-virtaddr are in common files, so FreeBSD already mostly supports this option in practice. This commit fixes the remaining bits to explicitly support base-virtaddr option, and moves the arg parsing from EAL to common options parsing code. Documentation is also updated to reflect that all platforms now support base-virtaddr. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2019-10-25 11:17:29 +02:00
Ruifeng Wang	5283392482	lib/distributor: fix deadlock on aarch64 Distributor and worker threads rely on data structs in cache line for synchronization. The shared data structs were not protected. This caused deadlock issue on weaker memory ordering platforms as aarch64. Fix this issue by adding memory barriers to ensure synchronization among cores. Bugzilla ID: 342 Fixes: `775003ad2f` ("distributor: add new burst-capable library") Cc: stable@dpdk.org Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> Reviewed-by: Gavin Hu <gavin.hu@arm.com> Acked-by: David Hunt <david.hunt@intel.com>	2019-10-25 10:20:31 +02:00
Fiona Trahe	80f5df0ae0	cryptodev: clarify wireless inputs in digest-encrypted cases Clarify constraints on fields specified in bits for wireless algorithms in digest-encrypted case. Signed-off-by: Fiona Trahe <fiona.trahe@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-10-23 16:57:06 +02:00
Vladimir Medvedkin	b2ee269267	ipsec: add SAD add/delete/lookup implementation Replace rte_ipsec_sad_add(), rte_ipsec_sad_del() and rte_ipsec_sad_lookup() stubs with actual implementation. It uses three librte_hash tables each of which contains an entries for a specific SA type (either it is addressed by SPI only or SPI+DIP or SPI+DIP+SIP) Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Tested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2019-10-23 16:57:06 +02:00
Vladimir Medvedkin	3feb23609c	ipsec: add SAD create/destroy implementation Replace rte_ipsec_sad_create(), rte_ipsec_sad_destroy() and rte_ipsec_sad_find_existing() API stubs with actual implementation. Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Tested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2019-10-23 16:57:06 +02:00
Vladimir Medvedkin	401633d9c1	ipsec: add inbound SAD API According to RFC 4301 IPSec implementation needs an inbound SA database (SAD). For each incoming inbound IPSec-protected packet (ESP or AH) it has to perform a lookup within it's SAD. Lookup should be performed by: Security Parameters Index (SPI) + destination IP (DIP) + source IP (SIP) or SPI + DIP or SPI only and an implementation has to return the 'longest' existing match. This patch extend DPDK IPsec library with inbound security association database (SAD) API implementation that: - conforms to the RFC requirements above - can scale up to millions of entries - supports fast lookups - supports incremental updates Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2019-10-23 16:57:06 +02:00
Julien Meunier	3dd4435cf4	cryptodev: fix checks related to device id Each cryptodev are indexed with dev_id in the global rte_crypto_devices variable. nb_devs is incremented / decremented each time a cryptodev is created / deleted. The goal of nb_devs was to prevent the user to get an invalid dev_id. Let's imagine DPDK has configured N cryptodevs. If the cryptodev=1 is removed at runtime, the latest cryptodev N cannot be accessible, because nb_devs=N-1 with the current implementaion. In order to prevent this kind of behavior, let's remove the check with nb_devs and iterate in all the rte_crypto_devices elements: if data is not NULL, that means a valid cryptodev is available. Also, remove max_devs field and use RTE_CRYPTO_MAX_DEVS in order to unify the code. Fixes: `d11b0f30df` ("cryptodev: introduce API and framework for crypto devices") Cc: stable@dpdk.org Signed-off-by: Julien Meunier <julien.meunier@nokia.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-10-23 16:57:06 +02:00
Arek Kusztal	f2b2a44971	cryptodev: add asymmetric session-less This commit adds asymmetric session-less option to rte_crypto_asym_op. Feature flag for session-less is added to rte_cryptodev. Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com> Acked-by: Anoob Joseph <anoobj@marvell.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-10-23 16:57:06 +02:00
David Marchand	8e35792c53	eal: remove dead code on NUMA node detection RTE_EAL_ALLOW_INV_SOCKET_ID had been introduced and documented as used with xen dom0 support (dropped for some time now). Closely looking at this, the code was changed later and ensures that the socket id is in the [0..RTE_MAX_NUMA_NODES] range anyway. Let's drop this dead code and the build option with it. Fixes: `94ef296414` ("eal/linux: fix numa node detection") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-10-24 14:15:28 +02:00
David Christensen	ed5d3d5cdb	eal/linux: restore specific hugepage ordering for ppc An ifdef present in eal_memory.c references "RTE_ARCH_PPC64" when it should actually use "RTE_ARCH_PPC_64". Simple testing revealed that both the PPC_64 and non-PPC_64 versions of the code involved work, but the PPC_64 version of the code is retained to be consistent with other instances in the same file where mmapped memory is accessed in reverse order on Power platforms. Fixes: `66cc45e293` ("mem: replace memseg with memseg lists") Cc: stable@dpdk.org Signed-off-by: David Christensen <drc@linux.vnet.ibm.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-10-24 14:15:10 +02:00
Morten Brørup	0f824df6f8	mbuf: add bulk free function Add function for freeing a bulk of mbufs. Signed-off-by: Morten Brørup <mb@smartsharesystems.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2019-10-24 02:45:40 +02:00
Bruce Richardson	47cce54ba8	build: allow stricter fallthrough warnings DPDK currently compiles with implicit-fallthrough=2 warning level. With gcc -Wextra flag, the default level is 3, so some minor changes are needed to support this in DPDK. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Luca Boccassi <bluca@debian.org>	2019-10-24 01:02:30 +02:00
Bruce Richardson	7f8f7f4d0a	build: process dependencies before main build check If we want to add support for turning off components because of missing dependencies, then we need to check for those dependencies before we make a determination as to whether a component should be built or not, assuming that the component says it should be built. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Luca Boccassi <bluca@debian.org> Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-10-24 01:02:28 +02:00
Bruce Richardson	ae783b42c4	build: print out dependency names for clarity To help developers to get the correct dependency name e.g. when creating a new example that depends on a specific component, print out the dependency name for each lib/driver as it is processed. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>	2019-10-23 16:41:06 +02:00
Nipun Gupta	b21302a107	eventdev: add Tx flag for packets with same destination This patch introduces a `flag` in the Eth TX adapter enqueue API. Some drivers may support burst functionality only with the packets having same destination device and queue. The flag `RTE_EVENT_ETH_TX_ADAPTER_ENQUEUE_SAME_DEST` can be used to indicate this so the underlying driver, for drivers to utilize burst functionality appropriately. Signed-off-by: Nipun Gupta <nipun.gupta@nxp.com> Acked-by: Jerin Jacob <jerinj@marvell.com> Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>	2019-10-18 10:03:08 +02:00
David Marchand	08be0e0b68	rcu: fix reference to offline function Fixes: `64994b56cf` ("rcu: add RCU library supporting QSBR mechanism") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>	2019-10-21 21:21:30 +02:00
Honnappa Nagarahalli	33466e0fe1	rcu: update QS only when there are updates from writer When the writer is checking the quiescent state status, it is not deleting any entries in the data structure. This means, the readers do not need to update their quiescent state during that period. Readers update the quiescent state only when there are updates available from the writer. Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Reviewed-by: Gavin Hu <gavin.hu@arm.com>	2019-10-21 17:54:41 +02:00
Honnappa Nagarahalli	1f90d32ce1	rcu: add least acknowledged token optimization When the rte_rcu_qsbr_check API is called, it is possible to calculate the least valued token acknowledged by all the readers. When the API is called next time, the readers' token counters do not need to be scanned if the value of the token being queried is less than the last least token acknowledged. This avoids the cache line bounces between readers and writer. Fixes: `64994b56cf` ("rcu: add RCU library supporting QSBR mechanism") Cc: stable@dpdk.org Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Reviewed-by: Gavin Hu <gavin.hu@arm.com>	2019-10-21 17:54:40 +02:00
David Marchand	384b0a33fe	clean bare metal support traces Bare metal support has been gone for quite some time but we still had some checks on system includes. Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2019-10-21 16:19:00 +02:00
Phil Yang	7911ba0473	stack: enable lock-free implementation for aarch64 Enable both C11 atomic and non C11 atomic lock-free stack for aarch64. Introduced a new header to reduce the ifdef clutter across generic and C11 files. The rte_stack_lf_stubs.h contains stub implementations of __rte_stack_lf_count, __rte_stack_lf_push_elems and __rte_stack_lf_pop_elems. Suggested-by: Gage Eads <gage.eads@intel.com> Suggested-by: Jerin Jacob <jerinj@marvell.com> Signed-off-by: Phil Yang <phil.yang@arm.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Tested-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Acked-by: Jerin Jacob <jerinj@marvell.com>	2019-10-21 10:15:57 +02:00
Phil Yang	7e2c3e17fe	eal/arm64: add 128-bit atomic compare exchange This patch adds the implementation of the 128-bit atomic compare exchange API on aarch64. Using 64-bit 'ldxp/stxp' instructions can perform this operation. Moreover, on the LSE atomic extension accelerated platforms, it is implemented by 'casp' instructions for better performance. Since the '__ARM_FEATURE_ATOMICS' flag only supports GCC-9, this patch adds a new config flag 'RTE_ARM_FEATURE_ATOMICS' to enable the 'cas' version on older version compilers. For octeontx2, we make sure that the lse (and other) extensions are enabled even if the compiler does not know of the octeontx2 target cpu. Since direct x0 register used in the code and cas_op_name() and rte_atomic128_cmp_exchange() is inline function, based on parent function load, it may corrupt x0 register aka break aarch64 ABI. Define CAS operations as rte_noinline functions to avoid an ABI break [1]. 1: https://git.dpdk.org/dpdk/commit/?id=5b40ec6b9662 Suggested-by: Jerin Jacob <jerinj@marvell.com> Signed-off-by: Phil Yang <phil.yang@arm.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Tested-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Acked-by: Jerin Jacob <jerinj@marvell.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2019-10-21 10:06:13 +02:00
Jim Harris	b30b134f82	eal: calibrate TSC only in primary process This ensures secondary processes never have to calculate the TSC rate themselves, which can be noticeable in VMs that don't have access to arch-specific detection mechanism (such as CPUID leaf 0x15 or MSR 0xCE on x86). Since rte_mem_config is now internal to the EAL library, we can add tsc_hz without ABI breakage concerns. Reduces rte_eal_init() execution time in a secondary process from 165ms to 66ms on my test system. Signed-off-by: Jim Harris <james.r.harris@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2019-10-18 13:23:10 +02:00
Ruifeng Wang	b36f587f01	rcu: fix spurious thread unregister Thread unregister returns success while unregister not been performed. This is due to incorrect thread registration status check. Fix this issue by correcting bitmap check. Fixes: `64994b56cf` ("rcu: add RCU library supporting QSBR mechanism") Cc: stable@dpdk.org Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> Reviewed-by: Gavin Hu <gavin.hu@arm.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2019-10-18 06:13:36 +02:00
Nikhil Rao	e484ccddbe	service: avoid false sharing on core state For a valid service, the core mask of the service is checked against the current core and the corresponding entry in the active_on_lcore array is set or reset. Upto 8 cores share the same cache line for their service active_on_lcore array entries since each entry is a uint8_t. Some number of these entries also share the cache line with the internal_flags member of struct rte_service_spec_impl, hence this false sharing also makes the service_valid() check expensive. Eliminate false sharing by moving the active_on_lcore array to a per-core data structure. The array is now indexed by service id. Signed-off-by: Nikhil Rao <nikhil.rao@intel.com> Acked-by: Gage Eads <gage.eads@intel.com> Acked-by: Harry van Haaren <harry.van.haaren@intel.com>	2019-10-18 06:09:24 +02:00
Jim Harris	c1077933d4	timer: remove useless check on x86 TSC reliability This code was added 7+ years ago in commit `fb022b85ba` ("timer: check TSC reliability") presumably when variant TSCs were still somewhat common. But this code doesn't do anything except print a warning, and the warning doesn't give any kind of advice to the user, so let's just remove it. While the warning has no functional meaning, the /proc/cpuinfo parsing consumes a non-trivial amount of time which is especially noticeable in secondary processes. On my test system, it consumes 21ms out of the 66ms total execution time for rte_eal_init() in a secondary process. Signed-off-by: Jim Harris <james.r.harris@intel.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2019-10-17 09:47:42 +02:00
Hemant Agrawal	ad4305d0d5	eal/ppc: add SPDX license tag Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com> Acked-by: David Christensen <drc@linux.vnet.ibm.com>	2019-10-17 06:59:15 +02:00
David Christensen	72e69d801b	eal/ppc: fix 64-bit atomic exchange operation The rte_atomic64_exchange operation for ppc_64 incorrectly linked back to a 32 bit generic operation (__atomic_exchange_4) rather than the 64 bit generic operation (__atomic_exchange_8). As a result, applications that used rte_eth_link_get_nowait() would only receive the link speed, they would not receive the link state, link duplex, or link autoneg properties. Fixes: `ff2863570f` ("eal: introduce atomic exchange operation") Cc: stable@dpdk.org Signed-off-by: David Christensen <drc@linux.vnet.ibm.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2019-10-17 06:59:11 +02:00
Stephen Hemminger	c3a90c381d	mbuf: add a copy routine This is a commonly used operation that surprisingly the DPDK has not supported. The new rte_pktmbuf_copy does a deep copy of packet. This is a complete copy including meta-data. It handles the case where the source mbuf comes from a pool with larger data area than the destination pool. The routine also has options for skipping data, or truncating at a fixed length. This patch also introduces internal inline to copy the metadata fields of mbuf. Add a test for this new function, based of the clone tests. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2019-10-16 12:43:53 +02:00
Stephen Hemminger	1d2db47c9f	mbuf: deinline clone function Cloning mbufs requires allocations and iteration and therefore should not be an inline. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2019-10-16 12:42:04 +02:00
Stephen Hemminger	6b1dd3be54	mbuf: deinline linearize function This copy part of this function is too big to be put inline. The places it is used are only in special exception paths where a highly fragmented mbuf arrives at a device that can't handle it. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2019-10-16 12:42:04 +02:00
Olivier Matz	a2b5a8722f	mempool: clarify default populate function No functional change. Clarify the populate function to make future changes easier to understand. Rename the variables: - to avoid negation in the name - to have more understandable names Remove useless variable (no_pageshift is equivalent to pg_sz == 0). Remove duplicate affectation of "external" variable. Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>	2019-10-16 10:41:21 +02:00
Xiaolong Ye	b34801d1aa	kni: support allmulticast mode set This patch adds support to allow users enable/disable allmulticast mode for kni interface. This requirement comes from bugzilla 312, more details can refer to: https://bugs.dpdk.org/show_bug.cgi?id=312 Bugzilla ID: 312 Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-10-15 21:16:32 +02:00
Stephen Hemminger	a8f8ae1cf9	service: use log for error messages EAL should always use rte_log instead of putting errors to stderr (which maybe redirected to /dev/null in a daemon). Also checks for null before rte_free are unnecessary. Minor code consistency improvements. Fixes: `21698354c8` ("service: introduce service cores concept") Cc: stable@dpdk.org Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> Acked-by: Harry van Haaren <harry.van.haaren@intel.com>	2019-10-15 20:37:11 +02:00
Arnon Warshavsky	75dbb45f28	eal: fix mapping leak in secondary process Have rte_eal_config_reattach clean up the mapped address which is a valid address but not the one intended. Coverity issue: 343439 Fixes: `4e8854ae89` ("eal: do not panic on shared memory init") Fixes: `b149a70642` ("eal/freebsd: add config reattach in secondary process") Cc: stable@dpdk.org Signed-off-by: Arnon Warshavsky <arnon@qwilt.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2019-10-15 20:37:11 +02:00
Jim Harris	773a860aef	vfio: fix leak with multiprocess The code checks both rte_mp_request_sync() return code and that the number of messages in the reply equals 1. If rte_mp_request_sync() succeeds but there was more than one message, those messages would get leaked. Found via code review by Anatoly Burakov of patches that used the vhost code as a template for using rte_mp_request_sync(). Fixes: `83a73c5fef` ("vfio: use generic multi-process channel") Cc: stable@dpdk.org Reported-by: Anatoly Burakov <anatoly.burakov@intel.com> Signed-off-by: Jim Harris <james.r.harris@intel.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-10-15 20:36:58 +02:00
Jerin Jacob	7fa2537226	bpf: hide internal program argument type RTE_BPF_ARG_PTR_STACK is used as internal program arg type. Rename to RTE_BPF_ARG_RESERVED to avoid exposing internal program type. Signed-off-by: Jerin Jacob <jerinj@marvell.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Tested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2019-10-12 14:27:19 +02:00
Jerin Jacob	082482cef4	bpf/arm: add branch operation Add branch and call operations. jump_offset_* APIs used for finding the relative offset to jump w.r.t current eBPF program PC. Signed-off-by: Jerin Jacob <jerinj@marvell.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2019-10-12 14:20:29 +02:00
Jerin Jacob	2acfae37f6	bpf/arm: add atomic-exchange-and-add operation Implement XADD eBPF instruction using STADD arm64 instruction. If the given platform does not have atomics support, use LDXR and STXR pair for critical section instead of STADD. Signed-off-by: Jerin Jacob <jerinj@marvell.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2019-10-12 14:20:29 +02:00
Jerin Jacob	e00906bdc7	bpf/arm: add load and store operations Add load and store operations. Signed-off-by: Jerin Jacob <jerinj@marvell.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2019-10-12 14:20:29 +02:00
Jerin Jacob	2b6d22fa9a	bpf/arm: add byte swap operations add le16, le32, le64, be16, be32 and be64 operations. Signed-off-by: Jerin Jacob <jerinj@marvell.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2019-10-12 14:20:29 +02:00
Jerin Jacob	9f4469d9e8	bpf/arm: add logical operations Add OR, AND, NEG, XOR, shift operations for immediate and source register variants. Signed-off-by: Jerin Jacob <jerinj@marvell.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2019-10-12 14:20:29 +02:00
Jerin Jacob	111e2a747a	bpf/arm: add basic arithmetic operations Add mov, add, sub, mul, div and mod arithmetic operations for immediate and source register variants. Signed-off-by: Jerin Jacob <jerinj@marvell.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2019-10-12 14:20:28 +02:00
Jerin Jacob	f3e5167724	bpf/arm: add prologue and epilogue Add prologue and epilogue as per arm64 procedure call standard. As an optimization the generated instructions are the function of whether eBPF program has stack and/or CALL class. Signed-off-by: Jerin Jacob <jerinj@marvell.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2019-10-12 14:20:25 +02:00
Jerin Jacob	6861c01001	bpf/arm: add build infrastructure Add build infrastructure and documentation update for arm64 JIT support. Signed-off-by: Jerin Jacob <jerinj@marvell.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2019-10-12 14:20:21 +02:00
David Marchand	7dde68cf0e	net: add missing rte prefix for ESP tail This structure has been missed during the big rework. Fixes: `5ef2546767` ("net: add rte prefix to ESP structure") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-10-08 12:14:31 +02:00
Simei Su	d172886440	ethdev: add symmetric Toeplitz hash Currently, there are DEFAULT,TOEPLITZ and SIMPLE_XOR hash function. To support symmetric hash by rte_flow RSS action, this patch adds new hash function "Symmetric Toeplitz" which is supported by some hardware. Signed-off-by: Simei Su <simei.su@intel.com> Reviewed-by: Qi Zhang <qi.z.zhang@intel.com> Acked-by: Ori Kam <orika@mellanox.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>	2019-10-07 15:00:58 +02:00
Ying A Wang	226c6e60c3	ethdev: add PPPoE to flow API - RTE_FLOW_ITEM_TYPE_PPPOES: matches a PPPoE session header. - RTE_FLOW_ITEM_TYPE_PPPOED: matches a PPPoE discovery header. - RTE_FLOW_ITEM_TYPE_PPPOE_PROTO_ID: matches a PPPoE session protocol identifier. Signed-off-by: Ying A Wang <ying.a.wang@intel.com> Acked-by: Ori Kam <orika@mellanox.com> Reviewed-by: Qi Zhang <qi.z.zhang@intel.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-10-07 15:00:58 +02:00
Ying A Wang	346553db5b	ethdev: add GTP extension header to flow API - RTE_FLOW_ITEM_TYPE_GTP_PSC: matches a GTP - RTE_FLOW_ITEM_TYPE_GTP_PSC: matches a GTP PDU extension header (PDU session container). Signed-off-by: Ying A Wang <ying.a.wang@intel.com> Acked-by: Ori Kam <orika@mellanox.com> Reviewed-by: Qi Zhang <qi.z.zhang@intel.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-10-07 15:00:58 +02:00
Adrian Moreno	5d9dc18e1b	vhost: fix vring memory partially mapped Only the mapping of the vring addresses is being ensured. This causes errors when the vring size is larger than the IOTLB page size. E.g: queue sizes > 256 for 4K IOTLB pages Ensure the entire vring memory range gets mapped. Refactor duplicated code for for IOTLB UPDATE and IOTLB INVALIDATE and add packed virtqueue support. Fixes: `09927b5249` ("vhost: translate ring addresses when IOMMU enabled") Cc: stable@dpdk.org Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-10-07 15:00:57 +02:00
Tiwei Bie	4e0de8dac8	vhost: protect vring access done by application Besides the enqueue/dequeue API, other APIs of the builtin net backend should also be protected. Fixes: `a368804699` ("vhost: protect active rings from async ring changes") Cc: stable@dpdk.org Reported-by: Peng He <xnhp0320@icloud.com> Signed-off-by: Tiwei Bie <tiwei.bie@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-10-07 15:00:57 +02:00
Tiwei Bie	72d002b3eb	vhost: fix vring address handling during live migration When live migration starts, QEMU will set ring addrs again for each virtqueue. In this case, we should try to translate ring addrs after we invalidating the ring, otherwise virtqueues can be enabled with the addrs untranslated. Besides, also leverage the access_ok flag in non-IOMMU case to prevent the data path accessing invalidated virtqueues. Fixes: `5a4933e56b` ("vhost: postpone ring address translations at kick time only") Cc: stable@dpdk.org Reported-by: Yilong Lv <lvyilong.lyl@alibaba-inc.com> Signed-off-by: Tiwei Bie <tiwei.bie@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-10-07 15:00:57 +02:00
Tiwei Bie	37f7c1b609	vhost: forbid reallocation when running When the device has been started, don't do the reallocation anymore. Otherwise the pointers used in application threads can be invalidated without proper protection. Instead of introducing a global lock to protect the change of device pointers which will hurt the performance, let's just do the reallocation during setup. Fixes: `af295ad469` ("vhost: realloc device and queues to same numa node as vring desc") Cc: stable@dpdk.org Reported-by: Yinan Wang <yinan.wang@intel.com> Signed-off-by: Tiwei Bie <tiwei.bie@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-10-07 15:00:57 +02:00
Jim Harris	61af1713d3	vhost: add missing experimental flag This function is listed under EXPERIMENTAL in the rte_vhost_version.map, so it needs to be marked with __rte_experimental in the header file as well. Found by check-experimental-syms.sh when trying to compile DPDK with -finstrument-functions. This script didn't catch this in the normal case, since the function is declared __rte_always_inline. This also requires updating the vhost_scsi example to allow use of this newly marked experimental API. Signed-off-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-10-07 15:00:57 +02:00
Andrew Rybchenko	de5ccf0775	ethdev: do nothing if all-multicast mode is applied again Since driver callbacks return status code now, there is no necessity to enable or disable all-multicast mode once again if it is already successfully enabled or disabled. Configuration restore at startup tries to ensure that configured all-multicast mode is applied and start will return error if it fails. Also it avoids theoretical cases when already configured all-multicast mode is applied once again and fails. In this cases it is unclear which value should be reported on get (configured or opposite). Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>	2019-10-07 15:00:55 +02:00
Ivan Ilchenko	ca041cd44f	ethdev: change allmulticast callbacks to return status Enabling/disabling of allmulticast mode is not always successful and it should be taken into account to be able to handle it properly. When correct return status is unclear from driver code, -EAGAIN is used. Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Hyong Youb Kim <hyonkim@cisco.com>	2019-10-07 15:00:55 +02:00
Ivan Ilchenko	4b0db43df3	ethdev: change allmulticast mode API to return errors Change rte_eth_allmulticast_enable()/rte_eth_allmulticast_disable() return value from void to int and return negative errno values in case of error conditions. Modify usage of these functions across the ethdev according to new return type. Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>	2019-10-07 15:00:55 +02:00
Igor Romanov	fd2d28fcb5	ethdev: change owner delete function to return int Change rte_eth_dev_owner_delete() return value from void to int and return negative errno values in case of error conditions. Right now there is only one error case for rte_eth_dev_owner_delete() - invalid owner, but it still makes sense to return error to catch bugs in the code which uses the function. Also update the usage of the function in drivers/netvsc according to the new return type. Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>	2019-10-07 15:00:55 +02:00
Igor Romanov	1cde5e0aca	ethdev: change MAC address get function to return int Change rte_eth_macaddr_get() return value from void to int and return negative errno values in case of error conditions. Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>	2019-10-07 15:00:54 +02:00
Igor Romanov	4633c3b2eb	ethdev: change link status get functions to return int Change rte_eth_link_get() and rte_eth_link_get_nowait() return value from void to int and return negative errno values in case of error conditions. Return value of link_update callback is ignored since the callback returns not errors but whether link up status has changed or not. Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>	2019-10-07 15:00:54 +02:00
Igor Romanov	9970a9ad07	ethdev: make stats and xstats reset callbacks return int Change return value of the callbacks from void to int. Make implementations across all drivers return negative errno values in case of error conditions. Both callbacks are updated together because a large number of drivers assign the same function to both callbacks. Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-10-07 15:00:54 +02:00
Igor Romanov	da328f7f11	ethdev: change xstats reset function to return int Change rte_eth_xstats_reset() return value from void to int and return negative errno values in case of error conditions. Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-10-07 15:00:54 +02:00
Ivan Ilchenko	b57e35d6e9	kni: check code of promiscuous mode switch rte_eth_promiscuous_enable()/rte_eth_promiscuous_disable() return value was changed from void to int, so modify usage of these functions across lib/librte_kni according to new return type. Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>	2019-10-07 15:00:54 +02:00
Andrew Rybchenko	3c9b7f5131	ethdev: do nothing if promiscuous mode is applied again Since driver callbacks return status code now, there is no necessity to enable or disable promiscuous mode once again if it is already successfully enabled or disabled. Configuration restore at startup tries to ensure that configured promiscuous mode is applied and start will return error if it fails. Also it avoids theoretical cases when already configured promiscuous mode is applied once again and fails. In this cases it is unclear which value should be reported on get (configured or opposite). Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>	2019-10-07 15:00:54 +02:00
Andrew Rybchenko	9039c81257	ethdev: change promiscuous callbacks to return status Enabling/disabling of promiscuous mode is not always successful and it should be taken into account to be able to handle it properly. When correct return status is unclear from driver code, -EAGAIN is used. Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Matan Azrad <matan@mellanox.com> Acked-by: Hyong Youb Kim <hyonkim@cisco.com>	2019-10-07 15:00:54 +02:00
Ivan Ilchenko	69d0e70928	ethdev: change promiscuous mode controllers to return errors Change rte_eth_promiscuous_enable()/rte_eth_promiscuous_disable() return value from void to int and return negative errno values in case of error conditions. Modify usage of these functions across the ethdev according to new return type. Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>	2019-10-07 15:00:54 +02:00
Tiwei Bie	761d57651c	vhost: fix slave request fd leak We need to close the old slave request fd if any first before taking the new one. Fixes: `275c3f9447` ("vhost: support slave requests channel") Cc: stable@dpdk.org Signed-off-by: Tiwei Bie <tiwei.bie@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-10-07 15:00:53 +02:00
Eelco Chaudron	039253166a	vhost: add device op when notification to guest is sent This patch adds an operation callback which gets called every time the library is waking up the guest trough an eventfd_write() call. This can be used by 3rd party application, like OVS, to track the number of times interrupts where generated. This might be of interest to find out system-call were called in the fast path. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-10-07 15:00:53 +02:00
Ivan Ilchenko	bdad90d12e	ethdev: change device info get callback to return int Change eth_dev_infos_get_t return value from void to int. Make eth_dev_infos_get_t implementations across all drivers to return negative errno values if case of error conditions. Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-10-07 14:45:35 +02:00
Ivan Ilchenko	3e09529f97	pdump: check status of getting ethdev info rte_eth_dev_info_get() return value was changed from void to int, so this patch modify rte_eth_dev_info_get() usage across pdump component according to its new return type. Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-10-07 14:45:35 +02:00
Ivan Ilchenko	d00a52acf9	latency: check status of getting ethdev info rte_eth_dev_info_get() return value was changed from void to int, so this patch modify rte_eth_dev_info_get() usage across latency component according to its new return type. Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-10-07 14:45:35 +02:00
Ivan Ilchenko	4f25d7d225	ethdev: add return code to device info get function Change rte_eth_dev_info_get() return value from void to int and return negative errno values in case of error conditions. Modify rte_eth_dev_info_get() usage across the ethdev according to new return type. Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-10-07 14:43:50 +02:00
Julien Meunier	1a60db7f35	cryptodev: fix initialization on multi-process Primary process is responsible to initialize the data struct of each crypto devices. Secondary process should not override this data during the initialization. Fixes: `d11b0f30df` ("cryptodev: introduce API and framework for crypto devices") Cc: stable@dpdk.org Signed-off-by: Julien Meunier <julien.meunier@nokia.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-10-09 11:50:12 +02:00
Akhil Goyal	badac76cec	security: add HFN override option in PDCP HFN can be given as a per packet value also. As we do not have IV in case of PDCP, and HFN is used to generate IV. IV field can be used to get the per packet HFN while enq/deq If hfn_ovrd field in pdcp_xform is set, application is expected to set the per packet HFN in place of IV. Driver will extract the HFN and perform operations accordingly. Signed-off-by: Akhil Goyal <akhil.goyal@nxp.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2019-10-09 11:50:12 +02:00
Radu Nicolau	382df9dfb6	security: fix doxygen fields Replace /< with / for multiline doxygen comments. Fixes: `c261d1431b` ("security: introduce security API and framework") Cc: stable@dpdk.org Signed-off-by: Radu Nicolau <radu.nicolau@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com> Acked-by: Anoob Joseph <anoobj@marvell.com>	2019-10-09 11:50:12 +02:00
Radu Nicolau	9404e0138d	security: add IPsec statistics Update IPsec statistics struct definition, add per SA statistics collection enable flag. Signed-off-by: Radu Nicolau <radu.nicolau@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com> Acked-by: Anoob Joseph <anoobj@marvell.com>	2019-10-09 11:50:12 +02:00
David Marchand	8ac3591694	remove useless include of EAL memory config header Restrict this header inclusion to its real users. Fixes: `028669bc9f` ("eal: hide shared memory config") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-10-09 10:22:24 +02:00
Xiaolong Ye	e8c7df5d7d	ethdev: fix typos for ENOTSUP Fixes: `af75078fec` ("first public release") Cc: stable@dpdk.org Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-09-05 18:34:45 +02:00
David Marchand	fbb25a3878	ethdev: fix endian annotation for SPI item Security Parameters Index (SPI) should be set with network endian values. While 0xffffffff == htonl(0xffffffff), this missing annotation is caught by sparse when compiling ovs (dpdk-latest branch). Fixes: `d4b684f719` ("net: add ESP header to generic flow steering") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>	2019-08-27 15:15:00 +02:00
Andrew Rybchenko	3c5f2cdb56	ethdev: fix doc reference to FDIR disabled mode There is no RTE_FDIR_DISABLE. The right name is RTE_FDIR_MODE_NONE. Fixes: `af75078fec` ("first public release") Cc: stable@dpdk.org Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>	2019-08-27 14:25:57 +02:00
Gagandeep Singh	47caefc163	eal: increase maximum different hugepage sizes on Arm ARM is supporting maximum 4 hugepage sizes (64K, 2M, 32M and 1G) when granule is 4KB since very long and DPDK support maximum 3 hugepage sizes. With all 4 hugepage sizes enabled, applications and some stacks like VPP which are working over DPDK and using "in-memory" eal option, or using separate mount points on ARM based platform, fails at huge page initialization, reporting error messages from eal: EAL: FATAL: Cannot get hugepage information. EAL: Cannot get hugepage information. EAL: Error - exiting with code: 1 This issue is originated from Linux 5.0 (a21b0b78eaf7 "arm64: hugetlb: Register hugepages during arch init") where kernel is by default creating directories for each supported hugepage size in /sys/kernel/mm/hugepages/ On earlier Stable Kernel LTR's, the directories visible in /sys/kernel/mm/hugepages/ were dependent upon what hugepage sizes are configured at boot time. This change increases the maximum supported hugepage sizes to 4 for ARM based platforms. Cc: stable@dpdk.org Signed-off-by: Gagandeep Singh <g.singh@nxp.com> Signed-off-by: Nipun Gupta <nipun.gupta@nxp.com>	2019-08-08 17:25:14 +02:00
David Christensen	8e3cb36d5b	replace license text with SPDX tag on PPC files Signed-off-by: David Christensen <drc@linux.vnet.ibm.com>	2019-08-05 17:17:09 +02:00
Maxime Coquelin	bff3b9a80e	vhost: replace IOTLB license with SPDX tag Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>	2019-08-05 16:06:11 +02:00
David Marchand	4d05ac955e	ethdev: sort experimental symbols per release Sort the experimental symbols per release to make it easier/quicker to check for how long we have them. Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org>	2019-08-05 12:20:53 +02:00
David Marchand	ba5d78da70	eal: hide internal function This function has never been used outside of this code unit. Mark it static and remove it from the eal internal header. Fixes: `9e29251b2a` ("eal: thread affinity API") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com>	2019-08-05 11:47:22 +02:00
Thomas Monjalon	5113798fbf	eal: warn on legacy memory allocation requirement When using --no-huge mode, dynamic allocation is not supported. Because of this limitation, the option --legacy-mem is implied and -m may be needed to specify the amount of memory to allocate. Otherwise the default amount MEMSIZE_IF_NO_HUGE_PAGE will be allocated. The option --socket-mem can also be used with --legacy-mem when hugepages are supported. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-08-05 11:26:57 +02:00
Aaron Conole	5924381221	acl: fix undefined behavior of bit shifts Left-shift of an integer constant is represented as 'int' type, but a left shift of 1 by 31 bits in 'int' is undefined. Use the U suffix to force a representation as unsigned. Caught while running with ubsan under gcc. Fixes: `dc276b5780` ("acl: new library") Cc: stable@dpdk.org Signed-off-by: Aaron Conole <aconole@redhat.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2019-08-02 22:07:07 +02:00
David Marchand	b6be16acfe	eal: fix control thread affinity with --lcores The ctrl thread cpu affinity setting has been broken when using --lcores. Using -l/-c options makes each lcore associated to a physical cpu in a 1:1 fashion. On the contrary, when using --lcores, each lcore cpu affinity can be set to a list of any online cpu on the system. To handle both cases, each lcore cpu affinity is considered and removed from the process startup cpu affinity. Introduced macros to manipulate dpdk cpu sets in both Linux and FreeBSD. Examples on a 8 cores Linux system: $ cd /sys/fs/cgroup/cpuset/ $ mkdir dpdk $ cd dpdk $ echo 4-7 > cpuset.cpus $ echo 0 > cpuset.mems $ echo $$ > tasks Before the fix: $ ./master/app/testpmd --master-lcore 0 --lcores '(0,7)@(7,4,5)' \ --no-huge --no-pci -m 512 -- -i --total-num-mbufs=2048 8427 cpu_list=4-5,7 testpmd 8428 cpu_list=4-6 eal-intr-thread 8429 cpu_list=4-6 rte_mp_handle 8430 cpu_list=4-5,7 lcore-slave-7 $ taskset -c 7 \ ./master/app/testpmd --master-lcore 0 --lcores '(0,7)@(7,4,5)' \ --no-huge --no-pci -m 512 -- -i --total-num-mbufs=2048 EAL: Detected 8 lcore(s) EAL: Detected 1 NUMA nodes EAL: Failed to create thread for interrupt handling EAL: FATAL: Cannot init interrupt-handling thread EAL: Cannot init interrupt-handling thread PANIC in main(): Cannot init EAL After the fix: $ ./master/app/testpmd --master-lcore 0 --lcores '(0,7)@(7,4,5)' \ --no-huge --no-pci -m 512 -- -i --total-num-mbufs=2048 15214 cpu_list=4-5,7 testpmd 15215 cpu_list=6 eal-intr-thread 15216 cpu_list=6 rte_mp_handle 15217 cpu_list=4-5,7 lcore-slave-7 $ taskset -c 7 \ ./master/app/testpmd --master-lcore 0 --lcores '(0,7)@(7,4,5)' \ --no-huge --no-pci -m 512 -- -i --total-num-mbufs=2048 15297 cpu_list=4-5,7 testpmd 15298 cpu_list=4-5,7 eal-intr-thread 15299 cpu_list=4-5,7 rte_mp_handle 15300 cpu_list=4-5,7 lcore-slave-7 Bugzilla ID: 322 Fixes: `c3568ea376` ("eal: restrict control threads to startup CPU affinity") Cc: stable@dpdk.org Reported-by: Johan Källström <johan.kallstrom@ericsson.com> Signed-off-by: David Marchand <david.marchand@redhat.com>	2019-07-30 23:11:19 +02:00
Anatoly Burakov	79a0bbe5b6	eal: pick IOVA as PA if IOMMU is not available When IOMMU is not available, /sys/kernel/iommu_groups will not be populated. This is happening since at least 3.6 when VFIO support was added. If the directory is empty, EAL should not pick IOVA as VA as the default IOVA mode. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Tested-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Tested-by: Jerin Jacob <jerinj@marvell.com> Reviewed-by: Jerin Jacob <jerinj@marvell.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2019-07-30 10:09:13 +02:00
Liang Ma	96c343868d	distributor: fix livelock on flush The Distributor autotest can lock if ran enough times. Worker and distributor threads get into a livelock situation waiting on each other. To repeat: `while sudo sh -c "echo 'distributor_autotest' \| ./build/app/test/dpdk-test"; do :; done` The root cause is where we are flushing on exit, and do not wait for all worker packets to be returned before exiting. Add a delay on flush so that all worker packets are returned before completing the flush. Bugzilla ID: 316 Fixes: `775003ad2f` ("distributor: add new burst-capable library") Cc: stable@dpdk.org Reported-by: Michael Santana <msantana@redhat.com> Signed-off-by: David Hunt <david.hunt@intel.com> Signed-off-by: Liang Ma <liang.j.ma@intel.com> Tested-by: Michael Santana <msantana@redhat.com>	2019-07-29 23:23:37 +02:00
Hemant Agrawal	86c3cb2245	eal: replace license text with SPDX tag Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>	2019-07-29 22:55:14 +02:00
David Marchand	e27071744d	service: remove experimental banner in API description This was missed when promoting this API to stable. Fixes: `7a0ac7cdb4` ("service: promote experimental functions to stable") Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Gage Eads <gage.eads@intel.com>	2019-07-29 22:38:50 +02:00
David Marchand	e2769068e5	eal: sort experimental symbols per release Sort the experimental symbols per release to make it easier/quicker to check for how long we have them. Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Michael Santana <msantana@redhat.com>	2019-07-29 22:26:10 +02:00
Sean Morrissey	9f83b62f55	eal: revert parsing option --telemetry This reverts commit `debacba029`. Reverting this patch as it currently breaks the initialization of telemetry, more investigation is ongoing to fix the issue for the printed error message for unrecognized argument. Fixes: `debacba029` ("eal: fix parsing option --telemetry") Cc: stable@dpdk.org Signed-off-by: Sean Morrissey <sean.morrissey@intel.com>	2019-07-29 22:24:59 +02:00
Xiaolong Ye	85458c2f3e	ethdev: replace license text with SPDX tag Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com> Acked-by: Jerin Jacob <jerinj@marvell.com>	2019-07-29 15:07:48 +02:00
Xiaolong Ye	7cf68e4ec4	eal: replace license text with SPDX tag Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2019-07-29 13:31:45 +02:00
Xiaolong Ye	9375067d3f	hash: replace license text with SPDX tag Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2019-07-29 13:31:45 +02:00
Xiaolong Ye	84adcc9c39	lpm: replace license text with SPDX tag Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Jerin Jacob <jerinj@marvell.com>	2019-07-29 13:31:45 +02:00
Fiona Trahe	760d966ab3	compressdev: clarify destination buffer size Clarify the corner case with incompressible data whereby the output can actually be greater than the uncompressed data. Signed-off-by: Fiona Trahe <fiona.trahe@intel.com> Acked-by: Adam Dybkowski <adamx.dybkowski@intel.com> Acked-by: Shally Verma <shallyv@marvell.com>	2019-07-29 10:18:24 +02:00
Ferruh Yigit	08644f1ad2	security: remove duplicated symbols from map file Fixes: `f63ffee26f` ("security: restore experimental tag for unimplemented APIs") Cc: stable@dpdk.org Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-07-26 16:03:21 +02:00
Ferruh Yigit	c417f59a6e	cryptodev: fix typo in comment Remove extra ';' which is probably added unintentionally, reported by ./devtools/check-includes.sh script. Fixes: `26008aaed1` ("cryptodev: add asymmetric xform and op definitions") Cc: stable@dpdk.org Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-07-26 15:48:52 +02:00
Anatoly Burakov	78a6d7ed19	vfio: use contiguous mapping for IOVA as VA mode When using IOVA as VA mode, there is no need to map segments page by page. This normally isn't a problem, but it becomes one when attempting to use DPDK in no-huge mode, where VFIO subsystem simply runs out of space to store mappings. Fix this for x86 by triggering different callbacks based on whether IOVA as VA mode is enabled. Fixes: `73a6390859` ("vfio: allow to map other memory regions") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Tested-by: Andrius Sirvys <andrius.sirvys@intel.com>	2019-07-23 20:47:14 +02:00
Andrew Rybchenko	b671987985	ethdev: avoid getting uninitialized info for bad port rte_eth_dev_info_get() returns void and caller does know if the function does its job or not. Changing of the return value to int would be API/ABI breakage which requires deprecation process and cannot be backported to stable branches. For now, make sure that device info is initialized even in the case of invalid port ID. Fixes: `a30268e9a2` ("ethdev: reset whole dev info structure before filling") Cc: stable@dpdk.org Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2019-07-23 20:28:17 +02:00
Stephen Hemminger	d036e6a149	net: be more restrictive with ethernet address format The current ether_unformat_addr code was based off of BSD ether_aton. That version changed what was allowed by the cmdline ether address parser. For example, it allows dropping leading zeros. Change the code to be more restrictive and only allow the fully expanded standard formats. Bugzilla ID: 324 Fixes: `596d31092d` ("net: add function to convert string to ethernet address") Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Olivier Matz <olivier.matz@6wind.com>	2019-07-23 14:31:35 +02:00
Dilshod Urazov	42caa11b37	net: adjust L2 length on soft VLAN insertion Layer 2 length must be updated after the prepend to mbuf to keep the length right to be used by other Tx offloads. If the packet has tunnel encapsulation, outer_l2_len should be updated. Otherwise l2_len should be updated. Fixes: `c974021a59` ("ether: add soft vlan encap/decap") Cc: stable@dpdk.org Signed-off-by: Dilshod Urazov <dilshod.urazov@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2019-07-23 14:31:35 +02:00
Nithin Dabilpuram	a159730c2f	eal: add ack interrupt API Add new ack interrupt API to avoid using VFIO_IRQ_SET_ACTION_TRIGGER(rte_intr_enable()) for acking interrupt purpose for VFIO based interrupt handlers. This implementation is specific to Linux. Using rte_intr_enable() for acking interrupt has below issues * Time consuming to do for every interrupt received as it will free_irq() followed by request_irq() and all other initializations * A race condition because of a window between free_irq() and request_irq() with packet reception still on and device still enabled and would throw warning messages like below. [158764.159833] do_IRQ: 9.34 No irq handler for vector In this patch, rte_intr_ack() is a no-op for VFIO_MSIX/VFIO_MSI interrupts as they are edge triggered and kernel would not mask the interrupt before delivering the event to userspace and we don't need to ack. Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com> Signed-off-by: Jerin Jacob <jerinj@marvell.com> Tested-by: Shahed Shaikh <shshaikh@marvell.com> Signed-off-by: David Marchand <david.marchand@redhat.com>	2019-07-23 12:00:22 +02:00
Nithin Dabilpuram	33543fb3b6	vfio: revert interrupt eventfd setup at probe This reverts commit `89aac60e0b`. "vfio: fix interrupts race condition" The above mentioned commit moves the interrupt's eventfd setup to probe time but only enables one interrupt for all types of interrupt handles i.e VFIO_MSI, VFIO_LEGACY, VFIO_MSIX, UIO. It works fine with default case but breaks below cases specifically for MSIX based interrupt handles. * Applications like l3fwd-power that request rxq interrupts while ethdev setup. * Drivers that need > 1 MSIx interrupts to be configured for functionality to work. VFIO PCI for MSIx expects all the possible vectors to be setup up when using VFIO_IRQ_SET_ACTION_TRIGGER so that they can be allocated from kernel pci subsystem. Only way to increase the number of vectors later is first free all by using VFIO_IRQ_SET_DATA_NONE with action trigger and then enable new vector count. Above commit changes the behavior of rte_intr_[enable\|disable] to only mask and unmask unlike earlier behavior and thereby breaking above two scenarios. Fixes: `89aac60e0b` ("vfio: fix interrupts race condition") Cc: stable@dpdk.org Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com> Signed-off-by: Jerin Jacob <jerinj@marvell.com> Tested-by: Stephen Hemminger <stephen@networkplumber.org> Tested-by: Shahed Shaikh <shshaikh@marvell.com> Tested-by: Lei Yao <lei.a.yao@intel.com> Acked-by: David Marchand <david.marchand@redhat.com>	2019-07-23 12:00:14 +02:00
Sean Morrissey	debacba029	eal: fix parsing option --telemetry Added telemetry to EAL long options so that when --telemetry is passed as an EAL arg that there is no unrecognized argument error message printed. Fixes: `8877ac688b` ("telemetry: introduce infrastructure") Cc: stable@dpdk.org Signed-off-by: Sean Morrissey <sean.morrissey@intel.com> Tested-by: John OLoughlin <john.oloughlin@intel.com> Acked-by: Kevin Laatz <kevin.laatz@intel.com>	2019-07-22 17:58:35 +02:00
Jerin Jacob	bbe29a9bd7	eal/linux: select IOVA as VA mode for default case When bus layer reports the preferred mode as RTE_IOVA_DC then select the RTE_IOVA_VA mode: - All drivers work in RTE_IOVA_VA mode, irrespective of physical address availability. - By default, a mempool asks for IOVA-contiguous memory using RTE_MEMZONE_IOVA_CONTIG. This is slow in RTE_IOVA_PA mode and it may affect the application boot time. Signed-off-by: Jerin Jacob <jerinj@marvell.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com> Signed-off-by: David Marchand <david.marchand@redhat.com>	2019-07-22 17:47:27 +02:00
David Marchand	b76fafb174	eal: fix IOVA mode selection as VA for PCI drivers The incriminated commit broke the use of RTE_PCI_DRV_IOVA_AS_VA which was intended to mean "driver only supports VA" but had been understood as "driver supports both PA and VA" by most net drivers and used to let dpdk processes to run as non root (which do not have access to physical addresses on recent kernels). The check on physical addresses actually closed the gap for those drivers. We don't need to mark them with RTE_PCI_DRV_IOVA_AS_VA and this flag can retain its intended meaning. Document explicitly its meaning. We can check that a driver requirement wrt to IOVA mode is fulfilled before trying to probe a device. Finally, document the heuristic used to select the IOVA mode and hope that we won't break it again. Fixes: `703458e19c` ("bus/pci: consider only usable devices for IOVA mode") Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Jerin Jacob <jerinj@marvell.com> Tested-by: Jerin Jacob <jerinj@marvell.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-07-22 17:45:52 +02:00
David Marchand	62f8f5ace5	bus/pci: remove Mellanox kernel driver type This reverts commit `0cb86518db`. The PCI bus now reports DC when faced with a device bound to an unknown driver and, in such a case, the IOVA mode is selected against physical address availability. As a consequence, there is no reason for this special case for Mellanox drivers. Fixes: `703458e19c` ("bus/pci: consider only usable devices for IOVA mode") Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Jerin Jacob <jerinj@marvell.com>	2019-07-22 17:44:08 +02:00
Jasvinder Singh	b767a6a213	sched: remove redundant macros Remove unused macros from the library, and update release notes. Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com> Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com> Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>	2019-07-22 15:44:16 +02:00
Jasvinder Singh	3f2eaa4ce9	examples/ip_pipeline: add config flexibility to TM Update ip pipeline sample app for configuration flexiblity of pipe traffic classes and queues. Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com> Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com> Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>	2019-07-22 15:42:59 +02:00
Jasvinder Singh	b86ba6aa12	sched: improve doxygen comments Improve doxygen comments. Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com> Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com> Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>	2019-07-22 15:41:47 +02:00
Jasvinder Singh	3cd66f0093	sched: improve error log messages Replace hard-coded numbers for reporting errors with error messages. Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com> Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com> Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>	2019-07-22 15:41:36 +02:00
Jasvinder Singh	db06a17ede	sched: rename TC3 params to best-effort TC Change the traffic class 3 related params name to best-effort(be) traffic class. Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com> Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com> Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>	2019-07-22 15:41:21 +02:00
Jasvinder Singh	6d27101561	sched: add max pipe profiles config in run-time Allow setting the maximum number of pipe profiles in run time. Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com> Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com> Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>	2019-07-22 15:41:11 +02:00
Jasvinder Singh	ca4d5915b8	sched: add config flexibility to TC queue sizes Add support for zero queue sizes of the traffic classes. The queues which are not used can be set to zero size. This helps in reducing memory footprint of the hierarchical scheduler. Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com> Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com> Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>	2019-07-22 15:41:04 +02:00
Jasvinder Singh	e16b06da09	sched: remove WRR from strict priority TC queues All higher priority traffic classes contain only one queue, thus remove wrr function for them. The lowest priority best-effort traffic class conitnue to have multiple queues and packet are scheduled from its queues using wrr function. Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com> Signed-off-by: Abraham Tovar <abrahamx.tovar@intel.com> Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>	2019-07-22 15:16:25 +02:00
Arek Kusztal	40dd1f42e1	cryptodev: remove RSA PKCS1 BT0 padding BT0 block type padding after rfc2313 has been discontinued. Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com> Acked-by: Shally Verma <shallyv@marvell.com>	2019-07-19 14:17:11 +02:00
Arek Kusztal	501ed9c661	cryptodev: add cipher field to RSA op Asymmetric nature of RSA algorithm suggest to use additional field for output. In place operations still can be done by setting cipher and message pointers with the same memory address. Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com> Acked-by: Shally Verma <shallyv@marvell.com>	2019-07-19 14:17:11 +02:00
Arek Kusztal	2e728ed3e4	cryptodev: change RSA API comments about primes RSA modulus cannot be prime as its security depends on the problem of integer factorization. Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com> Acked-by: Shally Verma <shallyv@marvell.com>	2019-07-19 14:17:11 +02:00
Fan Zhang	186b14d685	cryptodev: make xform key pointer constant This patch changes the key pointer data types in cipher, auth, and aead xforms from "uint8_t " to "const uint8_t " for a more intuitive and safe sessionn creation. Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com> Acked-by: Arek Kusztal <arkadiuszx.kusztal@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com> Acked-by: Liron Himi <lironh@marvell.com>	2019-07-19 14:15:21 +02:00
Ruifeng Wang	316095eb41	lpm: use atomic store to avoid partial update Compiler could generate non-atomic stores for whole table entry updating. This may cause incorrect nexthop to be returned, if the byte with valid flag is updated prior to the byte with nexthop is updated. Besides, field by field updating of table entries follow read-modify-write sequences. The operations are not atomic, nor efficient. And could cause entries out of synchronization. Changed to use atomic store to update whole table entry. Suggested-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com> Suggested-by: Gavin Hu <gavin.hu@arm.com> Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> Reviewed-by: Gavin Hu <gavin.hu@arm.com> Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>	2019-07-19 12:37:14 +02:00
Ruifeng Wang	be8c9ad0c2	lpm: avoid race conditions for v20 When a tbl8 group is getting attached to a tbl24 entry, lookup might fail even though the entry is configured in the table. For ex: consider a LPM table configured with 10.10.10.1/24. When a new entry 10.10.10.32/28 is being added, a new tbl8 group is allocated and tbl24 entry is changed to point to the tbl8 group. If the tbl24 entry is written without the tbl8 group entries updated, a lookup on 10.10.10.9 will return failure. Correct memory orderings are required to ensure that the store to tbl24 does not happen before the stores to tbl8 group entries complete. Besides, explicit structure alignment is used to address atomic operation building issue with older version clang. Suggested-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Reviewed-by: Gavin Hu <gavin.hu@arm.com> Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>	2019-07-19 12:37:09 +02:00
Ruifeng Wang	dd728f9fe1	lpm: avoid race conditions for v1604 When a tbl8 group is getting attached to a tbl24 entry, lookup might fail even though the entry is configured in the table. For ex: consider a LPM table configured with 10.10.10.1/24. When a new entry 10.10.10.32/28 is being added, a new tbl8 group is allocated and tbl24 entry is changed to point to the tbl8 group. If the tbl24 entry is written without the tbl8 group entries updated, a lookup on 10.10.10.9 will return failure. Correct memory orderings are required to ensure that the store to tbl24 does not happen before the stores to tbl8 group entries complete. The ordering patches in general have no notable impact on LPM performance test on both Arm A72 platform and x86 E5 platform. Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> Reviewed-by: Gavin Hu <gavin.hu@arm.com> Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>	2019-07-19 12:37:03 +02:00
Ruifeng Wang	c63d004bdd	lpm: uninline some functions Tests showed that the function inlining caused performance drop on some x86 platforms with the memory ordering patches applied. By force no-inline functions, the performance was better than before on x86 and no impact to arm64 platforms. Besides inlines of other functions are removed to let compiler to decide whether to inline. Suggested-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com> Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> Reviewed-by: Gavin Hu <gavin.hu@arm.com> Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>	2019-07-19 12:36:57 +02:00
Stephen Hemminger	8570a88d42	cfgfile: use calloc Better to use calloc when allocating arrays. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2019-07-19 00:48:43 +02:00
Stephen Hemminger	0cfffed310	cfgfile: use log for error messages In general, DPDK libraries to not print error messages to stdout because that is often redirected to /dev/null for daemons. This patch changes cfgfile library to use RTE_LOG with its own type. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2019-07-19 00:47:50 +02:00
Stephen Hemminger	1b58407653	cfgfile: remove unnecessary initialization No need to initialize variable if it is immediately overwritten. It is better style not do unnecessary initialization with modern tools since it lets compiler and other static checkers detect uninitialized data. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2019-07-19 00:47:44 +02:00
Erik Gabriel Carrillo	c485a5e65a	timer: fix null access if not initialized If the timer subsystem is not initialized before rte_timer_manage (for example) is invoked, a pointer to a shared hugepage memory region will still be null and dereferenced when it is checked for validity; handle this case. Fixes: `c0749f7096` ("timer: allow management in shared memory") Cc: stable@dpdk.org Signed-off-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>	2019-07-18 23:14:53 +02:00
Harman Kalra	acc6e5503e	distributor: fix check of workers number No of workers should never exceed RTE_MAX_LCORE. RTE_DIST_ALG_SINGLE also require no of workers check. Fixes: `775003ad2f` ("distributor: add new burst-capable library") Cc: stable@dpdk.org Signed-off-by: Harman Kalra <hkalra@marvell.com> Acked-by: David Hunt <david.hunt@intel.com>	2019-07-17 22:35:19 +02:00
Thomas Monjalon	b685129c79	eal: update comment about --no-huge option The old comment, on top of the function rte_eal_has_hugepages(), is really outdated and not generic enough. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-07-17 19:53:06 +02:00
Gavin Hu	be0330cb1b	hash: flush rings instead of dequeuing one by one Within rte_hash_reset, calling a while loop to dequeue one by one from the ring, while not using them at all, is wasting cycles, The patch just flush the ring by resetting the indices can save CPU cycles. Signed-off-by: Gavin Hu <gavin.hu@arm.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Acked-by: Yipeng Wang <yipeng1.wang@intel.com>	2019-07-17 19:52:37 +02:00
Gavin Hu	272d87b01b	ring: add reset function for flushing Currently, the flush is done by dequeuing the ring in a while loop. It is much simpler to flush the queue by resetting the head and tail indices. Signed-off-by: Gavin Hu <gavin.hu@arm.com> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2019-07-17 19:51:56 +02:00
Sunil Kumar Kori	d601426a27	ip_frag: remove remaining IP checkum offload flag Currently PKT_TX_IP_CKSUM is being set into mbuf->ol_flags during fragmentation operation implicitly by the library. Because of this, application is forced to use checksum offload whether it is supported by platform or not. Also documentation does not provide any expected value of ol_flags in returned fragmented mbufs so application will never come to know that which offloads are enabled. So transmission may be failed for the platforms which does not support checksum offload. So removing mentioned flag from the library. Mentioned change is part of http://patches.dpdk.org/patch/53475. Changes for reassembly operation is already accepted. This patch set implements the similar change for fragmentation operation. Fixes: `e29fc44370` ("ip_frag: remove IP checkum offload flag") Signed-off-by: Sunil Kumar Kori <skori@marvell.com>	2019-07-16 13:23:04 +02:00
Takeshi Yoshimura	e072d16f89	vfio: fix expanding DMA area in ppc64le In ppc64le, expanding DMA areas always fail because we cannot remove a DMA window. As a result, we cannot allocate more than one memseg in ppc64le. This is because vfio_spapr_dma_mem_map() doesn't unmap all the mapped DMA before removing the window. This patch fixes this incorrect behavior. I also fixed the order of ioctl for unregister and unmap. The ioctl for unregister sometimes report device busy errors due to the existence of mapped area. Signed-off-by: Takeshi Yoshimura <tyos@jp.ibm.com> Acked-by: David Christensen <drc@linux.vnet.ibm.com>	2019-07-16 12:56:03 +02:00
Harman Kalra	44dc7c0a26	metrics: add function to deinitialise library Once the library usage is over, it must be deinitialized which will free the shared memory reserved during initialization. Observed an issue while running 'metrics_autotest' continuously without quiting. For the first run 'metrics_autotest' passes all test cases but second run onwards first test case fails because metrics library is already initialized during first run. Signed-off-by: Harman Kalra <hkalra@marvell.com> Acked-by: Remy Horton <remy.horton@intel.com> Acked-by: Reshma Pattan <reshma.pattan@intel.com>	2019-07-16 12:45:30 +02:00
Yangchao Zhou	5eb1708ec1	kni: fix kernel crash with multi-segments va2pa depends on the physical address and virtual address offset of current mbuf. It may get the wrong physical address of next mbuf which allocated in another hugepage segment. In rte_mempool_populate_default(), trying to allocate whole block of contiguous memory could be failed. Then, it would reserve memory in several memzones that have different physical address and virtual address offsets. The rte_mempool_populate_default() is used by rte_pktmbuf_pool_create(). Fixes: `8451269e6d` ("kni: remove continuous memory restriction") Cc: stable@dpdk.org Signed-off-by: Yangchao Zhou <zhouyates@gmail.com> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-07-15 22:48:20 +02:00
Stephen Hemminger	dbb69b7b64	kni: fix style rte_kni does not follow standard style rules. Noticed some extra \ line continuation etc. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-07-15 19:15:34 +02:00
Anatoly Burakov	e6d3c09282	eal/freebsd: fix config creation The config create function did not store the mem config address in the shared memconfig structure, so the secondary processes couldn't map it at the required address. Fixes: `b149a70642` ("eal/freebsd: add config reattach in secondary process") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2019-07-14 15:37:19 +02:00
Takeshi Yoshimura	22a55d2eb6	vfio: fix build on Linux < 4.2 The commit `db90b4969e` ("vfio: retry creating sPAPR DMA window") introduced a build breakage on old Linux. Linux <4.2 does not define ddw in struct vfio_iommu_spapr_tce_info. Without ddw, we cannot change window size and so should give up the creation. I just exculuded the retrying code if ddw is not supported. Fixes: `db90b4969e` ("vfio: retry creating sPAPR DMA window") Signed-off-by: Takeshi Yoshimura <tyos@jp.ibm.com> Tested-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-07-11 11:28:20 +02:00
Bernard Iremonger	8d244bb3a3	flow_classify: fix out-of-bounds access This patch fixes the out-of-bounds coverity issue by removing the offending line of code at line 107 in rte_flow_classify_parse.c which is never executed. Coverity issue: 343454 Fixes: `be41ac2a33` ("flow_classify: introduce flow classify library") Cc: stable@dpdk.org Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>	2019-07-10 23:41:25 +02:00
Anatoly Burakov	f3330a2650	fbarray: fix destruction Currently, when fbarray is destroyed, the fbarray structure is not zeroed out, which leads to stale data being there and confusing secondary process init in legacy mem mode. Fix it by always memsetting the fbarray to zero when destroying it. Fixes: `5b61c62cfd` ("fbarray: add internal tailq for mapped areas") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-07-10 23:27:46 +02:00
David Marchand	89aac60e0b	vfio: fix interrupts race condition Populating the eventfd in rte_intr_enable in each request to vfio triggers a reconfiguration of the interrupt handler on the kernel side. The problem is that rte_intr_enable is often used to re-enable masked interrupts from drivers interrupt handlers. This reconfiguration leaves a window during which a device could send an interrupt and then the kernel logs this (unsolicited from the kernel point of view) interrupt: [158764.159833] do_IRQ: 9.34 No irq handler for vector VFIO api makes it possible to set the fd at setup time. Make use of this and then we only need to ask for masking/unmasking legacy interrupts and we have nothing to do for MSI/MSIX. "rxtx" interrupts are left untouched but are most likely subject to the same issue. Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1654824 Fixes: `5c782b3928` ("vfio: interrupts") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Tested-by: Shahed Shaikh <shshaikh@marvell.com>	2019-07-10 18:53:47 +02:00
Stephen Hemminger	6efc06725f	cmdline: use new ethernet address parser Now that there is a version of ether_aton in rte_ether, it can be used by the cmdline ethernet address parser. Note: ether_aton_r can not be used in cmdline because the old code would accept either bytes XX:XX:XX:XX:XX:XX or words XXXX:XXXX:XXXX and we need to keep compatibility. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2019-07-08 21:26:52 +02:00
Stephen Hemminger	7750b036d8	net: speedup ethernet address comparison with bitops Using bit operations like or and xor is faster than a loop on all architectures. Really just explicit unrolling. Similar cast to uint16 unaligned is already done in other functions here. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2019-07-08 21:26:52 +02:00
Stephen Hemminger	22d2103dfe	ethdev: use new ethernet parsing function Use rte_eth_unformat_addr, so that ethdev can be built and work without the cmdline library. The dependency on cmdline was an arrangement of convenience anyway. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-07-08 21:26:52 +02:00
Stephen Hemminger	596d31092d	net: add function to convert string to ethernet address Make a function that can be used in place of eth_aton_r to convert a string to rte_ether_addr. This function allows both byte (xx:xx:xx:xx:xx:xx) and word (XXXX:XXXX:XXXX) format and has the same lack of error handling as the original. This also allows ethdev to no longer have a hard dependency on the cmdline library. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>	2019-07-08 21:26:52 +02:00
Stephen Hemminger	3bdd09ae07	net: deinline non-critical ethernet functions Formatting Ethernet address and getting a random value are not in critical path so they should not be inlined. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Rami Rosen <ramirose@gmail.com> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2019-07-08 21:26:52 +02:00
Xiaoyu Min	3374db1857	ethdev: add GRE key field to flow API Add new rte_flow_item_gre_key in order to match the optional key field. Signed-off-by: Xiaoyu Min <jackmin@mellanox.com> Acked-by: Ori Kam <orika@mellanox.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2019-07-08 21:26:52 +02:00
Maxime Coquelin	906accb60b	vhost: log virtio and vhost-user negotiated features Having this info logged by default when analysing bug reports has proved to be useful. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>	2019-07-08 21:26:52 +02:00
Honnappa Nagarahalli	b0437f8b02	hash: load value after full key compare When a hash entry is added, there are 2 sets of stores. 1) The application writes its data to memory (whose address is provided in rte_hash_add_key_with_hash_data API (or NULL)) 2) The rte_hash library writes to its own internal data structures; key store entry and the hash table. The only ordering requirement between these 2 is that - store to the application data must complete before the store to key_index. There are no ordering requirements between the stores to key/signature and store to application data. The synchronization point for application data can be any point between the 'store to application data' and 'store to the key_index'. So, 'pdata' should not be a guard variable for the data in hash table. It should be a guard variable only for the application data written to the memory location pointed by 'pdata'. Hence, in the lookup functions, 'pdata' can be loaded after full key comparison succeeds. The synchronization point for the application data (store-release to 'pdata' in key store) is changed to be consistent with the order of loads in lookup function. However, this change is cosmetic and does not affect the functionality. Fixes: `e605a1d36` ("hash: add lock-free r/w concurrency") Cc: stable@dpdk.org Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Reviewed-by: Gavin Hu <gavin.hu@arm.com> Tested-by: Ruifeng Wang <ruifeng.wang@arm.com> Acked-by: Yipeng Wang <yipeng1.wang@intel.com>	2019-07-08 20:03:39 +02:00
Honnappa Nagarahalli	4d9ca3ed21	hash: use ordered loads only if signature matches Relaxed signature comparison is done first. Further ordered loads are done only if the signature matches. Any false positives are caught by the full key comparison. This provides performance benefits as load-acquire is executed only when required. Fixes: `e605a1d36` ("hash: add lock-free r/w concurrency") Cc: stable@dpdk.org Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Reviewed-by: Gavin Hu <gavin.hu@arm.com> Tested-by: Ruifeng Wang <ruifeng.wang@arm.com> Acked-by: Yipeng Wang <yipeng1.wang@intel.com>	2019-07-08 20:00:23 +02:00
Dilshod Urazov	468d4c4e20	eventdev: fix error sign Fixes: `c9bf83947e` ("eventdev: add eth Tx adapter APIs") Fixes: `47d05b2928` ("eventdev: add timer adapter common code") Fixes: `6750b21bd6` ("eventdev: add default software timer adapter") Fixes: `c75f7897ea` ("eventdev: set error code in port link/unlink functions") Fixes: `7d1acc9dde` ("eventdev: introduce helper function for enqueue burst") Fixes: `406aed4e0d` ("eventdev: add errno-style return values") Fixes: `cc7b73ea9e` ("eventdev: add new software timer adapter") Cc: stable@dpdk.org Signed-off-by: Dilshod Urazov <dilshod.urazov@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Acked-by: Jerin Jacob <jerinj@marvell.com>	2019-07-07 13:23:57 +02:00
Gage Eads	7a0ac7cdb4	service: promote experimental functions to stable The functions rte_service_may_be_active(), rte_service_lcore_attr_get(), and rte_service_attr_reset_all() were introduced nearly a year ago in DPDK 18.08. They can be considered non-experimental for the 19.08 release. rte_service_may_be_active() is used by the sw PMD, and this commit allows it to not need any experimental API. Signed-off-by: Gage Eads <gage.eads@intel.com>	2019-07-08 12:35:06 +02:00
Sunil Kumar Kori	e29fc44370	ip_frag: remove IP checkum offload flag Currently PKT_TX_IP_CKSUM is being set into mbuf->ol_flags during fragmentation and reassemble operation implicitly. Because of this, application is forced to use checksum offload whether it is supported by platform or not. Also documentation does not provide any expected value of ol_flags in returned mbuf (reassembled or fragmented) so application will never come to know that which offloads are enabled. So transmission may be failed for the platforms which does not support checksum offload. Also, IPv6 does not contain any checksum field in header so setting mbuf->ol_flags with PKT_TX_IP_CKSUM is itself invalid. So removing mentioned flag from the library. Signed-off-by: Sunil Kumar Kori <skori@marvell.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2019-07-08 11:04:01 +02:00
Krzysztof Kanas	d32411d034	eal: fix typo in comments Fixes: `a753e53d51` ("eal: add device event monitor framework") Fixes: `af75078fec` ("first public release") Cc: stable@dpdk.org Signed-off-by: Krzysztof Kanas <kkanas@marvell.com>	2019-07-08 10:57:14 +02:00
Phil Yang	2173f3333b	mcslock: add MCS queued lock implementation If there are multiple threads contending, they all attempt to take the spinlock lock at the same time once it is released. This results in a huge amount of processor bus traffic, which is a huge performance killer. Thus, if we somehow order the lock-takers so that they know who is next in line for the resource we can vastly reduce the amount of bus traffic. This patch added MCS lock library. It provides scalability by spinning on a CPU/thread local variable which avoids expensive cache bouncings. It provides fairness by maintaining a list of acquirers and passing the lock to each CPU/thread in the order they acquired the lock. Signed-off-by: Phil Yang <phil.yang@arm.com> Reviewed-by: Steve Capper <steve.capper@arm.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Reviewed-by: Gavin Hu <gavin.hu@arm.com>	2019-07-07 23:45:22 +02:00
Takeshi Yoshimura	db90b4969e	vfio: retry creating sPAPR DMA window sPAPR allows only page_shift from VFIO_IOMMU_SPAPR_TCE_GET_INFO ioctl. However, Linux 4.17 or before returns incorrect page_shift for Power9. I added the code for retrying creation of sPAPR DMA window. Signed-off-by: Takeshi Yoshimura <tyos@jp.ibm.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-07-07 23:20:23 +02:00
Fan Zhang	cd5b860c18	ipsec: support header construction Add support for RFC 4301(5.1.2) to update of Type of service field and Traffic class field bits inside ipv4/ipv6 packets for outbound cases and inbound cases which deals with the update of the DSCP/ENC bits inside each of the fields. Signed-off-by: Marko Kovacevic <marko.kovacevic@intel.com> Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Tested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2019-07-05 15:28:14 +02:00
Nicolas Chautru	b443f318c6	bbdev: extend API for 5G FEC Extension to BBDEV operations to support 5G on top of existing 4G operations. Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com> Acked-by: Amr Mokhtar <amr.mokhtar@intel.com>	2019-07-05 15:28:14 +02:00
Nicolas Chautru	c4b0d66376	bbdev: rename non-generic LTE specific structure Renaming of the enums and structure which were LTE specific to allow for extension and support for 5GNR operations. Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com> Acked-by: Amr Mokhtar <amr.mokhtar@intel.com>	2019-07-05 15:28:14 +02:00
Damian Nowak	90197eb094	cryptodev: add digest encrypted feature flag Some PMDs can only support digest being encrypted separately in auth-cipher operations. Thus it is required to add feature flag in PMD to reflect if it does support digest-appended both: digest generation with encryption and decryption with digest verification. This patch also adds information about new feature flag to the release notes. Signed-off-by: Damian Nowak <damianx.nowak@intel.com> Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Acked-by: Fiona Trahe <fiona.trahe@intel.com>	2019-07-05 15:28:14 +02:00
Damian Nowak	735b77dd85	cryptodev: document usage of digest-appended operations This patch explains what are the conditions and how to use digest appended for auth-cipher operations. Signed-off-by: Damian Nowak <damianx.nowak@intel.com> Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Acked-by: Fiona Trahe <fiona.trahe@intel.com>	2019-07-05 15:28:14 +02:00
Junxiao Shi	b482a57863	cryptodev: free memzone when releasing device When a cryptodev is created in a primary process, rte_cryptodev_data_alloc reserves a memzone. However, this memzone was not released when the cryptodev is uninitialized. After that, new cryptodev cannot be created due to memzone name conflict. This commit frees the memzone when a cryptodev is uninitialized, fixing this bug. This approach is chosen instead of keeping and reusing the old memzone, because the new cryptodev could belong to a different NUMA socket. Also, rte_cryptodev_data pointer is now properly recorded in cryptodev_globals.data array. Bugzilla ID: 105 Signed-off-by: Junxiao Shi <git@mail1.yoursunny.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-07-05 15:28:14 +02:00
Lukasz Bartosik	8734e669fc	ipsec: include high order bytes of ESN in packet length When esn is used then high-order 32 bits are included in ICV calculation however are not transmitted. Update packet length to be consistent with auth data offset and length before crypto operation. High-order 32 bits of esn will be removed from packet length in crypto post processing. Signed-off-by: Lukasz Bartosik <lbartosik@marvell.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-07-05 15:28:14 +02:00
Konstantin Ananyev	e95291f019	ipsec: support multi-segment packets Add support for packets that consist of multiple segments. Take into account that trailer bytes (padding, ESP tail, ICV) can spawn across multiple segments. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-07-05 15:28:14 +02:00
Marcin Smoczynski	2c1887fad0	ipsec: fix transport mode for IPv6 with extensions Reconstructing IPv6 header after encryption or decryption requires updating 'next header' value in the preceding protocol header, which is determined by parsing IPv6 header and iteratively looking for next IPv6 header extension. It is required that 'l3_len' in the mbuf metadata contains a total length of the IPv6 header with header extensions up to ESP header. Fixes: `4d7ea3e145` ("ipsec: implement SA data-path API") Cc: stable@dpdk.org Signed-off-by: Marcin Smoczynski <marcinx.smoczynski@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com> Tested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2019-07-05 15:28:14 +02:00
Marcin Smoczynski	1f79768ca6	net: add IPv6 header extension parsing function Introduce new function for IPv6 header extension parsing able to determine extension length and next protocol number. This function is helpful when implementing IPv6 header traversing. Signed-off-by: Marcin Smoczynski <marcinx.smoczynski@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com> Tested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2019-07-05 15:28:14 +02:00
Anoob Joseph	c9030ae382	cryptodev: add feature flags to disable Adding a new field, ff_disable, to allow applications to control the features enabled on the crypto device. This would allow for efficient usage of HW/SW offloads. Signed-off-by: Anoob Joseph <anoobj@marvell.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-07-05 15:03:25 +02:00
Arek Kusztal	fac52fb26a	cryptodev: add option to support both IV and J0 for GCM This patch adds an option to support both IV (of all supported sizes) and J0 when using Galois Counter Mode of crypto operation. Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com> Acked-by: Fiona Trahe <fiona.trahe@intel.com> Acked-by: Anoob Joseph <anoobj@marvell.com>	2019-07-05 14:52:25 +02:00
Saleh Alsouqi	2318d8d545	net: define IPv4 IHL and VHL Define IPv4 Minimum IHL and VHL according to rfc791 (see [1]) "The Version field indicates the format of the internet header." "Internet Header Length (ihl) is the length of the internet header in 32 bit words, and thus points to the beginning of the data. Note that the minimum value for a correct header is 5." [1] https://tools.ietf.org/html/rfc791 Signed-off-by: Saleh Alsouqi <salehals@mellanox.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-07-05 01:52:01 +02:00
Vivek Sharma	dfebfc9882	ethdev: support dynamic configuration of QinQ strip Enable missing support for runtime configuration (setting/getting) of QinQ strip rx offload for a given ethdev. Signed-off-by: Vivek Sharma <viveksharma@marvell.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>	2019-07-05 01:52:01 +02:00
Dekel Peled	f10409b766	ethdev: allow action with any config object type In current implementation, an action which requires parameters must accept them enclosed in a structure. Some actions require a single, trivial type parameter, but it still must be enclosed in a structure. This obligation results in multiple, action-specific structures, each containing a single trivial type parameter. This patch introduces a new approach, allowing an action configuration object of any type, trivial or a structure. Signed-off-by: Dekel Peled <dekelp@mellanox.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2019-07-05 01:52:01 +02:00
Dekel Peled	5dca85377e	ethdev: add actions to modify TCP header fields Add actions: - INC_TCP_SEQ - Increase sequence number in the outermost TCP header. - DEC_TCP_SEQ - Decrease sequence number in the outermost TCP header. - INC_TCP_ACK - Increase acknowledgment number in the outermost TCP header. - DEC_TCP_ACK - Decrease acknowledgment number in the outermost TCP header. Original work by Xiaoyu Min. This patch uses the new approach introduced by [1], using a simple integer instead of using an action-specific structure for each of the new actions. [1] http://patches.dpdk.org/patch/55882/ Signed-off-by: Dekel Peled <dekelp@mellanox.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2019-07-05 01:52:01 +02:00
Andrew Rybchenko	ca5ba1b697	ethdev: avoid error on PCI unplug of closed port If PCI Ethernet device driver removes it on close (RTE_ETH_DEV_CLOSE_REMOVE) and later PCI device itself is unplugged, it should not fail because of Ethernet device is already removed. Fixes: `23ea57a2a0` ("ethdev: complete closing of port") Cc: stable@dpdk.org Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Ivan Malov <ivan.malov@oktetlabs.ru> Reported-by: David Marchand <david.marchand@redhat.com> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-07-05 01:52:01 +02:00
Anatoly Burakov	f9d6cd8bfe	timer: fix resource leak in finalize Currently, whenever timer library is initialized, the memory is leaked because there is no telling when primary or secondary processes get to use the state, and there is no way to initialize/deinitialize timer library state without race conditions [1] because the data itself must live in shared memory. Add a spinlock to the shared mem config to have a way to exclusively initialize/deinitialize the timer library without any races, and implement the synchronization mechanism based on this lock in the timer library. Also, update the API doc. Note that the behavior of the API itself did not change - the requirement to call init in every process was simply not documented explicitly. [1] See the following email thread: https://mails.dpdk.org/archives/dev/2019-May/131498.html Fixes: `c0749f7096` ("timer: allow management in shared memory") Cc: stable@dpdk.org Signed-off-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2019-07-06 10:32:40 +02:00
Anatoly Burakov	ae3b4bc4fb	eal: prevent different primary/secondary process versions Currently, nothing stops DPDK to attempt to run primary and secondary processes while having different versions. This can lead to all sorts of weird behavior and makes it harder to maintain compatibility without breaking ABI every once in a while. Fix it by explicitly disallowing running different DPDK versions as primary and secondary processes. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: David Marchand <david.marchand@redhat.com>	2019-07-06 10:32:40 +02:00
Anatoly Burakov	b39fcd9569	eal: unify internal config init Currently, each EAL will update internal/shared config in their own way at init, resulting in needless duplication of code and OS-dependent behavior. Move the functions to a common file and add missing FreeBSD steps. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: David Marchand <david.marchand@redhat.com>	2019-07-06 10:32:40 +02:00
Anatoly Burakov	00299d3960	eal: unify wait for complete init Currently, mcfg completion function exists in two independent implementations doing the same thing, which is bug prone. Unify the two functions and move them into one place. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: David Marchand <david.marchand@redhat.com>	2019-07-06 10:32:40 +02:00
Anatoly Burakov	a08a5dd20e	eal: uninline wait for complete init Currently, the function to wait until config completion is static inline for no reason. Move its implementation to an EAL common file. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: David Marchand <david.marchand@redhat.com>	2019-07-06 10:32:40 +02:00
Anatoly Burakov	0d6e584247	eal: remove packed attribute from mcfg structure There is no reason to pack the memconfig structure, and doing so gives out warnings in some static analyzers. Fix it by removing the packed attributed. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: David Marchand <david.marchand@redhat.com>	2019-07-06 10:32:40 +02:00
Anatoly Burakov	028669bc9f	eal: hide shared memory config Now that everything that has ever accessed the shared memory config is doing so through the public API's, we can make it internal. Since we're removing quite a few headers from rte_eal_memconfig.h, we need to add them back in places where this header is used. This bumps the ABI, so also change all build files and make update documentation. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: David Marchand <david.marchand@redhat.com>	2019-07-06 10:32:34 +02:00
Anatoly Burakov	119cee86cd	eal: add API to lock/unlock mempool list Currently, in order to lock access to the mempool list, a direct access to the shared memory structure is needed. Add an API to do the same, and search-and-replace all usages. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: David Marchand <david.marchand@redhat.com>	2019-07-05 22:31:39 +02:00
Anatoly Burakov	a36f5ce06e	eal: add API to lock/unlock tailq list Currently, locking/unlocking the TAILQ list requires direct access to the shared memory config. Add an API to do the same, and search-and-replace all usages. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: David Marchand <david.marchand@redhat.com>	2019-07-05 22:13:23 +02:00
Anatoly Burakov	76f80881ef	mem: add API to lock/unlock memory hotplug Currently, the memory hotplug is locked automatically by all memory-related _walk() functions, but sometimes locking the memory subsystem outside of them is needed. There is no public API to do that, so it creates a dependency on shared memory config to be public. Fix this by introducing a new API to lock/unlock the memory hotplug subsystem. Create a new common file for all things mem config, and a new API namespace rte_mcfg_*, and search-and-replace all usages of the locks with the new API. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: David Marchand <david.marchand@redhat.com>	2019-07-05 22:12:40 +02:00
Ben Walker	c2361bab70	eal: compute IOVA mode based on PA availability Currently, if the bus selects IOVA as PA, the memory init can fail when lacking access to physical addresses. This can be quite hard for normal users to understand what is wrong since this is the default behavior. Catch this situation earlier in eal init by validating physical addresses availability, or select IOVA when no clear preferrence had been expressed. The bus code is changed so that it reports when it does not care about the IOVA mode and let the eal init decide. In Linux implementation, rework rte_eal_using_phys_addrs() so that it can be called earlier but still avoid a circular dependency with rte_mem_virt2phys(). In FreeBSD implementation, rte_eal_using_phys_addrs() always returns false, so the detection part is left as is. If librte_kni is compiled in and the KNI kmod is loaded, - if the buses requested VA, force to PA if physical addresses are available as it was done before, - else, keep iova as VA, KNI init will fail later. Signed-off-by: Ben Walker <benjamin.walker@intel.com> Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-07-05 16:55:44 +02:00
David Marchand	fe8fc6b0b7	kni: abort when IOVA is not PA If a forced iova-mode has been passed at init, kni is not supposed to work. Fixes: `075b182b54` ("eal: force IOVA to a particular mode") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com>	2019-07-05 16:54:40 +02:00
David Marchand	d39d8c4bb6	mem: fix typo in API description Fixes: `552afc420a` ("mem: add contig walk function") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-07-05 16:54:21 +02:00
Jananee Parthasarathy	984d9e522f	table: fix crash in LPM IPv6 Unit test table_autotest results in segmentation fault. Crash occurs in test_table_lpm_ipv6_combined(). Variable 'nht_pos0' used as array subscript is not initialized in rte_table_lpm_ipv6_entry_add(). It will not be assigned, if a rule does not exist. In such case a junk number or invalid array index might result in segmentation fault due to array out of bounds when lpm->nht_users is used with such invalid array index. Fix is to initialize the variables used for array subscript. Bugzilla ID: 285 Fixes: `d89a5bce1d` ("lpm6: extend next hop field") Cc: stable@dpdk.org Signed-off-by: Jananee Parthasarathy <jananeex.m.parthasarathy@intel.com> Tested-by: David Marchand <david.marchand@redhat.com> Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2019-07-04 23:44:58 +02:00
Flavia Musatescu	5c0bdc3203	telemetry: fix build with gcc 9 Suppress the unaligned packed member address warnings by extending the telemetry library build flags with -Wno-address-of-packed-member option, through the WERROR_FLAGS makefile variable. With this change additional warnings are turned on to be treated as errors, which causes the following build issues to be seen: - no previous prototype [-Werror=missing-prototypes] - initialization discards ‘const’ qualifier from pointer target type [-Werror=discarded-qualifiers] - old-style function definition [-Werror=old-style-definition] - variable may be used before its value is set (when using icc compiler). Fixes: `0fe3a37924` ("telemetry: format json response when sending stats") Fixes: `ee5ff0d329` ("telemetry: add client feature and sockets") Fixes: `8877ac688b` ("telemetry: introduce infrastructure") Fixes: `1b756087db` ("telemetry: add parser for client socket messages") Fixes: `fff6df7bf5` ("telemetry: fix using ports of different types") Fixes: `4080e46c80` ("telemetry: support global metrics") Cc: stable@dpdk.org Signed-off-by: Reshma Pattan <reshma.pattan@intel.com> Signed-off-by: Flavia Musatescu <flavia.musatescu@intel.com> Acked-by: Kevin Laatz <kevin.laatz@intel.com>	2019-07-04 18:48:04 +02:00
Ferruh Yigit	15a74163b1	net: forbid VLAN insert in shared mbuf The vlan_insert() is buggy when it tries to handle the shared mbufs, instead don't support inserting VLAN tag into shared mbufs and return an error for that case. Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2019-07-04 15:59:14 +02:00
Konstantin Ananyev	4715bb1623	bpf: fix validate for function return value eval_call() blindly calls eval_max_bound() for external function return value for all return types. That causes wrong estimation for returned pointer min and max boundaries. So any attempt to dereference that pointer value causes verifier to fail with error message: "memory boundary violation at pc: ...". To fix - estimate min/max boundaries based on the return value type. Bugzilla ID: 298 Fixes: `8021917293` ("bpf: add extra validation for input BPF program") Cc: stable@dpdk.org Reported-by: Michel Machado <michel@digirati.com.br> Suggested-by: Michel Machado <michel@digirati.com.br> Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2019-07-04 10:33:13 +02:00
Bruce Richardson	a951e147ea	rawdev: allow devices to skip extra memory allocation Some device drivers want to allocate their own private memory, and should be allowed to do so. Therefore skip memory allocation and associated error checks if zero-length private memory is requested. While adjusting the code for new indent level, fix incorrect error message. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com> Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>	2019-07-04 09:33:59 +02:00
Morten Brørup	fc4ebf7872	net: add definition for DSCP and ECN masks Added definitions of DSCP and ECN masks. Signed-off-by: Morten Brørup <mb@smartsharesystems.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2019-07-03 12:57:30 +02:00
Morten Brørup	854c1333b5	net: fix definition of IPv6 traffic class mask IPv6 header TC field is 8 bits, not 4. Fixes: `9b20c6e902` ("net: add IPv6 header fields macros") Cc: stable@dpdk.org Signed-off-by: Morten Brørup <mb@smartsharesystems.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2019-07-03 12:57:30 +02:00
Morten Brørup	f0c2dc8201	net: move TCP flags from ethdev header TCP flags were moved to the TCP header file from the Ethernet control header file, and the RTE prefix was added to their names. Missing TCP ECN flags were added. The ALL mask did not include TCP ECN flags, so it was renamed to reflect that it applies to N-tuple filtering only. Updated other files affected by the renaming accordingly. Signed-off-by: Morten Brørup <mb@smartsharesystems.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2019-07-03 12:57:30 +02:00
Jerin Jacob	374348cd33	eventdev: fix doxygen comment Update rte_eventdev.h file description to adapt to Tx adapter changes. Fixes: `c9bf83947e` ("eventdev: add eth Tx adapter APIs") Cc: stable@dpdk.org Signed-off-by: Jerin Jacob <jerinj@marvell.com> Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>	2019-07-03 06:55:41 +02:00
Nikhil Rao	68d21669f3	eventdev: change Rx adapter callback and stats structure Replace the mbuf pointer array in the event eth Rx adapter callback with an event array. Using an event array allows the application to change attributes of the events enqueued by the SW adapter. The callback can drop packets and populate a callback argument with the number of dropped packets. Add a Rx adapter stats field to keep track of the total number of dropped packets. This commit removes the experimental tags from the callback and stats APIs, the experimental tag from eventdev is also removed and eventdev functions become part of the main DPDK API/ABI. Signed-off-by: Nikhil Rao <nikhil.rao@intel.com> Acked-by: Jerin Jacob <jerinj@marvell.com>	2019-07-03 06:55:39 +02:00
Erik Gabriel Carrillo	cc7b73ea9e	eventdev: add new software timer adapter This patch introduces a new version of the event timer adapter software PMD. In the original design, timer event producer lcores in the primary and secondary processes enqueued event timers into a ring, and a service core in the primary process dequeued them and processed them further. To improve performance, this version does away with the ring and lets lcores insert timers directly into timer skiplist data structures; the service core directly accesses the lists as well, when looking for timers that have expired. To compare the burst and non-burst performance of the original and new versions of the software event timer adapter, I ran the following commands: $ sudo ./build/app/dpdk-test-eventdev -c 0xFFE -s 0xC --vdev=event_sw0 \ -- --test=perf_queue --plcores=4,5,6 --wlcore=7,8,9 --stlist=p \ --prod_type_timerdev --worker_deq_depth=32 $ sudo ./build/app/dpdk-test-eventdev -c 0xFFE -s 0xC --vdev=event_sw0 \ -- --test=perf_queue --plcores=4,5,6 --wlcore=7,8,9 --stlist=p \ --prod_type_timerdev_burst --worker_deq_depth=32 With the new version, I see a 151% improvement in throughput for the non-burst case, and a 270% improvement in throughput for the burst case. I also see a 53% improvement in arm latency in the non-burst case and a 65% improvement in arm latency in the burst case. Note: To perform the test, I commented out a check in the original version that checks the adapter tick interval against a minimum value. Signed-off-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com> Acked-by: Jerin Jacob <jerinj@marvell.com>	2019-07-03 06:55:35 +02:00
Nikhil Rao	b41363e477	eventdev: optimize Rx adapter enqueue Setup event when the Rx queue is added to the adapter in place of generating the event when it is being enqueued to the event device. Signed-off-by: Nikhil Rao <nikhil.rao@intel.com> Acked-by: Jerin Jacob <jerinj@marvell.com>	2019-07-03 06:55:33 +02:00
Nikhil Rao	894c373ce5	eventdev: remove event copy in Rx adapter Remove copy from temporary event array on the stack to the enqueue buffer event array entry, instead initialize event in the enqueue buffer event array entry. Suggested-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com> Signed-off-by: Nikhil Rao <nikhil.rao@intel.com> Acked-by: Jerin Jacob <jerinj@marvell.com>	2019-07-03 06:55:31 +02:00
Bruce Richardson	759a5fb18e	lib: add reasons for components being disabled For each library where we optionally disable it, add in the reason why it's being disabled, so the user knows how to fix it. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Luca Boccassi <bluca@debian.org>	2019-07-02 23:21:05 +02:00
Bruce Richardson	8baad6f884	build: print list of disabled components When configuring with meson we print out a list of enabled components, but it is also useful to list out the disabled components and the reasons why. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Luca Boccassi <bluca@debian.org>	2019-07-02 23:20:26 +02:00

... 3 4 5 6 7 ...

5853 Commits