numam-dpdk

Author	SHA1	Message	Date
Maxime Coquelin	a121f17572	net/virtio: fix memory init with vDPA backend This patch fixes an overhead met with mlx5-vdpa Kernel driver, where for every page in the mapped area, all the memory tables gets updated. For example, with 2MB hugepages, a single IOTLB_UPDATE for a 1GB region causes 512 memory updates on mlx5-vdpa side. Using batching mode, the mlx5 driver will only trigger a single memory update for all the IOTLB updates that happen between the batch begin and batch end commands. Fixes: 6b901437056e ("net/virtio: introduce vhost-vDPA backend") Cc: stable@dpdk.org Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2021-01-13 18:51:58 +01:00
Maxime Coquelin	35a6630e2b	net/virtio: add missing backend features negotiation This patch adds missing backend features negotiation for in Vhost-vDPA. Without it, IOTLB messages v2 could be sent by Virtio-user PMD while not supported by the backend. Fixes: 6b901437056e ("net/virtio: introduce vhost-vDPA backend") Cc: stable@dpdk.org Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2021-01-13 18:51:58 +01:00
Karra Satwik	c3bbc38147	net/cxgbe: accept VLAN flow items without ethertype When apps pass the RTE_FLOW_ITEM_TYPE_VLAN without setting the ethertype field in RTE_FLOW_ITEM_TYPE_ETH, then assume 0x8100 VLAN by default and don't reject the rule. Fixes: 55f003d8884c ("net/cxgbe: support flow API for matching QinQ VLAN") Cc: stable@dpdk.org Signed-off-by: Karra Satwik <kaara.satwik@chelsio.com> Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>	2021-01-13 18:51:57 +01:00
John Daley	8ffaae0d09	net/enic: remove deprecated flow director code The Flow Director (FDIR) API was removed in release 20.11. This patch removes the remainder of the FDIR code in the PMD. Signed-off-by: John Daley <johndale@cisco.com> Reviewed-by: Hyong Youb Kim <hyonkim@cisco.com>	2021-01-13 18:51:57 +01:00
Selwin Sebastian	ff70acdf42	net/axgbe: support reading FW version Added support for fw_version_get API Signed-off-by: Selwin Sebastian <selwin.sebastian@amd.com> Acked-by: Somalapuram Amaranath <asomalap@amd.com>	2021-01-13 18:51:57 +01:00
Xuan Ding	38d632cbdc	net/ice: refactor PF RSS This patch refactors the PF RSS code based on the below design: 1. ice_pattern_match_item->input_set_mask is the superset of ETH_RSS_xxx. 2. ice_pattern_match_item->meta is the ice_rss_hash_cfg template. 3. ice_hash_parse_pattern will generate pattern hint. 4. ice_hash_parse_action will refine the ice_rss_hash_cfg based on the pattern hint and rss_type. 5. The refine process includes: 1) refine protocol headers(VLAN/PPPOE/GTPU). 2) refine hash bit fields of l2, l3, l4. 3) refine hash bit fields for gtpu header. Signed-off-by: Xuan Ding <xuan.ding@intel.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com>	2021-01-13 18:51:57 +01:00
Ruifeng Wang	d16331c054	config/arm: add Neoverse N2 Add Arm Neoverse N2 cpu support. Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> Acked-by: Jerin Jacob <jerinj@marvell.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>	2021-01-14 16:42:25 +01:00
Ruifeng Wang	fe55802814	common/octeontx2: fix build with SVE Building with gcc 10.2 with SVE extension enabled got error: {standard input}: Assembler messages: {standard input}:4002: Error: selected processor does not support `mov z3.b,#0' {standard input}:4003: Error: selected processor does not support `whilelo p1.b,xzr,x7' {standard input}:4005: Error: selected processor does not support `ld1b z0.b,p1/z,[x8]' {standard input}:4006: Error: selected processor does not support `whilelo p4.s,wzr,w7' This is because inline assembly code explicitly resets cpu model to not have SVE support. Thus SVE instructions generated by compiler auto vectorization got rejected by assembler. Added SVE to the cpu model specified by inline assembly for SVE support. Not replacing the inline assembly with C atomics because the driver relies on specific LSE instruction to interface to co-processor [1]. Fixes: 8a4f835971f5 ("common/octeontx2: add IO handling APIs") Cc: stable@dpdk.org [1] https://mails.dpdk.org/archives/dev/2021-January/196092.html Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> Reviewed-by: Jerin Jacob <jerinj@marvell.com>	2021-01-14 16:42:25 +01:00
Ruifeng Wang	e88bd47467	net/octeontx: fix build with SVE Building with gcc 10.2 with SVE extension enabled got error: {standard input}: Assembler messages: {standard input}:91: Error: selected processor does not support `addvl x4,x8,#-1' {standard input}:95: Error: selected processor does not support `ptrue p1.d,all' {standard input}:135: Error: selected processor does not support `whilelo p2.d,xzr,x5' {standard input}:137: Error: selected processor does not support `decb x1' This is because inline assembly code explicitly resets cpu model to not have SVE support. Thus SVE instructions generated by compiler auto vectorization got rejected by assembler. Added SVE to the cpu model specified by inline assembly for SVE support. Not replacing the inline assembly with C atomics because the driver relies on specific LSE instruction to interface to co-processor [1]. Fixes: f0c7bb1bf778 ("net/octeontx/base: add octeontx IO operations") Cc: stable@dpdk.org [1] https://mails.dpdk.org/archives/dev/2021-January/196092.html Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> Reviewed-by: Jerin Jacob <jerinj@marvell.com>	2021-01-14 16:42:25 +01:00
Ruifeng Wang	21c4f1c7b2	net/hns3: fix build with SVE Building with SVE extension enabled stopped with error: error: ACLE function ‘svwhilelt_b64_s32’ requires ISA extension ‘sve’ 18 \| #define PG64_256BIT svwhilelt_b64(0, 4) This is caused by unintentional cflags reset. Fixed the issue by not touching cflags, and using flags defined by compiler. Fixes: 952ebacce4f2 ("net/hns3: support SVE Rx") Cc: stable@dpdk.org Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>	2021-01-14 16:42:25 +01:00
Ruifeng Wang	67b68824a8	lpm/arm: support SVE Added new path to do lpm4 lookup by using scalable vector extension. The SVE path will be selected if compiler has flag SVE set. Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>	2021-01-14 16:42:25 +01:00
Ruifeng Wang	f942122fef	test: improve coverage on LPM tbl8 Existing test cases create 256 tbl8 groups for testing. The number covers only 8 bit next_hop/group field. Since the next_hop/group field had been extended to 24-bits, creating more than 256 groups in tests can improve the coverage. Coverage was not expanded to reach the max supported group number, because it would take too much time to run for this fast-test. Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> Tested-by: David Christensen <drc@linux.vnet.ibm.com> Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>	2021-01-14 16:41:40 +01:00
Ruifeng Wang	5702b7bf1c	lpm: fix vector IPv4 lookup rte_lpm_lookupx4 could return wrong next hop when more than 256 tbl8 groups are created. This is caused by incorrect type casting of tbl8 group index that been stored in tbl24 entry. The casting caused group index truncation and hence wrong tbl8 group been searched. Issue fixed by applying proper mask to tbl24 entry to get tbl8 group index. Fixes: dc81ebbacaeb ("lpm: extend IPv4 next hop field") Fixes: cbc2f1dccfba ("lpm/arm: support NEON") Fixes: d2cc7959342b ("lpm: add AltiVec for ppc64") Cc: stable@dpdk.org Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> Tested-by: David Christensen <drc@linux.vnet.ibm.com> Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>	2021-01-14 14:19:57 +01:00
Vladimir Medvedkin	6e4d4a6381	fib6: improve AVX512 lookup performance Improved performance for AVX512 FIB6 lookup by doubling the number of flows being processed Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Ray Kinsella <mdr@ashroe.eu>	2021-01-13 22:13:37 +01:00
Dmitry Kozlyuk	da042bcfc6	build: fix linker flags on Windows The --export-dynamic linker option is only applicable to ELF. On Windows, where COFF is used, it causes warnings: x86_64-w64-mingw32-ld: warning: --export-dynamic is not supported for PE+ targets, did you mean --export-all-symbols? (MinGW) LINK : warning LNK4044: unrecognized option '/-export-dynamic'; ignored (clang) Don't add --export-dynamic on Windows anywhere. Fixes: b031e13d7f0d ("build: fix plugin load on static build") Cc: stable@dpdk.org Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> Acked-by: Ranjit Menon <ranjit.menon@intel.com>	2021-01-13 22:13:37 +01:00
Eugeny Parshutin	6a9d1e28f1	doc: add vtune profiling config to prog guide Return back 'profiling with vtune' section to profiling programmers guide with updated instruction on how to enable vtune profiling with meson configuration option. Fixes: 89c67ae2cba7 ("doc: remove references to make from prog guide") Cc: stable@dpdk.org Signed-off-by: Eugeny Parshutin <eugeny.parshutin@linux.intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2021-01-13 21:25:13 +01:00
Thomas Monjalon	0144eeafd1	devtools: adjust verbosity of ABI check The scripts gen-abi.sh and check-abi.sh are updated to print error messages to stderr so they are likely never ignored. When called from test-meson-builds.sh, the standard messages on stdout can be more quiet depending on the verbosity settings. The beginning of the ABI check is announced in verbose mode. The commands are printed in very verbose mode. The check result details are available in verbose mode. Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2021-01-13 00:04:33 +01:00
Ophir Munk	e5e518edd6	app/regex: measure performance with precise clock Performance measurement (elapsed time and Gbps) are based on Linux clock() API. The resolution is improved by replacing the clock() API with rte_rdtsc_precise() API. Signed-off-by: Ophir Munk <ophirmu@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com>	2021-01-13 00:04:27 +01:00
Ophir Munk	6e3c6bd6ab	app/regex: measure performance per queue pair Up to this commit measuring the parsing elapsed time and Giga bits per second performance was done on the aggregation of all QPs (per core). This commit separates the time measurements per individual QP. Signed-off-by: Ophir Munk <ophirmu@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com>	2021-01-13 00:00:21 +01:00
Ophir Munk	6b99ba8d4b	app/regex: support multiple cores Up to this commit the regex application was running with multiple QPs on a single core. This commit adds the option to specify a number of cores on which multiple QPs will run. A new parameter 'nb_lcores' was added to configure the number of cores: --nb_lcores <num of cores>. If not configured the number of cores is set to 1 by default. On application startup a few initial steps occur by the main core: the number of QPs and cores are parsed. The QPs are distributed as evenly as possible on the cores. The regex device and all QPs are initialized. The data file is read and saved in a buffer. Then for each core the application calls rte_eal_remote_launch() with the worker routine (run_regex) as its parameter. Signed-off-by: Ophir Munk <ophirmu@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com>	2021-01-12 23:59:51 +01:00
Ophir Munk	f5cffb7eb7	app/regex: read data file once at startup Up to this commit the input data file was read from scratch for each QP, which is redundant. Starting from this commit the data file is read only once at startup. Each QP will clone the data. Signed-off-by: Ophir Munk <ophirmu@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com>	2021-01-12 23:59:33 +01:00
Ophir Munk	4545bd0088	app/regex: support multiple queue pairs Up to this commit the regex application used one QP which was assigned a number of jobs, each with a different segment of a file to parse. This commit adds support for multiple QPs assignments. All QPs will be assigned the same number of jobs, with the same segments of file to parse. It will enable comparing functionality with different numbers of QPs. All queues are managed on one core with one thread. This commit focuses on changing routines API to support multi QPs, mainly, QP scalar variables are replaced by per-QP struct instance. The enqueue/dequeue operations are interleaved as follows: enqueue(QP #1) enqueue(QP #2) ... enqueue(QP #n) dequeue(QP #1) dequeue(QP #2) ... dequeue(QP #n) A new parameter 'nb_qps' was added to configure the number of QPs: --nb_qps <num of qps>. If not configured, nb_qps is set to 1 by default. Signed-off-by: Ophir Munk <ophirmu@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com>	2021-01-12 23:56:46 +01:00
Ophir Munk	2d1fb3f2a6	app/regex: move mempool creation to worker routine Function rte_pktmbuf_pool_create() is moved from init_port() routine to run_regex() routine. Looking forward on multi core support - init_port() will be called only once as part of application startup while mem pool creation should be called multiple times (per core). Signed-off-by: Ophir Munk <ophirmu@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com>	2021-01-12 23:56:13 +01:00
Ori Kam	9b27a37b84	regex/mlx5: add response flags This commit propagate the response flags from the regex engine. Signed-off-by: Francis Kelly <fkelly@nvidia.com> Signed-off-by: Ori Kam <orika@nvidia.com>	2021-01-12 23:32:04 +01:00
Ori Kam	1922db13bf	regexdev: add resource limit reached flag When scanning a buffer it is possible that the scan will abort due to some internal resource limit. This commit adds such response flag, so application can handle such cases. Signed-off-by: Francis Kelly <fkelly@nvidia.com> Signed-off-by: Ori Kam <orika@nvidia.com>	2021-01-12 23:31:39 +01:00
Tal Shnaiderman	b1fd151267	eal: add generic thread-local-storage functions Add support for TLS functionality in EAL. The following functions are added: rte_thread_tls_key_create - create a TLS data key. rte_thread_tls_key_delete - delete a TLS data key. rte_thread_tls_value_set - set value bound to the TLS key rte_thread_tls_value_get - get value bound to the TLS key TLS key is defined by the new type rte_tls_key. The API allocates the thread local storage (TLS) key. Any thread of the process can subsequently use this key to store and retrieve values that are local to the thread. Those functions are added in addition to TLS capability in rte_per_lcore.h to allow abstraction of the pthread layer for all operating systems. Windows implementation is under librte_eal/windows and implemented using WIN32 API for Windows only. Unix implementation is under librte_eal/unix and implemented using pthread for UNIX compilation. Signed-off-by: Tal Shnaiderman <talshn@nvidia.com> Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2021-01-11 23:28:12 +01:00
Tal Shnaiderman	d136fae560	eal: move thread affinity functions to new file Move the definition of the functions rte_thread_set_affinity and rte_thread_get_affinity to new file, rte_thread.h The file will implement generic threading functionality and will only host threading functions which do not reference pthread API. Signed-off-by: Tal Shnaiderman <talshn@nvidia.com> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2021-01-11 23:27:39 +01:00
Alvin Zhang	ef4c16fd91	net/i40e: refactor RSS flow 1. Delete original code. 2. Add 2 tables(One maps flow pattern and RSS type to PCTYPE, another maps RSS type to input set). 3. Parse RSS pattern and RSS type to get PCTYPE. 4. Parse RSS action to get queues, RSS function and hash field. 5. Create and destroy RSS filters. 6. Create new files for hash flows. Signed-off-by: Alvin Zhang <alvinx.zhang@intel.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com>	2021-01-08 19:20:09 +01:00
Alvin Zhang	c222d2a1d0	net/i40e: fix returned code for RSS hardware failure The API should return the system error status, but it returned the hardware error status, this is confuses the caller. This patch adds check on hardware execution status and returns -EIO in case of hardware execution failure. Fixes: 1d4b2b4966bb ("net/i40e: fix VF overwrite PF RSS LUT for X722") Fixes: d0a349409bd7 ("i40e: support AQ based RSS config") Cc: stable@dpdk.org Signed-off-by: Alvin Zhang <alvinx.zhang@intel.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com>	2021-01-08 19:20:09 +01:00
Alvin Zhang	742d9f87f6	doc: fix RSS flow description in i40e guide The command here does not create a queue region, but only sets the lookup table, so the descriptions in the doc is not exact. Fixes: feaae285b342 ("net/i40e: support hash configuration in RSS flow") Cc: stable@dpdk.org Signed-off-by: Alvin Zhang <alvinx.zhang@intel.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com>	2021-01-08 19:20:09 +01:00
Qi Zhang	d84e220a8d	net/ice/base: update copyright date Updated the Copyright for 2021 Updated ice driver version. Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>	2021-01-08 19:03:09 +01:00
Qi Zhang	d5be7f9375	net/ice/base: update add scheduler node counter The number of nodes added counter was updated incorrectly. This issue was exposed when the driver tried to add more than 128 queues per TC. Fix added to update the counter correctly. Fixes: 93e84b1bfc92 ("net/ice/base: add basic Tx scheduler") Cc: stable@dpdk.org Signed-off-by: Victor Raj <victor.raj@intel.com> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>	2021-01-08 19:03:08 +01:00
Qi Zhang	36a7d65eb5	net/ice/base: cleanup style A few style issues reported by checkpatch have snuck into the code, resolve the style issues. PARENTHESIS_ALIGNMENT: Alignment should match open parenthesis COMPLEX_MACRO: Macros with complex values should be enclosed in parentheses Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>	2021-01-08 19:03:08 +01:00
Qi Zhang	9966f7fcb9	net/ice/base: support GTPU inner for AVF flow director Add dummy packets for IPV4_GTPU with inner IPV4/UDP/TCP with all kinds of GTPU (EH) type (i.e., IP/EH/DL/UL) for AVF FDIR. Signed-off-by: Junfeng Guo <junfeng.guo@intel.com> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>	2021-01-08 19:03:08 +01:00
Qi Zhang	02d6b64051	net/ice/base: limit forced overrides based on FW version Beyond a specific version of firmware, there is no need to provide override values to the firmware when setting PHY capabilities. In this case, we do not need to indicate whether we're in Strict or Lenient Link Mode. In the case of translating capabilities to the configuration structure, the module compliance enforcement is already correctly set by firmware, so the extra code block is redundant. Signed-off-by: Jeb Cramer <jeb.j.cramer@intel.com> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>	2021-01-08 19:03:08 +01:00
Qi Zhang	964bafcf5e	net/ice/base: fix memory handling Fixed memory handling when memory allocated in user space was handled as memory allocated in kernel space within QV os_dep implementation of the ice_memdup function. Fixes: 93e84b1bfc92 ("net/ice/base: add basic Tx scheduler") Cc: stable@dpdk.org Signed-off-by: Andrii Pypchenko <andrii.pypchenko@intel.com> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>	2021-01-08 19:03:08 +01:00
Qi Zhang	b3d554edfe	net/ice/base: add package ptype enable information Scan the 'Marker PType TCAM' session to retrieve the Rx parser PTYPE enable information from the current package. Signed-off-by: Haiyue Wang <haiyue.wang@intel.com> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>	2021-01-08 19:03:08 +01:00
Qi Zhang	171522b829	net/ice/base: remove deprecated field hw_vsi_id is used to replace vsi_id, so remove the deprecated vsi_id. Signed-off-by: Haiyue Wang <haiyue.wang@intel.com> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>	2021-01-08 19:03:08 +01:00
Qi Zhang	9ea028123a	net/ice/base: align add VSI and update VSI AQ command buffer Aligned the buffer the following admin commands to their new definitions: * 0x210 = add_vsi * 0x211 = update_vsi Signed-off-by: Shay Amir <shay.amir@intel.com> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>	2021-01-08 19:02:58 +01:00
Maxime Coquelin	52ae8f2fab	net/virtio: improve logs in vhost-vDPA DMA mapping This patch adds debug logs in vhost_vdpa_dma_map() and vhost_vdpa_dma_unmap() to ease debugging. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2021-01-08 18:07:56 +01:00
Maxime Coquelin	be1525c6b4	vhost: refactor memory regions mapping This patch moves memory region mmaping and related preparation in a dedicated function in order to simplify VHOST_USER_SET_MEM_TABLE request handling function. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2021-01-08 18:07:56 +01:00
Maxime Coquelin	761ea501ce	vhost: refactor postcopy registration This patch moves the registration of postcopy to a dedicated function, with the goal of simplifying VHOST_USER_SET_MEM_TABLE request handling function. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2021-01-08 18:07:56 +01:00
Maxime Coquelin	fc2225dbc5	vhost: refactor postcopy region registration This patch moves the registration of memory regions to userfaultfd to a dedicated function, with the goal of simplifying VHOST_USER_SET_MEM_TABLE request handling function. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2021-01-08 18:07:56 +01:00
Xueming Li	1f93bee4e7	vdpa/mlx5: add hardware queue moderation The next parameters control the HW queue moderation feature. This feature helps to control the traffic performance and latency trade-off. Each packet completion report from HW to SW requires CQ processing by SW and triggers interrupt for the guest driver. Interrupt report and handling cost CPU cycles and time and the amount of this affects directly on packet performance and latency. hw_latency_mode parameters [int] 0, HW default. 1, Latency is counted from the first packet completion report. 2, Latency is counted from the last packet completion. hw_max_latency_us parameters [int] 0 - 4095, The maximum time in microseconds that packet completion report can be delayed. hw_max_pending_comp parameter [int] 0 - 65535, The maximum number of pending packets completions in an HW queue. Signed-off-by: Xueming Li <xuemingl@nvidia.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2021-01-08 18:07:56 +01:00
Xueming Li	6623dc2b76	common/mlx5: support vDPA completion queue moderation This patch introduces new parameters for VirtQ CQ moderation, used for performance tuning. Signed-off-by: Xueming Li <xuemingl@nvidia.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2021-01-08 18:07:56 +01:00
Joyce Kong	a33c3584f3	vhost: replace SMP with thread fence for control path Simply replace the smp barriers with atomic thread fence for vhost control path, if there are no synchronization points. Signed-off-by: Joyce Kong <joyce.kong@arm.com> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2021-01-08 18:07:56 +01:00
Joyce Kong	5faf0a9c54	vhost: replace SMP with thread fence for packed vring Simply replace smp barriers with atomic thread fence for virtio packed vring. Signed-off-by: Joyce Kong <joyce.kong@arm.com> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2021-01-08 18:07:55 +01:00
Joyce Kong	10b8c36af0	vhost: relax full barriers for used idx Used idx can be synchronized by one-way barrier instead of full write barrier for split vring. Signed-off-by: Joyce Kong <joyce.kong@arm.com> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2021-01-08 18:07:55 +01:00
Joyce Kong	9253c34cfb	vhost: relax full barriers for desc flags Relax the full read barrier to one-way barrier for desc flags in packed vring. Signed-off-by: Joyce Kong <joyce.kong@arm.com> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2021-01-08 18:07:55 +01:00
Joyce Kong	2d031675b2	vhost: remove unnecessary SMP barrier for avail idx The ordering between avail index and desc reads has been enforced by load-acquire for split vring, so smp_rmb barrier is not needed behind it. Signed-off-by: Joyce Kong <joyce.kong@arm.com> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2021-01-08 18:07:55 +01:00

1 2 3 4 5 ...

26307 Commits