4163 Commits

Author SHA1 Message Date
Anatoly Burakov
8f7335c1be mempool: use memseg walk instead of iteration
Reduce dependency on internal details of EAL memory subsystem, and
simplify code.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Tested-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
2018-04-11 19:49:16 +02:00
Anatoly Burakov
221b67bca0 eal: use memseg walk instead of iteration
Reduce dependency on internal details of EAL memory subsystem, and
simplify code.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Tested-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
2018-04-11 19:48:15 +02:00
Anatoly Burakov
2b9f98d8a5 mem: add function to walk all memsegs
For code that might need to iterate over list of allocated
segments, using this API will make it more resilient to
internal API changes and will prevent copying the same
iteration code over and over again.

Additionally, down the line there will be locking implemented,
so users of this API will not need to care about locking
either.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Tested-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
2018-04-11 19:47:25 +02:00
Anatoly Burakov
ba0009560c mempool: support new allocation methods
If a user has specified that the zone should have contiguous memory,
add a memzone flag to request contiguous memory. Otherwise, account
for the fact that unless we're in IOVA_AS_VA mode, we cannot
guarantee that the pages would be physically contiguous, so we
calculate the memzone size and alignments as if we were getting
the smallest page size available.

However, for the non-IOVA contiguous case, existing mempool size
calculation function doesn't give us expected results, because it
will return memzone sizes aligned to page size (e.g. a 1MB mempool
may use an entire 1GB page), therefore in cases where we weren't
specifically asked to reserve non-contiguous memory, first try
reserving a memzone as IOVA-contiguous, and if that fails, then
try reserving with page-aligned size/alignment.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Tested-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Tested-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
2018-04-11 19:45:48 +02:00
Anatoly Burakov
74dbbcd6f8 ethdev: use contiguous allocation for DMA memory
All hardware drivers should allocate IOVA-contiguous
memzones for their hardware resources.

This fixes the following drivers in one go:

grep -Rl rte_eth_dma_zone_reserve drivers/

drivers/net/avf/avf_rxtx.c
drivers/net/thunderx/nicvf_ethdev.c
drivers/net/e1000/igb_rxtx.c
drivers/net/e1000/em_rxtx.c
drivers/net/fm10k/fm10k_ethdev.c
drivers/net/vmxnet3/vmxnet3_rxtx.c
drivers/net/liquidio/lio_rxtx.c
drivers/net/i40e/i40e_rxtx.c
drivers/net/sfc/sfc.c
drivers/net/ixgbe/ixgbe_rxtx.c
drivers/net/nfp/nfp_net.c

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Tested-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
2018-04-11 19:44:53 +02:00
Anatoly Burakov
23fa86e529 memzone: enable IOVA-contiguous reserving
This adds a new flag to request reserved memzone to be IOVA
contiguous. This is useful for allocating hardware resources like
NIC rings/queues etc.For now, hugepage memory is always contiguous,
but we need to prepare the drivers for the switch.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Tested-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
2018-04-11 19:44:05 +02:00
Anatoly Burakov
5ea85289a9 malloc: support contiguous allocation
No major changes, just add some checks in a few key places, and
a new parameter to pass around.

Also, add a function to check malloc element for physical
contiguousness. For now, assume hugepage memory is always
contiguous, while non-hugepage memory will be checked.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Tested-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
2018-04-11 19:43:55 +02:00
Anatoly Burakov
d1162b77c9 malloc: replace panics with error messages
We shouldn't ever panic in libraries, let alone in EAL, so
replace all panic messages with error messages.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Tested-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
2018-04-11 19:43:50 +02:00
Anatoly Burakov
883179b493 malloc: make free return resulting element
This will be needed because we need to know how big is the
new empty space, to check whether we can free some pages as
a result.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Tested-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
2018-04-11 19:43:41 +02:00
Anatoly Burakov
0a59238f80 malloc: make free list removal function public
We will need to be able to remove entries from free lists from
heaps during certain events, such as rollbacks, or when freeing
memory to the system (where a previously element disappears and
thus can no longer be in the free list).

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Tested-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
2018-04-11 19:41:39 +02:00
Anatoly Burakov
f21aa4ec9d malloc: make join elements function public
Down the line, we will need to join free segments to determine
whether the resulting contiguous free space is bigger than a
page size, allowing to free some memory back to the system.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Tested-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
2018-04-11 19:38:08 +02:00
Anatoly Burakov
30bc6bf0d5 malloc: add function to dump heap contents
Malloc heap is now a doubly linked list, so it's now possible to
iterate over each malloc element regardless of its state.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Tested-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
2018-04-11 19:37:53 +02:00
Anatoly Burakov
bb372060da malloc: make heap a doubly-linked list
As we are preparing for dynamic memory allocation, we need to be
able to handle holes in our malloc heap, hence we're switching to
doubly linked list, and prepare infrastructure to support it.

Since our heap is now aware where are our first and last elements,
there is no longer any need to have a dummy element at the end of
each heap, so get rid of that as well. Instead, let insert/remove/
join/split operations handle end-of-list conditions automatically.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Tested-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
2018-04-11 19:37:46 +02:00
Anatoly Burakov
b5dd92226f malloc: move all locking to heap
Down the line, we will need to do everything from the heap as any
alloc or free may trigger alloc/free OS memory, which would involve
growing/shrinking heap.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Tested-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
2018-04-11 19:37:39 +02:00
Anatoly Burakov
b7cc54187e mem: move virtual area function in common directory
Move get_virtual_area out of linuxapp EAL memory and make it
common to EAL, so that other code could reserve virtual areas
as well.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Tested-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
2018-04-11 19:33:06 +02:00
Anatoly Burakov
bef5a2d629 vfio: do not needlessly check for IOVA mode
We already set IOVA addresses of memsegs and memzones to VA
address during initialization, so we don't need to check
whether we're in RTE_IOVA_VA mode anywhere else.

Cc: stable@dpdk.org

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
2018-04-11 02:18:19 +02:00
Anatoly Burakov
048303b6f3 mem: do not use physical addresses in IOVA as VA mode
We already use VA addresses for IOVA purposes everywhere if we're in
RTE_IOVA_VA mode:
 1) rte_malloc_virt2phy()/rte_malloc_virt2iova() always return VA addresses
 2) Because of 1), memzone's IOVA is set to VA address on reserve
 3) Because of 2), mempool's IOVA addresses are set to VA addresses

The only place where actual physical addresses are stored is in memsegs at
init time, but we're not using them anywhere, and there is no external API
to get those addresses (aside from manually iterating through memsegs), nor
should anyone care about them in RTE_IOVA_VA mode.

So, fix EAL initialization to allocate VA-contiguous segments at the start
without regard for physical addresses (as if they weren't available), and
use VA to set final IOVA addresses for all pages.

Fixes: 62196f4e0941 ("mem: rename address mapping function to IOVA")
Cc: stable@dpdk.org

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
2018-04-11 02:15:24 +02:00
Shahaf Shuler
5feecc57d9 align SPDX Mellanox copyrights
Aligning Mellanox SPDX copyrights to a single format.
In addition replace to SPDX licence files which were missed.

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-04-11 01:47:47 +02:00
Jan Viktorin
07cbe8f27d eal/arm: use SPDX tag for Cavium and RehiveTech copyright file
Replace the BSD license header with the SPDX tag for files
with a RehiveTech and Cavium copyright on them.

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2018-04-11 01:47:46 +02:00
Jan Viktorin
27d8b82635 use SPDX tag for RehiveTech copyright files
Replace the BSD license header with the SPDX tag for files
with only an RehiveTech copyright on them.

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
2018-04-11 01:47:43 +02:00
Pavan Nikhilesh
e166e55c1a hash: fix missing spinlock unlock in add key
Fix missing spinlock unlock during add key when key is already present.

Fixes: be856325cba3 ("hash: add scalable multi-writer insertion with Intel TSX")
Cc: stable@dpdk.org

Signed-off-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2018-04-10 23:35:40 +02:00
Jasvinder Singh
8173863865 table: remove incorrect check for ACL
Remove wrong check for table entry pointer.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
2018-04-04 12:26:20 +02:00
Jasvinder Singh
f0e352ddb0 pipeline: add port in action APIs
This API provides a common set of actions for pipeline input ports to speed
up application development.

Each pipeline input port can be assigned an action handler to be executed
on every input packet during the pipeline execution.

The pipeline library allows the user to define his own input port actions
by providing customized input port action handler. While the user can
still follow this process, this API is intended to provide a quicker
development alternative for a set of predefined actions.

The typical steps to use this API are:
* Define an input port action profile.
* Instantiate the input port action profile to create input port action
  objects.
* Use the input port action to generate the input port action handler
  invoked by the pipeline.
* Use the input port action object to generate the internal data structures
  used by the input port action handler based on given action parameters.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2018-04-04 12:26:07 +02:00
Jasvinder Singh
934db41a31 pipeline: add load balance action
Add implementation of the load balance action.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2018-04-04 12:21:26 +02:00
Jasvinder Singh
2c3558c6bf pipeline: add timestamp action
Add implementation of timestamp action.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2018-04-04 12:21:26 +02:00
Jasvinder Singh
394a0739c8 pipeline: add statistics read action
Add implementation of stats read action

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2018-04-04 12:21:25 +02:00
Jasvinder Singh
625f4d4040 pipeline: add TTL update action
Add implementation of ttl update action.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2018-04-04 12:21:25 +02:00
Jasvinder Singh
a19dc6cd01 pipeline: add NAT action
Add implementation of Network Address Translation(NAT) action.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2018-04-04 12:21:25 +02:00
Jasvinder Singh
871f44164e pipeline: add packet encapsulation action
Add implementation of different type of packet encap
such as vlan, qinq, mpls, pppoe, etc.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2018-04-04 12:21:24 +02:00
Jasvinder Singh
c8ae949197 pipeline: add traffic manager action
Add implementation of traffic manager action.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2018-04-04 12:21:24 +02:00
Jasvinder Singh
7c9e5b9a12 pipeline: add traffic metering action
Add traffic metering action implementation.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2018-04-04 12:21:23 +02:00
Jasvinder Singh
406a2bc0c6 pipeline: get table action params
Add API to specify action related parameters such as action
handler, table entry data size, etc. for the pipeline table.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2018-04-04 12:21:23 +02:00
Jasvinder Singh
654dd41112 pipeline: add table action APIs
This API provides a common set of actions for pipeline tables to speed up
application development.

Each match-action rule added to a pipeline table has associated data
that stores the action context. This data is input to the table
action handler called for every input packet that hits the rule as
part of the table lookup during the pipeline execution.

The pipeline library allows the user to define his own table
actions by providing customized table action handlers (table
lookup) and complete freedom of setting the rules and their data
(table rule add/delete). While the user can still follow this
process, this API is intended to provide a quicker development
alternative for a set of predefined actions.

The typical steps to use this API are:
* Define a table action profile.
* Instantiate the table action profile to create table action objects.
* Use the table action object to generate the pipeline table action
  handlers (invoked by the pipeline table lookup operation).
* Use the table action object to generate the rule data (for the
  pipeline table rule add operation) based on given action parameters.
* Use the table action object to read action data (e.g. stats counters)
  for any given rule.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2018-04-04 12:21:11 +02:00
Anatoly Burakov
952b207772 eal: provide API for querying valid socket ids
During lcore scan, find all socket ID's and store them, and
provide public API to query valid socket id's. This will break
the ABI, so bump ABI version.

Also, remove deprecation notice corresponding to this change.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
2018-04-05 00:27:13 +02:00
Anatoly Burakov
f05e26051c eal: add IPC asynchronous request
This API is similar to the blocking API that is already present,
but reply will be received in a separate callback by the caller
(callback specified at the time of request, rather than registering
for it in advance).

Under the hood, we create a separate thread to deal with replies to
asynchronous requests, that will just wait to be notified by the
main thread, or woken up on a timer.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Jianfeng Tan <jianfeng.tan@intel.com>
2018-04-04 23:47:59 +02:00
Anatoly Burakov
ce3a731235 eal: rename IPC request as synchronous one
Rename rte_mp_request to rte_mp_request_sync to indicate
that this request will be done synchronously (as opposed to
asynchronous request, which comes in next patch).

Also, fix alphabetical ordering for .map file.

Suggested-by: Thomas Monjalon <thomas@monjalon.net>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Jianfeng Tan <jianfeng.tan@intel.com>
2018-04-04 23:32:21 +02:00
Anatoly Burakov
0891faf5d8 eal: rename IPC sync request to pending request
Originally, there was only one type of request which was used
for multiprocess synchronization (hence the name - sync request).

However, now that we are going to have two types of requests,
synchronous and asynchronous, having it named "sync request" is
very confusing, so we will rename it to "pending request". This
is internal-only, so no externally visible API changes.

Suggested-by: Jianfeng Tan <jianfeng.tan@intel.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Jianfeng Tan <jianfeng.tan@intel.com>
2018-04-04 23:30:32 +02:00
Stephen Hemminger
8ea081f381 mbuf: fix truncated strncpy
Gcc-8 discovers issue with platform_mempool_ops.
rte_mbuf_pool_ops.c:26:3: error: ‘strncpy’ output truncated before
  terminating nul copying as many bytes from a string as its length
  [-Werror=stringop-truncation]
  strncpy(mz->addr, ops_name,  strlen(ops_name));

Since the ops_name is already checked for size, using strncpy
here is unnecessary; just use strcpy.

Fixes: a3acc3144a76 ("mbuf: add pool ops selection functions")

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-04-04 17:34:20 +02:00
Remy Horton
255d42d5b6 metrics: fix potential missing string termination
Fixes a potential memory overrun detected by Coverity.
This overrun cannot currently happen in practice because
rte_metrics_reg_names() explicitly forces the last name
character to be a NULL terminator.

This patches uses strlcpy instead of strncpy to copy name strings.

Coverity issue: 143434
Fixes: 349950ddb9c5 ("metrics: add information metrics library")
Fixes: 710cab6f675a ("metrics: fix out of bound access")

Signed-off-by: Remy Horton <remy.horton@intel.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2018-04-04 17:33:08 +02:00
Bruce Richardson
c022cb400e convert snprintf to strlcpy
Since we have support for the strlcpy function in DPDK, replace all
instances where a string is copied using snprintf.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
2018-04-04 17:33:08 +02:00
Bruce Richardson
5364de644a eal: support strlcpy function
The strncpy function is error prone for doing "safe" string copies, so
we generally try to use "snprintf" instead in the code. The function
"strlcpy" is a better alternative, since it better conveys the
intention of the programmer, and doesn't suffer from the non-null
terminating behaviour of it's n'ed brethern.

The downside of this function is that it is not available by default
on linux, though standard in the BSD's. It is available on most
distros by installing "libbsd" package.

This patch therefore provides the following in rte_string_fns.h to ensure
that strlcpy is available there:
* for BSD, include string.h as normal
* if RTE_USE_LIBBSD is set, include <bsd/string.h>
* if not set, fallback to snprintf for strlcpy

Using make build system, the RTE_USE_LIBBSD is a hard-coded value to "n",
but when using meson, it's automatically set based on what is available
on the platform.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
2018-04-04 17:33:08 +02:00
Pavan Nikhilesh
08f683174e eal: add functions for previous power of 2 alignment
Add 32b and 64b API's to align the given integer to the previous power
of 2. Update common auto test to include test for previous power of 2 for
both 32 and 64bit integers.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
2018-04-04 17:33:08 +02:00
Pavan Nikhilesh
5120203d75 eal: add macros to align value to multiple
Add macros to align given value to the multiple of the supplied
integer.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
2018-04-04 13:43:34 +02:00
Stephen Hemminger
2f35377892 mem: use z specifier to format size_t
The recommended way to format size_t in printf is to use the
z modifier which handles the case where size_t maybe 32 or 64 bits.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-04-04 13:43:33 +02:00
Stephen Hemminger
97d4aaffa7 pci: use z specifier to format size_t
This addresses potential issues where size_t and off_t can vary
on some platforms.  For size_t the best way to format the value
is to use the z modifier to printf. For off_t need to cast to
long long to handle 64 bit offset on 32 bit platforms.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-04-04 13:43:33 +02:00
Jianfeng Tan
768274ebbd vhost: avoid populate guest memory
It's not necessary to populate guest memory from vhost side unless
zerocopy is enabled or users want better performance.

Update the doc for guest memory requirement clarification.

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-03-30 17:25:45 +02:00
Tonghao Zhang
d64c43773a vhost: add pipe event for optimizing negotiation
When vhost-user connects qemu successfully, dpdk will call
the vhost_user_add_connection to add unix socket fd to poll.
And fdset_add only set the socket fd to a fdentry while poll
may sleep now. In a general case, this is no problem. But if
we use hot update for vhost-user, most downtime of VMs network
is 750+ms. This patch adds pipe event, so after connections are
ok, dpdk rebuild the poll immediately. With this patch, the
most downtime is 20~30ms.

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-03-30 17:25:45 +02:00
Tonghao Zhang
9426ee2678 vhost: move stdbool include
The vhost.h file uses bool type, but not include stdbool
header file. If other c files include vhost.h directly,
there will be a compile error.

This patch will be used in the next patch.

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-03-30 14:08:44 +02:00
Tonghao Zhang
ce5bd5fcae vhost: add fdset-event thread name
This patch adds the name for vhost fdset thread.
It can help us to know whether the thread is running.

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Reviewed-by: Jianfeng Tan <jianfeng.tan@intel.com>
2018-03-30 14:08:44 +02:00
Tonghao Zhang
2db2d3220b vhost: raise error on fdset-thread creation
When first call the 'rte_vhost_driver_start', the
fdset_event_dispatch thread should be created successfully.
Because the vhost uses it to poll socket events for vhost
server or clients. Without it, for example, vhost will not
get the connection event.

This patch returns err code directly when created not successful.

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Reviewed-by: Jianfeng Tan <jianfeng.tan@intel.com>
2018-03-30 14:08:44 +02:00