Commit Graph

3563 Commits

Author SHA1 Message Date
Adrien Mazarguil
5b4efcc6b9 ethdev: expose flow API error helper
rte_flow_error_set() is a convenient helper to initialize error objects.

Since there is no fundamental reason to prevent applications from using it,
expose it through the public interface after modifying its return value
from positive to negative. This is done for consistency with the rest of
the public interface.

Documentation is updated accordingly.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-13 01:18:47 +01:00
Zhiyong Yang
3eb7e207ea ethdev: fix port id type
Some features applied were still developed based on older version uint8_t
port_id, but port_id has been increased range to uint16_t. The patch fixes
the issue.

Fixes: f8244c6399 ("ethdev: increase port id range")

Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
2017-10-13 01:17:49 +01:00
Pavan Bhagavatula
bc48589e47 eal: move bitmap from sched library
The librte_sched uses rte_bitmap to manage large arrays of bits in an
optimized method so, moving it to eal/common would allow other libraries
and applications to use it.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2017-10-12 22:31:33 +02:00
Luca Boccassi
f642036f3a mk: sort headers before wildcard inclusion
In order to achieve fully reproducible builds, always use the same
inclusion order for headers in the Makefiles.

Signed-off-by: Luca Boccassi <luca.boccassi@gmail.com>
2017-10-12 22:31:33 +02:00
Jianfeng Tan
7100dfe381 mem: honor IOVA mode for no-huge case
With the introduction of IOVA mode, the only blocker to run
with 4KB pages for NICs binding to vfio-pci, is that
RTE_BAD_PHYS_ADDR is not a valid IOVA address.

We can refine this by using VA as IOVA if it's IOVA mode.

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
2017-10-12 21:05:21 +01:00
Pablo de Lara
1a4998dc4d crypto/openssl: support AES-CCM
Add support to AES-CCM, for 128, 192 and 256-bit keys.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
2017-10-12 15:22:07 +01:00
Pablo de Lara
3657283172 cryptodev: clarify API for AES-CCM
AES-CCM algorithm has some restrictions when
handling nonce (IV) and AAD information.

As the API stated, the nonce needs to be place 1 byte
after the start of the IV field. This field needs
to be 16 bytes long, regardless the length of the nonce,
but it is important to clarify that the first byte
and the padding added after the nonce may be modified
by the PMDs using this algorithm.

Same happens with the AAD. It needs to be placed 18 bytes
after the start of the AAD field. The field also needs
to be multiple of 16 bytes long and all memory reserved
(the first bytes and the padding (may be modified by the PMDs).

Lastly, nonce is not needed to be placed in the first 16 bytes
of the AAD, as the API stated, as that depends on the PMD
used, so the comment has been removed.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
2017-10-12 15:14:46 +01:00
Pablo de Lara
761fd95d82 cryptodev: add function to retrieve device name
Currently, in order to get the name of a crypto device,
a user needs to access to it using the crypto device structure.

It is a better practise to have a function to retrieve this
name, given a device id.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
2017-10-12 15:14:45 +01:00
Pablo de Lara
effd3b9fcf cryptodev: allocate driver structure statically
When register a crypto driver, a cryptodev driver
structure was being allocated, using malloc.
Since this call may fail, it is safer to allocate
this memory statically in each PMD, so driver registration
will never fail.

Coverity issue: 158645
Fixes: 7a364faef1 ("cryptodev: remove crypto device type enumeration")

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Reviewed-by: Kirill Rybalchenko <kirill.rybalchenko@intel.com>
2017-10-12 15:10:40 +01:00
Hemant Agrawal
e9508b64ca mempool: remove get capability debug log
This is not required to be printed for every mempool call.

Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Reviewed-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
2017-10-12 03:30:26 +02:00
Ferruh Yigit
72e3efb149 ethdev: revert use port name from device structure
This reverts commit a1e7c17555.

Original commit assumes there is 1:1 mapping between physical device and
ethdev port, so that device name can be used per port instead of ethdev
name field.

But one physical device may have multiple ethdev ports and each port
needs its own unique name.

One issue reported here:
http://dpdk.org/ml/archives/users/2017-September/002484.html

So reverting back the commit to continue using ethdev name field per
port.

Fixes: a1e7c17555 ("ethdev: use device name from device structure")
Cc: stable@dpdk.org

Reported-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2017-10-12 01:52:50 +01:00
Matan Azrad
d5b0924ba6 ethdev: add return value to stats get dev op
The stats_get dev op API doesn't include return value, so PMD cannot
return an error in case of failure at stats getting process time.

Since PCI devices can be removed and there is a time between the
physical removal to the RMV interrupt, the user may get invalid stats
without any indication.

This patch changes the stats_get API return value to be int instead of
void.

All the net PMDs stats_get dev ops are adjusted by this patch.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-10-12 01:52:49 +01:00
Raslan Darawsheh
42ffc45aa3 ethdev: add Rx HW timestamp capability
Add a new offload capability flag for Rx HW
timestamp and enabling/disabling this via rte_eth_rxmode.

Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Reviewed-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2017-10-12 01:52:49 +01:00
David Harton
80d0ff81e8 ethdev: add return code to stats reset function
Some devices do not support reset of eth stats.  An application may
need to know not to clear shadow stats if the device cannot.

rte_eth_stats_reset is updated to provide a return code to share
whether the device supports reset or not.

Signed-off-by: David Harton <dharton@cisco.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-10-12 01:36:58 +01:00
Stephen Hemminger
2cb43002af ethdev: increase device internal name length
Allow sufficient space for UUID in string form (36+1).
Needed to use UUID with Hyper-V.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-10-12 01:36:58 +01:00
Tonghao Zhang
e087d4cd3a ethdev: fix a comment for config struct
We have change the type of rx_adv_conf, so change the comment for it.

Fixes: 4bdefaade6 ("ethdev: VMDQ enhancements")
Cc: stable@dpdk.org

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-10-12 01:36:58 +01:00
Mark Kavanagh
70e737e448 gso: support GRE GSO
This patch adds GSO support for GRE-tunneled packets. Supported GRE
packets must contain an outer IPv4 header, and inner TCP/IPv4 headers.
They may also contain a single VLAN tag. GRE GSO doesn't check if all
input packets have correct checksums and doesn't update checksums for
output packets. Additionally, it doesn't process IP fragmented packets.

As with VxLAN GSO, GRE GSO uses a two-segment MBUF to organize each
output packet, which requires multi-segment mbuf support in the TX
functions of the NIC driver. Also, if a packet is GSOed, GRE GSO reduces
its MBUF refcnt by 1. As a result, when all of its GSOed segments are
freed, the packet is freed automatically.

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2017-10-12 01:36:57 +01:00
Mark Kavanagh
b058d92ea9 gso: support VxLAN GSO
This patch adds a framework that allows GSO on tunneled packets.
Furthermore, it leverages that framework to provide GSO support for
VxLAN-encapsulated packets.

Supported VxLAN packets must have an outer IPv4 header (prepended by an
optional VLAN tag), and contain an inner TCP/IPv4 packet (with an optional
inner VLAN tag).

VxLAN GSO doesn't check if input packets have correct checksums and
doesn't update checksums for output packets. Additionally, it doesn't
process IP fragmented packets.

As with TCP/IPv4 GSO, VxLAN GSO uses a two-segment MBUF to organize each
output packet, which mandates support for multi-segment mbufs in the TX
functions of the NIC driver. Also, if a packet is GSOed, VxLAN GSO
reduces its MBUF refcnt by 1. As a result, when all of its GSO'd segments
are freed, the packet is freed automatically.

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2017-10-12 01:36:57 +01:00
Jiayu Hu
119583797b gso: support TCP/IPv4 GSO
This patch adds GSO support for TCP/IPv4 packets. Supported packets
may include a single VLAN tag. TCP/IPv4 GSO doesn't check if input
packets have correct checksums, and doesn't update checksums for
output packets (the responsibility for this lies with the application).
Additionally, TCP/IPv4 GSO doesn't process IP fragmented packets.

TCP/IPv4 GSO uses two chained MBUFs, one direct MBUF and one indrect
MBUF, to organize an output packet. Note that we refer to these two
chained MBUFs as a two-segment MBUF. The direct MBUF stores the packet
header, while the indirect mbuf simply points to a location within the
original packet's payload. Consequently, use of the GSO library requires
multi-segment MBUF support in the TX functions of the NIC driver.

If a packet is GSO'd, TCP/IPv4 GSO reduces its MBUF refcnt by 1. As a
result, when all of its GSOed segments are freed, the packet is freed
automatically.

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Tested-by: Lei Yao <lei.a.yao@intel.com>
2017-10-12 01:36:57 +01:00
Jiayu Hu
ec51443cc9 gso: add Generic Segmentation Offload API framework
Generic Segmentation Offload (GSO) is a SW technique to split large
packets into small ones. Akin to TSO, GSO enables applications to
operate on large packets, thus reducing per-packet processing overhead.

To enable more flexibility to applications, DPDK GSO is implemented
as a standalone library. Applications explicitly use the GSO library
to segment packets. To segment a packet requires two steps. The first
is to set proper flags to mbuf->ol_flags, where the flags are the same
as that of TSO. The second is to call the segmentation API,
rte_gso_segment(). This patch introduces the GSO API framework to DPDK.

rte_gso_segment() splits an input packet into small ones in each
invocation. The GSO library refers to these small packets generated
by rte_gso_segment() as GSO segments. Each of the newly-created GSO
segments is organized as a two-segment MBUF, where the first segment is a
standard MBUF, which stores a copy of packet header, and the second is an
indirect MBUF which points to a section of data in the input packet.
rte_gso_segment() reduces the refcnt of the input packet by 1. Therefore,
when all GSO segments are freed, the input packet is freed automatically.
Additionally, since each GSO segment has multiple MBUFs (i.e. 2 MBUFs),
the driver of the interface which the GSO segments are sent to should
support to transmit multi-segment packets.

The GSO framework clears the PKT_TX_TCP_SEG flag for both the input
packet, and all produced GSO segments in the event of success, since
segmentation in hardware is no longer required at that point.

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2017-10-12 01:36:57 +01:00
David Hunt
db0dc9b32a power: add send channel msg function to map file
Adding new wrapper function to existing private (but unused 'till now)
function with an rte_power_ prefix.

The plan is to clean up all the header files in the next release so
that only the intended public functions are in the map file and only
the relevant headers have the rte_ prefix so that only they are
included in the documentation.

Signed-off-by: David Hunt <david.hunt@intel.com>
Reviewed-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2017-10-12 00:46:11 +01:00
David Hunt
a8cb5f6571 power: add extra msg type for policies
Signed-off-by: Nemanja Marjanovic <nemanja.marjanovic@intel.com>
Signed-off-by: Rory Sexton <rory.sexton@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
Reviewed-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2017-10-12 00:42:50 +01:00
Shreyansh Jain
63bdef1827 bus: ignore scan and probe failures
Bus scan is responsible for finding devices over *all* buses.
Some of these buses might not be able to scan but that should
not prevent other buses to be scanned.

Same is the case for probing. It is possible that some devices which
were scanned didn't have a specific driver. That should not prevent
other buses from being probed.

Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
2017-10-12 00:29:06 +02:00
Jerin Jacob
1e36bf301b timer: use 64-bit specific code on more platforms
64bit load and store will be an atomic operation on all the
64bit processors.
Change RTE_ARCH_X86_64 to RTE_ARCH_64 to reflect the case.

Fixes: 9b15ba895b ("timer: use a skip list")
Cc: stable@dpdk.org

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2017-10-11 22:59:31 +02:00
Pavan Nikhilesh
351f463456 timer: allow reset on service cores
The rte_timer_reset function should be able to register timers on service
lcores as they are EAL threads.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
2017-10-11 22:35:02 +02:00
Pavan Nikhilesh
78666372fa eal: add function to check lcore role
This function can be used to check the role of a specific lcore.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
2017-10-11 22:30:16 +02:00
Sergio Gonzalez Monroy
5b618b5b29 eal/x86: use cpuid builtin
GCC does have the __get_cpuid_count builtin which checks for maximum
supported leaf, but implementations differ between CLANG and GCC.

This change provides an implementation compatible with both GCC and
CLANG 3.4+.

Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-10-11 21:59:56 +02:00
Bruce Richardson
d65b3b1668 vhost: fix false-positive warning from clang 5
When compiling with clang extra warning flags, such as used by default with
meson, a warning is given in iotlb.c:

lib/librte_vhost/iotlb.c:318:6: warning:
	variable 'socket' is used uninitialized whenever
	'if' condition is false [-Wsometimes-uninitialized]

This is a false positive, as the socket value will be initialized by the
call to get_mempolicy in the case where the NUMA build-time flag is set,
and in cases where it is not set, "if (ret)" will always be true as ret is
initialized to -1 and never changed.

However, this is not immediately obvious, and is perhaps a little fragile,
as it will break if other code using ret is subsequently added above the
call to get_mempolicy by someone unaware of this subtle dependency.
Therefore, we can fix the warning and making the code more robust by
explicitly initializing socket to zero, and moving the extra condition
check on the return from get_mempolicy() into the #ifdef

Fixes: d012d1f293 ("vhost: add IOTLB helper functions")

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2017-10-11 13:56:34 +02:00
Nikhil Rao
9c38b704d2 eventdev: add eth Rx adapter implementation
The adapter implementation uses eventdev PMDs to configure the packet
transfer if HW support is available and if not, it uses an EAL service
function that reads packets from ethernet Rx queues and injects these
as events into the event device.

Signed-off-by: Gage Eads <gage.eads@intel.com>
Signed-off-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com>
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2017-10-10 18:34:09 +02:00
Nikhil Rao
06ac00686e eventdev: add event type for eth Rx adapter
Add RTE_EVENT_TYPE_ETH_RX_ADAPTER event type. Certain platforms (e.g.,
octeontx), in the event dequeue function, need to identify events
injected from ethernet hardware into eventdev so that DPDK mbuf can be
populated from the HW descriptor.

Events injected from ethernet hardware would use an event type of
RTE_EVENT_TYPE_ETHDEV and events injected from the rx adapter service
function would use an event type of RTE_EVENT_TYPE_ETH_RX_ADAPTER to
help the event dequeue function differentiate between these two event
sources.

Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2017-10-10 18:33:51 +02:00
Nikhil Rao
dcc806c263 eventdev: add eth Rx adapter API
Add common APIs for configuring packet transfer from ethernet Rx
queues to event devices across HW & SW packet transfer mechanisms.
A detailed description of the adapter is contained in the header's
comments.

Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2017-10-10 18:33:36 +02:00
Nikhil Rao
67255ee987 event/sw: add eth Rx adapter capabilities function
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2017-10-10 18:33:19 +02:00
Nikhil Rao
b1ce8ebd97 eventdev: add PMD callbacks for eth Rx adapter
The PMD callbacks are used by the rte_event_eth_rx_xxx() APIs to
configure and control the ethernet receive adapter if packet transfers
from the ethdev to eventdev is implemented in hardware.

Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2017-10-10 18:33:04 +02:00
Nikhil Rao
2b5c7409ec eventdev: add capabilities API
The caps API allows application to retrieve capability information
needed to configure the ethernet Rx adapter for the eventdev and
ethdev pair.

For e.g., the ethdev, eventdev pairing maybe such that all of the
ethdev Rx queues can only be connected to a single event queue, in
this case the application is required to pass in -1 as the queue id
when adding a receive queue to the adapter.

Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2017-10-10 18:32:51 +02:00
Gage Eads
be1bf6077e eventdev: extend port attribute get function
This commit adds the new_event_threshold port attribute, so the entire port
configuration structure passed to rte_event_queue_setup can be queried.

Signed-off-by: Gage Eads <gage.eads@intel.com>
2017-10-10 18:32:24 +02:00
Gage Eads
0a2ecfa00f eventdev: extend queue attribute get function
This commit adds three new queue attributes, so that the entire queue
configuration structure passed to rte_event_queue_setup can be queried.

Signed-off-by: Gage Eads <gage.eads@intel.com>
2017-10-10 18:32:11 +02:00
Harry van Haaren
dfb7f82a5a eventdev: bump library version
This commit bumps the library version to refect the ABI change
caused by removing the individual rte_event_port_count, queue_count,
and other get functions. These functions are superseded by the
get-attribute style API, which allows fetching values without API/ABI
changes.

Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
2017-10-10 18:31:44 +02:00
Harry van Haaren
44f3b4a4b5 eventdev: add device started attribute
This commit adds an attribute to the eventdev, allowing applications
to retrieve if the eventdev is running or stopped. Note that no API
or ABI changes were required in adding the statistic, and code changes
are minimal.

Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2017-10-10 18:31:30 +02:00
Harry van Haaren
783bdfef7e eventdev: add queue attribute function
This commit adds a generic queue attribute function. It also removes
the previous rte_event_queue_priority() and priority() functions, and
updates the map files and unit tests to use the new attr functions.

Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
2017-10-10 18:31:17 +02:00
Harry van Haaren
64103dbcd6 eventdev: add dev attribute get function
This commit adds a device attribute function, allowing flexible
fetching of device attributes, like port count or queue count.
The unit tests and .map file are updated to the new function.

Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2017-10-10 18:31:04 +02:00
Harry van Haaren
78ffab9611 eventdev: add port attribute function
This commit reworks the port functions to retrieve information
about the port, like the enq or deq depths. Note that "port count"
is a device attribute, and is added in a later patch for dev attributes.

Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
2017-10-10 18:30:50 +02:00
Tim McDaniel
cec04e240d eventdev: clarify usage of forward and release ops
Update doxygen to make it clear that RTE_EVENT_OP_FORWARD and
RTE_EVENT_OP_RELEASE must only be enqueued to the same port that the
original event was dequeued from.

Signed-off-by: Tim McDaniel <timothy.mcdaniel@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2017-10-10 18:30:37 +02:00
Gage Eads
381acec2b1 eventdev: ease single-link queue config requirements
Events sent through single-link queues are naturally in-order and
atomic, without reordering or atomic scheduling. Logically the
nb_atomic_flows and nb_atomic_order_sequences arguments don't apply to a
single link queue, but applications must set these (depending on the queue
config type) to bypass the is_valid_{ordered, atomic}_queue_conf() checks
in the eventdev layer.

This commit updates those is_valid_* functions to ignore queues with the
SINGLE_LINK flag, to simplify their configuration.

Signed-off-by: Gage Eads <gage.eads@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2017-10-10 18:30:24 +02:00
Maxime Coquelin
3494ed045e vhost: distinguish master and slave requests
This patch adds an union in VhostUserMsg to distinguish between
master and slave initiated requests, instead of casting slave
requests as master request.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
2017-10-10 15:54:31 +02:00
Dariusz Stojaczyk
efba12a78d vhost: add user callbacks for socket open/close
Added new callbacks to notify about socket connection status.
As destroy_device is used for virtqueue processing *pause* as well as
connection close, the user has no distinction between those.

Consider the following scenario:
rte_vhost: received SET_VRING_BASE message,
           calling destroy_device() as usual

user:  end-user asks to remove the device (together with socket file),
       OK, device is not *in use* - that's NOT the behavior we want
       calling rte_vhost_driver_unregister() etc.

Instead of changing new_device/destroy_device callbacks and breaking
the ABI, a set of new functions new_connection/destroy_connection
has been added.

Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
2017-10-10 15:54:31 +02:00
Kuba Kozak
66a6210124 vhost: check poll error code
Add return value check for poll() call.

Coverity issue: 140740
Fixes: 59317cef24 ("vhost: allow many vhost-user ports")
Cc: stable@dpdk.org

Signed-off-by: Kuba Kozak <kubax.kozak@intel.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
2017-10-10 15:54:31 +02:00
Maxime Coquelin
69c90e98f4 vhost: enable IOMMU support
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
2017-10-10 15:53:27 +02:00
Maxime Coquelin
36031f80cc vhost: invalidate vring in case of matching IOTLB invalidate
As soon as a page used by a ring is invalidated, the access_ok flag
is cleared, so that processing threads try to map them again.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
2017-10-10 15:52:27 +02:00
Maxime Coquelin
eefac9536a vhost: postpone device creation until rings are mapped
Translating the start addresses of the rings is not enough, we need to
be sure all the ring is made available by the guest.

It depends on the size of the rings, which is not known on SET_VRING_ADDR
reception. Furthermore, we need to be be safe against vring pages
invalidates.

This patch introduces a new access_ok flag per virtqueue, which is set
when all the rings are mapped, and cleared as soon as a page used by a
ring is invalidated. The invalidation part is implemented in a following
patch.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
2017-10-10 15:52:27 +02:00
Maxime Coquelin
09927b5249 vhost: translate ring addresses when IOMMU enabled
When IOMMU is enabled, the ring addresses set by the
VHOST_USER_SET_VRING_ADDR requests are guest's IO virtual addresses,
whereas Qemu virtual addresses when IOMMU is disabled.

When enabled and the required translation is not in the IOTLB cache,
an IOTLB miss request is sent, but being called by the vhost-user
socket handling thread, the function does not wait for the requested
IOTLB update.

The function will be called again on the next IOTLB update message
reception if matching the vring addresses.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
2017-10-10 15:52:27 +02:00