2633 Commits

Author SHA1 Message Date
Huilong Xu
2379ae8128 pci: increase log level to show blacklisted devices
Maybe we should change log level, when add port in blacklist,
for check it easy.
It does not influence performance and function.

Signed-off-by: Huilong Xu <huilongx.xu@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
2016-06-30 14:01:37 +02:00
Lazaros Koromilas
4b5062755a mempool: allow user-owned cache
The mempool cache is only available to EAL threads as a per-lcore
resource. Change this so that the user can create and provide their own
cache on mempool get and put operations. This works with non-EAL threads
too. This commit introduces the new API calls:

    rte_mempool_cache_create(size, socket_id)
    rte_mempool_cache_free(cache)
    rte_mempool_cache_flush(cache, mp)
    rte_mempool_default_cache(mp, lcore_id)

Changes the API calls:

    rte_mempool_generic_put(mp, obj_table, n, cache, flags)
    rte_mempool_generic_get(mp, obj_table, n, cache, flags)

The cache-oblivious API calls use the per-lcore default local cache.

Signed-off-by: Lazaros Koromilas <l@nofutznetworks.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2016-06-30 11:28:10 +02:00
Lazaros Koromilas
d6f78df6fe mempool: use bit flags for multi consumers and producers
Pass the same flags as in rte_mempool_create(). Changes API calls:

    rte_mempool_generic_put(mp, obj_table, n, flags)
    rte_mempool_generic_get(mp, obj_table, n, flags)

Signed-off-by: Lazaros Koromilas <l@nofutznetworks.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2016-06-30 11:28:10 +02:00
Lazaros Koromilas
656f2d3ede mempool: deprecate specific get and put functions
This commit introduces the API calls:

    rte_mempool_generic_put(mp, obj_table, n, is_mp)
    rte_mempool_generic_get(mp, obj_table, n, is_mc)

Deprecates the API calls:

    rte_mempool_mp_put_bulk(mp, obj_table, n)
    rte_mempool_sp_put_bulk(mp, obj_table, n)
    rte_mempool_mp_put(mp, obj)
    rte_mempool_sp_put(mp, obj)
    rte_mempool_mc_get_bulk(mp, obj_table, n)
    rte_mempool_sc_get_bulk(mp, obj_table, n)
    rte_mempool_mc_get(mp, obj_p)
    rte_mempool_sc_get(mp, obj_p)

We also check cookies in one place now.

Signed-off-by: Lazaros Koromilas <l@nofutznetworks.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2016-06-30 11:28:10 +02:00
Beilei Xing
f2a5f0b8ee net/ixgbe/base: update device IDs
There are two device IDs changed from 15C6/15C7 to 15E4/15E5 because of
PHY info changes. 15C6/15C7 IDs are now used for the backplane
SGMII versions.
Also, clean up some discovery kludges from the previous shared ID,
and also add 15C6/15C7 to ixgbe_set_mdio_speed just for paranoia
to control MDIO speed even though nothing should be attached.

Signed-off-by: Beilei Xing <beilei.xing@intel.com>
2016-06-27 16:17:53 +02:00
Beilei Xing
2b84092427 net/ixgbe: fix single VLAN tag to be outer VLAN tag
Previously, a single VLAN header is treated as inner VLAN,
but generally, a single VLAN header is treated as the outer
VLAN header.
The patch fixes the ether type of a single VLAN type, and
enables configuring inner and outer TPID for double VLAN.

Fixes: 19b16e2f6442 ("ethdev: add vlan type when setting ether type")

Signed-off-by: Beilei Xing <beilei.xing@intel.com>
2016-06-27 16:17:45 +02:00
Beilei Xing
5b2d37858d net/i40e: fix single VLAN tag to be outer VLAN tag
In current i40e codebase, if single VLAN header is added in a packet,
it's treated as inner VLAN. Generally, a single VLAN header is
treated as the outer VLAN header, so update the driver behaviour
appropriately.

Fixes: 19b16e2f6442 ("ethdev: add vlan type when setting ether type")

Signed-off-by: Beilei Xing <beilei.xing@intel.com>
2016-06-24 18:28:09 +02:00
Ajit Khaparde
40d941a517 net/bnxt: support Cumulus+ Ethernet adapter
This patch adds support for Cumulus+ Ethernet adapters.
These Cumulus+ Ethernet adapters support 10Gb/25Gb/40Gb/50Gb speeds.

Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
2016-06-24 18:28:09 +02:00
Ajit Khaparde
3522681460 net/bnxt: add driver for Broadcom NetXtreme-C devices
This patch adds the initial skeleton for bnxt driver along with the
nic guide, and ties the driver into the build system.
At this point, the driver simply fails init.

Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Signed-off-by: Stephen Hurd <stephen.hurd@broadcom.com>
Reviewed-by: David Christensen <david.christensen@broadcom.com>
[Release Note Addition]
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2016-06-20 17:21:51 +02:00
Helin Zhang
fc3a79d74a net/i40e/base: support new devices
Add new device IDs and PHY types.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
2016-06-20 17:21:50 +02:00
Olivier Matz
6046898f50 net/mbuf: remove unused Rx error flags
Following the discussions from:
http://dpdk.org/ml/archives/dev/2015-July/021721.html
http://dpdk.org/ml/archives/dev/2016-April/038143.html

The value of these flags is 0, making them useless. Today, no example
application checks them on Rx, and only few drivers sets them and
silently give wrong packets to the application, which should not happen.

This patch removes the unused flags from rte_mbuf and their use in the
drivers. The i40e and fm10k are kept as they are today and should be
fixed to drop bad packets. The enic driver is managed by its maintainer
in another patch.

Fixes: c22265f6 ("mbuf: add new packet flags for i40e")

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2016-06-20 17:21:49 +02:00
Beilei Xing
43e5488c0a net/i40e: support MTU configuration
This patch enables configuring MTU for i40e.
Since changing MTU needs to reconfigure queue, the port must be
stopped before configuring MTU.

Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
2016-06-15 17:13:55 +02:00
Rahul Lakkireddy
d9d4b291ba pci: fix config space access on FreeBSD
PCIOCREAD and PCIOCWRITE ioctls to read/write PCI config space fail
with EPERM due to missing write permission.  Fix by opening /dev/pci/
with O_RDWR instead.

Fixes: 632b2d1deeed ("eal: provide functions to access PCI config")

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
2016-06-15 17:13:55 +02:00
Panu Matilainen
8b1a26d226 pdump: fix missing dependency on libpthread
Fixes: 278f945402c5 ("pdump: add new library for packet capture")

Signed-off-by: Panu Matilainen <pmatilai@redhat.com>
2016-06-29 13:33:01 +02:00
Thomas Monjalon
f8e9cbe2aa mk: fix internal dependencies
Some libraries were missing their dependency on eal, mbuf, mempool,
ring and kvargs.
It is revealed by the linker option "-z defs".

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2016-06-29 13:33:01 +02:00
Panu Matilainen
fef3066c40 pipeline: fix truncated dependency list
In other libraries, dependency list is always appended to, but
in commit 6cbf4f75e059 it with an assignment. This causes the
librte_eal dependency added in commit 6cbf4f75e059 to get discarded,
resulting in missing dependency on librte_eal.

Fixes: 6cbf4f75e059 ("mk: fix missing internal dependencies")

Signed-off-by: Panu Matilainen <pmatilai@redhat.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2016-06-29 13:33:01 +02:00
Thomas Monjalon
f3e764fa2f cryptodev: uninline parameter parsing
There is no need to have this parsing inlined in the header.
It brings kvargs dependency to every crypto drivers.
The functions are moved into rte_cryptodev.c.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2016-06-27 16:50:27 +02:00
Thomas Monjalon
2fa5e196d5 mempool: fix symbol export
Every new symbols in release 16.07 are exported with the version
string DPDK_16.07.
Also remove the empty local: section which is not needed because
inherited from the DPDK_2.0 block.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2016-06-27 16:50:27 +02:00
Reshma Pattan
3bb262e263 pdump: fix string overflow
replaced strncpy with snprintf for safely
copying the strings.

Coverity issue: 127350

Fixes: 278f945402c5 ("pdump: add new library for packet capture")

Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
2016-06-27 16:50:17 +02:00
Reshma Pattan
f3c1829130 pdump: check missing home environment variable
inside pdump_get_socket_path(), getenv can return
a NULL pointer if the match for SOCKET_PATH_HOME is
not found in the environment. NULL check is added to
return -1 immediately. Since pdump_get_socket_path()
returns -1 now, wherever this function is called
there the return value is checked and error message
is logged.

Coverity issue: 127344, 127347

Fixes: 278f945402c5 ("pdump: add new library for packet capture")

Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
2016-06-27 16:50:13 +02:00
Reshma Pattan
bdd8dcc6e2 pdump: fix default socket path
SOCKET_PATH_HOME is to specify environment variable "HOME",
so it should not contain "/pdump_sockets"  in the macro.
So removed "/pdump_sockets" from SOCKET_PATH_HOME and
SOCKET_PATH_VAR_RUN. New changes will create pdump sockets under
/var/run/.dpdk/pdump_sockets for root users and
under HOME/.dpdk/pdump_sockets for non root users.
Changes are done in pdump_get_socket_path() to accommodate
new socket path changes.

Fixes: 278f945402c5 ("pdump: add new library for packet capture")

Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
2016-06-27 16:50:09 +02:00
Panu Matilainen
05dc7b05da port: fix build without KNI
Commit 9fc37d1c071c is missing a conditional in the dependencies,
causing builds to fail when KNI is not enabled:
    == Build lib/librte_port
      LD librte_port.so.3
    /usr/bin/ld: cannot find -lrte_kni
    collect2: error: ld returned 1 exit status

Fixes: 9fc37d1c071c ("port: support KNI")

Signed-off-by: Panu Matilainen <pmatilai@redhat.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2016-06-27 12:28:10 +02:00
Pablo de Lara
949e26c4cf kni: fix build with gcc 6.1
Using gcc 6.1, in some cases, kni fails to compile
because of unused variables:

lib/librte_eal/linuxapp/kni/ixgbe_main.c:82:19:
error: ‘ixgbe_copyright’
defined but not used [-Werror=unused-const-variable=]

lib/librte_eal/linuxapp/kni/ixgbe_main.c:62:19:
error: ‘ixgbe_driver_string’
defined but not used [-Werror=unused-const-variable=]

Fixes: 3fc5ca2f6352 ("kni: initial import")

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2016-06-27 12:28:10 +02:00
Wei Shen
be856325cb hash: add scalable multi-writer insertion with Intel TSX
This patch introduced scalable multi-writer Cuckoo Hash insertion
based on a split Cuckoo Search and Move operation using Intel
TSX. It can do scalable hash insertion with 22 cores with little
performance loss and negligible TSX abortion rate.

* Added an extra rte_hash flag definition to switch default single writer
  Cuckoo Hash behavior to multiwriter.
    - If HTM is available, it would use hardware feature for concurrency.
    - If HTM is not available, it would fall back to spinlock.

* Created a rte_cuckoo_hash_x86.h file to hold all x86-arch related
  cuckoo_hash functions. And rte_cuckoo_hash.c uses compile time flag to
  select x86 file or other platform-specific implementations. While HTM check
  is still done at runtime (same idea with
  RTE_HASH_EXTRA_FLAGS_TRANS_MEM_SUPPORT)

* Moved rte_hash private struct definitions to rte_cuckoo_hash.h, to allow
  rte_cuckoo_hash_x86.h or future platform dependent functions to include.

* Following new functions are created for consistent names when new platform
  TM support are added.
    - rte_hash_cuckoo_move_insert_mw_tm: do insertion with bucket movement.
    - rte_hash_cuckoo_insert_mw_tm: do insertion without bucket movement.

* One extra multi-writer test case is added.

Signed-off-by: Wei Shen <wei1.shen@intel.com>
Signed-off-by: Sameh Gobriel <sameh.gobriel@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2016-06-24 16:25:07 +02:00
Simon Kagstrom
5590a60241 mbuf: fix dump format
Do not add 0x when using %p in format strings to avoid dump messages
with double 0x0x, e.g.,

  dump mbuf at 0x0x7fac7b17c800, phys=17b17c880, buf_len=2176
    pkt_len=2064, ol_flags=0, nb_segs=1, in_port=255
    segment at 0x0x7fac7b17c800, data=0x0x7fac7b17c8f0, data_len=2064

Signed-off-by: Simon Kagstrom <simon.kagstrom@netinsight.net>
2016-06-24 11:01:05 +02:00
David Hunt
152ca51790 mbuf: use default mempool handler from config
By default, the mempool ops used for mbuf allocations is a multi
producer and multi consumer ring. We could imagine a target (maybe some
network processors?) that provides an hardware-assisted pool
mechanism. In this case, the default configuration for this architecture
would contain a different value for RTE_MBUF_DEFAULT_MEMPOOL_OPS.

Signed-off-by: David Hunt <david.hunt@intel.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Jan Viktorin <viktorin@rehivetech.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2016-06-24 11:01:05 +02:00
David Hunt
449c49b93a mempool: support handler operations
Until now, the objects stored in a mempool were internally stored in a
ring. This patch introduces the possibility to register external handlers
replacing the ring.

The default behavior remains unchanged, but calling the new function
rte_mempool_set_ops_byname() right after rte_mempool_create_empty() allows
the user to change the handler that will be used when populating
the mempool.

This patch also adds a set of default ops (function callbacks) based
on rte_ring.

Signed-off-by: David Hunt <david.hunt@intel.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2016-06-24 11:01:05 +02:00
Jingjing Wu
87ce17abbe mbuf: add NSH packet type
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Zhe Tao <zhe.tao@intel.com>
2016-06-23 21:39:42 +02:00
Hiroyuki Mikita
358f9c7b5b ethdev: fix doxygen formatting
This commit fixes some functions missing in API documentation.

Signed-off-by: Hiroyuki Mikita <h.mikita89@gmail.com>
2016-06-22 23:56:18 +02:00
Jerin Jacob
0cdf516a71 ethdev: align device structure with cache line
Elements of struct rte_eth_dev used in the fast path.
Make struct rte_eth_dev cache aligned to avoid the cases where
rte_eth_dev elements share the same cache line with other structures.

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2016-06-22 23:26:34 +02:00
Jerin Jacob
fbb280b6ae ethdev: add RSS RETA size constant 256
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2016-06-22 17:32:58 +02:00
Jerin Jacob
f56620dddb ethdev: add tunnel and port RSS offload types
- added VXLAN, GENEVE and NVGRE tunnel flow types
- added PORT flow type for accounting physical/virtual
port or channel number in flow creation

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2016-06-22 17:32:28 +02:00
Huawei Xie
2cdc118eef vhost: check hugepage fstat error
Value returned from fstat is not checked for errors before being used.
This patch fixes following coverity issue.

    static uint64_t
    get_blk_size(int fd)
    {
    	struct stat stat;

    	fstat(fd, &stat);
    	return (uint64_t)stat.st_blksize;
    >>>  CID 107103 (#1 of 1): Unchecked return value from library
         (CHECKED_RETURN)
    >>>  check_return: Calling fstat(fd, &stat) without checking
         return value.
    >>>  This library function may fail and return an error code.

Fixes: 8f972312b8f4 ("vhost: support vhost-user")

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-06-22 09:47:12 +02:00
Ilya Maximets
428261b461 vhost: unmap log memory on cleanup
Fixes memory leak on QEMU migration.

Fixes: 54f9e32305d4 ("vhost: handle dirty pages logging request")

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-06-22 09:47:12 +02:00
Ilya Maximets
53af5b1e0a vhost: fix leak of file descriptors
While migration of vhost-user device QEMU allocates memfd
to store information about dirty pages and sends fd to
vhost-user process.

File descriptor for this memory should be closed to prevent
"Too many open files" error for vhost-user process after
some amount of migrations.

Ex.:
 # ls /proc/<ovs-vswitchd pid>/fd/ -alh
 total 0
 root qemu  .
 root qemu  ..
 root qemu  0 -> /dev/pts/0
 root qemu  1 -> pipe:[1804353]
 root qemu  10 -> socket:[1782240]
 root qemu  100 -> /memfd:vhost-log (deleted)
 root qemu  1000 -> /memfd:vhost-log (deleted)
 root qemu  1001 -> /memfd:vhost-log (deleted)
 root qemu  1004 -> /memfd:vhost-log (deleted)
 [...]
 root qemu  996 -> /memfd:vhost-log (deleted)
 root qemu  997 -> /memfd:vhost-log (deleted)

 ovs-vswitchd.log:
 |WARN|punix:ovs-vswitchd.ctl: accept failed: Too many open files

Fixes: 54f9e32305d4 ("vhost: handle dirty pages logging request")

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-06-22 09:47:12 +02:00
Marcin Kerlin
b497724652 vhost: fix null pointer dereference
Return value of function get_device() is not checking before
dereference. Fix this problem by adding checking condition.

Coverity issue: 119262

Fixes: 77d20126b4c2 ("vhost-user: handle message to enable vring")

Signed-off-by: Marcin Kerlin <marcinx.kerlin@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-06-22 09:47:12 +02:00
Huawei Xie
39449e7429 vhost: remove concurrent enqueue
All other DPDK PMDs doesn't support concurrent receiving or sending
packets to the same queue. The upper application should deal with
this, normally through queue and core bindings.

Due to historical reason, vhost internally supports concurrent lockless
enqueuing packets to the same virtio queue through costly cmpset operation.
This patch removes this internal lockless implementation and should improve
performance a bit.

Luckily DPDK OVS doesn't rely on this behavior.

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-06-22 09:47:12 +02:00
Yuanhan Liu
a66bcad322 vhost: arrange struct fields for better cache sharing
The ifname[] field takes so much space, that it seperates some frequently
used fields into different caches, say, features and broadcast_rarp.

This patch moves all those fields that will be accessed frequently in Rx/Tx
together (before the ifname[] field) to let them share one cache line.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
Tested-by: Rich Lane <rich.lane@bigswitch.com>
2016-06-22 09:47:12 +02:00
Yuanhan Liu
1d41d77cf8 vhost: optimize dequeue for small packets
A virtio driver normally uses at least 2 desc buffers for Tx: the
first for storing the header, and the others for storing the data.

Therefore, we could fetch the first data desc buf before the main
loop, and do the copy first before the check of "are we done yet?".
This could save one check for small packets that just have one data
desc buffer and need one mbuf to store it.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
Tested-by: Rich Lane <rich.lane@bigswitch.com>
2016-06-22 09:47:12 +02:00
Yuanhan Liu
7f74b95c44 vhost: pre update used ring for Tx and Rx
Pre update and update used ring in batch for Tx and Rx at the stage
while fetching all avail desc idx. This would reduce some cache misses
and hence, increase the performance a bit.

Pre update would be feasible as guest driver will not start processing
those entries as far as we don't update "used->idx". (I'm not 100%
certain I don't miss anything, though).

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Rich Lane <rich.lane@bigswitch.com>
2016-06-22 09:47:12 +02:00
Yuanhan Liu
0823c1cb0a vhost: workaround stale vring base
When DPDK app crashes (or quits, or gets killed), a restart of DPDK
app would get stale vring base from QEMU. That would break the kernel
virtio net completely, making it non-work any more, unless a driver
reset is done.

So, instead of getting the stale vring base from QEMU, Huawei suggested
we could get a much saner (and may not the most accurate) vring base
from used->idx. That would work because:

- there is a memory barrier between updating used ring entries and
  used->idx. So, even though we crashed at updating the used ring
  entries, it will not cause any issue, as the guest driver will not
  process those stale used entries, for used-idx is not updated yet.

- DPDK process vring in order, that means a crash may just lead some
  packet retransmission for Tx and drop for Rx.

Suggested-by: Huawei Xie <huawei.xie@intel.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
2016-06-22 09:47:12 +02:00
Yuanhan Liu
e623e0c6d8 vhost: add reconnect ability
Allow reconnecting on failure by default when:

- DPDK app starts first and QEMU (as the server) is not started yet.
  Without reconnecting, DPDK app would simply fail on vhost-user
  registration.

- QEMU restarts, say due to OS reboot.
  Without reconnecting, you can't re-establish the connection without
  restarting DPDK app.

This patch make it work well for both above cases. It simply creates
a new thread, and keep trying calling "connect()", until it succeeds.

The reconnect could be disabled when RTE_VHOST_USER_NO_RECONNECT flag
is set.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-06-22 09:47:12 +02:00
Yuanhan Liu
64ab701c3d vhost: add vhost-user client mode
Add a new paramter (flags) to rte_vhost_driver_register(). DPDK
vhost-user acts as client mode when RTE_VHOST_USER_CLIENT flag
is set.  The flags would also allow future extensions without
breaking the API (again).

The rest is straingfoward then: allocate a unix socket, and
bind/listen for server, connect for client.

This extension is for vhost-user only, therefore we simply quit
and report error when any flags are given for vhost-cuse.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-06-22 09:47:07 +02:00
Yuanhan Liu
9ebcd4f9c7 vhost: rename structs for enabling client mode
DPDK vhost-user just acts as server so far, so, using a struct named
as "vhost_server" is okay. However, if we add client mode, it doesn't
make sense any more. Here renames it to "vhost_user_socket".

There was no obvious wrong about "connfd_ctx", but I think it's obviously
better to rename it to "vhost_user_connection", as it does represent
a connection, a connection between the backend (DPDK) and the frontend
(QEMU).

Similarly, few more renames are taken, such as "vserver_new_vq_conn"
to "vhost_user_new_connection".

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-06-22 09:44:21 +02:00
Ilya Maximets
7fd5dde987 vhost: make buffer vector for scatter Rx local
Array of buf_vector's is just an array for temporary storing information
about available descriptors. It used only locally in virtio_dev_merge_rx()
and there is no reason for that array to be shared.

Fix that by allocating local buf_vec inside virtio_dev_merge_rx().

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Rich Lane <rich.lane@bigswitch.com>
Acked-by: Rich Lane <rich.lane@bigswitch.com>
2016-06-22 09:44:21 +02:00
Yuanhan Liu
e197bd630a vhost: make virtio header length per device
Virtio net header length is set per device, but not per queue. So, there
is no reason to store it in vhost_virtqueue struct, instead, we should
store it in virtio_net struct, to make one copy only.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Rich Lane <rich.lane@bigswitch.com>
Acked-by: Rich Lane <rich.lane@bigswitch.com>
2016-06-22 09:44:20 +02:00
Yuanhan Liu
004b8ca8b5 vhost: reserve few more space for future extension
"virtio_net_device_ops" is the only left open struct that an application
can access, therefore, it's the only place that might introduce potential
ABI break in future for extension.

So, do some reservation for it. 5 should be pretty enough, considering
that we have barely touched it for a long while. Another reason to
choose 5 is for cache alignment: 5 makes the struct 64 bytes for 64 bit
machine.

With this, it's confidence to say that we might be able to be free from
the ABI violation forever.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Rich Lane <rich.lane@bigswitch.com>
Acked-by: Rich Lane <rich.lane@bigswitch.com>
2016-06-22 09:43:01 +02:00
Yuanhan Liu
f0fa04e35e vhost: remove virtio-net.h
It barely has anything useful there, just 2 functions prototype. Here
move them to vhost-net.h, and delete it.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Rich Lane <rich.lane@bigswitch.com>
Acked-by: Rich Lane <rich.lane@bigswitch.com>
2016-06-22 09:43:01 +02:00
Yuanhan Liu
758e3471b4 vhost: remove unnecessary fields
The "reserved" field in virtio_net and vhost_virtqueue struct is not
necessary any more. We now expose virtio_net device with a number "vid".

This patch also removes the "priv" field: all fields are priviate now:
application can't access it now. The only way that we could still access
it is to expose it by a function, but I doubt that's needed or worthwhile.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Rich Lane <rich.lane@bigswitch.com>
Acked-by: Rich Lane <rich.lane@bigswitch.com>
2016-06-22 09:43:01 +02:00
Yuanhan Liu
db69be54b6 vhost: hide internal code
We are now safe to move all those internal structs/macros/functions to
vhost-net.h, to hide them from external access.

This patch also breaks long lines and removes some redundant comments.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Rich Lane <rich.lane@bigswitch.com>
Acked-by: Rich Lane <rich.lane@bigswitch.com>
2016-06-22 09:43:01 +02:00