numam-dpdk

Author	SHA1	Message	Date
Panu Matilainen	05dc7b05da	port: fix build without KNI Commit `9fc37d1c07` is missing a conditional in the dependencies, causing builds to fail when KNI is not enabled: == Build lib/librte_port LD librte_port.so.3 /usr/bin/ld: cannot find -lrte_kni collect2: error: ld returned 1 exit status Fixes: `9fc37d1c07` ("port: support KNI") Signed-off-by: Panu Matilainen <pmatilai@redhat.com> Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2016-06-27 12:28:10 +02:00
Pablo de Lara	949e26c4cf	kni: fix build with gcc 6.1 Using gcc 6.1, in some cases, kni fails to compile because of unused variables: lib/librte_eal/linuxapp/kni/ixgbe_main.c:82:19: error: ‘ixgbe_copyright’ defined but not used [-Werror=unused-const-variable=] lib/librte_eal/linuxapp/kni/ixgbe_main.c:62:19: error: ‘ixgbe_driver_string’ defined but not used [-Werror=unused-const-variable=] Fixes: `3fc5ca2f63` ("kni: initial import") Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>	2016-06-27 12:28:10 +02:00
Thomas Monjalon	c175e542c0	mk: fix parallel build of test resources The build was failing sometimes when building with multiple parallel jobs: # rm build/build/app/test/res # make -j6 objcopy: 'resource.tmp': No such file The reason is that each resource was built from the same temporary file. The failure is seen because of a race condition when removing the temporary file after each resource creation. It also means that some resources may be created from the wrong source. The fix is to have a different input file for each resource. The source file is not directly used because it may have a long path which is used by objcopy to name the symbols after some transformations. When linking a tar resource, the input file is already in the current directory. The hard case is for simply linked resources. The trick is to create a symbolic link of the source file if it is not already in the current build directory. Then there is a replacement of dot by an underscore to predict the symbol names computed by objcopy which must be redefined. There is an additional change for the test_resource_c which is both a real source file and a test resource. An intermediate file test_resource.res is created to avoid compiling resource.c from the wrong directory through a symbolic link. Fixes: `1e9e0a6270` ("app/test: fix resource creation with objcopy on FreeBSD") Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2016-06-24 16:57:27 +02:00
Wei Shen	be856325cb	hash: add scalable multi-writer insertion with Intel TSX This patch introduced scalable multi-writer Cuckoo Hash insertion based on a split Cuckoo Search and Move operation using Intel TSX. It can do scalable hash insertion with 22 cores with little performance loss and negligible TSX abortion rate. * Added an extra rte_hash flag definition to switch default single writer Cuckoo Hash behavior to multiwriter. - If HTM is available, it would use hardware feature for concurrency. - If HTM is not available, it would fall back to spinlock. * Created a rte_cuckoo_hash_x86.h file to hold all x86-arch related cuckoo_hash functions. And rte_cuckoo_hash.c uses compile time flag to select x86 file or other platform-specific implementations. While HTM check is still done at runtime (same idea with RTE_HASH_EXTRA_FLAGS_TRANS_MEM_SUPPORT) * Moved rte_hash private struct definitions to rte_cuckoo_hash.h, to allow rte_cuckoo_hash_x86.h or future platform dependent functions to include. * Following new functions are created for consistent names when new platform TM support are added. - rte_hash_cuckoo_move_insert_mw_tm: do insertion with bucket movement. - rte_hash_cuckoo_insert_mw_tm: do insertion without bucket movement. * One extra multi-writer test case is added. Signed-off-by: Wei Shen <wei1.shen@intel.com> Signed-off-by: Sameh Gobriel <sameh.gobriel@intel.com> Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2016-06-24 16:25:07 +02:00
Simon Kagstrom	5590a60241	mbuf: fix dump format Do not add 0x when using %p in format strings to avoid dump messages with double 0x0x, e.g., dump mbuf at 0x0x7fac7b17c800, phys=17b17c880, buf_len=2176 pkt_len=2064, ol_flags=0, nb_segs=1, in_port=255 segment at 0x0x7fac7b17c800, data=0x0x7fac7b17c8f0, data_len=2064 Signed-off-by: Simon Kagstrom <simon.kagstrom@netinsight.net>	2016-06-24 11:01:05 +02:00
David Hunt	152ca51790	mbuf: use default mempool handler from config By default, the mempool ops used for mbuf allocations is a multi producer and multi consumer ring. We could imagine a target (maybe some network processors?) that provides an hardware-assisted pool mechanism. In this case, the default configuration for this architecture would contain a different value for RTE_MBUF_DEFAULT_MEMPOOL_OPS. Signed-off-by: David Hunt <david.hunt@intel.com> Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Jan Viktorin <viktorin@rehivetech.com> Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2016-06-24 11:01:05 +02:00
David Hunt	99ca3b7a82	app/test: add mempool handler Create a minimal custom mempool handler and check that it passes basic mempool autotests. Signed-off-by: David Hunt <david.hunt@intel.com> Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Jan Viktorin <viktorin@rehivetech.com> Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2016-06-24 11:01:05 +02:00
David Hunt	449c49b93a	mempool: support handler operations Until now, the objects stored in a mempool were internally stored in a ring. This patch introduces the possibility to register external handlers replacing the ring. The default behavior remains unchanged, but calling the new function rte_mempool_set_ops_byname() right after rte_mempool_create_empty() allows the user to change the handler that will be used when populating the mempool. This patch also adds a set of default ops (function callbacks) based on rte_ring. Signed-off-by: David Hunt <david.hunt@intel.com> Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2016-06-24 11:01:05 +02:00
Thomas Monjalon	479e160b2e	net/virtio-user: fix 32-bit build The compilation for 32-bit fails when CONFIG_RTE_VIRTIO_USER is enabled: drivers/net/virtio/virtio_user_ethdev.c:84:47: error: format ‘%llu’ expects argument of type ‘long long unsigned int’, but argument 5 has type ‘size_t {aka unsigned int}’ Fixes: `e9efa4d938` ("net/virtio-user: add new virtual PCI driver") Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>	2016-06-23 22:54:41 +02:00
Jingjing Wu	1bff80cf57	net/i40e: support NSH packet type NSH packet can be recognized by Intel X710/XL710 series. This patch enables the new packet type. Signed-off-by: Jingjing Wu <jingjing.wu@intel.com> Tested-by: Yulong Pei <yulong.pei@intel.com> Acked-by: Zhe Tao <zhe.tao@intel.com>	2016-06-23 22:39:01 +02:00
Jingjing Wu	87ce17abbe	mbuf: add NSH packet type Signed-off-by: Jingjing Wu <jingjing.wu@intel.com> Acked-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Zhe Tao <zhe.tao@intel.com>	2016-06-23 21:39:42 +02:00
Hiroyuki Mikita	358f9c7b5b	ethdev: fix doxygen formatting This commit fixes some functions missing in API documentation. Signed-off-by: Hiroyuki Mikita <h.mikita89@gmail.com>	2016-06-22 23:56:18 +02:00
Jerin Jacob	0cdf516a71	ethdev: align device structure with cache line Elements of struct rte_eth_dev used in the fast path. Make struct rte_eth_dev cache aligned to avoid the cases where rte_eth_dev elements share the same cache line with other structures. Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>	2016-06-22 23:26:34 +02:00
Jerin Jacob	fbb280b6ae	ethdev: add RSS RETA size constant 256 Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>	2016-06-22 17:32:58 +02:00
Jerin Jacob	f56620dddb	ethdev: add tunnel and port RSS offload types - added VXLAN, GENEVE and NVGRE tunnel flow types - added PORT flow type for accounting physical/virtual port or channel number in flow creation Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>	2016-06-22 17:32:28 +02:00
Huawei Xie	b81026f1e7	net/virtio: fix used index retrieved only once In the following loop: while (vq->vq_used_cons_idx != vq->vq_ring.used->idx) { ... } There is no external function call or any explict memory barrier in the loop, the re-read of used->idx might be optimized and only be retrieved once. Use of voaltile normally should be prohibited, and access_once is Linux kernel's style to handle this issue; Once we have that macro in DPDK, we could change to that style. virtio_recv_mergable_pkts might also have the same issue, so fix it as well. Fixes: `823ad64795` ("virtio: support multiple queues") Fixes: `13ce5e7eb9` ("virtio: mergeable buffers") Signed-off-by: Huawei Xie <huawei.xie@intel.com> Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>	2016-06-22 09:47:12 +02:00
Yuanhan Liu	7e1eb993f2	net/virtio: fix crash on querying xstats Trying to access xstats_names after "if (xstats_names == NULL)" is obviously wrong, which would result to a crash while running "show port xstats 0" in testpmd with virtio PMD. The fix is straightforward; just reverse the check. Fixes: `baf91c395b` ("net/virtio: fetch extended statistics with integer ids") Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>	2016-06-22 09:47:12 +02:00
Huawei Xie	2cdc118eef	vhost: check hugepage fstat error Value returned from fstat is not checked for errors before being used. This patch fixes following coverity issue. static uint64_t get_blk_size(int fd) { struct stat stat; fstat(fd, &stat); return (uint64_t)stat.st_blksize; >>> CID 107103 (#1 of 1): Unchecked return value from library (CHECKED_RETURN) >>> check_return: Calling fstat(fd, &stat) without checking return value. >>> This library function may fail and return an error code. Fixes: `8f972312b8` ("vhost: support vhost-user") Signed-off-by: Huawei Xie <huawei.xie@intel.com> Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>	2016-06-22 09:47:12 +02:00
Ilya Maximets	428261b461	vhost: unmap log memory on cleanup Fixes memory leak on QEMU migration. Fixes: `54f9e32305` ("vhost: handle dirty pages logging request") Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>	2016-06-22 09:47:12 +02:00
Ilya Maximets	53af5b1e0a	vhost: fix leak of file descriptors While migration of vhost-user device QEMU allocates memfd to store information about dirty pages and sends fd to vhost-user process. File descriptor for this memory should be closed to prevent "Too many open files" error for vhost-user process after some amount of migrations. Ex.: # ls /proc/<ovs-vswitchd pid>/fd/ -alh total 0 root qemu . root qemu .. root qemu 0 -> /dev/pts/0 root qemu 1 -> pipe:[1804353] root qemu 10 -> socket:[1782240] root qemu 100 -> /memfd:vhost-log (deleted) root qemu 1000 -> /memfd:vhost-log (deleted) root qemu 1001 -> /memfd:vhost-log (deleted) root qemu 1004 -> /memfd:vhost-log (deleted) [...] root qemu 996 -> /memfd:vhost-log (deleted) root qemu 997 -> /memfd:vhost-log (deleted) ovs-vswitchd.log: \|WARN\|punix:ovs-vswitchd.ctl: accept failed: Too many open files Fixes: `54f9e32305` ("vhost: handle dirty pages logging request") Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>	2016-06-22 09:47:12 +02:00
Jianfeng Tan	1b69528e5f	net/virtio-user: handle control queue in driver In virtio-user driver, when notify ctrl-queue, invoke API of virtio-user device emulation to handle ctrl-q command. Besides, multi-queue requires ctrl-queue and ctrl-queue will be enabled automatically when multi-queue is specified. Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com> Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>	2016-06-22 09:47:12 +02:00
Jianfeng Tan	f9b9d1a557	net/virtio-user: add multiple queues in device emulation The main purpose of this patch is to enable multi-queue. But multi-queue requires ctrl-queue so that driver can send how many queues will be enabled through ctrl-queue messages. So we partially implement ctrl-queue to handle control command with class of VIRTIO_NET_CTRL_MQ and with cmd of VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET to handle mq support. This patch provides a function, virtio_user_handle_cq(), for driver to handle ctrl-queue messages. Besides, multi-queue requires VIRTIO_NET_F_MQ and VIRTIO_NET_F_CTRL_VQ are enabled when we do feature negotiation. Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com> Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>	2016-06-22 09:47:12 +02:00
Jianfeng Tan	0b6df936c8	net/virtio-user: add multiple queues in vhost-user adapter This patch mainly adds method in vhost user adapter to communicate enable/disable queues messages with vhost user backend, aka, VHOST_USER_SET_VRING_ENABLE. Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com> Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>	2016-06-22 09:47:12 +02:00
Jianfeng Tan	ce2eabdd43	net/virtio-user: add virtual device Add a new virtual device named virtio-user, which can be used just like eth_ring, eth_null, etc. To reuse the code of original virtio, we do some adjustment in virtio_ethdev.c, such as remove key _static_ of eth_virtio_dev_init() so that it can be reused in virtual device; and we add some check to make sure it will not crash. Configured parameters include: - queues (optional, 1 by default), number of queue pairs, multi-queue not supported for now. - cq (optional, 0 by default), not supported for now. - mac (optional), random value will be given if not specified. - queue_size (optional, 256 by default), size of virtqueues. - path (madatory), path of vhost user. When enable CONFIG_RTE_VIRTIO_USER (enabled by default), the compiled library can be used in both VM and container environment. Examples: path_vhost=<path_to_vhost_user> # use vhost-user as a backend sudo ./examples/l2fwd/build/l2fwd -c 0x100000 -n 4 \ --socket-mem 0,1024 --no-pci --file-prefix=l2fwd \ --vdev=virtio-user0,mac=00:01:02:03:04:05,path=$path_vhost -- -p 0x1 Known issues: - Control queue and multi-queue are not supported yet. - Cannot work with --huge-unlink. - Cannot work with no-huge. - Cannot work when there are more than VHOST_MEMORY_MAX_NREGIONS(8) hugepages. - Root privilege is a must (mainly becase of sorting hugepages according to physical address). - Applications should not use file name like HUGEFILE_FMT ("%smap_%d"). - Cannot work with vhost-net backend. Signed-off-by: Huawei Xie <huawei.xie@intel.com> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>	2016-06-22 09:47:12 +02:00
Jianfeng Tan	e9efa4d938	net/virtio-user: add new virtual PCI driver This patch implements another new instance of struct virtio_pci_ops to drive the virtio-user virtual device. Instead of rd/wr ioport or PCI configuration space, this virtual pci driver will rd/wr the virtual device struct virtio_user_hw, and when necessary, invokes APIs provided by device emulation later to start/stop the device. ---------------------- \| ------------------ \| \| \| virtio driver \| \|----> (virtio_user_ethdev.c) \| ------------------ \| \| \| \| \| ------------------ \| ------> virtio-user PMD \| \| device emulate \| \| \| \| \| \| \| \| vhost adapter \| \| \| ------------------ \| ---------------------- \| \| \| ------------------ \| vhost backend \| ------------------ Signed-off-by: Huawei Xie <huawei.xie@intel.com> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>	2016-06-22 09:47:12 +02:00
Jianfeng Tan	37a7eb2ae8	net/virtio-user: add device emulation layer Few device emulation layer functions are added for virtio driver to call: - virtio_user_start_device() - virtio_user_stop_device() - virtio_user_dev_init() - virtio_user_dev_uninit() These functions will get called by virtio driver, and they call vhost adapter layer functions to implement the functionality. All stats related to virtual user device as logged in virtio_user_dev structure. ---------------------- \| ------------------ \| \| \| virtio driver \| \| \| ------------------ \| \| \| \| \| ------------------ \| ------> virtio-user PMD \| \| device emulate \|-\|----> (virtio_user_dev.c, virtio_user_dev.h) \| \| \| \| \| \| vhost adapter \| \| \| ------------------ \| ---------------------- \| \| \| ------------------ \| vhost backend \| ------------------ Signed-off-by: Huawei Xie <huawei.xie@intel.com> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>	2016-06-22 09:47:12 +02:00
Jianfeng Tan	6a84c37e39	net/virtio-user: add vhost-user adapter layer This patch provides vhost adapter layer implementation. Two main help functions are provided to upper layer (device emulation): - vhost_user_setup(), to set up vhost user backend; - vhost_user_sock(), to talk with vhost user backend. ---------------------- \| ------------------ \| \| \| virtio driver \| \| \| ------------------ \| \| \| \| \| ------------------ \| ------> virtio-user PMD \| \| device emulate \| \| \| \| \| \| \| \| vhost adapter \|-\|----> (vhost_user.c) \| ------------------ \| ---------------------- \| \| -------------- --> (vhost-user protocol) \| ------------------ \| vhost backend \| ------------------ Signed-off-by: Huawei Xie <huawei.xie@intel.com> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>	2016-06-22 09:47:12 +02:00
Jianfeng Tan	f24f8f9fee	net/virtio: allow virtual address to fill vring descriptors This patch is related to how to calculate relative address for vhost backend. The principle is that: based on one or multiple shared memory regions, vhost maintains a reference system with the frontend start address, backend start address, and length for each segment, so that each frontend address (GPA, Guest Physical Address) can be translated into vhost-recognizable backend address. To make the address translation efficient, we need to maintain as few regions as possible. In the case of VM, GPA is always locally continuous. But for some other case, like virtio-user, GPA continuous is not guaranteed, therefore, we use virtual address here. It basically means: a. when set_base_addr, VA address is used; b. when preparing RX's descriptors, VA address is used; c. when transmitting packets, VA is filled in TX's descriptors; d. in TX and CQ's header, VA is used. Signed-off-by: Huawei Xie <huawei.xie@intel.com> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>	2016-06-22 09:47:12 +02:00
Jianfeng Tan	595454c5ac	net/virtio: hide vring address check inside PCI ops This patch moves phys addr check from virtio_dev_queue_setup to pci ops. To make that happen, make sure virtio_ops.setup_queue return the result if we pass through the check. Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com> Signed-off-by: Huawei Xie <huawei.xie@intel.com> Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>	2016-06-22 09:47:12 +02:00
Marcin Kerlin	b497724652	vhost: fix null pointer dereference Return value of function get_device() is not checking before dereference. Fix this problem by adding checking condition. Coverity issue: 119262 Fixes: `77d20126b4` ("vhost-user: handle message to enable vring") Signed-off-by: Marcin Kerlin <marcinx.kerlin@intel.com> Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>	2016-06-22 09:47:12 +02:00
Huawei Xie	39449e7429	vhost: remove concurrent enqueue All other DPDK PMDs doesn't support concurrent receiving or sending packets to the same queue. The upper application should deal with this, normally through queue and core bindings. Due to historical reason, vhost internally supports concurrent lockless enqueuing packets to the same virtio queue through costly cmpset operation. This patch removes this internal lockless implementation and should improve performance a bit. Luckily DPDK OVS doesn't rely on this behavior. Signed-off-by: Huawei Xie <huawei.xie@intel.com> Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>	2016-06-22 09:47:12 +02:00
Huawei Xie	7e40200c56	net/virtio: fix crash when no devargs We skip kernel managed virtio devices, if it isn't whitelisted. Before checking if the virtio device is whitelisted, check if devargs is specified. Fixes: `ac5e1d838d` ("virtio: skip error when probing kernel managed device") Reported-by: Vincent Li <vincent.mc.li@gmail.com> Signed-off-by: Huawei Xie <huawei.xie@intel.com> Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>	2016-06-22 09:47:12 +02:00
Yuanhan Liu	a66bcad322	vhost: arrange struct fields for better cache sharing The ifname[] field takes so much space, that it seperates some frequently used fields into different caches, say, features and broadcast_rarp. This patch moves all those fields that will be accessed frequently in Rx/Tx together (before the ifname[] field) to let them share one cache line. Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Acked-by: Huawei Xie <huawei.xie@intel.com> Tested-by: Rich Lane <rich.lane@bigswitch.com>	2016-06-22 09:47:12 +02:00
Yuanhan Liu	1d41d77cf8	vhost: optimize dequeue for small packets A virtio driver normally uses at least 2 desc buffers for Tx: the first for storing the header, and the others for storing the data. Therefore, we could fetch the first data desc buf before the main loop, and do the copy first before the check of "are we done yet?". This could save one check for small packets that just have one data desc buffer and need one mbuf to store it. Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Acked-by: Huawei Xie <huawei.xie@intel.com> Tested-by: Rich Lane <rich.lane@bigswitch.com>	2016-06-22 09:47:12 +02:00
Yuanhan Liu	7f74b95c44	vhost: pre update used ring for Tx and Rx Pre update and update used ring in batch for Tx and Rx at the stage while fetching all avail desc idx. This would reduce some cache misses and hence, increase the performance a bit. Pre update would be feasible as guest driver will not start processing those entries as far as we don't update "used->idx". (I'm not 100% certain I don't miss anything, though). Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Tested-by: Rich Lane <rich.lane@bigswitch.com>	2016-06-22 09:47:12 +02:00
Yuanhan Liu	39cac2adca	net/vhost: add client option Add client option to vhost pmd, to let it act as the vhost-user client. Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>	2016-06-22 09:47:12 +02:00
Yuanhan Liu	2345e3be86	examples/vhost: add client option Add --client option to let vhost-switch acts as the client. Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>	2016-06-22 09:47:12 +02:00
Yuanhan Liu	0823c1cb0a	vhost: workaround stale vring base When DPDK app crashes (or quits, or gets killed), a restart of DPDK app would get stale vring base from QEMU. That would break the kernel virtio net completely, making it non-work any more, unless a driver reset is done. So, instead of getting the stale vring base from QEMU, Huawei suggested we could get a much saner (and may not the most accurate) vring base from used->idx. That would work because: - there is a memory barrier between updating used ring entries and used->idx. So, even though we crashed at updating the used ring entries, it will not cause any issue, as the guest driver will not process those stale used entries, for used-idx is not updated yet. - DPDK process vring in order, that means a crash may just lead some packet retransmission for Tx and drop for Rx. Suggested-by: Huawei Xie <huawei.xie@intel.com> Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Huawei Xie <huawei.xie@intel.com>	2016-06-22 09:47:12 +02:00
Yuanhan Liu	e623e0c6d8	vhost: add reconnect ability Allow reconnecting on failure by default when: - DPDK app starts first and QEMU (as the server) is not started yet. Without reconnecting, DPDK app would simply fail on vhost-user registration. - QEMU restarts, say due to OS reboot. Without reconnecting, you can't re-establish the connection without restarting DPDK app. This patch make it work well for both above cases. It simply creates a new thread, and keep trying calling "connect()", until it succeeds. The reconnect could be disabled when RTE_VHOST_USER_NO_RECONNECT flag is set. Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>	2016-06-22 09:47:12 +02:00
Yuanhan Liu	64ab701c3d	vhost: add vhost-user client mode Add a new paramter (flags) to rte_vhost_driver_register(). DPDK vhost-user acts as client mode when RTE_VHOST_USER_CLIENT flag is set. The flags would also allow future extensions without breaking the API (again). The rest is straingfoward then: allocate a unix socket, and bind/listen for server, connect for client. This extension is for vhost-user only, therefore we simply quit and report error when any flags are given for vhost-cuse. Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>	2016-06-22 09:47:07 +02:00
Yuanhan Liu	9ebcd4f9c7	vhost: rename structs for enabling client mode DPDK vhost-user just acts as server so far, so, using a struct named as "vhost_server" is okay. However, if we add client mode, it doesn't make sense any more. Here renames it to "vhost_user_socket". There was no obvious wrong about "connfd_ctx", but I think it's obviously better to rename it to "vhost_user_connection", as it does represent a connection, a connection between the backend (DPDK) and the frontend (QEMU). Similarly, few more renames are taken, such as "vserver_new_vq_conn" to "vhost_user_new_connection". Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>	2016-06-22 09:44:21 +02:00
Ilya Maximets	7fd5dde987	vhost: make buffer vector for scatter Rx local Array of buf_vector's is just an array for temporary storing information about available descriptors. It used only locally in virtio_dev_merge_rx() and there is no reason for that array to be shared. Fix that by allocating local buf_vec inside virtio_dev_merge_rx(). Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Tested-by: Rich Lane <rich.lane@bigswitch.com> Acked-by: Rich Lane <rich.lane@bigswitch.com>	2016-06-22 09:44:21 +02:00
Yuanhan Liu	e197bd630a	vhost: make virtio header length per device Virtio net header length is set per device, but not per queue. So, there is no reason to store it in vhost_virtqueue struct, instead, we should store it in virtio_net struct, to make one copy only. Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Tested-by: Rich Lane <rich.lane@bigswitch.com> Acked-by: Rich Lane <rich.lane@bigswitch.com>	2016-06-22 09:44:20 +02:00
Yuanhan Liu	004b8ca8b5	vhost: reserve few more space for future extension "virtio_net_device_ops" is the only left open struct that an application can access, therefore, it's the only place that might introduce potential ABI break in future for extension. So, do some reservation for it. 5 should be pretty enough, considering that we have barely touched it for a long while. Another reason to choose 5 is for cache alignment: 5 makes the struct 64 bytes for 64 bit machine. With this, it's confidence to say that we might be able to be free from the ABI violation forever. Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Tested-by: Rich Lane <rich.lane@bigswitch.com> Acked-by: Rich Lane <rich.lane@bigswitch.com>	2016-06-22 09:43:01 +02:00
Yuanhan Liu	f0fa04e35e	vhost: remove virtio-net.h It barely has anything useful there, just 2 functions prototype. Here move them to vhost-net.h, and delete it. Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Tested-by: Rich Lane <rich.lane@bigswitch.com> Acked-by: Rich Lane <rich.lane@bigswitch.com>	2016-06-22 09:43:01 +02:00
Yuanhan Liu	758e3471b4	vhost: remove unnecessary fields The "reserved" field in virtio_net and vhost_virtqueue struct is not necessary any more. We now expose virtio_net device with a number "vid". This patch also removes the "priv" field: all fields are priviate now: application can't access it now. The only way that we could still access it is to expose it by a function, but I doubt that's needed or worthwhile. Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Tested-by: Rich Lane <rich.lane@bigswitch.com> Acked-by: Rich Lane <rich.lane@bigswitch.com>	2016-06-22 09:43:01 +02:00
Yuanhan Liu	db69be54b6	vhost: hide internal code We are now safe to move all those internal structs/macros/functions to vhost-net.h, to hide them from external access. This patch also breaks long lines and removes some redundant comments. Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Tested-by: Rich Lane <rich.lane@bigswitch.com> Acked-by: Rich Lane <rich.lane@bigswitch.com>	2016-06-22 09:43:01 +02:00
Yuanhan Liu	4ecf22e356	vhost: export device id as the interface to applications With all the previous prepare works, we are just one step away from the final ABI refactoring. That is, to change current API to let them stick to vid instead of the old virtio_net dev. Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Tested-by: Rich Lane <rich.lane@bigswitch.com> Acked-by: Rich Lane <rich.lane@bigswitch.com>	2016-06-22 09:42:57 +02:00
Yuanhan Liu	16ae8abe1c	vhost: remove dependency on device private field This change could let us avoid the dependency of "virtio_net" struct, to prepare for the ABI refactoring. Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Tested-by: Rich Lane <rich.lane@bigswitch.com> Acked-by: Rich Lane <rich.lane@bigswitch.com>	2016-06-22 09:07:36 +02:00
Yuanhan Liu	a67f286a65	vhost: export queue free entries The new API rte_vhost_avail_entries() is actually a rename of rte_vring_available_entries(), with the "vring" to "vhost" name change to keep the consistency of other vhost exported APIs. This change could let us avoid the dependency of "virtio_net" struct, to prepare for the ABI refactoring. Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Tested-by: Rich Lane <rich.lane@bigswitch.com> Acked-by: Rich Lane <rich.lane@bigswitch.com>	2016-06-22 09:02:58 +02:00

1 2 3 4 5 ...

4834 Commits