numam-dpdk

Author	SHA1	Message	Date
Emmanuel Roullit	68759bbe73	vhost: remove unneeded variable assignment Found with clang static analysis: lib/librte_vhost/vhost_user.c:996:3: warning: Value stored to 'ret' is never read ret = vhost_user_get_vring_base(dev, &msg.payload.state); ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Emmanuel Roullit <emmanuel.roullit@gmail.com> Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>	2017-01-30 13:47:20 +01:00
Yuanhan Liu	b8b992e93f	vhost: fix long stall of negotiation Setting up the mapping from GPA (guest physical address) to HPA (guest physical address) could be very time consuming when the guest memory is backened with small pages (4K). The bigger the guest memory, the longer it takes. This could lead a very long vhost-user negotiation. Since the mapping is only needed in zero copy mode so far, we could avoid such time consuming settup when zero copy is turned off (which is the default case). It's actually a workaround, a right fix might be to start a new thread, and hide the big latency there. Fixes: e246896178e6 ("vhost: get guest/host physical address mappings") Cc: stable@dpdk.org Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2017-01-28 14:25:40 +01:00
Maxime Coquelin	73c8f9f69c	vhost: introduce reply ack feature REPLY_ACK features provide a generic way for QEMU to ensure both completion and success of a request. As described in vhost-user spec in QEMU repository, QEMU sets VHOST_USER_NEED_REPLY flag (bit 3) when expecting a reply_ack from the backend. Backend must reply with 0 for success or non-zero otherwise when flag is set. Currently, only VHOST_USER_SET_MEM_TABLE request implements reply_ack, in order to synchronize mapping updates. This patch enables REPLY_ACK feature generally, but only checks error code for VHOST_USER_SET_MEM_TABLE. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>	2017-01-17 09:20:18 +01:00
Haifeng Lin	8c33fc10f6	vhost: fix guest/host physical address mapping When reg_size < page_size the function read in rte_mem_virt2phy would not return, because host_user_addr is invalid. Fixes: e246896178e6 ("vhost: get guest/host physical address mappings") Cc: stable@dpdk.org Signed-off-by: Haifeng Lin <haifeng.lin@huawei.com> Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>	2017-01-17 09:20:17 +01:00
Zhihong Wang	f689586bc0	vhost: shadow used ring update The basic idea is to shadow the used ring update: update them into a local buffer first, and then flush them all to the virtio used vring at once in the end. And since we do avail ring reservation before enqueuing data, we would know which and how many descs will be used. Which means we could update the shadow used ring at the reservation time. It also introduce another slight advantage: we don't need access the desc->flag any more inside copy_mbuf_to_desc_mergeable(). Signed-off-by: Zhihong Wang <zhihong.wang@intel.com> Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Jianbo Liu <jianbo.liu@linaro.org> Tested-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2016-10-26 13:39:09 +02:00
Yuanhan Liu	b0a985d1f3	vhost: add dequeue zero copy The basic idea of dequeue zero copy is, instead of copying data from the desc buf, here we let the mbuf reference the desc buf addr directly. Doing so, however, has one major issue: we can't update the used ring at the end of rte_vhost_dequeue_burst. Because we don't do the copy here, an update of the used ring would let the driver to reclaim the desc buf. As a result, DPDK might reference a stale memory region. To update the used ring properly, this patch does several tricks: - when mbuf references a desc buf, refcnt is added by 1. This is to pin lock the mbuf, so that a mbuf free from the DPDK won't actually free it, instead, refcnt is subtracted by 1. - We chain all those mbuf together (by tailq) And we check it every time on the rte_vhost_dequeue_burst entrance, to see if the mbuf is freed (when refcnt equals to 1). If that happens, it means we are the last user of this mbuf and we are safe to update the used ring. - "struct zcopy_mbuf" is introduced, to associate an mbuf with the right desc idx. Dequeue zero copy is introduced for performance reason, and some rough tests show about 50% perfomance boost for packet size 1500B. For small packets, (e.g. 64B), it actually slows a bit down (well, it could up to 15%). That is expected because this patch introduces some extra works, and it outweighs the benefit from saving few bytes copy. Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Tested-by: Qian Xu <qian.q.xu@intel.com>	2016-10-12 09:45:14 +02:00
Yuanhan Liu	f6be82d725	vhost: introduce last available index for dequeue So far, we retrieve both the used ring and avail ring idx by the var last_used_idx; it won't be a problem because the used ring is updated immediately after those avail entries are consumed. But that's not true when dequeue zero copy is enabled, that used ring is updated only when the mbuf is consumed. Thus, we need use another var to note the last avail ring idx we have consumed. Therefore, last_avail_idx is introduced. Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Tested-by: Qian Xu <qian.q.xu@intel.com>	2016-10-12 09:45:12 +02:00
Yuanhan Liu	e246896178	vhost: get guest/host physical address mappings So that we can convert a guest physical address to host physical address, which will be used in later Tx zero copy implementation. MAP_POPULATE is set while mmaping guest memory regions, to make sure the page tables are setup and then rte_mem_virt2phy() could yield proper physical address. Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Tested-by: Qian Xu <qian.q.xu@intel.com>	2016-10-12 09:45:09 +02:00
Yuanhan Liu	552e8fd3d2	vhost: simplify memory regions handling Due to history reason (that vhost-cuse comes before vhost-user), some fields for maintaining the vhost-user memory mappings (such as mmapped address and size, with those we then can unmap on destroy) are kept in "orig_region_map" struct, a structure that is defined only in vhost-user source file. The right way to go is to remove the structure and move all those fields into virtio_memory_region struct. But we simply can't do that before, because it breaks the ABI. Now, thanks to the ABI refactoring, it's never been a blocking issue any more. And here it goes: this patch removes orig_region_map and redefines virtio_memory_region, to include all necessary info. With that, we can simplify the guest/host address convert a bit. Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Tested-by: Qian Xu <qian.q.xu@intel.com>	2016-10-12 09:44:56 +02:00
Yuanhan Liu	484e42d46f	vhost: simplify features set/get No need to use a pointer to store/retrieve features. Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2016-09-13 05:25:09 +02:00
Yuanhan Liu	bbd7e83520	vhost: get device once Invoke get_device() at the beginning of vhost_user_msg_handler, so that we could check the return value once. Which could save tons of duplicate get-and-check device. Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2016-09-13 05:25:08 +02:00
Yuanhan Liu	fc2a9b5b64	vhost: unify function names Some functions are with prefix "user_", while others with "vhost_". Making them all starting with "vhost_user_" to unify the function names. Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2016-09-13 05:25:08 +02:00
Yuanhan Liu	de854fd95d	vhost: fold common message handlers Due to history reason (that we have 2 vhost implementations), some messages are handled in two calls: vhost specific implementation handles it first and then invoke the common one to do another handling. We have one implementation only now, we could write one method for each message. Here fold those common handles to corresponding vhost user handler. Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2016-09-13 05:25:08 +02:00
Yuanhan Liu	a277c71598	vhost: refactor code structure The code structure is a bit messy now. For example, vhost-user message handling is spread to three different files: vhost-net-user.c virtio-net.c virtio-net-user.c Where, vhost-net-user.c is the entrance to handle all those messages and then invoke the right method for a specific message. Some of them are stored at virtio-net.c, while others are stored at virtio-net-user.c. The truth is all of them should be in one file, vhost_user.c. So this patch refactors the source code structure: mainly on renaming files and moving code from one file to another file that is more suitable for storing it. Thus, no functional changes are made. After the refactor, the code structure becomes to: - socket.c handles all vhost-user socket file related stuff, such as, socket file creation for server mode, reconnection for client mode. - vhost.c mainly on stuff like vhost device creation/destroy/reset. Most of the vhost API implementation are there, too. - vhost_user.c all stuff about vhost-user messages handling goes there. - virtio_net.c all stuff about virtio-net should go there. It has virtio net Rx/Tx implementation only so far: it's just a rename from vhost_rxtx.c Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2016-09-13 05:25:08 +02:00

14 Commits