numam-dpdk

Author	SHA1	Message	Date
Damjan Marion	5ea0942129	net/i40e: add packet type metadata in vector Rx The ptype is decoded from the Rx descriptor and stored in the packet type field in the mbuf using the same function in the non-vector driver. Signed-off-by: Damjan Marion <damarion@cisco.com> Signed-off-by: Jeff Shaw <jeffrey.b.shaw@intel.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com> Acked-by: Jing Chen <jing.d.chen@intel.com>	2016-10-13 15:30:59 +02:00
Eric Kinzie	c771e4ef38	net/bonding: enable slave VLAN filter SR-IOV virtual functions cannot rely on promiscuous mode for the reception of VLAN tagged frames. Program the VLAN filter for each slave when a VLAN is configured for the bonding master. Signed-off-by: Eric Kinzie <ehkinzie@gmail.com> Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>	2016-10-13 15:30:59 +02:00
Eric Kinzie	b4092cacf6	net/bonding: validate speed after link up It's possible for the bonding driver to mistakenly reject an interface based in it's, as yet, unnegotiated link speed and duplex. Always allow the interface to be added to the bonding interface but require link properties validation to succeed before slave is activated. Fixes: 2efb58cbab6e ("bond: new link bonding library") Signed-off-by: Eric Kinzie <ehkinzie@gmail.com> Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>	2016-10-13 15:30:59 +02:00
Jingjing Wu	edf1b61831	doc: add limitations for i40e PMD This patch adds "Limitations or Known issues" section for i40e PMD, including two items: 1. MPLS packet classification on X710/XL710 2. 16 Byte Descriptor cannot be used on DPDK VF 3. Link down with i40e kernel driver after DPDK application exist Signed-off-by: Jingjing Wu <jingjing.wu@intel.com> Acked-by: John McNamara <john.mcnamara@intel.com>	2016-10-13 15:30:59 +02:00
Nélio Laranjeiro	ecf60761fc	net/mlx5: return RSS hash result in mbuf Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2016-10-13 15:30:59 +02:00
Ferruh Yigit	473c4957a4	kni: move kernel version ifdefs to compat header Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>	2016-10-13 23:12:26 +02:00
Ferruh Yigit	dd6d3570c8	kni: prefer uint32_t to unsigned int Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>	2016-10-13 23:12:11 +02:00
Ferruh Yigit	e4728a1cfa	kni: update log messages Remove some function entrance logs and changed log level of some logs. Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>	2016-10-13 23:12:00 +02:00
Ferruh Yigit	f13fbc033a	kni: remove compile time debug configuration Since switched to kernel dynamic debugging it is possible to remove compile time debug log configuration. Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>	2016-10-13 23:11:26 +02:00
Ferruh Yigit	dec1ffcae7	kni: move functions to eliminate declarations Function implementations kept same. Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>	2016-10-13 23:11:12 +02:00
Ferruh Yigit	d2d7d6fc5f	kni: remove unnecessary messages for out of memory Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>	2016-10-13 23:09:12 +02:00
Ferruh Yigit	dd3e4e36d4	kni: update kernel logging Switch to dynamic logging functions. Depending kernel configuration this may cause previously visible logs disappear. How to enable dynamic logging: https://www.kernel.org/doc/Documentation/dynamic-debug-howto.txt Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>	2016-10-13 23:09:02 +02:00
Ferruh Yigit	05788ff054	kni: prefer ether_addr_copy to memcpy Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>	2016-10-13 23:08:44 +02:00
Ferruh Yigit	e479813505	kni: prefer min_t to min Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>	2016-10-13 23:08:40 +02:00
Ferruh Yigit	88b7a2a7bd	kni: enclose macros with complex values in parens Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>	2016-10-13 23:08:14 +02:00
Ferruh Yigit	59b36980e4	kni: do not use assignment in if condition Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>	2016-10-13 23:08:04 +02:00
Ferruh Yigit	50e25e4049	kni: move trailing statement on next line Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>	2016-10-13 23:07:16 +02:00
Ferruh Yigit	fa34b39dfd	kni: move comparison constants on the right Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>	2016-10-13 23:06:42 +02:00
Ferruh Yigit	e227435ec0	kni: remove useless return Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>	2016-10-13 23:06:38 +02:00
Ferruh Yigit	0861751c93	kni: prefer unsigned int to unsigned Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>	2016-10-13 23:06:30 +02:00
Ferruh Yigit	1b9190aff3	kni: fix spacing and line lenghts Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>	2016-10-13 23:05:31 +02:00
Ferruh Yigit	9afcc5bc74	kni: make static struct const Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>	2016-10-13 23:05:28 +02:00
Ferruh Yigit	fbd71b623f	kni: uninitialize global variables Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>	2016-10-13 23:05:20 +02:00
Ferruh Yigit	f60c4df6bb	kni: move externs to the header file Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>	2016-10-13 22:49:41 +02:00
Vladyslav Buslov	93a298b34e	kni: support core id parameter in single threaded mode Allow binding KNI thread to specific core in single threaded mode by setting core_id and force_bind config parameters. Signed-off-by: Vladyslav Buslov <vladyslav.buslov@harmonicinc.com> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com> Tested-by: Ferruh Yigit <ferruh.yigit@intel.com>	2016-10-13 22:24:45 +02:00
Wei Dai	231fa88ed5	app/test: verify LPM tbl8 recycle As a bug-fix for lpm tbl8 recycle is introduced, add a test case to verify tbl8 group is correctly freed when it only includes a rule with depth=24. Signed-off-by: Wei Dai <wei.dai@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2016-10-13 22:17:41 +02:00
Wei Dai	f05b0fbe7d	lpm: remove redundant check when adding rule When a rule with depth > 24 is added into an existing rule with depth <=24, a new tbl8 is allocated, the existing rule first fulfill whole new tbl8, so the filed valid of each entry in this tbl8 is always true and depth of each entry is always <= 24 before adding the new rule with depth > 24. Signed-off-by: Wei Dai <wei.dai@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2016-10-13 22:17:00 +02:00
Wei Dai	69ed52dddc	lpm: fix freeing unused sub-table on rule delete When all rules with depth > 24 are deleted in a same sub-table (tlb8 group) and only a rule with depth <=24 is left in it, this sub-table (tlb8 group) should be recycled. Fixes: dc81ebbacaeb ("lpm: extend IPv4 next hop field") Fixes: af75078fece3 ("first public release") Signed-off-by: Wei Dai <wei.dai@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2016-10-13 22:13:19 +02:00
John Ousterhout	844bd77c03	log: respect logger configured before EAL init Before this patch, application-specific loggers could not be installed before rte_eal_init completed (the initialization process called rte_openlog_stream, overwriting any previously installed logger). This made it impossible for an application to capture the initial log messages generated during rte_eal_init. This patch changes initialization so that information from a previous call to rte_openlog_stream is not lost. Specifically: * The default log stream is now maintained separately from an application-specific log stream installed with rte_openlog_stream. * rte_eal_common_log_init has been renamed to eal_log_set_default, since this is all it does. It no longer invokes rte_openlog_stream; it just updates the default stream. Also, this method now returns void, rather than int, since there are no errors. This patch also removes the "early log" mechanism and cleans up the log initialization mechanism: * The default log stream defaults to stderr on all platforms if eal_log_set_default hasn't been invoked (Linux used to use stdout during the first part of initialization). * Removed rte_eal_log_early_init; all of the desired functionality can be achieved by calling eal_log_set_default. * Removed lib/librte_eal/bsdapp/eal/eal_log.c: it contained only one function, rte_eal_log_init, which is not needed or invoked for BSD. * Removed declaration for eal_default_log_stream in rte_log.h (it's now private to eal_common_log.c). * Moved call to rte_eal_log_init earlier in rte_eal_init for Linux, so that it starts using the preferrred log ASAP. Signed-off-by: John Ousterhout <ouster@cs.stanford.edu>	2016-10-13 22:13:18 +02:00
Mauricio Vasquez B	29f1cb4b38	doc: fix file argument of debug functions Previous patch updated the functions without updating all the comments. Fixes: 591a9d7985c1 ("add FILE argument to debug functions") Signed-off-by: Mauricio Vasquez B <mauricio.vasquez@polito.it> Acked-by: John McNamara <john.mcnamara@intel.com>	2016-10-13 21:25:53 +02:00
Olivier Matz	696573046e	net/virtio: support TSO Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>	2016-10-13 20:45:56 +02:00
Olivier Matz	86d59b2146	net/virtio: support LRO Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>	2016-10-13 20:45:56 +02:00
Olivier Matz	58169a9c81	net/virtio: support Tx checksum offload Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>	2016-10-13 20:45:56 +02:00
Olivier Matz	96cb671193	net/virtio: support Rx checksum offload Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>	2016-10-13 20:45:56 +02:00
Olivier Matz	5896999295	app/testpmd: display LRO segment size In csumonly engine, display the value of LRO segment if the LRO flag is set. Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>	2016-10-13 20:45:56 +02:00
Olivier Matz	6ca3a595e0	mbuf: add flag for LRO When receiving coalesced packets in virtio, the original size of the segments is provided. This is a useful information because it allows to resegment with the same size. Add a RX new flag in mbuf, that can be set when packets are coalesced by a hardware or virtual driver when the m->tso_segsz field is valid and is set to the segment size of original packets. This flag is used in next commits in the virtio pmd. Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>	2016-10-13 20:45:54 +02:00
Olivier Matz	5842289a54	mbuf: add new Rx checksum flags Following discussions in [1] and [2], introduce a new bit to describe the Rx checksum status in mbuf. Before this patch, only one flag was available: PKT_RX_L4_CKSUM_BAD: L4 cksum of RX pkt. is not OK. And same for L3: PKT_RX_IP_CKSUM_BAD: IP cksum of RX pkt. is not OK. This had 2 issues: - it was not possible to differentiate "checksum good" from "checksum unknown". - it was not possible for a virtual driver to say "the checksum in packet may be wrong, but data integrity is valid". This patch tries to solve this issue by having 4 states (2 bits) for the IP and L4 Rx checksums. New values are: - PKT_RX_L4_CKSUM_UNKNOWN: no information about the RX L4 checksum -> the application should verify the checksum by sw - PKT_RX_L4_CKSUM_BAD: the L4 checksum in the packet is wrong -> the application can drop the packet without additional check - PKT_RX_L4_CKSUM_GOOD: the L4 checksum in the packet is valid -> the application can accept the packet without verifying the checksum by sw - PKT_RX_L4_CKSUM_NONE: the L4 checksum is not correct in the packet data, but the integrity of the L4 data is verified. -> the application can process the packet but must not verify the checksum by sw. It has to take care to recalculate the cksum if the packet is transmitted (either by sw or using tx offload) And same for L3 (replace L4 by IP in description above). This commit tries to be compatible with existing applications that only check the existing flag (CKSUM_BAD). [1] http://dpdk.org/ml/archives/dev/2016-May/039920.html [2] http://dpdk.org/ml/archives/dev/2016-June/040007.html Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>	2016-10-13 20:45:33 +02:00
Olivier Matz	c442fed81b	net: add function to calculate checksum in mbuf This function can be used to calculate the checksum of data embedded in mbuf, that can be composed of several segments. This function will be used by the virtio pmd in next commits to calculate the checksum in software in case the protocol is not recognized. Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>	2016-10-13 20:44:18 +02:00
Olivier Matz	60e6f4707e	net/virtio: reinitialize device when configuring Add the ability to reset the virtio device in the configure callback if the features flag changed since previous reset. This will be possible with the introduction of offload support in next commits. Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>	2016-10-13 20:30:33 +02:00
Olivier Matz	45e4acd476	net/virtio: move control queue configuration Move the configuration of control queue in the configure callback. This is needed by next commit, which introduces the reinitialization of the device in the configure callback to change the feature flags. Therefore, the control queue will have to be restarted at the same place. As virtio_dev_cq_queue_setup() is called from a place where config->max_virtqueue_pairs is not available, we need to store this in the private structure. It replaces max_rx_queues and max_tx_queues which have the same value. The log showing the value of max_rx_queues and max_tx_queues is also removed since config->max_virtqueue_pairs is already displayed above. Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>	2016-10-13 20:17:38 +02:00
Olivier Matz	198ab33677	net/virtio: move device initialization in a function Move all code related to device initialization in a new function virtio_init_device(). This commit brings no functional change, it prepares the next commits that will add the offload support. For that, it will be needed to reinitialize the device from ethdev->configure(), using this new function. Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>	2016-10-13 20:15:29 +02:00
Zhihong Wang	f46f655143	vhost: fix Windows VM hang This patch fixes a Windows VM compatibility issue in DPDK 16.07 vhost code which causes the guest to hang once any packets are enqueued when mrg_rxbuf is turned on by setting the right id and len in the used ring. As defined in virtio spec 0.95 and 1.0, in each used ring element, id means index of start of used descriptor chain, and len means total length of the descriptor chain which was written to. While in 16.07 code, index of the last descriptor is assigned to id, and the length of the last descriptor is assigned to len. How to test? 1. Start testpmd in the host with a vhost port. 2. Start a Windows VM image with qemu and connect to the vhost port. 3. Start io forwarding with tx_first in host testpmd. For 16.07 code, the Windows VM will hang once any packets are enqueued. Signed-off-by: Zhihong Wang <zhihong.wang@intel.com> Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>	2016-10-13 10:29:31 +02:00
Yuanhan Liu	4ce97c6f6b	net/vhost: add an option to enable dequeue zero copy Add an option, dequeue-zero-copy, to enable this feature in vhost-pmd. Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Tested-by: Qian Xu <qian.q.xu@intel.com>	2016-10-13 10:29:31 +02:00
Yuanhan Liu	00b8b70666	examples/vhost: add --dequeue-zero-copy option Add an option, --dequeue-zero-copy, to enable dequeue zero copy. One thing worth noting while using dequeue zero copy is the nb_tx_desc has to be small enough so that the eth driver will hit the mbuf free threshold easily and thus free mbuf more frequently. The reason behind that is, when dequeue zero copy is enabled, guest Tx used vring will be updated only when corresponding mbuf is freed. If mbuf is not freed frequently, the guest Tx vring could be starved. Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Tested-by: Qian Xu <qian.q.xu@intel.com>	2016-10-13 10:29:31 +02:00
Yuanhan Liu	9ba1e744ab	vhost: add a flag to enable dequeue zero copy Dequeue zero copy is disabled by default. Here add a new flag ``RTE_VHOST_USER_DEQUEUE_ZERO_COPY`` to explictily enable it. Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Tested-by: Qian Xu <qian.q.xu@intel.com>	2016-10-13 10:29:20 +02:00
Yuanhan Liu	b0a985d1f3	vhost: add dequeue zero copy The basic idea of dequeue zero copy is, instead of copying data from the desc buf, here we let the mbuf reference the desc buf addr directly. Doing so, however, has one major issue: we can't update the used ring at the end of rte_vhost_dequeue_burst. Because we don't do the copy here, an update of the used ring would let the driver to reclaim the desc buf. As a result, DPDK might reference a stale memory region. To update the used ring properly, this patch does several tricks: - when mbuf references a desc buf, refcnt is added by 1. This is to pin lock the mbuf, so that a mbuf free from the DPDK won't actually free it, instead, refcnt is subtracted by 1. - We chain all those mbuf together (by tailq) And we check it every time on the rte_vhost_dequeue_burst entrance, to see if the mbuf is freed (when refcnt equals to 1). If that happens, it means we are the last user of this mbuf and we are safe to update the used ring. - "struct zcopy_mbuf" is introduced, to associate an mbuf with the right desc idx. Dequeue zero copy is introduced for performance reason, and some rough tests show about 50% perfomance boost for packet size 1500B. For small packets, (e.g. 64B), it actually slows a bit down (well, it could up to 15%). That is expected because this patch introduces some extra works, and it outweighs the benefit from saving few bytes copy. Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Tested-by: Qian Xu <qian.q.xu@intel.com>	2016-10-12 09:45:14 +02:00
Yuanhan Liu	f6be82d725	vhost: introduce last available index for dequeue So far, we retrieve both the used ring and avail ring idx by the var last_used_idx; it won't be a problem because the used ring is updated immediately after those avail entries are consumed. But that's not true when dequeue zero copy is enabled, that used ring is updated only when the mbuf is consumed. Thus, we need use another var to note the last avail ring idx we have consumed. Therefore, last_avail_idx is introduced. Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Tested-by: Qian Xu <qian.q.xu@intel.com>	2016-10-12 09:45:12 +02:00
Yuanhan Liu	e246896178	vhost: get guest/host physical address mappings So that we can convert a guest physical address to host physical address, which will be used in later Tx zero copy implementation. MAP_POPULATE is set while mmaping guest memory regions, to make sure the page tables are setup and then rte_mem_virt2phy() could yield proper physical address. Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Tested-by: Qian Xu <qian.q.xu@intel.com>	2016-10-12 09:45:09 +02:00
Yuanhan Liu	552e8fd3d2	vhost: simplify memory regions handling Due to history reason (that vhost-cuse comes before vhost-user), some fields for maintaining the vhost-user memory mappings (such as mmapped address and size, with those we then can unmap on destroy) are kept in "orig_region_map" struct, a structure that is defined only in vhost-user source file. The right way to go is to remove the structure and move all those fields into virtio_memory_region struct. But we simply can't do that before, because it breaks the ABI. Now, thanks to the ABI refactoring, it's never been a blocking issue any more. And here it goes: this patch removes orig_region_map and redefines virtio_memory_region, to include all necessary info. With that, we can simplify the guest/host address convert a bit. Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Tested-by: Qian Xu <qian.q.xu@intel.com>	2016-10-12 09:44:56 +02:00
Jason Wang	7a75276ef5	net/virtio: support IOMMU platform Negotiate VIRTIO_F_IOMMU_PLATFORM to have IOMMU support. Signed-off-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>	2016-10-11 10:28:34 +02:00

1 2 3 4 5 ...

5850 Commits