The VMXNet3 protocol has a start-of-packet (SOP) and end-of-packet (EOP)
marker. If there was a bug where mbuf arrived without SOP the code that
chains the mbuf would dereference a null pointer.
Also, record any mbuf's dropped in statistics.
Although did the initial code no longer have access to VMware.
Compile tested only!
Coverity issue: 124563
Fixes: 8ee787ce80 ("vmxnet3: remove asserts that confuse coverity")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Yong Wang <yongwang@vmware.com>
When calling to setup RSS on v4 API, ESX will expect
IPv4/6 TCP RSS to be set/requested mandatory.
This patch will:
- Set IPv4/6 TCP RSS when these have not been set. A warning
message is thrown to make sure we warn the application we are
setting IPv4/6 TCP RSS when not set.
- An additional check has been added to dodge RSS configuration
altogether unless MQ_RSS has been requested, similar to v3.
The alternative (returning error) was considered, the intent
is to ease the task of setting up and running vmxnet3 in situations
where it's supposed to be most straightforward (testpmd, pktgen).
Bugzilla ID: 400
Fixes: 643fba7707 ("net/vmxnet3: add v4 boot and guest UDP RSS config")
Cc: stable@dpdk.org
Signed-off-by: Eduard Serra <eserra@vmware.com>
Acked-by: Yong Wang <yongwang@vmware.com>
Add 'rte_' prefix to structures:
- rename struct ether_addr as struct rte_ether_addr.
- rename struct ether_hdr as struct rte_ether_hdr.
- rename struct vlan_hdr as struct rte_vlan_hdr.
- rename struct vxlan_hdr as struct rte_vxlan_hdr.
- rename struct vxlan_gpe_hdr as struct rte_vxlan_gpe_hdr.
Do not update the command line library to avoid adding a dependency to
librte_net.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
This patch introduces:
- VMxnet3 v4 negotiation and,
- entirely guest-driven UDP RSS support.
VMxnet3 v3 already has UDP RSS support, however it
depends on hypervisor provisioning on the VM through
ESX specific flags, which are not transparent or known
to the guest later on.
Vmxnet3 v4 introduces a new API transaction which allows
configuring RSS entirely from the guest. This API must be
invoked after device shared mem region is initialized.
IPv4 ESP RSS (SPI based) is also available, but currently
there are no ESP RSS definitions on rte_eth layer to
handle that.
Signed-off-by: Eduard Serra <eserra@vmware.com>
Acked-by: Yong Wang <yongwang@vmware.com>
Since below commit, several tx_prep functions are broken, they fail to
pass supported Tx offload features check:
PKT_TX_IPVx must be set when any PKT_TX_L4 checksum is requested,
but these values are not present in the mask of supported Tx offloads
of several drivers that advertise PKT_TX_L4_MASK.
So any packet sent to those drivers with a L4 checksum request and
one of PKT_TX_IPVx bit set is rejected by the tx prepare function.
Fixes: 1037ed842c ("mbuf: fix Tx offload mask")
Cc: stable@dpdk.org
Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
This patch check if a input requested offloading is valid or not.
Any reuqested offloading must be supported in the device capabilities.
Any offloading is disabled by default if it is not set in the parameter
dev_conf->[rt]xmode.offloads to rte_eth_dev_configure() and
[rt]x_conf->offloads to rte_eth_[rt]x_queue_setup().
If any offloading is enabled in rte_eth_dev_configure() by application,
it is enabled on all queues no matter whether it is per-queue or
per-port type and no matter whether it is set or cleared in
[rt]x_conf->offloads to rte_eth_[rt]x_queue_setup().
If a per-queue offloading hasn't be enabled in rte_eth_dev_configure(),
it can be enabled or disabled for individual queue in
ret_eth_[rt]x_queue_setup().
A new added offloading is the one which hasn't been enabled in
rte_eth_dev_configure() and is reuqested to be enabled in
rte_eth_[rt]x_queue_setup(), it must be per-queue type,
otherwise trigger an error log.
The underlying PMD must be aware that the requested offloadings
to PMD specific queue_setup() function only carries those
new added offloadings of per-queue type.
This patch can make above such checking in a common way in rte_ethdev
layer to avoid same checking in underlying PMD.
This patch assumes that all PMDs in 18.05-rc2 have already
converted to offload API defined in 17.11 . It also assumes
that all PMDs can return correct offloading capabilities
in rte_eth_dev_infos_get().
In the beginning of [rt]x_queue_setup() of underlying PMD,
add offloads = [rt]xconf->offloads |
dev->data->dev_conf.[rt]xmode.offloads; to keep same as offload API
defined in 17.11 to avoid upper application broken due to offload
API change.
PMD can use the info that input [rt]xconf->offloads only carry
the new added per-queue offloads to do some optimization or some
code change on base of this patch.
Signed-off-by: Wei Dai <wei.dai@intel.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Packets containing empty segments are dropped by hypervisor, prevent
this case by skipping empty segments in transmission.
Also drop empty mbufs to be sure that at least one segment is transmitted
for each mbuf.
Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Acked-by: Yong Wang <yongwang@vmware.com>
When several TCP fragments are contained in a packet that is only one mbuf
segment long, vmxnet3 receives an empty segment following first one, that
contains offload information. In current version, this segment is
propagated as is to upper application.
Remove those empty segments directly when receiving buffers, they may
generate unneeded extra processing in the upper application.
Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Acked-by: Yong Wang <yongwang@vmware.com>
Not so old variants of vmxnet3 do not provide MSS value along with
LRO packet. When this case happens, try to guess MSS value with
information at hand.
Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Acked-by: Yong Wang <yongwang@vmware.com>
Add support for IPv6, LRO and properly set packet type in all
supported cases.
Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Acked-by: Yong Wang <yongwang@vmware.com>
In case we are working on a multisegment buffer, most bit are set
in last segment of the buffer. Correctly look at those bits in eop part
of the rx_offload function.
Fixes: 2fdd835f99 ("vmxnet3: support jumbo frames")
Cc: stable@dpdk.org
Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Acked-by: Yong Wang <yongwang@vmware.com>
Offloads are split between first and last segment of a packet.
Call a single vmxnet3_rx_offload function that will contain all
offload operations. This patch does not introduce any code modification.
Pass a vmxnet3_hw as parameter to the function, it is not presently
used in this patch, but will be later used for TSO offloads.
Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Acked-by: Yong Wang <yongwang@vmware.com>
Rather than parsing IP header to get proper ptype to return, just return
RTE_PTYPE_L3_IPV4_EXT_UNKNOWN, that tells application that we have an IP
packet with unknown header length.
Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Acked-by: Yong Wang <yongwang@vmware.com>
If a reconfiguration happens, queuedesc is reallocated. Any queues that
are preserved point to the previous queuedesc since the queues are only
configured during queue setup. Delay configuration of the shared queue
pointers until device start when queuedesc is no longer changing.
Fixes: 8618d19b52 ("net/vmxnet3: reallocate shared memzone on re-config")
Cc: stable@dpdk.org
Signed-off-by: Chas Williams <chas3@att.com>
Acked-by: Shrikrishna Khare <skhare@vmware.com>
Create a rte_ethdev_driver.h file and move PMD specific APIs here.
Drivers updated to include this new header file.
There is no update in header content and since ethdev.h included by
ethdev_driver.h, nothing changed from driver point of view, only
logically grouping of APIs. From applications point of view they can't
access to driver specific APIs anymore and they shouldn't.
More PMD specific data structures still remain in ethdev.h because of
inline functions in header use them. Those will be handled separately.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
With bonding, after sending sufficient ipv4 packets,
bond_ethdev_rx_burst_8023ad() no longer recognizes LACP packets
because the packet_type is set to RTE_PTYPE_L3_IPV4.
Ensure packet_type is reset for non-ipv4 packets in vmxnet3_rx_offload.
Signed-off-by: George Wilkie <george.wilkie@intl.att.com>
Acked-by: Shrikrishna Khare <skhare@vmware.com>
Replace the BSD license header with the SPDX tag for files
with only an Intel copyright on them.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
At the end of the queue release, we can free the containers for the
queue objects.
Fixes: dfaff37fc4 ("vmxnet3: import new vmxnet3 poll mode driver implementation")
Cc: stable@dpdk.org
Signed-off-by: Chas Williams <chas3@att.com>
Reviewed-by: Luca Boccassi <bluca@debian.org>
The following inline functions and macros have been renamed to be
consistent with the IOVA wording:
rte_mbuf_data_dma_addr -> rte_mbuf_data_iova
rte_mbuf_data_dma_addr_default -> rte_mbuf_data_iova_default
rte_pktmbuf_mtophys -> rte_pktmbuf_iova
rte_pktmbuf_mtophys_offset -> rte_pktmbuf_iova_offset
The deprecated functions and macros are kept to avoid breaking the API.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
The struct rte_memzone field .phys_addr is renamed to .iova.
The deprecated name is kept in an anonymous union to avoid breaking
the API.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
PKT_RX_VLAN_PKT and PKT_RX_QINQ_PKT are deprecated for a while.
As explained in [1], these flags were kept to let the applications and
PMDs move to the new flag. There is also a need to support Rx vlan
offload without vlan strip (at least for the ixgbe driver).
This patch renames the old flags for this feature, knowing that some
PMDs were using PKT_RX_VLAN_PKT and PKT_RX_QINQ_PKT to indicate that
the vlan tci has been saved in the mbuf structure.
It is likely that some PMDs do not set the proper flags when doing vlan
offload, and it would be worth making a pass on all of them.
Link: [1] http://dpdk.org/ml/archives/dev/2017-June/067712.html
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Coverity reports check_after_deref:
Null-checking rq suggests that it may be null, but it
has already been dereferenced on all paths leading to
the check.
This patch removes NULL checking of "rq" from function
vmxnet3_dev_rx_queue_reset as it is already checked against NULL
one level up the callstack (function vmxnet3_dev_clear_queues).
Coverity issue: 143468
Fixes: 5aecdc17a9 ("vmxnet3: fix stop/restart")
Cc: stable@dpdk.org
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Signed-off-by: Michal Jastrzebski <michalx.k.jastrzebski@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
vmxnet3 Rx processing should replenish ring buffers after new buffers
are available to prevent the interface from getting stuck in a state
that no new work is processed.
Signed-off-by: David Harton <dharton@cisco.com>
Acked-by: Shrikrishna Khare <skhare@vmware.com>
Refactor vmxnet3_post_rx_bufs() to call vmxnet3_renew_desc()
to update the newly allocated mbufs. While here, relocate the
relevant comments to vmxnet3_renew_desc().
Signed-off-by: Chas Williams <ciwillia@brocade.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
remove __rte_unused instances that are not required.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Allain Legacy <allain.legacy@windriver.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
If the user reconfigures the queue size, then the previously allocated
memzone may potentially be too small. Release the memzone when a queue
is released and allocate a new one each time a queue is setup.
While here convert to rte_eth_dma_zone_reserve() which does basically
the same things as the private function.
Fixes: dfaff37fc4 ("vmxnet3: import new vmxnet3 poll mode driver implementation")
Cc: stable@dpdk.org
Signed-off-by: Chas Williams <ciwillia@brocade.com>
Acked-by: Jan Blunck <jblunck@infradead.org>
Acked-by: Shrikrishna Khare <skhare@vmware.com>
vmxnet3 driver preallocates buffers for receiving packets and posts the
buffers to the emulation. In order to deliver a received packet to the
guest, the emulation must map buffer(s) and copy the packet into it.
To avoid this memory mapping overhead, this patch introduces the receive
data ring - a set of small sized buffers that are always mapped by
the emulation. If a packet fits into the receive data ring buffer, the
emulation delivers the packet via the receive data ring (which must be
copied by the guest driver), or else the usual receive path is used.
Signed-off-by: Shrikrishna Khare <skhare@vmware.com>
Acked-by: Yong Wang <yongwang@vmware.com>
Acked-by: Jin Heo <heoj@vmware.com>
vmxnet3 driver supports transmit data ring viz. a set of fixed size
buffers used by the driver to copy packet headers. Small packets that
fit these buffers are copied into these buffers entirely.
Currently this buffer size of fixed at 128 bytes. This patch extends
transmit data ring implementation to allow variable length transmit
data ring buffers. The length of the buffer is read from the emulation
during initialization.
Signed-off-by: Shrikrishna Khare <skhare@vmware.com>
Acked-by: Yong Wang <yongwang@vmware.com>
Acked-by: Jin Heo <heoj@vmware.com>
Our use case is that we have an app that needs to keep mbufs around
for a while. We've seen cases when calling vmxnet3_post_rx_bufs() from
vmxet3_recv_pkts(), it might not succeed to add any mbufs to any RX
descriptors (where it returns -err). Since there are no mbufs that the
virtual hardware can use, no packets will be received after this; the
driver won't refill the mbuf after this so it gets stuck in this
state. I call this a deadlock for lack of a better term - the virtual
HW waits for free mbufs, while the app waits for the hardware to
notify it for data (by flipping the generation bit on the used Rx
descriptors). Note that after this, the app can't recover.
This fix is a rework of this patch by Marco Lee:
http://dpdk.org/dev/patchwork/patch/6575/. I had to forward port
it, address review comments and also reverted the allocation
failure handling to the first version of the patch
(http://dpdk.org/ml/archives/dev/2015-July/022079.html), since
that's the only approach that seems to work, and seems to be what
other drivers are doing (I checked ixgbe and em). Reusing the mbuf
that's getting passed to the application doesn't seem to make
sense, and it was causing weird issues in our app. Also, reusing
rxm without checking if it's NULL could cause the code to crash.
Fixes: 14680e3747 ("vmxnet3: improve Rx performance")
Signed-off-by: Stefan Puiu <stefan.puiu@gmail.com>
Acked-by: Yong Wang <yongwang@vmware.com>
During device reset/stop, vmxnet3 releases all mbufs in tx and
rx cmd ring. For rx, we should go over all ring descriptors and
free using rte_pktmbuf_free_seg() instead of rte_pktmbuf_free()
as the metadata of the mbuf might not be properly initialized
(initialization after mempool creation is done in the rx routine)
and the mbuf should always be a single-segment one when populated.
For tx, we can use the existing way as mbuf, if any, will be a
valid one stashed in the eop.
Fixes: dfaff37fc4 ("vmxnet3: import new vmxnet3 poll mode driver implementation")
Signed-off-by: Yong Wang <yongwang@vmware.com>
Remove the 'name' member from rte_pci_driver and move to generic
rte_driver.
Most of the PMD drivers were initially using DRIVER_REGISTER_PCI(<name>..)
as well as assigning a name to eth_driver.pci_drv.name member.
In this patch, only the original DRIVER_REGISTER_PCI(<name>..) name has
been populated into the rte_driver.name member - assignments through
eth_driver has been removed.
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
[Shreyansh: Rebase and expand changes to newly added files]
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Remove 0x prefix for %p format to prevent double 0x in logs
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Yong Wang <yongwang@vmware.com>
The VLAN tag information should be stored in the first mbuf of a chain
of buffers, not in the last one.
Fixes: 9fd5e98b62 ("vmxnet3: support RSS and refactor Rx offload")
Signed-off-by: John Guzik <john@shieldxnetworks.com>
Acked-by: Yong Wang <yongwang@vmware.com>
The behavior of PKT_RX_VLAN_PKT was not very well defined, resulting in
PMDs not advertising the same flags in similar conditions.
Following discussion in [1], introduce 2 new flags PKT_RX_VLAN_STRIPPED
and PKT_RX_QINQ_STRIPPED that are better defined:
PKT_RX_VLAN_STRIPPED: a vlan has been stripped by the hardware and its
tci is saved in mbuf->vlan_tci. This can only happen if vlan stripping
is enabled in the RX configuration of the PMD.
For now, the old flag PKT_RX_VLAN_PKT is kept but marked as deprecated.
It should be removed from applications and PMDs in a future revision.
This patch also updates the drivers. For PKT_RX_VLAN_PKT:
- e1000, enic, i40e, mlx5, nfp, vmxnet3: done, PKT_RX_VLAN_PKT already
had the same meaning than PKT_RX_VLAN_STRIPPED, minor update is
required.
- fm10k: done, PKT_RX_VLAN_PKT already had the same meaning than
PKT_RX_VLAN_STRIPPED, and vlan stripping is always enabled on fm10k.
- ixgbe: modification done (vector and normal), the old flag was set
when a vlan was recognized, even if vlan stripping was disabled.
- the other drivers do not support vlan stripping.
For PKT_RX_QINQ_PKT, it was only supported on i40e, and the behavior was
already correct, so we can reuse the same bit value for
PKT_RX_QINQ_STRIPPED.
[1] http://dpdk.org/ml/archives/dev/2016-April/037837.html,
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Many drivers provide their own implementation of rte_mbuf_raw_alloc(),
duplicating the code. Introduce a new public function in rte_mbuf to
allocate a raw mbuf (uninitialized).
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
The macro RTE_VERIFY always checks a condition.
It is optimized with "unlikely" hint.
While this macro is well suited for test applications, it is preferred
in libraries and examples to enable such check in debug mode.
That's why the macro RTE_ASSERT is introduced to call RTE_VERIFY only
if built with debug logs enabled.
A lot of assert macros were duplicated and enabled with a specific flag.
Removing these #ifdef allows to test these code branches more easily
and avoid dead code pitfalls.
The ENA_ASSERT is kept (in debug mode only) because it has more
parameters to log.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
These asserts are only for debugging and never fired during
any testing, but they confuse coverity's null tracking.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Yong Wang <yongwang@vmware.com>
Now that vmxnet3 supports TCP/UDP checksum offload, let's update
the default txq flags to allow such offloads. Also fixed the tx
queue setup check to allow TCP/UDP checksum and only error out
if SCTP checksum is requested.
Fixes: f598fd063b ("vmxnet3: add Tx L4 checksum offload")
Reported-by: Heng Ding <hengx.ding@intel.com>
Signed-off-by: Yong Wang <yongwang@vmware.com>
Add support for linking multi-segment buffers together to
handle Jumbo packets. The vmxnet3 API supports having header
and body buffer types. What this patch does is fill the primary
ring completely with header buffers and the secondary ring
with body buffers. This allows for non-jumbo frames to only
use one mbuf (from primary ring); and jumbo frames will have
first mbuf from primary ring and following mbufs from other
ring.
This could be optimized in future if the DPDK had API
to supply different sized mbufs (two pools) into driver.
Signed-off-by: Stephen Hemminger <shemming@brocade.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Release note addition:
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
This commit adds vmxnet3 TSO support.
Verified with test-pmd (set fwd csum) that both tso and
non-tso pkts can be successfully transmitted and all
segmentes for a tso pkt are correct on the receiver side.
Signed-off-by: Yong Wang <yongwang@vmware.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>