Replace the raw I/O device memory read/write access with eal
abstraction for I/O device memory read/write access to fix
portability issues across different architectures.
CC: Yong Wang <yongwang@vmware.com>
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Our use case is that we have an app that needs to keep mbufs around
for a while. We've seen cases when calling vmxnet3_post_rx_bufs() from
vmxet3_recv_pkts(), it might not succeed to add any mbufs to any RX
descriptors (where it returns -err). Since there are no mbufs that the
virtual hardware can use, no packets will be received after this; the
driver won't refill the mbuf after this so it gets stuck in this
state. I call this a deadlock for lack of a better term - the virtual
HW waits for free mbufs, while the app waits for the hardware to
notify it for data (by flipping the generation bit on the used Rx
descriptors). Note that after this, the app can't recover.
This fix is a rework of this patch by Marco Lee:
http://dpdk.org/dev/patchwork/patch/6575/. I had to forward port
it, address review comments and also reverted the allocation
failure handling to the first version of the patch
(http://dpdk.org/ml/archives/dev/2015-July/022079.html), since
that's the only approach that seems to work, and seems to be what
other drivers are doing (I checked ixgbe and em). Reusing the mbuf
that's getting passed to the application doesn't seem to make
sense, and it was causing weird issues in our app. Also, reusing
rxm without checking if it's NULL could cause the code to crash.
Fixes: 14680e3747 ("vmxnet3: improve Rx performance")
Signed-off-by: Stefan Puiu <stefan.puiu@gmail.com>
Acked-by: Yong Wang <yongwang@vmware.com>
Attaching and detaching ethernet ports from an application
is not the same thing as physically removing a PCI device,
so clarify the flags indicating support. All PCI devices
are assumed to be physically removable, so no flag is
necessary in the PCI layer.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This makes struct rte_eth_dev independent of struct rte_pci_device by
replacing it with a pointer to the generic struct rte_device.
Signed-off-by: Jan Blunck <jblunck@infradead.org>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Only the drivers itself can decide if it could fill PCI information fields
of dev_info.
Signed-off-by: Jan Blunck <jblunck@infradead.org>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Add a new macro RTE_PMD_REGISTER_KMOD_DEP() that allows a driver to
declare the list of kernel modules required to run properly.
Today, most PCI drivers require uio/vfio.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
During device reset/stop, vmxnet3 releases all mbufs in tx and
rx cmd ring. For rx, we should go over all ring descriptors and
free using rte_pktmbuf_free_seg() instead of rte_pktmbuf_free()
as the metadata of the mbuf might not be properly initialized
(initialization after mempool creation is done in the rx routine)
and the mbuf should always be a single-segment one when populated.
For tx, we can use the existing way as mbuf, if any, will be a
valid one stashed in the eop.
Fixes: dfaff37fc4 ("vmxnet3: import new vmxnet3 poll mode driver implementation")
Signed-off-by: Yong Wang <yongwang@vmware.com>
All macros related to driver registeration renamed from DRIVER_*
to RTE_PMD_*
This includes:
DRIVER_REGISTER_PCI -> RTE_PMD_REGISTER_PCI
DRIVER_REGISTER_PCI_TABLE -> RTE_PMD_REGISTER_PCI_TABLE
DRIVER_REGISTER_VDEV -> RTE_PMD_REGISTER_VDEV
DRIVER_REGISTER_PARAM_STRING -> RTE_PMD_REGISTER_PARAM_STRING
DRIVER_EXPORT_* -> RTE_PMD_EXPORT_*
Fix PMDINFOGEN tool to look for matches of RTE_PMD_REGISTER_*.
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
This change enables device LRO if requested.
The current implementation of jumbo frame Rx can be used for LRO
directly without changes.
Note that since jumbo frame uses both ring0 and ring1, it cannot
be enabled in UPT (VMDirectPath) mode.
Signed-off-by: Yong Wang <yongwang@vmware.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
When adding a DPDK port to ovs-vswitchd with DPDK, the vmxnet3 device
fails to activate due to mismatched magic number. This failure causes
following operations to run: start the port, stop the port,
reconfigure and re-start the port.
During reconfigure, if there is an existing memzone, driver will reuse
it. But reconfigure may request different number of Tx/Rx queues.
This results in a memzone with wrong size and potential invalid memory
access.
To fix this, free the memzone if found and reserve a new one.
Signed-off-by: Yong Wang <yongwang@vmware.com>
Reviewed-by: Guolin Yang <gyang@vmware.com>
Reviewed-by: Daniele Di Proietto <ddiproietto@vmware.com>
Tested-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Remove the 'name' member from rte_pci_driver and move to generic
rte_driver.
Most of the PMD drivers were initially using DRIVER_REGISTER_PCI(<name>..)
as well as assigning a name to eth_driver.pci_drv.name member.
In this patch, only the original DRIVER_REGISTER_PCI(<name>..) name has
been populated into the rte_driver.name member - assignments through
eth_driver has been removed.
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
[Shreyansh: Rebase and expand changes to newly added files]
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Simplify crypto and ethdev pci drivers init by using newly introduced
init macros and helpers.
Those drivers then don't need to register as "rte_driver"s anymore.
Exceptions:
- virtio and mlx* use RTE_INIT directly as they have custom initialization
steps.
- VDEV devices are not modified - they continue to use PMD_REGISTER_DRIVER.
Update documentation for replacing an example referring to
PMD_REGISTER_DRIVER.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
As discussed in the past release, driver names are modified
to be more consistent, and the future driver should follow
this new convention.
Driver names consist of:
"driver category"_"driver folder name"_"optional extra name".
For example:
- Crypto null driver -> "crypto_null"
- Network IXGBE VF driver -> "net_ixgbe_vf"
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Moved vmware device ids macro since the driver had no such information.
Used RTE_PCI_DEVICE in place of RTE_PCI_DEV_ID_DECL* stuff.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Modify the PMD_REGISTER_DRIVER macro, adding a name argument to it. The
addition of a name argument creates a token that can be used for subsequent
macros in the creation of unique symbol names to export additional bits of
information for use by the pmdinfogen tool. For example:
PMD_REGISTER_DRIVER(ena_driver, ena);
registers the ena_driver struct as it always did, and creates a symbol
const char this_pmd_name0[] __attribute__((used)) = "ena";
which pmdinfogen can search for and extract. The subsequent macro
DRIVER_REGISTER_PCI_TABLE(ena, ena_pci_id_map);
creates a symbol const char ena_pci_tbl_export[] __attribute__((used)) =
"ena_pci_id_map";
Which allows pmdinfogen to find the pci table of this driver
Using this pattern, we can export arbitrary bits of information.
pmdinfo uses this information to extract hardware support from an object
file and create a json string to make hardware support info discoverable
later.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Panu Matilainen <pmatilai@redhat.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Remove 0x prefix for %p format to prevent double 0x in logs
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Yong Wang <yongwang@vmware.com>
The VLAN tag information should be stored in the first mbuf of a chain
of buffers, not in the last one.
Fixes: 9fd5e98b62 ("vmxnet3: support RSS and refactor Rx offload")
Signed-off-by: John Guzik <john@shieldxnetworks.com>
Acked-by: Yong Wang <yongwang@vmware.com>
The behavior of PKT_RX_VLAN_PKT was not very well defined, resulting in
PMDs not advertising the same flags in similar conditions.
Following discussion in [1], introduce 2 new flags PKT_RX_VLAN_STRIPPED
and PKT_RX_QINQ_STRIPPED that are better defined:
PKT_RX_VLAN_STRIPPED: a vlan has been stripped by the hardware and its
tci is saved in mbuf->vlan_tci. This can only happen if vlan stripping
is enabled in the RX configuration of the PMD.
For now, the old flag PKT_RX_VLAN_PKT is kept but marked as deprecated.
It should be removed from applications and PMDs in a future revision.
This patch also updates the drivers. For PKT_RX_VLAN_PKT:
- e1000, enic, i40e, mlx5, nfp, vmxnet3: done, PKT_RX_VLAN_PKT already
had the same meaning than PKT_RX_VLAN_STRIPPED, minor update is
required.
- fm10k: done, PKT_RX_VLAN_PKT already had the same meaning than
PKT_RX_VLAN_STRIPPED, and vlan stripping is always enabled on fm10k.
- ixgbe: modification done (vector and normal), the old flag was set
when a vlan was recognized, even if vlan stripping was disabled.
- the other drivers do not support vlan stripping.
For PKT_RX_QINQ_PKT, it was only supported on i40e, and the behavior was
already correct, so we can reuse the same bit value for
PKT_RX_QINQ_STRIPPED.
[1] http://dpdk.org/ml/archives/dev/2016-April/037837.html,
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Removed comparison against $CC in Makefiles as
in cross-compiling mode CC can be a different string
instead of string "gcc"
Suggested-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Many drivers provide their own implementation of rte_mbuf_raw_alloc(),
duplicating the code. Introduce a new public function in rte_mbuf to
allocate a raw mbuf (uninitialized).
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
The macro RTE_VERIFY always checks a condition.
It is optimized with "unlikely" hint.
While this macro is well suited for test applications, it is preferred
in libraries and examples to enable such check in debug mode.
That's why the macro RTE_ASSERT is introduced to call RTE_VERIFY only
if built with debug logs enabled.
A lot of assert macros were duplicated and enabled with a specific flag.
Removing these #ifdef allows to test these code branches more easily
and avoid dead code pitfalls.
The ENA_ASSERT is kept (in debug mode only) because it has more
parameters to log.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Some statistics were deprecated since release 2.1 (49f386542a).
The last deprecated counter to be used was imcasts.
The VF loopback statistics are also removed as they are used only
in igb and duplicated in extended statistics.
The new counters should be added to extended statistics.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Remy Horton <remy.horton@intel.com>
This patch redesigns the API to set the link speed/s configuration
of an ethernet port. Specifically:
- it allows to define a set of advertised speeds for
auto-negociation.
- it allows to disable link auto-negociation (single fixed speed).
- default: auto-negociate all supported speeds.
A flag autoneg in struct rte_eth_link indicates if link speed was a
result of auto-negociation or was fixed by configuration.
Signed-off-by: Marc Sune <marcdevel@gmail.com>
Tested-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Tested-by: Beilei Xing <beilei.xing@intel.com>
Tested-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
The speed numbers ETH_LINK_SPEED_ are renamed ETH_SPEED_NUM_.
The prefix ETH_LINK_SPEED_ is kept for AUTONEG and will be used
for bit flags in next patch.
Signed-off-by: Marc Sune <marcdevel@gmail.com>
Define and use ETH_LINK_UP and ETH_LINK_DOWN where appropriate.
Signed-off-by: Marc Sune <marcdevel@gmail.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
These asserts are only for debugging and never fired during
any testing, but they confuse coverity's null tracking.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Yong Wang <yongwang@vmware.com>
Now that vmxnet3 supports TCP/UDP checksum offload, let's update
the default txq flags to allow such offloads. Also fixed the tx
queue setup check to allow TCP/UDP checksum and only error out
if SCTP checksum is requested.
Fixes: f598fd063b ("vmxnet3: add Tx L4 checksum offload")
Reported-by: Heng Ding <hengx.ding@intel.com>
Signed-off-by: Yong Wang <yongwang@vmware.com>
Add a new API rte_eth_dev_get_supported_ptypes to query what packet types
can be filled by a given device. The device should be already started or
its PMD RX burst function already decided, since the packet types supported
may vary depending on RX function.
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Allow overriding the base mac address of the device.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Remy Horton <remy.horton@intel.com>
During an MTU change, the adapter is restarted. If hardware VLAN offload
is in use, this existing filter table would also be cleared. Instead,
setup the shadow table once during device initialization and just update
during restart.
vmxnet3_dev_vlan_offload_set(dev, mask) was incorrectly treating the
mask parameter as the bitmask for vlan_strip and vlan_filter, whereas
the mask indicates only what has changed - the values for
vlan_stripping and vlan_filter needs to be taken from dev_conf.rxmode.
Fixes: f003fc3834 ("vmxnet3: enable vlan filtering")
Signed-off-by: Charles (Chas) Williams <ciwillia@brocade.com>
Signed-off-by: Nachiketa Prachanda <nprachan@brocade.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Remy Horton <remy.horton@intel.com>
Add support for linking multi-segment buffers together to
handle Jumbo packets. The vmxnet3 API supports having header
and body buffer types. What this patch does is fill the primary
ring completely with header buffers and the secondary ring
with body buffers. This allows for non-jumbo frames to only
use one mbuf (from primary ring); and jumbo frames will have
first mbuf from primary ring and following mbufs from other
ring.
This could be optimized in future if the DPDK had API
to supply different sized mbufs (two pools) into driver.
Signed-off-by: Stephen Hemminger <shemming@brocade.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Release note addition:
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
This commit adds vmxnet3 TSO support.
Verified with test-pmd (set fwd csum) that both tso and
non-tso pkts can be successfully transmitted and all
segmentes for a tso pkt are correct on the receiver side.
Signed-off-by: Yong Wang <yongwang@vmware.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Tx data ring support was removed in a previous change that
added multi-seg transmit. This change adds it back.
According to the original commit (2e849373), 64B pkt
rate with l2fwd improved by ~20% on an Ivy Bridge
server at which point we start to hit some bottleneck
on the rx side.
I also re-did the same test on a different setup (Haswell
processor, ~2.3GHz clock rate) on top of the master
and still observed ~17% performance gains.
Fixes: 7ba5de417e ("vmxnet3: support multi-segment transmit")
Signed-off-by: Yong Wang <yongwang@vmware.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Macros RTE_MBUF_DATA_DMA_ADDR and RTE_MBUF_DATA_DMA_ADDR_DEFAULT
are defined in each PMD driver file. Convert macros to inline
functions and move them to common lib/librte_mbuf/rte_mbuf.h file.
PMD drivers include rte_mbuf.h file directly/indirectly hence no
additioanl header file inclusion is necessary.
Signed-off-by: Ravi Kerur <rkerur@gmail.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
fix the error reported by checkpatch:
"ERROR: return is not a function, parentheses are not required"
remove parentheses in return like:
"return (logical expressions)"
remove parentheses in return a function like:
"return (rte_mempool_lookup(...))"
Fixes: 6307b909b8 ("lib: remove extra parenthesis after return")
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
This is one of those trivial things git and other tools complain
about.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Use new function rte_eth_copy_pci_info.
Copy device info for the following pdevs:
bnx2x
cxgbe
e1000
enic
fm10k
i40e
ixgbe
mlx4
mlx5
virtio
vmxnet3
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
The extended unified packet type is now part of the standard ABI.
As mbuf struct is changed, the mbuf library version is incremented.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Neil Horman <nhorman@tuxdriver.com>