Macros RTE_MBUF_DATA_DMA_ADDR and RTE_MBUF_DATA_DMA_ADDR_DEFAULT
are defined in each PMD driver file. Convert macros to inline
functions and move them to common lib/librte_mbuf/rte_mbuf.h file.
PMD drivers include rte_mbuf.h file directly/indirectly hence no
additioanl header file inclusion is necessary.
Signed-off-by: Ravi Kerur <rkerur@gmail.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
ConnectX-4 NICs can handle at most 512 entries in RETA table.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
The physically linked-together combined library has been an increasing
source of problems, as was predicted when library and symbol versioning
was introduced. Replace the complex and fragile construction with a
simple linker script which achieves the same without all the problems,
remove the related kludges from eg mlx drivers.
Since creating the linker script is practically zero cost, remove the
config option and just create it always.
Based on a patch by Sergio Gonzales Monroy, linker script approach
initially suggested by Neil Horman.
Suggested-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Suggested-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Panu Matilainen <pmatilai@redhat.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Move all os / arch specifics to eal.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Reviewed-by: Santosh Shukla <sshukla@mvista.com>
Tested-by: Santosh Shukla <sshukla@mvista.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
According to the api, rte_eal_pci_map_device is only successful when
returning 0.
Fixes: 6ba1f63b5a ("virtio: support specification 1.0")
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Fixes: c52afa68d7 ("virtio: move left PCI stuff in the right file")
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
The file rte_config.h is automatically generated and included.
No need to #include it.
The example performance-thread needs a makefile fix to avoid
overwriting the default cflags.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
As discussed on list, switch numbering scheme to be based on year/month.
Release 2.3 then becomes 16.04.
Ref: http://dpdk.org/ml/archives/dev/2015-December/030336.html
Also, added zero padding to the month so that it appear as 16.04 and
not 16.4 in "make showversion" and rte_version().
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
fix the error reported by checkpatch:
"ERROR: return is not a function, parentheses are not required"
remove parentheses in return like:
"return (logical expressions)"
remove parentheses in return a function like:
"return (rte_mempool_lookup(...))"
Fixes: 6307b909b8 ("lib: remove extra parenthesis after return")
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Modern (v1.0) virtio pci device defines several pci capabilities.
Each cap has a configure structure corresponding to it, and the
cap.bar and cap.offset fields tell us where to find it.
Firstly, we map the pci resources by rte_eal_pci_map_device().
We then could easily locate a cfg structure by:
cfg_addr = dev->mem_resources[cap.bar].addr + cap.offset;
Therefore, the entrance of enabling modern (v1.0) pci device support
is to iterate the pci capability lists, and to locate some configs
we care; and they are:
- common cfg
For generic virtio and virtqueue configuration, such as setting/getting
features, enabling a specific queue, and so on.
- nofity cfg
Combining with `queue_notify_off' from common cfg, we could use it to
notify a specific virt queue.
- device cfg
Where virtio_net_config structure is located.
- isr cfg
Where to read isr (interrupt status).
If any of above cap is not found, we fallback to the legacy virtio
handling.
If succeed, hw->vtpci_ops is assigned to modern_ops, where all
operations are implemented by reading/writing a (or few) specific
configuration space from above 4 cfg structures. And that's basically
how this patch works.
Besides those changes, virtio 1.0 introduces a new status field:
FEATURES_OK, which is set after features negotiation is done.
Last, set the VIRTIO_F_VERSION_1 feature flag.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Reviewed-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Tested-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Huawei Xie <huawei.xie@intel.com>
The mergeable virtio net hdr format has been the standard and the
only virtio net hdr format since virtio 1.0. Therefore, we can
not hardcode hdr_size to "sizeof(struct virtio_net_hdr)" any more
at virtio_recv_pkts(), otherwise, there would be a mismatch of
hdr size from rte_vhost_enqueue_burst() and virtio_recv_pkts(),
leading a packet corruption.
Instead, we should retrieve it from hw->vtnet_hdr_size; we will
do proper settings at eth_virtio_dev_init() in later patches.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Reviewed-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Tested-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Huawei Xie <huawei.xie@intel.com>
Switch to 64 bit features, which virtio 1.0 supports.
While legacy virtio only supports 32 bit features, it complains aloud
and quit when trying to setting > 32 bit features.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Reviewed-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Tested-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Huawei Xie <huawei.xie@intel.com>
virtio_pci.c is a more proper place for pci stuff; virtio_ethdev is not.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Reviewed-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Tested-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Huawei Xie <huawei.xie@intel.com>
Introduce struct virtio_pci_ops, to let legacy virtio (v0.95) and
modern virtio (1.0) have different implementation regarding to a
specific pci action, such as read host status.
With that, this patch reimplements all exported pci functions, in
a way like:
vtpci_foo_bar(struct virtio_hw *hw)
{
hw->vtpci_ops->foo_bar(hw);
}
So that we need pay attention to those pci related functions only
while adding virtio 1.0 support.
This patch introduced a new vtpci function, vtpci_init(), to do
proper virtio pci settings. It's pretty simple so far: just sets
hw->vtpci_ops to legacy_ops as we don't support 1.0 yet.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Reviewed-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Tested-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Huawei Xie <huawei.xie@intel.com>
offset arg of vtpci_read/write_dev_config is derived from offsetof(),
which is of size_t type, instead of uint64_t. So, define it as size_t
type.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Reviewed-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Tested-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Huawei Xie <huawei.xie@intel.com>
As we have already set up it at virtio_dev_queue_setup(), and a vq
restart will not reset the settings.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Reviewed-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Tested-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Huawei Xie <huawei.xie@intel.com>
The x550 MDIO clock speed must be configured prior to first MDIO read or
write. The default MDIO clock speed is not valid, therefore the driver
is configuring a valid speed prior to reading the copper PHY device id.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
This patch removes KR PHY reset from ixgbe_init_phy_ops_X550em. Since
this function is meant to initialize function pointers for detected PHY
type. Internal PHY reset was moved to ixgbe_setup_internal_phy_t_x550em
which will now detect which mode does internal PHY work in, and setup it
as required.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
KR auto-neg mode is what we will be using going forward. The SW
interface for this mode is different than what was used for iXFI.
While debugging, it was determined that the ucode diagnostic was
no longer needed. This code has been removed to simplify the init
flow.
A subtle semaphore error in the CS4227 reset flow was fixed.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Avoid a needless PHY access on copper phys to save the 10ms wait
time for each PHY access. A helper function is introduced to
actually do the register access and process the contents.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
This patch changes code to use registers offsets stored in mvals table
instead of values defined statically.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
This patch adds ixgbe_set_fdir_drop_queue_82599 for enabling and
setting flow director drop queue, and adds sets drop no match in
ixgbe_init_fdir_perfect_82599 for x550.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Waiting for FDIRCMD completion is an expensive thing to do in the
transmit hot path. This wait was added to catch problems with perfect
filter rules, and, at least in the Linux driver, there is no error
check anyway, so there is no point to adding the delay. So do not wait
for completion. Change the return of the function to void, since it has
no meaningful return value.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
This patch adds the flow control ethertype to the defines for the
ETQF filter list. This only adds the define. Each driver
can add this ethertype to the filter. This is needed to prevent
denial of service by malicious VFs sending out flow control
packets.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Currently credit_refill and credit_max could be zero for a TC and that
is causing Tx hang for CEE mode configuration, so to fix that have at
min credit assigned to a TC and that is as what IEEE mode already does.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Coverity issue reported like
CID 119268 (#1 of 1): Unintended sign extension
(SIGN_EXTENSION)sign_extension: Suspicious implicit sign extension:
vsi_id with type unsigned short (16 bits, unsigned) is promoted in
vsi_id << 23 to type int (32 bits, signed), then sign-extended to type
unsigned long (64 bits, unsigned). If vsi_id << 23 is greater than
0x7FFFFFFF, the upper bits of the result will all be 1.
Fixes: 88ebc2b7f9 ("i40e: extend flow director to support VF")
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
In FreeBsd driver, the max frame size is changed to MTU, but not
keep the default value defined in DataSheet. When DPDK runs on that
NIC, the configured value is not expected.
This patch sets the max frame size to default when initialization.
Fixes: 4861cde461 ("i40e: new poll mode driver")
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
This counter was left unmodified. Restore it in ixgbe_dev_stats_get.
The ierrors counter still includes imissed for ixgbe. This behaviour is
not consistent amongst all drivers. Another patch may be needed to unify
the meaning of the ierrors counter.
Fixes: 5e50ad1c1b ("ixgbe: add specific stats")
Signed-off-by: Robin Jarry <robin.jarry@6wind.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Fragmented IPv4 packets have no TCP/UDP headers, so we hashed
random data introducing reordering of the fragments.
Signed-off-by: Andriy Berestovskyy <aber@semihalf.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
The following messages might appear after some idle time:
"PMD: Failed to allocate LACP packet from pool"
The fix ensures the mempool size is greater than the sum
of TX descriptors.
Signed-off-by: Andriy Berestovskyy <aber@semihalf.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
Add BNX2X PMD version, print it as part of adapter info.
Adjusted print adapter info output formatting.
This patch versions BNX2X PMD at 1.0.0.
Signed-off-by: Rasesh Mody <rasesh.mody@qlogic.com>
The periodic debug option is used to collect periodic
events like statistics, register access etc and won't
interfere with user-level messages.
Signed-off-by: Harish Patil <harish.patil@qlogic.com>
Signed-off-by: Rasesh Mody <rasesh.mody@qlogic.com>
Fix for the following clang build error:
drivers/net/bnx2x/elink.c:10384:41: error: shifting a
negative signed value is undefined [-Werror,-Wshift-negative-value]
vars->eee_status &= ~SHMEM_EEE_1G_ADV <<
~~~~~~~~~~~~~~~~~ ^
Signed-off-by: Harish Patil <harish.patil@qlogic.com>
SR-IOV is supported using bnx2x poll mode driver running as VF driver and
native linux driver running as PF (in host/hypervisor). There is no issue
while running with the PF driver which is at the base version as that of
PMD. However, there is a compatibility issue between newer out-of-box PF
drivers with older VF driver. So the newer VFs would also need to send
BNX2X_VF_TLV_PHYS_PORT_ID (among other TLVs) to differentiate between
newer and older VFs.
Signed-off-by: Harish Patil <harish.patil@qlogic.com>
If you stop and start the driver, the rx queue will have the previous
index values when programming the adapter. Therefore, we should always
reset the queue indices when the rx ring is setup. Note: We need to
clear (write) the status block's completion queue index since it is
possibly in a read cache.
Tidy some init code to make it clearer what the defaults are.
Signed-off-by: Chas Williams <3chas3@gmail.com>
When refilling freelists for the first time and if it fails, the rxq
is freed and returns ENOMEM. There is a check while freeing hardware rxq
to pass freelist context id if the freelist exists or 0xffff otherwise.
The error path is only reached if freelist exists. So, fix is to remove
the useless check for freelist existence.
Coverity issue: 107108
Fixes: 92c8a63223 ("cxgbe: add device configuration and Rx support")
Reported-by: John McNamara <john.mcnamara@intel.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
It corrects the input set values to be set for vlan or
double vlan.
Fixes: 98f0557076 ("i40e: configure input fields for RSS or flow director")
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
We should provide VIRTIO_PCI_QUEUE_SEL with vq->vq_queue_idx,
but not vq->queue_id.
vq->queue_id is the queue id from rte_eth_rx/tx_queue_setup(),
which always starts from 0 no matter which queue it is. However,
for virtio, even number is for RX queue, and odd number is for
TX queue.
Fixes: 5382b188fb ("virtio: add queue release")
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Check sc->state to prevent double init.
If the link is up, then the driver cannot be stopped and started
successfully. Instead of checking the link status, use the driver's
state.
Signed-off-by: Chas Williams <3chas3@gmail.com>
Similar to commit 5f9115e58c, but
for qat and mpipe drivers. The former did not exist when the
previous patch was sent and latter I just missed.
Fixes: 5f9115e58c ("mk: fix shared library dependencies of drivers")
Signed-off-by: Panu Matilainen <pmatilai@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
The standard for DPDK is to use memset() not bzero which
is a leftover BSD-ism.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
This is one of those trivial things git and other tools complain
about.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
The rx_mbuf_alloc_failed counter was only cleared by virtio driver.
Now it is cleared by common rte_eth_stats_reset function for all
drivers at once.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
call rte_eth_copy_pci_info() after the RTE_PCI_DRV_INTR_LSC
has been initialised.
Fixes: eeefe73f0a ("drivers: copy PCI device info to ethdev data")
Reported-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
The space check for transmit ring only needs a single conditional.
I.e only need to recheck for space if there was no space in first check.
This can help performance and simplifies loop.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
The virtio driver was not initializing all the fields in
the receive mbuf. This would cause bugs where previous usage
of mbuf would leave stale TCI and offload flags.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
The original was always setting unicast. While here, clean up some
other references that also point into the Ethernet header.
Signed-off-by: Chas Williams <3chas3@gmail.com>
Acked-by: Harish Patil <harish.patil@qlogic.com>
The upcoming Hyper-V driver converts the pci_drv element
in struct eth_driver to a union. When vmbus is added the
pci_drv needs to be explicit. Easier to fix the issue
ahead of time.
This is backwards compatible with previous code.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
When dev_stop is called in i40evf pmd driver, queues are switched off
to stop receiving and transmitting. But the mac address of this VF
still exists in VEB switch.
To stop the traffic from VSI level, the mac address need to be removed
too. Then the bandwidth for this SRIOV VSI can be freed.
This patch fix this issue.
Fixes: 4861cde461 ("i40e: new poll mode driver")
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
This patch fixes tx byte statistics when transmitting packets
with link down.
Previously, the counter would decrement 4 bytes for each packet that
was transmitted with link down, causing the uint64 to wrap around.
Fixes: c03fcee9ab ("ixgbe: remove CRC size from byte counters")
Reported-by: Michael Qiu <michael.qiu@intel.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Similar to commit 113c8e13c4, but
for bnx2x, pcap, sze2data and xenvirt PMDs.
Requiring applications to know about library internal details like
dependencies to external helper libraries is a limitation of
static linkage, shared libraries should always know their own
dependencies for sane operation. This is especially highlighted
with dlopen()'ed items, having applications link against about plugin
internal dependencies goes on the side of absurd.
Note that linking with a shared combined library still requires to
know the internal dependencies.
Signed-off-by: Panu Matilainen <pmatilai@redhat.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
There is a new function in the EAL API for internal use.
It has neither a proper prefix nor a .map export:
libethdev.so: undefined reference to `is_xen_dom0_supported'
Fixes: 719dbebceb ("xen: allow determining DOM0 at runtime")
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Parallel build of bonding driver can fail because of
missing dependency.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Declan Doherty <declan.doherty@intel.com>
There is a compilation issue with some compilers.
In i686 platform, long is 32bit, so XXX_CYCLECOUNTER_MASK
need define as 'ULL'
Fixes: 9c857bf6be ("igb: support ieee1588 functions for device time")
Fixes: 1c4445e1f2 ("ixgbe: support ieee1588 functions for device time")
Fixes: f3a4e40eca ("i40e: support ieee1588 functions for device time")
Signed-off-by: Michael Qiu <michael.qiu@intel.com>
After introducing vPMD feature, fm10k driver will select best
Rx/Tx in running time. Original implementation selects Rx/Tx
silently without notifications.
This patch adds debug info to notify user what actual Rx/Tx
func are used.
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
Steps to reproduce the bug:
1. All Rx offloading is disabled and start the device, then
Vector Rx is used.
2. Stop the device. Re-configure to enable hw_ip_checksum = 1,
start the device again.
3. In this case, assume regular Rx should be used since Vector
Rx doesn't support ip checksum offload. But actually Vector
Rx is used and cause checksum won't be done by hardware.
The reason is after re-configuring, driver misses an "else" in
func fm10k_set_rx_function(). Then Rx func in last round are
used.
Fixes: 77a8ab47eb ("fm10k: select best Rx function")
Reported-by: Xiao Wang <xiao.w.wang@intel.com>
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
Acked-by: Xiao Wang <xiao.w.wang@intel.com>
It fixes the issue of not configuring rss key enable flags
when there is no rss key available.
Fixes: 647d1eaf75 ("i40evf: support AQ based RSS config")
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
If use DPDK PF + DPDK VF on igb NICs, we find the MAC address of VF
port is always 0. Because we forget to give it a value if this MAC
address is not assigned by PF. This patch'll assign a random MAC
address to igb VF port as ixgbe does.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Without .map file, compiling error occurs when
CONFIG_RTE_BUILD_SHARED_LIB is enabled.
Reported-by: Guo Xin <gxin@ezchip.com>
Signed-off-by: Zhigang Lu <zlu@ezchip.com>
Secondary processes are expected to use queues and other resources
allocated by the primary, however Verbs resources can only be shared
between processes when inherited through fork().
This limitation can be worked around for TX by configuring separate queues
from secondary processes.
Signed-off-by: Or Ami <ora@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
The number of available entries in TX rings is taken before performing
completion, effectively making rings smaller than they are and causing
TX performance issues under load.
Fixes: 2e22920b85 ("mlx5: support non-scattered Tx and Rx")
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
The number of available entries in TX rings is taken before performing
completion, effectively making rings smaller than they are and causing
TX performance issues under load.
Fixes: 7fae69eeff ("mlx4: new poll mode driver")
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
When MP to MR cache is full, the last (newest) MR is freed instead of the
first (oldest) one, causing local protection errors during TX.
Fixes: 2e22920b85 ("mlx5: support non-scattered Tx and Rx")
Signed-off-by: Olga Shern <olgas@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
When MP to MR cache is full, the last (newest) MR is freed instead of the
first (oldest) one, causing local protection errors during TX.
Fixes: 7fae69eeff ("mlx4: new poll mode driver")
Signed-off-by: Olga Shern <olgas@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Pre-registering mbuf memory pools when creating TX queues avoids costly
registrations later in the data path.
Fixes: 2e22920b85 ("mlx5: support non-scattered Tx and Rx")
Signed-off-by: Olga Shern <olgas@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Pre-registering mbuf memory pools when creating TX queues avoids costly
registrations later in the data path.
Fixes: 7fae69eeff ("mlx4: new poll mode driver")
Signed-off-by: Olga Shern <olgas@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Buffers with too many segments are linearized to overcome
MLX5_PMD_SGE_WR_N, unfortunately the last segment is never sent.
Fixes: 3ee8444608 ("mlx5: support scattered Rx and Tx")
Signed-off-by: Jesper Wramberg <jesper.wramberg@gmail.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Buffers with too many segments are linearized to overcome
MLX4_PMD_SGE_WR_N, unfortunately the last segment is never sent.
Fixes: be11b35817 ("mlx4: move scattered Tx processing to helper function")
Signed-off-by: Jesper Wramberg <jesper.wramberg@gmail.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Indirect mbuf data may come from a different mempool which must be
registered separately as another memory region, otherwise such mbufs cannot
be sent.
Fixes: 2e22920b85 ("mlx5: support non-scattered Tx and Rx")
Signed-off-by: Jesper Wramberg <jesper.wramberg@gmail.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Indirect mbuf data may come from a different mempool which must be
registered separately as another memory region, otherwise such mbufs cannot
be sent.
Fixes: 7fae69eeff ("mlx4: new poll mode driver")
Signed-off-by: Jesper Wramberg <jesper.wramberg@gmail.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
A typo causes TX stats indices to be retrieved from RX queues.
Fixes: 87011737b7 ("mlx5: add software counters")
Reported-by: Nicolas Harnois <nicolas.harnois@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
A typo causes TX stats indices to be retrieved from RX queues.
Fixes: 7fae69eeff ("mlx4: new poll mode driver")
Reported-by: Nicolas Harnois <nicolas.harnois@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
CXGBE PMD depends on an alarm to periodically transmit any pending
coalesced packets and hence spins for a lock for each tx queue in the
alarm callback.
A better solution is to try to get a lock whenever possible, instead
of spinning for it.
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Most dpdk examples and apps seem to initialize with a minimum of 128 rx
descriptors and 512 tx descriptors for queue setup. However, CXGBE PMD
enforces a minimum of 1024 descriptors. This causes the dpdk stack
to return -EINVAL because the apps do not request the minimum required.
The fix is to relax this enforcing in CXGBE PMD. A minimum of 128
descriptors seems to be a good number with the least impact on the
performance.
Fixes: 946c9ed956 ("cxgbe: get descriptor limits")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
The vector number may change during 'dev_start'. Before enabling a new vector mapping,
it's necessary to disable/unmap the previous setting.
Fixes: 7ab8500037 ("ixgbe: fix VF start with PF stopped")
Reported-by: Yong Liu <yong.liu@intel.com>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Tested-by: Yong Liu <yong.liu@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
The vector number may change during 'dev_start'.
Before enabling a new vector mapping,
it's necessary to disable/unmap the previous setting.
Fixes: fe685de2b1 ("igb: fix VF start with PF stopped")
Reported-by: Yong Liu <yong.liu@intel.com>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
fm10k driver will meet compile error on non-x86 platforms due to
SSE instructions. Original implementation didn't have switch to
turn off vPMD.
The improvement introduces a macro to turn on/off vPMD functions,
it's on by default. On non-x86 platforms, it can simply be turned
off to fix compile issue.
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
When the fm10k port is closed, both func tx_queue_clean() and
fm10k_tx_queue_release_mbufs_vec() will try to release buffer in
SW ring. The latter func won't do sanity check on those pointers
and cause crash.
The fix removed Vector TX buffer release func since it can share
the release functions with regular TX.
Fixes: fb9066e479 ("fm10k: reset and release mbuf for vector Tx")
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
Acked-by: Michael Qiu <michael.qiu@intel.com>
vlan_tci should be copied from RX descriptor to mbuf,
and flag PKT_RX_VLAN_PKT should be set for every RX packet,
this patch fixes this issue.
Packets in fm10k device always carry at least one VLAN tag.
For those packets coming in without VLAN tag,
the port default VLAN tag will be used.
So in fm10k, always PKT_RX_VLAN_PKT flag is set and vlan_tci
is valid for each RX packet's mbuf.
Signed-off-by: Shaopeng He <shaopeng.he@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
For i40e vf driver, should use I40EVF_WRITE_FLUSH to flush
configuration but not I40E_WRITE_FLUSH. This patch fixed this issue.
Fixes: be6c228d4d (i40evf: support Rx interrupt)
Reported-by: Qian Xu <qian.q.xu@intel.com>
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
If DCB is not enabled, the BW info is not stored for VSI. This
patch fixes this issue by merging functions i40e_vsi_dump_bw_config
and i40e_vsi_get_bw_info together.
Fixes: c8b9a3e3fe (i40e: support DCB mode)
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
When NVM version is updated to 5.x, DCB can not be configured. This
issue is because of the FW version validation is not correct.
This patch fixed this issue.
Fixes: c8b9a3e3fe (i40e: support DCB mode)
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Tested-by: Yulong Pei <yulong.pei@intel.com>
As firmware does not support any link control from software driver
side, any phy config should be ignored as a workaround. Otherwise
the link might not be up again after binding back to kernel driver.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
It fixes the issue of not re-configuring hash enable flags (HENA)
if there is no key.
Fixes: d0a349409b ("i40e: support AQ based RSS config")
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
This patch removes the crc bytes from byte counter statistics.
Doing so fixes a bug that CRC bytes were included on TX but not
on RX, causing mismatch of bytes received / sent.
Fixes: 9aace75fc8 ("i40e: fix statistics")
Reported-by: Weichun Chen <weichunx.chen@intel.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
This patch removes the crc bytes from byte counter statistics.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
This patch removes the crc bytes from byte counter statistics.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
This fixes a bug added to em and igb drivers which causes the pci info
seen by the primary process to become invalidated by secondary process
startup.
This call was added after the process type check in the other drivers.
Fixes: eeefe73f0a ("drivers: copy PCI device info to ethdev data")
Signed-off-by: Jon DeVree <nuxi@vault24.org>
Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>
When freeing the device it is also neccesary to free
rx_queues and tx_queues
Signed-off-by: Mauricio Vasquez B <mauricio.vasquezbernal@studenti.polito.it>
Add virtual PMD which communicates with COMBO cards through sze2
layer using libsze2 library.
Since link_speed is uint16_t, there can not be used number for 100G
speed, therefore link_speed is set to ETH_LINK_SPEED_10G until the
type of link_speed is solved.
Signed-off-by: Matej Vido <matejvido@gmail.com>
This commit introduce rte_smp_mb(), rte_smp_wmb() and rte_smp_rmb(), in
order to enable memory barriers between lcores.
The patch does not provide any functional change for IA, the goal is to
have infrastructure for weakly ordered machines like ARM to work on DPDK.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Add additional functions to support the existing IEEE1588
functionality and to enable getting, setting and adjusting
the device time.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Signed-off-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>
Reviewed-by: John McNamara <john.mcnamara@intel.com>
Add additional functions to support the existing IEEE1588
functionality and to enable getting, setting and adjusting
the device time.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Signed-off-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>
Reviewed-by: John McNamara <john.mcnamara@intel.com>
Add additional functions to support the existing IEEE1588
functionality and to enable getting, setting and adjusting
the device time.
Signed-off-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Reviewed-by: John McNamara <john.mcnamara@intel.com>
Use common code to handle Xen support at runtime.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
These statistics are marked deprecated, leave them alone.
Also keep track of imissed across all ports.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
This patch refactors the queue and priority statistic handling.
Generic queue stats are presented by rte_eth_xstats_get(), and the
i40e_xstats_get() exposes only the extra stats.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Maryam Tahhan <maryam.tahhan@intel.com>
This patch refactors the queue handling. Generic queue stats are
handled by rte_eth_xstats_get() and the ixgbe_xstats_get() exposes
only the extra stats.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Maryam Tahhan <maryam.tahhan@intel.com>
It fixes the issue of trying to allocate more VSIs for VMDq than
hardware remaining. It adds a check of the hardware remaining
before allocating VSIs for VMDq.
Fixes: c80707a0fd ("i40e: fix VMDq pool limit")
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
ICC complains about enumerated types being mixed in link bonding driver,
as ETH_MQ_RX_RSS is an enum type of mq_mode and not a bitmask as it was
being treated.
Fixes: 734ce47f71 ("bonding: support RSS dynamic configuration")
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
One of the ways to reproduce the issue:
testpmd <EAL-OPTIONS> -- -i --txqflags=0
testpmd> set fwd txonly
testpmd> set txpkts 64,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4
testpmd> set txsplit rand
testpmd> start
After some time TX on ixgbe queue will hang,
and all packet transmission on that queue will stop.
This bug was first reported and investigated by
Vlad Zolotarov <vladz@cloudius-systems.com>:
"We can reproduce this issue when stressed the xmit path with a lot of highly
fragmented TCP frames (packets with up to 33 fragments with non-headers
fragments as small as 4 bytes) with all offload features enabled."
The root cause is that ixgbe_xmit_pkts() in some cases violates the HW rule
that the distance between TDs with RS bit set should not exceed 40 TDs.
>From the latest 82599 spec update:
"When WTHRESH is set to zero, the software device driver should set the RS bit
in the Tx descriptors with the EOP bit set and at least once in the 40
descriptors."
The fix is to make sure that the distance between TDs with RS bit set
would never exceed HW limit.
As part of that fix, tx_rs_thresh for ixgbe PMD is not allowed to be greater
then to 32 to comply with HW restrictions.
With that fix slight slowdown for the full-featured ixgbe TX path
might be observed (from our testing - up to 4%).
ixgbe simple TX path is unaffected by that patch.
Reported-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
This patch fixes a bug where only some of the statistics were being
reset when calling rte_eth_stats_reset() or rte_eth_xstats_reset().
As both the stats reset and xstats reset do the same, refactor away
the duplicated function.
This patch marks the VSI to update its offset, causing the stats be
look like they are reset.
Fixes: 9aace75fc8 ("i40e: fix statistics")
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Maryam Tahhan <maryam.tahhan@intel.com>
This fixes the issue of not freeing memzone in a call to free the
memory for adminq DMA.
Fixes: 4861cde461 ("i40e: new poll mode driver")
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
It fixes the issue of cannot using more than 1 poor for VMDq,
according to the queues left.
Fixes: 705b57f820 ("i40e: enlarge the number of supported queues")
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Fix for the following error on icc 2015, due to incorrect type:
drivers/net/i40e/i40e_fdir.c(1376): error #188:
enumerated type mixed with another type
case RTE_ETH_FDIR_FILTER_INPUT_SET_SELECT:
Fixes: 98f05570 ("i40e: configure input fields for RSS or flow director")
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
It fixes compile issue on ICC 13.0.0.
Error logs:
i40e_ethdev.c(7943): error #188: enumerated type mixed with another type
Fixes: c8b9a3e3fe ("i40e: support DCB mode")
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Once posted through mailbox, we must check for nack from the PF
and report an error in this case.
Signed-off-by: Julien Meunier <julien.meunier@6wind.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Report an error when something went wrong.
Signed-off-by: Julien Meunier <julien.meunier@6wind.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Sujith Sankar <ssujith@cisco.com>
In case of creating bond device without add any slaves and
quit from testpmd, application crashed since rx/tx queues
are NULL.
Add checking of this parameters before trying to free.
Signed-off-by: Raslsn Darawsheh <rdarawsheh@asaltech.com>
Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
Add support for RTE_ETH_FDIR_PASSTHRU flow director behavior so
output queue is assigned by other filters.
Signed-off-by: Andrey Chilikin <andrey.chilikin@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
This patch removes l3_l4_xsum_errors from rx errors.
The reason to remove it is that UDP packets have an optional checksum, and
when not calculated the checksum field should be set to zero. When the
checksum is not calculated (zero-ed out), the hardware still counts a valid
UDP packet as an l3_l4_xsum_error.
This hardware issue is documented in 82599 errata, titled:
"Integrity Error Reported for IPv4/UDP Packets with Zero Checksum"
The solution is to remove l3_l4_xsum_errors from rx_errors, as discussed on
http://thread.gmane.org/gmane.comp.networking.dpdk.devel/25590/
Fixes: f6bf669b99 ("ixgbe: account more Rx errors")
Suggested-by: Martin Weiser <martin.weiser@allegro-packets.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Following the same approach taken with dev_started field
in rte_eth_dev_data structure, this patch adds two new fields
in it, rx_queue_state and tx_queue_state arrays, which track
which queues have been started and which not.
This is important to avoid trying to start/stop twice a queue,
which will result in undefined behaviour
(which may cause RX/TX disruption).
Mind that only the PMDs which have queue_start/stop functions
have been changed to update this field, as the functions will
check the queue state before switching it.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
The patch enables rx interrupt support on i40e VF and some necessary
change on PF IOV mode to support VF.
On PF side, running in IOV mode via uio won't allow rx interrupt
which is exclusive with mbox interrupt in single vector competition.
On VF side, one single vector is shared for all the rx queues.
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
The patch enables rx interrupt support on i40e PF non-IOV mode.
Per queue rx interrupt works on vfio, however on uio, all rx queues
share one interrupt vector.
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
As ixgbe vf doesn't support lsc, the patch removes those unused code.
In addition, it does some tiny cleanup.
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
When ixgbe runs as a PF, mbox interrupt is prerequisite to make VF
start normally.
And PF sometimes won't 'dev_start', so the mbox interrupt register
during 'dev_init' is required.
The patch rolls back the interrupt register for mbox,lsc to the 'dev_init'.
As UIO doesn't support multiple vector, mbox has to occupy the only one.
It adds condition check on 'dev_start', rxq interrupt is not allowed
when PF running in IOV mode via UIO.
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
When igb runs as a PF, mbox interrupt is prerequisite to make VF
start normally.
And PF sometimes won't 'dev_start', so the mbox interrupt register
during 'dev_init' is required.
The patch rolls back the interrupt register for mbox,lsc to the 'dev_init'.
As UIO doesn't support multiple vector, mbox has to occupy the only one.
It adds condition check on 'dev_start', rxq interrupt is not allowed
when PF running in IOV mode via UIO.
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
According to the VFIO interrupt mapping, the interrupt vector id
for rxq starts from RX_VEC_START.
It doesn't impact the UIO cases.
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
According to the VFIO interrupt mapping, the interrupt vector id
for rxq starts from RX_VEC_START.
It doesn't impact the UIO cases.
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
The old statistics on i40e only counted the packets on ports.
So the discarding packets on VSI were not counted.
This patch is to make statistics for packets both on ports and VSI.
Also update release notes.
Signed-off-by: Xutao Sun <xutao.sun@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Some comments have a wrong space between /** and <.
Seen with
git grep '\*\* <'
Reported-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
By default, only first 3 bytes of GRE key will be used for hash or
FD calculation. With these changes, it can select 3 or 4 bytes of
GRE key for hash or FD calculation.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Signed-off-by: Andrey Chilikin <andrey.chilikin@intel.com>
The default input set of fields of a received packet are loaded from
firmware, which cannot be modified even users want to use different
fields for RSS or flow director. Here adds more flexibilities of
selecting packet fields for hash calculation or flow director for
users.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Signed-off-by: Andrey Chilikin <andrey.chilikin@intel.com>
It enlarges the number of supported queues to hardware allowed
maximum. There was a software limitation of 64 per physical port
which is not reasonable.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
It adjusts the number of queues for RSS from power of 2 to any as
long as it does not exceeds the hardware allowed.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Only seen since IPv6 RSS support was added, confusion about the purpose of
the hash_rxq_type_from_n() function has caused it to return invalid hash RX
queue types.
Refactor function for its intended purpose, rename it
hash_rxq_type_from_pos() and update comment with a better description.
Fixes: a76133214d ("mlx5: use separate indirection table for default hash Rx queue")
Reported-by: Olga Shern <olgas@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
The following error occurs when CONFIG_RTE_LIBRTE_MLX5_DEBUG=y:
drivers/net/mlx5/mlx5.c:381:4: error: ISO C forbids braced-groups within expressions
RTE_MIN() uses the non-standard ({ ... }) syntax to declare variables within
parentheses, which is rejected by -pedantic.
Since the RSS_INDIRECTION_TABLE_SIZE check is meant to go away as soon as
DPDK supports larger/variable indirection tables, put it in a separate
condition.
Fixes: 634efbc2c8 ("mlx5: support RETA query and update")
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
remove pci_dev, pci_drv, rte_bond_pmd and pci_id_table.
handle numa_node for vdevs
handle RTE_ETH_DEV_INTR_LSC for vdevs
rename the function valid_bonded_device to check_for_bonded_device
remove branches on pci_dev
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: John W. Linville <linville@tuxdriver.com>
remove rte_pcap_pmd and pci_dev.
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
remove rte_ring_pmd and pci_dev.
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
remove rte_null_pmd and pci_dev.
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Initialise dev_flags, driver, kdrv, drv_name and numa_node fields
in eth_dev data.
for the following vdevs:
null
ring
pcap
af_packet
xenvirt
mpipe
bonding
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Use new function rte_eth_copy_pci_info.
Copy device info for the following pdevs:
bnx2x
cxgbe
e1000
enic
fm10k
i40e
ixgbe
mlx4
mlx5
virtio
vmxnet3
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Make the virtio PMD allocate the array of unicast MAC addresses with
the maximum of entries (VIRTIO_MAX_MAC_ADDRS) that it exports.
Signed-off-by: Ivan Boule <ivan.boule@6wind.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
According to Table 7-38: Valid Fields by Offload Option
of Intel ® 82599 10 GbE Controller Datasheet,
L4LEN field is not needed for L4 XSUM computation by the hardware.
So remove l4_len from tx_offload_mask in ixgbe_set_xmit_ctx
function used to build the context transmitted to the hardware.
Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
ConnectX-4 adapters do not have a constant indirection table size, which is
set at runtime from the number of RX queues. The maximum size is retrieved
using a hardware query and is normally 512.
Since the current RETA API cannot handle a variable size, any query/update
command causes it to be silently updated to RSS_INDIRECTION_TABLE_SIZE
entries regardless of the original size.
Also due to the underlying type of the configuration structure, the maximum
size is limited to RSS_INDIRECTION_TABLE_SIZE (currently 128, at most 256
entries).
A port stop/start must be done to apply the new RETA configuration.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
It supports both Admin queue based and directly writing registers
based RSS hash key and lookup table configuration, as X722 supports
AQ based configuration.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
It supports both Admin queue based and directly writing registers
based RSS hash key and lookup table configuration, as X722 supports
AQ based configuration.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
In order to provide users early access of X722 and its A0 hardware,
new device IDs are added, and also compilation with those support
in base driver is enabled.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Add a one-parameter function to take an existing rte_ring and wrap it as
an ethdev, returning the port id of the new ethdev instance.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
The ring ethdev creation function creates an ethdev, but does not
actually set it up for use. Even if it's just a single ring, the user
still needs to create a mempool, call rte_eth_dev_configure, then call
rx and tx setup functions before the ethdev can be used.
This patch changes things so that the ethdev is fully set up after the
call to create the ethdev. The above-mentionned functions can still be
called - as will be the case, for instance, if the NIC is created via
commandline parameters - but they no longer are essential.
The function now also sets rte_errno appropriately on error, so the
caller can get a better indication of why a call may have failed.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
This patch removes the mac local fault count and
mac remote fault count from rx errors. The mac
fault count registers count faults, not packets,
and hence should not be added to packet counters.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Add xstats() functions and statistic strings to virtio PMD.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Maryam Tahhan <maryam.tahhan@intel.com>
Add xstats() functions and statistic strings.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Maryam Tahhan <maryam.tahhan@intel.com>
Add implementation of xstats() functions in i40evf PMD.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Maryam Tahhan <maryam.tahhan@intel.com>
Add xstats functions to i40e PMD, allowing extended statistics
to be retrieved from the NIC and exposed to the DPDK.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Maryam Tahhan <maryam.tahhan@intel.com>
Add xstats() functions and stat strings as necessary to ixgbevf PMD.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Maryam Tahhan <maryam.tahhan@intel.com>
Added and updated statistic strings as used by xstats_get(),
exposed extended queue statistics.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Maryam Tahhan <maryam.tahhan@intel.com>
Add xstats_get() and xstats_reset() functions to igb
driver, and the necessary strings to expose these
NIC statistics.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Maryam Tahhan <maryam.tahhan@intel.com>
simple rx/tx func is chose when merge-able rx is disabled and user
specifies single segment and no offload support.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Jianfeng Tan <jianfeng.tan@intel.com>
Bulk free of mbufs when clean used ring.
Shift operation of idx could be saved if vq_free_cnt means
free slots rather than free descriptors.
TODO: rearrange vq data structure, pack the stats var together so that
we could use one vec instruction to update all of them.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Jianfeng Tan <jianfeng.tan@intel.com>
With fixed avail ring, we don't need to get desc idx from avail ring.
virtio driver only has to deal with desc ring.
This patch uses vector instruction to accelerate processing desc ring.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Jianfeng Tan <jianfeng.tan@intel.com>
Add software RX ring in virtqueue.
Add fake_mbuf in virtqueue for wraparound processing.
Fill avail ring with blank mbufs in virtio_dev_vring_start
Add virtio_rxtx.h header file for RTE_VIRTIO_PMD_MAX_BURST.
Would move all rx/tx related declarations into this header file in future.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Jianfeng Tan <jianfeng.tan@intel.com>
In DPDK based switching environment, mostly vhost runs on a dedicated core
while virtio processing in guest VMs runs on different cores.
Take RX for example, with generic implementation, for each guest buffer,
a) virtio driver allocates a descriptor from free descriptor list
b) modify the entry of avail ring to point to allocated descriptor
c) after packet is received, free the descriptor
When vhost fetches the avail ring, it need to fetch the modified L1 cache from
virtio core, which is a heavy cost in current CPU implementation.
This idea of this optimization is:
allocate the fixed descriptor for each entry of avail ring, so avail ring will
always be the same during the run.
This removes L1M cache transfer from virtio core to vhost core for avail ring.
(Note we couldn't avoid the cache transfer for descriptors).
Besides, descriptor allocation and free operation is eliminated.
This also makes vector procesing possible to further accelerate the processing.
This is the layout for the avail ring(take 256 ring entries for example), with
each entry pointing to the descriptor with the same index.
avail
idx
+
|
+----+----+---+-------------+------+
| 0 | 1 | 2 | ... | 254 | 255 | avail ring
+-+--+-+--+-+-+---------+---+--+---+
| | | | | |
| | | | | |
v v v | v v
+-+--+-+--+-+-+---------+---+--+---+
| 0 | 1 | 2 | ... | 254 | 255 | desc ring
+----+----+---+-------------+------+
|
|
+----+----+---+-------------+------+
| 0 | 1 | 2 | | 254 | 255 | used ring
+----+----+---+-------------+------+
|
+
This is the ring layout for TX.
As we need one virtio header for each xmit packet, we have 128 slots available.
++
||
||
+-----+-----+-----+--------------+------+------+------+
| 0 | 1 | ... | 127 || 128 | 129 | ... | 255 | avail ring
+--+--+--+--+-----+---+------+---+--+---+------+--+---+
| | | || | | |
v v v || v v v
+--+--+--+--+-----+---+------+---+--+---+------+--+---+
| 128 | 129 | ... | 255 || 128 | 129 | ... | 255 | desc ring for virtio_net_hdr
+--+--+--+--+-----+---+------+---+--+---+------+--+---+
| | | || | | |
v v v || v v v
+--+--+--+--+-----+---+------+---+--+---+------+--+---+
| 0 | 1 | ... | 127 || 0 | 1 | ... | 127 | desc ring for tx dat
+-----+-----+-----+--------------+------+------+------+
||
||
++
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Jianfeng Tan <jianfeng.tan@intel.com>
Vector RX function will process 4 packets at a time. When the RX
ring wrapps to the tail and the left descriptor size is not multiple
of 4, SW will overwrite memory that not belongs to it and cause crash.
The fix will allocate additional 4 HW/SW spaces at the tail to avoid
overwrite.
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
Add func fm10k_set_tx_function to decide the best TX func in
fm10k_dev_tx_init.
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
Vector TX use different way to manage TX queue, it's necessary
to use different functions to reset TX queue and release mbuf
in TX queue. So, introduce 2 function pointers to do such ops.
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
Since Vector RX use different variables to trace RX HW ring, it
leads to need different func to release mbuf properly.
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
This patch add below functions:
1. Add function fm10k_rxq_rearm to re-allocate mbuf for used desc
in RX HW ring.
2. Add 2 functions, in which using SSE instructions to parse RX desc
to get pkt_type and ol_flags in mbuf.
3. Add func fm10k_recv_raw_pkts_vec to parse raw packets, in which
includes possible chained packets.
4. Add func fm10k_recv_pkts_vec to receive single mbuf packet.
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
Add new function fm10k_params_init to initialize all fm10k related
variables.
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
Add condition check in rx_queue_setup func. If number of RX desc
can't satisfy vPMD requirement, record it into a variable. Or
call fm10k_rxq_vec_setup to initialize Vector RX.
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
Add new file fm10k_rxtx_vec.c and add it into compiling.
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
According to XL710 datasheet:
RX QLEN restrictions: When the PXE_MODE flag in the GLLAN_RCTL_0
register is cleared, the QLEN must be whole number of 32
descriptors.
TX QLEN restrictions: When the PXE_MODE flag in the GLLAN_RCTL_0
register is cleared, the QLEN must be whole number of 32
descriptors.
So make sure that for both RX and TX queues number of HW descriptors is
a multiple of 32.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
This patch adds fills bonding port's stats with a sum of corresponding
values taken from bonded slaves, when stats are requested for bonding port.
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
This implementation allows to set and read RSS configuration for null
device, and is used to validate right values propagation over the slaves,
in test units for dynamic RSS configuration for bonding.
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Tetsuya Mukawa <mukawa@igel.co.jp>
This patch adds a possibility to configure more than one queue on null
device.
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Tetsuya Mukawa <mukawa@igel.co.jp>
This patch initializes eth_dev->link_intr_cbs queue used when null pmd is
added to the bonding.
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Bonding device implements independent management of RSS settings. It
stores its own copies of settings i.e. RETA, RSS hash function and RSS
key. It’s required to ensure consistency.
1) RSS hash function set for bonding device is maximal set of RSS hash
functions supported by all bonded devices. That mean, to have RSS support
for bonding, all slaves should be RSS-capable.
2) RSS key is propagated over the slaves "as is".
3) RETA for bonding is an internal table managed by bonding API, and is
used as a pattern to set up slaves. Its size is GCD of all RETA sizes, so
it can be easily used as a pattern providing expected behavior, even if
slaves RETA sizes are different.
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
Add separate functions to configure VMDQ and RSS.
Update dglort map and logic ports accordingly.
Reset MAC/VLAN filter after VMDQ config was changed.
Signed-off-by: Shaopeng He <shaopeng.he@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Michael Qiu <michael.qiu@intel.com>
The patch does below things for fm10k MAC/VLAN filter:
- Add separate functions for VMDQ and main VSI to change
MAC filter.
- Disable modification to VLAN filter in VMDQ mode.
- In device close phase, delete logic ports to remove all
MAC/VLAN filters belonging to those ports.
Signed-off-by: Shaopeng He <shaopeng.he@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Michael Qiu <michael.qiu@intel.com>
Add multi-queue checking in device configure function.
Currently, VMDQ and RSS are supported.
Signed-off-by: Shaopeng He <shaopeng.he@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Michael Qiu <michael.qiu@intel.com>
This patch adds one new API to get dcb related info.
rte_eth_dev_get_dcb_info
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
This patch enables DCB+RSS multi-queue mode, and also fix some coding
style.
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
This patch enables DCB feature on Intel XL710/X710 NICs. It includes:
Receive queue classification based on traffic class
Round Robin ETS schedule (rx and tx)
Priority flow control
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Different NIC has its specific constraint on the multi-queue
configuration, so move the checking from ethdev lib to drivers.
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Add interrupts handler for port status notification.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Seen with GCC < 4.6:
error: unknown field ‘tcp_udp’ specified in initializer
error: extra brace group at end of initializer
Static initialization of anonymous structs/unions is a C11 feature
properly supported only since GCC 4.6.
Work around compilation errors with older versions by expanding
struct ibv_exp_flow_spec into struct hash_rxq_init.
Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Normal flows do not currently provide IPv6 support.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Only a single flow per hash RX queue is needed in promiscuous mode.
Disable others to free up hardware resources.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
DPDK expects to have an RSS hash key per flow type (IPv4, IPv6, UDPv4,
etc.), to handle this the PMD must keep a table of hash keys to be able
to reconfigure the queues at each start/stop call.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
First implementation of rss_hash_update and rss_hash_conf_get, those
functions still lack in functionality but are usable to change the RSS
hash key. For now, the PMD does not handle an indirection table for
each kind of flow (IPv4, IPv6, etc.), the same RSS hash key is used
for all protocols. This situation explains why the rss_hash_conf_get
returns the RSS hash key for all DPDK supported protocols and why the
hash key is set for all of them too.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Promiscuous and allmulticast modes were historically enabled by adding
specific flows with types IBV_FLOW_ATTR_ALL_DEFAULT or
IBV_EXP_FLOW_ATTR_MC_DEFAULT to each hash RX queue, but this method is
deprecated.
- Promiscuous mode is now enabled by omitting destination MAC addresses from
basic flow specifications.
- Allmulticast mode is now enabled by using flow specifications that match
the broadcast bit in destination MAC addresses.
Signed-off-by: Olga Shern <olgas@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
All hash RX QPs currently use the same flow steering rule (L2 MAC filtering)
regardless of their type (TCP, UDP, IPv4, IPv6), which prevents them from
being dispatched properly. This is fixed by adding flow information to the
hash RX queue initialization data and generating specific flow steering
rules for each of them.
Signed-off-by: Olga Shern <olgas@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Use the maximum size of the indirection table when the number of requested
RX queues is not a power of two, this help to improve RSS balancing.
A message informs users that balancing is not optimal in such cases.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
The default hash RX queue handles packets that are not matched by more
specific types and requires its own indirection table of size 1 to work
properly.
This commit implements support for multiple indirection tables by grouping
their layout and properties in a static initialization table.
Signed-off-by: Olga Shern <olgas@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
The new Verbs RSS API is lower-level than the previous one and much more
flexible but requires RX queues to use Work Queues (WQs) internally instead
of Queue Pairs (QPs), which are grouped in an indirection table used by a
new kind of hash RX QPs.
Hash RX QPs and the indirection table together replace the parent RSS QP
while WQs are mostly similar to child QPs.
RSS hash key is not configurable yet.
Summary of changes:
- Individual DPDK RX queues do not store flow properties anymore, this info
is now part of the hash RX queues.
- All functions affecting the parent queue when RSS is enabled or the basic
queues otherwise are modified to affect hash RX queues instead.
- Hash RX queues are also used when a single DPDK RX queue is configured (no
RSS) to remove that special case.
- Hash RX queues and indirection table are created/destroyed when device
is started/stopped in addition to create/destroy flows.
- Contrary to QPs, WQs are moved to the "ready" state before posting RX
buffers, otherwise they are ignored.
- Resource domain information is added to WQs for better performance.
- CQs are not resized anymore when switching between non-SG and SG modes as
it does not work correctly with WQs. Use the largest possible size
instead, since CQ size does not have to be the same as the number of
elements in the RX queue. This also applies to the maximum number of
outstanding WRs in a WQ (max_recv_wr).
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Olga Shern <olgas@mellanox.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Or Ami <ora@mellanox.com>
Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Removing this structure reduces the size of SG and non-SG RX queue elements
significantly to improve performance.
An nice side effect is that the mbuf pointer is now fully stored in
struct rxq_elt instead of relying on the WR ID data offset hack.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Olga Shern <olgas@mellanox.com>
Signed-off-by: Or Ami <ora@mellanox.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>