When application tried to allocate 1024 descriptors, device was not
initializing properly.
This patch solves it by avoiding allocation of all descriptors in the
ring in one attempt. At least one descriptor must remain unused in the
HW ring.
Fixes: 1173fca25a ("ena: add polling-mode driver")
Cc: stable@dpdk.org
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Jakub Palider <jpalider@gmail.com>
Acked-by: Jan Medala <jan.medala@outlook.com>
Expose additional fpga status registers.
Report hardware status during PMD init.
Signed-off-by: Ed Czeck <ed.czeck@atomicrules.com>
Acked-by: John Miller <john.miller@atomicrules.com>
Time stamp was carried in the packet meta data, but not
place in the mbuf. The new time stamp field is the proper
destination.
Remove the setting of data offset since this is done by
rte_pktmbuf_free()
Signed-off-by: Ed Czeck <ed.czeck@atomicrules.com>
Acked-by: John Miller <john.miller@atomicrules.com>
Removing this check improves performance as VLAN and CRC stripping are
enabled most of the time.
Convert MLX5_CQE_VLAN_STRIPPED to network order to speed up the check
instead of doing it on the completion queue entry field.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Trying to assign more queues to stats struct break only from one loop
when the maximum size is reached. Outside loop iteration is continued.
This leads to access an array out of bounds.
Fixes: 21e3fb0050 ("net/thunderx: add final bits for secondary queue support")
Cc: stable@dpdk.org
Signed-off-by: Marcin Wilk <marcin.wilk@caviumnetworks.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Create a new file rte_pmd_ixgbe.c for all the private
APIs. Move all the related code to the new file.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
To keep consistent with previous release, Priority Flow Control (PFC)
need to be disabled by default. This patch fixes it.
This also fixes an issue where traffic was not forwarded by testpmd
occasionally. In those cases ~4770 pps seen on one of the ports rather
than the full rate (>20mpps).
Fixes: 6f0a707e5b ("net/i40e: enable DCB on SRIOV VFs")
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Tested-by: David Hunt <david.hunt@intel.com>
There are some new PCTYPEs on X722, but they have not been announced
on the RTE lib, so if it can not set corresponding hash input set for
these packet type, the hash function won’t work.
So we need to handle them base on the translation of the new
PCTYPE and the original PCTYPE.
Fixes: b6a0ec4182 ("i40e: use AQ for Rx control register read/write")
Signed-off-by: Jeff Guo <jia.guo@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Remove initialization of next and nb_segs mbuf fields in the Rx path
since they are now initialized in the mbuf pool.
See commit 8f094a9ac5 ("mbuf: set mbuf fields while in pool").
Signed-off-by: John Daley <johndale@cisco.com>
The base code was using native GCC macros
for endian detection. Change to portable
rte_byteorder.h based endian detection.
Fixes: 262d43fe20 ("net/thunderx/base: add HW register definitions")
Suggested-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Having packets received without any offload flags given in the mbuf is not
very useful, and performance tests with testpmd indicates little
benefit is got with the current code by turning off the flags. This makes
the build-time option pointless, so we can remove it.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
With the mbuf rework, we now have 8 contiguous bytes to be rearmed in the
mbuf just before the 8-bytes of olflags. If we don't do the rearm write
inside the descriptor ring replenishment function, and delay it to
receiving the packet, we can do a single 16B write inside the RX function
to set both the rearm data, and the flags together.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Porting two changes from x86 SSE implematation.
net/i40e: fix checksum flag in x86 vector Rx
net/i40e: eliminate mbuf write on rearm
Signed-off-by: Jianbo Liu <jianbo.liu@linaro.org>
Having packets received without any offload flags given in the mbuf is not
very useful, and performance tests with testpmd indicates little to no
benefit is got with the current code by turning off the flags. This makes
the build-time option pointless, so we can remove it.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Jianbo Liu <jianbo.liu@linaro.org>
With the mbuf rework, we now have 8 contiguous bytes to be rearmed in the
mbuf just before the 8-bytes of olflags. If we don't do the rearm write
inside the descriptor ring replenishment function, and delay it to
receiving the packet, we can do a single 16B write inside the RX function
to set both the rearm data, and the flags together.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Jianbo Liu <jianbo.liu@linaro.org>
Initialize mail box command to request VF FLR. Data field was
uninitialized before as it was not required and caused the following
error during scan.
Coverity issue: 1384518
Fixes: cdb166963c ("net/liquidio: add API for VF FLR")
Signed-off-by: Shijith Thotton <shijith.thotton@caviumnetworks.com>
When PF notify link speed to VF, encode of link speed need to
be converted from ETH_SPEED_XXX to I40E_LINK_SPEED_XXX.
Fixes: bb6722fb5c ("net/i40e: fix VF bonded device link down")
Cc: stable@dpdk.org
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Make i40e_notify_all_vfs_link_status as a wrapper of
i40e_notify_vf_link_status so duplicate code can be removed.
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
The eth keyword only should be accepted.
Fixes: d37705068e ("net/i40e: parse QinQ pattern")
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
mbuf next is equal to NULL, nb_segs and reference counter are equal to 1
when mbuf is obtained from mempool.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
* Core TX packet moving functions
* Flesh out ark_adapter struct to support TX code
(not all fields used at this patch)
Signed-off-by: Ed Czeck <ed.czeck@atomicrules.com>
Provide C-level interface for Arkville's internal HW resources
DDM (Downstream Data Mover) and UDM (Upstream Data Mover) modules
Signed-off-by: Ed Czeck <ed.czeck@atomicrules.com>
Enable Arkville on supported configurations
Add overview documentation
Minimum driver support for valid compile
Arkville PMD is not supported on ARM or PowerPC at this time
Signed-off-by: Ed Czeck <ed.czeck@atomicrules.com>
Signed-off-by: John Miller <john.miller@atomicrules.com>
function i40evf_add_del_all_mac_addr without check return
value of rte_zmalloc
Fixes: 97ac72aa71 ("i40e: support setting VF MAC address")
Cc: stable@dpdk.org
Signed-off-by: Henry Cai <caihe@huawei.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Check return value of malloc.
Fixes: 3bd122eef2 ("cxgbe/base: add hardware API for Chelsio T5 series adapters")
Cc: stable@dpdk.org
Signed-off-by: Henry Cai <caihe@huawei.com>
Acked-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
build error:
.../drivers/net/xenvirt/rte_eth_xenvirt.c:676:9:
error: ‘struct rte_eth_dev’ has no member named ‘driver’
eth_dev->driver = NULL;
^~
ethdev driver removed, removing assignment to it to fix the build error
Fixes: 9dca21fb80 ("ethdev: remove ethdev driver")
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
This moves the rte_eth_copy_pci_info() into the PCI specific ethdev
header. As a side effect this also removes it from the list of symbols
exported by the rte_ethdev library.
Signed-off-by: Jan Blunck <jblunck@infradead.org>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
This removes the now unused struct eth_driver.
Signed-off-by: Jan Blunck <jblunck@infradead.org>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
There is no need to export this API. Remaining users should use the
rte_eal_vdev_init() function instead.
Signed-off-by: Jan Blunck <jblunck@infradead.org>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
A virtual device should get initialized through the rte_eal_vdev_init()
function to properly initialize the driver.
Signed-off-by: Jan Blunck <jblunck@infradead.org>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
This driver can use the library function rte_eth_dma_zone_reserve()
instead of duplicating the code.
Signed-off-by: Jan Blunck <jblunck@infradead.org>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
To properly embed the generic rte_device into the rte_eth_dev this reworks
the bonding API to call through rte_eal_vdev_init().
Signed-off-by: Jan Blunck <jblunck@infradead.org>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
This is a preparation to embed the generic rte_device into the rte_eth_dev
also for virtual devices.
Signed-off-by: Jan Blunck <jblunck@infradead.org>
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Secondary process doesn't properly attach to the rte_eth_device
initialized by the primary process.
Accessing device from secondary process (e.g. via rte_eth_rx_burst),
causes process to crash. because rte_eth_dev_data is not properly set.
The issue was flood by
'commit 7f95f78a8a ("ethdev: clear data when allocating device")'
which now clears rte_eth_dev_data entry.
For pci devices the struct is initialized by rte_eth_dev_pci_probe
->eth_dev_attach_secondary().
However, for virtio-user virtio_user_pmd_probe() is called instead of
rte_eth_dev_pci_probe().
The fix is to call rte_eth_dev_attach_secondary(), for secondary
process, from virtio_user_pmd_probe.
Fixes: 7f95f78a8a ("ethdev: clear data when allocating device")
Cc: stable@dpdk.org
Signed-off-by: Ami Sabo <amis@radware.com>
The patch change the prototype of callback function
(rte_intr_callback_fn) by removing the unnecessary parameter.
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Failed to destroy tunnel filter rule if the action of
the tunnel filter is VF, root cause is the wrong vsi
used.
Fixes: c50474f31e ("net/i40e: support tunnel filter to VF")
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>
Disable for gcc < 4.7 and icc <= 14.0
PMD uses some compiler builtins and new compiler options. Tested with
gcc 4.5.1 and following were not supported:
option:
-Ofast
macros:
_Static_assert
__ORDER_LITTLE_ENDIAN__
__ORDER_BIG_ENDIAN__
__BYTE_ORDER__
__atomic_fetch_add
__ATOMIC_ACQUIRE
__atomic_load_n
__ATOMIC_RELAXED
__atomic_store_n
__ATOMIC_RELEASE
It is not easy to fix all in PMD, disabling PMD for older compilers.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
The macro RTE_PMD_TAP_MAX_QUEUES was defined twice.
On machines with kernel < 3.8, IFF_MULTI_QUEUE didn't exist, and thus
both definitions used different values.
Fixes: cf56436611 ("net/tap: move private elements to external header")
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Since patch "mbuf: structure reorganization" the compiler complains
sometimes (in some conditions):
.../drivers/net/mlx5/mlx5_rxtx.c: In function ‘mlx5_rx_burst’:
.../drivers/net/mlx5/mlx5_rxtx.c:2082:17: error: ‘len’ may be used
uninitialized in this function [-Werror=maybe-uninitialized]
len is not initialised as it will be at the first segment of a received
packet, but it remains hard for the compiler to determine it.
Fixes: 9964b965ad ("net/mlx5: re-add Rx scatter support")
Cc: stable@dpdk.org
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
To avoid multiple stores on fast path, Ethernet drivers
aggregate the writes to data_off, refcnt, nb_segs and port
to an uint64_t data and write the data in one shot
with uint64_t* at &mbuf->rearm_data address.
Some of the non-IA platforms have store operation overhead
if the store address is not naturally aligned.This patch
fixes the performance issue on those targets.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Now that the m->next pointer and m->nb_segs is expected to be set (to
NULL and 1 respectively) after a mempool_get(), we can avoid to write them
in the Rx functions of drivers.
Only some drivers are patched, it's not an exhaustive patch. It gives
the idea to do the same in other drivers.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Set the value of m->refcnt to 1, m->nb_segs to 1 and m->next
to NULL when the mbuf is stored inside the mempool (unused).
This is done in rte_pktmbuf_prefree_seg(), before freeing or
recycling a mbuf.
Before this patch, the value of m->refcnt was expected to be 0
while in pool.
The objectives are:
- to avoid drivers to set m->next to NULL in the early Rx path, since
this field is in the second 64B of the mbuf and its access could
trigger a cache miss
- rationalize the behavior of raw_alloc/raw_free: one is now the
symmetric of the other, and refcnt is never changed in these functions.
To optimize the freeing of the segments, we try try to only update
m->refcnt, m->next, and m->nb_segs when it's required (idea from
Konstantin Ananyev <konstantin.ananyev@intel.com>).
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Rename __rte_mbuf_raw_free() as rte_mbuf_raw_free() and make
it public. The old function is kept for compat but is marked as
deprecated.
The next commit changes the behavior of rte_mbuf_raw_free() to
make it more consistent with rte_mbuf_raw_alloc().
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Document the function and make it public, since it is used at several
places in the drivers. The old one is marked as deprecated.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Rename "rte_virtio_net.h" to "rte_vhost.h", to not let it be virtio
net specific.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
We used to use rte_vhost_driver_session_start() to trigger the vhost-user
session. It takes no argument, thus it's a global trigger. And it could
be problematic.
The issue is, currently, rte_vhost_driver_register(path, flags) actually
tries to put it into the session loop (by fdset_add). However, it needs
a set of APIs to set a vhost-user driver properly:
* rte_vhost_driver_register(path, flags);
* rte_vhost_driver_set_features(path, features);
* rte_vhost_driver_callback_register(path, vhost_device_ops);
If a new vhost-user driver is registered after the trigger (think OVS-DPDK
that could add a port dynamically from cmdline), the current code will
effectively starts the session for the new driver just after the first
API rte_vhost_driver_register() is invoked, leaving later calls taking
no effect at all.
To handle the case properly, this patch introduce a new API,
rte_vhost_driver_start(path), to trigger a specific vhost-user driver.
To do that, the rte_vhost_driver_register(path, flags) is simplified
to create the socket only and let rte_vhost_driver_start(path) to
actually put it into the session loop.
Meanwhile, the rte_vhost_driver_session_start is removed: we could hide
the session thread internally (create the thread if it has not been
created). This would also simplify the application.
NOTE: the API order in prog guide is slightly adjusted for showing the
correct invoke order.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
rename "virtio_net_device_ops" to "vhost_device_ops", to not let it
be virtio-net specific.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
They are virtio-net specific and should be defined inside the virtio-net
driver.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
We used to use rte_vhost_get_queue_num() for telling how many vrings.
However, the return value is the number of "queue pairs", which is
very virtio-net specific. To make it generic, we should return the
number of vrings instead, and let the driver do the proper translation.
Say, virtio-net driver could turn it to the number of queue pairs by
dividing 2.
Meanwhile, mark rte_vhost_get_queue_num as deprecated.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Assume there is an application both support vhost-user net and
vhost-user scsi, the callback should be different. Making notify
ops per vhost driver allow application define different set of
callbacks for different driver.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The rte_eth_vhost_feature_disable/enable/get APIs are just a wrapper of
rte_vhost_feature_disable/enable/get. However, the later are going to
be refactored; it's going to take an extra parameter (socket_file path),
to let it be per-device.
Instead of changing those vhost-pmd APIs to adapt to the new vhost APIs,
we could simply remove them, and let vdev to serve this purpose. After
all, vdev options is better for disabling/enabling some features.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
So far, virtio-user with vhost-user as the backend can only support
client mode. So when vhost user backend is down, i.e., unix socket
connection is broken, the connection cannot be re-connected. We will
forcely set the link state to be down.
Note: virtio-user with vhost-kernel as the backend still cannot
support lsc now as we fail to find a way to monitor the backend, tap
device, up/down events.
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Originally, we did not report support of VIRTIO_NET_F_STATUS.
This feature is not reported by vhost backend, instead, it
is added/removed by QEMU in virtio PCI case.
We report the support of this feature so that following patch
will depend on this feature to enable LSC interrupt.
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
For rxq interrupt, the device (backend driver) will notify driver
through callfd. Each virtqueue has a callfd. To keep compatible
with the existing framework, we will give these callfds to
interrupt thread for listening for interrupts.
Before that, we need to allocate intr_handle, and fill callfds
into it so that driver can use it to set up rxq interrupt mode.
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Originally, eventfd is opened when initializing each vq; and gets closded
in virtio_user_stop_device().
To make it possible to initialize intr_handle struct in init() in following
patch, we put the open() of all eventfds into init(); and put the close()
into uninit().
Suggested-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
This patch adds a new option 'iface' to change the interface name of
tap device with vhost-kernel as backend.
Signed-off-by: Wenfeng Liu <liuwf@arraynetworks.com.cn>
Reviewed-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
This patch implements support for the Virtio MTU feature.
When negotiated, the host shares its maximum supported MTU,
which is used as initial MTU and as maximum MTU the application
can set.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
This patch adds a call to rte_vhost_mtu_get() at device creation
time to fill device's MTU property when available.
This makes the MTU value defined in QEMU cmdline accessible to the
application by calling rte_eth_dev_get_mtu().
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
The link state change interrupt can only be configured if the virtio device
supports MSIX. Prior to this change the writing of the vector to the PCI
config space was causing it to overwrite the initial part of the MAC
address since the MSIX vector is not in the config space and is occupied by
the MAC address.
This has been reproduced in Virtual Box (v5.0.30.r112061) in Windows 7.
Fixes: 954ea11540 ("virtio: do not report link state feature unless available")
Cc: stable@dpdk.org
Signed-off-by: Matt Peters <matt.peters@windriver.com>
Signed-off-by: Allain Legacy <allain.legacy@windriver.com>
virtio-user limits the qeueue number to 8 but provides no limit
check against the queue number input from user. If a bigger queue
number (> 8) is given, there is an overflow issue. Doing a sanity
check could avoid it.
Fixes: 37a7eb2ae8 ("net/virtio-user: add device emulation layer")
Cc: stable@dpdk.org
Signed-off-by: Wenfeng Liu <liuwf@arraynetworks.com.cn>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
The valid tap file descriptor range should be equal or greater
than zero instead of non-zero
Fixes: e3b434818b ("net/virtio-user: support kernel vhost")
Cc: stable@dpdk.org
Signed-off-by: Wenfeng Liu <liuwf@arraynetworks.com.cn>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
This patch revmoves include of the numaif.h header from rte_eth_vhost.c.
Commit 586e390013 ("vhost: export numa node") moved the invocation of
get_mempolicy() from rte_eth_vhost.c to librte_vhost. So there is no need
to include the numaif.h header anymore in rte_eth_vhost.c.
Signed-off-by: Rami Rosen <rami.rosen@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
The minor change aims to remove the redundant computing and make
it easier to understand the code.
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Improve error checking to avoid a caller overflowing the MCDI
request buffer if the requested TXQ size was excessively large.
Coverity issue: 1305527
Fixes: e7cd430c86 ("net/sfc/base: import SFN7xxx family support")
CC: stable@dpdk.org
Signed-off-by: Andy Moreton <amoreton@solarflare.com>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Ixgbe supports to set the relative bandwidth for the TCs.
It's a global setting for the PF and all the VFs of a
physical port.
This feature provide the API to set the bandwidth.
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Some DPDK applications/examples check link status on their
start. NICVF does not wait for the link, so those apps fail.
Wait up to 9 seconds for the link as other PMDs do in order
to fix those apps/examples.
Signed-off-by: Andriy Berestovskyy <andriy.berestovskyy@caviumnetworks.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
After adding VLAN filter, the VLAN promiscuous mode is
disabled. But there's no chance to enable it.
So add the check after deleting VLAN filter. If there's
no VLAN filter left, enable the VLAN promiscuous mode.
Fixes: 9f0645cd14 ("net/i40e: fix VLAN filter")
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
This is necessary to ensure packets with the new MAC address as
destination get redirected to the tap device.
Also change the MAC address only if the current one is different from
the requested one.
Fixes: 2bc06869cd ("net/tap: add remote netdevice traffic capture")
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Immediately after init (probing), the device MAC address is all zeroes.
It should be possible to get a correct MAC address as soon as that,
without need for a dev_configure().
With this patch, a MAC address is set in eth_dev_tap_create()
explicitly. It either comes from the remote if any was configured, or is
randomly generated. In any case, the device MAC address is guaranteed to
be the correct one when the tap netdevice actually gets created in
tun_alloc().
Fixes: f76d46b4ff ("net/tap: add MAC address management")
Fixes: 2bc06869cd ("net/tap: add remote netdevice traffic capture")
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Some errors received from the kernel are acceptable, such as a -ENOENT
for a rule deletion (the rule was already no longer existing in the
kernel). Make sure we consider return codes properly. For that,
nl_recv() has been simplified.
qdisc_exists() function is no longer needed as we can check whether the
kernel returned -EEXIST when requiring the qdisc creation. It's simpler
and faster.
Add a few messages for clarity when a netlink error occurs.
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Flow validation already processes the final actions to verify if a rule can
be applied or not and the same is done during creation. As the creation
function relies on validation to generate and apply a rule, this job can be
fully handled by the validation function.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
With support for segmented packets, it is now possible to easily receive
packets of many sizes, given an adequate number of descriptors.
Remove limitation on the minimum size of mbuf: on reception, if a packet
won't fit in the queue's mbufs, it will be detected in the packet info
and the packet will be discarded.
Fixes: 0781f5762c ("net/tap: support segmented mbufs")
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
Only full mask (0xffff) is accepted, there is no way to specify a mask
for layer 4 ports to the kernel using TC rules.
Fixes: de96fe68ae ("net/tap: add basic flow API patterns and actions")
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
Remove unnecessary interface queries and the Resource Domain when
creating the Queue Pair. Since mlx5 PMD data path is on top of native
APIs, such Verbs library calls are no longer needed.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Currently mlx5_dev_rss_reta_update() just updates tables in the host,
therefore it isn't immediately effective until restarting the device by
calling mlx5_dev_stop()/mlx5_dev_start() to update the changes in the
device side. This patch adds rebuilding the device-specific data
structure and applying it to the device right away.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
When querying and updating RSS RETA table, it always uses the max size of
the device instead of configured value. This patch fixes it and removed the
related comments in the code.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
If the primary process has initialized all the queues to vector
pmd mode, the secondary process should not use scalar code path,
because the per queue data structures haven't been prepared for
that, e.g. txq->ops is for vector Tx rather than scalar Tx.
Fixes: a6ce64a975 ("fm10k: introduce vector driver")
Cc: stable@dpdk.org
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Acked-by: Jing Chen <jing.d.chen@intel.com>
The QinQ filter uses big buffers, set the big_buffer flag
when restoring a QinQ filter.
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
The QinQ filter uses big buffers, set the big_buffer flag
when removing a QinQ filter.
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Add i40e_cloud_filter_qinq_create function, and call it
from i40e_dev_consistent_tunnel_filter_set function.
Replace the Outer IP filter with the QinQ filter.
QinQ allows multiple VLAN tags to be inserted into a single Ethernet
frame. A QinQ frame is a frame that has two VLAN 802.1Q headers.
802.1Q tunneling (QinQ) is a technique often used by Metro Ethernet
providers as a layer 2 VPN for customers.
Signed-off-by: Laura Stroe <laura.stroe@intel.com>
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
The L3 MAP register is initialised to support QinQ
cloud filters.
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Mark ID in the completion queue entry is 24 bits, the remaining 8 bits are
reserved and may be nonzero. Do not take them into account when looking for
marked packets.
Fixes: ea3bc3b1df ("net/mlx5: support mark flow action")
Cc: stable@dpdk.org
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Device reconfigure should be done without close which releases
all transmit and receive queue. ethdev API assumes that previously
setup queues (minimum from configured before and now) are kept
across device reconfigure.
Fixes: aaa3f5f0f7 ("net/sfc: add configure and close stubs")
Cc: stable@dpdk.org
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@solarflare.com>
Fixes: a8ad8cf83f ("net/sfc: provide basic stubs for Tx subsystem")
Cc: stable@dpdk.org
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@solarflare.com>
Prepare to fix device reconfigure. Make it clear that corresponding
functions should be called on device configure and close operations.
No functional change.
Fixes: a8ad8cf83f ("net/sfc: provide basic stubs for Tx subsystem")
Cc: stable@dpdk.org
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@solarflare.com>
Prepare to fix device reconfigure. Make it clear that corresponding
functions should be called on device configure and close operations.
No functional change.
Fixes: a8e64c6b45 ("net/sfc: implement Rx subsystem stubs")
Cc: stable@dpdk.org
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@solarflare.com>
Port configuration should be initialized on attach to avoid reset to
defaults on device reconfigure.
Fixes: 03ed21195d ("net/sfc: minimum port control sufficient to receive traffic")
Cc: stable@dpdk.org
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@solarflare.com>
Prepare to fix device reconfigure.
Device arguments should be parsed once on attach.
Management event queue should be initialized once on attach.
Fixes: 58294ee65a ("net/sfc: support event queue")
Fixes: c22d3c508e ("net/sfc: support parameter to choose performance profile")
Cc: stable@dpdk.org
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@solarflare.com>
EvQ info array keeps information about EvQ centralized, however
EvQ pointers are available in TxQ and RxQ structures. Single
array for all EvQs complicates device reconfigure handling, so
simply git rid of it.
It removes notion of EvQ software index since there is no EvQ
array in software any more.
Fixes: 58294ee65a ("net/sfc: support event queue")
Fixes: 9a75f75cb1 ("net/sfc: maintain management event queue")
Fixes: ce35b05c63 ("net/sfc: implement Rx queue setup release operations")
Fixes: 28944ac098 ("net/sfc: implement Rx queue start and stop operations")
Fixes: b1b7ad933b ("net/sfc: set up and release Tx queues")
Fixes: fed9aeb46c ("net/sfc: implement transmit path start / stop")
Fixes: 3b809c27b1 ("net/sfc: support link status change interrupt")
Cc: stable@dpdk.org
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@solarflare.com>
Next step to get rid of EvQ info at all.
Fixes: c22d3c508e ("net/sfc: support parameter to choose performance profile")
Fixes: 3b809c27b1 ("net/sfc: support link status change interrupt")
Cc: stable@dpdk.org
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@solarflare.com>
EvQ info array is a problem on device reconfigure when number of
Rx and Tx queues may change. It is a step to get rid of it.
Fixes: 58294ee65a ("net/sfc: support event queue")
Cc: stable@dpdk.org
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@solarflare.com>
The index of an event queue index is computed from the index of the
corresponding transmit or receive queue, and depends on the total
number of receive queues. As a consequence, the index of an event
queue bound to a transmit queue changes if the total number of
receive queues is changed.
Fixes: 58294ee65a ("net/sfc: support event queue")
Cc: stable@dpdk.org
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@solarflare.com>
Make it clear that corresponding functions should be called on device
configure and close operations. No functional change.
Fixes: 06bc197796 ("net/sfc: interrupts support sufficient for event queue init")
Cc: stable@dpdk.org
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
If Tx datapath does not support TSO, TSO was dropped on device configure.
It is incorrect to change advertised offloads.
Fixes: 7a4d44a639 ("net/sfc: make TSO a datapath-dependent feature")
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Datapath choice requires NIC capabilities knowledge and, therefore,
should be done after probe. Whereas NIC resources estimation needs
to know chosen datapath (e.g. if Tx datapath is going to use TSO).
Fixes: df1bfde4ff ("net/sfc: factor out libefx-based Rx datapath")
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Put to mempool does not free chained segments.
Fixes: e0b063941e ("net/sfc: support scattered Rx DMA")
Cc: stable@dpdk.org
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
MPLSoUDP & MPLSoGRE is not supported by tunnel
filter due to limited resource of HW, this patch
enables MPLS tunnel filter by replacing inner_mac
filter.
This configuration will be set when adding MPLSoUDP
and MPLSoGRE filter rules, and it will be invalid
only by NIC core reset.
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
This patch add MPLS parsing function to support MPLS filtering.
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
First segment size must be at least 18 bytes, packets not respecting this
are silently not sent by the NIC but counted as sent by the PMD. The only
way to figure out is compiling the PMD in debug mode.
Fixes: 6579c27c11 ("net/mlx5: remove gather loop on segments")
Cc: stable@dpdk.org
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
The driver can handle dynamic MTU change without needing the port to be
stopped explicitly by the application. However, there is currently no
check to prevent I/Os from happening on a different thread while the
port is going thru' reset internally. This patch fixes this issue by
assigning RX/TX burst functions to a dummy function and also reconfigure
RX bufsize for each rx queue based on the new MTU value.
Fixes: 200645ac79 ("net/qede: set MTU")
Cc: stable@dpdk.org
Signed-off-by: Harish Patil <harish.patil@qlogic.com>
The newer SR-IOV PF drivers expects RX/TX queues to be created before
applying RSS configuration. This patch addresses this requirement by
deferring RSS configuration till the queues are created. Even though
this issue is only seen in SR-IOV context, the changes will be made
applicable to PF also to keep the behavior consistent between VF/PF.
Fixes: 7ab35bf6b9 ("net/qede: fix RSS")
Cc: stable@dpdk.org
Signed-off-by: Harish Patil <harish.patil@qlogic.com>
Both UDP and TCP based RSS offload types are supported by the device.
This patch adds UDP protocol which got missed out in the original patch.
Fixes: 4c98f2768e ("net/qede: support RSS hash configuration")
Cc: stable@dpdk.org
Signed-off-by: Harish Patil <harish.patil@qlogic.com>
Add i40e_tunnel_type enumeration type to refine consistent
tunnel filter, it will be esay to add new tunnel type for
i40e.
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Previously, only tunnel filter to PF is supported.
This patch adds i40e_dev_consistent_tunnel_filter_set
function for consistent filter API to support tunnel
filter to VF.
Signed-off-by: Yong Liu <yong.liu@intel.com>
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Change tunnel filter function name to VXLAN filter
function, prepare for other tunnel filter function.
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Rework tunnel filter functions to align with the
new command buffer for add/remove cloud filter.
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
vhost removes limit of Rx burst size(32 pkts) and supports to make
an best effort to receive pkts.
Cc: yuanhan.liu@linux.intel.com
Cc: maxime.coquelin@redhat.com
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
vhost removes limit of Tx burst size(32 pkts) and supports to make
an best effort to transmit pkts.
Cc: yuanhan.liu@linux.intel.com
Cc: maxime.coquelin@redhat.com
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
To add a wrapper function to remove the limit of tx burst size and
implement the "make an best effort to transmit the pkts" policy.
The patch makes ixgbe vec function work in a consistent behavior
like ixgbe_xmit_pkts_simple and ixgbe_xmit_pkts do that.
Cc: Helin Zhang <helin.zhang@intel.com>
Cc: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
To add a wrapper function to remove the limit of tx burst size. The patch
makes i40e vec function an best effort to transmit the pkts in the
consistent behavior like i40e_xmit_pkts_simple and i40e_xmit_pkts do that.
Cc: Helin Zhang <helin.zhang@intel.com>
Cc: Jingjing Wu <jingjing.wu@intel.com>
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
To add a wrapper function to remove the limit of tx burst size.
The patch makes fm10k vec function an best effort to transmit
pkts in the consistent behavior like fm10k_xmit_pkts does that.
Cc: Jing Chen <jing.d.chen@intel.com>
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
This patch includes slowpath configuration and fastpath changes
to support LRO and TSO. A bit of revamping is needed in order
to make use of existing packet classification schemes in Rx fastpath
and for SG element processing in Tx.
Signed-off-by: Harish Patil <harish.patil@qlogic.com>
Add limited support for ntuple filter and flow director configuration.
The filtering is based on 4-tuples viz src-ip, dst-ip, src-port,
dst-port. The mask fields, tcp_flags, flex masks, priority fields,
Rx queue drop etc are not supported.
Signed-off-by: Harish Patil <harish.patil@qlogic.com>
Add base driver APIs to enable accelerated RFS[aRFS] mode and ramrod
to configure rfs and ntuple filter.
Signed-off-by: Harish Patil <harish.patil@qlogic.com>
Merge hw_stop and hw_reset into one function.
Prevent race condition between MFW attentions and pf stop command during
unload flow that causes an ASSERT.
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
A step toward having multi-Txq support on same queue-zone for VFs.
This change takes care of:
- VFs assume a single CID per-queue, where queue X receives CID X.
Switch to a model similar to that of PF - I.e., Use different CIDs
for Rx/Tx, and use mapping to acquire/release those. Each VF
currently will have 32 CIDs available for it [for its possible 16
Rx & 16 Tx queues].
- To retain the same interface for PFs/VFs when initializing queues,
the base driver would have to retain a unique number per-each queue
that would be communicated in some extended TLV [current TLV
interface allows the PF to send only the queue-id]. The new TLV isn't
part of the current change but base driver would now start adding
such unique keys internally to queue_cids. This would also force
us to start having alloc/setup/free for L2 [we've refrained from
doing so until now]
The limit would be no-more than 64 queues per qzone [This could be
changed if needed, but hopefully no one needs so many queues]
- In IOV, Add infrastructure for up to 64 qids per-qzone, although
at the moment hard-code '0' for Rx and '1' for Tx [Since VF still
isn't communicating via new TLV which index to associate with a
given queue in its queue-zone].
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>