In virtio, Explicit Congestion Notification (ECN) includes two parts:
guest ECN and host ECN. Guest ECN means the frontend can handle TSO
packets which have ECN set, and host ECN means the backend can handle
TSO packets which have ECN set.
The ECN features are rarely used. However, virtio-net enables them by
default, and vhost-net support both. To make live migration from
vhost-net to vhost-user possible, this patch announces to support
guest and host ECN in vhost-user.
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
Suggested-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch adds dev_pause, dev_resume and inject_pkts APIs to allow
driver to pause the worker threads and inject special packets into
Tx queue. The next patch will be based on this.
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The virtio_send_command function may be called from app's configuration
routine, but also from an interrupt handler called when live migration
is done on the backup side. So this patch makes control queue
thread-safe first.
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
For vhost sample, the operation if (dev_info.max_rx_queues >
MAX_QUEUES) in the function port_init causes startup failure
when using X710(i40e driver). X710 requires that MAX_QUEUES
should be defined no less than 320, however it is defined as
128 currently.
Such checking is overkill and Removal don't affect any
functionality (have already validated ixgbe and i40e).
The removal can avoid similar issue when introduing new physical NIC.
Fixes: 8bd6c395a5 ("examples/vhost: increase maximum queue number")
Cc: stable@dpdk.org
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
On error, pthread_create() returns a positive number (errno).
Fix the test on the return value.
Fixes: af14759181 ("vhost: introduce API to start a specific driver")
Fixes: e623e0c6d8 ("vhost: add reconnect ability")
Cc: stable@dpdk.org
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
The max_mtu is kept as zero in case no CRTL channel, which leads
to failure when calling virtio_mtu_set().
Signed-off-by: Zhike Wang <wangzhike@jd.com>
Acked-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
This patch adds the name for vhost-user reconnect thread.
It can help us to know whether the thread is running.
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
The driver can suppress interrupt when VIRTIO_F_EVENT_IDX feature bit is
negotiated. The driver set vring flags to 0, and MAY use used_event in
available ring to advise device interrupt util reach an index specified
by used_event. The device ignore the lower bit of vring flags, and send
an interrupt when index reach used_event.
The device can suppress notification in a manner analogous to the ways
driver suppress interrupt. The device manipulates flags or avail_event in
the used ring in the same way the driver manipulates flags or used_event in
available ring.
Signed-off-by: Junjie Chen <junjie.j.chen@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Lei Yao <lei.a.yao@intel.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
PKT_RX_FDIR_ID should be set only if flow_tag is neither non-zero nor
MLX5_FLOW_MARK_DEFAULT.
Fixes: 6cb559d67b ("net/mlx5: add vectorized Rx/Tx burst for x86")
Fixes: 570acdb1da ("net/mlx5: add vectorized Rx/Tx burst for ARM")
Cc: stable@dpdk.org
Reported-by: Xueming Li <xuemingl@mellanox.com>
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Xueming Li <xuemingl@mellanox.com>
Adding the size of ibv_flow_spec_action_drop is missing. This can cause
overflow when converting rte_flow to ibv_flow and thus corrupt memory.
Fixes: 8086cf08b2 ("net/mlx5: handle RSS hash configuration in RSS flow")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
If mlx5_dev_link_status_handler() is executed while canceling the alarm,
deadlock can happen because rte_eal_alarm_cancel() waits for all callbackes
to finish execution and both calls are protected by priv->lock.
Fixes: 198a3c339a ("mlx5: handle link status interrupts")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Mailbox interrupt handler only takecare of the PF reset notification,
for other message mbx->ops.read should not be called since it gets
chance to break the foreground VF to PF communication.
Fixes: 316f4f1adc ("net/igb: support VF mailbox interrupt for link up/down")
Cc: stable@dpdk.org
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Wei Dai <wei.dai@intel.com>
Mailbox interrupt handler only takes care of PF reset notification, for
other message ixgbe_read_mbx should not be called since it gets chance
to break the foreground VF to PF communication.
This can be simply repeated by 'testpmd>rx_vlan rm all 0'.
Fixes: 77234603fb ("net/ixgbe: support VF mailbox interrupt for link up/down")
Cc: stable@dpdk.org
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Wei Dai <wei.dai@intel.com>
This patch adds the PCI device ID for Intel 82579LM support.
Signed-off-by: Markus Theil <markus.theil@tu-ilmenau.de>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
The transmit and receive controller state machines are only enabled after
receiving an interrupt and the link status is now valid. If an adapter
is being used in conjunction with NC-SI, network controller sideband
interface, the adapter may never get a link state change interrupt since
the adapter's PHY is always link up and never changes state.
To fix this, always enable and disable the transmit and receive with
.dev_start and .dev_stop. This is a better match for what is typically
done with the other PMD's. Since we may never get an interrupt to check
the link state, we also poll once at the end of .dev_start to get the
current link status.
Signed-off-by: Chas Williams <chas3@att.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Calling i40evf_dev_xstats_reset can sometimes crash. Fixed issue
by checking return code before using pstats.
Fixes: 8210e9e0d8 ("net/i40e: fix clear xstats bug in VF")
Cc: stable@dpdk.org
Signed-off-by: David Harton <dharton@cisco.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Wei Zhao <wei.zhao1@intel.com>
This patch will go into the process of clear all queue region
related configuration when dev stop even if there is no queue region
command before, so this is a bug, it may cause error. So add code
to check if there is queue configuration exist when flush queue
region config and remove this process when device stop. Queue region
clear only do when device initialization or PMD get flush command.
Fixes: 7cbecc2f74 ("net/i40e: support queue region set and flush")
Cc: stable@dpdk.org
Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
New MAC address is copied to dev->data->mac_addrs[0] before calling
setting MAC address ops. So it will fail when deleting
dev->data->mac_addrs[0]. Deleting hw->mac.addr will fix the issue.
Fixes: 943c2d899a ("net/i40e: set VF MAC from VF")
Cc: stable@dpdk.org
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
VF can't run in multiple queue mode, if nb_q_per_pool is set to 1.
Nb_q_per_pool is passed through to max_rx_q and max_tx_q in VF.
So if nb_q_per_pool is equal to 1, max_rx_q and max_tx_q shouldn't
be more than 1, otherwise VF multiple queue mode will fail.
RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool is the Max number of queue can be
used in VF, that would be assigned as IXGBE_MAX_RX_QUEUE_NUM /
RTE_ETH_DEV_SRIOV(dev).active, so that assigning nb_q_per_pool as 1
when PF is in ETH_MQ_RX_NONE, which will make VF can just use only 1
queue, is not right.
Fixes: 27b609cbd1 ("ethdev: move the multi-queue mode check to specific drivers")
Cc: stable@dpdk.org
Signed-off-by: Yanglong Wu <yanglong.wu@intel.com>
Acked-by: Wei Dai <wei.dai@intel.com>
Rte_flow was defined to include RSS, this patch moves i40e
existing RSS to rte_flow. The old RSS configuration is kept
as it was, and can be deprecated in the future.
Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
This patch adds command to configure input set for RSS,
FDIR, and FDIR flexible payload.
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
This patch supports getting/setting input set info for
RSS, FDIR, and FDIR flexible payload. It also adds some
helper functions for input set configuration.
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Change shadow queues allocation from port/core to txq/core.
Use array of shadow queues (one per lcore) for each tx queue object to
avoid data corruption when few tx queues are handled by one lcore and
buffers that were not sent yet, can be released and used for receive.
Fixes: 0ddc9b815b ("net/mrvl: add net PMD skeleton")
Cc: stable@dpdk.org
Signed-off-by: Natalie Samsonov <nsamsono@marvell.com>
Don't return mbuf to dpdk pool if failed to get buffer from the bpool.
Fix maximum bpool size calculation to prevent unnecessary bpool
oversize cases when working with small rx queues.
Fixes: 0ddc9b815b ("net/mrvl: add net PMD skeleton")
Cc: stable@dpdk.org
Signed-off-by: Natalie Samsonov <nsamsono@marvell.com>
1. Add checking for non-EAL threads.
2. Create hif objects on first use since sometimes on probe not all
lcores are initialized and can be added later.
In this case the hif objects for later cores were not created and
this caused system crash.
Fixes: 0ddc9b815b ("net/mrvl: add net PMD skeleton")
Cc: stable@dpdk.org
Signed-off-by: Natalie Samsonov <nsamsono@marvell.com>
MUSDK library initialization and cleanup should be done once.
This commit fixes that by doing necessary initialization once the first
port is probed and cleanup once the last port is removed.
Fixes: 0ddc9b815b ("net/mrvl: add net PMD skeleton")
Cc: stable@dpdk.org
Signed-off-by: Natalie Samsonov <nsamsono@marvell.com>
This operation is already done by the ethdev layer, it should not
be done by the driver.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The TAP device does not use these definitions in current version.
And kernel version is not the correct way to detect features.
Fixes: 7c2d03d65f ("net/tap: remove Linux version check")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
FSLMC bus detects a multiple type of logical objects representing
components of the datapath.
Using the type of device, a newly introduced API
rte_fslmc_get_device_count can return the count of devices
scanned of that device type.
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Connecting the sub-devices each other by cyclic linked list can help to
iterate over them by Rx burst functions because there is no need to
check the sub-devices ring wraparound.
Create the aforementioned linked-list and change the Rx burst functions
iteration accordingly.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Fail-safe uses atomic operations to protect sub-device close operation
calling by host thread in removal time while the removed sub-device
burst functions are still in process by application threads.
Using "set" atomic operations is a little bit more efficient than "add"
or "sub" atomic operations because "set" shouldn't read the value and
in fact, it does not need a special atomic mechanism in x86 platforms.
Replace "add 1" and "sub 1" atomic operations by "set 1" and "set 0"
atomic operations.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
failsafe_rx_burst function is used when there are no sub-devices or at
least one of them has been removed, on the other hand, when all the
sub-devices are present, failsafe_rx_burst_fast function is used.
So it's really expected that some of the sub-devices will be unsafe for
Rx burst in failsafe_rx_burst execution.
Remove unlikely compiler hint from fs_rx_unsafe calling.
Fixes: a46f8d584e ("net/failsafe: add fail-safe PMD")
Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Set the MTU for bonding device by calling .mtu_set for all
the slaves. Set the MTU only if all slaves support .mtu_set,
and there is no error returned from any slave.
Signed-off-by: Sharmila Podury <sharmila.podury@att.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
On single NUMA device the log information shows as boundary value
instead of -1. The change brings in unsigned to signed.
Fixes: 4c173302c3 ("pcap: add new driver")
Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
The VLAN insert flag and VLAN tag used in the VIC write descriptor
can be set unconditionally.
Signed-off-by: John Daley <johndale@cisco.com>
Reviewed-by: Hyong Youb Kim <hyonkim@cisco.com>
Depend on the tx_offload flags in the mbuf to determine the length
of the headers instead of looking into the packet itself.
Signed-off-by: John Daley <johndale@cisco.com>
Reviewed-by: Hyong Youb Kim <hyonkim@cisco.com>
enic is currently using BSD-2-Clause, whereas the DPDK approved
license is BSD-3-Clause. So replace license text with BSD-3-Clause.
Remove LICENSE as it is redundant.
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
Once the RQ descriptors are initialized (enic_alloc_rx_queue_mbufs),
their length_type does not change during normal RX
operations. rx_pkt_burst only needs to reset their address field for
newly allocated mbufs.
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
No need to zero ol_flags as it is overwritten at the end of the
function. No need to check for EOP as the caller (enic_recv_pkts) has
already checked it.
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
Header splitting has been disabled at least since the following
commit. Remove the remaining code to avoid confusion.
commit 947d860c82 ("enic: improve Rx performance")
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
For non-UDP/TCP packets, enic may wrongly set PKT_RX_L4_CKSUM_BAD in
ol_flags. The comparison that checks if a packet is UDP or TCP assumes
that RTE_PTYPE_L4 values are bit flags, but they are not. For example,
the following evaluates to true because NONFRAG is 0x600 and UDP is
0x200, and causes the current code to think the packet is UDP.
!!(RTE_PTYPE_L4_NONFRAG & RTE_PTYPE_L4_UDP)
So, fix this by comparing the packet type against UDP and TCP
individually.
Fixes: 453d15059b ("net/enic: use new Rx checksum flags")
Cc: stable@dpdk.org
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
PKT_RX_IP_CKSUM_UNKNOWN and PKT_RX_L4_CKSUM_UNKNOWN are zeros, so no
need to set them.
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
The following commits deprecate the use of the offload bit fields
(e.g. header_split) in rte_eth_rxmode and txq_flags in rte_eth_txconf.
commit ce17eddefc ("ethdev: introduce Rx queue offloads API")
commit cba7f53b71 ("ethdev: introduce Tx queue offloads API")
For enic, the required changes are mechanical. Use the new 'offloads'
field in rxmode instead of the bit fields. And, no changes required
with respect to txq_flags, as enic does not use it at all.
Per-queue RX offload capabilities are not set, as all offloads are
per-port at the moment.
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>