Fix the mbuf offload flags namespace by adding an RTE_ prefix to the
name. The old flags remain usable, but a deprecation warning is issued
at compilation.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
The completion queue index could be implicitly extended past its
uint16_t size when multiplied by the size of the descriptor. While
this should not be a problem, coverity flags it. Do the extension
explicitly by casting the index to uintptr_t.
Coverity issue: 161317
Fixes: 8b428cb5a9 ("net/enic: use 64B completion queue entries if available")
Cc: stable@dpdk.org
Signed-off-by: John Daley <johndale@cisco.com>
Reviewed-by: Hyong Youb Kim <hyonkim@cisco.com>
The rte_ethdev_driver.h, rte_ethdev_vdev.h and rte_ethdev_pci.h files are
for drivers only and should be a private to DPDK and not installed.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Steven Webster <steven.webster@windriver.com>
Latest VIC adapters support 64B CQ (completion queue) entries as well
as 16B entries available on all VIC models. 64B entries can greatly
reduce cache contention (CPU stall cycles) between DMA writes (Rx
packet descriptors) and polling CPU. The effect is very noticeable on
Intel platforms with DDIO. As most UCS servers are based on Intel
platforms, enable and use 64B CQ entries by default, if
available. Also, add devarg 'cq64' so the user can explicitly disable
64B CQ.
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
There were defines which originally allowed sharing of some code with
the enic kernel driver. The code has long since diverged and now the
abstraction just makes the code harder to read. Mostly mechanical
replacement of defines and reformatting.
Signed-off-by: John Daley <johndale@cisco.com>
Reviewed-by: Hyong Youb Kim <hyonkim@cisco.com>
The current code wrongly assumes that packets are non-TSO and ends up
rejecting large TSO packets. Check non-TSO and TSO max packet sizes
separately.
Fixes: 5a12c38740 ("net/enic: check maximum packet size in Tx prepare handler")
Cc: stable@dpdk.org
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
Move a number of Rx functions to the header file so that the avx2
based Rx handler can use them.
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
Currently the simple Tx handler supports no offloads, which makes it
usable only for a small number of benchmarks. Add vlan and checksum
offloads to the handler, as cycles/packet increases only by about 3
cycles, and applications commonly use those offloads.
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
The NIC indicates VLAN TCI to the driver even when VLAN stripping is
disabled. The driver sets mbuf's vlan_tci but not PKT_RX_VLAN. Set
PKT_RX_VLAN to indicate that vlan_tci is valid.
Fixes: c6f4555074 ("net/enic: add ethernet VLAN packet type")
Cc: stable@dpdk.org
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
This reverts the patch that enabled mbuf fast free.
There are two main reasons.
First, enic_fast_free_wq_bufs is broken. When
DEV_TX_OFFLOAD_MBUF_FAST_FREE is enabled, the driver calls this
function to free transmitted mbufs. This function currently does not
reset next and nb_segs. This is simply wrong as the fast-free flag
does not imply anything about next and nb_segs.
We could fix enic_fast_free_wq_bufs by making it to call
rte_pktmbuf_prefree_seg to reset the required fields. But, it negates
most of cycle saving.
Second, there are customer applications that blindly enable all Tx
offloads supported by the device. Some of these applications do not
satisfy the requirements of mbuf fast free (i.e. a single pool per
queue and refcnt = 1), and end up crashing or behaving badly.
Fixes: bcaa54c1a1 ("net/enic: support mbuf fast free offload")
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
In the default Rx handler stop processing packets at the end of
the completion ring so that wrapping doesn't have to be checked
in the inner while loop.
Also, check the color bit in the completion without using a conditional.
Signed-off-by: John Daley <johndale@cisco.com>
Reviewed-by: Hyong Youb Kim <hyonkim@cisco.com>
The default tx handler checks the maximum packet size. Check it in the
prepare handler too. WQ stops working if the app/driver tries to send
oversized packets, so these checks are unavoidable.
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
Add a much-simplified handler that works when all offloads are
disabled, except mbuf fast free. When compared against the default
handler, under ideal conditions, cycles per packet drop by 60+%.
The driver tries to use the simple handler first.
The idea of using specialized/simplified handlers is from the Intel
and Mellanox drivers.
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
Request one completion update per roughly 32 buffers. It saves DMA
resources on the NIC, PCIe utilization, and cache miss rates.
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
WQ is currently using vnic_wq_buf to store mbuf pointers for Tx
packets. But, it contains an unused mempool pointer and mbuf is
unnecessarily cast to void pointer. Remove vnic_wq_buf entirely and
use an mbuf pointer array instead.
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
Fix missing or incorrect packet types discovered by DTS.
- Non-IP inner packets
Set the tunnel flag.
- Inner Ethernet packets
All supported tunnel packets have Ethernet as inner packets. So, set
INNER_L2_ETHER for all tunnel types.
- IPv4 fragments carrying TCP/UDP
The NIC indicates TCP/UDP based on the protocol in IP header. For
fragments, ignore that bit and always set L4_FRAG.
- IPv6 fragments
The NIC does regconize fragments (IPv6 packets with fragment extension
headers). Set packet types for these.
Fixes: 93fb21fdbe ("net/enic: enable overlay offload for VXLAN and GENEVE")
Cc: stable@dpdk.org
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
Related to d9fff8a31, where rte_errno should always have positive
errno values.
Technically this is an ABI change since it fixes an error code
introduced in 18.02, but is minor and inconsequential.
Fixes: 1e81dbb532 ("net/enic: add Tx prepare handler")
Cc: stable@dpdk.org
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
Reviewed-by: Aaron Conole <aconole@redhat.com>
Recent NIC models support overlay offload. The overlay offload
feature enables the following on the NIC.
- Rx/Tx checksum offloads for both inner and outer packets.
- Rx inner packet type classification.
- TSO.
- Inner RSS.
TX descriptors do not require any changes, except the header length
for TSO. The NIC parses outer/inner packets and performs offloads on
them as necessary. The header length for tunneled TSO includes both
inner and outer headers.
The NIC actually parses and performs the above for NVGRE as well. DPDK
currently has no offload flags for NVGRE, and the hardware has no
controls to individually enable tunnel types either. So do nothing for
now.
The driver enables overlay offload by default. Add a devargs
'disable-overlay=<0|1>' to allow the app to disable it.
Also update the enic guide doc.
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
enic_cq_rx_to_pkt_flags() currently sets checksum good/bad flags only
for IPv4. The hardware actually validates the TCP/UDP checksum of
IPv6 packets too. Set PKT_RX_L4_CKSUM_{GOOD,BAD} accordingly.
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
Like most NICs, this hardware (Cisco VIC) also requires partial
checksum in the packet for checksum offload and TSO. So, add
the tx_pkt_prepare handler like other PMDs do.
Technically, VIC has an offload mode that does not require partial
checksum for non-TSO packets. But, it has no such mode for TSO
packets, making tx_pkt_prepare unavoidable.
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
Create a rte_ethdev_driver.h file and move PMD specific APIs here.
Drivers updated to include this new header file.
There is no update in header content and since ethdev.h included by
ethdev_driver.h, nothing changed from driver point of view, only
logically grouping of APIs. From applications point of view they can't
access to driver specific APIs anymore and they shouldn't.
More PMD specific data structures still remain in ethdev.h because of
inline functions in header use them. Those will be handled separately.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
The VLAN insert flag and VLAN tag used in the VIC write descriptor
can be set unconditionally.
Signed-off-by: John Daley <johndale@cisco.com>
Reviewed-by: Hyong Youb Kim <hyonkim@cisco.com>
Depend on the tx_offload flags in the mbuf to determine the length
of the headers instead of looking into the packet itself.
Signed-off-by: John Daley <johndale@cisco.com>
Reviewed-by: Hyong Youb Kim <hyonkim@cisco.com>
enic is currently using BSD-2-Clause, whereas the DPDK approved
license is BSD-3-Clause. So replace license text with BSD-3-Clause.
Remove LICENSE as it is redundant.
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
Once the RQ descriptors are initialized (enic_alloc_rx_queue_mbufs),
their length_type does not change during normal RX
operations. rx_pkt_burst only needs to reset their address field for
newly allocated mbufs.
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
No need to zero ol_flags as it is overwritten at the end of the
function. No need to check for EOP as the caller (enic_recv_pkts) has
already checked it.
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
For non-UDP/TCP packets, enic may wrongly set PKT_RX_L4_CKSUM_BAD in
ol_flags. The comparison that checks if a packet is UDP or TCP assumes
that RTE_PTYPE_L4 values are bit flags, but they are not. For example,
the following evaluates to true because NONFRAG is 0x600 and UDP is
0x200, and causes the current code to think the packet is UDP.
!!(RTE_PTYPE_L4_NONFRAG & RTE_PTYPE_L4_UDP)
So, fix this by comparing the packet type against UDP and TCP
individually.
Fixes: 453d15059b ("net/enic: use new Rx checksum flags")
Cc: stable@dpdk.org
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
PKT_RX_IP_CKSUM_UNKNOWN and PKT_RX_L4_CKSUM_UNKNOWN are zeros, so no
need to set them.
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
A check was previously added to drop Tx packets greater than what the Nic
is capable of sending since such packets can freeze the send queue. The
check did not account for TSO packets however, so TSO was limited to 9208
bytes.
Check packet length only for non-TSO packets. Also insure that TSO packet
segment size plus the headers do not exceed what the Nic is capable of
since this also can freeze the send queue.
Use the PKT_TX_TCP_SEG ol_flag instead of m->tso_segsz which is the
preferred way to check for TSO.
Fixes: ed6e564c21 ("net/enic: fix memory leak with oversized Tx packets")
Cc: stable@dpdk.org
Signed-off-by: John Daley <johndale@cisco.com>
Rename buf_physaddr to buf_iova.
Keep the deprecated name in an anonymous union to avoid breaking
the API.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
PKT_RX_VLAN_PKT and PKT_RX_QINQ_PKT are deprecated for a while.
As explained in [1], these flags were kept to let the applications and
PMDs move to the new flag. There is also a need to support Rx vlan
offload without vlan strip (at least for the ixgbe driver).
This patch renames the old flags for this feature, knowing that some
PMDs were using PKT_RX_VLAN_PKT and PKT_RX_QINQ_PKT to indicate that
the vlan tci has been saved in the mbuf structure.
It is likely that some PMDs do not set the proper flags when doing vlan
offload, and it would be worth making a pass on all of them.
Link: [1] http://dpdk.org/ml/archives/dev/2017-June/067712.html
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Occasionally, the amount of packets to free from the work queue ends
perfectly on a boundary to have nb_free = 0 and pool = 0. This causes
a segfault as follows:
(gdb) bt
#0 rte_mempool_default_cache
#1 rte_mempool_put_bulk (n=0, obj_table=0x7f10deff2530, mp=0x0)
#2 enic_free_wq_bufs (wq=wq@entry=0x7efabffcd5b0,
completed_index=completed_index@entry=33)
#3 0x00007f11e9c86e17 in enic_cleanup_wq (enic=<optimized out>,
wq=wq@entry=0x7efabffcd5b0)
at /usr/src/debug/openvswitch-2.6.1/dpdk-16.11/drivers/net/enic/enic_rxtx.c:442
#4 0x00007f11e9c86e5f in enic_xmit_pkts (tx_queue=0x7efabffcd5b0,
tx_pkts=0x7f10deffb1a8, nb_pkts=<optimized out>)
at /usr/src/debug/openvswitch-2.6.1/dpdk-16.11/drivers/net/enic/enic_rxtx.c:470
#5 0x00007f11e9e147ad in rte_eth_tx_burst (nb_pkts=<optimized out>,
tx_pkts=0x7f10deffb1a8, queue_id=0, port_id=<optimized out>)
This commit makes the enic wq driver match other drivers who call the
bulk free, by checking that there are actual packets to free.
Fixes: 36935afbc5 ("net/enic: refactor Tx mbuf recycling")
CC: stable@dpdk.org
Reported-by: Vincent S. Cojot <vcojot@redhat.com>
Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1468631
Signed-off-by: Aaron Conole <aconole@redhat.com>
Reviewed-by: John Daley <johndale@cisco.com>
For VICs with filter tagging, support the MARK and FLAG actions
by setting appropriate mbuf ol_flags if there is a filter match.
Signed-off-by: John Daley <johndale@cisco.com>
Reviewed-by: Nelson Escobar <neescoba@cisco.com>
Remove initialization of next and nb_segs mbuf fields in the Rx path
since they are now initialized in the mbuf pool.
See commit 8f094a9ac5 ("mbuf: set mbuf fields while in pool").
Signed-off-by: John Daley <johndale@cisco.com>
Document the function and make it public, since it is used at several
places in the drivers. The old one is marked as deprecated.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
If a packet send is attempted with a packet larger than the NIC
is capable of processing (9208) it will be dropped with no
completion descriptor returned or completion index update, which
will lead to an mbuf leak and eventual hang.
Drop and count oversized Tx packets in the Tx burst function and
dereference/free the mbuf without sending it to the NIC.
Since the maximum Rx and Tx packet sizes are different on enic
and are now both being used, make the define ENIC_DEFAULT_MAX_PKT_SIZE
be 2 defines, one for Rx and one for Tx.
Fixes: fefed3d1e6 ("enic: new driver")
Cc: stable@dpdk.org
Signed-off-by: John Daley <johndale@cisco.com>
Replace the raw I/O device memory read/write access with eal
abstraction for I/O device memory read/write access to fix portability
issues across different architectures.
CC: John Daley <johndale@cisco.com>
CC: Nelson Escobar <neescoba@cisco.com>
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Use the new L3 and L4 ..CKSUM_GOOD and ..CKSUM_UNKNOWN flags to
distinguish good checksums from unknown ones.
Signed-off-by: John Daley <johndale@cisco.com>
The enic TSO implementation requires that the length of the Eth/IP/TCP
headers be passed to the NIC. Other than that, it's just a matter of
setting the mss and offload mode on a per packet basis.
In TSO mode, IP and TCP checksums are offloaded even if not requested
with mb->ol_flags.
Signed-off-by: John Daley <johndale@cisco.com>
Enic is capable of recognizing packets to be delivered to the
app with single VLAN tags. Advertise this with the ptype
RTE_PTYPE_L2_ETHER_VLAN and set the ptype for VLAN packets.
Signed-off-by: John Daley <johndale@cisco.com>
Re-initialize Rq's when MTU is changed. This allows for more
efficient use of mbufs when moving from an MTU that is greater
than the mbuf size to one that is less. Also move to using Rx
scatter mode when moving from an MTU less than the mbuf size
to one that is greater.
Signed-off-by: Nelson Escobar <neescoba@cisco.com>
Signed-off-by: John Daley <johndale@cisco.com>
The bad L4 checksum flag was set on IP packets which were not
also TCP or UDP packets. This includes ICMP, IGMP and OSPF packets.
L4 ptypes were being treated as bits instead of values within the
L4 mask causing the code to check L4 checksum in the completion
queue and incorrectly set the L4 bad checksum flag.
Fixes: 947d860c82 ("enic: improve Rx performance")
Reviewed-by: Nelson Escobar <neescoba@cisco.com>
Signed-off-by: John Daley <johndale@cisco.com>
Initialize the mbuf data offset to RTE_PKTMBUF_HEADROOM as the
enic takes ownership of them. If allocated mbufs had some offset
other than RTE_PKTMBUF_HEADROOM, the application would read mbuf
data starting at the wrong place and misinterpret the packet.
Fixes: 856d7ba7ed ("net/enic: support scattered Rx")
Reviewed-by: Nelson Escobar <neescoba@cisco.com>
Signed-off-by: John Daley <johndale@cisco.com>
In the burst Tx cleanup function, the reference count in mbufs
returned to the pool should to be decremented before they are
returned. Decrementing is not done by rte_mempool_put_bulk()
so it must be done separately using __rte_pktmbuf_prefree_seg().
Also when returning unsent buffers when the device is stopped
use rte_mbuf_free_seg() instead of rte_mempool_put() so that
reference counts are properly decremented.
Fixes: 36935afbc5 ("net/enic: refactor Tx mbuf recycling")
Reviewed-by: Nelson Escobar <neescoba@cisco.com>
Signed-off-by: John Daley <johndale@cisco.com>
The macro ENIC_ASSERT does the same thing as RTE_ASSERT,
thus it can be removed.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: John Daley <johndale@cisco.com>