This patch adds a notice that the ABI change for ethtool app to
get the NIC firmware version in the 17.02 release.
Signed-off-by: Qiming Yang <qiming.yang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
In 17.02 five rte_eth_dev_set_vf_*** functions will be removed
from librte_ether, renamed and moved to the ixgbe PMD.
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Reshma Pattan <reshma.pattan@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
The _rte_eth_dev_call_process function will change to return "int"
and a fourth parameter "void* ret_param" will be added. This change
targets release 17.02.
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
In igb_uio, iomem is mapped, and both ioport and io mem are recorded
into uio framework (then into sysfs files), which is duplicated with
what Linux has already provided for user space, and makes the code
too complex.
For iomem, DPDK user space code never opens or reads files under
/sys/pci/bus/devices/xxxx:xx:xx.x/uio/uioY/maps/. Instead,
/sys/pci/bus/devices/xxxx:xx:xx.x/resourceY are used to map device
memory.
For ioport, non-x86 platforms cannot read from files under
/sys/pci/bus/devices/xxxx:xx:xx.x/uio/uioY/portio/ directly, because
non-x86 platforms need to map port region for access in user space,
see non-x86 version pci_uio_ioport_map(). x86 platforms can use the
the same way as uio_pci_generic.
This will remove iomem and ioport mapping in igb_uio kernel module,
and adjusts the iomem implementation in both igb_uio and
uio_pci_generic:
- for x86 platform, get ports info from /proc/ioports;
- for non-x86 platform, map and get ports info by pci_uio_ioport_map().
Note: this will affect those applications who are using files under
/sys/pci/bus/devices/xxxx:xx:xx.x/uio/uioY/maps/ and
/sys/pci/bus/devices/xxxx:xx:xx.x/uio/uioY/portio/.
Suggested-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Add more tested platforms and nics and OSes to the release notes.
Signed-off-by: Yulong Pei <yulong.pei@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Issue is with the digest appended feature on QAT PMD.
A workaround is also documented.
Signed-off-by: Fiona Trahe <fiona.trahe@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: John Griffin <john.griffin@intel.com>
The changes for the feature "Tx prepare" should be made in version 17.02.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
L3fwd-power app needs vector mode to be disabled in order to work
properly. The app used to work previously, because it was using
Rx scalar function, but now it uses vector function.
Vector mode needs to be disabled to make the app works,
which has been documented in release notes.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Add list of tested and validated NICs too.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Add information about new ixgbe PMD API.
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
The library was named libethdev without rte_ prefix.
It is now fixed, the library namespace is consistent.
Note: the ABI version has already been changed in this release cycle.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
rte_device/driver generalization patches [1] were merged without a change
in the LIBABIVER variable. This patches bumps the macro of affected libs:
- libcryptodev and libetherdev have been bumped
- librte_eal version changed in
d7e61ad3ae ("log: remove deprecated history dump")
Details of ABI/API changes:
- EAL [version already bumped in: d7e61ad3ae]
|- type field was removed from rte_driver
|- rte_pci_device now embeds rte_device
|- rte_pci_resource renamed to rte_mem_resource
|- numa_node and devargs of rte_pci_driver is moved to rte_driver
|- APIs for device hotplug (attach/detach) moved into EAL
|- API rte_eal_pci_device_name added for PCI device naming
|- vdev registration API introduced (rte_eal_vdrv_register,
| rte_eal_vdrv_unregister
- librte_crypto (v 1=>2)
|- removed rte_cryptodev_create_unique_device_name API
|- moved device naming to EAL
- librte_ethdev (v 4=>5)
|- rte_eth_dev_type is removed
|- removed dev_type from rte_eth_dev_allocate API
|- removed API rte_eth_dev_get_device_type
|- removed API rte_eth_dev_get_addr_by_port
|- removed API rte_eth_dev_get_port_by_addr
|- removed rte_cryptodev_create_unique_device_name API
|- moved device naming to EAL
Also, deprecation notice from 16.07 has been removed and release notes for
16.11 added.
[1] http://dpdk.org/ml/archives/dev/2016-September/047087.html
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
This patch replaces name "libcrypto" to "openssl" from file directories,
symbol prefixes and sub-names connected with old name.
Renamed poll mode driver files, test files, and documentations.
It is done to better name association with library because
the cryptography operations are using Openssl library crypto API.
Fixes: d61f70b4c9 ("crypto/libcrypto: add driver for OpenSSL library")
Signed-off-by: Slawomir Mrozowicz <slawomirx.mrozowicz@intel.com>
Acked-by: Deepak Kumar Jain <deepak.k.jain@intel.com>
When receiving coalesced packets in virtio, the original size of the
segments is provided. This is a useful information because it allows to
resegment with the same size.
Add a RX new flag in mbuf, that can be set when packets are coalesced by
a hardware or virtual driver when the m->tso_segsz field is valid and is
set to the segment size of original packets.
This flag is used in next commits in the virtio pmd.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Following discussions in [1] and [2], introduce a new bit to
describe the Rx checksum status in mbuf.
Before this patch, only one flag was available:
PKT_RX_L4_CKSUM_BAD: L4 cksum of RX pkt. is not OK.
And same for L3:
PKT_RX_IP_CKSUM_BAD: IP cksum of RX pkt. is not OK.
This had 2 issues:
- it was not possible to differentiate "checksum good" from
"checksum unknown".
- it was not possible for a virtual driver to say "the checksum
in packet may be wrong, but data integrity is valid".
This patch tries to solve this issue by having 4 states (2 bits)
for the IP and L4 Rx checksums. New values are:
- PKT_RX_L4_CKSUM_UNKNOWN: no information about the RX L4 checksum
-> the application should verify the checksum by sw
- PKT_RX_L4_CKSUM_BAD: the L4 checksum in the packet is wrong
-> the application can drop the packet without additional check
- PKT_RX_L4_CKSUM_GOOD: the L4 checksum in the packet is valid
-> the application can accept the packet without verifying the
checksum by sw
- PKT_RX_L4_CKSUM_NONE: the L4 checksum is not correct in the packet
data, but the integrity of the L4 data is verified.
-> the application can process the packet but must not verify the
checksum by sw. It has to take care to recalculate the cksum
if the packet is transmitted (either by sw or using tx offload)
And same for L3 (replace L4 by IP in description above).
This commit tries to be compatible with existing applications that
only check the existing flag (CKSUM_BAD).
[1] http://dpdk.org/ml/archives/dev/2016-May/039920.html
[2] http://dpdk.org/ml/archives/dev/2016-June/040007.html
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
This function can be used to calculate the checksum of data embedded in
mbuf, that can be composed of several segments.
This function will be used by the virtio pmd in next commits to calculate
the checksum in software in case the protocol is not recognized.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Dequeue zero copy is disabled by default. Here add a new flag
``RTE_VHOST_USER_DEQUEUE_ZERO_COPY`` to explictily enable it.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
The ``file_name`` data type of ``struct rte_port_source_params`` and
``struct rte_port_sink_params`` is changed from `char *`` to ``const char *``.
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
The GCC 4.9 -march option supports the intel code names for processors,
for example -march=silvermont, -march=broadwell.
The RTE_MACHINE config flag can be used to pass code name to
the compiler as -march flag.
Release notes is updated.
Linux and FreeBSD getting started guides are updated with recommended
gcc version as 4.9 and above.
Some of the gmake command examples in sample application guide and driver
guides are updated with gcc version as 4.9.
Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Dumping the packet type is useful for debug purposes. Instead
of having each application providing its function to do that,
introduce functions to do it.
It factorizes the code and reduces the risk of desynchronization between
the new packet types and the dump function.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Introduce the function rte_net_get_ptype() that parses a mbuf and
returns its packet type. For now, the following packet types are parsed:
L2: Ether
L3: IPv4, IPv6
L4: TCP, UDP, SCTP
The goal here is to provide a reference implementation for packet type
parsing. This function will be used by testpmd in next commits, allowing
to compare its result with the value given by the hardware.
This function will also be useful when implementing Rx offload support
in virtio pmd. Indeed, the virtio protocol gives the csum start and
offset, but it does not give the L4 protocol nor it tells if the
checksum is relevant for inner or outer. This information has to be
known to properly set the ol_flags in mbuf.
Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Signed-off-by: Jean Dao <jean.dao@6wind.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Introduce a new function to read the packet data from an mbuf chain. It
linearizes the data if required, and also ensures that the mbuf is large
enough.
This function is used in next commits that add a software parser to
retrieve the packet type.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
RFC3686: Using AES Counter (CTR) Mode With IPsec ESP.`
Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Add support for AES-GCM (Galois-Counter Mode).
RFC4106: The Use of Galois-Counter Mode (GCM) in IPSec ESP.
Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
NIST SP800-38A recommends two methods to generate unpredictable IVs
(Initilisation Vector) for CBC mode:
1) Apply the forward function to a nonce (ie. counter)
2) Use a FIPS-approved random number generator
This patch implements the first recommended method by using the forward
function to generate the IV.
Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
This code provides the initial implementation of the libcrypto
poll mode driver. All cryptography operations are using Openssl
library crypto API. Each algorithm uses EVP_ interface from
openssl API - which is recommended by Openssl maintainers.
This patch adds libcrypto poll mode driver support to librte_cryptodev
library.
Signed-off-by: Slawomir Mrozowicz <slawomirx.mrozowicz@intel.com>
Signed-off-by: Michal Kobylinski <michalx.kobylinski@intel.com>
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Signed-off-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
This patch adds the configuration file support to ipsec_secgw
sample application. Instead of hard-coded rules, the users can
specify their own SP, SA, and routing rules in the configuration
file. A command line option "-f" is added to pass the
configuration file location to the application.
Configuration item formats:
SP rule format:
sp <ip_ver> <dir> esp <action> <priority> <src_ip> <dst_ip> \
<proto> <sport> <dport>
SA rule format:
sa <dir> <spi> <cipher_algo> <cipher_key> <auth_algo> <auth_key> \
<mode> <src_ip> <dst_ip>
Routing rule format:
rt <ip_ver> <src_ip> <dst_ip> <port>
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
3DES support added to QuickAssist PMD with CTR and CBC mode.
Both cipher-only and chained with HMAC_SHAx.
Signed-off-by: Fiona Trahe <fiona.trahe@intel.com>
Acked-by: Deepak Kumar Jain <deepak.k.jain@intel.com>
The ixgbe base driver was updated to version
cid-10g-shared-code.2016.04.12.
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
This feature adds vhost pmd extended statistics from per port perspective
in order to meet the requirements of the applications such as OVS etc.
RX/TX xstats count the bytes without CRC. This is different from physical
NIC stats with CRC.
The statistics counters are based on RFC 2819 and RFC 2863 as follows:
rx/tx_good_packets
rx/tx_total_bytes
rx/tx_missed_pkts
rx/tx_broadcast_packets
rx/tx_multicast_packets
rx/tx_unicast_packets
rx/tx_undersize_errors
rx/tx_size_64_packets
rx/tx_size_65_to_127_packets;
rx/tx_size_128_to_255_packets;
rx/tx_size_256_to_511_packets;
rx/tx_size_512_to_1023_packets;
rx/tx_size_1024_to_1522_packets;
rx/tx_1523_to_max_packets;
rx/tx_errors
rx_fragmented_errors
rx_jabber_errors
rx_unknown_protos_packets;
No API is changed or added.
rte_eth_xstats_get_names() to retrieve what kinds of vhost xstats are
supported,
rte_eth_xstats_get() to retrieve vhost extended statistics,
rte_eth_xstats_reset() to reset vhost extended statistics.
The usage of vhost pmd xstats is the same as virtio pmd xstats.
for example, when test-pmd application is running in interactive mode
vhost pmd xstats will support the two following commands:
show port xstats all | port_id will show vhost xstats
clear port xstats all | port_id will reset vhost xstats
net/virtio pmd xstats(the function virtio_update_packet_stats) is used
as reference when implementing the feature.
Tested-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Added neon based Rx vector implementation.
Selection of the new handler based neon availability at runtime.
Updated the release notes and MAINTAINERS file.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Jianbo Liu <jianbo.liu@linaro.org>
Indirect descriptors are usually supported by virtio-net devices,
allowing to dispatch a larger number of requests.
When the virtio device sends a packet using indirect descriptors,
only one slot is used in the ring, even for large packets.
The main effect is to improve the 0% packet loss benchmark.
A PVP benchmark using Moongen (64 bytes) on the TE, and testpmd
(fwd io for host, macswap for VM) on DUT shows a +50% gain for
zero loss.
On the downside, micro-benchmark using testpmd txonly in VM and
rxonly on host shows a loss between 1 and 4%. But depending on
the needs, feature can be disabled at VM boot time by passing
indirect_desc=off argument to vhost-user device in Qemu.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
remove vhost-cuse code, including the eventfd_link kernel module that
is for vhost-cuse only.
The lib/virt/qemu-wrap.py is also removed, as it's mainly for vhost-cuse
usage.
As we have one vhost implementation now, one vhost config option is
needed only. Thus, CONFIG_RTE_LIBRTE_VHOST_USER is removed.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
As discussed in the past release, driver names are modified
to be more consistent, and the future driver should follow
this new convention.
Driver names consist of:
"driver category"_"driver folder name"_"optional extra name".
For example:
- Crypto null driver -> "crypto_null"
- Network IXGBE VF driver -> "net_ixgbe_vf"
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Following discussions on the mailing list [1] and since nobody stood up to
implement the necessary cleanups, here is the ivshmem integration removal.
There is not much to say about this patch, a lot of code is being removed.
The default configuration file for packet_ordering example is replaced with
the "native" x86 file.
The only tricky part is in eal_memory with the memseg index stuff.
More cleanups can be done after this but will come in subsequent patchsets.
[1]: http://dpdk.org/ml/archives/dev/2016-June/040844.html
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Panu Matilainen <pmatilai@redhat.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
The log history feature was deprecated in 16.07.
The remaining empty functions are removed in 16.11.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: David Marchand <david.marchand@6wind.com>
It was planned to remove some mempool functions which are deprecated
since 16.07.
As no other mempool ABI change is planned in 16.11, it is better
to postpone and group every mempool ABI changes in 17.02.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Add template release notes for DPDK 16.11 with inline
comments and explanations of the various sections.
Signed-off-by: John McNamara <john.mcnamara@intel.com>
The API changes are planned for rte_port_source_params and
rte_port_sink_params, which will be supported from release 16.11.
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Vhost-cuse was invented before vhost-user exist. The both are actually
doing the same thing: a vhost-net implementation in user space. But they
are not exactly the same thing.
Firstly, vhost-cuse is harder for use; no one seems to care it, either.
Furthermore, since v2.1, a large majority of development effort has gone
to vhost-user. For example, we extended the vhost-user spec to add the
multiple queue support. We also added the vhost-user live migration at
v16.04 and the latest one, vhost-user reconnect that allows vhost app
restart without restarting the guest. Both of them are very important
features for product usage and none of them works for vhost-cuse.
You now see that the difference between vhost-user and vhost-cuse is
big (and will be bigger and bigger as time moves forward), that you
should never use vhost-cuse, that we should drop it completely.
The remove would also result to a much cleaner code base, allowing us
to do all kinds of extending easier.
So here to mark vhost-cuse as deprecated in this release and will be
removed in the next release (v16.11).
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Ciara Loftus <ciara.loftus@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Rich Lane <rich.lane@bigswitch.com>
Acked-by: Jan Viktorin <viktorin@rehivetech.com>
Acked-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
There was a prior call with an explanation of what needs to be done:
http://dpdk.org/ml/archives/dev/2016-June/040844.html
- Qemu patch upstreamed
- IVSHMEM PCI device managed by a PCI driver
- No DPDK objects (ring/mempool) allocated by EAL
As nobody seems interested, it is time to remove this code which
makes EAL improvements harder.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Jan Viktorin <viktorin@rehivetech.com>
Acked-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
For 16.11, the mbuf structure will be modified implying ABI breakage.
Some discussions already took place here:
http://www.dpdk.org/dev/patchwork/patch/12878/
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: John Daley <johndale@cisco.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
This is an ABI deprecation notice for DPDK 16.11 in librte_ether about
changes in rte_eth_dev and rte_eth_desc_lim structures.
As discussed in that thread:
http://dpdk.org/ml/archives/dev/2015-September/023603.html
Different NIC models depending on HW offload requested might impose
different requirements on packets to be TX-ed in terms of:
- Max number of fragments per packet allowed
- Max number of fragments per TSO segments
- The way pseudo-header checksum should be pre-calculated
- L3/L4 header fields filling
- etc.
MOTIVATION:
-----------
1) Some work cannot (and didn't should) be done in rte_eth_tx_burst.
However, this work is sometimes required, and now, it's an
application issue.
2) Different hardware may have different requirements for TX offloads,
other subset can be supported and so on.
3) Some parameters (eg. number of segments in ixgbe driver) may hung
device. These parameters may be vary for different devices.
For example i40e HW allows 8 fragments per packet, but that is after
TSO segmentation. While ixgbe has a 38-fragment pre-TSO limit.
4) Fields in packet may require different initialization (like eg. will
require pseudo-header checksum precalculation, sometimes in a
different way depending on packet type, and so on). Now application
needs to care about it.
5) Using additional API (rte_eth_tx_prep) before rte_eth_tx_burst let to
prepare packet burst in acceptable form for specific device.
6) Some additional checks may be done in debug mode keeping tx_burst
implementation clean.
PROPOSAL:
---------
To help user to deal with all these varieties we propose to:
1. Introduce rte_eth_tx_prep() function to do necessary preparations of
packet burst to be safely transmitted on device for desired HW
offloads (set/reset checksum field according to the hardware
requirements) and check HW constraints (number of segments per
packet, etc).
While the limitations and requirements may differ for devices, it
requires to extend rte_eth_dev structure with new function pointer
"tx_pkt_prep" which can be implemented in the driver to prepare and
verify packets, in devices specific way, before burst, what should to
prevent application to send malformed packets.
2. Also new fields will be introduced in rte_eth_desc_lim:
nb_seg_max and nb_mtu_seg_max, providing an information about max
segments in TSO and non-TSO packets acceptable by device.
This information is useful for application to not create/limit
malicious packet.
APPLICATION (CASE OF USE):
--------------------------
1) Application should to initialize burst of packets to send, set
required tx offload flags and required fields, like l2_len, l3_len,
l4_len, and tso_segsz
2) Application passes burst to the rte_eth_tx_prep to check conditions
required to send packets through the NIC.
3) The result of rte_eth_tx_prep can be used to send valid packets
and/or restore invalid if function fails.
eg.
for (i = 0; i < nb_pkts; i++) {
/* initialize or process packet */
bufs[i]->tso_segsz = 800;
bufs[i]->ol_flags = PKT_TX_TCP_SEG | PKT_TX_IPV4
| PKT_TX_IP_CKSUM;
bufs[i]->l2_len = sizeof(struct ether_hdr);
bufs[i]->l3_len = sizeof(struct ipv4_hdr);
bufs[i]->l4_len = sizeof(struct tcp_hdr);
}
/* Prepare burst of TX packets */
nb_prep = rte_eth_tx_prep(port, 0, bufs, nb_pkts);
if (nb_prep < nb_pkts) {
printf("tx_prep failed\n");
/* drop or restore invalid packets */
}
/* Send burst of TX packets */
nb_tx = rte_eth_tx_burst(port, 0, bufs, nb_prep);
/* Free any unsent packets. */
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
The right name of ethdev should be dpdk_netdev. However:
1/ We are using rte_ prefix in the code and library names.
2/ The API uses rte_ethdev
That's why 16.11 will just have the rte_ prefix prepended to
the library filename as every other libraries.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Jan Viktorin <viktorin@rehivetech.com>
Acked-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Driver names for all the supported devices in DPDK do not have
a naming convention. Some are using a prefix, some are not
and some have long names. Driver names are used when creating
virtual devices, so it is useful to have consistency in the names.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Remove deprecation notice pertaining to introduction of new flow
types in favor of a more generic filtering infrastructure proposal.
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Add new section on tested platforms and nics and OSes to the release notes.
Signed-off-by: Yulong Pei <yulong.pei@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Improve the wording of some text in the "new features" section of
the release notes.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: John McNamara <john.mcnamara@intel.com>
When use i40e linux kernel driver as host driver and DPDK handler the i40e
VF, the promiscuous mode doesn't work in i40e VF. It is not supported by
DPDK i40e VF driver right now.
Signed-off-by: Jeff Guo <jia.guo@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
The following tools may be installed system-wide.
It may be cleaner and more convenient to find them with the same
dpdk- prefix (especially for autocompletion).
Moreover, the script dpdk_nic_bind.py deserves a new name because it is
not restricted to NICs and can be used for e.g. crypto.
These files are renamed:
pmdinfogen -> dpdk-pmdinfogen
pmdinfo.py -> dpdk-pmdinfo.py
dpdk_pdump -> dpdk-pdump
dpdk_proc_info -> dpdk-procinfo
dpdk_nic_bind.py -> dpdk-devbind.py
setup.sh -> dpdk-setup.sh
The tools pmdinfogen, pmdinfo.py and dpdk_pdump are new in 16.07.
The scripts dpdk_nic_bind.py and setup.sh may have been used with
previous releases by end users. That's why a symbolic link still
provide the old name in the installed tools directory.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
This patch describes the procedure to be be followed
to perform Live Migration of a VM with Virtio and VF PMD's
using the bonding PMD.
It includes sample host and VM scripts used in the procedure,
and a sample switch configuration.
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
The commit cb6696d220 ("drivers: update registration macro usage")
changes the name from virtio-user to virtio_user, because hyphen
cannot be used in a C symbol name. However, this commit does not
update the strings in docs and source code, which could lead to
failure to start this device as per the docs.
This patch updates related strings in the docs and source code.
Fixes: cb6696d220 ("drivers: update registration macro usage")
Reported-by: Tiwei Bie <tiwei.bie@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
mmap the iomem range of the PCI device fails for kernels that
enabled CONFIG_IO_STRICT_DEVMEM option:
EAL: pci_map_resource():
cannot mmap(39, 0x7f1c51800000, 0x100000, 0x0):
Invalid argument (0xffffffffffffffff)
CONFIG_IO_STRICT_DEVMEM is introduced in Linux v4.5 and not enabled
by default:
Linux commit: 90a545e restrict /dev/mem to idle io memory ranges
As a workaround igb_uio can stop reserving PCI memory resources, from
kernel point of view iomem region looks like idle and mmap works
again. This matches uio_pci_generic usage.
With this update device iomem range is not protected against any
other kernel drivers or userspace access. But this shouldn't
be a problem for dpdk usage module since purpose of the igb_uio
module is to provide userspace access.
Fixes: af75078fec ("first public release")
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
The mempool_count and mempool_free_count behaved contrary to what their
names suggested. The free_count function actually returned the number of
elements that were allocated from the pool, not the number unallocated as
the name implied.
Fix this by introducing two new functions to replace the old ones,
* rte_mempool_avail_count to replace rte_mempool_count
* rte_mempool_in_use_count to replace rte_mempool_free_count
In this patch, the new functions are added, and the old ones are marked
as deprecated. All apps and examples that use the old functions are
updated to use the new functions.
Fixes: af75078fec ("first public release")
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
The mempool cache is only available to EAL threads as a per-lcore
resource. Change this so that the user can create and provide their own
cache on mempool get and put operations. This works with non-EAL threads
too. This commit introduces the new API calls:
rte_mempool_cache_create(size, socket_id)
rte_mempool_cache_free(cache)
rte_mempool_cache_flush(cache, mp)
rte_mempool_default_cache(mp, lcore_id)
Changes the API calls:
rte_mempool_generic_put(mp, obj_table, n, cache, flags)
rte_mempool_generic_get(mp, obj_table, n, cache, flags)
The cache-oblivious API calls use the per-lcore default local cache.
Signed-off-by: Lazaros Koromilas <l@nofutznetworks.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
This commit introduces the API calls:
rte_mempool_generic_put(mp, obj_table, n, is_mp)
rte_mempool_generic_get(mp, obj_table, n, is_mc)
Deprecates the API calls:
rte_mempool_mp_put_bulk(mp, obj_table, n)
rte_mempool_sp_put_bulk(mp, obj_table, n)
rte_mempool_mp_put(mp, obj)
rte_mempool_sp_put(mp, obj)
rte_mempool_mc_get_bulk(mp, obj_table, n)
rte_mempool_sc_get_bulk(mp, obj_table, n)
rte_mempool_mc_get(mp, obj_p)
rte_mempool_sc_get(mp, obj_p)
We also check cookies in one place now.
Signed-off-by: Lazaros Koromilas <l@nofutznetworks.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
The standard Virtual Ethernet Bridge(VEB) definition in 1Qbg is a bridge
which has an uplink port to the outside world (maybe another bridge), but
a "floating" VEB is a special VEB without an uplink port to the outside.
Instead, traffic can be sent from one VF to another using the floating
VEB - even when the physical link on the NIC port is down.
This patch adds floating VEB options in the devargs for i40e driver.
Using these parameters, applications can decide whether to use legacy
VEB/VEPA or a floating VEB.
To enable this feature, the user should pass a devargs parameter to the
EAL, for example "-w 84:00.0,enable_floating_veb=1", to control whether
the PMD will to use the floating VEB feature or not.
Once the floating VEB feature is enabled, all the VFs created by
this PF device are connected to the floating VEB.
NOTE: The floating VEB functionality requires a NIC firmware version
of 5.0 or greater.
Signed-off-by: Zhe Tao <zhe.tao@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
The ixgbe base driver was updated to version
cid-10g-shared-code.2016.04.12
The changes include:
Added sgmii link for X550.
Added mac link setup for X550a SFP and SFP+.
Added KR support for X550em_a.
Added new phy definitions for M88E1500.
Added support for the VLVF to be bypassed when adding/removing
a VFTA entry.
Added X550a flow control auto negotiation support.
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
In current i40e codebase, if single VLAN header is added in a packet,
it's treated as inner VLAN. Generally, a single VLAN header is
treated as the outer VLAN header, so update the driver behaviour
appropriately.
Fixes: 19b16e2f64 ("ethdev: add vlan type when setting ether type")
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Updated doc/guides/nics/overview.rst, doc/guides/nics/thunderx.rst
and release notes
Changed "*" to "P" in overview.rst to capture the partially supported
feature as "*" creating alignment issues with Sphinx table
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Slawomir Rosek <slawomir.rosek@semihalf.com>
Signed-off-by: Maciej Czekaj <maciej.czekaj@caviumnetworks.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
This patch adds the initial skeleton for bnxt driver along with the
nic guide, and ties the driver into the build system.
At this point, the driver simply fails init.
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Signed-off-by: Stephen Hurd <stephen.hurd@broadcom.com>
Reviewed-by: David Christensen <david.christensen@broadcom.com>
[Release Note Addition]
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
When using kernel PF and DPDK VF, when the PF driver finds the link
state changes, up -> down or down -> up, the driver will send a
message to VF by mailbox. This link state change may be
triggered by PHY disconnection/reconnection, user config change
like *ifconfig down/up* or interface parameter, like MTU change.
This patch enables the support of the mailbox interrupt,
so VF driver can receive the message for link up/down.
After VF receives this message, VF port need to be reset to
recover. This needs to be handled by the application so this patch
allows the app to register a reset callback so it can reset the VF port.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
When using kernel PF and DPDK VF, when the PF driver finds the link
state changes, up -> down or down -> up, the driver will send a
message to VF by mailbox. This link state change may be
triggered by PHY disconnection/reconnection, user config change
like *ifconfig down/up* or interface parameter, like MTU change.
This patch enables the support of the mailbox interrupt,
so VF driver can receive the message for link up/down.
After VF receives this message, VF port need to be reset to
recover. This needs to be handled by the application so this patch
allows the app to register a reset callback so it can reset the VF port.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Update the documentation and comments with brief details on the base
code version included in this release.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
This patch enables configuring MTU for i40e.
Since changing MTU needs to reconfigure queue, the port must be
stopped before configuring MTU.
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Previously, there was a known issue "On Intel® 40G Ethernet
Controller stopping the port does not really down the port link."
There were two reasons why the port was always kept up.
1. Old firmware versions had issues when "Set PHY config command"
was used on 40G NICs.
2. The kernel i40e driver didn't call "Set PHY config command" when
ifconfig up/down was used, it assumes the link is always up. But
in DPDK, ports are forced down when an applications quits. So if
the port is then switched to being controlled by kernel the driver,
the port can not be brought up through "ifconfig <ethx> up".
This patch fixes this issue by adding in "Set PHY config command"
into our driver. This is now possible because with newer firmware
there is no longer a problem using this command.
With this fix, after DPDK quit, if the port is switched to being used
by the kernel driver, "ethtool -s <ethx> autoneg on" can be used to
turn on the auto negotiation, and then port can be brought up through
"ifconfig <ethx> up".
NOTE: requires kernel i40e driver version >= 1.4.X
Fixes: 2f1e228174 ("i40e: skip link control as firmware workaround")
Fixes: 16c979f9ad ("i40e: disable setting of PHY configuration")
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
This patch introduced scalable multi-writer Cuckoo Hash insertion
based on a split Cuckoo Search and Move operation using Intel
TSX. It can do scalable hash insertion with 22 cores with little
performance loss and negligible TSX abortion rate.
* Added an extra rte_hash flag definition to switch default single writer
Cuckoo Hash behavior to multiwriter.
- If HTM is available, it would use hardware feature for concurrency.
- If HTM is not available, it would fall back to spinlock.
* Created a rte_cuckoo_hash_x86.h file to hold all x86-arch related
cuckoo_hash functions. And rte_cuckoo_hash.c uses compile time flag to
select x86 file or other platform-specific implementations. While HTM check
is still done at runtime (same idea with
RTE_HASH_EXTRA_FLAGS_TRANS_MEM_SUPPORT)
* Moved rte_hash private struct definitions to rte_cuckoo_hash.h, to allow
rte_cuckoo_hash_x86.h or future platform dependent functions to include.
* Following new functions are created for consistent names when new platform
TM support are added.
- rte_hash_cuckoo_move_insert_mw_tm: do insertion with bucket movement.
- rte_hash_cuckoo_insert_mw_tm: do insertion without bucket movement.
* One extra multi-writer test case is added.
Signed-off-by: Wei Shen <wei1.shen@intel.com>
Signed-off-by: Sameh Gobriel <sameh.gobriel@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Until now, the objects stored in a mempool were internally stored in a
ring. This patch introduces the possibility to register external handlers
replacing the ring.
The default behavior remains unchanged, but calling the new function
rte_mempool_set_ops_byname() right after rte_mempool_create_empty() allows
the user to change the handler that will be used when populating
the mempool.
This patch also adds a set of default ops (function callbacks) based
on rte_ring.
Signed-off-by: David Hunt <david.hunt@intel.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
NSH packet can be recognized by Intel X710/XL710 series.
This patch enables the new packet type.
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Tested-by: Yulong Pei <yulong.pei@intel.com>
Acked-by: Zhe Tao <zhe.tao@intel.com>
Add a new virtual device named virtio-user, which can be used just like
eth_ring, eth_null, etc. To reuse the code of original virtio, we do
some adjustment in virtio_ethdev.c, such as remove key _static_ of
eth_virtio_dev_init() so that it can be reused in virtual device; and
we add some check to make sure it will not crash.
Configured parameters include:
- queues (optional, 1 by default), number of queue pairs, multi-queue
not supported for now.
- cq (optional, 0 by default), not supported for now.
- mac (optional), random value will be given if not specified.
- queue_size (optional, 256 by default), size of virtqueues.
- path (madatory), path of vhost user.
When enable CONFIG_RTE_VIRTIO_USER (enabled by default), the compiled
library can be used in both VM and container environment.
Examples:
path_vhost=<path_to_vhost_user> # use vhost-user as a backend
sudo ./examples/l2fwd/build/l2fwd -c 0x100000 -n 4 \
--socket-mem 0,1024 --no-pci --file-prefix=l2fwd \
--vdev=virtio-user0,mac=00:01:02:03:04:05,path=$path_vhost -- -p 0x1
Known issues:
- Control queue and multi-queue are not supported yet.
- Cannot work with --huge-unlink.
- Cannot work with no-huge.
- Cannot work when there are more than VHOST_MEMORY_MAX_NREGIONS(8)
hugepages.
- Root privilege is a must (mainly becase of sorting hugepages according
to physical address).
- Applications should not use file name like HUGEFILE_FMT ("%smap_%d").
- Cannot work with vhost-net backend.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
All other DPDK PMDs doesn't support concurrent receiving or sending
packets to the same queue. The upper application should deal with
this, normally through queue and core bindings.
Due to historical reason, vhost internally supports concurrent lockless
enqueuing packets to the same virtio queue through costly cmpset operation.
This patch removes this internal lockless implementation and should improve
performance a bit.
Luckily DPDK OVS doesn't rely on this behavior.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Allow reconnecting on failure by default when:
- DPDK app starts first and QEMU (as the server) is not started yet.
Without reconnecting, DPDK app would simply fail on vhost-user
registration.
- QEMU restarts, say due to OS reboot.
Without reconnecting, you can't re-establish the connection without
restarting DPDK app.
This patch make it work well for both above cases. It simply creates
a new thread, and keep trying calling "connect()", until it succeeds.
The reconnect could be disabled when RTE_VHOST_USER_NO_RECONNECT flag
is set.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Add a new paramter (flags) to rte_vhost_driver_register(). DPDK
vhost-user acts as client mode when RTE_VHOST_USER_CLIENT flag
is set. The flags would also allow future extensions without
breaking the API (again).
The rest is straingfoward then: allocate a unix socket, and
bind/listen for server, connect for client.
This extension is for vhost-user only, therefore we simply quit
and report error when any flags are given for vhost-cuse.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
With all the previous prepare works, we are just one step away from
the final ABI refactoring. That is, to change current API to let them
stick to vid instead of the old virtio_net dev.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Rich Lane <rich.lane@bigswitch.com>
Acked-by: Rich Lane <rich.lane@bigswitch.com>
The new API rte_vhost_avail_entries() is actually a rename of
rte_vring_available_entries(), with the "vring" to "vhost" name
change to keep the consistency of other vhost exported APIs.
This change could let us avoid the dependency of "virtio_net"
struct, to prepare for the ABI refactoring.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Rich Lane <rich.lane@bigswitch.com>
Acked-by: Rich Lane <rich.lane@bigswitch.com>
Added new SW PMD which makes use of the libsso_kasumi SW library,
which provides wireless algorithms KASUMI F8 and F9
in software.
This PMD supports cipher-only, hash-only and chained operations
("cipher then hash" and "hash then cipher") of the following
algorithms:
- RTE_CRYPTO_SYM_CIPHER_KASUMI_F8
- RTE_CRYPTO_SYM_AUTH_KASUMI_F9
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Deepak Kumar Jain <deepak.k.jain@intel.com>
The new pdump tool is added for packet capturing on dpdk.
This tool runs as secondary process by default.
Tool facilitates the command line options like
port, device_id, queue which user should pass on
to the tool to request the packet capture on those devices.
Tool creates the rte ring, mempool and pcap vdev and
calls the enable API of the pdump library with port/device_id,
queue, ring and mempool as arguments to enable the packet
capture on specific devices and gets the packets from the
primary process over the ring. Once the packets are
received, those packets will be send to the pcap vdev.
Tool can be terminated by using ctrl+c(SIGINT) upon which tool
calls the disable API of the pdump library to disable the packet capture
and dequeues the rest of the packets from the ring and sends them on
to the pcap vdev, then after releases all allocated resources.
Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
The librte_pdump library provides a framework for
packet capturing in dpdk. The library provides set of
APIs to initialize the packet capture framework, to
enable or disable the packet capture, and to uninitialize
it.
The librte_pdump library works on a client/server model.
The server is responsible for enabling or disabling the
packet capture and the clients are responsible
for requesting the enabling or disabling of the packet
capture.
Enabling APIs are supported with port, queue, ring and
mempool parameters. Applications should pass on this information
to get the packets from the dpdk ports.
For enabling requests from applications, library creates the client
request containing the mempool, ring, port and queue information and
sends the request to the server. After receiving the request, server
registers the Rx and Tx callbacks for all the port and queues.
After the callbacks registration, registered callbacks will get the
Rx and Tx packets. Packets then will be copied to the new mbufs that
are allocated from the user passed mempool. These new mbufs then will
be enqueued to the application passed ring. Applications need to dequeue
the mbufs from the rings and direct them to the devices like
pcap vdev for viewing the packets outside of the dpdk
using the packet capture tools.
For disabling requests, library creates the client request containing
the port and queue information and sends the request to the server.
After receiving the request, server removes the Rx and Tx callback
for all the port and queues.
Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
The new fields nb_rx_queues and nb_tx_queues are added to the
rte_eth_dev_info structure.
Changes to API rte_eth_dev_info_get() are done to update these new fields
to the rte_eth_dev_info object.
Release notes is updated with the changes.
The librte_pdump library needs to register Rx and Tx callbacks for all
the nb_rx_queues and nb_tx_queues, when application wants to capture the
packets on all the software configured number of Rx and Tx queues of the
device. So far there is no support to get nb_rx_queues and nb_tx_queues
information from the ethdev library. Hence these changes are introduced.
Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Adds and documents new callbacks that allow transitions to core
states other than dead to be reported to applications.
Signed-off-by: Remy Horton <remy.horton@intel.com>
The current extended ethernet statistics fetching involve doing several
string operations, which causes performance issues if there are lots of
statistics and/or network interfaces. This patch changes the test-pmd
and proc_info applications to use the new xstats API, and removes
deprecated code associated with the old API.
Signed-off-by: Remy Horton <remy.horton@intel.com>
The behavior of PKT_RX_VLAN_PKT was not very well defined, resulting in
PMDs not advertising the same flags in similar conditions.
Following discussion in [1], introduce 2 new flags PKT_RX_VLAN_STRIPPED
and PKT_RX_QINQ_STRIPPED that are better defined:
PKT_RX_VLAN_STRIPPED: a vlan has been stripped by the hardware and its
tci is saved in mbuf->vlan_tci. This can only happen if vlan stripping
is enabled in the RX configuration of the PMD.
For now, the old flag PKT_RX_VLAN_PKT is kept but marked as deprecated.
It should be removed from applications and PMDs in a future revision.
This patch also updates the drivers. For PKT_RX_VLAN_PKT:
- e1000, enic, i40e, mlx5, nfp, vmxnet3: done, PKT_RX_VLAN_PKT already
had the same meaning than PKT_RX_VLAN_STRIPPED, minor update is
required.
- fm10k: done, PKT_RX_VLAN_PKT already had the same meaning than
PKT_RX_VLAN_STRIPPED, and vlan stripping is always enabled on fm10k.
- ixgbe: modification done (vector and normal), the old flag was set
when a vlan was recognized, even if vlan stripping was disabled.
- the other drivers do not support vlan stripping.
For PKT_RX_QINQ_PKT, it was only supported on i40e, and the behavior was
already correct, so we can reuse the same bit value for
PKT_RX_QINQ_STRIPPED.
[1] http://dpdk.org/ml/archives/dev/2016-April/037837.html,
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
This patch docs the issue on EAL argument that the last EAL
argument is replaced by program name in argv[].
Reported-by: Ziye Yang <ziye.yang@intel.com>
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
This patch is used to add the class_id (class_code,
subclass_code, programming_interface) support for
pci_device probe. With this patch, it will be
flexible for users to probe a class of devices
by class_id.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
The log history uses rte_mempool. In order to remove the mempool
dependency in EAL (and improve the build), this feature is deprecated.
The ABI is kept but the behaviour is now voided because it seems this
function was not used. The history can be read from syslog.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: David Marchand <david.marchand@6wind.com>
The routing pipeline registers a callback function with the nic ports and
this function is invoked for updating the routing entries (corresponding to
local host and directly attached network) tables whenever the nic ports
change their states (up/down).
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
As a result of tracking, output ports of routing pipelines are linked with
physical nic ports (potentially through other pipeline instances).
Thus, the mac addresses of the NIC ports are assigned to routing pipeline
out ports which are connected to them and are further used in routing table
entries instead of hardcoded default values.
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
This patch enables rss (receive side scaling) per network interface
through the configuration file. The user can specify following
parameters in LINK section for enabling the rss feature - rss_qs,
rss_proto_ipv4, rss_proto_ipv6 and ip_proto_l2.
The "rss_qs" is mandatory parameter which indicates the queues to be
used for rss, while rest of the parameters are optional. When optional
parameters are not provided in the configuration file, default setting
(ETH_RSS_IPV4 | ETH_RSS_IPV6) is assumed for "rss_hf" field of the
rss_conf structure.
For example, following configuration can be applied for using the rss
on port 0 of the network interface;
[PIPELINE0]
type = MASTER
core = 0
[LINK0]
rss_qs = 0 1
[PIPELINE1]
type = PASS-THROUGH
core = 1
pktq_in = RXQ0.0 RXQ0.1 RXQ1.0
pktq_out = TXQ0.0 TXQ1.0 TXQ0.1
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
This patch provides counter mode support to AES-NI multi-buffer library.
The following cipher algorithm is enabled:
- RTE_CRYPTO_CIPHER_AES_CTR
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Added possibility for AES to work in counter mode
Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Deepak Kumar Jain <deepak.k.jain@intel.com>
Remove the deprecation notice and add an entry in the release note
for the changes in mempool allocation.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
The rte_pktmbuf_detach() function should decrease refcnt on a direct
buffer as stated in doc/guides/prog_guide/mbuf_lib.rst:
"whenever the indirect buffer is detached, the reference counter on the
direct buffer is decremented."
Signed-off-by: Hiroyuki Mikita <h.mikita89@gmail.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
The rte_mempool structure is changed, which will cause an ABI change
for this structure. Providing backward compat is not reasonable
here as this structure is used in multiple defines/inlines.
Allow mempool cache support to be dynamic depending on if the
mempool being created needs cache support. Saves about 1.5M of
memory used by the rte_mempool structure.
Allocating small mempools which do not require cache can consume
larges amounts of memory if you have a number of these mempools.
Signed-off-by: Keith Wiles <keith.wiles@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
We don't want to have this instructions in the generated docs, so use
comments. It's also less confusing for people adding entries in the
documentation.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Previously, for tunnel packets, such as VXLAN/NVGRE, the vlan
tags of the inner header will be stripped without putting vlan
info to descriptor, what is not expected behaviour.
This patch fixes it by changing hardware configuration to leave
the inner packet alone.
Fixes: 4861cde461 ("i40e: new poll mode driver")
Fixes: a778a1fa2e ("i40e: set up and initialize flow director")
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Some statistics were deprecated since release 2.1 (49f386542a).
The last deprecated counter to be used was imcasts.
The VF loopback statistics are also removed as they are used only
in igb and duplicated in extended statistics.
The new counters should be added to extended statistics.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Remy Horton <remy.horton@intel.com>
The driver i40e was using a specific PCI config before the release 16.04.
Since 16.04, it is always enabled in i40e (commit 56465cfaf).
The API has been deprecated in the commit 68f7759382.
The igb_uio implementation has been deprecated in commit b7cf8e155.
The config helper - through igb_uio sysfs entries - is now removed.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Support of PCAP file has been added to rte_port in release 16.04
as NEXT_ABI. It is in the standard ABI of the release 16.07.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Add a new section on tested platforms and nics to the release notes.
Signed-off-by: Qian Xu <qian.q.xu@intel.com>
Signed-off-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Following discussions with Jan, here is a deprecation notice to prepare for
hotplug and rte_device changes to come in 16.07.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Jan Viktorin <viktorin@rehivetech.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
We currently exposed way too many fields (or even structures) than
necessary. For example, vhost_virtqueue struct should NOT be exposed
to user at all: application just need to tell the right queue id to
locate a specific queue, and that's all. Instead, the structure should
be defined in an internal header file. With that, we could do any changes
to it we want, without worrying about that we may offense the painful
ABI rules.
Similar changes could be done to virtio_net struct as well, just exposing
very few fields that are necessary and moving all others to an internal
structure.
Huawei then suggested a more radical yet much cleaner one: just exposing
a virtio_net handle to application, just like the way kernel exposes an
fd to user for locating a specific file, and exposing some new functions
to access those old fields, such as flags, virt_qp_nb.
With this change, we're likely to be free from ABI violations forever
(well, except when we have to extend the virtio_net_device_ops struct).
For example, following nice cleanup would not be a blocking one then:
http://dpdk.org/ml/archives/dev/2016-February/033528.html
Suggested-by: Huawei Xie <huawei.xie@intel.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Panu Matilainen <pmatilai@redhat.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
This patch adds a notice that the API for the xstats
functionality will be modified in the 16.07 release, with
no backwards compatibility planned as it would require
code duplication in each PMD that supports xstats.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Acked-by: David Harton <dharton@cisco.com>
Acked-by: Maryam Tahhan <maryam.tahhan@intel.com>
Several new fields will be added to structure rte_port_source_params for
source port enhancement with pcap file reading support.
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Jasvinder Singh <jasvinder.singh@intel.com>
Acked-by: Piotr Azarewicz <piotrx.t.azarewicz@intel.com>
Change rte_hash*_create() functions to return NULL and set rte_errno to
EEXIST when the object name already exists. This is the behavior
described in the API documentation in the header file.
These functions were returning a pointer to the existing object in that
case, but it is a problem as the caller did not know if the object had
to be freed or not.
Doing this change also makes the hash API more consistent with the other
APIs (mempool, rings, ...).
Fixes: 916e4f4f4e ("memory: fix for multi process support")
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Change rte_lpm*_create() functions to return NULL and set rte_errno to
EEXIST when the object name already exists. This is the behavior
described in the API documentation in the header file.
These functions were returning a pointer to the existing object in that
case, but it is a problem as the caller did not know if the object had
to be freed or not.
Doing this change also makes the lpm API more consistent with the other
APIs (mempool, rings, ...).
Fixes: 916e4f4f4e ("memory: fix for multi process support")
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
The issue is the VF's link speed kept as 10G and status always was up.
It did not change even the physical link's status changed.
This patch fixes this issue to make VF's link info consistent with
physical link.
Fixes: 4861cde461 ("i40e: new poll mode driver")
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
The purpose of this patch is used to add a new field
"class" in rte_pci_id structure. The new class field includes
class_id, subcalss_id, programming interface of a pci device.
With this field, we can identify pci device by its class info,
which can be more flexible instead of probing the device by
vendor_id OR device_id OR subvendor_id OR subdevice_id.
For example, we can probe all nvme devices by class field, which
can be quite convenient.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
Add a deprecation notice for coming changes in mempool for 16.07.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: David Hunt <david.hunt@intel.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
Announce the ABI breakage due to addition of external mempool
manager functionality which requires changes to rte_mempool
structure.
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Deprecation notice for 16.04 for changes to occur in
release 16.07 for rte_mempool memory reduction.
Signed-off-by: Keith Wiles <keith.wiles@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: David Hunt <david.hunt@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
The link speed configuration is now done with bitmaps so 100G speed
requires only a new bit flag.
The actual link speed is a number so its size must be increased from
16-bit to 32-bit.
Signed-off-by: Marc Sune <marcdevel@gmail.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Tested-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Tested-by: Matej Vido <vido@cesnet.cz>
This patch redesigns the API to set the link speed/s configuration
of an ethernet port. Specifically:
- it allows to define a set of advertised speeds for
auto-negociation.
- it allows to disable link auto-negociation (single fixed speed).
- default: auto-negociate all supported speeds.
A flag autoneg in struct rte_eth_link indicates if link speed was a
result of auto-negociation or was fixed by configuration.
Signed-off-by: Marc Sune <marcdevel@gmail.com>
Tested-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Tested-by: Beilei Xing <beilei.xing@intel.com>
Tested-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
The speed capabilities of a device can be retrieved with
rte_eth_dev_info_get().
The new field speed_capa is initialized in the drivers without
taking care of device characteristics in this patch.
When the capabilities of a driver are accurate, the table in
overview.rst must be filled.
Signed-off-by: Marc Sune <marcdevel@gmail.com>
Hash library used a function pointer to choose a different
key compare function, depending on the key size.
As a result, multiple processes could not use the same hash table,
as the function addresses vary from one process to another.
Instead, a jump table is used, so each process has its own
function addresses, accessing this table with an index stored
in the hash table (note that using a custom key compare function
is not supported in multi-process mode).
Fixes: 48a3991196 ("hash: replace with cuckoo hash implementation")
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Previously, vector driver is not the first (default) choice for i40e,
as it cannot fill packet type info for l3fwd to work well. Now there
is an option for l3fwd to analysis packet type softly. So enable it
by default.
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
As a example to use ptype info, l3fwd needs firstly to use
rte_eth_dev_get_supported_ptypes() API to check if device and/or
its PMD driver will parse and fill the needed packet type; if not,
use the newly added option, --parse-ptype, to analyze it in the
callback softly.
As the mode of EXACT_MATCH uses the 5 tuples to caculate hash, so
we narrow down its scope to:
a. ip packets with no extensions, and
b. L4 payload should be either tcp or udp.
Note: this patch does not completely solve the issue, "cannot run
l3fwd on virtio or other devices", because hw_ip_checksum may be
not supported by the devices. Currently we can:
a. remove this requirements, or
b. wait for virtio front end (pmd) to support it.
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Ixgbe HW supports 128 TX queues. However, the full 128 queues are only
available in VT and DCB mode. In normal default "none" mode (VT/DCB off)
the maximum number of available queues is only 64.
The driver doesn't check the mode when reporting the available
number of queues, allowing more that 64 queues to be used in all cases.
If a queue no. >=64 is used in default mode, the TX packets will be dropped
silently.
This change adds a check to forbid using a queue number larger than 64
during device configuration (in default mode), so that the problem is
reported as early as possible.
Fixes: 27b609cbd1 ("ethdev: move the multi-queue mode check to specific drivers")
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
This patch extends flow director to select vlan id as part of
filter's input set and program the filter rule with vlan id.
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
This patch adds RTE_ETH_INPUT_SET_L3_IP4_TTL,
RTE_ETH_INPUT_SET_L3_IP6_HOP_LIMITS input field types and extends
struct rte_eth_ipv4_flow and rte_eth_ipv6_flow to support filtering
by tos, protocol and ttl.
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
VLAN insertion can be done in hardware when supported in Verbs. A software
fallback is provided otherwise. The software implementation is also used
when multi-packet send is enabled on a queue, as both features are mutually
exclusive.
Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Environment variable MLX5_PMD_ENABLE_PADDING enables HW packet padding
in PCI bus transactions.
When packet size is cache aligned and CRC stripping is enabled, 4 fewer
bytes are written to the PCI bus. Enabling padding makes such packets
aligned again.
In cases where PCI bandwidth is the bottleneck, padding can improve
performance by 10%.
This is disabled by default since this can also decrease performance for
unaligned packet sizes.
Signed-off-by: Olga Shern <olgas@mellanox.com>
fix packet padding macro check
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Secondary processes are expected to use queues and other resources
allocated by the primary, however Verbs resources can only be shared
between processes when inherited through fork().
This limitation can be worked around for TX by configuring separate queues
from secondary processes.
Signed-off-by: Or Ami <ora@mellanox.com>
Add driver functions to set link state up or down.
Burst functions are updated to make sure applications cannot attempt to
send/receive after link is brought down.
Signed-off-by: Or Ami <ora@mellanox.com>
When Linux PF and DPDK VF are used for i40e PMD, when a PF reset occurs,
an interrupt will go via adminq event to inform the VF of the reset.
A callback mechanism is introduced for the VF to allow it to invoke a
registered callback when PF reset happens.
Users can register a callback for this interrupt event using:
rte_eth_dev_callback_register(portid,
RTE_ETH_EVENT_INTR_RESET,
reset_event_callback,
arg);
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
The patch introduces a new PMD. This PMD is implemented as thin wrapper
of librte_vhost. It means librte_vhost is also needed to compile the PMD.
The vhost messages will be handled only when a port is started. So start
a port first, then invoke QEMU.
The PMD has 2 parameters.
- iface: The parameter is used to specify a path to connect to a
virtio-net device.
- queues: The parameter is used to specify the number of the queues
virtio-net device has.
(Default: 1)
Here is an example.
$ ./testpmd -c f -n 4 --vdev 'eth_vhost0,iface=/tmp/sock0,queues=1' -- -i
To connect above testpmd, here is qemu command example.
$ qemu-system-x86_64 \
<snip>
-chardev socket,id=chr0,path=/tmp/sock0 \
-netdev vhost-user,id=net0,chardev=chr0,vhostforce,queues=1 \
-device virtio-net-pci,netdev=net0,mq=on
Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Rich Lane <rich.lane@bigswitch.com>
Tested-by: Rich Lane <rich.lane@bigswitch.com>
Update for queue state event name:
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
This is a PMD for the Amazon ethernet ENA (Elastic Network Adapters)
family.
The driver operates variety of ENA adapters through feature negotiation
with the adapter and upgradable commands set.
ENA driver handles PCI Physical and Virtual ENA functions.
Signed-off-by: Evgeny Schemeilin <evgenys@amazon.com>
Signed-off-by: Jan Medala <jan@semihalf.com>
Signed-off-by: Jakub Palider <jpa@semihalf.com>
Release Note addition:
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
This patch updates the release notes with the features that
have been added to ip_pipeline application.
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Allow dynamic deallocation of af_packet device through proper
API functions. To achieve this:
* set device flag to RTE_ETH_DEV_DETACHABLE
* implement rte_pmd_af_packet_devuninit() and expose it
through rte_driver.uninit()
* copy device name to ethdev->data to make discoverable with
rte_eth_dev_allocated()
Moreover, make af_packet init function static, as there is no
reason to keep it public.
Signed-off-by: Wojciech Zmuda <woz@semihalf.com>
Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>
Add support for linking multi-segment buffers together to
handle Jumbo packets. The vmxnet3 API supports having header
and body buffer types. What this patch does is fill the primary
ring completely with header buffers and the secondary ring
with body buffers. This allows for non-jumbo frames to only
use one mbuf (from primary ring); and jumbo frames will have
first mbuf from primary ring and following mbufs from other
ring.
This could be optimized in future if the DPDK had API
to supply different sized mbufs (two pools) into driver.
Signed-off-by: Stephen Hemminger <shemming@brocade.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Release note addition:
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
This commit adds vmxnet3 TSO support.
Verified with test-pmd (set fwd csum) that both tso and
non-tso pkts can be successfully transmitted and all
segmentes for a tso pkt are correct on the receiver side.
Signed-off-by: Yong Wang <yongwang@vmware.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Tx data ring support was removed in a previous change that
added multi-seg transmit. This change adds it back.
According to the original commit (2e849373), 64B pkt
rate with l2fwd improved by ~20% on an Ivy Bridge
server at which point we start to hit some bottleneck
on the rx side.
I also re-did the same test on a different setup (Haswell
processor, ~2.3GHz clock rate) on top of the master
and still observed ~17% performance gains.
Fixes: 7ba5de417e ("vmxnet3: support multi-segment transmit")
Signed-off-by: Yong Wang <yongwang@vmware.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Mmap PCI resource file and add inline functions for reading from and
writing to PCI resource address space.
Add description of IBUF and OBUF address space.
Add configuration option for setting which firmware type will be used.
Right address space values for IBUFs and OBUFs offsets are used
according to configuration option CONFIG_RTE_LIBRTE_PMD_SZEDATA2_AS.
Setting link up/down and getting info about link status is done through
mmapped PCI resource address space.
Signed-off-by: Matej Vido <vido@cesnet.cz>
PMD was of type PMD_VDEV which means that PCI device is not recognised
automatically during EAL initialization, but it has to be created by
EAL option --vdev.
Now, PMD is of type PMD_PDEV which means that PCI device is probed
and recognised during EAL initialization automatically.
Path to szedata2 device file is matched with device and the count
of available RX and TX DMA channels is found out during device
initialization.
Initialization, starting and stopping of queues is changed to better
correspond with Ethernet device API model. Function callbacks
(rx|tx)_queue_(start|stop) are added. Unnecessary items are removed
from ethernet device private data structure.
Signed-off-by: Matej Vido <vido@cesnet.cz>
Change rxq_cq_to_ol_flags() to set checksum flags according to packet type,
so for non L3/L4 packets the mbuf chksum_bad flags will not be set.
Fixes: 67fa62bc67 ("mlx5: support checksum offload")
Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
RSS configuration should not be freed when priv is NULL.
Fixes: 2f97422e77 ("mlx5: support RSS hash update and get")
Signed-off-by: Or Ami <ora@mellanox.com>
Allows HW to strip the 802.1Q header from incoming frames and report it
through the mbuf structure.
This feature requires MLNX_OFED >= 3.2.
Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
This patch enables reading sglort (global resource tag) info into the
mbuf for RX and inserting an FTAG (Fabric Tag) at the beginning of the
packet for TX. The vlan_tci_outer field selected from rte_mbuf structure
for sglort is not used in fm10k now.
In FTAG based forwarding mode, the switch will forward packets according
to glort info in FTAG rather than mac and vlan table.
To activate this feature, user needs to pass a devargs parameter to eal
for fm10k device like "-w 0000:84:00.0,enable_ftag=1". Currently this
feature is supported only on PF, because FM10K_PFVTCTL register is
read-only for VF.
Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
Acked-by: Jing Chen <jing.d.chen@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Using SSE instructions to parse error flags in HW Rx descriptor,
then set corresponding bits of mbuf.
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
When the TX function tries to free a bunch of mbufs, it will free
them one by one. This change will scan the free list and merge the
requests in case they belongs to same pool, then free once, which
will reduce cycles on freeing mbufs.
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
Acked-by: Shaopeng He <shaopeng.he@intel.com>
Previous l3fwd-power only processes IP and IPv6 packets, other
packets' mbufs are not freed, and this causes a memory leak.
This patch fixes this issue.
Fixes: 3c0184cc0c ("examples: replace some offload flags with packet type")
Signed-off-by: Shaopeng He <shaopeng.he@intel.com>
Acked-by: Jing Chen <jing.d.chen@intel.com>
Acked-by: Michael Qiu <michael.qiu@intel.com>
In interrupt mode, each rx queue can have one interrupt to notify the
application when packets are available in that queue. Some queues
also can share one interrupt.
Currently, fm10k needs one separate interrupt for mailbox. So, only those
drivers which support multiple interrupt vectors e.g. vfio-pci can work
in fm10k interrupt mode.
This patch uses the RXINT/INT_MAP registers to map interrupt causes
(rx queue and other events) to vectors, and enable these interrupts
through kernel drivers like vfio-pci.
Signed-off-by: Shaopeng He <shaopeng.he@intel.com>
Acked-by: Jing Chen <jing.d.chen@intel.com>
Acked-by: Michael Qiu <michael.qiu@intel.com>
This patch implemented the ops of adding and removing mac
address in i40evf driver. Functions are assigned like:
.mac_addr_add = i40evf_add_mac_addr,
.mac_addr_remove = i40evf_del_mac_addr,
To support multiple mac addresses setting, this patch also
extended the mac addresses adding and deletion when device
start and stop. Each VF can have a maximum of 64 mac
addresses.
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Zhe Tao <zhe.tao@intel.com>
VEB switching feature for i40e is used to enable the switching between the
VSIs connect to the virtual bridge. The old implementation is setting the
virtual bridge mode as VEPA which is port aggregation. Enable the switching
ability by setting the loop back mode for the specific VSIs which connect
to PF or VFs.
VEB/VSI/VEPA are concepts not specific to the i40e HW, the concepts are
from 802.1qbg spec
IEEE EVB tutorial:
http://www.ieee802.org/802_tutorials/2009-11/evb-tutorial-draft-20091116_v09.pdf
VEB: a virtual switch can forward the packet based on the specific match
field.
VSI: a virtual interface connect between the VEB/VEPA and virtual machine.
VEPA: a virtual Ethernet port aggregator will upstream the packets from
VSI to the LAN port.
Signed-off-by: Zhe Tao <zhe.tao@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Previously, DCB(Data Center Bridging) is only enabled on PF,
queue mapping and BW configuration is only done on PF.
This patch enables DCB for VMDQ VSIs(Virtual Station Interfaces)
by following steps:
1. Take BW and ETS(Enhanced Transmission Selection)
configuration on VEB(Virtual Ethernet Bridge).
2. Take BW and ETS configuration on VMDQ VSIs.
3. Update TC(Traffic Class) and queues mapping on VMDQ VSIs.
To enable DCB on VMDQ, the number of TCs should not be larger than
the number of queues in VMDQ pools, and the number of queues per
VMDQ pool is specified by CONFIG_RTE_LIBRTE_I40E_QUEUE_NUM_PER_VM
in config/common_* file.
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Several structures and macros are added or updated, such
as 'struct i40e_aqc_get_link_status',
'struct i40e_aqc_run_phy_activity' and
'struct i40e_aqc_lldp_set_local_mib_resp'.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
RX control register read/write functions are added, as directly
read/write may fail when under stress small traffic. After the
adminq is ready, all rx control registers should be read/written
by dedicated functions.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Generate a MAC address for each VF during PF host
initialization.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Zhe Tao <zhe.tao@intel.com>
Normally the auto-negotiation is supported by FW. SW need not care about
that. But on x550em_x, FW doesn't support auto-neg. As the x550em_x ports
are 10G, if we connect the port will a peer which is 1G, the link will
always be down.
We need support auto-neg by SW to avoid this link down issue. As we already
have the code to handle the link speed setting, what we need is a trigger.
When the advertised link speed changes, a PHY interruption will be
triggered. So, we should handle this interrupt and call ixgbe_handle_lasi
to set the link speed correctly.
Please be aware it's working when auto-neg is on. If the auto-neg of the
peer port is turned off and its speed is indicated manually, we should also
set the speed of our own port manually.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Shaopeng He <shaopeng.he@intel.com>
Add multicast promiscuous mode support on ixgbe VF driver.
Please note if we want to use this promiscuous mode, we need both PF
and VF driver to support it. The reason is this VF feature is
configged on PF.
If use kernel PF driver + dpdk VF driver, make sure kernel PF driver
support VF multicast promiscuous mode. If use dpdk PF + dpdk VF,
better make sure PF driver is the same version as VF.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Shaopeng He <shaopeng.he@intel.com>
Acked-by: Xiao Wang <xiao.w.wang@intel.com>
The MDIO clock speed must be reconfigured after the MAC reset.
The MDIO clock speed becomes invalid, therefore the driver reads
invalid PHY register values. The driver now set the MDIO clock
speed prior to initializing PHY ops and again after the MAC reset.
As now the MDIO speed gets set in more than one place, make a
function for it so it will always be done correctly.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Do not set FDIRCTRL.DROP_NO_MATCH in ixgbe_init_fdir_perfect_82599(),
this bit is already set in ixgbe_set_fdir_drop_queue_82599() which
makes more sense for drivers that call that function.
This resolves an issue where packets were being dropped when switching
to perfect filters mode.
Setting this bit makes no sense in perfect filters mode for the
driver as we do not want to route all packets that don't match an FDIR
rule to a single queue and instead fall back to RSS.
Drivers that need this bit set can call ixgbe_set_fdir_drop_queue_82599()
and the ones that don't, can preserve the old behavior.
Fixes: 2241ce2816 ("ixgbe/base: add flow director drop queue")
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
This patch resolves an issue where VF mac address is zeroed out
in cases where the VF driver is loaded while the PF interface
is down.
The solution is to only set it when we get an ACK from the PF.
Fixes: 6202266e56 ("ixgbe/base: vf changes")
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Only x550em_x V1 was supported before. Now V2 is supported.
A mask for V1 and V2 is defined and used to support both.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Add new X550EM_a devices and their mac types, X550EM_a
and X550EM_a_vf.
Update the code to use the new devices and mac types.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
max_rx_pkt_len already includes ETHER_HDR_LEN and ETHER_CRC_LEN for the
mtu. But, the firmware also adds ETHER_HDR_LEN and ETHER_CRC_LEN to the
mtu specified. Fix by subtracting these values from the mtu before
passing it to firmware.
Fixes: 4b2eff452d ("cxgbe: enable jumbo frames")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
The size of each entry in the port's rss table is actually 2 bytes
and not 1 byte. A segfault occurs when accessing part of port 0's rss
table because it gets overwritten by subsequent port 1's part of the
rss table. Fix by setting the size of each entry appropriately.
Fixes: 92c8a63223 ("cxgbe: add device configuration and Rx support")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Change the fields of outer_mac and inner_mac in struct
rte_eth_tunnel_filter_conf from pointer to struct in order to
keep the code's readability.
Signed-off-by: Xutao Sun <xutao.sun@intel.com>
Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
The patch add VxLAN & NVGRE TX checksum off-load. When the flag of
outer IP header checksum offload is set, we'll set the context
descriptor to enable this checksum off-load.
Also update release notes for VxLAN & NVGRE checksum off-load support.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
The names of function for tunnel port configuration are not
accurate. They're tunnel_add/del, better change them to
tunnel_port_add/del.
The old functions are directly replaced because the API and ABI
compatibility of ethdev are already broken in 16.04.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Add support of l2 tunnel configuration and operations.
1, Support modifying ether type of a type of l2 tunnel.
2, Support enabling and disabling the support of a type of l2 tunnel.
3, Support enabling/disabling l2 tunnel tag insertion/stripping.
4, Support enabling/disabling l2 tunnel packets forwarding.
5, Support adding/deleting forwarding rules for l2 tunnel packets.
Only support E-tag now.
Also update the release note.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Shaopeng He <shaopeng.he@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Tested-by: Yong Liu <yong.liu@intel.com>