Add support of l2 tunnel configuration and operations.
1, Support modifying ether type of a type of l2 tunnel.
2, Support enabling and disabling the support of a type of l2 tunnel.
3, Support enabling/disabling l2 tunnel tag insertion/stripping.
4, Support enabling/disabling l2 tunnel packets forwarding.
5, Support adding/deleting forwarding rules for l2 tunnel packets.
Only support E-tag now.
Also update the release note.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Shaopeng He <shaopeng.he@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Tested-by: Yong Liu <yong.liu@intel.com>
On X550, as required by datasheet, E-tag packets are not expected
when double VLAN are used. So modify the register PFVTCTL after
enabling double VLAN to select pool by MAC but not MAC or E-tag.
An introduction of E-tag:
It's defined in IEEE802.1br. Please reference this website,
http://www.ieee802.org/1/pages/802.1br.html.
A brief description.
E-tag means external tag, and it's a kind of l2 tunnel. It means a
tag will be inserted in the l2 header. Like below,
|31 24|23 16|15 8|7 0|
0| Destination MAC address |
4| Dest MAC address(cont.) | Src MAC address |
8| Source MAC address(cont.) |
12| E-tag Etherenet type (0x893f) | E-tag header |
16| E-tag header(cont.) |
20| VLAN Ethertype(optional) | VLAN header(optional) |
24| Original type | ...... |
...| ...... |
The E-tag format is like below,
|0 15|16 18|19 |20 31|
| Ethertype - 0x893f | E-PCP |DEI| Ingress E-CID_base |
|32 33|34 35|36 47|48 55 |56 63|
| RSV | GRP |E-CID_base|Ingress_E-CID_ext| E-CID_ext |
The Ingess_E-CID_ext and E-CID_ext are always zero for endpoints
and are effectively reserved.
The more details of E-tag is in IEEE 802.1BR. 802.1BR is used to
replace 802.1Qbh. 802.1BR is a standard for Bridge Port Extension.
It specifies the operation of Bridge Port Extenders, including
management, protocols, and algorithms. Bridge Port Extenders
operate in support of the MAC Service by Extended Bridges.
The E-tag is added to l2 header to identify the VM channel and
the virtual port.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Shaopeng He <shaopeng.he@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Tested-by: Yong Liu <yong.liu@intel.com>
The array 'ptype_table' was defined in depth of 'UINT8_MAX' which
is 255, while the querying index could be from 0 to 255. The issue
can be fixed with expanding the array to one more element.
Fixes: 9571ea028489 ("i40e: replace some offload flags with unified packet type")
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
In order to set ether type of VLAN for single VLAN, inner
and outer VLAN, the VLAN type as an input parameter is added
to 'rte_eth_dev_set_vlan_ether_type()'.
In addition, corresponding changes in e1000, ixgbe and i40e
are also added.
It is an ABI break but ethdev library is already bumped for 16.04.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
virtio PMD could use IO port to configure the virtio device without
using UIO/VFIO driver in legacy mode.
There are two issues with previous implementation:
1) virtio PMD will take over the virtio device(s) blindly even if not
intended for DPDK.
2) driver conflict between virtio PMD and virtio-net kernel driver.
This patch checks if there is kernel driver other than UIO/VFIO managing
the virtio device before using port IO.
If legacy_virtio_resource_init fails and kernel driver other than
VFIO/UIO is managing the device, return 1 to tell the upper layer we
don't take over this device.
For all other IO port mapping errors, return -1.
Note than if VFIO/UIO fails, now we don't fall back to port IO.
Fixes: da978dfdc43b ("virtio: use port IO to get PCI resource")
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
PCIe feature of 'Extended Tag' is important for 40G performance.
It adds its enabling during each port initialization, to ensure
the high performance.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Fixed issue of byte order in ethdev library that the structure
for setting fdir's mask and flow entry is inconsist and made
inputs of mask be in big endian.
Fixes: 2d4c1a9ea2ac ("ethdev: add new flow director masks")
Fixes: 76c6f89e80d4 ("ixgbe: support new flow director masks")
Reported-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Zhe Tao <zhe.tao@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Macros RTE_MBUF_DATA_DMA_ADDR and RTE_MBUF_DATA_DMA_ADDR_DEFAULT
are defined in each PMD driver file. Convert macros to inline
functions and move them to common lib/librte_mbuf/rte_mbuf.h file.
PMD drivers include rte_mbuf.h file directly/indirectly hence no
additioanl header file inclusion is necessary.
Signed-off-by: Ravi Kerur <rkerur@gmail.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
ConnectX-4 NICs can handle at most 512 entries in RETA table.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
The physically linked-together combined library has been an increasing
source of problems, as was predicted when library and symbol versioning
was introduced. Replace the complex and fragile construction with a
simple linker script which achieves the same without all the problems,
remove the related kludges from eg mlx drivers.
Since creating the linker script is practically zero cost, remove the
config option and just create it always.
Based on a patch by Sergio Gonzales Monroy, linker script approach
initially suggested by Neil Horman.
Suggested-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Suggested-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Panu Matilainen <pmatilai@redhat.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Move all os / arch specifics to eal.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Reviewed-by: Santosh Shukla <sshukla@mvista.com>
Tested-by: Santosh Shukla <sshukla@mvista.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
According to the api, rte_eal_pci_map_device is only successful when
returning 0.
Fixes: 6ba1f63b5ab0 ("virtio: support specification 1.0")
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Fixes: c52afa68d763 ("virtio: move left PCI stuff in the right file")
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
The file rte_config.h is automatically generated and included.
No need to #include it.
The example performance-thread needs a makefile fix to avoid
overwriting the default cflags.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
As discussed on list, switch numbering scheme to be based on year/month.
Release 2.3 then becomes 16.04.
Ref: http://dpdk.org/ml/archives/dev/2015-December/030336.html
Also, added zero padding to the month so that it appear as 16.04 and
not 16.4 in "make showversion" and rte_version().
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
fix the error reported by checkpatch:
"ERROR: return is not a function, parentheses are not required"
remove parentheses in return like:
"return (logical expressions)"
remove parentheses in return a function like:
"return (rte_mempool_lookup(...))"
Fixes: 6307b909b8e0 ("lib: remove extra parenthesis after return")
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Modern (v1.0) virtio pci device defines several pci capabilities.
Each cap has a configure structure corresponding to it, and the
cap.bar and cap.offset fields tell us where to find it.
Firstly, we map the pci resources by rte_eal_pci_map_device().
We then could easily locate a cfg structure by:
cfg_addr = dev->mem_resources[cap.bar].addr + cap.offset;
Therefore, the entrance of enabling modern (v1.0) pci device support
is to iterate the pci capability lists, and to locate some configs
we care; and they are:
- common cfg
For generic virtio and virtqueue configuration, such as setting/getting
features, enabling a specific queue, and so on.
- nofity cfg
Combining with `queue_notify_off' from common cfg, we could use it to
notify a specific virt queue.
- device cfg
Where virtio_net_config structure is located.
- isr cfg
Where to read isr (interrupt status).
If any of above cap is not found, we fallback to the legacy virtio
handling.
If succeed, hw->vtpci_ops is assigned to modern_ops, where all
operations are implemented by reading/writing a (or few) specific
configuration space from above 4 cfg structures. And that's basically
how this patch works.
Besides those changes, virtio 1.0 introduces a new status field:
FEATURES_OK, which is set after features negotiation is done.
Last, set the VIRTIO_F_VERSION_1 feature flag.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Reviewed-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Tested-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Huawei Xie <huawei.xie@intel.com>
The mergeable virtio net hdr format has been the standard and the
only virtio net hdr format since virtio 1.0. Therefore, we can
not hardcode hdr_size to "sizeof(struct virtio_net_hdr)" any more
at virtio_recv_pkts(), otherwise, there would be a mismatch of
hdr size from rte_vhost_enqueue_burst() and virtio_recv_pkts(),
leading a packet corruption.
Instead, we should retrieve it from hw->vtnet_hdr_size; we will
do proper settings at eth_virtio_dev_init() in later patches.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Reviewed-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Tested-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Huawei Xie <huawei.xie@intel.com>
Switch to 64 bit features, which virtio 1.0 supports.
While legacy virtio only supports 32 bit features, it complains aloud
and quit when trying to setting > 32 bit features.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Reviewed-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Tested-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Huawei Xie <huawei.xie@intel.com>
virtio_pci.c is a more proper place for pci stuff; virtio_ethdev is not.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Reviewed-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Tested-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Huawei Xie <huawei.xie@intel.com>
Introduce struct virtio_pci_ops, to let legacy virtio (v0.95) and
modern virtio (1.0) have different implementation regarding to a
specific pci action, such as read host status.
With that, this patch reimplements all exported pci functions, in
a way like:
vtpci_foo_bar(struct virtio_hw *hw)
{
hw->vtpci_ops->foo_bar(hw);
}
So that we need pay attention to those pci related functions only
while adding virtio 1.0 support.
This patch introduced a new vtpci function, vtpci_init(), to do
proper virtio pci settings. It's pretty simple so far: just sets
hw->vtpci_ops to legacy_ops as we don't support 1.0 yet.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Reviewed-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Tested-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Huawei Xie <huawei.xie@intel.com>
offset arg of vtpci_read/write_dev_config is derived from offsetof(),
which is of size_t type, instead of uint64_t. So, define it as size_t
type.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Reviewed-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Tested-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Huawei Xie <huawei.xie@intel.com>
As we have already set up it at virtio_dev_queue_setup(), and a vq
restart will not reset the settings.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Reviewed-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Tested-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Huawei Xie <huawei.xie@intel.com>
The x550 MDIO clock speed must be configured prior to first MDIO read or
write. The default MDIO clock speed is not valid, therefore the driver
is configuring a valid speed prior to reading the copper PHY device id.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
This patch removes KR PHY reset from ixgbe_init_phy_ops_X550em. Since
this function is meant to initialize function pointers for detected PHY
type. Internal PHY reset was moved to ixgbe_setup_internal_phy_t_x550em
which will now detect which mode does internal PHY work in, and setup it
as required.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
KR auto-neg mode is what we will be using going forward. The SW
interface for this mode is different than what was used for iXFI.
While debugging, it was determined that the ucode diagnostic was
no longer needed. This code has been removed to simplify the init
flow.
A subtle semaphore error in the CS4227 reset flow was fixed.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Avoid a needless PHY access on copper phys to save the 10ms wait
time for each PHY access. A helper function is introduced to
actually do the register access and process the contents.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
This patch changes code to use registers offsets stored in mvals table
instead of values defined statically.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
This patch adds ixgbe_set_fdir_drop_queue_82599 for enabling and
setting flow director drop queue, and adds sets drop no match in
ixgbe_init_fdir_perfect_82599 for x550.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Waiting for FDIRCMD completion is an expensive thing to do in the
transmit hot path. This wait was added to catch problems with perfect
filter rules, and, at least in the Linux driver, there is no error
check anyway, so there is no point to adding the delay. So do not wait
for completion. Change the return of the function to void, since it has
no meaningful return value.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
This patch adds the flow control ethertype to the defines for the
ETQF filter list. This only adds the define. Each driver
can add this ethertype to the filter. This is needed to prevent
denial of service by malicious VFs sending out flow control
packets.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Currently credit_refill and credit_max could be zero for a TC and that
is causing Tx hang for CEE mode configuration, so to fix that have at
min credit assigned to a TC and that is as what IEEE mode already does.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Coverity issue reported like
CID 119268 (#1 of 1): Unintended sign extension
(SIGN_EXTENSION)sign_extension: Suspicious implicit sign extension:
vsi_id with type unsigned short (16 bits, unsigned) is promoted in
vsi_id << 23 to type int (32 bits, signed), then sign-extended to type
unsigned long (64 bits, unsigned). If vsi_id << 23 is greater than
0x7FFFFFFF, the upper bits of the result will all be 1.
Fixes: 88ebc2b7f976 ("i40e: extend flow director to support VF")
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
In FreeBsd driver, the max frame size is changed to MTU, but not
keep the default value defined in DataSheet. When DPDK runs on that
NIC, the configured value is not expected.
This patch sets the max frame size to default when initialization.
Fixes: 4861cde46116 ("i40e: new poll mode driver")
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
This counter was left unmodified. Restore it in ixgbe_dev_stats_get.
The ierrors counter still includes imissed for ixgbe. This behaviour is
not consistent amongst all drivers. Another patch may be needed to unify
the meaning of the ierrors counter.
Fixes: 5e50ad1c1b63 ("ixgbe: add specific stats")
Signed-off-by: Robin Jarry <robin.jarry@6wind.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Fragmented IPv4 packets have no TCP/UDP headers, so we hashed
random data introducing reordering of the fragments.
Signed-off-by: Andriy Berestovskyy <aber@semihalf.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
The following messages might appear after some idle time:
"PMD: Failed to allocate LACP packet from pool"
The fix ensures the mempool size is greater than the sum
of TX descriptors.
Signed-off-by: Andriy Berestovskyy <aber@semihalf.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
Add BNX2X PMD version, print it as part of adapter info.
Adjusted print adapter info output formatting.
This patch versions BNX2X PMD at 1.0.0.
Signed-off-by: Rasesh Mody <rasesh.mody@qlogic.com>