3256 Commits

Author SHA1 Message Date
Jianfeng Tan
b08b8cfeb2 vhost: fix IP checksum
There is no way to bypass IP checksum verification in Linux
kernel, no matter skb->ip_summed is assigned as CHECKSUM_UNNECESSARY
or CHECKSUM_PARTIAL.

So any packets with bad IP checksum will be dropped at VM IP layer.

To correct, we check this flag PKT_TX_IP_CKSUM to calculate IP csum.

Fixes: 859b480d5afd ("vhost: add guest offload setting")
Cc: stable@dpdk.org

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2017-07-02 01:28:34 +02:00
Jianfeng Tan
46b7a8372d vhost: fix TCP checksum
As PKT_TX_TCP_SEG flag in mbuf->ol_flags implies PKT_TX_TCP_CKSUM,
applications, e.g., testpmd, don't set PKT_TX_TCP_CKSUM when TSO
is set.

This leads to that packets get dropped in VM tcp stack layer because
of bad TCP csum.

To fix this, we make sure TCP NEEDS_CSUM info is set into virtio net
header when PKT_TX_TCP_SEG is set, so that VM tcp stack will not
check the TCP csum.

Fixes: 859b480d5afd ("vhost: add guest offload setting")
Cc: stable@dpdk.org

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2017-07-02 01:28:22 +02:00
Daniel Verkamp
3cb502b310 vhost: clean up per-socket mutex
vsocket->conn_mutex was allocated with pthread_mutex_init() but never
freed with pthread_mutex_destroy().  This is a potential memory leak,
depending on how pthread_mutex_t is implemented.

Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2017-07-02 01:16:31 +02:00
Harish Patil
d95188551f mbuf: introduce new Tx offload flag for MPLS-in-UDP
Some PMDs need to know the tunnel type in order to handle advance TX
features. This patch adds a new TX offload flag for MPLS-in-UDP packets.

Signed-off-by: Harish Patil <harish.patil@cavium.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2017-07-06 15:00:57 +02:00
Stephen Hemminger
463ced957c pci: increase domain storage to 32 bits
In some environments, the PCI domain can be larger than 16 bits.
For example, a PCI device passed through in Azure gets a synthetic domain
id  which is internally generated based on GUID. The PCI standard does
not restrict domain to be 16 bits.

This change breaks ABI for API's that expose PCI address structure.

The printf format for PCI remains unchanged, so that on most
systems (with only 16 bit domain) the output format is unchanged
and is 4 characters wide.  For example: 0000:00:01.0
Only on sysetms with higher bits will the domain take up more
space; example: 12000:00:01.0

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-07-06 01:28:02 +02:00
Stephen Hemminger
c023dabc07 pci: remove unnecessary casts in address parsing
The function strtoul returns unsigned long and can be directly
assigned to a smaller type. Removing the casts allows easier
expansion of PCI domain.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-07-06 01:27:19 +02:00
Ferruh Yigit
a1e7c17555 ethdev: use device name from device structure
Device name resides in two different locations, in rte_device->name and
in ethernet device private data.

For now, the copy in the ethernet device private data is required for
multi process support, the name is the how secondary process finds about
primary process device.

But in the ethdev library some eth_dev->data->name usage can be
converted to rte_device->name.

This patch updates ethdev to use rte_device->name when possible.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-07-06 00:17:11 +02:00
Ferruh Yigit
48d8675c9c ethdev: ensure same name size for device and ethdev
rte_device->name copied into eth_dev->name, right now size is same for
both but the requirement is not clear.

This patch highlights the relation without changing actual sizes.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-07-06 00:16:15 +02:00
Qi Zhang
a3a2e2c8f7 ethdev: add fuzzy match in flow API
Add new meta pattern item RTE_FLOW_TYPE_ITEM_FUZZY in flow API.

This is for device that support fuzzy match option.
Usually a fuzzy match is fast but the cost is accuracy.
i.e. Signature Match only match pattern's hash value, but it is
possible that two different patterns have the same hash value.

Matching accuracy level can be configured by subfield threshold.
Driver can divide the range of threshold and map to different
accuracy levels that device support.

Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
2017-07-05 19:51:56 +02:00
Jianfeng Tan
641566b38b eal: fix config file path when checking process
When primary process is booted with --file-prefix option, the API,
rte_eal_primary_proc_alive(), uses a wrong config file path to
check if primary process is alive.

Fix it by calling helper function to get config file path.

Fixes: dd3e00138d74 ("eal: check if primary process is alive")
Cc: stable@dpdk.org

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
2017-07-05 15:17:05 +02:00
Jianfeng Tan
a33c81e38c ethdev: fix secondary process crash on unused virtio
Suppose we have 2 virtio devices for a VM, with only the first one,
virtio0, binding to igb_uio. Start a primary DPDK process, driving
only virtio0. Then start a secondary DPDK process, it encounters
segfault at eth_virtio_dev_init() because hw is NULL, when trying
to initialize the 2nd virtio devices.
    1539                    if (!hw->virtio_user_dev) {

We could add a precheck to return error when hw is NULL. But the
root cause is that virtio devices which are not driven by the primary
process are not exluded by secondary eal probe function.

To support legacy virtio devices bound to none kernel driver, we
removed RTE_PCI_DRV_NEED_MAPPING in
commit 962cf902e6eb ("pci: export device mapping functions").
At the boot of primary process, ether dev is allocated in rte_eth_devices
array, rte_eth_dev_data is also allocated in rte_eth_dev_data array; then
probe function fails; and ether dev is released. However, the entry in
rte_eth_dev_data array is not cleared. Then we start secondary process,
and try to attach the virtio device that not used in primary process,
the field, dev_private (or hw), in rte_eth_dev_data, is NULL.

To fail the dev attach, we need to clear the field, name, when we
release any ether devices in primary, so that below loop in
rte_eth_dev_attach_secondary() will not find any matched names.
        for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
                if (strcmp(rte_eth_dev_data[i].name, name) == 0)
                        break;
        }

Fixes: 6d890f8ab512 ("net/virtio: fix multiple process support")
Cc: stable@dpdk.org

Reported-by: Reshma Pattan <reshma.pattan@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
2017-07-05 12:10:40 +02:00
Olivier Matz
ad10c17821 mem: do not advertise physical address when no hugepages
When populating a mempool with a virtual memory area, the mempool
library expects to be able to get the physical address of each page.

When started with --no-huge, the physical addresses may not be available
because the pages are not locked in memory. It sometimes returns
RTE_BAD_PHYS_ADDR, which makes the mempool_populate() function to fail.

This was working before the commit cdc242f260e7 ("eal/linux: support
running as unprivileged user"), because rte_mem_virt2phy() was returning
0 instead of RTE_BAD_PHYS_ADDR, which was seen as a valid physical
address.

Since --no-huge is a debug function that breaks the support of physical
drivers, always set physical addresses to RTE_BAD_PHYS_ADDR in memzones
or in rte_mem_virt2phy(), and ensure that mempool won't complain in that
case.

Fixes: cdc242f260e7 ("eal/linux: support running as unprivileged user")
Cc: stable@dpdk.org

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Jan Blunck <jblunck@infradead.org>
2017-07-04 17:51:22 +02:00
Jianbo Liu
3c4b4024c2 arch/arm: add vcopyq_laneq_u32 for old gcc
Implement vcopyq_laneq_u32 if gcc version is lower than 7.

Signed-off-by: Jianbo Liu <jianbo.liu@linaro.org>
2017-07-04 17:41:53 +02:00
Ashwin Sekhar T K
a566400e8b net: implement CRC for ARM64 NEON
Added CRC compute APIs for arm64 utilizing the pmull
capability.

Added new file net_crc_neon.h to hold the arm64 pmull
CRC implementation.

Added wrappers in rte_vect.h for those neon intrinsics
which are not supported in GCC version < 7.

Verified the changes with crc_autotest unit test case

Signed-off-by: Ashwin Sekhar T K <ashwin.sekhar@caviumnetworks.com>
Acked-by: Jianbo Liu <jianbo.liu@linaro.org>
2017-07-04 15:58:45 +02:00
Ashwin Sekhar T K
266451e419 eal: move gcc version definition to common header
Moved the definition of GCC_VERSION from lib/librte_table/rte_lru.h
to lib/librte_eal/common/include/rte_common.h.

Tested compilation on:
 * arm64 with gcc
 * x86 with gcc and clang

Signed-off-by: Ashwin Sekhar T K <ashwin.sekhar@caviumnetworks.com>
Reviewed-by: Jan Viktorin <viktorin@rehivetech.com>
Acked-by: Jianbo Liu <jianbo.liu@linaro.org>
2017-07-04 15:57:22 +02:00
Bruce Richardson
887c272fab table: remove check for SSE4
Since SSE4 is now part of the minimum requirements for DPDK, we don't need
the scalar version on x86.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2017-07-04 14:39:18 +02:00
Bruce Richardson
ff1b2b39d6 sched: remove check for SSE4
Since SSE4 is now part of the minimum requirements for DPDK, we don't need
to check for its presence any more.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2017-07-04 14:39:18 +02:00
Bruce Richardson
e08555a041 net: remove check for SSE4
Since SSE4 is now part of the minimum requirements for DPDK, we don't need
to check for its presence any more.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2017-07-04 14:35:41 +02:00
Bruce Richardson
3f50cf9075 ip_frag: check for x86 rather than SSE4
Since SSE4 is now part of the minimum requirements for DPDK, we don't need
to check for its presence any more.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2017-07-04 14:35:41 +02:00
Bruce Richardson
4f4cd8717e hash: remove checks for SSE
Since SSE4 is now part of the minimum requirements for DPDK, we don't need
a fallback case to handle selection of algorithm when SSE4 is unavailable.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2017-07-04 14:35:41 +02:00
Bruce Richardson
673e2fe586 distributor: remove checks for SSE4
Since SSE4 is now part of the minimum requirements for DPDK, we now longer
need this check.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2017-07-04 14:35:41 +02:00
Bruce Richardson
35320649fa acl: remove checks for SSE4
Since SSE4 is now part of the minimum requirements for DPDK, we now longer
need this check.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2017-07-04 14:35:41 +02:00
Bruce Richardson
f46e442ca0 eal: remove unneeded conditionals for SSE headers
Our x86 baseline is to have support for SSE4.2, so therefore there is no
point in conditions around the inclusion of SSE1 - SSE4 headers.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2017-07-04 14:35:37 +02:00
Tiwei Bie
190ce8645e contigmem: do not zero pages during each mmap
Don't zero the pages during each mmap. Instead, only zero the pages
when they are not already mmapped. Otherwise, the multi-process
support will be broken, as the pages will be zeroed when secondary
processes map the memory. Besides, track the open and mmap operations
on the cdev, and prevent the module from being unloaded when it is
still in use.

Fixes: 82f931805506 ("contigmem: zero all pages during mmap")
Cc: stable@dpdk.org

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2017-07-04 01:32:57 +02:00
Tiwei Bie
5f51eca224 contigmem: free allocated memory on error
Fixes: 764bf26873b9 ("add FreeBSD support")
Cc: stable@dpdk.org

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2017-07-04 01:32:28 +02:00
Jan Blunck
0bba9e6050 eal: use new hotplug API in attach
Using the new hotplug API allows attach to be backwards compatible while
decoupling it from the concrete bus implementations.

Signed-off-by: Jan Blunck <jblunck@infradead.org>
2017-07-04 01:22:19 +02:00
Jan Blunck
cbb4c648c5 ethdev: use device handle to detach
This is changing the API of rte_eal_dev_detach().

Signed-off-by: Jan Blunck <jblunck@infradead.org>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2017-07-04 01:22:19 +02:00
Jan Blunck
a3ee360f44 eal: add hotplug add/remove device
Signed-off-by: Jan Blunck <jblunck@infradead.org>
2017-07-04 01:10:24 +02:00
Gaetan Rivet
00e62aae69 bus/pci: implement plug/unplug operations
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
2017-07-04 01:09:33 +02:00
Jan Blunck
96f54a07c8 bus/vdev: implement unplug operation
Signed-off-by: Jan Blunck <jblunck@infradead.org>
2017-07-04 01:09:17 +02:00
Jan Blunck
7c8810f43f bus: introduce device plug/unplug
This allows the buses to plug and probe specific devices.
This is meant to be a building block for hotplug support.

Signed-off-by: Jan Blunck <jblunck@infradead.org>
2017-07-04 01:08:42 +02:00
Jan Blunck
2f517390e5 bus: add helper to find bus by name
Signed-off-by: Jan Blunck <jblunck@infradead.org>
2017-07-04 01:08:36 +02:00
Jan Blunck
95d57b2b03 bus: add helper to find which bus holds a device
Signed-off-by: Jan Blunck <jblunck@infradead.org>
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
2017-07-04 01:08:28 +02:00
Jan Blunck
dd288f0dfb bus: require to implement device finding
Signed-off-by: Jan Blunck <jblunck@infradead.org>
2017-07-04 01:08:27 +02:00
Jan Blunck
9a58384b74 bus/pci: implement method to find device
Signed-off-by: Jan Blunck <jblunck@infradead.org>
2017-07-04 01:08:21 +02:00
Jan Blunck
7729daf9ed bus/vdev: implement method to find device
Signed-off-by: Jan Blunck <jblunck@infradead.org>
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
2017-07-04 01:08:17 +02:00
Jan Blunck
3a8f0bc68a bus: add method to find device
This new method allows buses to expose their devices in a controlled
manner. A comparison function is provided by the user to discriminate
between devices, using arbitrary data as identifier.

It is possible to start an iteration from a specific point, in order to
continue a search.

Signed-off-by: Jan Blunck <jblunck@infradead.org>
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
2017-07-04 01:08:13 +02:00
Jan Blunck
87bfa873af bus: add iterator to find a bus
This helper allows to iterate over all registered buses and find one
matching data used as parameter.

Signed-off-by: Jan Blunck <jblunck@infradead.org>
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
2017-07-04 01:08:11 +02:00
Gaetan Rivet
fea892e35f bus/vdev: use standard bus registration
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
2017-07-04 01:07:53 +02:00
Jerin Jacob
577329e66b eal: switch to architecture specific pause function
Remove rte_pause() definition from rte_common.h and
switchover to architecture specific rte_pause.h

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2017-07-03 23:58:51 +02:00
Jerin Jacob
ad0c241386 eal/ppc64: add empty pause function
The patch does not provide any functional change for ppc64
with respect to existing rte_pause() definition.

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
2017-07-03 23:58:51 +02:00
Jerin Jacob
d2f8d65f6e eal/x86: copy pause function
The patch does not provide any functional change for x86
with respect to existing rte_pause() definition.

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2017-07-03 23:58:51 +02:00
Jerin Jacob
dfd33f01cd eal/arm64: add pause function
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Jianbo Liu <jianbo.liu@linaro.org>
2017-07-03 23:58:51 +02:00
Jerin Jacob
b8d08b0dc3 eal/arm32: add empty pause function
The patch does not provide any functional change for ARM32
with respect to existing rte_pause() definition.

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Jan Viktorin <viktorin@rehivetech.com>
Acked-by: Jianbo Liu <jianbo.liu@linaro.org>
2017-07-03 23:58:51 +02:00
Jerin Jacob
841e7ae580 eal: introduce architecture specific pause function
Each architecture may have different instructions for optimized
and power consumption aware rte_pause() implementation.

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2017-07-03 23:57:49 +02:00
Ashwin Sekhar T K
3b557b932c eal/arm: fix build with clang
Fixed warning -Wasm-operand-widths seen with armv8a
clang compilation.

Signed-off-by: Ashwin Sekhar T K <ashwin.sekhar@caviumnetworks.com>
Reviewed-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
2017-07-03 22:28:16 +02:00
Ashwin Sekhar T K
30b156d5ef acl: fix build with ARMv8 clang
Fixed warning -Wunknown-warning-option seen with
armv8a clang compilation.

Signed-off-by: Ashwin Sekhar T K <ashwin.sekhar@caviumnetworks.com>
Reviewed-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
2017-07-03 22:28:10 +02:00
Ashwin Sekhar T K
fa50d3b27a hash: compile ARMv8 CRC32 support conditionally
Compile the armv8a CRC32 support only if the machine
has the CRC extensions i.e if RTE_MACHINE_CPUFLAG_CRC32
is defined.

Removed the .arch assembly directives as these are no
more necessary.

Signed-off-by: Ashwin Sekhar T K <ashwin.sekhar@caviumnetworks.com>
Reviewed-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
2017-07-03 22:27:42 +02:00
Ashwin Sekhar T K
8d55ebcc78 eal: pause while busy-waiting for lcore slave
Instead of simply busy-waiting for slave in rte_eal_wait_lcore()
do rte_pause(). This will give power savings.

This also fixes warning -Wempty-body seen with armv8a clang
compilation.

Suggested-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Ashwin Sekhar T K <ashwin.sekhar@caviumnetworks.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
2017-07-03 22:27:40 +02:00
Ashwin Sekhar T K
47e15e618a table: add NEON implementation of LRU strategy 3
* Added new file rte_lru_arm64.h for holding arm64 specific
  definitions
* Verified the changes with table_autotest unit test case

Signed-off-by: Ashwin Sekhar T K <ashwin.sekhar@caviumnetworks.com>
2017-07-03 17:15:47 +02:00