13453 Commits

Author SHA1 Message Date
Andy Green
dd6f8d712e bus/dpaa: fix inconsistent struct alignment
The actual descriptor for qm_mr_entry is 64-byte aligned.

But the original code plays a trick, and puts a u8 common
to the three descriptor subtypes in the union afterwards
outside their structure definitions.

Unfortunately since they compose a struct qm_fd with
alignment 8, this trick destroys the ability of the compiler
to understand what has happened, resulting in this kind of
problem:

drivers/bus/dpaa/include/fsl_qman.h:354:3: error:
alignment 1 of ‘struct <anonymous>’ is less than 8 [-Werror=packed-not-aligned]
   } __packed dcern;

on gcc 8 / Fedora 28 out of the box.

This patch moves the u8 verb into the structure definitions
composed into the union, so the alignment of the parent struct
containing the alignment 8 object can also be seen to be
alignment 8 by the compiler.  Uses of .verb are fixed up to use
.ern.verb (the same offset of +0 inside all the structs in
the union).

The final struct layout should be unchanged.

Fixes: c47ff048b99a ("bus/dpaa: add QMAN driver core routines")
Fixes: f6fadc3e6310 ("bus/dpaa: add QMAN interface driver")
Cc: stable@dpdk.org

Signed-off-by: Andy Green <andy@warmcat.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
2018-05-14 23:32:23 +02:00
Andy Green
fe5f777b53 bus/pci: replace strncpy by strlcpy
In function ‘pci_get_kernel_driver_by_path’,
    inlined from ‘pci_scan_one.isra.1’ at
	drivers/bus/pci/linux/pci.c:317:8:
drivers/bus/pci/linux/pci.c:57:3: error:
‘strncpy’ specified bound depends on the length of the source argument
[-Werror=stringop-overflow=]
   strncpy(dri_name, name + 1, strlen(name + 1) + 1);

Fixes: d9a8cd9595f2 ("pci: add kernel driver type")
Cc: stable@dpdk.org

Signed-off-by: Andy Green <andy@warmcat.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2018-05-14 23:32:23 +02:00
Vipin Varghese
f5fd98c802 net/tap: add default name to tun
The change adds default name to reflect TUN PMD instance. if option
name is not passed, the default dtun is taken.

Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2018-05-14 22:32:23 +01:00
Hyong Youb Kim
2b9919feab net/enic: fix missing offload capabilities
Add the following missing flags to the advertised offloads.
- DEV_RX_OFFLOAD_CRC_STRIP
  CRC is always stripped.
- DEV_RX_OFFLOAD_JUMBO_FRAME
  Jumbo support is always enabled on the NIC.
- DEV_RX_OFFLOAD_SCATTER
  Scatter Rx is currently supported.
- DEV_TX_OFFLOAD_MULTI_SEGS
  Multiple-segment transmit has always been supported.

Fixes: 93fb21fdbe23 ("net/enic: enable overlay offload for VXLAN and GENEVE")

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
2018-05-14 22:32:23 +01:00
Matan Azrad
59056833cc net/bonding: fix slave activation simultaneously
The bonding PMD decides to activate\deactivate its slaves according to
the slaves link statuses.
Thus, it registers to the LSC events of the slaves ports and
activates\deactivates them from its LSC callbacks called asynchronously
by the host thread when the slave link status is changed.

In addition, the bonding PMD uses the callback for slave activation
when it tries to start it, this operation is probably called by the
master thread.

Consequently, a slave may be activated in the same time by two
different threads and may cause a lot of optional errors, for example,
slave mempool recreation with the same name causes an error.

Synchronize the critical section in the LSC callback using a special
new spinlock.

Fixes: 414b202343ce ("bonding: fix initial link status of slave")
Fixes: a45b288ef21a ("bond: support link status polling")
Cc: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
2018-05-14 22:32:23 +01:00
Jingjing Wu
efe73c0d1d net/avf: fix Rx interrupt mapping
Vector used for rx mapping is different if WB_ON_ITR
is supported. The mapping table need to be updated.

Fixes: d6bde6b5eae9 ("net/avf: enable Rx interrupt")

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Cc: stable@dpdk.org
2018-05-14 22:32:23 +01:00
Shreyansh Jain
346b02d1dc net/dpaa2: change VLAN strip value to offload flag
Fixes: 0ebce6129bc6 ("net/dpaa2: support new ethdev offload APIs")

Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
2018-05-14 22:32:23 +01:00
Rahul Lakkireddy
b84bcf4019 net/cxgbe: free resources during uninit
Move freeing up resources from dev_close() to dev_uninit(). This fixes
NULL pointer de-reference when accessing adapter context needed by
other ports under same PF, but had been freed up by the first port.
This can happen if only the first port is started up and the check
to free up all resources is still satisfied. When dev_close is
called for other ports, adapter context is NULL since it was freed
up by the first port.

Thus, by moving to dev_uninit() all the ports can be teared down
safely without need for extra checks.

Fixes: 2195df6d11bd ("net/cxgbe: rework ethdev device allocation")

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
2018-05-14 22:32:23 +01:00
Ophir Munk
97b2217ae5 net/mlx4: advertise supported RSS hash functions
Advertise mlx4 supported RSS functions as part of dev_infos_get
callback.
Previous to this commit RSS support was reported as none. Since the
introduction of [1] it is required that all RSS configurations will be
verified.

[1] commit 8863a1fbfc66 ("ethdev: add supported hash function check")

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
2018-05-14 22:32:23 +01:00
Ophir Munk
cbd737416c net/mlx4: avoid constant recreations in function
Function mlx4_conv_rss_types() contains constant arrays variables
which are recreated with every call to the function. By changing the
arrays definitions from "const" to "static const" these recreations
can be saved.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
2018-05-14 22:32:23 +01:00
Yongseok Koh
c9ec2192ff net/mlx5: use correct field in a union structure
This is not a bug but it is better to use semantically correct field.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
2018-05-14 22:32:23 +01:00
Yongseok Koh
0cfdc1808d net/mlx5: use coherent I/O memory barrier
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
2018-05-14 22:32:22 +01:00
Yongseok Koh
5f44cfd011 net/mlx5: fix inlining segmented TSO packet
When a multi-segmented packet is inlined, data can be further inlined even
after the first segment. In case of TSO packet, extra inline data after TSO
header should be carried by an inline DSEG which has 4B inline header
recording the length of the inline data. If more than one segment is
inlined, the length doesn't count from the second segment. This will cause
a fault in HW and CQE will have an error, which is ignored by PMD.

Fixes: f895536be4fa ("net/mlx5: enable inlining data from multiple segments")
Cc: stable@dpdk.org

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
2018-05-14 22:32:22 +01:00
Ferruh Yigit
7c45f6c079 app/testpmd: check if CRC strip offload supported
Testpmd set CRC_STRIP offload blindly, this is wrong according offload
API definition, and will cause error for the PMDs that doesn't support
CRC_STRIP like virtual PMDs.

Check if underlying device report this capability and don't set it if
not supported.

Fixes: 0074d02fca21 ("app/testpmd: convert to new Rx offloads API")

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>
2018-05-14 22:32:22 +01:00
Ophir Munk
907252079a net/failsafe: add an RSS hash update callback
Add an RSS hash update callback to eth_dev_ops.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
2018-05-14 22:32:22 +01:00
Ivan Malov
2d65bf3cfb ethdev: improve doc for name by port ID API
Description of rte_eth_dev_get_name_by_port() calls
port ID argument a pointer, which is misleading.
Also, output buffer minimal size is not mentioned.
These points need to be improved.

Fixes: bde516d5a85a ("ethdev: get port by name")
Cc: stable@dpdk.org

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2018-05-14 22:32:22 +01:00
Remy Horton
266c467f60 net/ixgbe: fix missing port representor data-path
This patch adds Rx and Tx burst functions to the ixgbe
Port Representors, so that the implementation within
ixgbe PMD can be tested using applications such as
testpmd which require data-path functionality.

Fixes: cf80ba6e2038 ("net/ixgbe: add support for representor ports")

Signed-off-by: Remy Horton <remy.horton@intel.com>
Acked-by: Mohammad Abdul Awal <mohammad.abdul.awal@intel.com>
2018-05-14 22:32:17 +01:00
Remy Horton
8973508a0c net/i40e: fix missing port representor data-path
This patch adds Rx and Tx burst functions to the i40e Port
Representors, so that the implementation within this PMD
can be tested using applications such as testpmd which
require data-path functionality.

Fixes: e0cb96204b71 ("net/i40e: add support for representor ports")

Signed-off-by: Remy Horton <remy.horton@intel.com>
Acked-by: Mohammad Abdul Awal <mohammad.abdul.awal@intel.com>
2018-05-14 22:32:11 +01:00
Yanglong Wu
b447e89e33 ethdev: fix checking Rx/Tx queue status
Relax the check for queue setup, since some device
may not update queue states during dev_stop.

Fixes: cac923cfea47 ("ethdev: support runtime queue setup")

Signed-off-by: Yanglong Wu <yanglong.wu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2018-05-14 22:31:54 +01:00
Beilei Xing
3d4faec985 net/i40e: print original value for global register change
Currently, only new value is printed during global
register change. Add original value to help debugging
facility.

Fixes: bc66b9717c50 ("net/i40e: add debug logs when writing global registers")
Cc: stable@dpdk.org

Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2018-05-14 22:31:53 +01:00
Zhiyong Yang
201a416517 net/virtio-user: fix multiple queues fail in server mode
This patch fixes multiple queues failure when virtio-user works in
server mode.

This patch adds feature negotiation in the processing of virtio-user
connection and enables multiple-queue pairs.

Fixes: bd8f50a45d0f ("net/virtio-user: support server mode")
Cc: stable@dpdk.org

Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
2018-05-14 22:31:53 +01:00
Yanglong Wu
4fa9cd4338 net/i40e: fix missing VLAN offload capability
VLAN offload capability should be exposed in VF
since i40e does support it.

Fixes: c3ac7c5b0b8a ("net/i40e: convert to new Rx offloads API")

Signed-off-by: Yanglong Wu <yanglong.wu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2018-05-14 22:31:53 +01:00
Qi Zhang
399421100e net/i40e: fix missing mbuf fast free offload
Expose the missing mbuf fast free capability since i40 does
support it.

Fixes: 7497d3e2f777 ("net/i40e: convert to new Tx offloads API")

Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
2018-05-14 22:31:53 +01:00
Matan Azrad
bafa9aa0d7 ethdev: fix port removal notification timing
When an ethdev port is released, a destroy event is triggered to notify
the users about the released port.

A bit before the destroy event is triggered, the port becomes invalid
by changing its state to UNUSED and cleaning its data. Therefore, the
port is invalid for the destroy event callback process and the users
may get a wrong information of the port.

Move the destroy event emitting to be called before the port
invalidation.

Fixes: 133b54779aa1 ("ethdev: fix port data reset timing")
Fixes: 29aa41e36de7 ("ethdev: add notifications for probing and removal")
Cc: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
2018-05-14 22:31:53 +01:00
Matan Azrad
7fda13d3a5 net/failsafe: fix sub-device ownership race
There is time between the sub-device port probing by the sub-device PMD
to the sub-device port ownership taking by a fail-safe port.

In this time, the port is available for the application usage. For
example, the port will be exposed to the applications which use
RTE_ETH_FOREACH_DEV iterator.

Thus, ownership unaware applications may manage the port in this time
what may cause a lot of problematic behaviors in the fail-safe
sub-device initialization.

Register to the ethdev NEW event to take the sub-device port ownership
before it becomes exposed to the application.

Fixes: a46f8d584eb8 ("net/failsafe: add fail-safe PMD")
Cc: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
2018-05-14 22:31:53 +01:00
Thomas Monjalon
be8cd21037 ethdev: fix port probing notification
The new device was notified as soon as it was allocated.
It leads to use a device which is not yet initialized.

The notification must be published after the initialization is done
by the PMD, but before the state is changed, in order to let
notified entities taking ownership before general availability.

Fixes: 29aa41e36de7 ("ethdev: add notifications for probing and removal")
Cc: stable@dpdk.org

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
2018-05-14 22:31:53 +01:00
Thomas Monjalon
e06227e2fa ethdev: fix port visibility before initialization
The port was set to the state ATTACHED during allocation.
The consequence was to iterate over ports which are not initialized.

The state ATTACHED is now set as the last step of probing.

The uniqueness of port name is now checked before the availability
of a port id for allocation (order reversed).

As the state is not set on allocation anymore, it is also not checked
in the function telling whether a port is allocated or not.
The name of the port is set on allocation, so it is enough as a check.

Fixes: 5588909af21b ("ethdev: add device iterator")
Cc: stable@dpdk.org

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
2018-05-14 22:31:53 +01:00
Matan Azrad
ac7d3b6ddf ethdev: add lock to port allocation check
When comparing the port name, there can be a race condition with
a thread allocating a new port and writing the name at the same time.
It can lead to match with a partial name by error.

The check of the port is now considered as a critical section
protected with locks.

This fix will be even more required for multi-process when the
port availability will rely only on the name, in a following patch.

Fixes: 84934303a17c ("ethdev: synchronize port allocation")
Cc: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
2018-05-14 22:31:53 +01:00
Matan Azrad
33c73aae32 ethdev: allow ownership operations on unused port
When the state will be updated later than in allocation,
we may need to update the ownership of a port which is
still in state unused.

It will be used to take ownership of a port before it is
declared as available for other entities.

Cc: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
2018-05-14 22:31:53 +01:00
Thomas Monjalon
fbe90cdd77 ethdev: add probing finish function
A new hook function is added and called inside the PMDs at the end
of the device probing:
	- in primary process, after allocating, init and config
	- in secondary process, after attaching and local init

This new function is almost empty for now.
It will be used later to add some post-initialization processing.

For the PMDs calling the helpers rte_eth_dev_create() or
rte_eth_dev_pci_generic_probe(), the hook rte_eth_dev_probing_finish()
is called from here, and not in the PMD itself.

Note that the helper rte_eth_dev_create() could be used more,
especially for vdevs, avoiding some code duplication in PMDs.

Cc: stable@dpdk.org

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
2018-05-14 22:31:53 +01:00
Thomas Monjalon
01a98fdd08 drivers/net: use higher level of probing helper for PCI
The drivers avp, bnx2x and liquidio were using the helper function
rte_eth_dev_pci_allocate() and can be replaced by
rte_eth_dev_pci_generic_probe() which calls the former.

Fixes: dcd5c8112bc3 ("ethdev: add PCI driver helpers")
Cc: stable@dpdk.org

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
2018-05-14 22:31:53 +01:00
Thomas Monjalon
d5e54c355c ethdev: add doxygen comments for each state
The enum rte_eth_dev_state was not properly documented.
Its values did not appear in the doxygen output,
and may be misunderstood.

The state RTE_ETH_DEV_DEFERRED has no interest anymore
since the ownership mechanism brings a more flexible categorization.
This state could be removed later.

Fixes: d52268a8b24b ("ethdev: expose device states")
Fixes: cb894d99eceb ("ethdev: add deferred intermediate device state")
Fixes: 5b7ba31148a8 ("ethdev: add port ownership")
Fixes: 7106edc12380 ("ethdev: add devop to check removal status")

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
2018-05-14 22:31:53 +01:00
Thomas Monjalon
1958142640 net/failsafe: fix sub-device visibility
The iterator function rte_eth_find_next_owned_by(), used by the
iterator macro RTE_ETH_FOREACH_DEV_OWNED_BY, are ignoring the devices
which are neither ATTACHED nor REMOVED. Thus sub-devices, having
the state DEFERRED, cannot be seen with the ethdev iterator.
The state RTE_ETH_DEV_DEFERRED can be replaced by
RTE_ETH_DEV_ATTACHED + owner.

Fixes: dcd0c9c32b8d ("net/failsafe: use ownership mechanism for slaves")
Cc: stable@dpdk.org

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
2018-05-14 22:31:52 +01:00
Thomas Monjalon
444e2b7829 ethdev: fix debug log of owner id
The owner id is 64-bit.
On 32-bit environment, it must be printed with PRIX64.

Fixes: 5b7ba31148a8 ("ethdev: add port ownership")
Cc: stable@dpdk.org

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
2018-05-14 22:31:52 +01:00
Wei Dai
c73a907187 app/testpmd: add commands to test new offload API
Add following testpmd run-time commands to support test of
new Rx offload API:
show port <port_id> rx_offload capabilities
show port <port_id> rx_offload configuration
port config <port_id> rx_offload <offload> on|off
port <port_id> rxq <queue_id> rx_offload <offload> on|off
Above last 2 commands should be run when the port is stopped.
And <offload> can be one of "vlan_strip", "ipv4_cksum", ...

Add following testpmd run-time commands to support test of
new Tx offload API:
show port <port_id> tx_offload capabilities
show port <port_id> tx_offload configuration
port config <port_id> tx_offload <offload> on|off
port <port_id> txq <queue_id> tx_offload <offload> on|off
Above last 2 commands should be run when the port is stopped.
And <offload> can be one of "vlan_insert", "udp_cksum", ...

Signed-off-by: Wei Dai <wei.dai@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
2018-05-14 22:31:52 +01:00
Beilei Xing
9153411e44 net/i40e: print global register change info
Global register change info during enabling
flexible payload is not printed.
This patch changes macro to print the global
register change info.

Fixes: d2f9fe8ae309 ("net/i40e: turn off flexible payload on driver init")
Cc: stable@dpdk.org

Signed-off-by: Beilei Xing <beilei.xing@intel.com>
2018-05-14 22:31:52 +01:00
Andrew Rybchenko
4436cf988b net/sfc: fix inner TCP/UDP checksum offload control
If application uses Tx offload API and sets ETH_TXQ_FLAGS_IGNORE flag,
it still should have inner TCP/UDP checksum offload enabled if it is
supported and TCP/UDP checksum offload is requested.

Fixes: c78d280e88ef ("net/sfc: convert to new Tx offload API")
Cc: stable@dpdk.org

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
2018-05-14 22:31:52 +01:00
Qiming Yang
dbb36bb2a1 doc: add XXV710 support in i40e guide
Signed-off-by: Qiming Yang <qiming.yang@intel.com>
2018-05-14 22:31:52 +01:00
Hyong Youb Kim
9713ece25c net/enic: fix flow drop action
Drop is a fate-deciding action, so mark it as FATE. It was missing in
a previous commit.

Fixes: cc17feb90413 ("ethdev: alter behavior of flow API actions")

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
2018-05-14 22:31:52 +01:00
Wei Dai
acca1807e7 net/e1000: report Tx multi segment offload
This feature has been confirmed with testpmd:
testpmd> set fwd txonly
testpmd> port stop all
testpmd> port config all txd 1024
testpmd> set txsplit on
testpmd> set txpkts 70,80,90,100
testpmd> start
It can be observed at peer port that UDP packets
with UDP data length 298 bytes.

Signed-off-by: Wei Dai <wei.dai@intel.com>
2018-05-14 22:31:52 +01:00
Beilei Xing
b5f6272c24 net/i40e: fix link status update
Link status is not updated correctly, link speed is 0
when link is up and link speed is not 0 when link is
down. This patch fixes the issue.

Fixes: eef2daf2e199 ("net/i40e: fix link update no wait")
Cc: stable@dpdk.org

Signed-off-by: Keith Wiles <keith.wiles@intel.com>
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
2018-05-14 22:31:52 +01:00
Qi Zhang
2f203d44ba app/testpmd: fix device configure with zero queue
Setup number of Rx & Tx queues to 0 at rte_eth_dev_configure means
take driver's default queue number, so if during a re-configuration
previous queue number will be overwrite, this is not expected when
we configure dcb. The patch fix it by re-configure device with the
original queue number.

Fixes: 3be82f5cc5e ("ethdev: support PMD-tuned Tx/Rx parameters")

Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
2018-05-14 22:31:52 +01:00
Yongseok Koh
7d6bf6b866 net/mlx5: add Multi-Packet Rx support
Multi-Packet Rx Queue (MPRQ a.k.a Striding RQ) can further save PCIe
bandwidth by posting a single large buffer for multiple packets. Instead of
posting a buffer per a packet, one large buffer is posted in order to
receive multiple packets on the buffer. A MPRQ buffer consists of multiple
fixed-size strides and each stride receives one packet.

Rx packet is mem-copied to a user-provided mbuf if the size of Rx packet is
comparatively small, or PMD attaches the Rx packet to the mbuf by external
buffer attachment - rte_pktmbuf_attach_extbuf(). A mempool for external
buffers will be allocated and managed by PMD.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-05-14 22:31:52 +01:00
Yongseok Koh
18bee13096 net/mlx5: add a function to rdma-core glue
mlx5dv_create_wq() is added for the Multi-Packet RQ (a.k.a Striding RQ).

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-05-14 22:31:52 +01:00
Yongseok Koh
3e1f82a1f1 net/mlx5: separate filling Rx flags
Filling in fields of mbuf becomes a separate inline function so that this
can be reused.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-05-14 22:31:52 +01:00
Yongseok Koh
9797bfcce1 net/mlx4: add new memory region support
This is the new design of Memory Region (MR) for mlx PMD, in order to:
- Accommodate the new memory hotplug model.
- Support non-contiguous Mempool.

There are multiple layers for MR search.

L0 is to look up the last-hit entry which is pointed by mr_ctrl->mru (Most
Recently Used). If L0 misses, L1 is to look up the address in a fixed-sized
array by linear search. L0/L1 is in an inline function -
mlx4_mr_lookup_cache().

If L1 misses, the bottom-half function is called to look up the address
from the bigger local cache of the queue. This is L2 - mlx4_mr_addr2mr_bh()
and it is not an inline function. Data structure for L2 is the Binary Tree.

If L2 misses, the search falls into the slowest path which takes locks in
order to access global device cache (priv->mr.cache) which is also a B-tree
and caches the original MR list (priv->mr.mr_list) of the device. Unless
the global cache is overflowed, it is all-inclusive of the MR list. This is
L3 - mlx4_mr_lookup_dev(). The size of the L3 cache table is limited and
can't be expanded on the fly due to deadlock. Refer to the comments in the
code for the details - mr_lookup_dev(). If L3 is overflowed, the list will
have to be searched directly bypassing the cache although it is slower.

If L3 misses, a new MR for the address should be created -
mlx4_mr_create(). When it creates a new MR, it tries to register adjacent
memsegs as much as possible which are virtually contiguous around the
address. This must take two locks - memory_hotplug_lock and
priv->mr.rwlock. Due to memory_hotplug_lock, there can't be any
allocation/free of memory inside.

In the free callback of the memory hotplug event, freed space is searched
from the MR list and corresponding bits are cleared from the bitmap of MRs.
This can fragment a MR and the MR will have multiple search entries in the
caches. Once there's a change by the event, the global cache must be
rebuilt and all the per-queue caches will be flushed as well. If memory is
frequently freed in run-time, that may cause jitter on dataplane processing
in the worst case by incurring MR cache flush and rebuild. But, it would be
the least probable scenario.

To guarantee the most optimal performance, it is highly recommended to use
an EAL option - '--socket-mem'. Then, the reserved memory will be pinned
and won't be freed dynamically. And it is also recommended to configure
per-lcore cache of Mempool. Even though there're many MRs for a device or
MRs are highly fragmented, the cache of Mempool will be much helpful to
reduce misses on per-queue caches anyway.

'--legacy-mem' is also supported.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
2018-05-14 22:31:52 +01:00
Yongseok Koh
2d684b911d net/mlx4: remove memory region support
This patch removes current support of Memory Region (MR) in order to
accommodate the dynamic memory hotplug patch. This patch can be compiled
but traffic can't flow and HW will raise faults. Subsequent patches will
add new MR support.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
2018-05-14 22:31:51 +01:00
Yongseok Koh
974f1e7ef1 net/mlx5: add new memory region support
This is the new design of Memory Region (MR) for mlx PMD, in order to:
- Accommodate the new memory hotplug model.
- Support non-contiguous Mempool.

There are multiple layers for MR search.

L0 is to look up the last-hit entry which is pointed by mr_ctrl->mru (Most
Recently Used). If L0 misses, L1 is to look up the address in a fixed-sized
array by linear search. L0/L1 is in an inline function -
mlx5_mr_lookup_cache().

If L1 misses, the bottom-half function is called to look up the address
from the bigger local cache of the queue. This is L2 - mlx5_mr_addr2mr_bh()
and it is not an inline function. Data structure for L2 is the Binary Tree.

If L2 misses, the search falls into the slowest path which takes locks in
order to access global device cache (priv->mr.cache) which is also a B-tree
and caches the original MR list (priv->mr.mr_list) of the device. Unless
the global cache is overflowed, it is all-inclusive of the MR list. This is
L3 - mlx5_mr_lookup_dev(). The size of the L3 cache table is limited and
can't be expanded on the fly due to deadlock. Refer to the comments in the
code for the details - mr_lookup_dev(). If L3 is overflowed, the list will
have to be searched directly bypassing the cache although it is slower.

If L3 misses, a new MR for the address should be created -
mlx5_mr_create(). When it creates a new MR, it tries to register adjacent
memsegs as much as possible which are virtually contiguous around the
address. This must take two locks - memory_hotplug_lock and
priv->mr.rwlock. Due to memory_hotplug_lock, there can't be any
allocation/free of memory inside.

In the free callback of the memory hotplug event, freed space is searched
from the MR list and corresponding bits are cleared from the bitmap of MRs.
This can fragment a MR and the MR will have multiple search entries in the
caches. Once there's a change by the event, the global cache must be
rebuilt and all the per-queue caches will be flushed as well. If memory is
frequently freed in run-time, that may cause jitter on dataplane processing
in the worst case by incurring MR cache flush and rebuild. But, it would be
the least probable scenario.

To guarantee the most optimal performance, it is highly recommended to use
an EAL option - '--socket-mem'. Then, the reserved memory will be pinned
and won't be freed dynamically. And it is also recommended to configure
per-lcore cache of Mempool. Even though there're many MRs for a device or
MRs are highly fragmented, the cache of Mempool will be much helpful to
reduce misses on per-queue caches anyway.

'--legacy-mem' is also supported.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
2018-05-14 22:31:51 +01:00
Yongseok Koh
d561b5dc13 net/mlx5: remove memory region support
This patch removes current support of Memory Region (MR) in order to
accommodate the dynamic memory hotplug patch. This patch can be compiled
but traffic can't flow and HW will raise faults. Subsequent patches will
add new MR support.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
2018-05-14 22:31:51 +01:00
Wei Dai
a4996bd89c ethdev: new Rx/Tx offloads API
This patch check if a input requested offloading is valid or not.
Any reuqested offloading must be supported in the device capabilities.
Any offloading is disabled by default if it is not set in the parameter
dev_conf->[rt]xmode.offloads to rte_eth_dev_configure() and
[rt]x_conf->offloads to rte_eth_[rt]x_queue_setup().
If any offloading is enabled in rte_eth_dev_configure() by application,
it is enabled on all queues no matter whether it is per-queue or
per-port type and no matter whether it is set or cleared in
[rt]x_conf->offloads to rte_eth_[rt]x_queue_setup().
If a per-queue offloading hasn't be enabled in rte_eth_dev_configure(),
it can be enabled or disabled for individual queue in
ret_eth_[rt]x_queue_setup().
A new added offloading is the one which hasn't been enabled in
rte_eth_dev_configure() and is reuqested to be enabled in
rte_eth_[rt]x_queue_setup(), it must be per-queue type,
otherwise trigger an error log.
The underlying PMD must be aware that the requested offloadings
to PMD specific queue_setup() function only carries those
new added offloadings of per-queue type.

This patch can make above such checking in a common way in rte_ethdev
layer to avoid same checking in underlying PMD.

This patch assumes that all PMDs in 18.05-rc2 have already
converted to offload API defined in 17.11 . It also assumes
that all PMDs can return correct offloading capabilities
in rte_eth_dev_infos_get().

In the beginning of [rt]x_queue_setup() of underlying PMD,
add offloads = [rt]xconf->offloads |
dev->data->dev_conf.[rt]xmode.offloads; to keep same as offload API
defined in 17.11 to avoid upper application broken due to offload
API change.
PMD can use the info that input [rt]xconf->offloads only carry
the new added per-queue offloads to do some optimization or some
code change on base of this patch.

Signed-off-by: Wei Dai <wei.dai@intel.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
2018-05-14 22:31:51 +01:00