Secondary process is not allowed to register MR due to a restriction of
library and kernel driver.
Fixes: 7e43a32ee060 ("net/mlx5: support externally allocated static memory")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
In order to support secondary process, a few features are required.
a) rdma-core library should allocate device resources using DPDK's
memory allocator.
b) UAR should be remapped for secondary processes. Currently, in order
not to use different data structure for secondary processes, PMD
tries to reserve identical virtual address space for both primary
and secondary processes.
c) IPC channel is necessary, which can be easily set with rte_mp APIs.
Through the channel, Verbs command FD is delivered to the secondary
process and the device stop/start event is also broadcast from
primary process.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
To support secondary process, the memory allocated by library such as
completion rings (CQ) and buffer rings (WQ) must be manageable by EAL,
in order to share it with secondary processes. With new changes in
rdma-core and kernel driver, it is possible to provide an external
allocator to the library layer for this purpose. All such resources
will now be allocated within DPDK framework.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
rte_eth_devices[] is not shared between primary and secondary process,
but a static array to each process. The reverse pointer of device
(priv->dev) becomes invalid if mlx4 supports secondary process.
Instead, priv has the pointer to shared data of the device,
struct rte_eth_dev_data *dev_data;
Two macros are added,
#define PORT_ID(priv) ((priv)->dev_data->port_id)
#define ETH_DEV(priv) (&rte_eth_devices[PORT_ID(priv)])
Cc: stable@dpdk.org
Suggested-by: Raslan Darawsheh <rasland@mellanox.com>
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Rx/Tx burst function pointers are stored in the rte_eth_dev structure,
which is local to a process. Even though primary process replaces the
function pointers, secondary will not run the new ones. With rte_mp
APIs, primary can easily broadcast a request to stop/start the datapath
of secondary processes.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
There's more need to have PMD global data structure. This should be
initialized once per a process regardless of how many PMD instances are
probed. mlx5_init_once() is called during probing and make sure all the
init functions are called once per a process. Currently, such global
data and its initialization functions are even scattered. Rather than
'extern'-ing such variables and calling such functions one by one making
sure it is called only once by checking the validity of such variables, it
will be better to have a global storage to hold such data and a
consolidated function having all the initializations. The existing shared
memory gets more extensively used for this purpose. As there could be
multiple secondary processes, a static storage (local to process) is also
added.
As the reserved virtual address for UAR remap is a PMD global resource,
this doesn't need to be stored in the device priv structure, but in the
PMD global data.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Socket API is used for IPC in order for secondary process to acquire
Verb command file descriptor. The FD is used to remap UAR address.
The multi-process APIs (rte_mp) in EAL are newly introduced.
mlx5_socket.c is replaced with mlx5_mp.c, which uses the new APIs.
As it is PMD global infrastructure, only one IPC channel is established.
All the IPC message types may have port_id in the message if there is
need to reference a specific device.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
As the memory event is propagated to secondary processes, the event is
processed redundantly. This should be processed once because the data
structure used for MR and the event is global across the processes.
Fixes: 974f1e7ef146 ("net/mlx5: add new memory region support")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
When replenishing mbufs on Rx, buffer address (mbuf->buf_addr) should be
loaded. non-x86 processors (mostly RISC such as ARM and Power) are more
vulnerable to load stall. For x86, reducing the number of instructions
seems to matter most.
For x86, this is simply a load but for other architectures, it is
calculated from the address of mbuf structure by rte_mbuf_buf_addr()
without having to load the first cacheline of the mbuf.
Fixes: 12d468a62bc1 ("net/mlx5: fix instruction hotspot on replenishing Rx buffer")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
MC firmware is the core component of FSLMC bus and DPAA2 devices.
Prior to this patch, MC firmware supported 10.10.x version. This
patch bumps the min supported version to 10.14.x.
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Do a global replace of snprintf(..."%s",...) with strlcpy, adding in the
rte_string_fns.h header if needed. The function changes in this patch were
auto-generated via command:
spatch --sp-file devtools/cocci/strlcpy.cocci --dir . --in-place
and then the files edited using awk to add in the missing header:
gawk -i inplace '/include <rte_/ && ! seen { \
print "#include <rte_string_fns.h>"; seen=1} {print}'
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
For files that already have rte_string_fns.h included in them, we can
do a straight replacement of snprintf(..."%s",...) with strlcpy. The
changes in this patch were auto-generated via command:
spatch --sp-file devtools/cocci/strlcpy-with-header.cocci --dir . --in-place
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Using the size of the source string is incorrect when printing using
snprintf. Instead pass in the buffer size to be used appropriately.
Fixes: 457ecf2953fc ("bond: add debug info for mode 6")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
This commit sets the min and max supported MTU values for igb devices
via the eth_igb_info_get() function. Min MTU supported is set to
ETHER_MIN_MTU and max MTU is calculated as the max packet length
supported minus the transport overhead. To aid in these calculations
a new MACRO 'E1000_ETH_OVERHEAD' has been introduced to consolidate
overhead calculation and avoid duplication.
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
This commit sets the min and max supported MTU values for ixgbe VF
devices via the ixgbevf_dev_set_mtu() function. Min MTU supported is
set to ETHER_MIN_MTU and max MTU is calculated as the max packet length
supported minus the transport overhead. As transport overhead is the
same for VF and PF ixgbe devices, reuse MACRO 'IXGBE_ETH_OVERHEAD' to
avoid duplication.
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
This commit sets the min and max supported MTU values for ixgbe devices
via the ixgbe_dev_info_get() function. Min MTU supported is set to
ETHER_MIN_MTU and max MTU is calculated as the max packet length
supported minus the transport overhead. To aid in these calculations
a new MACRO 'IXGBE_ETH_OVERHEAD' has been introduced to consolidate
overhead calculation and avoid duplication.
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
This commit sets the min and max supported MTU values for i40e VF
devices via the i40evf_dev_info_get() function. Min MTU supported
is set to ETHER_MIN_MTU and max MTU is calculated as the max packet
length supported minus the transport overhead.
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
This commit sets the min and max supported MTU values for i40e devices
via the i40e_dev_info_get() function. Min MTU supported is set to
ETHER_MIN_MTU and max mtu is calculated as the max packet length
supported minus the transport overhead.
Signed-off-by: Ian Stokes <ian.stokes@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Device speed capability should be specified based on different PHY types
instead of a fixed value, this patch fix the issue.
Fixes: 690175ee51bf ("net/ice: support getting device information")
Cc: stable@dpdk.org
Signed-off-by: Chenmin Sun <chenmin.sun@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Since previous test is for mtu < 1519 the next else if
is always true. This causes the lgtm static tool to complain.
Not a real issue, just cosmetic.
Fixes: 76d4c652e07d ("virtio: add extended stats")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Rami Rosen <ramirose@gmail.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Since previous test is for mtu < 1519 the next else if
is always true. This causes the lgtm static tool to complain.
Not a real issue, just cosmetic.
Fixes: 4e9c73e96e83 ("net/netvsc: add Hyper-V network device")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Acked-by: Rami Rosen <ramirose@gmail.com>
For E-Switch configurations over multiport Infiniband devices
we should add source vport match to correctly distribute
traffic between representors.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
This patch modifies asynchronous event handler to support multiport
Infiniband devices. Handler queries the event parameters, including
event source port index, and invokes the handler for specific
devices with appropriate port_id.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
We are implementing the support for multiport Infiniband device
with representors attached to these multiple ports. Asynchronous
device event notifications (link status change, removal event, etc.)
should be shared between ports. We are going to implement shared
event handler and this patch introduces appropriate device
structure changes and updated event handler install and uninstall
routines.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
The code is updated to provide IB port index for the Verbs
objects being created - QPs and Verbs Flows.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
The code is updated to use the shared IB device context and
device handles. The IB device context is shared between
reprentors created over the single multiport IB device. All
Verbs and DevX objects will be created within this shared context.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
The code is updated to use the shared IB device attributes,
located in the shared IB context. It saves some memory if
there are representors created over the single Infiniband
device with multiple ports.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
The PMD code is updated to use Protected Domain from the
shared IB device context. The Domain is shared between
all devices belonging to the same multiport Infiniband device.
If IB device has only one port, the PD is not shared, because
there is only ethernet device created over IB one.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
The IB device names are moved from device private data
to the shared context, code involving the names is updated.
The IB port index treatment is added where it is relevant.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
The Mellanox NICs support SR-IOV and have E-Switch feature.
When SR-IOV is set up in switchdev mode and E-Switch is enabled
we have so called VF representors in the system. All representors
belonging to the same E-Switch are created on the basis of the
single PCI function and with current implementation each representor
has its own dedicated Infiniband device and operates within its
own Infiniband context. It is proposed to provide representors
as ports of the single Infiniband device and operate on the
shared Infiniband context saving various resources. This patch
introduces appropriate structures.
Also the functions to allocate and free shared IB context for
multiport are added. The IB device context, Protection Domain,
device attributes, Infiniband names are going to be relocated
to the shared structure from the device private one.
mlx5_dev_spawn() is updated to support shared context.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
mlx5_pci_probe() routine is refactored to probe the ports
of found Infiniband devices. All active ports (with attached
network interface), belonging to the same Infiniband device
will use the single shared Infiniband context of that device.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
There is the routine mlx5_nl_portnum() added to get
the number of ports of multiport Infiniband device.
It is assumed the Uplink/VF representors are attached
on these ports.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
There is the routine mlx5_nl_ifindex() returning the
network interface index associated with Infiniband device.
We are going to support multiport IB devices, now function
takes the IB port as argument and returns ifindex associated
with tuple <IB device, IB port>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
The master device and VF representors were distinguished by
presence of port name, master device did not have one. The new Linux
kernels starting from 5.0 provide the port name for master device
and the implemented representor recognizing method does not work.
The new recognizing method is based on querying the VF number,
has been created on the base of the device.
The IFLA_NUM_VF attribute is returned by kernel if IFLA_EXT_MASK
attribute is specified in the Netlink request message.
Also the presence check of device symlink in device sysfs folder
is added to distinguish representors with sysfs based method.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
We are consistently passing 1 as the argument in the data path,
so there is no need to define avail/used flags as function-like
macros anymore. This patch changes the avail and used flags to
constants. And a frequently used combination is also introduced.
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch fixes the multi-process support for virtio-user.
Currently virtio-user just provides some limited secondary
process supports. Only some basic operations can be done in
secondary process on virtio-user port, e.g. getting port stats.
Actions which will trigger the communication with vhost backend
can't be done in secondary process for now, as the fds are
not synced between processes. The processing of server mode
devargs is also moved into virtio_user_dev_init().
Fixes: cdb068f031c6 ("bus/vdev: scan by multi-process channel")
Fixes: ee27edbe0c10 ("drivers/net: share vdev data to secondary process")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
"The macro name '_VHOST_NET_USER_H' of this include guard is used
in 2 different header files."
lib/librte_vhost/vhost_user.h has the same include guard.
Renamed the include guard in vhost.h to differentiate.
Fixes: 6a84c37e3975 ("net/virtio-user: add vhost-user adapter layer")
Cc: stable@dpdk.org
Signed-off-by: Andrius Sirvys <andrius.sirvys@intel.com>
Acked-by: Rami Rosen <ramirose@gmail.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
Add functions to set the link state up or down.
Signed-off-by: Pablo Cascón <pablo.cascon@netronome.com>
Acked-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Since the VLAN header is stripped from mbuf data, PKT_RX_VLAN_STRIPPED
should be set in offload flag.
Fixes: 6b59a3bc82b1 ("fm10k: fix VLAN in Rx mbuf")
Fixes: 7092be8437bd ("fm10k: add vector Rx")
Cc: stable@dpdk.org
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Add implementation for probe in secondary.
Failsafe will attempt to attach all the sub-devices in
secondary process.
Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
In multiprocess context, the pointer to sub-device is shared between
processes. Previously, it was a pointer to per process eth_dev so
it's needed to replace this dependency.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
In multiprocess context, the sub-device structure is shared
between processes. The reference to the failsafe device was
a per process pointer. It's changed to port id which is the
same for all processes.
Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
In multiprocess context, the private structure is shared between
processes. The back reference from private to generic data was using
a pointer to a per process eth_dev. It's now changed to a reference of
the shared data.
Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
This patch fixes the build failure with message:
drivers/net/mlx5/mlx5_ethdev.c: In function ‘mlx5_sysfs_switch_info’:
drivers/net/mlx5/mlx5_ethdev.c:1381:3:
error: ignoring return value of ‘fscanf’, declared with attribute
warn_unused_result [-Werror=unused-result]
fscanf(file, "%s", port_name);
^
Which reproduces on Ubuntu 16.04 LTS with
gcc (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609.
Fixes: b2f3a3810125 ("net/mlx5: support new representor naming format")
Signed-off-by: Ali Alnubani <alialnu@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Dekel Peled <dekelp@mellanox.com>
The bnxt driver is not correctly setting the receive VLAN offload
flags. When VLAN is offloaded the driver must set the
PKT_RX_VLAN_STRIPPED flag.
Actually, several drivers have the same bug, only most of the
Intel drivers look right. Any driver that sets vlan_tci is probably
stripping the tag, and should be setting RX_VLAN_STRIPPED.
To quote rte_mbuf.h:
/**
* The RX packet is a 802.1q VLAN packet, and the tci has been
* saved in in mbuf->vlan_tci.
* If the flag PKT_RX_VLAN_STRIPPED is also present, the VLAN
* header has been stripped from mbuf data, else it is still
* present.
*/
Fixes: 2eb53b134aae ("net/bnxt: add initial Rx code")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
A log message is required when provided RSS key is
not valid so that driver will use the default RSS key.
Fixes: ecad87d22383 ("net/i40e: move RSS to flow API")
Cc: stable@dpdk.org
Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Tested-by: Yuan Peng <yuan.peng@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>