Previouly, there are three channels for multi-process
(i.e., primary/secondary) communication.
1. Config-file based channel, in which, the primary process writes
info into a pre-defined config file, and the secondary process
reads the info out.
2. vfio submodule has its own channel based on unix socket for the
secondary process to get container fd and group fd from the
primary process.
3. pdump submodule also has its own channel based on unix socket for
packet dump.
It'd be good to have a generic communication channel for multi-process
communication to accommodate the requirements including:
a. Secondary wants to send info to primary, for example, secondary
would like to send request (about some specific vdev to primary).
b. Sending info at any time, instead of just initialization time.
c. Share FDs with the other side, for vdev like vhost, related FDs
(memory region, kick) should be shared.
d. A send message request needs the other side to response immediately.
This patch proposes to create a communication channel, based on datagram
unix socket, for above requirements. Each process will block on a unix
socket waiting for messages from the peers.
Three new APIs are added:
1. rte_eal_mp_action_register() is used to register an action,
indexed by a string, when a component at receiver side would like
to response the messages from the peer processe.
2. rte_eal_mp_action_unregister() is used to unregister the action
if the calling component does not want to response the messages.
3. rte_eal_mp_sendmsg() is used to send a message, and returns
immediately. If there are n secondary processes, the primary
process will send n messages.
Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Calling rte_smp_{w/r}mb macro expands into a compound block, which
would break compiling a else clause following it, if that calling
place has been terminated already with ";", as in below code.
This patch adds { } around this macro to allow compiling else too.
Fixes: d23a6bd04d ("eal/ppc: fix memory barrier for IBM POWER")
Fixes: 05c3fd7110 ("eal/ppc: atomic operations for IBM Power")
Cc: stable@dpdk.org
Signed-off-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
This patch defines __NR_bpf for powerpc architecture and hence,
fixes compiling tap driver for this architecture.
Fixes: b02d85e1 ("net/tap: add eBPF API")
Signed-off-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
In failsafe device start can be called for ports/devices that
had been plugged out.
The mlx4 PMD detects device removal by listening to the device RMV
events, when the mlx4 port is being stopped, the PMD no longer
listens to these events causing the PMD to stop detecting device
removals.
This patch fixes this issue by moving installation of the interrupt
handler to device configuration, and toggle only the Rx-queue
interrupts on start/stop.
Fixes: a6e8b01c3c ("net/mlx4: compact interrupt functions")
Cc: stable@dpdk.org
Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Test the return value of ecore_ptt_acquire for NULL.
Coverity issue: 257049
Fixes: d378cefab8 ("net/qede: add support for GENEVE tunneling offload")
Signed-off-by: Shahed Shaikh <shahed.shaikh@cavium.com>
This patch fixes issues related to MTU set and max_rx_pkt_len usage.
- Adjust MTU during device configuration when jumbo is enabled
- In qede_set_mtu():
Return not supported for VF as currently we do not support it.
Cache new mtu value in mtu_new for proper update.
Add check for RXQ allocation before calculating RX buffer size
if not allocated defer RX buffer size calculation till RXQ setup.
Add check for before performing device start/stop.
- Use max_rx_pkt_len appropriately
- Change QEDE_ETH_OVERHEAD macro to adjust driver specifics
Fixes: 4c4bdadfa9 ("net/qede: refactoring multi-queue implementation")
Fixes: 9a6d30ae6d ("net/qede: refactoring vport handling code")
Fixes: 1ef4c3a5c1 ("net/qede: prevent crash while changing MTU dynamically")
Cc: stable@dpdk.org
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
- Fix incorrect header size. In the tunnel case, the outer L2/L3 lengths
should be included to calculate tunnel header_size.
- In TSO case, skip manipulating TX BD1 and TX BD2 data buffer fields
since those fields are already updated with header and payload lengths
respectively.
- Update TX BD debug data collection.
Fixes: 3d4bb44116 ("net/qede: add fastpath support for VXLAN tunneling")
Cc: stable@dpdk.org
Signed-off-by: Harish Patil <harish.patil@cavium.com>
- Add a check to verify tunnel IP header checksum is valid and mark MBUF
flag as appropriate.
- Bit of refactoring so that inner frame handling for tunneled packets can
be made common as regular (non-tunneled) packets.
- make qede_tunn_exist() as inline.
- remove RTE_PTYPE_L2_ETHER as default L2 pkt_type.
Fixes: 3d4bb44116 ("net/qede: add fastpath support for VXLAN tunneling")
Cc: stable@dpdk.org
Signed-off-by: Harish Patil <harish.patil@cavium.com>
By default, the PF driver enables tunnel offload for its child VF.
So mark tunnel offloads as enabled in the VF driver to reflect the
actual state.
Fixes: 52d94b57e1 ("net/qede: add slowpath support for VXLAN tunneling")
Fixes: d378cefab8 ("net/qede: add support for GENEVE tunneling offload")
Cc: stable@dpdk.org
Signed-off-by: Harish Patil <harish.patil@cavium.com>
This patch adds support for registering and waiting for Rx interrupts.
This allows applications to wait for Rx events from the PMD using the
DPDK rte_epoll subsystem.
Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
This patch fixes the issue of mlx4 not receiving broadcast packets
when configured to work promiscuous or allmulticast modes.
Fixes: eacaac7bae ("net/mlx4: restore promisc and allmulti support")
Cc: stable@dpdk.org
Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
The number of mlx4 present ports is calculated as follows:
conf.ports.present |= (UINT64_C(1) << device_attr.phys_port_cnt) - 1;
That is - all ones sequence (due to -1 subtraction)
When retrieving the number of ports, 1 must be added in order to obtain
the correct number of ports to the power of 2, as follows:
uint32_t ports = rte_log2_u32(conf->ports.present + 1);
If 1 was not added, in the case of one port, the number of ports would
be falsely calculated as 0.
Fixes: 8264279967 ("net/mlx4: check max number of ports dynamically")
Cc: stable@dpdk.org
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
This patch is the last patch in the series of patches aimed
to add support for registering and waiting for Rx interrupts
in failsafe PMD. This allows applications to wait for Rx events
from the PMD using the DPDK rte_epoll subsystem.
The failsafe PMD presents to the application a facade of a single
device to be handled by the application while internally it manages
several devices on behalf of the application including packets
transmission and reception.
The Proposed failsafe Rx interrupt scheme follows this approach.
The failsafe PMD will present the application with a single set of
Rx interrupt vectors representing the failsafe Rx queues, while
internally it will serve as an interrupt proxy for its subdevices.
will allow applications to wait for Rx traffic from the failsafe
PMD by registering and waiting for Rx events from its Rx queues.
In order to support this the following is suggested:
* Every Rx queue in the failsafe (virtual) device will be assigned
* a Linux event file descriptor (efd) and an enable_interrupts flag.
* The failsafe PMD will fill in its rte_intr_handle structure with
the Rx efds assigned previously and register them with the EAL.
* The failsafe driver will create a private epoll fd (epfd) and
* will allocate enough space to handle all the Rx events from all its
subdevices.
* Acting as an application,
for each Rx queue in each active subdevice the failsafe will:
o Register the Rx queue with the EAL.
o Pass the EAL the failsafe private epoll fd as the epfd to
register the Rx queue event on.
o Pass the EAL, as a parameter, the pointer to the failsafe Rx
queue that handles this Rx queue.
o Using the DPDK service callbacks, the failsafe PMD will launch
an Rx proxy service that will Wait on the epoll fd for Rx
events from the sub-devices.
o For each Rx event received the proxy service will
- Retrieve the pointer to failsafe Rx queue that handles
this subdevice Rx queue from the user info returned by the
EAL.
- Trigger a failsafe Rx event on that queue by writing to
the event fd unless interrupts are disabled for that queue.
* The failsafe pmd will also implement the rx_queue_intr_enable
* and rx_queue_intr_disable routines that will enable and disable Rx
interrupts respectively on both on the failsafe and its subdevices.
Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
This commit adds the following functionality to failsafe PMD:
* Register and unregister slaves Rx interrupts.
* Enable and Disable slaves Rx interrupts.
The interrupts events generated by the slaves are not handled in this
commit.
Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
This patch adds registering the Rx queues of the failsafe PMD with EAL
Rx interrupts subsystem.
Each failsafe RX queue is assigned with a unique eventfd and an enable
interrupts flag.
The PMD creates an interrupt vector containing the above eventfds and
Registers it with EAL. The PMD also implements the Rx interrupts enable
and disable interface routines.
This patch does not implement the generation of Rx interrupts, so an
application can now wait for failsafe Rx interrupts but it will not
receive one.
Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
By default both static and shared libraries should be created while
building MUSDK library. It turns out that this will not happen if
host parameter is not explicitly passed to the configure script.
Specifying host makes sure configure will detect support for shared
libraries.
Signed-off-by: Tomasz Duszynski <tdu@semihalf.com>
Since in DPDK 17.11 port type was changed from uint8_t to uint16_t
the MBUF_INVALID_PORT value became 0xffff but in mrvl_tx_pkt_burst()
when trying to lookup bpool using mbuf port, we check if the port
is invalid according to value 0xff. This causes segmentation fault.
Solution: since the valid port value cannot exceed RTE_MAX_ETHPORTS
(size of bpool lookup table) any other values consider as invalid so
the packet should be returned to DPDK pool.
Fixes: afb4d0d0bf ("net/mrvl: add Rx/Tx support")
Cc: stable@dpdk.org
Signed-off-by: Natalie Samsonov <nsamsono@marvell.com>
While using RSS, the pool count should be 1.
Fixes: 8103a57ab4 ("net/bnxt: handle Rx multi queue creation properly")
Cc: stable@dpdk.org
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
When the driver is loaded on a 100G NIC, the port speed is not
displayed correctly. Parse the 100G speed before displaying it.
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Currently this is implemented entirely in the PMD as there is no
explicit support in the HW. Re-program the RSS Table without this queue
on stop and add it back to the table on start.
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
In certain cases the MAC address of a port could be all zeros.
Catch it early, log a message and fail the initialization.
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Register for async events from the FW.
New events we are registering for include Link speed config changes,
PF driver unload and VF config change. Also log a message when the
async event arrives on the completion ring.
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
This patch implements driver specific log type doing away with
usage of RTE_LOG() for logging.
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
During Tx ring allocation, the actual ring size configured in the HW
ends up being twice the number of txd parameter specified to the driver.
The power of 2 ring size wrongly adds a +1 while sending the ring
create command to the FW.
Fixes: 6eb3cc2294 ("net/bnxt: add initial Tx code")
Cc: stable@dpdk.org
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
When using UIO, after enabling the interrupt to get the PF
message, VF RX queue interrupt is not working.
It's expected, as UIO doesn't support multiple interrupt.
So, PMD should not try to enable RX queue interrupt. Then
APP can know the RX queue interrupt is not enabled and only
choose the polling mode.
Fixes: 316f4f1adc ("net/igb: support VF mailbox interrupt for link up/down")
CC: stable@dpdk.org
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
When using UIO, after enabling the interrupt to get the PF
message, VF RX queue interrupt is not working.
It's expected, as UIO doesn't support multiple interrupt.
So, PMD should not try to enable RX queue interrupt. Then
APP can know the RX queue interrupt is not enabled and only
choose the polling mode.
Fixes: 77234603fb ("net/ixgbe: support VF mailbox interrupt for link up/down")
CC: stable@dpdk.org
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
When using UIO, after enabling the interrupt to get the PF
message, VF RX queue interrupt is not working.
It's expected, as UIO doesn't support multiple interrupt.
So, PMD should not try to enable RX queue interrupt. Then
APP can know the RX queue interrupt is not enabled and only
choose the polling mode.
Fixes: ae19955e7c ("i40evf: support reporting PF reset")
CC: stable@dpdk.org
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
As UIO doesn't support multiple interrupt, and the interrupt
is occupied by the control plane. PMD should not try to enable
RX queue interrupt. Then APP can know the RX queue interrupt
is not enabled and only choose the polling mode.
Fixes: d6bde6b5ea ("net/avf: enable Rx interrupt")
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
Use the same kind of loop than in virtio_free_queues() and factorize
common code.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
Free the previous queues and the attached mbufs before initializing new
ones.
The function virtio_dev_free_mbufs() is now called when reconfiguring the
device, so we also need to add a check to ensure that it won't crash for
uninitialized queues.
Cc: stable@dpdk.org
Fixes: 60e6f4707e ("net/virtio: reinitialize device when configuring")
Signed-off-by: Zijie Pan <zijie.pan@6wind.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
When using vector Rx mode (use_simple_rx = 1), vq->vq_descx[] is not
kept up to date. To properly detach the mbufs in this case, browse
sw_ring[] instead, as it's done in virtqueue_rxvq_flush().
Since we need virtio_get_queue_type(), also move this function in
virtqueue.h as a static inline.
Fixes: fc3d66212f ("virtio: add vector Rx")
Cc: stable@dpdk.org
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
On arm32, we were always selecting the simple handler, but it is only
available if neon is present.
This is due to a typo in the name of the config option.
CONFIG_RTE_ARCH_ARM is for Makefiles. One should use RTE_ARCH_ARM.
Fixes: 2d7c37194e ("net/virtio: add NEON based Rx handler")
Cc: stable@dpdk.org
Signed-off-by: Samuel Gauthier <samuel.gauthier@6wind.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
Since commit 59fe5e17d9 ("vhost: propagate set features handling error"),
vhost does not allow to set different features without reset.
The virtio-user driver fails to reset the device in below commit.
To fix, we send the reset message as stopping the device.
Fixes: c12a26ee20 ("net/virtio-user: fix not properly reset device")
Cc: stable@dpdk.org
Reported-by: Lei Yao <lei.a.yao@intel.com>
Reported-by: Tiwei Bie <tiwei.bie@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
The VIRTIO_F_ANY_LAYOUT feature indicates the device accepts arbitrary
descriptor layouts. The vhost-user lib already supports it, but the
feature declaration is missing. This patch fixes the mismatch.
Signed-off-by: Zhihong Wang <zhihong.wang@intel.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
Information about received packet type detected by NIC should be
stored in packet_type field of rte_mbuf. TX L4 offload flags should
not be set in RX path. Only fields that could be set in of_flags
during packet receiving are information if L4 and L3 checksum is
correct.
Fixes: 1173fca25a ("ena: add polling-mode driver")
Cc: stable@dpdk.org
Reported-by: Matthew Smith <mgsmith@netgate.com>
Signed-off-by: Rafal Kozik <rk@semihalf.com>
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reserving the memory space for the UAR near huge pages helps to
**reduce** the cases where the secondary process cannot start. Those
pages being physical pages they must be mapped at the same virtual
address as in the primary process to have a
working secondary process.
As this remap is almost the latest being done by the processes
(libraries, heaps, stacks are already loaded), similar to huge pages,
there is **no guarantee** this mechanism will always work.
Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Since commit f81ec74843 ("net/mlx5: fix memory region lookup") the
Memory Region (MR) are no longer overlaps.
Comparing the end address of the MR should be exclusive, otherwise two
contiguous MRs may cause wrong matching.
Fixes: f81ec74843 ("net/mlx5: fix memory region lookup")
Cc: stable@dpdk.org
Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
In case Memory Region cache is full, the new mempool will be
inserted in the last index of the array.
Update the last entry being hit to reflect it.
Fixes: b0b0938457 ("net/mlx5: use buffer address for LKEY search")
Cc: stable@dpdk.org
Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Memory registration can fail, add the proper warning for such scenario
for it at least to be visible in debug mode.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Verbs structs such as ibv_mr are not accessible from the secondary
process.
Choose to remove the assert in favor of performing more checks on the
critical data path.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Secondary process is not allowed to register mempools on the flight.
The code will return invalid memory key for such case.
Fixes: 87ec44ce16 ("net/mlx5: add operations for secondary process")
Cc: stable@dpdk.org
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
The Memory Region (MR) cache contains pointers to mlx5_mr.
The MR cache indexes are filled when a new MR is created. As it is
possible for MR to be created on the flight, an extra validation must be
added to avoid segmentation fault.
Fixes: b0b0938457 ("net/mlx5: use buffer address for LKEY search")
Cc: stable@dpdk.org
Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>