Add debug logs to have a trace of unused cores for -c/-l options on
systems with more cores than RTE_MAX_LCORE.
Signed-off-by: David Marchand <david.marchand@redhat.com>
We use this state in control path only for services cores and -c/-l
options.
The value is not updated when using --lcores.
Use the internal helper where needed.
Signed-off-by: David Marchand <david.marchand@redhat.com>
Fix the name of CPU_SETSIZE in hope we can reuse it in other parts of
the dpdk manipulating some rte_cpuset_t.
Fixes: 4dc2b4d2a4cd ("eal/windows: add headers for compatibility")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
In the root table, there is some limitation of total number of header
modify actions, 16 or 8 for each. But in other tables, there is no
such strict limitation. In an IPv6 case, the IP fields modifying
will occupy more actions than that in IPv4, so the total support
number should be increased in order to support as many actions as
possible for an IPv6 + TCP packet.
And in the meanwhile, the memory consumption should also be taken
into consideration because sometimes only several actions are needed.
The root table checking could also be done in low layer driver and
the error code will be returned if the actions number is over the
maximal supported value.
Fixes: 0e9d00027686 ("net/mlx5: check maximum modify actions number")
Cc: stable@dpdk.org
Signed-off-by: Bing Zhao <bingz@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
The meter suffix flow item pointer restore is not correct to decrease
a fixed value. The incorrect operation will cause incorrect match to
the meter suffix flow, the flow create will fail once the magic number
in the wrong offset memory start with RTE_FLOW_ITEM_TYPE_END.
The pointer should decrease the real offset it increases.
Set the decrease value to the real offset the pointer increases to fix
the issue.
Fixes: 9ea9b049a960 ("net/mlx5: split meter flow")
Cc: stable@dpdk.org
Reported-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
This patch adds to MLX5 PMD support of matching on GTP item,
fields msg_type and teid, according to RFC [1].
GTP item validation and translation functions are added and called.
GTP tunnel type is added to supported tunnels.
[1] http://mails.dpdk.org/archives/dev/2019-December/152799.html
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
Previous fix added translation of Rx hash fields to PRM format.
This patch optimizes the fix, to perform value translation only
if value is not zero.
In case value is zero, there is no need to translate it.
Fixes: c3e33304a7f6 ("net/mlx5: fix setting of Rx hash fields")
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
The free on completion queue keeps the indices of elts array,
all mbuf stored below this index should be freed on arrival
of normal send completion. In debug version it also contains
an index of completed transmitting descriptor (WQE) to check
queues synchronization.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
The new software manged entity is introduced in Tx datapath
- free on completion queue. This queue keeps the information
how many buffers stored in elts array must freed on send
completion. Each element of the queue contains transmitting
descriptor index to be in synch with completion entries (in
debug build only) and the index in elts array to free buffers.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
This is preparation step, we are going to store the index
of elts to free on completion in the dedicated free on
completion queue, this patch updates the elts freeing routine
and updates Tx error handling routine to be synced with
coming new queue.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
The complete request flag is set once per Tx burst call,
the code of appropriate routine moved to the end of sending
loop. This is preparation step to remove WQE reserved field
usage to store index of elts to free.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Add support for reading the firmware version.
Signed-off-by: Alfredo Cardigliano <cardigliano@ntop.org>
Reviewed-by: Shannon Nelson <snelson@pensando.io>
Add basic, per queue and extended statistics for
RX and TX, both from the adapter and the driver.
Signed-off-by: Alfredo Cardigliano <cardigliano@ntop.org>
Reviewed-by: Shannon Nelson <snelson@pensando.io>
Add code to manipulate the RSS configuration
used by the adapter.
Signed-off-by: Alfredo Cardigliano <cardigliano@ntop.org>
Reviewed-by: Shannon Nelson <snelson@pensando.io>
Add support for managing RX filters based on MAC and VLAN.
Hardware cannot provide the list of filters, thus we keep
a local list.
Add support for promisc and allmulticast modes.
Signed-off-by: Alfredo Cardigliano <cardigliano@ntop.org>
Reviewed-by: Shannon Nelson <snelson@pensando.io>
Add support for port start/stop and handle basic features
including MTU and link up/down.
Signed-off-by: Alfredo Cardigliano <cardigliano@ntop.org>
Reviewed-by: Shannon Nelson <snelson@pensando.io>
Add support for the notify queue, which is used for events
published by the NIC.
Signed-off-by: Alfredo Cardigliano <cardigliano@ntop.org>
Reviewed-by: Shannon Nelson <snelson@pensando.io>
Add support for the admin queue, which is used for most
of the NIC configurations.
Signed-off-by: Alfredo Cardigliano <cardigliano@ntop.org>
Reviewed-by: Shannon Nelson <snelson@pensando.io>
Doorbell registers are used by the driver to signal to the NIC
that requests are waiting on the message queues.
Signed-off-by: Alfredo Cardigliano <cardigliano@ntop.org>
Reviewed-by: Shannon Nelson <snelson@pensando.io>
Initialize LIFs (Logical Interfaces) which represents
external connections. The NIC can multiplex many LIFs
to a single port, but in most setups, LIF0 is the
primary control for the port.
Create a device for each LIF.
Signed-off-by: Alfredo Cardigliano <cardigliano@ntop.org>
Reviewed-by: Shannon Nelson <snelson@pensando.io>
Add port management commands that apply to the physical
ports associated with the PCI device, which might be
shared among several logical interfaces.
Signed-off-by: Alfredo Cardigliano <cardigliano@ntop.org>
Reviewed-by: Shannon Nelson <snelson@pensando.io>
Register the Pensando ionic PMD (net_ionic) and define initial probe
and remove callbacks with adapter initialization.
Signed-off-by: Alfredo Cardigliano <cardigliano@ntop.org>
Reviewed-by: Shannon Nelson <snelson@pensando.io>
Add debug options to the config file.
Define macros used for logs and make use of config file options
to enable them.
Signed-off-by: Alfredo Cardigliano <cardigliano@ntop.org>
Reviewed-by: Shannon Nelson <snelson@pensando.io>
Add makefile and config file options to compile the Pensando ionic PMD.
Add feature and version map file.
Update maintainers file.
Signed-off-by: Alfredo Cardigliano <cardigliano@ntop.org>
Reviewed-by: Shannon Nelson <snelson@pensando.io>
V1000/R1000 processors are using the same PCI ids for the network
device as SNOWYOWL processor but has altered register definitions
for determining the window settings for the indirect PCS access.
Add support to check for this hardware and if found use the new
register values.
Signed-off-by: Selwin Sebastian <selwin.sebastian@amd.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
OCTEON TX2 isn't built for gcc 4.8.5 as the compiler emits
"internal compiler error" on aarch64. This causes the following
build error when OCTEON TX2 EP is enabled:
/usr/bin/ld: cannot find -lrte_common_octeontx2
collect2: error: ld returned 1 exit status
Fixes: 56d46d13f736 ("raw/octeontx2_ep: add build infra and device probe")
Signed-off-by: Ali Alnubani <alialnu@mellanox.com>
In the Rx datapath the flags in the newly allocated mbufs
are all explicitly cleared but the EXT_ATTACHED_MBUF must be
preserved. It would allow to use mbuf pools with pre-attached
external data buffers.
The vectorized rx_burst routines are updated in order to
inherit the EXT_ATTACHED_MBUF from mbuf pool private
RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF flag.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
The new mbuf pool type is added to testpmd. To engage the
mbuf pool with externally attached data buffers the parameter
"--mp-alloc=xbuf" should be specified in testpmd command line.
The objective of this patch is just to test whether mbuf pool
with externally attached data buffers works OK. The memory for
data buffers is allocated from DPDK memory, so this is not
"true" external memory from some physical device (this is
supposed the most common use case for such kind of mbuf pool).
The user should be aware that not all drivers support the mbuf
with EXT_ATTACHED_BUF flags set in newly allocated mbuf (many
PMDs just overwrite ol_flags field and flag value is getting
lost).
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
The dedicated routine rte_pktmbuf_pool_create_extbuf() is
provided to create mbuf pool with data buffers located in
the pinned external memory. The application provides the
external memory description and routine initializes each
mbuf with appropriate virtual and physical buffer address.
It is entirely application responsibility to register
external memory with rte_extmem_register() API, map this
memory, etc.
The new introduced flag RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF
is set in private pool structure, specifying the new special
pool type. The allocated mbufs from pool of this kind will
have the EXT_ATTACHED_MBUF flag set and initialiazed shared
info structure, allowing cloning with regular mbufs (without
attached external buffers of any kind).
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Update detach routine to check the mbuf pool type.
Introduce the special internal version of detach routine to handle
the special case of pinned external bufferon mbuf freeing.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
The routine rte_pktmbuf_priv_flags is introduced to fetch
the flags from the mbuf memory pool private structure
in unified fashion.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Use new marker typedef available in EAL and remove private marker
typedef.
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Use new marker typedef available in EAL.
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Introduce EAL typedef for structure 1B, 2B, 4B, 8B alignment marking and
a generic marker for a point in a structure.
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Acked-by: Matan Azrad <matan@mellanox.com>
In case of too low number of memzone segments user notification
was misleading. This patch improves the description by providing
better explanation about the cause.
Signed-off-by: Artur Trybula <arturx.trybula@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
FreeBSD 13 has changed the definition of vm_page_replace so we need
to have slightly different code paths around this function depending on
the BSD version.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
In (currently unreleased) FreeBSD 13, the CPU_NAND macro has been renamed
to CPU_ANDNOT, so we need to use different DPDK-specific macros depending
on what system-defined ones are present.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Meson versions 0.52 and 0.53 are being overly smart and detecting the path
"/sys/devices/system/cpu/present" in the call to cat in
app/test/meson.build and then adding it as a dependency to the build
configuration. This causes issues on systems where the timestamp of that
file always returns the current time, since it means that the build.ninja
file is always out of date, and therefore needs to be rebuilt.
We can fix this by just using a simple shell script to return the coremask
appropriately for BSD and Linux, and removing that code logic from meson -
thereby hiding the use of the /sys file.
Fixes: c70622ac6f72 ("test: detect number of cores with meson")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Moved RFC4115 APIs to non-experimental as they have been there
since 19.02. Also, these APIs are the same as the non RFC4115 APIs.
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Enhance API documentation of rte_pktmbuf_attach_extbuf() to
explain that the attached mbuf is initialized with length = 0.
Link: https://bugs.dpdk.org/show_bug.cgi?id=362
Signed-off-by: Jörg Thalheim <joerg@thalheim.io>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
The existing optimize_object_size() function address the memory object
alignment constraint on x86 for better performance.
Different (micro) architecture may have different memory alignment
constraint for better performance and it not the same as the existing
optimize_object_size().
Some use, XOR(kind of CRC) scheme to enable DRAM channel distribution
based on the address and some may have a different formula.
Introducing arch_mem_object_align() function to abstract
the difference between different (micro) architectures to avoid
wasting memory for mempool object alignment for the architecture
that it is not required to do so.
Details on the amount of memory saving:
Currently, arm64 based architectures use the default (nchan=4,
nrank=1). The worst case is for an object whose size (including mempool
header) is 2 cache lines, where it is optimized to 3 cache lines (+50%).
Examples for cache lines size = 64:
orig optimized
64 -> 64 +0%
128 -> 192 +50%
192 -> 192 +0%
256 -> 320 +25%
320 -> 320 +0%
384 -> 448 +16%
...
2304 -> 2368 +2.7% (~mbuf size)
Additional details:
https://www.mail-archive.com/dev@dpdk.org/msg149157.html
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Starting from v20.05, rte_mempool_populate_iova() will return 0.
The ABI will be preserved through symbol versioning until 20.11.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
To populate a mempool with a virtual area, the mempool code calls
rte_mempool_populate_iova() for each iova-contiguous area. It happens
(rarely) that this area is too small to store one object. In this case,
rte_mempool_populate_iova() returns an error, which is forwarded by
rte_mempool_populate_virt().
This case should not throw an error in rte_mempool_populate_virt().
Instead, the area that is too small should just be ignored.
To fix this issue, change the return value of
rte_mempool_populate_iova() to 0 when no object can be populated,
so it can be ignored by the caller. As this would be an API/ABI change,
only do this modification internally for now.
Fixes: 354788b60cfd ("mempool: allow populating with unaligned virtual area")
Cc: stable@dpdk.org
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Tested-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Alvin Zhang <alvinx.zhang@intel.com>
When allocating a mempool which is larger than the largest
available area, it can take a lot of time:
a- the mempool calculate the required memory size, and tries
to allocate it, it fails
b- then it tries to allocate the largest available area (this
does not request new huge pages)
c- add this zone to the mempool, this triggers the allocation
of a mem hdr, which request a new huge page
d- back to a- until mempool is populated or until there is no
more memory
This can take a lot of time to finally fail (several minutes): in step
a- it takes all available hugepages on the system, then release them
after it fails.
The problem appeared with commit eba11e364614 ("mempool: reduce wasted
space on populate"), because smaller chunks are now allowed. Previously,
it had to be at least one page size, which is not the case in step b-.
To fix this, implement our own way to allocate the largest available
area instead of using the feature from memzone: if an allocation fails,
try to divide the size by 2 and retry. When the requested size falls
below min_chunk_size, stop and return an error.
Fixes: eba11e364614 ("mempool: reduce wasted space on populate")
Cc: stable@dpdk.org
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Tested-by: Ali Alnubani <alialnu@mellanox.com>