Introduce a new API to configure Rx offloads.
In the new API, offloads are divided into per-port and per-queue
offloads. The PMD reports capability for each of them.
Offloads are enabled using the existing DEV_RX_OFFLOAD_* flags.
To enable per-port offload, the offload should be set on both device
configuration and queue configuration. To enable per-queue offload, the
offloads can be set only on queue configuration.
Applications should set the ignore_offload_bitfield bit on rxmode
structure in order to move to the new API.
The old Rx offloads API is kept for the meanwhile, in order to enable a
smooth transition for PMDs and application to the new API.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
This patch adds a new eth_dev layer API function rte_eth_dev_reset(),
which a DPDK application can call to reset a NIC and keep its port id
afterwards. It means that all software resources allocated in the ethdev
layer are kept, and software & hardware resources of the NIC within the
NIC's PMD are reset to a state simular to that obtained by calling the
PCI dev_uninit() and then dev_init(). This effective sequence of
dev_uninit() and dev_init() is packed into a single API function
rte_eth_dev_reset().
Please see the comments before the declaration of rte_eht_dev_reset()
in lib/librte_ether/rte_ethdev.h to get more details on why this
function is needed, what it does, when it should be called
and what an application should do after calling this function.
See also detailed explanations in the programmer's guide.
Signed-off-by: Wei Dai <wei.dai@intel.com>
Reviewed-by: Remy Horton <remy.horton@intel.com>
The name of a device is copied in a provided buffer within
rte_eth_dev_detach(). The current sizeof is done on a pointer instead of
the intended array usually pointed to.
The name field of an rte_device is not assured however to point an
rte_devargs name field. The almost correct length to base this copy over
is thus RTE_DEV_NAME_MAX_LEN.
Almost correct, because unfortunately this function does not allow the
user to pass down a size parameter for the buffer it is meant to write.
This API should be fixed, it is broken by design.
Fixes: a1e7c17555e8 ("ethdev: use device name from device structure")
Cc: stable@dpdk.org
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Make the rte_eth_dev_count() return the number of available devices even
after some are detached by the hotplug API or put in a deferred state.
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
This device state means that the device is managed externally, by
whichever party has set this state (PMD or application).
Note: this new device state is only an information. The related device
structure and operators are still valid and can be used normally.
It is however made private by device management helpers within ethdev,
making the device invisible to applications.
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Introducing the DEV_TX_OFFLOAD_MT_LOCKFREE TX capability flag.
if a PMD advertises DEV_TX_OFFLOAD_MT_LOCKFREE capable, multiple threads
can invoke rte_eth_tx_burst() concurrently on the same tx queue without
SW lock. This PMD feature will be useful in the following use cases and
found in the OCTEON family of NPUs.
1) Remove explicit spinlock in some applications where lcores
to TX queues are not mapped 1:1.
example: OVS has such instance
https://github.com/openvswitch/ovs/blob/master/lib/netdev-dpdk.c#L299https://github.com/openvswitch/ovs/blob/master/lib/netdev-dpdk.c#L1859
See the the usage of tx_lock spinlock.
2) In the eventdev use case, avoid dedicating a separate TX core for
transmitting and thus enables more scaling as all workers can
send the packets.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
The rte_flow feature breaks the monolithic approach for ethdev by
introducing the new rte_flow API to ethdev using a plugin-like approach.
Basically, the rte_flow API is still logically part of ethdev:
- It extends the ethdev functionality: rte_flow is a new feature/
capability of ethdev;
- all its functions work on an Ethernet device: the first parameter of the
rte_flow functions is Ethernet device port ID.
Also, the rte_flow API is a sort of capability plugin for ethdev:
- the rte_flow API functions have their own name space: they are called
rte_flow_operationXYZ() as opposed to rte_eth_dev_flow_operationXYZ());
- the rte_flow API functions are placed in separate files in the same
librte_ether folder as opposed to rte_ethdev.[hc].
The way it works is by using the existing ethdev API function
rte_eth_dev_filter_ctrl() to query the current Ethernet device port ID for
the support of the rte_flow capability and return the pointer to the
rte_flow operations when supported and NULL otherwise:
struct rte_flow_ops *eth_flow_ops;
int rte = rte_eth_dev_filter_ctrl(eth_port_id,
RTE_ETH_FILTER_GENERIC, RTE_ETH_FILTER_GET, ð_flow_ops);
This patch reuses the same approach for ethdev Traffic Management API.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
From documentation it is very unclear how VMDq configuration can be
tweaked, and online search offer very poor results.
This patch will ultimately spawn an online documentation page
for the rte_eth_vmdq_rx_conf struct which will eventually add a bit of
documentation about the rx_mode tag and how to allow e.g. VMDq pools
to receive packets without VLAN tags.
Signed-off-by: Tom Barbette <tom.barbette@ulg.ac.be>
Acked-by: John McNamara <john.mcnamara@intel.com>
In order to be able to replicate a configuration onto a second port,
device configuration should be fully described and available.
Other configuration items (i.e. MAC addresses) are stored within
rte_eth_dev_data, but not this one.
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Check that numbers of Rx and Tx descriptors satisfy descriptors limits
from the Ethernet device information, otherwise adjust them to boundaries.
Signed-off-by: Roman Zhukov <roman.zhukov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
rte_device->name copied into eth_dev->name, right now size is same for
both but the requirement is not clear.
This patch highlights the relation without changing actual sizes.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
This patch fixes a typo in the eth device API doc, device
config. not stored between calls to rte_eth_dev_start/stop()
should be restored before a call to rte_eth_dev_start()
instead of after a call to rte_eth_dev_start().
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
This patch fixes a trivial typo in rte_ethdev.h; it should be
"RX multicast OFF" and not "RX multicast OF".
Signed-off-by: Rami Rosen <rami.rosen@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Change the rte_eth_dev_callback_process function to return int,
and add a void *ret_param parameter.
The new parameter is used by ixgbe and i40e instead of abusing
the user data of the callback.
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Fixing typos across dpdk source code using codespell utility.
Skipped the ethdev driver's base code fixes to keep the base
code intact.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
rte_driver->name has the driver name and all physical and virtual
devices has access to it.
Previously it was not possible for virtual ethernet devices to access
rte_driver->name field (because eth_dev used to keep only pci_dev),
and it was required to save driver name in the device private struct.
After re-works on bus and vdev, it is possible for all bus types to
access rte_driver.
It is able to remove the driver name from ethdev device private data and
use eth_dev->device->driver->name.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: Jan Blunck <jblunck@infradead.org>
Move all bypass functions to ixgbe pmd and remove function
pointers from the eth_dev_ops struct.
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Instead of many PMD define their own macro, define a generic one in
ethdev and use that in PMDs.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Allain Legacy <allain.legacy@windriver.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Currently, 'rte_eth_dev_get_port_by_name' changes transmitted
'port_id' unconditionally. This is undocumented and misleading
behaviour as user may expect unchanged value in case of error.
Otherwise, there is no sense having both return value and
a pointer in the function.
Fixes: 9c5b8d8b9feb ("ethdev: clean port id retrieval when attaching")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Different drivers use internal macros like force_inline for compiler
always inline feature.
Standardizing it through __rte_always_inline macro.
Verified the change by comparing the output binary file.
No difference found in the output binary file with this change.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Some customers find adding MAC addr to VF sometimes can fail,
but it is still stored in dev->data->mac_addrs[ ]. So this
can lead to some errors that assumes the non-zero entry in
dev->data->mac_addrs[ ] is valid.
Following acknowledgements are from specific NIC PMD
maintainer for their managing part.
This patch changes the ethdev internal API, it should not be
backported to a stable/LTS release so far.
Fixes: af75078fece3 ("first public release")
Signed-off-by: Wei Dai <wei.dai@intel.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Extended xstats API in ethdev library to allow grouping of stats
logically so they can be retrieved per logical grouping managed
by the application.
Added new functions rte_eth_xstats_get_names_by_id and
rte_eth_xstats_get_by_id using additional arguments (in compare
to rte_eth_xstats_get_names and rte_eth_xstats_get) - array of ids
and array of values.
doc: add description for modified xstats API
Documentation change for new extended statistics API functions.
The old API only allows retrieval of *all* of the NIC statistics
at once. Given this requires a MMIO read PCI transaction per statistic
it is an inefficient way of retrieving just a few key statistics.
Often a monitoring agent only has an interest in a few key statistics,
and the old API forces wasting CPU time and PCIe bandwidth in retrieving
*all* statistics; even those that the application didn't explicitly
show an interest in.
The new, more flexible API allow retrieval of statistics per ID.
If a PMD wishes, it can be implemented to read just the required
NIC registers. As a result, the monitoring application no longer wastes
PCIe bandwidth and CPU time.
Signed-off-by: Kuba Kozak <kubax.kozak@intel.com>
Revert patches to provide clear view for
upcoming changes. Reverted patches are listed below:
commit ea85e7d711b6 ("ethdev: retrieve xstats by ID")
commit a954495245c4 ("ethdev: get xstats ID by name")
commit 1223608adb9b ("app/proc-info: support xstats by ID")
commit 25e38f09af9c ("net/e1000: support xstats by ID")
commit 923419333f5a ("net/ixgbe: support xstats by ID")
Signed-off-by: Kuba Kozak <kubax.kozak@intel.com>
The RTE_FUNC_*_RET() and RTE_PROC_*_RET() macro definitions in rte_dev.h
require RTE_PMD_DEBUG_TRACE(). This macro is defined as needed by users of
rte_dev.h since its value depends on their own debug settings.
It may be defined multiple times as a result when including files from
various components simultaneously. Worse, these redefinitions may be
inconsistent. This causes the following compilation errors:
In file included from /tmp/check-includes.sh.13890.c:27:0:
build/include/rte_eventdev_pmd.h:58:0: error: "RTE_PMD_DEBUG_TRACE"
redefined [-Werror]
[...]
In file included from build/include/rte_ethdev_pci.h:39:0,
from /tmp/check-includes.sh.13890.c:13:
build/include/rte_ethdev.h:1042:0: note: this is the location of the
previous definition
[...]
In file included from /tmp/check-includes.sh.13890.c:83:0:
build/include/rte_cryptodev_pmd.h:65:0: error: "RTE_PMD_DEBUG_TRACE"
redefined [-Werror]
[...]
In file included from /tmp/check-includes.sh.13890.c:27:0:
build/include/rte_eventdev_pmd.h:58:0: note: this is the location of
the previous definition
[...]
This commit moves the RTE_PMD_DEBUG_TRACE() definition to rte_dev.h where
it is enabled consistently depending on global configuration settings and
removes redundant definitions.
Also when disabled, RTE_PMD_DEBUG_TRACE() is now defined as (void)0 to
avoid empty statements warnings if used outside { } blocks.
Fixes: b974e4a40cb5 ("ethdev: make error checking macros public")
Cc: stable@dpdk.org
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
This new API allows reacting to a device removal.
A device removal is the sudden disappearance of a device from its
bus.
PMDs implementing support for this notification guarantee that the removal
of the underlying device does not incur a risk to the application.
In particular, Rx/Tx bursts and all other functions can still be called
(albeit likely returning errors) without triggering a crash, irrespective
of an application handling this event.
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Signed-off-by: Elad Persiko <eladpe@mellanox.com>
There is a new argument --xstats-ids and --xstats-name
in proc_info command line to retrieve statistics given by ids
and by name.
E.g. --xstats-ids="1,3,5,7,8"
E.g. --xstats-name rx_errors
ethdev: mark functions as deprecated
Functions rte_eth_xstats_get_all and rte_eth_xstats_get_names_all
are marked as deprecated
Signed-off-by: Kuba Kozak <kubax.kozak@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Introduced new function: rte_eth_xstats_get_id_by_name
to retrieve xstats ids by its names.
doc: added release note
Signed-off-by: Kuba Kozak <kubax.kozak@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Extended xstats API in ethdev library to allow grouping of stats
logically so they can be retrieved per logical grouping managed
by the application.
Changed existing functions rte_eth_xstats_get_names and
rte_eth_xstats_get to use a new list of arguments: array of ids
and array of values. ABI versioning mechanism was used to
support backward compatibility.
Introduced two new functions rte_eth_xstats_get_all and
rte_eth_xstats_get_names_all which keeps functionality of the
previous ones (respectively rte_eth_xstats_get and
rte_eth_xstats_get_names) but use new API inside.
test-pmd: add support for new xstats API retrieving by id in
testpmd application: xstats_get() and
xstats_get_names() call with modified parameters.
doc: add description for modified xstats API
Documentation change for modified extended statistics API functions.
The old API only allows retrieval of *all* of the NIC statistics
at once. Given this requires a MMIO read PCI transaction per statistic
it is an inefficient way of retrieving just a few key statistics.
Often a monitoring agent only has an interest in a few key statistics,
and the old API forces wasting CPU time and PCIe bandwidth in retrieving
*all* statistics; even those that the application didn't explicitly
show an interest in.
The new, more flexible API allow retrieval of statistics per ID.
If a PMD wishes, it can be implemented to read just the required
NIC registers. As a result, the monitoring application no longer wastes
PCIe bandwidth and CPU time.
Signed-off-by: Jacek Piasecki <jacekx.piasecki@intel.com>
Signed-off-by: Kuba Kozak <kubax.kozak@intel.com>
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Compilation error seen while compiling mlx5 in debug mode
under RHEL 7.3:
rte_ethdev.h:1670:7: error: type of bit-field 'state' is a GCC extension
[-Werror=pedantic]
Address it by removing the unnecessary bit-field width limitation.
Fixes: d52268a8b24b ("ethdev: expose device states")
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Since the PCI functionality has been moved to the PCI specific ethdev
header we don't need to include rte_pci.h from here anymore.
Signed-off-by: Jan Blunck <jblunck@infradead.org>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
This moves the rte_eth_copy_pci_info() into the PCI specific ethdev
header. As a side effect this also removes it from the list of symbols
exported by the rte_ethdev library.
Signed-off-by: Jan Blunck <jblunck@infradead.org>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
This removes the now unused struct eth_driver.
Signed-off-by: Jan Blunck <jblunck@infradead.org>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
This removes the now unused rte_eth_dev_pci_probe() and
rte_eth_dev_pci_remove() functions.
Signed-off-by: Jan Blunck <jblunck@infradead.org>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Today eth_dev_attach_secondary is defined as static and can only be
called by pci drivers. However, the functionality is also required for
non-pci drivers - so the patch export the function.
Signed-off-by: Ami Sabo <amis@radware.com>
This iterator helps applications iterate over the device list and skip
holes caused by invalid or detached devices.
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
The hotplug API introduced multiple states for a device with possible
values defined internally, while the related field in struct rte_eth_dev
was made public.
Exposing those states improves consistency because applications have to
deal with the device list directly.
"DEV_DETACHED" is renamed "RTE_ETH_DEV_UNUSED" to better reflect that
the emptiness of a slot is not necessarily the result of detaching a
device.
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Introduce a new API to get the status of a descriptor.
For Rx, it is almost similar to rx_descriptor_done API, except it
differentiates "used" descriptors (which are hold by the driver and not
returned to the hardware).
For Tx, it is a new API.
The descriptor_done() API, and probably the rx_queue_count() API could
be replaced by this new API as soon as it is implemented on all PMDs.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Add a new API to force free consumed buffers on Tx ring. API will return
the number of packets freed (0-n) or error code if feature not supported
(-ENOTSUP) or input invalid (-ENODEV).
Signed-off-by: Billy McFall <bmcfall@redhat.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
The check of queue_id is done in all drivers implementing
rte_eth_rx_queue_count(). Factorize this check in the generic function.
Note that the nfp driver was doing the check differently, which could
induce crashes if the queue index was too big.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
The API comments are not consistent between each other.
The function rte_eth_rx_queue_count() returns the number of used
descriptors on a receive queue.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
remove the following API's:
rte_eth_dev_set_vf_rxmode
rte_eth_dev_set_vf_rx
rte_eth_dev_set_vf_tx
rte_eth_dev_set_vf_vlan_filter
rte_eth_dev_set_vf_rate_limit
Increment LIBABIVER in Makefile
Remove deprecation notice for removing rte_eth_dev_set_vf_* API's.
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
This patch adds a new API 'rte_eth_dev_fw_version_get' for
fetching firmware version by a given device.
Signed-off-by: Qiming Yang <qiming.yang@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
If these flags are advertised by a PMD, the NIC supports the MACsec
offload. The incoming MACsec traffics can be offloaded transparently
after the MACsec offload is configured correctly by the application.
And the application can set the PKT_TX_MACSEC flag in mbufs to enable
the MACsec offload for the packets to be transmitted.
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
This commit adds a below event type:
- RTE_ETH_EVENT_MACSEC
This event will occur when the PN counter in a MACsec connection
reaches the exhaustion threshold.
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Added API for `rte_eth_tx_prepare`
uint16_t rte_eth_tx_prepare(uint8_t port_id, uint16_t queue_id,
struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
Added fields to the `struct rte_eth_desc_lim`:
uint16_t nb_seg_max;
/**< Max number of segments per whole packet. */
uint16_t nb_mtu_seg_max;
/**< Max number of segments per one MTU */
These fields can be used to create valid packets according to the
following rules:
* For non-TSO packet, a single transmit packet may span up to
"nb_mtu_seg_max" buffers.
* For TSO packet the total number of data descriptors is "nb_seg_max",
and each segment within the TSO may span up to "nb_mtu_seg_max".
Added functions:
int
rte_validate_tx_offload(struct rte_mbuf *m)
to validate general requirements for tx offload set in mbuf of packet
such a flag completness. In current implementation this function is
called optionaly when RTE_LIBRTE_ETHDEV_DEBUG is enabled.
int rte_net_intel_cksum_prepare(struct rte_mbuf *m)
to prepare pseudo header checksum for TSO and non-TSO tcp/udp packets
before hardware tx checksum offload.
- for non-TSO tcp/udp packets full pseudo-header checksum is
counted and set.
- for TSO the IP payload length is not included.
int
rte_net_intel_cksum_flags_prepare(struct rte_mbuf *m, uint64_t ol_flags)
this function uses same logic as rte_net_intel_cksum_prepare, but
allows application to choose which offloads should be taken into
account, if full preparation is not required.
PERFORMANCE TESTS
-----------------
This feature was tested with modified csum engine from test-pmd.
The packet checksum preparation was moved from application to Tx
preparation step placed before burst.
We may expect some overhead costs caused by:
1) using additional callback before burst,
2) rescanning burst,
3) additional condition checking (packet validation),
4) worse optimization (e.g. packet data access, etc.)
We tested it using ixgbe Tx preparation implementation with some parts
disabled to have comparable information about the impact of different
parts of implementation.
IMPACT:
1) For unimplemented Tx preparation callback the performance impact is
negligible,
2) For packet condition check without checksum modifications (nb_segs,
available offloads, etc.) is 14626628/14252168 (~2.62% drop),
3) Full support in ixgbe driver (point 2 + packet checksum
initialization) is 14060924/13588094 (~3.48% drop)
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>