Commit Graph

132 Commits

Author SHA1 Message Date
Vipin Varghese
403232b817 net/tap: update tap index to unsigned
Updating the logic to reflect unsigned integer as index for TAP PMD.

Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2018-06-14 19:27:50 +02:00
Ophir Munk
42ec78eaeb net/tap: fix keep-alive queue not detached
The TAP keep-alive queue was created in order to keep the TAP device
in Linux even in case all of its Rx/Tx queues are released (in Linux
terminology: even in case all of the TAP device file descriptors are
closed), however, the keep-alive queue itself is attached to the TAP
device like all other Rx/Tx queues and therefore the kernel will
enqueue to it some Rx packets based on the kernel RSS distribution
rules. Those packets are unknown to the application and will remain
lost in the keep-alive queue.
All queues are attached by default to the TAP device after they are
created though TUNSETIFF ioctl call.
The fix is to detach the keep-alive queue after its creation through
TUNSETQUEUE ioctl call.

Fixes: 3101191c63 ("net/tap: fix device removal when no queue exist")
Cc: stable@dpdk.org

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2018-05-25 17:07:40 +02:00
Vipin Varghese
d8dc42fab6 net/tap: fix vdev data sharing for tun
Enables TUN PMD sharing by attaching the port from the shared data.

Fixes: ee27edbe0c ("drivers/net: share vdev data to secondary process")

Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2018-05-23 00:35:01 +02:00
Vipin Varghese
d1883bf1de net/tap: fix protocol field
The TX function is shared between TAP and TUN PMD. Checking TUN-TAP
type field will ensure the TAP PMD will always have protocol field
as 0.

Fixes: 204d026a39 ("net/tap: support tun")

Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2018-05-23 00:35:01 +02:00
Ophir Munk
3101191c63 net/tap: fix device removal when no queue exist
TAP device is created following its first queue creation. Multiple
queues can be added or removed over time. In Linux terminology those
are file descriptors which are opened or closed over time. As long as
the number of opened file descriptors is positive - TAP device will
appear as a Linux device. In case all queues are released (the
equivalent of all file descriptors being closed) the TAP device will
be removed. This can lead to abnormalities in different scenarios
where the TAP device should exist even if all its queues are released.
In order to make TAP existence independent of its number of queues -
an extra file descriptor is opened on TAP creation and is closed on
TAP closure. Its only purpose is to serve as a keep-alive mechanism
for the TAP device.

Fixes: bf7b7f437b ("net/tap: create netdevice during probing")
Cc: stable@dpdk.org

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2018-05-23 00:35:01 +02:00
Ophir Munk
2f5045c51c net/tap: support RSS hash update
Add RSS hash update callback to eth_dev_ops.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2018-05-17 19:07:08 +02:00
Ophir Munk
2ef1c0da89 net/tap: fix isolation mode toggling
Running testpmd command "flow isolae <port> 0" (i.e. disabling flow
isolation) followed by command "flow isolate <port> 1" (i.e. enabling
flow isolation) may result in a TAP error:
PMD: Kernel refused TC filter rule creation (17): File exists

Root cause analysis: when disabling flow isolation we keep the local
rule to redirect packets on TX (TAP_REMOTE_TX index) while we add it
again when enabling flow isolation. As a result this rule is added
two times in a row which results in "File exists" error.
The fix is to identify the "File exists" error and silently ignore it.

Another issue occurs when enabling isolation mode several times in a
row in which case the same tc rules are added consecutively and
rte_flow structs are added to a linked list before removing the
previous rte_flow structs.
The fix is to act upon isolation mode command only when there is a
change from "0" to "1" (or vice versa).

Fixes: f503d26948 ("net/tap: support flow API isolated mode")
Cc: stable@dpdk.org

Reviewed-by: Raslan Darawsheh <rasland@mellanox.com>
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2018-05-17 16:01:05 +02:00
Vipin Varghese
f5fd98c802 net/tap: add default name to tun
The change adds default name to reflect TUN PMD instance. if option
name is not passed, the default dtun is taken.

Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2018-05-14 22:32:23 +01:00
Thomas Monjalon
fbe90cdd77 ethdev: add probing finish function
A new hook function is added and called inside the PMDs at the end
of the device probing:
	- in primary process, after allocating, init and config
	- in secondary process, after attaching and local init

This new function is almost empty for now.
It will be used later to add some post-initialization processing.

For the PMDs calling the helpers rte_eth_dev_create() or
rte_eth_dev_pci_generic_probe(), the hook rte_eth_dev_probing_finish()
is called from here, and not in the PMD itself.

Note that the helper rte_eth_dev_create() could be used more,
especially for vdevs, avoiding some code duplication in PMDs.

Cc: stable@dpdk.org

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
2018-05-14 22:31:53 +01:00
Wei Dai
a4996bd89c ethdev: new Rx/Tx offloads API
This patch check if a input requested offloading is valid or not.
Any reuqested offloading must be supported in the device capabilities.
Any offloading is disabled by default if it is not set in the parameter
dev_conf->[rt]xmode.offloads to rte_eth_dev_configure() and
[rt]x_conf->offloads to rte_eth_[rt]x_queue_setup().
If any offloading is enabled in rte_eth_dev_configure() by application,
it is enabled on all queues no matter whether it is per-queue or
per-port type and no matter whether it is set or cleared in
[rt]x_conf->offloads to rte_eth_[rt]x_queue_setup().
If a per-queue offloading hasn't be enabled in rte_eth_dev_configure(),
it can be enabled or disabled for individual queue in
ret_eth_[rt]x_queue_setup().
A new added offloading is the one which hasn't been enabled in
rte_eth_dev_configure() and is reuqested to be enabled in
rte_eth_[rt]x_queue_setup(), it must be per-queue type,
otherwise trigger an error log.
The underlying PMD must be aware that the requested offloadings
to PMD specific queue_setup() function only carries those
new added offloadings of per-queue type.

This patch can make above such checking in a common way in rte_ethdev
layer to avoid same checking in underlying PMD.

This patch assumes that all PMDs in 18.05-rc2 have already
converted to offload API defined in 17.11 . It also assumes
that all PMDs can return correct offloading capabilities
in rte_eth_dev_infos_get().

In the beginning of [rt]x_queue_setup() of underlying PMD,
add offloads = [rt]xconf->offloads |
dev->data->dev_conf.[rt]xmode.offloads; to keep same as offload API
defined in 17.11 to avoid upper application broken due to offload
API change.
PMD can use the info that input [rt]xconf->offloads only carry
the new added per-queue offloads to do some optimization or some
code change on base of this patch.

Signed-off-by: Wei Dai <wei.dai@intel.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
2018-05-14 22:31:51 +01:00
Ophir Munk
8d54ede738 net/tap: report on supported RSS hash functions
Report on TAP supported RSS functions as part of dev_infos_get
callback: ETH_RSS_IP, ETH_RSS_UDP and ETH_RSS_TCP.
Known limitation: TAP supports all of the above hash functions together
and not in partial combinations.
Previous to this commit RSS support was reported as none. Since the
introduction of [1] it is required that all RSS configurations will be
verified.

[1] commit 8863a1fbfc ("ethdev: add supported hash function check")

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2018-05-14 22:31:51 +01:00
Stephen Hemminger
1b3b7caeb1 net/tap: convert to dynamic logging
Use new logging macro to convert all calls to RTE_LOG() into
new dynamic log type.

Also fix whitespace.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2018-04-27 18:00:59 +01:00
Ophir Munk
ead63dd318 net/tap: return empty port offload capabilities
Fix internal report on port specific offload capabilities to be 0 (no
capabilities). Before this commit port capabilities were a clone of queue
capabilities, however the current TAP offload capabilities (e.g.
checksum calculation) are per queue and are not specific per port.
This commit fixes an internal validation check for new configured
queue offloads.
The port capability API keeps reporting all queue capabilities as port
capabilities.

Fixes: 95ae196ae1 ("net/tap: use new Rx offloads API")
Fixes: 818fe14a98 ("net/tap: use new Tx offloads API")
Cc: stable@dpdk.org

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
2018-04-27 18:00:57 +01:00
Adrien Mazarguil
76e9a55b5b ethdev: add transfer attribute to flow API
This new attribute enables applications to create flow rules that do not
simply match traffic whose origin is specified in the pattern (e.g. some
non-default physical port or VF), but actively affect it by applying the
flow rule at the lowest possible level in the underlying device.

It breaks ABI compatibility for the following public functions:

- rte_flow_copy()
- rte_flow_create()
- rte_flow_validate()

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-04-27 18:00:54 +01:00
Adrien Mazarguil
e58638c324 ethdev: fix TPID handling in flow API
TPID handling in rte_flow VLAN and E_TAG pattern item definitions is not
consistent with the normal stacking order of pattern items, which is
confusing to applications.

Problem is that when followed by one of these layers, the EtherType field
of the preceding layer keeps its "inner" definition, and the "outer" TPID
is provided by the subsequent layer, the reverse of how a packet looks like
on the wire:

 Wire:     [ ETH TPID = A | VLAN EtherType = B | B DATA ]
 rte_flow: [ ETH EtherType = B | VLAN TPID = A | B DATA ]

Worse, when QinQ is involved, the stacking order of VLAN layers is
unspecified. It is unclear whether it should be reversed (innermost to
outermost) as well given TPID applies to the previous layer:

 Wire:       [ ETH TPID = A | VLAN TPID = B | VLAN EtherType = C | C DATA ]
 rte_flow 1: [ ETH EtherType = C | VLAN TPID = B | VLAN TPID = A | C DATA ]
 rte_flow 2: [ ETH EtherType = C | VLAN TPID = A | VLAN TPID = B | C DATA ]

While specifying EtherType/TPID is hopefully rarely necessary, the stacking
order in case of QinQ and the lack of documentation remain an issue.

This patch replaces TPID in the VLAN pattern item with an inner
EtherType/TPID as is usually done everywhere else (e.g. struct vlan_hdr),
clarifies documentation and updates all relevant code.

It breaks ABI compatibility for the following public functions:

- rte_flow_copy()
- rte_flow_create()
- rte_flow_query()
- rte_flow_validate()

Summary of changes for PMDs that implement ETH, VLAN or E_TAG pattern
items:

- bnxt: EtherType matching is supported with and without VLAN, but TPID
  matching is not and triggers an error.

- e1000: EtherType matching is only supported with the ETHERTYPE filter,
  which does not support VLAN matching, therefore no impact.

- enic: same as bnxt.

- i40e: same as bnxt with existing FDIR limitations on allowed EtherType
  values. The remaining filter types (VXLAN, NVGRE, QINQ) do not support
  EtherType matching.

- ixgbe: same as e1000, with additional minor change to rely on the new
  E-Tag macro definition.

- mlx4: EtherType/TPID matching is not supported, no impact.

- mlx5: same as bnxt.

- mvpp2: same as bnxt.

- sfc: same as bnxt.

- tap: same as bnxt.

Fixes: b1a4b4cbc0 ("ethdev: introduce generic flow API")
Fixes: 99e7003831 ("net/ixgbe: parse L2 tunnel filter")

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
2018-04-27 18:00:54 +01:00
Adrien Mazarguil
18aee2861a ethdev: add encap level to RSS flow API action
RSS hash types (ETH_RSS_* macros defined in rte_ethdev.h) describe the
protocol header fields of a packet that must be taken into account while
computing RSS.

When facing encapsulated (e.g. tunneled) packets, there is an ambiguity as
to whether these should apply to inner or outer packets. Applications need
the ability to tell exactly "where" RSS must be performed.

This is addressed by adding encapsulation level information to the RSS flow
action. Its default value is 0 and stands for the usual unspecified
behavior. Other values provide a specific encapsulation level.

Contrary to the change announced by commit 676b605182 ("doc: announce
ethdev API change for RSS configuration"), this patch does not affect
struct rte_eth_rss_conf but struct rte_flow_action_rss as the former is not
used anymore by the RSS flow action. ABI impact is therefore limited to
rte_flow.

This breaks ABI compatibility for the following public functions:

- rte_flow_copy()
- rte_flow_create()
- rte_flow_query()
- rte_flow_validate()

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
2018-04-27 18:00:54 +01:00
Adrien Mazarguil
929e331934 ethdev: add hash function to RSS flow API action
By definition, RSS involves some kind of hash algorithm, usually Toeplitz.

Until now it could not be modified on a flow rule basis and PMDs had to
always assume RTE_ETH_HASH_FUNCTION_DEFAULT, which remains the default
behavior when unspecified (0).

This breaks ABI compatibility for the following public functions:

- rte_flow_copy()
- rte_flow_create()
- rte_flow_query()
- rte_flow_validate()

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
2018-04-27 18:00:54 +01:00
Adrien Mazarguil
ac8d22de23 ethdev: flatten RSS configuration in flow API
Since its inception, the rte_flow RSS action has been relying in part on
external struct rte_eth_rss_conf for compatibility with the legacy RSS API.
This structure lacks parameters such as the hash algorithm to use, and more
recently, a method to tell which layer RSS should be performed on [1].

Given struct rte_eth_rss_conf will never be flexible enough to represent a
complete RSS configuration (e.g. RETA table), this patch supersedes it by
extending the rte_flow RSS action directly.

A subsequent patch will add a field to use a non-default RSS hash
algorithm. To that end, a field named "types" replaces the field formerly
known as "rss_hf" and standing for "RSS hash functions" as it was
confusing. Actual RSS hash function types are defined by enum
rte_eth_hash_function.

This patch updates all PMDs and example applications accordingly.

It breaks ABI compatibility for the following public functions:

- rte_flow_copy()
- rte_flow_create()
- rte_flow_query()
- rte_flow_validate()

[1] commit 676b605182 ("doc: announce ethdev API change for RSS
    configuration")

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
2018-04-27 18:00:53 +01:00
Adrien Mazarguil
cc17feb904 ethdev: alter behavior of flow API actions
This patch makes the following changes to flow rule actions:

- List order now matters, they are redefined as performed first to last
  instead of "all simultaneously".

- Repeated actions are now supported (e.g. specifying QUEUE multiple times
  now duplicates traffic among them). Previously only the last action of
  any given kind was taken into account.

- No more distinction between terminating/non-terminating/meta actions.
  Flow rules themselves are now defined as always terminating unless a
  PASSTHRU action is specified.

These changes alter the behavior of flow rules in corner cases in order to
prepare the flow API for actions that modify traffic contents or properties
(e.g. encapsulation, compression) and for which order matter when combined.

Previously one would have to do so through multiple flow rules by combining
PASSTRHU with priority levels, however this proved overly complex to
implement at the PMD level, hence this simpler approach.

This breaks ABI compatibility for the following public functions:

- rte_flow_create()
- rte_flow_validate()

PMDs with rte_flow support are modified accordingly:

- bnxt: no change, implementation already forbids multiple actions and does
  not support PASSTHRU.

- e1000: no change, same as bnxt.

- enic: modified to forbid redundant actions, no support for default drop.

- failsafe: no change needed.

- i40e: no change, implementation already forbids multiple actions.

- ixgbe: same as i40e.

- mlx4: modified to forbid multiple fate-deciding actions and drop when
  unspecified.

- mlx5: same as mlx4, with other redundant actions also forbidden.

- sfc: same as mlx4.

- tap: implementation already complies with the new behavior except for
  the default pass-through modified as a default drop.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
2018-04-27 18:00:53 +01:00
Ferruh Yigit
18869f97f1 drivers/net: fix link autoneg value for virtual PMDs
These drivers never attempt link speed negotiation. Change link_autoneg
value to ETH_LINK_FIXED to be more accurate and consistent between PMDs.

Fixes: 1e3a958f40 ("ethdev: fix link autonegotiation value")
Cc: stable@dpdk.org

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2018-04-27 17:34:43 +01:00
Ferruh Yigit
6db80eb7c9 net/tap: fix icc build
build error:
.../dpdk/drivers/net/tap/rte_eth_tap.c(598):
error #279: controlling expression is constant
	RTE_ASSERT(!"unsupported request type: must not happen");

Although RTE_ASSERT helps debugging this issue when assert enabled,
constant expression in assert means this path can be taken during
runtime and there is no protection against it when assert is disabled.

Adding error log and error return back, replacing RTE_ASSERT.

Fixes: 7748a4b441 ("net/tap: add debug messages")
Cc: stable@dpdk.org

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2018-04-27 17:34:42 +01:00
Vipin Varghese
8055b534d6 net/tap: fix protocol field for non-IP
When non IP packets are sent on TUN interface, the logic put Ipv6 as
protocol field in header. With the current patch, the check is modified
for ipv4, ipv6 and non ip.

Fixes: 204d026a39 ("net/tap: support tun")

Suggested-by: Ophir Munk <ophirmu@mellanox.com>
Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2018-04-27 15:54:55 +01:00
Jianfeng Tan
ee27edbe0c drivers/net: share vdev data to secondary process
dpdk-procinfo, as a secondary process, cannot fetch stats for vdev.

This patch enables that by attaching the port from the shared data.
We also fill the eth dev ops, with only some ops works in secondary
process, for example, stats_get().

Note that, we still cannot Rx/Tx packets on the ports which do not
support multi-process.

Reported-by: Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
2018-04-24 12:37:31 +02:00
Jianfeng Tan
5f19dee604 drivers/net: do not use private ethdev data
We introduced private rte_eth_dev_data to allow vdev to be created
both in primary process and secondary process(es). This is not
friendly to multi-process model, for example, it leads to port id
contention issue if two processes both find the data entry is free.

And to get stats of primary vdev in secondary, we must allocate
from the pre-defined array so that we can find it.

Suggested-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
2018-04-24 12:33:51 +02:00
Olivier Matz
caccf8b318 ethdev: return diagnostic when setting MAC address
Change the prototype and the behavior of dev_ops->eth_mac_addr_set(): a
return code is added to notify the caller (librte_ether) if an error
occurred in the PMD.

The new default MAC address is now copied in dev->data->mac_addrs[0]
only if the operation is successful.

The patch also updates all the PMDs accordingly.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-04-14 00:43:30 +02:00
Ferruh Yigit
cd8c7c7ce2 ethdev: replace bus specific struct with generic dev
Public struct rte_eth_dev_info has a "struct rte_pci_device" field in it
although it is common for all ethdev in all buses.

Replacing pci specific struct with generic device struct and updating
places that are using pci device in a way to get this information from
generic device.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: David Marchand <david.marchand@6wind.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2018-04-14 00:41:44 +02:00
Vipin Varghese
58f7db4396 net/tap: add tun log and documentation
The changes add TUN|TAP specific logs and documentation support.

Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2018-04-14 00:41:44 +02:00
Vipin Varghese
204d026a39 net/tap: support tun
The change adds functional TUN PMD logic to the existing TAP PMD.
TUN PMD can be initialized with 'net_tunX' where 'X' represents unique id.
PMD supports argument interface, while MAC address and remote are not
supported.

Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2018-04-14 00:41:44 +02:00
Pavan Nikhilesh
28aa16db4a net/tap: fix memcpy with incorrect size
Fix incorrect sizeof operation being used for getting mac addr size.

Found while compiling with arm64 clang.
drivers/net/tap/rte_eth_tap.c:1410:40: error: argument to 'sizeof' in
    'memcpy' call is the same pointer type 'struct ether_addr *' as the
    destination; expected 'struct ether_addr' or an explicit length
    [-Werror,-Wsizeof-pointer-memaccess]
       rte_memcpy(&pmd->eth_addr, mac_addr, sizeof(mac_addr));
                  ~~~~~~~~~~~~~~            ^~~~~~~~~~~~~~~~

Fixes: bcab6c1d27 ("net/tap: allow user MAC to be passed as args")

Signed-off-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
Acked-by: Zhiyong Yang <zhiyong.yang@intel.com>
2018-04-14 00:40:21 +02:00
Shahaf Shuler
5feecc57d9 align SPDX Mellanox copyrights
Aligning Mellanox SPDX copyrights to a single format.
In addition replace to SPDX licence files which were missed.

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-04-11 01:47:47 +02:00
Bruce Richardson
c022cb400e convert snprintf to strlcpy
Since we have support for the strlcpy function in DPDK, replace all
instances where a string is copied using snprintf.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
2018-04-04 17:33:08 +02:00
Vipin Varghese
bcab6c1d27 net/tap: allow user MAC to be passed as args
Allow TAP PMD to pass user desired MAC address as argument.
The argument value is processed as string delimited by  ':',
is parsed and converted to HEX MAC address after validation.

Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2018-03-30 14:08:43 +02:00
Ophir Munk
ba2f43464e net/tap: fix promiscuous rules double insertions
Running testpmd command "port stop all" followed by command "port start
all" may result in a TAP error:
PMD: Kernel refused TC filter rule creation (17): File exists

Root cause analysis: during the execution of "port start all" command
testpmd calls rte_eth_promiscuous_enable() while during the execution
of "port stop all" command testpmd does not call
rte_eth_promiscuous_disable().
As a result the TAP PMD is trying to add tc (traffic control command)
promiscuous rules to the remote netvsc device consecutively. From the
kernel point of view it is seen as an attempt to add the same rule more
than once. In recent kernels (e.g. version 4.13) this attempt is rejected
with a "File exists" error. In less recent kernels (e.g. version 4.4) the
same rule may have been successfully accepted twice, which is undesirable.

In the corrupted code every tc promiscuous rule included a different
handle number parameter. If instead an identical handle number is
used for all tc promiscuous rules - all kernels will reject the second
identical rule with a "File exists" error, which is easy to identify and
to silently ignore.

Fixes: 2bc06869cd ("net/tap: add remote netdevice traffic capture")
Cc: stable@dpdk.org

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2018-02-14 15:42:39 +01:00
Ophir Munk
3382eb3a3a net/tap: add CRC stripping capability
CRC stripping is executed in the kernel outside of TAP PMD scope.
There is no prevention that the TAP PMD will report on Rx CRC
stripping capability.
In the corrupted code, TAP PMD did not report on this capability.
The fix enables TAP PMD to report that Rx CRC stripping is supported.

Fixes: 02f96a0a82 ("net/tap: add TUN/TAP device PMD")
Cc: stable@dpdk.org

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
2018-02-13 18:17:30 +01:00
Moti Haimovsky
cb7e68da63 net/tap: fix cleanup on allocation failure
This patch complements the partial cleanup done inside
eth_dev_tap_create when the routine failed.
Such a failure left a non-functional device attached to the system.

Fixes: 050fe6e9ff ("drivers/net: use ethdev allocation helper for vdev")
Cc: stable@dpdk.org

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2018-02-08 13:18:07 +01:00
Ophir Munk
54aad2f231 net/tap: fix eBPF handling of non-RSS flows
The eBPF classifier (section "cls_q" in tap_bpf_program.c) is tracing
marked packets in which skb->cb[1] contains an RSS queue number, and
redirects those packets to the matched queue.
It is expected that skb->cb[1] has been previously set with a valid RSS
queue number during an eBPF action (section "l3_l4" in tap_bpf_program.c).
However, for non-RSS flows, skb->cb[1] may contain a random unset value,
which could falsely be interpreted as a valid RSS queue.
To avoid this potential error, tap_bpf_program.c has been updated as
follows:
1. After calculating the RSS queue number, it is added a unique offset in
order to uniquely identify it as a valid RSS queue number.
2. After matching an RSS queue to a packet, skb->cb[1] is set to 0.

Fixes: cdc07e83bb ("net/tap: add eBPF program file")
Fixes: aabe70df73 ("net/tap: add eBPF bytes code")

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2018-02-05 19:56:04 +01:00
Ophir Munk
9abaad8d68 net/tap: fix multi segments capability
TAP device is supporting multi segments Tx, however this capability is
not reported when querying the TAP device.
This commit adds this capability report.

Fixes: 818fe14a98 ("net/tap: use new Tx offloads API")
Cc: stable@dpdk.org

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2018-02-05 19:56:04 +01:00
Zhiyong Yang
d98fac4dae net/tap: fix icc build
The following error is reported when compiling 18.02-rc2 using ICC,
"transfer of control bypasses initialization of".
The patch fixes the issue.

Fixes: 1911c5edc6 ("net/tap: fix eBPF RSS map key handling")

Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
2018-02-01 10:31:18 +01:00
Olivier Matz
acfe2bd440 net/tap: use SPDX tags in 6WIND copyrighted files
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2018-02-01 02:33:04 +01:00
Vipin Varghese
22e8c9f341 net/tap: remove speed argument
TAP is a virtual device created on Kernel. The speed of interface is
set by Kernel to a fixed static value. But this does not prevent using
RX or TX to rate limit. Hence removing the option from user arguments.

Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2018-01-31 20:57:29 +01:00
Ophir Munk
1911c5edc6 net/tap: fix eBPF RSS map key handling
This commit addresses a case of RSS and non RSS flows mixture.
In the corrupted code, if a non-RSS flow is destroyed, it has an eBPF map
index 0 (initialized upon flow creation). This might unintentionally remove
RSS entry 0 from eBPF map.
To fix this issue, add an offset to the real index during a KEY_CMD_GET
operation, and subtract this offset during a KEY_CMD_RELEASE operation, in
order to restore the real index.
Thus, if a non RSS flow is falsely trying to release map entry 0 - The
offset subtraction will calculate the real map index as an
out-of-range value, and the release operation will be silently ignored.

Fixes: 036d721a82 ("net/tap: implement RSS using eBPF")

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2018-01-31 20:57:29 +01:00
Ophir Munk
da8841a713 net/tap: fix eBPF file descriptors leakage
When a user creates an RSS rule, the tap PMD dynamically allocates
a 'flow' data structure, and uploads BPF programs (represented by file
descriptors) to the kernel.
The kernel might reject the rule (due to filters overlap, for example)
in which case, flow memory should be freed and BPF file descriptors
should be closed.
In the corrupted code there were scenarios where BPF file descriptors
were not closed.
The fix is to add a new function - tap_flow_free(), which will make sure
to always close BPF file descriptors before freeing the flow allocated
memory.

Fixes: 036d721a82 ("net/tap: implement RSS using eBPF")

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2018-01-31 20:57:29 +01:00
Ophir Munk
e56447980e net/tap: fix build on ARM32
This commit adds eBPF system call definitions for ARM architecture.
Old Linux header files may not define eBPF system call numbers.
In order to successful compile eBPF on all Linux platforms - the
missing ARM system call definition is explicitly added.

Fixes: b02d85e1 ("net/tap: add eBPF API")

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
2018-01-31 19:14:47 +01:00
Gowrishankar Muthukrishnan
3eaad8e530 net/tap: fix build on ppc
This patch defines __NR_bpf for powerpc architecture and hence,
fixes compiling tap driver for this architecture.

Fixes: b02d85e1 ("net/tap: add eBPF API")

Signed-off-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2018-01-30 14:28:14 +01:00
Moti Haimovsky
4870a8cdd9 net/tap: support Rx interrupt
This patch adds support for registering and waiting for Rx interrupts.
This allows applications to wait for Rx events from the PMD using the
DPDK rte_epoll subsystem.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2018-01-29 10:45:20 +01:00
Ophir Munk
e011bad02d net/tap: use local eBPF definitions
eBPF has a graceful approach: it must successfully compile on all Linux
distributions. If a specific kernel cannot support eBPF it will gracefully
refuse the eBPF netlink message sent to it.
The kernel header file linux/bpf.h (if present) on different Linux
distributions may not include all definitions required for TAP
compilation.
In order to guarantee a successful eBPF compilation everywhere all the
required definitions for TAP have been locally added instead of including
file <linux/bpf.h>

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Tested-by: Harry van Haaren <harry.van.haaren@intel.com>
2018-01-24 19:03:29 +01:00
Ferruh Yigit
ffc905f3b8 ethdev: separate driver APIs
Create a rte_ethdev_driver.h file and move PMD specific APIs here.
Drivers updated to include this new header file.

There is no update in header content and since ethdev.h included by
ethdev_driver.h, nothing changed from driver point of view, only
logically grouping of APIs. From applications point of view they can't
access to driver specific APIs anymore and they shouldn't.

More PMD specific data structures still remain in ethdev.h because of
inline functions in header use them. Those will be handled separately.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2018-01-22 01:26:49 +01:00
Ophir Munk
036d721a82 net/tap: implement RSS using eBPF
TAP PMD is required to support RSS queue mapping based on rte_flow API. An
example usage for this requirement is failsafe transparent switching from a
PCI device to TAP device while keep redirecting packets to the same RSS
queues on both devices.

TAP RSS implementation is based on eBPF programs sent to Linux kernel
through BPF system calls and using netlink messages to reference the
programs as part of traffic control commands.

TC uses eBPF programs as classifiers and actions.
eBPF classification: packets marked with an RSS queue will be directed
to this queue using TC with "skbedit" action.
BPF classifiers are downloaded to the kernel once on TAP creation for
each TAP Rx queue.

eBPF action: calculate the Toeplitz RSS hash based on L3 addresses and
L4 ports. Mark the packet with the RSS queue according the resulting
RSS hash, then reclassify the packet.
BPF actions are downloaded to the kernel for each new RSS rule.

TAP eBPF requires Linux version 4.9 configured with BPF. TAP PMD will
successfully compile on systems with old or non-BPF configured kernels but
RSS rules creation on TAP devices will not be successful

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2018-01-21 15:51:52 +01:00
Ophir Munk
b02d85e19c net/tap: add eBPF API
This commit include BPF API to be used by TAP.

tap_flow_bpf_cls_q() - download to kernel BPF program that classifies
packets to their matching queues
tap_flow_bpf_calc_l3_l4_hash() - download to kernel BPF program that
calculates per packet layer 3 and layer 4 RSS hash
tap_flow_bpf_rss_map_create() - create BPF RSS map for storing RSS
parameters per RSS rule
tap_flow_bpf_update_rss_elem() - update BPF map entry with RSS rule
parameters

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2018-01-21 15:51:52 +01:00
Ophir Munk
aabe70df73 net/tap: add eBPF bytes code
File tap_bpf_insns.h was added. It includes  eBPF bytes code
which corresponds to source file tap_bpf_program.c
(see "net/tap: add eBPF program file").
The bytes code is in the format of C arrays of struct bpf_insn and
was generated from the C file tap_bpf_program.c
1. The C file was compiled via LLVM into an object file in ELF
format as:
   clang -O2 -emit-llvm -c tap_bpf_program.c -o - | llc -march=bpf \
   -filetype=obj -o <tap_bpf_program.o>

clang version must be 3.7 and above
The C functions are under different ELF sections and are considered
different BPF programs to be downloaded to the kernel

2. Using an external tool the ELF sections are parsed and the C arrays
of struct bpf_insn are generated. Each C array (corresponding to a
different function under an ELF section) is downloaded to the kernel
using an BPF systm call. The external tool that generates the C arrays
will be added in separate commits.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2018-01-21 15:51:52 +01:00