numam-dpdk

Author	SHA1	Message	Date
Jerin Jacob	3a6f2c50b9	eal: introduce rte version of fls The function returns the last (most-significant) bit set. Added unit testcase to verify rte_fls_u32(). Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>	2018-11-12 13:25:01 +01:00
Ferruh Yigit	fac66b1295	test: fix build With "make -C test/" command getting following warnings: awk: cmd. line:1: fatal: cannot open file `/cmdline_test/cmdline_test/' for reading (No such file or directory) awk: cmd. line:1: fatal: cannot open file `/test-pipeline/test-pipeline/' for reading (No such file or directory) awk: cmd. line:1: fatal: cannot open file `/test-acl/test-acl/' for reading (No such file or directory) This is because unexpected/invalid MAPFILE param passed to check-experimental-syms.sh There is no easy way to unify MAPFILE for different build options, instead add an input verification to script, and silently ignore wrong values. Fixes: `a6ec31597a` ("mk: add experimental tag check") Cc: stable@dpdk.org Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Neil Horman <nhorman@tuxdriver.com>	2018-11-12 01:38:39 +01:00
Kevin Laatz	fe1570a73e	telemetry: fix shared link with make Currently, telemetry is not working for shared builds in make. The --as-needed flag is preventing telemetry from being linked as there are no direct API calls from the app to telemetry. This is causing the --telemetry option to not be recognized by EAL. Telemetry registers it's EAL option using the RTE_INIT constructor. Since EAL's option parsing is done before the plugins init, the --telemetry option isn't registered at the time of parsing, and as a result, the --telemetry option is not being recognized. This patch fixes this issue by explicitly linking telemetry to the application by setting the "--no-as-needed" flag for the library in mk/rte.app.mk. Fixes: `8877ac688b` ("telemetry: introduce infrastructure") Reported-by: Yanjie Xu <yanjie.xu@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2018-11-12 01:21:58 +01:00
Thomas Monjalon	6bdf144553	eal/x86: remove unused memcpy file The use of rte_memcpy_ptr was removed in revert below, but it was missing removing the file arch/x86/rte_memcpy.c. Fixes: `d35cc1fe6a` ("eal/x86: revert select optimized memcpy at run-time") Cc: stable@dpdk.org Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>	2018-11-12 00:11:46 +01:00
Thomas Monjalon	c7ad7754f8	devargs: do not replace already inserted device The devargs of a device can be replaced by a newly allocated one when trying to probe again the same device (multi-process or multi-ports scenarios). This is breaking some pointer references. It can be avoided by copying the new content, freeing the new devargs, and returning the already inserted pointer. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Tested-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Tested-by: Qi Zhang <qi.z.zhang@intel.com> Tested-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>	2018-11-12 00:10:21 +01:00
Marko Kovacevic	86ce81af73	examples/fips_validation: fix uninitialized access Fixing a bug raised in coverity using uninitialized value. Coverity issue: 325881 Fixes: `527cbf3d5e` ("examples/fips_validation: support TDES parsing") Signed-off-by: Marko Kovacevic <marko.kovacevic@intel.com>	2018-11-07 14:53:36 +01:00
Alejandro Lucero	ee0e074f81	mem: fix DMA mask width sanity check Current code has different max DMA mask width values for 32 and 64 bits systems. IOMMU hardware could report a higher supported width than current MAX_DMA_MASK_BITS when RTE_ARCH_64 is not defined. This is actually true with a 32 bits kernel running in a 64 bits server with IOMMU hardware. This could also be a problem with embedded systems using an IOMMU designed for 64 bits in a 32 bits system. This patch leaves a single max DMA mask width which will make sure the mask width is within the range for 64 bits variables used for DMA mask. This also will avoid wrong values because any value higher than 64 bits is likely wrong. Fixes: `223b7f1d5e` ("mem: add function for checking memseg IOVA") Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com> Tested-by: Ferruh Yigit <ferruh.yigit@intel.com>	2018-11-07 14:42:28 +01:00
Anatoly Burakov	4531d096d1	mem: fix use after free in legacy mem init Adding an additional failure path in DMA mask check has exposed an issue where `hugepage` pointer may point to memory that has already been unmapped, but pointer value is still not NULL, so failure handler will attempt to unmap it second time if DMA mask check fails. Fix it by setting `hugepage` pointer to NULL once it is no longer needed. Coverity issue: 325730 Fixes: `165c89b845` ("mem: use DMA mask check for legacy memory") Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-11-07 00:06:38 +01:00
Marko Kovacevic	9252e81a9f	examples/fips_validation: fix unitialized variables Fixed compilation issue with variable which may be used uninitialized. Fixes: `527cbf3d5e` ("examples/fips_validation: support TDES parsing") Signed-off-by: Marko Kovacevic <marko.kovacevic@intel.com>	2018-11-06 12:13:49 +01:00
Thomas Monjalon	c59b06294f	version: 18.11-rc2 Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2018-11-06 03:27:49 +01:00
Dharmik Thakkar	57ed574e26	test/hash: fix build Enable print_key_info() function compilation always. Compilation error message: 'test_hash.c: In function ‘print_key_info’: test_hash.c:90:15: error: cast discards ‘const’ qualifier from pointer target type [-Werror=cast-qual] uint8_t p = (uint8_t )key; ^ cc1: all warnings being treated as errors' Fixes: `af75078fec` ("first public release") Cc: stable@dpdk.org Suggested-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Signed-off-by: Dharmik Thakkar <dharmik.thakkar@arm.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Reviewed-by: Gavin Hu <gavin.hu@arm.com>	2018-11-06 03:20:26 +01:00
Naga Suresh Somarowthu	f5a09cf019	test/hash: reduce time for multiwriter test Reduced test duration for hash_multiwriter_autotest. Number of entries and total insertions are reduced such that the duration is less than 10 seconds. Signed-off-by: Naga Suresh Somarowthu <naga.sureshx.somarowthu@intel.com> Acked-by: Herakliusz Lipiec <herakliusz.lipiec@intel.com>	2018-11-06 03:20:26 +01:00
Naga Suresh Somarowthu	737f5813a7	test: reduce time for function reentrancy test Reduced test duration for func_reentrancy_autotest. Reduced MAX_LPM_ITER_TIMES, introduced new macro MAX_ITER_ONCE to reduce the unique key check and altered the macro MAX_ITER_TIMES to MAX_ITER_MULTI. Combined for loops thereby reduced snprintf calls and repeated iterations. Such that the duration is less than 10 seconds. Signed-off-by: Naga Suresh Somarowthu <naga.sureshx.somarowthu@intel.com> Acked-by: Reshma Pattan <reshma.pattan@intel.com>	2018-11-06 03:20:26 +01:00
Bruce Richardson	a5563b85f6	test: allow taking extra arguments from environment When running unit tests automatically, either via script, from meson, or otherwise, the same set of options may be used for each run, for example to set a standard coremask to be used for all tests. To facilitate this, this patch adds support for the test binary taking additional EAL parameters from the environment and appending them to the argc/argv list passed to eal init. This allows parameter modification without having to edit test scripts etc. There are now two environment variables which can be used for running tests: * DPDK_TEST - (added previously) passes the test name to be run automatically rather than running the app interactively. Used by "meson test" when running tests individually or as part of a suite. * DPDK_TEST_PARAMS - new parameter to specify the commandline arguments to use with the test binary. For example to run a test, or tests, on only 16 lcores, and to skip pci scan we can set this to "-l 0-15 --no-pci". Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Tested-by: Luca Boccassi <bluca@debian.org>	2018-11-06 03:20:26 +01:00
Ori Kam	9af4eb5657	examples/flow_filtering: remove VLAN item Since the VLAN is not in use and some PMD can't support vlan = 0 this item was removed. Fixes: `4a3ef59a10` ("examples/flow_filtering: add simple demo of flow API") Cc: stable@dpdk.org Signed-off-by: Ori Kam <orika@mellanox.com>	2018-11-06 02:35:39 +01:00
Ori Kam	c82f2f8b4b	examples/flow_filtering: filter out unsupported offloads Some of the requested offloads are not supported by all devices. This patch fixes this issue by setting only the supported offloads. Fixes: `feca6c428a` ("examples/flow_filtering: add Tx queues setup process") Cc: stable@dpdk.org Signed-off-by: Ori Kam <orika@mellanox.com> Acked-by: Wei Zhao <wei.zhao1@intel.com>	2018-11-06 02:35:34 +01:00
Fan Zhang	a38eafedda	bus/pci: fix config r/w access The recent change to rte_pci_read/write_config() missed uio_pci_generic case. Fixes: `630deed612` ("bus/pci: compare kernel driver instead of interrupt handler") Cc: stable@dpdk.org Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2018-11-06 02:11:25 +01:00
Konstantin Ananyev	b8d5dfd4a5	ip_frag: use key length for key comparison Right now reassembly code relies on src_dst[] being all zeroes to determine is it free/occupied entry in the fragments table. This is suboptimal and error prone - user can crash DPDK ip_reassembly app by something like the following scapy script: x=Ether(src=...,dst=...)/IP(dst='0.0.0.0',src='0.0.0.0',id=0)/('X'*1000) frags=fragment(x, fragsize=500) sendp(frags, iface=...) To overcome that issue and reduce overhead of 'key invalidate' and 'key is empty' operations - add key_len into keys comparision procedure. Fixes: `4f1a8f6338` ("ip_frag: add IPv6 reassembly") Cc: stable@dpdk.org Reported-by: Ryan E Hall <ryan.e.hall@intel.com> Reported-by: Alexander V Gutkin <alexander.v.gutkin@intel.com> Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2018-11-06 01:58:11 +01:00
Konstantin Ananyev	7f0983ee33	ip_frag: check fragment length of incoming packet Under some conditions ill-formed fragments might cause reassembly code to corrupt mbufs and/or crash. Let say the following fragments sequence: <ofs=0,len=100, flags=MF> <ofs=96,len=100, flags=MF> <ofs=200,len=0,flags=MF> <ofs=200,len=100,flags=0> can trigger the problem. To overcome such situation, added check that fragment length of incoming value is greater than zero. Fixes: `601e279df0` ("ip_frag: move fragmentation/reassembly headers into a library") Fixes: `4f1a8f6338` ("ip_frag: add IPv6 reassembly") Cc: stable@dpdk.org Reported-by: Ryan E Hall <ryan.e.hall@intel.com> Reported-by: Alexander V Gutkin <alexander.v.gutkin@intel.com> Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2018-11-06 01:58:03 +01:00
Ferruh Yigit	7b178300ac	vhost: fix possible out of bound access Fixes: `d7280c9fff` ("vhost: support selective datapath") Cc: stable@dpdk.org Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2018-11-06 01:14:23 +01:00
Ferruh Yigit	c8b506e4b6	service: fix possible null access Fixes: `21698354c8` ("service: introduce service cores concept") Cc: stable@dpdk.org Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Harry van Haaren <harry.van.haaren@intel.com>	2018-11-06 01:14:15 +01:00
Ferruh Yigit	9eb0688412	lib: fix shifting 32-bit signed variable 31 times Fix cppcheck warning by marking variable as unsigned. Fixes: `dc276b5780` ("acl: new library") Fixes: `986ff526fb` ("net: add CRC computation API") Cc: stable@dpdk.org Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>	2018-11-06 01:14:05 +01:00
Thomas Monjalon	1ccdc31793	ethdev: remove experimental tag for iterator API After removing the function rte_eth_dev_attach(), there are two replacement solutions possible: one using probe event notification, and one using a new iterator. So the application can get the new probed ports either asynchronously or synchronously. The iterator API is new in DPDK 18.11 so they got the experimental tag by policy. It causes an issue for strict applications which do not use experimental functions, and want to use the synchronous method. The replacement for removed API should not be experimental. That's why the experimental status of the ethdev iterator is removed. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Kevin Traynor <ktraynor@redhat.com> Tested-by: Kevin Traynor <ktraynor@redhat.com>	2018-11-06 01:14:04 +01:00
Thomas Monjalon	d75d132c30	eal: remove experimental tag for probe/remove The functions rte_dev_probe() and rte_dev_remove() are new in DPDK 18.11 so they got the experimental tag by policy. However they are too much basic functions for being skipped by strict applications which do not use experimental functions. The alternative is to use rte_eal_hotplug_add() and rte_eal_hotplug_remove(), but their API requires the application to parse the devargs string in order to provide bus name, device name and driver arguments. The new function rte_dev_probe() is really simpler to use and more flexible by accepting any devargs string. Let's encourage applications to use it. The old functions rte_eal_hotplug_* may be deprecated later. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Kevin Traynor <ktraynor@redhat.com> Tested-by: Kevin Traynor <ktraynor@redhat.com>	2018-11-06 01:14:02 +01:00
Anatoly Burakov	1ccfeb7df7	malloc: fix invalid argument handling When adding memory to an external heap, do not go to unlock failure handler because the memory hotplug lock hasn't been taken out yet. Fixes: `7d75c31014` ("malloc: allow adding memory to named heaps") Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-11-06 01:13:58 +01:00
Stephen Hemminger	7415ad0cdc	net/netvsc: fix VF link update The netvsc device calls VF (if present) to update the link status with the wrong device. This leads to errors in mlx5 device when it can't find the ifindex. Fixes: `dc7680e859` ("net/netvsc: support integrated VF") Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>	2018-11-05 20:10:54 +01:00
Ferruh Yigit	c06aee624d	net/bnxt: fix uninitialized variable access ag_cons is used uninitialized, it is used when DEBUG enabled, remove debug code. Fixes: `0958d8b643` ("net/bnxt: support LRO") Cc: stable@dpdk.org Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>	2018-11-05 18:54:48 +01:00
Beilei Xing	054d1be48c	net/i40e: fix Rx instability with vector mode Previously, there is instability during vector Rx if descriptor number is not power of 2, e.g. process hang and some Rx packets are unexpectedly empty. That's because vector Rx mode assumes Rx descriptor number is power of 2 when doing bit mask. This patch allows vector mode only when the number of Rx descriptor is power of 2. Fixes: `8e109464c0` ("i40e: allow vector Rx and Tx usage") Fixes: `a3c83a2527` ("net/i40e: enable runtime queue setup") Cc: stable@dpdk.org Signed-off-by: Beilei Xing <beilei.xing@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2018-11-05 16:50:14 +01:00
Ferruh Yigit	37cf9ea83d	net/avf/base: fix shifting 32-bit signed variable 31 times Fixes: `e5b2a9e957` ("net/avf/base: add base code for avf PMD") Cc: stable@dpdk.org Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> Reviewed-by: Wei Zhao <wei.zhao1@intel.com>	2018-11-05 15:58:45 +01:00
Tom Barbette	26f0488344	net/mlx5: support Rx queue count API This patch adds support for the rx_queue_count API in mlx5 driver Signed-off-by: Tom Barbette <barbette@kth.se> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2018-11-05 15:01:25 +01:00
Ophir Munk	a4391f8bae	app/testpmd: set default RSS key as null When creating an RSS rule without specifying a key (see [1]) it is expected that the device will use the default key. A NULL key is used to indicate to a PMD it should use its default key, however testpmd assigns a non-NULL dummy key (see [2]) instead. This does not enable testing any PMD behavior when the RSS key is not specified. This commit fixes this limitation by setting key to NULL. [1] RSS rule example without specifying a key: flow create 0 ingress <pattern> / end actions rss queues 0 1 end / end [2] Testpmd default key assignment: .key= "testpmd's default RSS hash key, " "override it for better balancing" Signed-off-by: Ophir Munk <ophirmu@mellanox.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2018-11-05 15:01:25 +01:00
Dekel Peled	892cfadd59	doc: clarify testpmd guide for flow API The description of prefix for mask creation was misunderstood. I updated the description, so it is clearly understood which mask will be created by a certain prefix. Signed-off-by: Dekel Peled <dekelp@mellanox.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2018-11-05 15:01:25 +01:00
Zhirun Yan	ac60b2de4f	net/igb: update Tx offload mask Tx offload mask is updated in following commit `1037ed842c` ("mbuf: fix Tx offload mask") Currently, the new added offload flags are not supported in PMD and application will fail to call PMD transmit prepare function. This patch updates IGB_TX_OFFFLOAD_MASK. Fixes: `1037ed842c` ("mbuf: fix Tx offload mask") Cc: stable@dpdk.org Signed-off-by: Zhirun Yan <zhirun.yan@intel.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2018-11-05 15:01:25 +01:00
Yongseok Koh	c1cfb132fa	net/mlx5: remove flags setting from flow preparation Even though flow_drv_prepare() takes item_flags and action_flags to be filled in, those are not used and will be overwritten by parsing of flow_drv_translate(). There's no reason to keep the flags and fill it. Appropriate notes are added to the documentation of flow_drv_prepare() and flow_drv_translate(). Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Ori Kam <orika@mellanox.com>	2018-11-05 15:01:25 +01:00
Yongseok Koh	9ce75f5122	net/mlx5: fix Direct Verbs flow tunnel 1) Fix layer parsing In translation of tunneled flows, dev_flow->layers must not be used to check tunneled layer as it contains all the layers parsed from flow_drv_prepare(). Checking tunneled layer is needed to distinguish between outer and inner item. This should be based on dynamic parsing. With dev_flow->layers on a tunneled flow, items will always be interpreted as inner as dev_flow->layer already has all the items. Dynamic parsing (item_flags) is added as there's no such code. 2) Refactoring code - flow_dv_create_item() and flow_dv_create_action() are merged into flow_dv_translate() for consistency with Verbs and *_validate(). Fixes: `2466364115` ("net/mlx5: fix flow tunnel handling") Fixes: `d02cb06912` ("net/mlx5: add Direct Verbs translate actions") Fixes: `fc2c498ccb` ("net/mlx5: add Direct Verbs translate items") Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Ori Kam <orika@mellanox.com>	2018-11-05 15:01:25 +01:00
Yongseok Koh	4a78c88e3b	net/mlx5: fix Verbs flow tunnel 1) Fix layer parsing In translation of tunneled flows, dev_flow->layers must not be used to check tunneled layer as it contains all the layers parsed from flow_drv_prepare(). Checking tunneled layer is needed to set IBV_FLOW_SPEC_INNER and it should be based on dynamic parsing. With dev_flow->layers on a tunneled flow, items will always be interpreted as inner as dev_flow->layer already has all the items. 2) Refactoring code It is partly because flow_verbs_translate_item_() sets layer flag. Same code is repeating in multiple locations and that could be error-prone. - Introduce VERBS_SPEC_INNER() to unify setting IBV_FLOW_SPEC_INNER. - flow_verbs_translate_item_() doesn't set parsing result - MLX5_FLOW_LAYER_. - flow_verbs_translate_item_() doesn't set priority or adjust hashfields but does only item translation. Both have to be done outside. - Make more consistent between Verbs and DV. 3) Remove flow_verbs_mark_update() This code can never be reached as validation prohibits specifying mark and flag actions together. No need to convert flag to mark. Fixes: `84c406e745` ("net/mlx5: add flow translate function") Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Ori Kam <orika@mellanox.com>	2018-11-05 15:01:25 +01:00
Ophir Munk	f1b85a2719	net/mlx5: support default RSS key as null Applications which add RSS rules must supply an RSS key and length. If an application is only interested in default RSS operation it should not care about the exact RSS key. By setting the key to NULL - the PMD will use the default RSS key. In addition if the application does not care about the RSS type it can set it to 0 and the PMD will use the default type (ETH_RSS_IP). Signed-off-by: Ophir Munk <ophirmu@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2018-11-05 15:01:25 +01:00
Yongseok Koh	e7d2e32b26	net/mlx5: limit priority range for Linux TC flower driver Due to a limitation on driver/FW, priority ranges from 1 to 16 in kernel. Priority in rte_flow attribute starts from 0 and is added by 1 in translation. This is subject to be changed to determine the max priority based on trial-and-error like Verbs driver once the restriction is lifted or the range is extended. Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2018-11-05 15:01:25 +01:00
Yongseok Koh	09d8b41699	net/mlx5: make vectorized Tx threshold configurable Add txqs_max_vec parameter to configure the maximum number of Tx queues to enable vectorized Tx. And its default value is set according to the architecture and device type. Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2018-11-05 15:01:25 +01:00
Yongseok Koh	f87bfa8eae	net/mlx5: move device spawn configuration to probing When a device is spawned, it does make more sense that the configuration parameters are passed by callee. Furthermore, setting default value for some configuration would need PCIe device ID which can be found in the probe function. Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2018-11-05 15:01:25 +01:00
Viacheslav Ovsiienko	f420f03d67	net/mlx5: add E-switch VXLAN rule cleanup routines The last part of patchset contains the rule cleanup routines. These ones is the part of outer interface initialization at the moment of VXLAN VTEP attaching. These routines query the list of attached VXLAN devices, the list of local IP addresses with peer and link scope attribute and the list of permanent neigh rules, then all found abovementioned items on the specified outer device are flushed. Suggested-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-11-05 15:01:25 +01:00
Viacheslav Ovsiienko	7a2d6d99a4	net/mlx5: add E-Switch VXLAN encapsulation rules VXLAN encap rules are applied to the VF ingress traffic and have the VTEP as actual redirection destinations instead of outer PF. The encapsulation rule should provide: - redirection action VF->PF - VF port ID - some inner network parameters (MACs/IP) - the tunnel outer source IP (v4/v6) - the tunnel outer destination IP (v4/v6). Current - VNI - Virtual Network Identifier There is no direct way found to provide kernel with all required encapsulatioh header parameters. The encapsulation VTEP is created attached to the outer interface and assumed as default path for egress encapsulated traffic. The outer tunnel IP address are assigned to interface using Netlink, the implicit route is created like this: ip addr add <src_ip> peer <dst_ip> dev <outer> scope link Peer address provides implicit route, and scode link reduces the risk of conflicts. At initialization time all local scope link addresses are flushed from device (see next part of patchset). The destination MAC address is provided via permenent neigh rule: ip neigh add dev <outer> lladdr <dst_mac> to <dst_ip> nud permanent At initialization time all neigh rules of this type are flushed from device (see the next part of patchset). Suggested-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-11-05 15:01:25 +01:00
Viacheslav Ovsiienko	95a464cecc	net/mlx5: add E-switch VXLAN tunnel devices management VXLAN interfaces are dynamically created for each local UDP port of outer networks and then used as targets for TC "flower" filters in order to perform encapsulation. These VXLAN interfaces are system-wide, the only one device with given UDP port can exist in the system (the attempt of creating another device with the same UDP local port returns EEXIST), so PMD should support the shared device instances database for PMD instances. These VXLAN implicitly created devices are called VTEPs (Virtual Tunnel End Points). Creation of the VTEP occurs at the moment of rule applying. The link is set up, root ingress qdisc is also initialized. Encapsulation VTEPs are created on per port basis, the single VTEP is attached to the outer interface and is shared for all encapsulation rules on this interface. The source UDP port is automatically selected in range 30000-60000. For decapsulaton one VTEP is created per every unique UDP local port to accept tunnel traffic. The name of created VTEP consists of prefix "vmlx_" and the number of UDP port in decimal digits without leading zeros (vmlx_4789). The VTEP can be preliminary created in the system before the launching application, it allows to share UDP ports between primary and secondary processes. Suggested-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-11-05 15:01:25 +01:00
Viacheslav Ovsiienko	4eaa82256c	net/mlx5: fix E-Switch flow counter deletion The counters for E-Switch rules were erroneously deleted in flow_tcf_remove() routine. The counters deletion is moved to flow_tcf_destroy() routine. Fixes: `e1114ff6a5` ("net/mlx5: support e-switch flow count action") Suggested-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-11-05 15:01:25 +01:00
Viacheslav Ovsiienko	6e74990b34	net/mlx5: update E-Switch VXLAN netlink routines This part of patchset updates Netlink exchange routine. Message sequence numbers became not random ones, the multipart reply messages are supported, not propagating errors to the following socket calls, Netlink replies buffer size is increased to MNL_SOCKET_BUFFER_SIZE and now is preallocated at context creation time instead of stack usage. This update is needed to support Netlink query operations. Suggested-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-11-05 15:01:25 +01:00
Viacheslav Ovsiienko	251e8d02cf	net/mlx5: add VXLAN to flow translate routine This part of patchset adds support of VXLAN-related items and actions to the flow translation routine. Later some tunnel types, other than VXLAN can be addedd (GRE). No VTEP devices are created at this point, the flow rule is just translated, not applied yet. Suggested-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-11-05 15:01:25 +01:00
Viacheslav Ovsiienko	1a02c6781f	net/mlx5: add VXLAN to flow prepare routine The e-switch Flow prepare function is updated to support VXLAN encapsulation/and decapsulation actions. The function calculates buffer size for Netlink message and Flow description structures, including optional ones for tunneling purposes. Suggested-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-11-05 15:01:25 +01:00
Viacheslav Ovsiienko	2a3fed2042	net/mlx5: add E-Switch VXLAN to validation routine This patch adds VXLAN support for flow item/action lists validation. The following entities are now supported: - RTE_FLOW_ITEM_TYPE_VXLAN, contains the tunnel VNI - RTE_FLOW_ACTION_TYPE_VXLAN_DECAP, if this action is specified the items in the flow items list treated as outer network parameters for tunnel outer header match. The ethernet layer addresses always are treated as inner ones. - RTE_FLOW_ACTION_TYPE_VXLAN_ENCAP, contains the item list to build the encapsulation header. In current implementation the values is the subject for some constraints: - outer source MAC address will be always unconditionally set to the one of MAC addresses of outer egress interface - no way to specify source UDP port - all abovementioned parameters are ignored if specified in the rule, warning messages are sent to the log Minimal tunneling support is also added. If VXLAN decapsulation action is specified the ETH item can follow the VXLAN VNI item, the content of this ETH item is treated as inner MAC addresses and type. The outer ETH item for VXLAN decapsulation action is always ignored. Suggested-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-11-05 15:01:25 +01:00
Viacheslav Ovsiienko	0decab8ed1	net/mlx5: swap items/actions validations for E-Switch rules The rule validation function for E-Switch checks item list first, then action list is checked. This patch swaps the validation order, now actions are checked first. This is preparation for validation function update with VXLAN tunnel actions. VXLAN decapsulation action requires to check the items in special way. We could do this special check in the single item check pass if the action flags were gathered before. This is the reason to swap the item/actions checking loops. Suggested-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-11-05 15:01:25 +01:00
Viacheslav Ovsiienko	6f2f07e228	net/mlx5: add necessary structures for E-Switch VXLAN This patch introduces the data structures needed to implement VXLAN encapsulation/decapsulation hardware offload support for E-Switch. Suggested-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-11-05 15:01:25 +01:00

1 2 3 4 5 ...

15859 Commits