rte_flow API provides the building blocks for vendor-agnostic flow classification offloads. The rte_flow "patterns" and "actions" primitives are fine-grained, thus enabling DPDK applications the flexibility to offload network stacks and complex pipelines. Applications wishing to offload tunneled traffic are required to use the rte_flow primitives, such as group, meta, mark, tag, and others to model their high-level objects. The hardware model design for high-level software objects is not trivial. Furthermore, an optimal design is often vendor-specific. When hardware offloads tunneled traffic in multi-group logic, partially offloaded packets may arrive to the application after they were modified in hardware. In this case, the application may need to restore the original packet headers. Consider the following sequence: The application decaps a packet in one group and jumps to a second group where it tries to match on a 5-tuple, that will miss and send the packet to the application. In this case, the application does not receive the original packet but a modified one. Also, in this case, the application cannot match on the outer header fields, such as VXLAN vni and 5-tuple. There are several possible ways to use rte_flow "patterns" and "actions" to resolve the issues above. For example: 1 Mapping headers to a hardware registers using the rte_flow_action_mark/rte_flow_action_tag/rte_flow_set_meta objects. 2 Apply the decap only at the last offload stage after all the "patterns" were matched and the packet will be fully offloaded. Every approach has its pros and cons and is highly dependent on the hardware vendor. For example, some hardware may have a limited number of registers while other hardware could not support inner actions and must decap before accessing inner headers. The tunnel offload model resolves these issues. The model goals are: 1 Provide a unified application API to offload tunneled traffic that is capable to match on outer headers after decap. 2 Allow the application to restore the outer header of partially offloaded packets. The tunnel offload model does not introduce new elements to the existing RTE flow model and is implemented as a set of helper functions. For the application to work with the tunnel offload API it has to adjust flow rules in multi-table tunnel offload in the following way: 1 Remove explicit call to decap action and replace it with PMD actions obtained from rte_flow_tunnel_decap_and_set() helper. 2 Add PMD items obtained from rte_flow_tunnel_match() helper to all other rules in the tunnel offload sequence. VXLAN Code example: Assume application needs to do inner NAT on the VXLAN packet. The first rule in group 0: flow create <port id> ingress group 0 pattern eth / ipv4 / udp dst is 4789 / vxlan / end actions {pmd actions} / jump group 3 / end The first VXLAN packet that arrives matches the rule in group 0 and jumps to group 3. In group 3 the packet will miss since there is no flow to match and will be sent to the application. Application will call rte_flow_get_restore_info() to get the packet outer header. Application will insert a new rule in group 3 to match outer and inner headers: flow create <port id> ingress group 3 pattern {pmd items} / eth / ipv4 dst is 172.10.10.1 / udp dst 4789 / vxlan vni is 10 / ipv4 dst is 184.1.2.3 / end actions set_ipv4_dst 186.1.1.1 / queue index 3 / end Resulting of the rules will be that VXLAN packet with vni=10, outer IPv4 dst=172.10.10.1 and inner IPv4 dst=184.1.2.3 will be received decapped on queue 3 with IPv4 dst=186.1.1.1 Note: The packet in group 3 is considered decapped. All actions in that group will be done on the header that was inner before decap. The application may specify an outer header to be matched on. It's PMD responsibility to translate these items to outer metadata. API usage: /** * 1. Initiate RTE flow tunnel object */ const struct rte_flow_tunnel tunnel = { .type = RTE_FLOW_ITEM_TYPE_VXLAN, .tun_id = 10, } /** * 2. Obtain PMD tunnel actions * * pmd_actions is an intermediate variable application uses to * compile actions array */ struct rte_flow_action **pmd_actions; rte_flow_tunnel_decap_and_set(&tunnel, &pmd_actions, &num_pmd_actions, &error); /** * 3. offload the first rule * matching on VXLAN traffic and jumps to group 3 * (implicitly decaps packet) */ app_actions = jump group 3 rule_items = app_items; /** eth / ipv4 / udp / vxlan */ rule_actions = { pmd_actions, app_actions }; attr.group = 0; flow_1 = rte_flow_create(port_id, &attr, rule_items, rule_actions, &error); /** * 4. after flow creation application does not need to keep the * tunnel action resources. */ rte_flow_tunnel_action_release(port_id, pmd_actions, num_pmd_actions); /** * 5. After partially offloaded packet miss because there was no * matching rule handle miss on group 3 */ struct rte_flow_restore_info info; rte_flow_get_restore_info(port_id, mbuf, &info, &error); /** * 6. Offload NAT rule: */ app_items = { eth / ipv4 dst is 172.10.10.1 / udp dst 4789 / vxlan vni is 10 / ipv4 dst is 184.1.2.3 } app_actions = { set_ipv4_dst 186.1.1.1 / queue index 3 } rte_flow_tunnel_match(&info.tunnel, &pmd_items, &num_pmd_items, &error); rule_items = {pmd_items, app_items}; rule_actions = app_actions; attr.group = info.group_id; flow_2 = rte_flow_create(port_id, &attr, rule_items, rule_actions, &error); /** * 7. Release PMD items after rule creation */ rte_flow_tunnel_item_release(port_id, pmd_items, num_pmd_items); References 1. https://mails.dpdk.org/archives/dev/2020-June/index.html Signed-off-by: Eli Britstein <elibr@mellanox.com> Signed-off-by: Gregory Etelson <getelson@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
189 lines
5.3 KiB
C
189 lines
5.3 KiB
C
/* SPDX-License-Identifier: BSD-3-Clause
|
|
* Copyright 2016 6WIND S.A.
|
|
* Copyright 2016 Mellanox Technologies, Ltd
|
|
*/
|
|
|
|
#ifndef RTE_FLOW_DRIVER_H_
|
|
#define RTE_FLOW_DRIVER_H_
|
|
|
|
/**
|
|
* @file
|
|
* RTE generic flow API (driver side)
|
|
*
|
|
* This file provides implementation helpers for internal use by PMDs, they
|
|
* are not intended to be exposed to applications and are not subject to ABI
|
|
* versioning.
|
|
*/
|
|
|
|
#include <stdint.h>
|
|
|
|
#include "rte_ethdev.h"
|
|
#include "rte_ethdev_driver.h"
|
|
#include "rte_flow.h"
|
|
|
|
#ifdef __cplusplus
|
|
extern "C" {
|
|
#endif
|
|
|
|
/**
|
|
* Generic flow operations structure implemented and returned by PMDs.
|
|
*
|
|
* To implement this API, PMDs must handle the RTE_ETH_FILTER_GENERIC filter
|
|
* type in their .filter_ctrl callback function (struct eth_dev_ops) as well
|
|
* as the RTE_ETH_FILTER_GET filter operation.
|
|
*
|
|
* If successful, this operation must result in a pointer to a PMD-specific
|
|
* struct rte_flow_ops written to the argument address as described below:
|
|
*
|
|
* \code
|
|
*
|
|
* // PMD filter_ctrl callback
|
|
*
|
|
* static const struct rte_flow_ops pmd_flow_ops = { ... };
|
|
*
|
|
* switch (filter_type) {
|
|
* case RTE_ETH_FILTER_GENERIC:
|
|
* if (filter_op != RTE_ETH_FILTER_GET)
|
|
* return -EINVAL;
|
|
* *(const void **)arg = &pmd_flow_ops;
|
|
* return 0;
|
|
* }
|
|
*
|
|
* \endcode
|
|
*
|
|
* See also rte_flow_ops_get().
|
|
*
|
|
* These callback functions are not supposed to be used by applications
|
|
* directly, which must rely on the API defined in rte_flow.h.
|
|
*
|
|
* Public-facing wrapper functions perform a few consistency checks so that
|
|
* unimplemented (i.e. NULL) callbacks simply return -ENOTSUP. These
|
|
* callbacks otherwise only differ by their first argument (with port ID
|
|
* already resolved to a pointer to struct rte_eth_dev).
|
|
*/
|
|
struct rte_flow_ops {
|
|
/** See rte_flow_validate(). */
|
|
int (*validate)
|
|
(struct rte_eth_dev *,
|
|
const struct rte_flow_attr *,
|
|
const struct rte_flow_item [],
|
|
const struct rte_flow_action [],
|
|
struct rte_flow_error *);
|
|
/** See rte_flow_create(). */
|
|
struct rte_flow *(*create)
|
|
(struct rte_eth_dev *,
|
|
const struct rte_flow_attr *,
|
|
const struct rte_flow_item [],
|
|
const struct rte_flow_action [],
|
|
struct rte_flow_error *);
|
|
/** See rte_flow_destroy(). */
|
|
int (*destroy)
|
|
(struct rte_eth_dev *,
|
|
struct rte_flow *,
|
|
struct rte_flow_error *);
|
|
/** See rte_flow_flush(). */
|
|
int (*flush)
|
|
(struct rte_eth_dev *,
|
|
struct rte_flow_error *);
|
|
/** See rte_flow_query(). */
|
|
int (*query)
|
|
(struct rte_eth_dev *,
|
|
struct rte_flow *,
|
|
const struct rte_flow_action *,
|
|
void *,
|
|
struct rte_flow_error *);
|
|
/** See rte_flow_isolate(). */
|
|
int (*isolate)
|
|
(struct rte_eth_dev *,
|
|
int,
|
|
struct rte_flow_error *);
|
|
/** See rte_flow_dev_dump(). */
|
|
int (*dev_dump)
|
|
(struct rte_eth_dev *dev,
|
|
FILE *file,
|
|
struct rte_flow_error *error);
|
|
/** See rte_flow_get_aged_flows() */
|
|
int (*get_aged_flows)
|
|
(struct rte_eth_dev *dev,
|
|
void **context,
|
|
uint32_t nb_contexts,
|
|
struct rte_flow_error *err);
|
|
/** See rte_flow_shared_action_create() */
|
|
struct rte_flow_shared_action *(*shared_action_create)
|
|
(struct rte_eth_dev *dev,
|
|
const struct rte_flow_shared_action_conf *conf,
|
|
const struct rte_flow_action *action,
|
|
struct rte_flow_error *error);
|
|
/** See rte_flow_shared_action_destroy() */
|
|
int (*shared_action_destroy)
|
|
(struct rte_eth_dev *dev,
|
|
struct rte_flow_shared_action *shared_action,
|
|
struct rte_flow_error *error);
|
|
/** See rte_flow_shared_action_update() */
|
|
int (*shared_action_update)
|
|
(struct rte_eth_dev *dev,
|
|
struct rte_flow_shared_action *shared_action,
|
|
const struct rte_flow_action *update,
|
|
struct rte_flow_error *error);
|
|
/** See rte_flow_shared_action_query() */
|
|
int (*shared_action_query)
|
|
(struct rte_eth_dev *dev,
|
|
const struct rte_flow_shared_action *shared_action,
|
|
void *data,
|
|
struct rte_flow_error *error);
|
|
/** See rte_flow_tunnel_decap_set() */
|
|
int (*tunnel_decap_set)
|
|
(struct rte_eth_dev *dev,
|
|
struct rte_flow_tunnel *tunnel,
|
|
struct rte_flow_action **pmd_actions,
|
|
uint32_t *num_of_actions,
|
|
struct rte_flow_error *err);
|
|
/** See rte_flow_tunnel_match() */
|
|
int (*tunnel_match)
|
|
(struct rte_eth_dev *dev,
|
|
struct rte_flow_tunnel *tunnel,
|
|
struct rte_flow_item **pmd_items,
|
|
uint32_t *num_of_items,
|
|
struct rte_flow_error *err);
|
|
/** See rte_flow_get_rte_flow_restore_info() */
|
|
int (*get_restore_info)
|
|
(struct rte_eth_dev *dev,
|
|
struct rte_mbuf *m,
|
|
struct rte_flow_restore_info *info,
|
|
struct rte_flow_error *err);
|
|
/** See rte_flow_action_tunnel_decap_release() */
|
|
int (*action_release)
|
|
(struct rte_eth_dev *dev,
|
|
struct rte_flow_action *pmd_actions,
|
|
uint32_t num_of_actions,
|
|
struct rte_flow_error *err);
|
|
/** See rte_flow_item_release() */
|
|
int (*item_release)
|
|
(struct rte_eth_dev *dev,
|
|
struct rte_flow_item *pmd_items,
|
|
uint32_t num_of_items,
|
|
struct rte_flow_error *err);
|
|
};
|
|
|
|
/**
|
|
* Get generic flow operations structure from a port.
|
|
*
|
|
* @param port_id
|
|
* Port identifier to query.
|
|
* @param[out] error
|
|
* Pointer to flow error structure.
|
|
*
|
|
* @return
|
|
* The flow operations structure associated with port_id, NULL in case of
|
|
* error, in which case rte_errno is set and the error structure contains
|
|
* additional details.
|
|
*/
|
|
const struct rte_flow_ops *
|
|
rte_flow_ops_get(uint16_t port_id, struct rte_flow_error *error);
|
|
|
|
#ifdef __cplusplus
|
|
}
|
|
#endif
|
|
|
|
#endif /* RTE_FLOW_DRIVER_H_ */
|