Commit Graph

296 Commits

Author SHA1 Message Date
Thomas Monjalon
4630290af4 mbuf: move pool pointer in first half
According to the Technical Board decision
(http://mails.dpdk.org/archives/dev/2020-November/191859.html),
the mempool pointer in the mbuf struct is moved
from the second to the first half.
It may increase performance in some cases
on systems having 64-byte cache line, i.e. mbuf split in two cache lines.

Due to this change, all fields after "pool" are moved up.
Hopefully no vector data path is impacted.

Moving this field gives more space to dynfield1
while dropping the temporary dynfield0.

This is how the mbuf layout looks like (pahole-style):

word  type                              name                byte  size
 0    void *                            buf_addr;         /*   0 +  8 */
 1    rte_iova_t                        buf_iova          /*   8 +  8 */
      /* --- RTE_MARKER64               rearm_data;                   */
 2    uint16_t                          data_off;         /*  16 +  2 */
      uint16_t                          refcnt;           /*  18 +  2 */
      uint16_t                          nb_segs;          /*  20 +  2 */
      uint16_t                          port;             /*  22 +  2 */
 3    uint64_t                          ol_flags;         /*  24 +  8 */
      /* --- RTE_MARKER                 rx_descriptor_fields1;        */
 4    uint32_t             union        packet_type;      /*  32 +  4 */
      uint32_t                          pkt_len;          /*  36 +  4 */
 5    uint16_t                          data_len;         /*  40 +  2 */
      uint16_t                          vlan_tci;         /*  42 +  2 */
 5.5  uint64_t             union        hash;             /*  44 +  8 */
 6.5  uint16_t                          vlan_tci_outer;   /*  52 +  2 */
      uint16_t                          buf_len;          /*  54 +  2 */
 7    struct rte_mempool *              pool;             /*  56 +  8 */
      /* --- RTE_MARKER                 cacheline1;                   */
 8    struct rte_mbuf *                 next;             /*  64 +  8 */
 9    uint64_t             union        tx_offload;       /*  72 +  8 */
10    struct rte_mbuf_ext_shared_info * shinfo;           /*  80 +  8 */
11    uint16_t                          priv_size;        /*  88 +  2 */
      uint16_t                          timesync;         /*  90 +  2 */
11.5  uint32_t                          dynfield1[9];     /*  92 + 36 */
16    /* --- END                                             128      */

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
2020-11-12 16:39:10 +01:00
Morten Brørup
3a35c1c0f6 mbuf: clean up comments and prefix
The mbuf header files had some commenting style errors that affected the
API documentation.
Also, the RTE_ prefix was missing on a macro and a definition.

Note: This patch does not touch the offload and attachment flags that are
also missing the RTE_ prefix.

Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-11-05 17:53:15 +01:00
Stephen Hemminger
0429a2e1a4 mbuf: fix dynamic fields and flags with multiprocess
The dynamic flag management is broken if rte_mbuf_dynflag_lookup()
is done in a secondary process because the local pointer to
the memzone is not ever initialized.

Fix it by using the same checks as dynfield_register().
I.e if shared memory zone has not been looked up already,
then discover it.

Fixes: 4958ca3a44 ("mbuf: support dynamic fields and flags")
Cc: stable@dpdk.org

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-11-04 20:47:14 +01:00
Thomas Monjalon
26fb26f6f5 mbuf: add Tx timestamp registration helper
The function rte_mbuf_dyn_tx_timestamp_register()
can be used to register the required field and flag.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-11-03 16:21:15 +01:00
Thomas Monjalon
c7e857e16f mbuf: remove deprecated timestamp field
As announced in the deprecation note, the field timestamp
is removed to give more space to the dynamic fields.
The related offload flag PKT_RX_TIMESTAMP is also removed.

This is how the mbuf layout looks like (pahole-style):

word  type                              name                byte  size
 0    void *                            buf_addr;         /*   0 +  8 */
 1    rte_iova_t                        buf_iova          /*   8 +  8 */
      /* --- RTE_MARKER64               rearm_data;                   */
 2    uint16_t                          data_off;         /*  16 +  2 */
      uint16_t                          refcnt;           /*  18 +  2 */
      uint16_t                          nb_segs;          /*  20 +  2 */
      uint16_t                          port;             /*  22 +  2 */
 3    uint64_t                          ol_flags;         /*  24 +  8 */
      /* --- RTE_MARKER                 rx_descriptor_fields1;        */
 4    uint32_t             union        packet_type;      /*  32 +  4 */
      uint32_t                          pkt_len;          /*  36 +  4 */
 5    uint16_t                          data_len;         /*  40 +  2 */
      uint16_t                          vlan_tci;         /*  42 +  2 */
 5.5  uint64_t             union        hash;             /*  44 +  8 */
 6.5  uint16_t                          vlan_tci_outer;   /*  52 +  2 */
      uint16_t                          buf_len;          /*  54 +  2 */
 7    uint64_t                          dynfield0[1];     /*  56 +  8 */
      /* --- RTE_MARKER                 cacheline1;                   */
 8    struct rte_mempool *              pool;             /*  64 +  8 */
 9    struct rte_mbuf *                 next;             /*  72 +  8 */
10    uint64_t             union        tx_offload;       /*  80 +  8 */
11    struct rte_mbuf_ext_shared_info * shinfo;           /*  88 +  8 */
12    uint16_t                          priv_size;        /*  96 +  2 */
      uint16_t                          timesync;         /*  98 +  2 */
12.5  uint32_t                          dynfield1[7];     /* 100 + 28 */
16    /* --- END                                             128      */

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-11-03 16:21:15 +01:00
Thomas Monjalon
9db924cc21 mbuf: add Rx timestamp flag and helpers
There is already a dynamic field for timestamp,
used only for Tx scheduling with the dedicated Tx offload flag.
The same field can be used for Rx timestamp filled by drivers.

A new dynamic flag is defined for Rx usage.
A new function wraps the registration of both field and Rx flag.
The type rte_mbuf_timestamp_t is defined for the API users.

After migrating all Rx timestamp usages, it will be possible
to remove the deprecated timestamp field.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-11-03 16:21:14 +01:00
David Marchand
79d69c6dcf mbuf: remove seqn field
As announced in the deprecation note, the field seqn is removed to give
more space to the dynamic fields.

This is how the mbuf layout looks like (pahole-style):

word  type                              name                byte  size
 0    void *                            buf_addr;         /*   0 +  8 */
 1    rte_iova_t                        buf_iova          /*   8 +  8 */
      /* --- RTE_MARKER64               rearm_data;                   */
 2    uint16_t                          data_off;         /*  16 +  2 */
      uint16_t                          refcnt;           /*  18 +  2 */
      uint16_t                          nb_segs;          /*  20 +  2 */
      uint16_t                          port;             /*  22 +  2 */
 3    uint64_t                          ol_flags;         /*  24 +  8 */
      /* --- RTE_MARKER                 rx_descriptor_fields1;        */
 4    uint32_t             union        packet_type;      /*  32 +  4 */
      uint32_t                          pkt_len;          /*  36 +  4 */
 5    uint16_t                          data_len;         /*  40 +  2 */
      uint16_t                          vlan_tci;         /*  42 +  2 */
 5.5  uint64_t             union        hash;             /*  44 +  8 */
 6.5  uint16_t                          vlan_tci_outer;   /*  52 +  2 */
      uint16_t                          buf_len;          /*  54 +  2 */
 7    uint64_t                          timestamp;        /*  56 +  8 */
      /* --- RTE_MARKER                 cacheline1;                   */
 8    struct rte_mempool *              pool;             /*  64 +  8 */
 9    struct rte_mbuf *                 next;             /*  72 +  8 */
10    uint64_t             union        tx_offload;       /*  80 +  8 */
11    struct rte_mbuf_ext_shared_info * shinfo;           /*  88 +  8 */
12    uint16_t                          priv_size;        /*  96 +  2 */
      uint16_t                          timesync;         /*  98 +  2 */
12.5  uint32_t                          dynfield1[7];     /* 100 + 28 */
16    /* --- END                                             128      */

Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2020-10-31 22:14:44 +01:00
Thomas Monjalon
5284adad3e mbuf: remove userdata field
As announced in the deprecation note, the field userdata / udata64
is removed to give more space to the dynamic fields.

This is how the mbuf layout looks like (pahole-style):

word  type                              name                byte  size
 0    void *                            buf_addr;         /*   0 +  8 */
 1    rte_iova_t                        buf_iova          /*   8 +  8 */
      /* --- RTE_MARKER64               rearm_data;                   */
 2    uint16_t                          data_off;         /*  16 +  2 */
      uint16_t                          refcnt;           /*  18 +  2 */
      uint16_t                          nb_segs;          /*  20 +  2 */
      uint16_t                          port;             /*  22 +  2 */
 3    uint64_t                          ol_flags;         /*  24 +  8 */
      /* --- RTE_MARKER                 rx_descriptor_fields1;        */
 4    uint32_t             union        packet_type;      /*  32 +  4 */
      uint32_t                          pkt_len;          /*  36 +  4 */
 5    uint16_t                          data_len;         /*  40 +  2 */
      uint16_t                          vlan_tci;         /*  42 +  2 */
 5.5  uint64_t             union        hash;             /*  44 +  8 */
 6.5  uint16_t                          vlan_tci_outer;   /*  52 +  2 */
      uint16_t                          buf_len;          /*  54 +  2 */
 7    uint64_t                          timestamp;        /*  56 +  8 */
      /* --- RTE_MARKER                 cacheline1;                   */
 8    struct rte_mempool *              pool;             /*  64 +  8 */
 9    struct rte_mbuf *                 next;             /*  72 +  8 */
10    uint64_t             union        tx_offload;       /*  80 +  8 */
11    uint16_t                          priv_size;        /*  88 +  2 */
      uint16_t                          timesync;         /*  90 +  2 */
      uint32_t                          seqn;             /*  92 +  4 */
12    struct rte_mbuf_ext_shared_info * shinfo;           /*  96 +  8 */
13    uint64_t                          dynfield1[3];     /* 104 + 24 */
16    /* --- END                                             128      */

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-10-31 16:13:11 +01:00
Thomas Monjalon
34d26d26b7 mbuf: fix typo in dynamic field convention note
Replace "in a in PMD" with "in a PMD".

Fixes: 4958ca3a44 ("mbuf: support dynamic fields and flags")
Cc: stable@dpdk.org

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2020-10-31 16:13:10 +01:00
Thomas Monjalon
19653eed40 remove config prefix used with make
The config options CONFIG_RTE_* are simple RTE_* defines with meson.
Now that make support is dropped, update the names in logs and comments.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: David Marchand <david.marchand@redhat.com>
2020-10-23 19:25:21 +02:00
Bruce Richardson
63b3907833 build: remove library name from version map file name
Since each version map file is contained in the subdirectory of the library
it refers to, there is no need to include the library name in the filename.
This makes things simpler in case of library renaming.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Luca Boccassi <bluca@debian.org>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Rosen Xu <rosen.xu@intel.com>
2020-10-19 22:13:59 +02:00
Somnath Kotur
ed94631da7 mbuf: extend meaning of QinQ stripped bit
Clarify the documentation of QinQ flags, and extend the meaning of the
flag: if PKT_RX_QINQ_STRIPPED is set and PKT_RX_VLAN_STRIPPED is unset,
only the outer VLAN is removed from packet data, but both tci are saved
in mbuf->vlan_tci (inner) and mbuf->vlan_tci_outer (outer).

Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2020-10-15 23:04:55 +02:00
Thomas Monjalon
4be717272e mbuf: remove physical address alias
Remove the deprecated buf_physaddr union field from rte_mbuf.
It is replaced with buf_iova which is at the same offset.

The single field buf_physaddr in rte_kni_mbuf is also renamed.

This concludes a 3-year process of semantic change.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2020-09-19 00:25:37 +02:00
Thomas Monjalon
ce627d633b mbuf: remove deprecated function and macro aliases
Remove the deprecated functions
	- rte_mbuf_data_dma_addr
	- rte_mbuf_data_dma_addr_default
which aliased the more recent functions
	- rte_mbuf_data_iova
	- rte_mbuf_data_iova_default

Remove the deprecated macros
	- rte_pktmbuf_mtophys
	- rte_pktmbuf_mtophys_offset
which aliased the more recent macros
	- rte_pktmbuf_iova
	- rte_pktmbuf_iova_offset

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2020-09-19 00:25:37 +02:00
Phil Yang
e41d27a68d mbuf: remove atomic reference counters
Remove the deprecated refcnt_atomic union fields in
rte_mbuf and rte_mbuf_ext_shared_info structures.

Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
2020-09-15 11:29:23 +02:00
Ciara Power
3cc6ecfdfe build: remove makefiles
A decision was made [1] to no longer support Make in DPDK, this patch
removes all Makefiles that do not make use of pkg-config, along with
the mk directory previously used by make.

[1] https://mails.dpdk.org/archives/dev/2020-April/162839.html

Signed-off-by: Ciara Power <ciara.power@intel.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2020-09-08 00:09:50 +02:00
Thomas Monjalon
4f86c0ba19 version: 20.11-rc0
Start a new release cycle with empty release notes.

The ABI version becomes 21.0.
The ABI major is back to normal, having only one number (21 vs 20.0).
The map files are updated to the new ABI major number (21).
The ABI exceptions are dropped.
Travis ABI check is disabled because compatibility is not preserved.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2020-08-12 11:32:16 +02:00
Phil Yang
db48bae253 mbuf: use C11 atomic builtins for refcnt
Use C11 atomic builtins with explicit ordering instead of rte_atomic
ops which enforce unnecessary barriers on aarch64.

Suggested-by: Olivier Matz <olivier.matz@6wind.com>
Suggested-by: Dodji Seketeli <dodji@redhat.com>
Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-07-21 10:30:35 +02:00
Viacheslav Ovsiienko
9da82e8d8b mbuf: introduce accurate packet Tx scheduling
There is the requirement on some networks for precise traffic timing
management. The ability to send (and, generally speaking, receive)
the packets at the very precisely specified moment of time provides
the opportunity to support the connections with Time Division
Multiplexing using the contemporary general purpose NIC without involving
an auxiliary hardware. For example, the supporting of O-RAN Fronthaul
interface is one of the promising features for potentially usage of the
precise time management for the egress packets.

The main objective of this patchset is to specify the way how applications
can provide the moment of time at what the packet transmission must be
started and to describe in preliminary the supporting this feature
from mlx5 PMD side [1].

The new dynamic timestamp field is proposed, it provides some timing
information, the units and time references (initial phase) are not
explicitly defined but are maintained always the same for a given port.
Some devices allow to query rte_eth_read_clock() that will return
the current device timestamp. The dynamic timestamp flag tells whether
the field contains actual timestamp value. For the packets being sent
this value can be used by PMD to schedule packet sending.

The device clock is opaque entity, the units and frequency are
vendor specific and might depend on hardware capabilities and
configurations. If might (or not) be synchronized with real time
via PTP, might (or not) be synchronous with CPU clock (for example
if NIC and CPU share the same clock source there might be no
any drift between the NIC and CPU clocks), etc.

After PKT_RX_TIMESTAMP flag and fixed timestamp field supposed
deprecation and obsoleting, these dynamic flag and field might be
used to manage the timestamps on receiving datapath as well. Having
the dedicated flags for Rx/Tx timestamps allows applications not
to perform explicit flags reset on forwarding and not to promote
received timestamps to the transmitting datapath by default.
The static PKT_RX_TIMESTAMP is considered as candidate to become
the dynamic flag and this move should be discussed.

When PMD sees the "rte_dynfield_timestamp" set on the packet being sent
it tries to synchronize the time of packet appearing on the wire with
the specified packet timestamp. If the specified one is in the past it
should be ignored, if one is in the distant future it should be capped
with some reasonable value (in range of seconds). These specific cases
("too late" and "distant future") can be optionally reported via
device xstats to assist applications to detect the time-related
problems.

There is no any packet reordering according timestamps is supposed,
neither within packet burst, nor between packets, it is an entirely
application responsibility to generate packets and its timestamps
in desired order. The timestamps can be put only in the first packet
in the burst providing the entire burst scheduling.

PMD reports the ability to synchronize packet sending on timestamp
with new offload flag:

This is palliative and might be replaced with new eth_dev API
about reporting/managing the supported dynamic flags and its related
features. This API would break ABI compatibility and can't be introduced
at the moment, so is postponed to 20.11.

For testing purposes it is proposed to update testpmd "txonly"
forwarding mode routine. With this update testpmd application generates
the packets and sets the dynamic timestamps according to specified time
pattern if it sees the "rte_dynfield_timestamp" is registered.

The new testpmd command is proposed to configure sending pattern:

set tx_times <burst_gap>,<intra_gap>

<intra_gap> - the delay between the packets within the burst
              specified in the device clock units. The number
              of packets in the burst is defined by txburst parameter

<burst_gap> - the delay between the bursts in the device clock units

As the result the bursts of packet will be transmitted with specific
delays between the packets within the burst and specific delay between
the bursts. The rte_eth_read_clock is supposed to be engaged to get the
current device clock value and provide the reference for the timestamps.

[1] http://patches.dpdk.org/patch/73714/

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-07-11 06:18:54 +02:00
Xiaolong Ye
6248368a9a mbuf: add dump of free dynamic flags
Add support to dump free_flags as below format:

Free bit in mbuf->ol_flags (0 = occupied, 1 = free):
  0000: 0 0 0 0 0 0 0 0
  0008: 0 0 0 0 0 0 0 0
  0010: 0 0 0 0 0 0 0 1
  0018: 1 1 1 1 1 1 1 1
  0020: 1 1 1 1 1 1 1 1
  0028: 1 0 0 0 0 0 0 0
  0030: 0 0 0 0 0 0 0 0
  0038: 0 0 0 0 0 0 0 0

Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-06-25 23:03:18 +02:00
Xiaolong Ye
d27a4cb3ef mbuf: fix dynamic field dump log
For each mbuf byte, free_space[i] == 0 means the space is occupied,
free_space[i] != 0 means space is free.

Fixes: 4958ca3a44 ("mbuf: support dynamic fields and flags")
Cc: stable@dpdk.org

Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-06-25 23:03:18 +02:00
Xiaolong Ye
3a7fb882fd mbuf: fix free space update for dynamic field
The value free_space[i] is used to save the size of biggest aligned
element that can fit in the zone, current implementation has one flaw,
for example, if user registers dynfield1 (size = 4, align = 4, req = 124)
first, the free_space would be as below after registration:

  0070: 08 08 08 08 08 08 08 08
  0078: 08 08 08 08 00 00 00 00

Then if user continues to register dynfield2 (size = 4, align = 4),
free_space would become:

  0070: 00 00 00 00 04 04 04 04
  0078: 04 04 04 04 00 00 00 00

Further request dynfield3 (size = 8, align = 8) would fail to register
due to alignment requirement can't be satisfied, though there is enough
space remained in mbuf.

This patch fixes above issue by saving alignment only in aligned zone,
after the fix, above registrations order can be satisfied, free_space
would be like:

After dynfield1 registration:

  0070: 08 08 08 08 08 08 08 08
  0078: 04 04 04 04 00 00 00 00

After dynfield2 registration:

  0070: 08 08 08 08 08 08 08 08
  0078: 00 00 00 00 00 00 00 00

After dynfield3 registration:

  0070: 00 00 00 00 00 00 00 00
  0078: 00 00 00 00 00 00 00 00

This patch also reduces iterations in process_score() by jumping align
steps in each loop.

Fixes: 4958ca3a44 ("mbuf: support dynamic fields and flags")
Cc: stable@dpdk.org

Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-06-25 23:03:18 +02:00
Xiaolong Ye
359386e15b mbuf: fix error code in dynamic field/flag registration
Set rte_errno as ENOMEM when allocation failure.

Fixes: 4958ca3a44 ("mbuf: support dynamic fields and flags")
Cc: stable@dpdk.org

Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-06-25 23:03:18 +02:00
Xiaolong Ye
f8eb26dda8 mbuf: fix boundary check at dynamic field registration
We should make sure off + size < sizeof(struct rte_mbuf) to avoid
possible out-of-bounds access of free_space array, there is no issue
currently due to the low bits of free_flags (which is adjacent to
free_space) are always set to 0. But we shouldn't rely on it since it's
fragile and layout of struct mbuf_dyn_shm may be changed in the future.
This patch adds boundary check explicitly to avoid potential risk of
out-of-bounds access.

Fixes: 4958ca3a44 ("mbuf: support dynamic fields and flags")
Cc: stable@dpdk.org

Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-06-25 23:03:18 +02:00
Tal Shnaiderman
4887a7e234 mbuf: align layout in Windows
Using uint32_t type bit-fields in Windows will pads the
'L2/L3/L4 and tunnel information' union with additional bits.

This padding causes rte_mbuf size misalignment and the total size
increases to 3 cache-lines.

Changed packet_type bit-fields types from uint32_t to uint8_t
to allow unified 2 cache-line structure size.

Added the __extension__ attribute over the modified struct to avoid
the warning:

type of bit-field ... is a GCC extension [-pedantic]

Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
Tested-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-06-11 16:26:33 +02:00
Alexander Kozyrev
d6eb247371 mbuf: fix external buffer pool boundaries
Memzones are created in testpmd in order to test external data
buffers functionality. Each memzone is 2Mb in size and divided among
the pool of external memory buffers.

Memzone may not always be fully utilized because mbufs size can vary
and some space can be left unused at the tail of a memzone. This is
not handled properly and mbuf can get the address of this leftover
space since this address is still valid (part of memzone), but there
is not enough space to fit the whole packet data. As a result packet
data may overflow and cause the memory corruption.

Take mbuf size into account when distributing memory addresses from
a memzone to external mbufs. Skip the remaining tail in case there
is not enough room for a packet and move to a next memzone instead.

Fixes: 6c8e50c2e5 ("mbuf: create pool with external memory buffers")
Cc: stable@dpdk.org

Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-06-11 09:51:46 +02:00
Xiaolong Ye
c67a423c53 mbuf: remove unused next member in dynamic flag/field
TAILQ_ENTRY next is not needed in struct mbuf_dynfield_elt and
mbuf_dynflag_elt, since they are actually chained by rte_tailq_entry's
next field when calling TAILQ_INSERT_TAIL(mbuf_dynfield/dynflag_list, te,
next).

Fixes: 4958ca3a44 ("mbuf: support dynamic fields and flags")
Cc: stable@dpdk.org

Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-06-11 09:32:43 +02:00
Thomas Monjalon
d1342ea419 mbuf: document guideline for new fields and flags
Since dynamic fields and flags were added in 19.11,
the idea was to use them for new features, not only PMD-specific.

The guideline is made more explicit in doxygen, in the mbuf guide,
and in the contribution design guidelines.

For more information about the original design, see the presentation
https://www.dpdk.org/wp-content/uploads/sites/35/2019/10/DynamicMbuf.pdf

This decision was discussed in the Technical Board:
http://mails.dpdk.org/archives/dev/2020-June/169667.html

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2020-06-11 09:29:15 +02:00
Muhammad Bilal
ec5a905ec8 mbuf: prevent setting mempool ops name empty
Bugzilla ID: 353

Signed-off-by: Muhammad Bilal <m.bilal@emumba.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
2020-04-25 22:56:10 +02:00
Thomas Monjalon
f2fc83b40f replace unused attributes
There is a common macro __rte_unused, avoiding warnings,
which is now used where appropriate for consistency.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2020-04-16 18:30:58 +02:00
Pavan Nikhilesh
acec04c4b2 build: disable experimental API check internally
Remove setting ALLOW_EXPERIMENTAL_API individually for each Makefile and
meson.build. Instead, enable ALLOW_EXPERIMENTAL_API flag across app, lib
and drivers.
This changes reduces the clutter across the project while still
maintaining the functionality of ALLOW_EXPERIMENTAL_API i.e. warning
external applications about experimental API usage.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
2020-04-14 16:22:34 +02:00
Alexander Kozyrev
759a6a221e mbuf: optimize memory loads during freeing
Introduction of pinned external buffers doubled memory loads in the
rte_pktmbuf_prefree_seg() function. Analysis of the generated assembly
code shows unnecessary load of the pool field of the rte_mbuf structure.
Here is the snippet of the assembly for "if (!RTE_MBUF_DIRECT(m))":
Before the change the code was:
	movq  0x18(%rbx), %rax // load the ol_flags field
	test %r13, %rax	       // check if ol_flags equals to 0x60...0
	jz 0x9a8718 <Block 2>  // jump out to "if (m->next != NULL)"
After the change the code became:
	movq  0x18(%rbx), %rax // load ol_flags
	test %r14, %rax	       // check if ol_flags equals to 0x60...0
	jnz 0x9bea38 <Block 2> // jump in to "if (!RTE_MBUF_HAS_EXTBUF(m)"
	movq  0x48(%rbx), %rax // load the pool field
	jmp 0x9bea78 <Block 7> // jump out to "if (m->next != NULL)"
Look like this absolutely unneeded memory load of the pool field is an
optimization for the external buffer case in GCC (4.8.5), since Clang
generates the same assembly for both before and after the change versions.
Plus, GCC favors the external buffer case over the simple case.
This assembly code layout causes the performance degradation because the
rte_pktmbuf_prefree_seg() function is a part of a very hot path.
Workaround this compilation issue by moving the check for pinned buffer
apart from the check for external buffer and restore the initial code
flow that favors the direct mbuf case over the external one.

Fixes: 6ef1107ad4 ("mbuf: detach mbuf with pinned external buffer")
Cc: stable@dpdk.org

Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-03-31 02:57:42 +02:00
Ciara Power
f42c9ac5b6 lib: fix unnecessary double negation
An equality expression already returns either 0 or 1.
There is no need to use double negation for these cases.

Fixes: ea672a8b16 ("mbuf: remove the rte_pktmbuf structure")
Fixes: a0fd91cefc ("mempool: rename functions with confusing names")
Cc: stable@dpdk.org

Signed-off-by: Ciara Power <ciara.power@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-02-14 17:39:46 +01:00
Viacheslav Ovsiienko
545fa736b9 mbuf: fix pinned memory function style
Minor style issue is fixed.

Fixes: 6c8e50c2e5 ("mbuf: create pool with external memory buffers")

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-02-06 16:17:14 +01:00
Stephen Hemminger
5b6eaea8ea mbuf: display more fields in dump
The rte_pktmbuf_dump should display offset, refcount, and vlan
info since these are often useful during debugging.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-02-06 16:17:14 +01:00
Viacheslav Ovsiienko
6c8e50c2e5 mbuf: create pool with external memory buffers
The dedicated routine rte_pktmbuf_pool_create_extbuf() is
provided to create mbuf pool with data buffers located in
the pinned external memory. The application provides the
external memory description and routine initializes each
mbuf with appropriate virtual and physical buffer address.
It is entirely application responsibility to register
external memory with rte_extmem_register() API, map this
memory, etc.

The new introduced flag RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF
is set in private pool structure, specifying the new special
pool type. The allocated mbufs from pool of this kind will
have the EXT_ATTACHED_MBUF flag set and initialiazed shared
info structure, allowing cloning with regular mbufs (without
attached external buffers of any kind).

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-01-20 23:36:38 +01:00
Viacheslav Ovsiienko
6ef1107ad4 mbuf: detach mbuf with pinned external buffer
Update detach routine to check the mbuf pool type.
Introduce the special internal version of detach routine to handle
the special case of pinned external bufferon mbuf freeing.

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-01-20 23:36:15 +01:00
Viacheslav Ovsiienko
c621a9b66b mbuf: introduce routine to get private mbuf pool flags
The routine rte_pktmbuf_priv_flags is introduced to fetch
the flags from the mbuf memory pool private structure
in unified fashion.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-01-20 23:30:38 +01:00
Jerin Jacob
3e6181b070 mbuf: use structure marker from EAL
Use new marker typedef available in EAL and remove private marker
typedef.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-01-20 21:17:35 +01:00
Jörg Thalheim
a18e79cd3c mbuf: improve API doc for attaching external buffer
Enhance API documentation of rte_pktmbuf_attach_extbuf() to
explain that the attached mbuf is initialized with length = 0.

Link: https://bugs.dpdk.org/show_bug.cgi?id=362

Signed-off-by: Jörg Thalheim <joerg@thalheim.io>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2020-01-20 16:37:27 +01:00
Shahaf Shuler
39a19ae03d mbuf: extend mbuf pool private structure
With the API and ABI freeze ahead, it will be good to reserve
some bits on the private structure for future use.

Otherwise we will potentially need to maintain two different
private structure during 2020 period.

There is already one use case for those reserved bits[1]

The reserved field should be set to 0 by the user.

[1] https://patches.dpdk.org/patch/63077/

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2019-11-25 22:44:46 +01:00
Pawel Modrak
85ff364f3b build: align symbols with global ABI version
Merge all versions in linker version script files to DPDK_20.0.

This commit was generated by running the following command:

:~/DPDK$ buildtools/update-abi.sh 20.0

Signed-off-by: Pawel Modrak <pawelx.modrak@intel.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2019-11-20 23:05:39 +01:00
Anatoly Burakov
fbaf943887 build: remove individual library versions
Since the library versioning for both stable and experimental ABI's is
now managed globally, the LIBABIVER and version variables no longer
serve any useful purpose, and can be removed.

The replacement in Makefiles was done using the following regex:

	^(#.*\n)?LIBABIVER\s*:=\s*\d+\n(\s*\n)?

(LIBABIVER := numbers, optionally preceded by a comment and optionally
succeeded by an empty line)

The replacement for meson files was done using the following regex:

	^(#.*\n)?version\s*=\s*\d+\n(\s*\n)?

(version = numbers, optionally preceded by a comment and optionally
succeeded by an empty line)

[David]: those variables are manually removed for the files:
- drivers/common/qat/Makefile
- lib/librte_eal/meson.build
[David]: the LIBABIVER is restored for the external ethtool example
library.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2019-11-20 23:05:39 +01:00
Viacheslav Ovsiienko
9bf26e1318 ethdev: move egress metadata to dynamic field
The dynamic mbuf fields were introduced by [1]. The egress metadata is
good candidate to be moved from statically allocated field tx_metadata to
dynamic one. Because mbufs are used in half-duplex fashion only, it is
safe to share this dynamic field with ingress metadata.

The shared dynamic field contains either egress (if application going to
transmit mbuf with tx_burst) or ingress (if mbuf is received with rx_burst)
metadata and can be accessed by RTE_FLOW_DYNF_METADATA() macro or with
rte_flow_dynf_metadata_set() and rte_flow_dynf_metadata_get() helper
routines. PKT_TX_DYNF_METADATA/PKT_RX_DYNF_METADATA flag will be set
along with the data.

The mbuf dynamic field must be registered by calling
rte_flow_dynf_metadata_register() prior accessing the data.

The availability of dynamic mbuf metadata field can be checked with
rte_flow_dynf_metadata_avail() routine.

DEV_TX_OFFLOAD_MATCH_METADATA offload and configuration flag is removed.
The metadata support in PMDs is engaged on dynamic field registration.

Metadata feature is getting complex. We might have some set of actions
and items that might be supported by PMDs in multiple combinations,
the supported values and masks are the subjects to query by perfroming
trials (with rte_flow_validate).

[1] http://patches.dpdk.org/patch/62040/

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Ori Kam <orika@mellanox.com>
2019-11-08 23:15:05 +01:00
Viacheslav Ovsiienko
e02ecc1324 ethdev: extend flow metadata
Currently, metadata can be set on egress path via mbuf tx_metadata field
with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META matches metadata.

This patch extends the metadata feature usability.

1) RTE_FLOW_ACTION_TYPE_SET_META

When supporting multiple tables, Tx metadata can also be set by a rule and
matched by another rule. This new action allows metadata to be set as a
result of flow match.

2) Metadata on ingress

There's also need to support metadata on ingress. Metadata can be set by
SET_META action and matched by META item like Tx. The final value set by
the action will be delivered to application via metadata dynamic field of
mbuf which can be accessed by RTE_FLOW_DYNF_METADATA() macro or with
rte_flow_dynf_metadata_set() and rte_flow_dynf_metadata_get() helper
routines. PKT_RX_DYNF_METADATA flag will be set along with the data.

The mbuf dynamic field must be registered by calling
rte_flow_dynf_metadata_register() prior to use SET_META action.

The availability of dynamic mbuf metadata field can be checked
with rte_flow_dynf_metadata_avail() routine.

If application is going to engage the metadata feature it registers
the metadata  dynamic fields, then PMD checks the metadata field
availability and handles the appropriate fields in datapath.

For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
propagated to the other path depending on hardware capability.

MARK and METADATA look similar and might operate in similar way,
but not interacting.

Initially, there were proposed two metadata related actions:

- RTE_FLOW_ACTION_TYPE_FLAG
- RTE_FLOW_ACTION_TYPE_MARK

These actions set the special flag in the packet metadata, MARK action
stores some specified value in the metadata storage, and, on the packet
receiving PMD puts the flag and value to the mbuf and applications can
see the packet was threated inside flow engine according to the appropriate
RTE flow(s). MARK and FLAG are like some kind of gateway to transfer some
per-packet information from the flow engine to the application via
receiving datapath. Also, there is the item of type RTE_FLOW_ITEM_TYPE_MARK
provided. It allows us to extend the flow match pattern with the capability
to match the metadata values set by MARK/FLAG actions on other flows.

From the datapath point of view, the MARK and FLAG are related to the
receiving side only. It would useful to have the same gateway on the
transmitting side and there was the feature of type RTE_FLOW_ITEM_TYPE_META
was proposed. The application can fill the field in mbuf and this value
will be transferred to some field in the packet metadata inside the flow
engine. It did not matter whether these metadata fields are shared because
of MARK and META items belonged to different domains (receiving and
transmitting) and could be vendor-specific.

So far, so good, DPDK proposes some entities to control metadata inside
the flow engine and gateways to exchange these values on a per-packet basis
via datapaths.

As we can see, the MARK and META means are not symmetric, there is absent
action which would allow us to set META value on the transmitting path.
So, the action of type:

- RTE_FLOW_ACTION_TYPE_SET_META was proposed.

The next, applications raise the new requirements for packet metadata.
The flow ngines are getting more complex, internal switches are introduced,
multiple ports might be supported within the same flow engine namespace.
From the DPDK points of view, it means the packets might be sent on one
eth_dev port and received on the other one, and the packet path inside
the flow engine entirely belongs to the same hardware device. The simplest
example is SR-IOV with PF, VFs and the representors. And there is a
brilliant opportunity to provide some out-of-band channel to transfer
some extra data from one port to another one, besides the packet data
itself. And applications would like to use this opportunity.

It is supposed for application to use trials (with rte_flow_validate)
to detect which metadata features (FLAG, MARK, META) actually supported
by PMD and underlying hardware. It might depend on PMD configuration,
system software, hardware settings, etc., and should be detected
in run time.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Ori Kam <orika@mellanox.com>
2019-11-08 23:15:04 +01:00
Olivier Matz
4958ca3a44 mbuf: support dynamic fields and flags
Many features require to store data inside the mbuf. As the room in mbuf
structure is limited, it is not possible to have a field for each
feature. Also, changing fields in the mbuf structure can break the API
or ABI.

This commit addresses these issues, by enabling the dynamic registration
of fields or flags:

- a dynamic field is a named area in the rte_mbuf structure, with a
  given size (>= 1 byte) and alignment constraint.
- a dynamic flag is a named bit in the rte_mbuf structure.

The typical use case is a PMD that registers space for an offload
feature, when the application requests to enable this feature.  As
the space in mbuf is limited, the space should only be reserved if it
is going to be used (i.e when the application explicitly asks for it).

The registration can be done at any moment, but it is not possible
to unregister fields or flags.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2019-10-26 19:08:50 +02:00
Konstantin Ananyev
3eb860b08e mbuf: move definitions into a separate file
Right now inclusion of rte_mbuf.h header can cause inclusion of
some arch/os specific headers.
That prevents it to be included directly by some
non-DPDK (but related) entities: KNI, BPF programs, etc.
To overcome that problem usually a separate definitions of rte_mbuf
structure is created within these entities.
That aproach has a lot of drawbacks: code duplication, error prone, etc.
This patch moves rte_mbuf structure definition (and some related macros)
into a separate file that can be included by both rte_mbuf.h and
other non-DPDK entities.

Note that it doesn't introduce any change for current DPDK code.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Michel Machado <michel@digirati.com.br>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2019-10-25 19:30:38 +02:00
Ting Xu
d892768c6d mbuf: add GTP tunnel type
Add GTP tunnel type flag in mbuf for future use in GTP
Tx checksum offload.

Signed-off-by: Ting Xu <ting.xu@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2019-10-23 16:43:10 +02:00
Morten Brørup
0f824df6f8 mbuf: add bulk free function
Add function for freeing a bulk of mbufs.

Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2019-10-24 02:45:40 +02:00
Stephen Hemminger
c3a90c381d mbuf: add a copy routine
This is a commonly used operation that surprisingly the
DPDK has not supported. The new rte_pktmbuf_copy does a
deep copy of packet. This is a complete copy including
meta-data.

It handles the case where the source mbuf comes from a pool
with larger data area than the destination pool. The routine
also has options for skipping data, or truncating at a fixed
length.

This patch also introduces internal inline to copy the
metadata fields of mbuf.

Add a test for this new function, based of the clone tests.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2019-10-16 12:43:53 +02:00