Commit Graph

2733 Commits

Author SHA1 Message Date
Ferruh Yigit
59b36980e4 kni: do not use assignment in if condition
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2016-10-13 23:08:04 +02:00
Ferruh Yigit
50e25e4049 kni: move trailing statement on next line
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2016-10-13 23:07:16 +02:00
Ferruh Yigit
fa34b39dfd kni: move comparison constants on the right
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2016-10-13 23:06:42 +02:00
Ferruh Yigit
e227435ec0 kni: remove useless return
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2016-10-13 23:06:38 +02:00
Ferruh Yigit
0861751c93 kni: prefer unsigned int to unsigned
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2016-10-13 23:06:30 +02:00
Ferruh Yigit
1b9190aff3 kni: fix spacing and line lenghts
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2016-10-13 23:05:31 +02:00
Ferruh Yigit
9afcc5bc74 kni: make static struct const
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2016-10-13 23:05:28 +02:00
Ferruh Yigit
fbd71b623f kni: uninitialize global variables
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2016-10-13 23:05:20 +02:00
Ferruh Yigit
f60c4df6bb kni: move externs to the header file
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2016-10-13 22:49:41 +02:00
Vladyslav Buslov
93a298b34e kni: support core id parameter in single threaded mode
Allow binding KNI thread to specific core in single threaded mode
by setting core_id and force_bind config parameters.

Signed-off-by: Vladyslav Buslov <vladyslav.buslov@harmonicinc.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Ferruh Yigit <ferruh.yigit@intel.com>
2016-10-13 22:24:45 +02:00
Wei Dai
f05b0fbe7d lpm: remove redundant check when adding rule
When a rule with depth > 24 is added into an existing
rule with depth <=24, a new tbl8 is allocated, the existing
rule first fulfill whole new tbl8, so the filed valid of
each entry in this tbl8 is always true and depth of each
entry is always <= 24 before adding the new rule with depth > 24.

Signed-off-by: Wei Dai <wei.dai@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2016-10-13 22:17:00 +02:00
Wei Dai
69ed52dddc lpm: fix freeing unused sub-table on rule delete
When all rules with depth > 24 are deleted in a same sub-table
(tlb8 group) and only a rule with depth <=24 is left in it,
this sub-table (tlb8 group) should be recycled.

Fixes: dc81ebbaca ("lpm: extend IPv4 next hop field")
Fixes: af75078fec ("first public release")

Signed-off-by: Wei Dai <wei.dai@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2016-10-13 22:13:19 +02:00
John Ousterhout
844bd77c03 log: respect logger configured before EAL init
Before this patch, application-specific loggers could not be
installed before rte_eal_init completed (the initialization process
called rte_openlog_stream, overwriting any previously installed
logger). This made it impossible for an application to capture the
initial log messages generated during rte_eal_init. This patch changes
initialization so that information from a previous call to
rte_openlog_stream is not lost. Specifically:
* The default log stream is now maintained separately from an
  application-specific log stream installed with rte_openlog_stream.
* rte_eal_common_log_init has been renamed to eal_log_set_default,
  since this is all it does. It no longer invokes rte_openlog_stream; it
  just updates the default stream. Also, this method now returns void,
  rather than int, since there are no errors.

This patch also removes the "early log" mechanism and cleans up the
log initialization mechanism:
* The default log stream defaults to stderr on all platforms if
  eal_log_set_default hasn't been invoked (Linux used to use stdout
  during the first part of initialization).
* Removed rte_eal_log_early_init; all of the desired functionality can
  be achieved by calling eal_log_set_default.
* Removed lib/librte_eal/bsdapp/eal/eal_log.c: it contained only one
  function, rte_eal_log_init, which is not needed or invoked for BSD.
* Removed declaration for eal_default_log_stream in rte_log.h (it's now
  private to eal_common_log.c).
* Moved call to rte_eal_log_init earlier in rte_eal_init for Linux, so
  that it starts using the preferrred log ASAP.

Signed-off-by: John Ousterhout <ouster@cs.stanford.edu>
2016-10-13 22:13:18 +02:00
Mauricio Vasquez B
29f1cb4b38 doc: fix file argument of debug functions
Previous patch updated the functions without updating all the comments.

Fixes: 591a9d7985 ("add FILE argument to debug functions")

Signed-off-by: Mauricio Vasquez B <mauricio.vasquez@polito.it>
Acked-by: John McNamara <john.mcnamara@intel.com>
2016-10-13 21:25:53 +02:00
Olivier Matz
6ca3a595e0 mbuf: add flag for LRO
When receiving coalesced packets in virtio, the original size of the
segments is provided. This is a useful information because it allows to
resegment with the same size.

Add a RX new flag in mbuf, that can be set when packets are coalesced by
a hardware or virtual driver when the m->tso_segsz field is valid and is
set to the segment size of original packets.

This flag is used in next commits in the virtio pmd.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-10-13 20:45:54 +02:00
Olivier Matz
5842289a54 mbuf: add new Rx checksum flags
Following discussions in [1] and [2], introduce a new bit to
describe the Rx checksum status in mbuf.

Before this patch, only one flag was available:
  PKT_RX_L4_CKSUM_BAD: L4 cksum of RX pkt. is not OK.

And same for L3:
  PKT_RX_IP_CKSUM_BAD: IP cksum of RX pkt. is not OK.

This had 2 issues:
- it was not possible to differentiate "checksum good" from
  "checksum unknown".
- it was not possible for a virtual driver to say "the checksum
  in packet may be wrong, but data integrity is valid".

This patch tries to solve this issue by having 4 states (2 bits)
for the IP and L4 Rx checksums. New values are:

 - PKT_RX_L4_CKSUM_UNKNOWN: no information about the RX L4 checksum
   -> the application should verify the checksum by sw
 - PKT_RX_L4_CKSUM_BAD: the L4 checksum in the packet is wrong
   -> the application can drop the packet without additional check
 - PKT_RX_L4_CKSUM_GOOD: the L4 checksum in the packet is valid
   -> the application can accept the packet without verifying the
      checksum by sw
 - PKT_RX_L4_CKSUM_NONE: the L4 checksum is not correct in the packet
   data, but the integrity of the L4 data is verified.
   -> the application can process the packet but must not verify the
      checksum by sw. It has to take care to recalculate the cksum
      if the packet is transmitted (either by sw or using tx offload)

  And same for L3 (replace L4 by IP in description above).

This commit tries to be compatible with existing applications that
only check the existing flag (CKSUM_BAD).

[1] http://dpdk.org/ml/archives/dev/2016-May/039920.html
[2] http://dpdk.org/ml/archives/dev/2016-June/040007.html

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-10-13 20:45:33 +02:00
Olivier Matz
c442fed81b net: add function to calculate checksum in mbuf
This function can be used to calculate the checksum of data embedded in
mbuf, that can be composed of several segments.

This function will be used by the virtio pmd in next commits to calculate
the checksum in software in case the protocol is not recognized.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-10-13 20:44:18 +02:00
Zhihong Wang
f46f655143 vhost: fix Windows VM hang
This patch fixes a Windows VM compatibility issue in DPDK 16.07 vhost code
which causes the guest to hang once any packets are enqueued when mrg_rxbuf
is turned on by setting the right id and len in the used ring.

As defined in virtio spec 0.95 and 1.0, in each used ring element, id means
index of start of used descriptor chain, and len means total length of the
descriptor chain which was written to. While in 16.07 code, index of the
last descriptor is assigned to id, and the length of the last descriptor is
assigned to len.

How to test?

 1. Start testpmd in the host with a vhost port.

 2. Start a Windows VM image with qemu and connect to the vhost port.

 3. Start io forwarding with tx_first in host testpmd.

For 16.07 code, the Windows VM will hang once any packets are enqueued.

Signed-off-by: Zhihong Wang <zhihong.wang@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-10-13 10:29:31 +02:00
Yuanhan Liu
9ba1e744ab vhost: add a flag to enable dequeue zero copy
Dequeue zero copy is disabled by default. Here add a new flag
``RTE_VHOST_USER_DEQUEUE_ZERO_COPY`` to explictily enable it.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
2016-10-13 10:29:20 +02:00
Yuanhan Liu
b0a985d1f3 vhost: add dequeue zero copy
The basic idea of dequeue zero copy is, instead of copying data from
the desc buf, here we let the mbuf reference the desc buf addr directly.

Doing so, however, has one major issue: we can't update the used ring
at the end of rte_vhost_dequeue_burst. Because we don't do the copy
here, an update of the used ring would let the driver to reclaim the
desc buf. As a result, DPDK might reference a stale memory region.

To update the used ring properly, this patch does several tricks:

- when mbuf references a desc buf, refcnt is added by 1.

  This is to pin lock the mbuf, so that a mbuf free from the DPDK
  won't actually free it, instead, refcnt is subtracted by 1.

- We chain all those mbuf together (by tailq)

  And we check it every time on the rte_vhost_dequeue_burst entrance,
  to see if the mbuf is freed (when refcnt equals to 1). If that
  happens, it means we are the last user of this mbuf and we are
  safe to update the used ring.

- "struct zcopy_mbuf" is introduced, to associate an mbuf with the
  right desc idx.

Dequeue zero copy is introduced for performance reason, and some rough
tests show about 50% perfomance boost for packet size 1500B. For small
packets, (e.g. 64B), it actually slows a bit down (well, it could up to
15%). That is expected because this patch introduces some extra works,
and it outweighs the benefit from saving few bytes copy.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
2016-10-12 09:45:14 +02:00
Yuanhan Liu
f6be82d725 vhost: introduce last available index for dequeue
So far, we retrieve both the used ring and avail ring idx by the var
last_used_idx; it won't be a problem because the used ring is updated
immediately after those avail entries are consumed.

But that's not true when dequeue zero copy is enabled, that used ring is
updated only when the mbuf is consumed. Thus, we need use another var to
note the last avail ring idx we have consumed.

Therefore, last_avail_idx is introduced.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
2016-10-12 09:45:12 +02:00
Yuanhan Liu
e246896178 vhost: get guest/host physical address mappings
So that we can convert a guest physical address to host physical
address, which will be used in later Tx zero copy implementation.

MAP_POPULATE is set while mmaping guest memory regions, to make
sure the page tables are setup and then rte_mem_virt2phy() could
yield proper physical address.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
2016-10-12 09:45:09 +02:00
Yuanhan Liu
552e8fd3d2 vhost: simplify memory regions handling
Due to history reason (that vhost-cuse comes before vhost-user), some
fields for maintaining the vhost-user memory mappings (such as mmapped
address and size, with those we then can unmap on destroy) are kept in
"orig_region_map" struct, a structure that is defined only in vhost-user
source file.

The right way to go is to remove the structure and move all those fields
into virtio_memory_region struct. But we simply can't do that before,
because it breaks the ABI.

Now, thanks to the ABI refactoring, it's never been a blocking issue
any more. And here it goes: this patch removes orig_region_map and
redefines virtio_memory_region, to include all necessary info.

With that, we can simplify the guest/host address convert a bit.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
2016-10-12 09:44:56 +02:00
Thomas Monjalon
5fc07e3eb7 app/test: fix vdev names
The vdev eth_ring has been renamed to net_ring.
Some unit tests are using the old name and fail.

Fixes also the vdev comments in EAL and ethdev.

Fixes: 2f45703c17 ("drivers: make driver names consistent")

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2016-10-13 15:55:42 +02:00
Jasvinder Singh
5a99f20868 port: support file descriptor
This patch adds File Descriptor(FD) port type (e.g. TAP port) to the
packet framework library that allows interface with the kernel network
stack. The FD port APIs are defined that allow port creation, writing
and reading packet from the kernel interface.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2016-10-13 11:42:37 +02:00
Jasvinder Singh
3275de7be8 port: modify source and sink port structure
The ``file_name`` data type of ``struct rte_port_source_params`` and
``struct rte_port_sink_params`` is changed from `char *`` to ``const char *``.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2016-10-12 22:25:24 +02:00
Guruprasad Rao
5a80bf0ae6 table: add cuckoo hash
This patch provides table apis for dosig version of cuckoo hash
via  rte_table_hash_cuckoo_dosig_ops

The following apis are implemented for cuckoo hash
	rte_table_hash_cuckoo_create
	rte_table_hash_cuckoo_free
	rte_table_hash_cuckoo_entry_add
	rte_table_hash_cuckoo_entry_delete
	rte_table_hash_cuckoo_lookup_dosig
	rte_table_hash_cuckoo_stats_read

Signed-off-by: Sankar Chokkalingam <sankarx.chokkalingam@intel.com>
Signed-off-by: Guruprasad Rao <guruprasadx.rao@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2016-10-12 22:08:36 +02:00
Pablo de Lara
94eaebad74 hash: fix bucket size usage
Multiwriter insert function was using a fixed value for
the bucket size, instead of using the
RTE_HASH_BUCKET_ENTRIES macro, which value was changed
recently (making it inconsistent in this case).

Fixes: be856325cb ("hash: add scalable multi-writer insertion with Intel TSX")

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2016-10-12 18:40:52 +02:00
Pablo de Lara
243e93a504 hash: fix unlimited cuckoo path
When trying to insert a new entry, if its target bucket is full,
the alternative location (bucket) of one of the entries is checked,
to try to find an empty slot, with make_space_bucket.
This function is called every time a new bucket is checked, recursively.
To avoid having a very long insert operation (and to avoid filling up
the stack), a limit in the number of pushes is introduced.

Fixes: 48a3991196 ("hash: replace with cuckoo hash implementation")

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2016-10-12 18:40:51 +02:00
Reshma Pattan
1a0c9cb015 pdump: fix created directory permissions
Inside the function pdump_get_socket_path(), pdump socket
directories are created using mkdir() call with permissions 700,
which was assigning wrong permissions to the directories
i.e. "d-w-r-xr-T" instead of drwx---. The reason is mkdir() call
doesn't consider 700 as an octal value until unless 0 is explicitly
added before the value. Because of this, socket creation failure is
observed when DPDK application was ran in non root user mode.
DPDK application running in root user mode never reported the issue.

So 0 is prefixed to the value to create directories with
the correct permissions.

Fixes: e4ffa2d3 ("pdump: fix error handlings")
Fixes: bdd8dcc6 ("pdump: fix default socket path")

Reported-by: Jianfeng Tan <jianfeng.tan@intel.com>
Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
2016-10-12 18:40:49 +02:00
Olivier Matz
5d4955d3e3 mbuf: add functions to dump offload flags
The functions rte_get_rx_ol_flag_name() and rte_get_tx_ol_flag_name()
can dump one flag, or set of flag that are part of the same mask (ex:
PKT_TX_UDP_CKSUM, part of PKT_TX_L4_MASK). But they are not designed to
dump the list of flags contained in mbuf->ol_flags.

This commit introduce new functions to do that. Similarly to the packet
type dump functions, the goal is to factorize the code that could be
used in several applications and reduce the risk of desynchronization
between the flags and the dump functions.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2016-10-12 18:08:40 +02:00
Olivier Matz
8c5cb94993 mbuf: clarify definition of fragment packet types
An IPv4 packet is considered as a fragment if:
- MF (more fragment) bit is set
- or Fragment_Offset field is non-zero

Update the API documentation of packet types to reflect this.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2016-10-11 18:22:46 +02:00
Olivier Matz
288541c8ff mbuf: add functions to dump packet type
Dumping the packet type is useful for debug purposes. Instead
of having each application providing its function to do that,
introduce functions to do it.

It factorizes the code and reduces the risk of desynchronization between
the new packet types and the dump function.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2016-10-11 18:22:36 +02:00
Olivier Matz
873dd1e68d net: get packet type for the first layers only
Add a parameter to rte_net_get_ptype() to select which
layers should be parsed. This avoids to parse all layers if
only the first ones are required.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2016-10-11 18:17:13 +02:00
Olivier Matz
2c15c5377d net: support NVGRE in software packet type parser
Add support of Nvgre tunnels in rte_net_get_ptype(). At the same
time, as Nvgre transports Ethernet, we need to add the support for inner
Vlan, QinQ, and Mpls.

Signed-off-by: Jean Dao <jean.dao@6wind.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2016-10-11 18:17:13 +02:00
Olivier Matz
d21d855464 net: support GRE in software packet type parser
Add support of Gre tunnels in rte_net_get_ptype().

Signed-off-by: Jean Dao <jean.dao@6wind.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2016-10-11 18:17:13 +02:00
Olivier Matz
894f71a380 net: add GRE header structure
Add the Gre header structure in librte_net. It will be used by next
patches that adds the support of Gre tunnels in the software packet type
parser.

The extended headers (checksum, key or sequence number) are not defined.

Signed-off-by: Jean Dao <jean.dao@6wind.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2016-10-11 18:17:13 +02:00
Olivier Matz
b32159dcca net: support IP tunnels in software packet type parser
Add support of IP and IP6 tunnels in rte_net_get_ptype().

We need to duplicate some code because the packet types do not have the
same value for a given protocol between inner and outer.

Signed-off-by: Jean Dao <jean.dao@6wind.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2016-10-11 18:17:12 +02:00
Olivier Matz
218a163efd net: support QinQ in software packet type parser
Add a new RTE_PTYPE_L2_ETHER_QINQ packet type, and its support in
rte_net_get_ptype().

Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2016-10-11 18:17:12 +02:00
Olivier Matz
eb173c8def net: support VLAN in software packet type parser
Add a new RTE_PTYPE_L2_ETHER_VLAN packet type, and its support in
rte_net_get_ptype().

Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2016-10-11 18:17:12 +02:00
Olivier Matz
ab09e04d2c net: add function to get packet type from data
Introduce the function rte_net_get_ptype() that parses a mbuf and
returns its packet type. For now, the following packet types are parsed:
   L2: Ether
   L3: IPv4, IPv6
   L4: TCP, UDP, SCTP

The goal here is to provide a reference implementation for packet type
parsing. This function will be used by testpmd in next commits, allowing
to compare its result with the value given by the hardware.

This function will also be useful when implementing Rx offload support
in virtio pmd. Indeed, the virtio protocol gives the csum start and
offset, but it does not give the L4 protocol nor it tells if the
checksum is relevant for inner or outer. This information has to be
known to properly set the ol_flags in mbuf.

Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Signed-off-by: Jean Dao <jean.dao@6wind.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2016-10-11 18:17:09 +02:00
Olivier Matz
b25c2a8c69 net: introduce net library
Previously, librte_net only contained header files. Add a C file
(empty for now) and generate a library. It will contain network helpers
like checksum calculation, software packet type parser, ...

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2016-10-11 18:16:22 +02:00
Olivier Matz
a3917f2218 mbuf: move packet type definitions in a new file
The file rte_mbuf.h starts to be quite big, and next commits
will introduce more functions related to packet types. Let's
move them in a new file.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2016-10-11 18:16:22 +02:00
Olivier Matz
57668ed7bc net: move ethernet definitions to the net library
The proper place for rte_ether.h is in librte_net because it defines
network headers.

Moving it will also prevent to have circular references in the following
patches that will require the Ethernet header definition in rte_mbuf.c.
By the way, fix minor checkpatch issues.

Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2016-10-11 18:16:22 +02:00
Olivier Matz
b84110e7ba mbuf: add function to read packet data
Introduce a new function to read the packet data from an mbuf chain. It
linearizes the data if required, and also ensures that the mbuf is large
enough.

This function is used in next commits that add a software parser to
retrieve the packet type.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2016-10-11 18:16:10 +02:00
David Marchand
10313a4699 ethdev: fix vendor id in debug message
Fixes: af75078fec ("first public release")

Signed-off-by: David Marchand <david.marchand@6wind.com>
2016-10-10 11:53:31 +02:00
David Marchand
ad606c9f6a ethdev: fix hotplug attach
If a pci probe operation creates a port but, for any reason, fails to
finish this operation and decides to delete the newly created port, then
the last created port id can not be trusted anymore and any subsequent
attach operations will fail.

This problem was noticed while working on a vm that had a virtio-net
management interface bound to the virtio-net kernel driver and no port
whitelisted in the commandline:

root@ubuntu1404:~/dpdk# ./build/app/testpmd -c 0x6 --
	 -i --total-num-mbufs=2049
EAL: Detected 3 lcore(s)
EAL: Probing VFIO support...
EAL: Debug logs available - lower performance
EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no -> using
	unreliable clock cycles !
EAL: PCI device 0000:00:03.0 on NUMA socket -1
EAL:   probe driver: 1af4:1000 (null)
rte_eth_dev_pci_probe: driver (null): eth_dev_init(vendor_id=0x6900
	device_id=0x1000) failed
EAL: No probed ethernet devices
     ^
     |
     Here, rte_eth_dev_pci_probe() fails since vtpci_init() reports an
     error. This results in a rte_eth_dev_release_port() right after a
     rte_eth_dev_allocate().

Then, if we try to attach a port using rte_eth_dev_attach:

testpmd> port attach net_ring0
Attaching a new port...
PMD: Initializing pmd_ring for net_ring0
PMD: Creating rings-backed ethdev on numa socket 0

Two solutions:
- either update the last created port index to something invalid
  (when freeing a ethdev port),
- or rely on the port count, before and after the eal attach.

The latter solution seems (well not really more robust but at least)
less fragile than the former.
We still have some issues with drivers that create multiple ethdev
ports with a single probe operation, but this was already the case.

Fixes: b0fb266855 ("ethdev: convert to EAL hotplug")

Reported-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
2016-10-10 11:52:50 +02:00
Jianfeng Tan
c59faf3fe8 net/i40e: support TSO on tunneling packet
To enable Tx side offload on tunneling packet, driver should set
correct tunneling parameters: (1) EIPT, External IP header type;
(2) EIPLEN, External IP; (3) L4TUNT; (4) L4TUNLEN. This parsing
behavior is based on (ol_flag & PKT_TX_TUNNEL_MASK). And when
it's a tunneling packet, MACLEN defines the outer L2 header.

Also, we define TSO on each kind of tunneling type as a capabilities.
Now only i40e declares to support them.

Signed-off-by: Zhe Tao <zhe.tao@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2016-10-09 23:19:15 +02:00
Jianfeng Tan
63c0d74daa mbuf: add Tx side tunneling type
To support tunneling packet offload capabilities on Tx side, PMDs
(e.g., i40e) need to know what kind of tunneling type of this packet.
Instead of analyzing the packet itself, we depend on applications to
correctly set the tunneling type. These flags are defined inside
rte_mbuf.ol_flags.

Signed-off-by: Zhe Tao <zhe.tao@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2016-10-09 23:18:48 +02:00
Pablo de Lara
afc5dffaa0 cryptodev: fix build on Suse 11 SP2
This commit fixes following build error, which happens in SUSE 11 SP2,
with gcc 4.5.1:

In file included from lib/librte_cryptodev/rte_cryptodev.c:70:0:
lib/librte_cryptodev/rte_cryptodev.h:772:7:
error: flexible array member in otherwise empty struct

Fixes: 347a1e037f ("lib: use C99 syntax for zero-size arrays")

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-10-08 17:54:38 +02:00
Slawomir Mrozowicz
d61f70b4c9 crypto/libcrypto: add driver for OpenSSL library
This code provides the initial implementation of the libcrypto
poll mode driver. All cryptography operations are using Openssl
library crypto API. Each algorithm uses EVP_ interface from
openssl API - which is recommended by Openssl maintainers.

This patch adds libcrypto poll mode driver support to librte_cryptodev
library.

Signed-off-by: Slawomir Mrozowicz <slawomirx.mrozowicz@intel.com>
Signed-off-by: Michal Kobylinski <michalx.kobylinski@intel.com>
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Signed-off-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2016-10-08 17:54:37 +02:00
Pablo de Lara
cf7685d68f crypto/zuc: add driver for ZUC library
Added new SW PMD which makes use of the libsso SW library,
which provides wireless algorithms ZUC EEA3 and EIA3
in software.

This PMD supports cipher-only, hash-only and chained operations
("cipher then hash" and "hash then cipher") of the following
algorithms:
- RTE_CRYPTO_SYM_CIPHER_ZUC_EEA3
- RTE_CRYPTO_SYM_AUTH_ZUC_EIA3

The ZUC hash and cipher algorithms, which are enabled
by this crypto PMD are implemented by Intel's libsso software
library.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Deepak Kumar Jain <deepak.k.jain@intel.com>
2016-10-08 17:53:10 +02:00
Pablo de Lara
af9f6afb14 crypto: rename all KASUMI references
KASUMI algorithm has all uppercase letters,
but some references of it had some lowercase letters.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Deepak Kumar Jain <deepak.k.jain@intel.com>
2016-10-04 20:41:09 +02:00
Pablo de Lara
6aef763816 crypto: rename some SNOW 3G references
SNOW 3G algorithm has all uppercase letters in its name
and a space between SNOW and 3G, but some references of it
had some lowercase letters or no space.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Deepak Kumar Jain <deepak.k.jain@intel.com>
2016-10-04 20:41:09 +02:00
Arek Kusztal
2508c30912 cryptodev: update GMAC API comments
In file rte_crypto_sym.h, GMAC API comments need to be changed
to comply with the GMAC specification. Main areas of change are
aad pointer and aad len, which now will be used to
provide plaintext.

Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Deepak Kumar Jain <deepak.k.jain@intel.com>
2016-10-04 20:41:09 +02:00
Shreyansh Jain
50a3345fa9 vdev: rename init/uninit ops to probe/remove
Inline with PCI probe and remove, VDEV probe and remove hooks provide
a uniform naming.
PCI probe represents scan and driver initialization. For VDEV, it will
represent argument parsing and initialization.

Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
2016-10-06 16:02:14 +02:00
Olivier Matz
11fd19f764 mem: fix build with -O1
When compiled with EXTRA_CFLAGS="-O1", the compiler is not
able to detect that size is always initialized when used, and
issues a wrong warning:

  eal_memory.c: In function ‘rte_eal_hugepage_attach’:
  eal_memory.c:1684:3: error: ‘size’ may be used uninitialized in this
                       function [-Werror=maybe-uninitialized]
     munmap(hp, size);
     ^

Workaround this issue by initializing size to 0.
Seen on gcc (Debian 5.4.1-1) 5.4.1 20160803.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2016-10-06 15:23:53 +02:00
Panu Matilainen
dad7734e66 ip_frag: fix missing dependency on hash library
Not sure what exactly changed and where, but I've started getting
build failures on Fedora rawhide i386:
    lib/librte_ip_frag/ip_frag_internal.c:36:23: fatal error:
	    rte_jhash.h: No such file or directory
     #include <rte_jhash.h>
                       ^
Looking at librte_ip_frag, it clearly depends on librte_hash so
its probably more a question of something commonly masking the issue.

Signed-off-by: Panu Matilainen <pmatilai@redhat.com>
2016-10-05 15:47:23 +02:00
Ferruh Yigit
094be9925d kni: remove unnecessary ethtool files
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
2016-10-05 15:45:21 +02:00
Ferruh Yigit
7550f1201f kni: remove unused ethtool files
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
2016-10-05 15:45:11 +02:00
Maxime Coquelin
febf2bb46d mbuf: add function to reset headroom
Some application use rte_mbuf_raw_alloc() function to improve
performance by not resetting mbuf's fields to their default state.

This can be however problematic for mbuf consumers that need some
headroom, meaning that data_off field gets decremented after
allocation. When the mbuf is re-used afterwards, there might not
be enough room for the consumer to prepend anything, if the data_off
field is not reset to its default value.

This patch adds a new rte_pktmbuf_reset_headroom() function that
applications can call to reset the data_off field.
This patch also replaces current data_off affectations in the mbuf
lib with a call to this function.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2016-10-05 15:13:37 +02:00
Olivier Matz
4e8739e9bb mbuf: fix error handling on pool creation
On error, the mempool object has to be freed, and rte_errno should be a
positive value.

Fixes: 152ca51790 ("mbuf: use default mempool handler from config")

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2016-10-05 14:21:05 +02:00
Byron Marohn
ff15d9c0ba hash: modify lookup bulk pipeline
This patch replaces the pipelined rte_hash lookup mechanism with a
loop-and-jump model, which performs significantly better,
especially for smaller table sizes and smaller table occupancies.

Signed-off-by: Byron Marohn <byron.marohn@intel.com>
Signed-off-by: Saikrishna Edupuganti <saikrishna.edupuganti@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Sameh Gobriel <sameh.gobriel@intel.com>
2016-10-05 12:10:49 +02:00
Byron Marohn
58017c98ed hash: add vectorized comparison
In lookup bulk function, the signatures of all entries
are compared against the signature of the key that is being looked up.
Now that all the signatures are together, they can be compared
with vector instructions (SSE, AVX2), achieving higher lookup performance.

Also, entries per bucket are increased to 8 when using processors
with AVX2, as 256 bits can be compared at once, which is the size of
8x32-bit signatures.

Signed-off-by: Byron Marohn <byron.marohn@intel.com>
Signed-off-by: Saikrishna Edupuganti <saikrishna.edupuganti@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Sameh Gobriel <sameh.gobriel@intel.com>
2016-10-05 12:09:50 +02:00
Byron Marohn
8a9f542f32 hash: reorganize bucket structure
Move current signatures of all entries together in the bucket
and same with all alternative signatures, instead of having
current and alternative signatures together per entry in the bucket.
This will be benefitial in the next commits, where a vectorized
comparison will be performed, achieving better performance.

The alternative signatures have been moved away from
the current signatures, to make the key indices be consecutive
to the current signatures, as these two fields are used by lookup,
so they are in the same cache line.

Signed-off-by: Byron Marohn <byron.marohn@intel.com>
Signed-off-by: Saikrishna Edupuganti <saikrishna.edupuganti@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Sameh Gobriel <sameh.gobriel@intel.com>
2016-10-05 12:08:56 +02:00
Pablo de Lara
02a08eb355 hash: reorder hash structure
In order to optimize lookup performance, hash structure
is reordered, so all fields used for lookup will be
in the first cache line.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Sameh Gobriel <sameh.gobriel@intel.com>
2016-10-05 12:08:04 +02:00
Karmarkar Suyash
0778cfe864 timer: fix lag delay
For periodic timers, if the lag gets introduced, the current code
added additional delay when the next peridoc timer was initialized
by not taking into account the delay added, with this fix the code
would start the next occurrence of timer keeping in account the
lag added. Corrected the behavior.

Fixes: 9b15ba89 ("timer: use a skip list")

Signed-off-by: Karmarkar Suyash <skarmarkar@sonusnet.com>
Acked-by: Robert Sanford <rsanford@akamai.com>
2016-10-05 12:02:53 +02:00
Jean Tourrilhes
db8c96c551 mem: fix hugepage mapping error messages
Running secondary is tricky due to the need to map the memory region
at the right place in VM, which is whatever primary has chosen. If the
base address for primary happens to by already mapped in the
secondary, we will hit precisely these error messages (depending if we
fail on the config region or the hugepages). This is why there is
already a comment about ASLR.

The issue is that in most cases, remapping does not happen and "errno"
is not changed and therefore stale. In our case, we got a "permission
denied", which sent us down the wrong track. It's such a common error
for secondary that I feel this error message should be unambiguous and
helpful.
The call to close was also moved because close() may override errno.

Signed-off-by: Jean Tourrilhes <jt@labs.hpe.com>
2016-10-05 11:42:45 +02:00
Konstantin Ananyev
fd4015e98e eal: fix C++ link of delay function pointer
When compiling with C++, it treats
void (*rte_delay_us)(unsigned int us);
as definition of the global variable.
So further linking with librte_eal fails.

Fixes: b4d63fb622 ("eal: customize delay function")

Steps to reproduce:

$ cat rttm1.cpp

using namespace std;

int main(int argc, char *argv[])
{
        int ret = rte_eal_init(argc, argv);
        rte_delay_us(1);
        cout << "return code ";
        cout << ret;
        return ret;
}

$ g++ -m64 -I/${RTE_SDK}/${RTE_TARGET}/include -c  -o rttm1.o rttm1.cpp
$ gcc -m64 -pthread -o rttm1 rttm1.o -ldl -Wl,-lstdc++ \
  -L/${RTE_SDK}/${RTE_TARGET}/lib -Wl,-lrte_eal
.../librte_eal.a(eal_common_timer.o):
(.bss+0x0): multiple definition of `rte_delay_us'
rttm1.o:(.bss+0x0): first defined here
collect2: error: ld returned 1 exit status

$ nm rttm1.o | grep rte_delay_us
0000000000000092 t _GLOBAL__sub_I_rte_delay_us
0000000000000000 B rte_delay_us

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2016-10-05 11:16:28 +02:00
Maxime Coquelin
2304dd73d2 vhost: support indirect Tx descriptors
Indirect descriptors are usually supported by virtio-net devices,
allowing to dispatch a larger number of requests.

When the virtio device sends a packet using indirect descriptors,
only one slot is used in the ring, even for large packets.

The main effect is to improve the 0% packet loss benchmark.
A PVP benchmark using Moongen (64 bytes) on the TE, and testpmd
(fwd io for host, macswap for VM) on DUT shows a +50% gain for
zero loss.

On the downside, micro-benchmark using testpmd txonly in VM and
rxonly on host shows a loss between 1 and 4%. But depending on
the needs, feature can be disabled at VM boot time by passing
indirect_desc=off argument to vhost-user device in Qemu.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-09-28 02:18:33 +02:00
Matthias Gatto
255c4829ad vhost: remove obsolete comment
As new_device and destroy_device use an int instead of a
"struct virtio_net *", The comment about setting VIRTIO_DEV_RUNNING
doesn't make sense anymore, plus If I've correctly understand the
code, the drivers take care of setting the flag before calling the
callbacks, so I guess that this comment is obsolet and I've remove it.

Signed-off-by: Matthias Gatto <matthias.gatto@outscale.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-09-13 05:25:09 +02:00
Yuanhan Liu
484e42d46f vhost: simplify features set/get
No need to use a pointer to store/retrieve features.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2016-09-13 05:25:09 +02:00
Yuanhan Liu
bbd7e83520 vhost: get device once
Invoke get_device() at the beginning of vhost_user_msg_handler, so that
we could check the return value once. Which could save tons of duplicate
get-and-check device.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2016-09-13 05:25:08 +02:00
Yuanhan Liu
fc2a9b5b64 vhost: unify function names
Some functions are with prefix "user_", while others with "vhost_".
Making them all starting with "vhost_user_" to unify the function names.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2016-09-13 05:25:08 +02:00
Yuanhan Liu
de854fd95d vhost: fold common message handlers
Due to history reason (that we have 2 vhost implementations), some
messages are handled in two calls: vhost specific implementation
handles it first and then invoke the common one to do another handling.

We have one implementation only now, we could write one method for
each message. Here fold those common handles to corresponding vhost
user handler.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2016-09-13 05:25:08 +02:00
Yuanhan Liu
a277c71598 vhost: refactor code structure
The code structure is a bit messy now. For example, vhost-user message
handling is spread to three different files:

    vhost-net-user.c  virtio-net.c  virtio-net-user.c

Where, vhost-net-user.c is the entrance to handle all those messages
and then invoke the right method for a specific message. Some of them
are stored at virtio-net.c, while others are stored at virtio-net-user.c.

The truth is all of them should be in one file, vhost_user.c.

So this patch refactors the source code structure: mainly on renaming
files and moving code from one file to another file that is more suitable
for storing it. Thus, no functional changes are made.

After the refactor, the code structure becomes to:

- socket.c      handles all vhost-user socket file related stuff, such
                as, socket file creation for server mode, reconnection
                for client mode.

- vhost.c       mainly on stuff like vhost device creation/destroy/reset.
                Most of the vhost API implementation are there, too.

- vhost_user.c  all stuff about vhost-user messages handling goes there.

- virtio_net.c  all stuff about virtio-net should go there. It has virtio
                net Rx/Tx implementation only so far: it's just a rename
                from vhost_rxtx.c

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2016-09-13 05:25:08 +02:00
Yuanhan Liu
8394025f54 vhost: remove sub-directory
We now have one vhost implementation; no sub source dir is needed.
Remove it by move them to upper dir.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2016-09-13 05:25:08 +02:00
Yuanhan Liu
466d914b01 vhost: remove vhost-cuse
remove vhost-cuse code, including the eventfd_link kernel module that
is for vhost-cuse only.

The lib/virt/qemu-wrap.py is also removed, as it's mainly for vhost-cuse
usage.

As we have one vhost implementation now, one vhost config option is
needed only. Thus, CONFIG_RTE_LIBRTE_VHOST_USER is removed.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2016-09-13 05:25:08 +02:00
Pablo de Lara
5230bc4c77 hash: fix free slot check
In function rte_hash_cuckoo_insert_mw_tm, while looking for
an empty slot, only the first entry in the bucket was being checked,
as key_idx array was not being iterated.

Fixes: 5fc74c2e14 ("hash: check if slot is empty with key index")

Reported-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2016-10-04 11:40:40 +02:00
Olivier Matz
faaf69adb9 ethdev: clarify API comment for imissed stats
The "imissed" stats represent RX packets dropped by the HW,
so we should not talk about mbufs as the hardware is not aware
of this structure. Buffer seems to be a better word.

Fixes: 4eadb8ba11 ("ethdev: do not deprecate imissed counter")

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2016-10-04 11:35:23 +02:00
Ferruh Yigit
722500498f mempool: fix comments for no contiguous flag
Fixes: ce94a51ff0 ("mempool: add flag for removing phys contiguous constraint")

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2016-10-04 11:16:01 +02:00
Ferruh Yigit
8e0437473d mempool: fix comments of create functions
Fixes: 85226f9c52 ("mempool: introduce a function to create an empty pool")
Fixes: d1d914ebbc ("mempool: allocate in several memory chunks by default")

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2016-10-04 11:15:19 +02:00
Jerin Jacob
f91bcbb2d9 eal/armv8: use high-resolution cycle counter
Existing cntvct_el0 based rte_rdtsc() provides portable
means to get wall clock counter at user space. Typically
it runs at <= 100MHz.

The alternative method to enable rte_rdtsc() for high resolution
wall clock counter is through armv8 PMU subsystem.
The PMU cycle counter runs at CPU frequency, However,
access to PMU cycle counter from user space is not enabled
by default in the arm64 linux kernel.
It is possible to enable cycle counter at user space access
by configuring the PMU from the privileged mode (kernel space).

by default rte_rdtsc() implementation uses portable
cntvct_el0 scheme. Application can choose the PMU based
implementation with CONFIG_RTE_ARM_EAL_RDTSC_USE_PMU

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
2016-10-04 10:43:44 +02:00
Yangchao Zhou
6edfa69ba6 pci: fix memory leak when detaching device
Fixes: dbe6b4b61b ("pci: probe or close device")

Signed-off-by: Yangchao Zhou <zhouyates@gmail.com>
Acked-by: David Marchand <david.marchand@6wind.com>
2016-10-04 10:05:51 +02:00
Jan Viktorin
13a1317d3b pci: create device list and fallback on its members
Now that rte_device is available, drivers can start using its members
(numa, name) as well as link themselves into another rte_device list.

As of now no one is using this list, but can be used for moving over all
devices (pdev/vdev/Xdev) and perform bulk actions (like cleanup).

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
[Shreyansh: Reword commit log for extra rte_device list]
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: David Marchand <david.marchand@6wind.com>
2016-10-03 16:34:03 +02:00
Jan Viktorin
a000b58662 eal: introduce generalized device
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: David Marchand <david.marchand@6wind.com>
2016-10-03 16:34:02 +02:00
Jan Viktorin
0a4f4001db eal: register drivers explicitly
To register both vdev and pci drivers into the list of all rte_driver,
we have to call rte_eal_driver_register explicitly.

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: David Marchand <david.marchand@6wind.com>
2016-10-03 16:33:59 +02:00
Jan Viktorin
2f3193cf0f pci: inherit common driver in PCI driver
Remove the 'name' member from rte_pci_driver and move to generic
rte_driver.

Most of the PMD drivers were initially using DRIVER_REGISTER_PCI(<name>..)
as well as assigning a name to eth_driver.pci_drv.name member.
In this patch, only the original DRIVER_REGISTER_PCI(<name>..) name has
been populated into the rte_driver.name member - assignments through
eth_driver has been removed.

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
[Shreyansh: Rebase and expand changes to newly added files]
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: David Marchand <david.marchand@6wind.com>
2016-10-03 16:33:55 +02:00
Jan Viktorin
9df1ae8a88 eal: rename and move PCI resource structure
There is no need to have a custom memory resource representation for
each infrastructure (PCI, ...) as it would always have the same members.

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: David Marchand <david.marchand@6wind.com>
2016-10-03 16:33:53 +02:00
Jan Viktorin
8a4764a466 eal: include dev headers in place of PCI headers
Further refactoring and generalization of PCI infrastructure will
require access to the rte_dev.h contents.

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: David Marchand <david.marchand@6wind.com>
2016-10-03 16:33:52 +02:00
Jan Viktorin
2695c6df69 eal: remove unused PMD types
- All devices register themselfs by calling a kind of DRIVER_REGISTER_XXX.
  The PMD_REGISTER_DRIVER is not used anymore.
- PMD_VDEV type is also not being used - can be removed from all VDEVs.

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: David Marchand <david.marchand@6wind.com>
2016-10-03 16:33:51 +02:00
Jan Viktorin
fe363dd425 drivers: use vdev registration
All PMD_VDEV drivers can now use rte_vdev_driver instead of the
rte_driver (which is embedded in the rte_vdev_driver).

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: David Marchand <david.marchand@6wind.com>
2016-10-03 16:33:48 +02:00
Jan Viktorin
7d299777ec eal: remove PCI/vdev unused code
- Remove checks for VDEV from rte_eal_vdev_(init/uninint) as all devices
  are inherently virtual here.
- PDEVs perform PCI specific inits - rte_eal_dev_init() need not call
  rte_driver->init();

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
[Shreyansh: Reword commit log]
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: David Marchand <david.marchand@6wind.com>
2016-10-03 16:33:45 +02:00
Jan Viktorin
a160404539 eal: extract vdev infra
Move all PMD_VDEV-specific code into a separate module and header
file to not polute the generic code anymore. There is now a list
of virtual devices available.

The rte_vdev_driver integrates the original rte_driver inside
(C inheritance). The rte_driver will be however change in the
future to serve as a common base for all other types of drivers.

The existing PMDs (PMD_VDEV) are to be modified later (there is
no change for them at the moment).

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: David Marchand <david.marchand@6wind.com>
2016-10-03 16:33:42 +02:00
David Marchand
6751f6deb7 ethdev: get rid of device type
Now that hotplug has been moved to eal, there is no reason to keep the
device type in this layer.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
2016-10-03 16:33:39 +02:00
David Marchand
b0fb266855 ethdev: convert to EAL hotplug
Remove bus logic from ethdev hotplug by using eal for this.

Current api is preserved:
- the last port that has been created is tracked to return it to the
  application when attaching,
- the internal device name is reused when detaching.

We can not get rid of ethdev hotplug yet since we still need some
mechanism to inform applications of port creation/removal to substitute
for ethdev hotplug api.

dev_type field in struct rte_eth_dev and rte_eth_dev_allocate are kept as
is, but this information is not needed anymore and is removed in the
following commit.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
2016-10-03 16:33:36 +02:00
David Marchand
e8c528cb92 eal: add hotplug operations for PCI and vdev
Hotplug invocations, which deals with devices, should come from the layer
that already handles them, i.e. EAL.

For both attach and detach operations, 'name' is used to select the bus
that will handle the request.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
2016-10-03 16:33:32 +02:00
David Marchand
284576e3f6 ethdev: do not scan all PCI devices on attach
No need to scan all devices, we only need to update the device being
attached.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
2016-10-03 16:33:29 +02:00
David Marchand
affe1cdc00 pci: introduce helpers for device name parsing/update
- Move rte_eth_dev_create_unique_device_name() from ether/rte_ethdev.c to
  common/include/rte_pci.h as rte_eal_pci_device_name(). Being a common
  method, can be used across crypto/net PCI PMDs.
- Remove crypto specific routine and fallback to common name function.
- Introduce a eal private Update function for PCI device naming.

Signed-off-by: David Marchand <david.marchand@6wind.com>
[Shreyansh: Merge crypto/pci helper patches]
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
2016-10-03 16:33:26 +02:00
David Marchand
c830cb2954 drivers: use PCI registration macro
Simplify crypto and ethdev pci drivers init by using newly introduced
init macros and helpers.
Those drivers then don't need to register as "rte_driver"s anymore.

Exceptions:
- virtio and mlx* use RTE_INIT directly as they have custom initialization
  steps.
- VDEV devices are not modified - they continue to use PMD_REGISTER_DRIVER.

Update documentation for replacing an example referring to
PMD_REGISTER_DRIVER.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
2016-10-03 16:33:23 +02:00