Commit Graph

3220 Commits

Author SHA1 Message Date
Yuanhan Liu
abd53c16b6 vhost: add features changed callback
Features could be changed after the feature negotiation. For example,
VHOST_F_LOG_ALL will be set/cleared at the start/end of live migration,
respecitively. Thus, we need a new callback to inform the application
on such change.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2017-04-01 10:42:44 +02:00
Yuanhan Liu
cb04355743 vhost: rename virtio-net to vhost
Rename "virtio-net" to "vhost" in the API comments and vhost prog guide.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2017-04-01 10:42:44 +02:00
Yuanhan Liu
7c12903746 vhost: rename device ops struct
rename "virtio_net_device_ops" to "vhost_device_ops", to not let it
be virtio-net specific.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2017-04-01 10:42:44 +02:00
Yuanhan Liu
aca49772f6 vhost: do not include net specific headers
Include it internally, at vhost.h.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2017-04-01 10:42:44 +02:00
Yuanhan Liu
f53cf83980 vhost: drop the Rx and Tx queue macro
They are virtio-net specific and should be defined inside the virtio-net
driver.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2017-04-01 10:42:44 +02:00
Yuanhan Liu
c0674b1bc8 vhost: move the device ready check at proper place
Currently, we check vq->desc, vq->kickfd and vq->callfd to know whether
a virtio device is ready or not. However, we only do it when handling
SET_VRING_KICK message, which could be wrong if a vhost-user frontend
send SET_VRING_KICK first and SET_VRING_CALL later.

To work for all possible vhost-user frontend implementations, we could
move the ready check at the end of vhost-user message handler.

Meanwhile, since we do the check more often than before, the "virtio
not ready" message is dropped, to not flood the screen.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2017-04-01 10:42:44 +02:00
Yuanhan Liu
b50a203986 vhost: export the number of vrings
We used to use rte_vhost_get_queue_num() for telling how many vrings.
However, the return value is the number of "queue pairs", which is
very virtio-net specific. To make it generic, we should return the
number of vrings instead, and let the driver do the proper translation.
Say, virtio-net driver could turn it to the number of queue pairs by
dividing 2.

Meanwhile, mark rte_vhost_get_queue_num as deprecated.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2017-04-01 10:42:44 +02:00
Yuanhan Liu
ab4d7b9f1a vhost: turn queue pair to vring
The queue pair is very virtio-net specific, other devices don't have
such concept. To make it generic, we should log the number of vrings
instead of the number of queue pairs.

This patch just does a simple convert, a later patch would export the
number of vrings to applications.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2017-04-01 10:42:44 +02:00
Yuanhan Liu
dba9bf127b vhost: export API to translate gpa to vva
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2017-04-01 10:42:44 +02:00
Yuanhan Liu
40ef286f23 vhost: export vhost vring info
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2017-04-01 10:42:44 +02:00
Yuanhan Liu
ca33faf9ef vhost: introduce API to fetch negotiated features
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2017-04-01 10:41:54 +02:00
Yuanhan Liu
eb32247457 vhost: export guest memory regions
Some vhost-user driver may need this info to setup its own page tables
for GPA (guest physical addr) to HPA (host physical addr) translation.
SPDK (Storage Performance Development Kit) is one example.

Besides, by exporting this memory info, we could also export the
gpa_to_vva() as an inline function, which helps for performance.
Otherwise, it has to be referenced indirectly by a "vid".

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2017-04-01 10:40:13 +02:00
Yuanhan Liu
93433b639d vhost: make notify ops per vhost driver
Assume there is an application both support vhost-user net and
vhost-user scsi, the callback should be different. Making notify
ops per vhost driver allow application define different set of
callbacks for different driver.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2017-04-01 10:40:13 +02:00
Yuanhan Liu
0917f9d1f0 vhost: use new APIs to handle features
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2017-04-01 10:40:13 +02:00
Yuanhan Liu
5fbb3941da vhost: introduce driver features related APIs
Introduce few APIs to set/get/enable/disable driver features.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2017-04-01 10:40:13 +02:00
Yuanhan Liu
65388b43f5 vhost: fix fd leaks for vhost-user server mode
A vhost-user server socket could have many connections, thus many connfd.
However, we currently just use one single int var to store it. Meaning,
it will get overwritten every time a new connection is created.

While this will not create fatal issue as it sounds (since the correct
connfd is closured to the event loop thread by fdset_add), it may cause
fd leaks if a user invokes rte_vhost_driver_unregister before shutting
down all connections: it just closes the recent connfd.

A simple example that should be able to reproduce this leaks issues is,
del the ovs vhost-user port while the connected VMs are still alive. (Note
that it's suggested to use one socket for one VM, which makes the issue
not that fatal as it sounds again).

Since we already use a struct "vhost_user_connection" to track all info
about one connection, it's obvious that we should put the connfd there.
Then we could build a connection list inside the vhost_user_socket struct,
to represent all connections belong that socket file.

Fixes: 164fd39678 ("vhost: fix unregistering in client mode")
Cc: stable@dpdk.org
Cc: Ilya Maximets <i.maximets@samsung.com>

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2017-04-01 10:36:17 +02:00
Jianfeng Tan
99998feec9 eal/linux: add interrupt type for vdev
A new interrupt type, RTE_INTR_HANDLE_VDEV, is added to support lsc and rxq
interrupt for vdev.

For lsc interrupt, except from original EPOLLIN events, we also listen for
socket peer closed connection event (EPOLLRDHUP and EPOLLHUP).

For rxq interrupt, add a precondition to avoid invoking any vfio and uio
code.

For intr_handle initialization, let each vdev driver to do that.

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
2017-04-01 10:36:17 +02:00
Kevin Traynor
4e9474141e vhost: fix false sharing
The broadcast_rarp field in the virtio_net struct is checked in the
dequeue datapath regardless of whether descriptors are available or not.

As it is checked with cmpset leading to a write, false sharing on the
virtio_net struct can happen between enqueue and dequeue datapaths
regardless of whether a RARP is requested. In OVS, the issue can cause
a uni-directional performance drop of up to 15%.

Fix that by only performing the cmpset if a read of broadcast_rarp
indicates that the cmpset is likely to succeed.

Fixes: a66bcad322 ("vhost: arrange struct fields for better cache sharing")
Cc: stable@dpdk.org

Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2017-04-01 10:36:17 +02:00
Maxime Coquelin
72e8543093 vhost: add API to get MTU value
This patch implements the function for the application to
get the MTU value.

rte_vhost_get_mtu() fills the mtu parameter with the MTU value
set in QEMU if VIRTIO_NET_F_MTU has been negotiated and returns 0,
-ENOTSUP otherwise.

The function returns -EAGAIN if Virtio feature negotiation
didn't happened yet.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2017-04-01 10:36:17 +02:00
Maxime Coquelin
4c5d8459d2 vhost: add new ready status flag
This patch adds a new status flag indicating the Virtio device
is ready to operate.

This is required to be able to call rte_vhost_mtu_get() in the
.new_device() callback, as rte_vhost_mtu_get needs that the
negotiation is done, but it is too early to rely on running status
flag, which is set just after .new_device() returns.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2017-04-01 10:36:17 +02:00
Maxime Coquelin
23f1e756ca vhost: support MTU protocol feature
This patch implements the vhost-user MTU protocol feature support.
When VIRTIO_NET_F_MTU is negotiated, QEMU notifies the vhost-user
backend with the configured MTU if dedicated protocol feature is
supported.

The value can be used by the application to ensure consistency with
value set by the user.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2017-04-01 10:36:17 +02:00
Maxime Coquelin
3d3c6590b5 vhost: enable virtio MTU feature
This patch enables the new VIRTIO_NET_F_MTU feature,
which makes possible for the host to advise the guest
with its maximum supported MTU.

MTU value is set via QEMU parameters, either via Libvirt XML, or
directly in virtio-net device command line arguments.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2017-04-01 10:36:17 +02:00
Yuanhan Liu
160cbc815b vhost: remove a hack on queue allocation
We used to allocate queues based on the index from SET_VRING_CALL
request: if corresponding queue hasn't been allocated, allocate it.

Though it's pratically right (it's the first per-vring request we
will get from QEMU for vhost-user negotiation), but it's not technically
right: it's not documented in the vhost-user spec that it will always
be the first per-vring request. For example, SET_VRING_ADDR could also
be the first per-vring request.

Thus, we should not depend the SET_VRING_CALL on queue allocation.
Instead, we could catch all the per-vring messages at the entrance of
request handler, and allocate one if it hasn't been allocated before.

By that, we could remove a hack.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2017-04-01 10:36:06 +02:00
Yuanhan Liu
de8e1fdcec vhost: fix max queues
0x8000 is the max virito-net queue pairs the virtio 1.0 spec claims to
support. While for vhost-user, it's a different story: the max vring
index could be passed by the vhost-user spec is 0xff, masked by the
VHOST_USER_VRING_IDX_MASK.

That said, the max queue pairs could vhost-user could supported is 0x80.
If user are asking more, I think the vhost-user need be extended.

Fixes: b09b198bfb ("vhost-user: announce queue number in message")
Cc: stable@dpdk.org

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2017-04-01 08:58:54 +02:00
Yuanhan Liu
8d286dbeb8 vhost: fix multiple queue not enabled for old kernels
Some macros (say VIRTIO_NET_F_MQ) are needed for enabling multiple queue,
however they are introduced since kernel v3.8, meaning build error happens
if we build DPDK vhost on those platforms.

71dfdbe66a ("vhost: fix build with kernel < 3.8") meant to fix it, but
in a wrong way: it completely disables the MQ features for those kernels.
However, the MQ feature doesn't depend on the kernel at all (except the
macros dependency stated above), that we could still enable the MQ feature
even the host kernel has no such support.

The right fix is to define the macro if it's not defined.

Fixes: 71dfdbe66a ("vhost: fix build with kernel < 3.8")
Cc: stable@dpdk.org

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2017-04-01 08:58:54 +02:00
Ilya Maximets
29b851e8de vhost: change log levels in client mode
Inability to connect to socket is a normal situation
in client mode because, in common case, server isn't
started yet. RTE_LOG_WARNING should be suitable for
the case of some unusual errors.
Message about reconnection is not an error at all.

Fixes: e623e0c6d8 ("vhost: add reconnect ability")
Cc: stable@dpdk.org

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2017-04-01 08:58:54 +02:00
Matthias Gatto
1b815b8959 vhost: try to shrink pfdset when fdset_add fails
fdset_add increments pfdset->num, but fdset_del doesn't decrement
pfdset->num, so if we call fdset_add then fdset_del in a loop without
calling fdset_shrink, we can easily exceed MAX_FDS with only a few
number of fds used.

So my solution is simply to call fdset_shrink in fdset_add when it
exceeds MAX_FDS.

Because fdset_shrink and fdset_add locks pfdset->fd_mutex we can't
call fdset_shrink inside fdset_add because that would cause a dead
lock, so this patch split fdset_shrink in two, fdset_shrink and
fdset_shrink_nolock.

Fixes: 59317cef24 ("vhost: allow many vhost-user ports")
Cc: stable@dpdk.org

Signed-off-by: Matthias Gatto <matthias.gatto@outscale.com>
2017-04-01 08:58:54 +02:00
Yongseok Koh
6f60ca5e5e ethdev: remove requirement of aligned RETA size
In rte_eth_check_reta_mask(), it is required to align the size of the RETA
table to RTE_RETA_GROUP_SIZE but as the size can be less than the limit,
this should be removed. The change is also applied to a command of testpmd.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
2017-04-04 19:03:02 +02:00
Beilei Xing
7cd048321d ethdev: add MPLS and GRE flow API items
This patch adds MPLS and GRE items to generic rte flow.

Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-04-04 19:02:58 +02:00
Shahaf Shuler
49e2f374e4 eal/linux: support external Rx interrupt
Prior to this patch only UIO/VFIO interrupt handlers types were supported.
This patch adds support for the external interrupt handler type, allowing
external drivers to set their own fds with specific interrupt handlers.

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2017-04-04 18:59:39 +02:00
Nirmoy Das
2972254ce1 kni: fix build on Suse 12 SP3
Add support for SLES12SP3, which uses kernel 4.4,
but backported features from newer kernels.

Signed-off-by: Nirmoy Das <ndas@suse.de>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-04-04 17:11:26 +02:00
Allain Legacy
216079fb1d cfgfile: support empty value
This commit adds support to the cfgfile library for parsing a key=value
line that has no value string specified (e.g., "key=").  This can be used
to override a configuration attribute that has a default value or default
list of values to set it back to an undefined value to disable
functionality.

Signed-off-by: Allain Legacy <allain.legacy@windriver.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2017-04-04 16:32:06 +02:00
Joseph Richard
3f3d51ebc8 cfgfile: fix parsing of long fields
When parsing a ini file with a "key = value" line that has both "key" and
"value" sized to the maximum allowed length causes a parsing failure.  The
internal "buffer" variable should be sized at least as large as the maximum
for both fields.  This commit updates the local array to be sized to hold
the max name, max value, " = ", and the nul terminator.

Signed-off-by: Allain Legacy <allain.legacy@windriver.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2017-04-04 16:32:06 +02:00
Allain Legacy
8eaff74f22 cfgfile: constrain string search
The call to memchr() uses the absolute length of the string buffer instead
of the actual length of the string returned by fgets().  This causes the
search to go beyond the '\n' character and find ';' characters in random
garbage on the stack.  This then causes the 'len' variable to be updated
and the subsequent search for the '=' character to potentially find one
beyond the first newline character.

Since this bug relies on ';' and '=' characters appearing in random places
in the 'buffer' variable it is intermittently reproducible at best.

Signed-off-by: Allain Legacy <allain.legacy@windriver.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2017-04-04 16:32:06 +02:00
Allain Legacy
f3b1a6981f cfgfile: support configurable comment character
The current cfgfile comment character is hardcoded to ';'.  This commit a
new API to allow the user to specify which comment character to use while
parsing the file.

This is to ease adoption by applications that have an existing
configuration file which may use a different comment character.  For
instance, an application may already have a configuration file that uses
the '#' as the comment character.

The approach of using a new API with an extensible parameters structure was
used rather than simply adding a new argument to the existing API to allow
for additional arguments to be introduced in the future.

Signed-off-by: Allain Legacy <allain.legacy@windriver.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2017-04-04 16:32:06 +02:00
Allain Legacy
1a5efe7499 cfgfile: support global properties section
The current implementation of the cfgfile library requires that all
key=value pairs be within [SECTION] definitions.  The ini file standard
allows for key=value pairs in an unnamed section.

   https://en.wikipedia.org/wiki/INI_file#Global_properties

This commit adds the capability of parsing key=value pairs from such an
unnamed section. The CFG_FLAG_GLOBAL_SECTION flag must be passed to the
rte_cfgfile_load() API to enable this functionality.  Any key=value pairs
found before the first section can be accessed in the section named
"GLOBAL".

Signed-off-by: Allain Legacy <allain.legacy@windriver.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2017-04-04 16:32:06 +02:00
David Hunt
7c37da5a9a distributor: fix creation error checks
Coverity issue 143258: not freeing distributor instance
Coverity issue 143254: not checking return code from malloc
Fixes: 775003ad2f ("distributor: add new burst-capable library")

Signed-off-by: David Hunt <david.hunt@intel.com>
2017-04-04 14:58:49 +02:00
Jerin Jacob
9256eed78a eal/linux: fix build with glibc 2.25
glibc 2.25 is warning about if applications depend on
sys/types.h for makedev macro, it expects to be included
from <sys/sysmacros.h>

Found this error while testing with GCC 6.3.1 on archlinux.

lib/librte_eal/linuxapp/eal/eal_pci_uio.c: In function ‘pci_mknod_uio_dev’:
lib/librte_eal/linuxapp/eal/eal_pci_uio.c:134:13:
error: In the GNU C Library, "makedev" is defined
by <sys/sysmacros.h>. For historical compatibility, it is
currently defined by <sys/types.h> as well, but we plan to
remove this soon. To use "makedev", include <sys/sysmacros.h>
directly. If you did not intend to use a system-defined macro
"makedev", you should undefine it after including <sys/types.h>. [-Werror]
 dev = makedev(major, minor);
             ^~~~~~~~~~~~~~~~~

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2017-04-04 14:52:06 +02:00
Bruce Richardson
8415429e43 nic_uio: fix device binding at boot
When loading nic_uio from /boot/loader.conf as specified in the Getting
Started Guide doc, the NIC devices were not bound at boot. Unloading the
nic_uio driver and reloading it would cause them to be bound, however.

The root cause appears to be the fact that when the module is loaded at
boot, the call to find the pci device when parsing the b:d:f parameter
fails to return the device. That means that later on when the device
is probed as part of a PCI scan, no action is taken as it's not recorded
as a device to be used.

We fix this by having the b:d:f string parsed again on probe if the
initial check to see if it's an already-known device fails. In my tests,
this causes the NIC devices to be successfully bound at boot time, as
well as leaving things working as before in the case the module is loaded
post-boot.

Fixes: 764bf26873 ("add FreeBSD support")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2017-04-04 12:28:03 +02:00
Jianfeng Tan
dd18a2f0b2 vfio: fix secondary process start
When binding with vfio-pci, secondary process cannot be started with
an error message:

    cannot find TAILQ entry for PCI device.

It's due to: struct rte_pci_addr is padded with 1 byte for alignment
by compiler. Then below comparison in commit 2f4adfad0a
("vfio: add multiprocess support") will fail if the last byte is not
initialized.

    memcmp(&vfio_res->pci_addr, &dev->addr, sizeof(dev->addr)

And commit cdc242f260 ("eal/linux: support running as unprivileged user")
just triggers this bug by using a stack un-initialized variable.

The fix is to use rte_eal_compare_pci_addr() for pci addr comparison.

Fixes: 2f4adfad0a ("vfio: add multiprocess support")
Fixes: cdc242f260 ("eal/linux: support running as unprivileged user")
Cc: stable@dpdk.org

Reported-by: Pawel Rutkowski <pawelx.rutkowski@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
2017-04-04 11:58:57 +02:00
Anatoly Burakov
9fa5993f10 vfio: fix build
Some compilers require definition of vfio_iommu_spapr_tce_ddw_info
before its use in vfio_iommu_spapr_tce_info, so move tce_info
definition below tce_ddw_info.

Fixes: 468f42cc26 ("vfio: fix build on old kernel")

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
2017-04-03 20:00:23 +02:00
Ferruh Yigit
21be6fb6af igb_uio: fix build with kernel < 3.2
Recently added "dma_zalloc_coherent()" call is causing build error
for Linux kernels < 3.2.

compile error:
lib/librte_eal/linuxapp/igb_uio/igb_uio.c:
  In function ‘igbuio_pci_probe’:
lib/librte_eal/linuxapp/igb_uio/igb_uio.c:434:2:
error: implicit declaration of function ‘dma_zalloc_coherent’
  [-Werror=implicit-function-declaration]
  map_addr = dma_zalloc_coherent(&dev->dev, 1024,
  ^

dma_zalloc_coherent() introduced with Linux kernel 3.2, with commit
Linux: 842fa69f3e0c ("include/linux/dma-mapping.h: add dma_zalloc_coherent()")
Since it does not exist for older kernels, causing a build error.

Switched to dma_alloc_coherent() API to prevent build error.

Fixes: d287e4d41b ("igb_uio: map dummy DMA forcing IOMMU domain attachment")

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-04-03 19:49:58 +02:00
Shreyansh Jain
1263b426ff mempool: move stack handler as a driver
Moved from lib/librte_mempool, stack mempool handler is an independent
driver.
Shared builds would now require to link in librte_mempool_stack for
"stack" mempool handler.

Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2017-04-03 19:45:45 +02:00
Shreyansh Jain
9a8e9b57f5 mempool: move ring handler as a driver
Moved from lib/librte_mempool, ring mempool is now an independent
driver.
Shared builds would now need to add librte_mempool_ring for:
* ring_mp_mc
* ring_sp_sc
* ring_sp_mc
* ring_mp_sc

Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2017-04-03 19:45:45 +02:00
Shreyansh Jain
44cebef721 mempool: fix crash when handler not found
In case the stack or ring mempool handler are compiled as shared
library and not linked in with test binary, segfault is reported.
This is because return value of rte_mempool_set_ops_byname is not
being checked in rte_mempool_ops_alloc.

This patch handles error returned from rte_mempool_set_ops_byname
when a mempool is not found.

Fixes: 449c49b93a ("mempool: support handler operations")

Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2017-04-03 18:53:10 +02:00
Andriy Berestovskyy
240e992f74 mempool: fix typos
Signed-off-by: Andriy Berestovskyy <andriy.berestovskyy@caviumnetworks.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2017-04-03 18:53:10 +02:00
Gage Eads
8477f2f50d mempool: update non-EAL thread note
Commit 30e6399892 ("mempool: support non-EAL thread") added the
capability for non-EAL threads to use the mempool library. This commit
removes the note indicating that the mempool library cannot be used safely
by non-EAL threads, and replaces it with a more up-to-date note.

Signed-off-by: Gage Eads <gage.eads@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2017-04-03 18:53:10 +02:00
David Su
f0d1896fa1 igb_uio: use non-threaded ISR
This eliminates the overhead of a task switch when an interrupt arrives.

Signed-off-by: David Su <david.w.su@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-03-30 22:26:07 +02:00
Alejandro Lucero
d287e4d41b igb_uio: map dummy DMA forcing IOMMU domain attachment
For using a DPDK app when iommu is enabled, it requires to
add iommu=pt to the kernel command line. But using igb_uio driver
makes DMAR errors because the device has not an IOMMU domain.

Since kernel 3.15, iommu=pt requires to use the internal kernel
DMA API for attaching the device to the IOMMU 1:1 mapping, aka
si_domain. Previous versions did attach the device to that
domain when intel iommu notifier was called.

This is not a problem if the driver does later some call to the
DMA API because the mapping can be done then. But DPDK apps do
not use that DMA API at all.

Doing this dma map and unmap is harmless even when iommu is not
enabled at all.

Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-03-30 22:26:07 +02:00
Alejandro Lucero
94c0776b1b vfio: support hotplug
Current device hotplug is just supported by UIO managed devices.
This patch adds same functionality with VFIO.

It has been validated through tests using IOMMU and also with
VFIO and no-iommu mode.

Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2017-03-30 18:40:15 +02:00
Nikhil Rao
bb7927fd21 vfio: fix disabling INTx
The flags member of irq_set should be ORed with VFIO_IRQ_SET_ACTION_MASK
and not VFIO_IRQ_SET_ACTION_UNMASK. The bug was found by code inspection.

Fixes: 5c782b3928 ("vfio: interrupts")
Cc: stable@dpdk.org

Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2017-03-30 16:59:49 +02:00
Anatoly Burakov
468f42cc26 vfio: fix build on old kernel
Fixing compile failures for kernels without sPAPR IOMMU support.

Fixes: 0fe9830b53 ("eal/ppc: support sPAPR IOMMU for vfio-pci")

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
2017-03-30 16:55:57 +02:00
Ferruh Yigit
d4d2380cbb kni: fix build with kernel 4.11
compile error:
.../build/build/lib/librte_eal/linuxapp/kni/kni_net.c:124:6:
error: implicit declaration of function ‘signal_pending’
[-Werror=implicit-function-declaration]
  if (signal_pending(current) || ret_val <= 0) {
      ^~~~~~~~~~~~~~

Linux 4.11 moves signal function declarations to its own header file:
Linux: 174cd4b1e5fb ("sched/headers: Prepare to move signal wakeup &
sigpending methods from <linux/sched.h> into <linux/sched/signal.h>")

Use new header file "linux/sched/signal.h" to fix the build error.

Cc: stable@dpdk.org

Reported-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Tested-by: Pankaj Gupta <pagupta@redhat.com>
2017-03-30 16:45:36 +02:00
Olivier Matz
93e32ea349 mk: fix dependencies to optional configs
In rte.lib.mk, the list of libraries passed to the link
command (LDLIBS) is generated from the DEPDIRS-xxx variables.
If a library is not compiled because it is disabled in
configuration, it should not appear in DEPDIRS-xxx.

- librte_port depends on librte_kni only if it is enabled.
- librte_table depends on librte_acl only if it is enabled.

Fixes: feb9f680cd ("mk: optimize directory dependencies")

Reported-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Tested-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-03-30 15:38:43 +02:00
Olivier Matz
b1b700ce7d ethdev: add descriptor status API
Introduce a new API to get the status of a descriptor.

For Rx, it is almost similar to rx_descriptor_done API, except it
differentiates "used" descriptors (which are hold by the driver and not
returned to the hardware).

For Tx, it is a new API.

The descriptor_done() API, and probably the rx_queue_count() API could
be replaced by this new API as soon as it is implemented on all PMDs.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
2017-03-30 15:02:05 +02:00
Bruce Richardson
a6619414e0 ring: make struct and macros type agnostic
Modify the enqueue and dequeue macros to support copying any type of
object by passing in the exact object type. Rather than using the "ring"
structure member of rte_ring, which is of type "array of void *", instead
have the macros take the start of the ring a a pointer value, thereby
leaving the rte_ring structure as purely a header value. This allows it
to be reused by other future ring types which can add on extra fields if
they want, or even to have the actual ring elements, of whatever type
stored separate from the ring header.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2017-03-29 22:32:20 +02:00
Bruce Richardson
6a68df7f23 ring: create common function for updating tail index
Both producer and consumer use the same logic for updating the tail
index so merge into a single function.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2017-03-29 22:32:20 +02:00
Bruce Richardson
0dfc98c507 ring: separate out head index manipulation
We can write a single common function for head manipulation for enq
and a common one for deq, allowing us to have a single worker function
for enq and deq, rather than two of each. Update all other inline
functions to use the new functions.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2017-03-29 22:32:20 +02:00
Bruce Richardson
3fe963a85a ring: reduce scope of local variables
The local variable i is only used for loop control so define it in
the enqueue and dequeue blocks directly, rather than at the function
level.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2017-03-29 22:32:20 +02:00
Bruce Richardson
ecaed092b6 ring: return remaining entry count when dequeuing
Add an extra parameter to the ring dequeue burst/bulk functions so that
those functions can optionally return the amount of remaining objs in the
ring. This information can be used by applications in a number of ways,
for instance, with single-consumer queues, it provides a max
dequeue size which is guaranteed to work.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2017-03-29 22:32:20 +02:00
Bruce Richardson
14fbffb0aa ring: return free space when enqueuing
Add an extra parameter to the ring enqueue burst/bulk functions so that
those functions can optionally return the amount of free space in the
ring. This information can be used by applications in a number of ways,
for instance, with single-producer queues, it provides a max
enqueue size which is guaranteed to work. It can also be used to
implement watermark functionality in apps, replacing the older
functionality with a more flexible version, which enables apps to
implement multiple watermark thresholds, rather than just one.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2017-03-29 22:32:04 +02:00
Bruce Richardson
cfa7c9e6fc ring: make bulk and burst return values consistent
The bulk fns for rings returns 0 for all elements enqueued and negative
for no space. Change that to make them consistent with the burst functions
in returning the number of elements enqueued/dequeued, i.e. 0 or N.
This change also allows the return value from enq/deq to be used directly
without a branch for error checking.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2017-03-29 22:25:37 +02:00
Bruce Richardson
77dd306427 ring: remove watermark support
Remove the watermark support. A future commit will add support for having
enqueue functions return the amount of free space in the ring, which will
allow applications to implement their own watermark checks, while also
being more useful to the app.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2017-03-29 22:25:34 +02:00
Bruce Richardson
82cb88375c ring: remove the yield when waiting for tail update
There was a compile time setting to enable a ring to yield when
it entered a loop in mp or mc rings waiting for the tail pointer update.
Build time settings are not recommended for enabling/disabling features,
and since this was off by default, remove it completely. If needed, a
runtime enabled equivalent can be used.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2017-03-29 22:25:29 +02:00
Bruce Richardson
8c82198978 ring: remove debug setting
The debug option only provided statistics to the user, most of
which could be tracked by the application itself. Remove this as a
compile time option, and feature, simplifying the code.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2017-03-29 22:25:27 +02:00
Bruce Richardson
d1e138e1b0 ring: eliminate duplication of size and mask fields
The size and mask fields are duplicated in both the producer and
consumer data structures. Move them out of that into the top level
structure so they are not duplicated.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2017-03-29 22:25:23 +02:00
Bruce Richardson
8526571400 ring: create common structure for prod and cons metadata
create a common structure to hold the metadata for the producer and
the consumer, since both need essentially the same information - the
head and tail values, the ring size and mask.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2017-03-29 22:25:13 +02:00
Bruce Richardson
d9f0d3a1ff ring: remove split cacheline build setting
Users compiling DPDK should not need to know or care about the arrangement
of cachelines in the rte_ring structure.  Therefore just remove the build
option and set the structures to be always split. On platforms with 64B
cachelines, for improved performance use 128B rather than 64B alignment
since it stops the producer and consumer data being on adjacent cachelines.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2017-03-29 22:21:51 +02:00
David Hunt
d55362bd87 distributor: add symbol versioning
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2017-03-29 16:46:57 +02:00
David Hunt
c0de0eb82e distributor: switch over to new API
This is the main switch over between the legacy API and the new
burst API. We rename all the functions in rte_distributor.c to remove
the _v1705, and we add in _v20 in the rte_distributor_v20.c

We also rename the rte_distributor_next.h as rte_distributor.h, as
this is now the public header.

At the same time, we need the autotests and sample app to compile
properly, hence those changes are in this patch also.

Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2017-03-29 16:46:57 +02:00
David Hunt
6690f105dc distributor: add SIMD flow matching
Add an optimised version of the in-flight flow matching algorithm
using SIMD instructions. This should give up to 1.5x over the scalar
versions performance.

Falls back to scalar version if SSE4.2 not available

Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2017-03-29 16:46:57 +02:00
David Hunt
775003ad2f distributor: add new burst-capable library
This patch includes the code for new burst-capable distributor library.

It also includes the rte_distributor_next.h file which will
be used as the public header once we add in the symbol versioning
for v20 and v1705 APIs, at which stage we will rename it to
rte_distributor.h.

The new distributor code contains a very similar API to the legacy code,
but now sends bursts of up to 8 mbufs to each worker. Flow ID's are
reduced to 15 bits for an optimal flow matching algorithm.

Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2017-03-29 16:46:57 +02:00
David Hunt
66ec3e8bd2 distributor: create private header file
We'll be adding internal implementation definitions in here
that are common to both burst and legacy APIs.

Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2017-03-29 16:46:57 +02:00
David Hunt
73f08e03c9 distributor: rename legacy files
Move files out of the way so that we can replace with new
versions of the distributor library. Files are named in
such a way as to match the symbol versioning that we will
apply for backward ABI compatibility.

Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2017-03-29 16:46:57 +02:00
Bruce Richardson
6f6d2a66f8 eal/bsd: query the cpu count only once
Rather than querying the number of CPUs on the system multiple times, and
printing out the number each time, just query the value from sysctl once
and store it for future reuse.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2017-03-27 23:56:58 +02:00
Olivier Matz
feb9f680cd mk: optimize directory dependencies
Before this patch, the management of dependencies between directories
had several issues:

- the generation of .depdirs, done at configuration is slow: it can take
  more than one minute on some slow targets (usually ~10s on a standard
  PC without -j).

- for instance, it is possible to express a dependency like:
  - app/foo depends on lib/librte_foo
  - and lib/librte_foo depends on app/bar
  But this won't work because the directories are traversed with a
  depth-first algorithm, so we have to choose between doing 'app' before
  or after 'lib'.

- the script depdirs-rule.sh is too complex.

- we cannot use "make -d" for debug, because the output of make is used for
  the generation of .depdirs.

This patch moves the DEPDIRS-* variables in the upper Makefile, making
the dependencies much easier to calculate. A DEPDIRS variable is still
used to process library dependencies in LDLIBS.

After this commit, "make config" is almost immediate.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Tested-by: Robin Jarry <robin.jarry@6wind.com>
Tested-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2017-03-27 23:28:43 +02:00
Billy McFall
44a718c457 ethdev: add API to free consumed buffers in Tx ring
Add a new API to force free consumed buffers on Tx ring. API will return
the number of packets freed (0-n) or error code if feature not supported
(-ENOTSUP) or input invalid (-ENODEV).

Signed-off-by: Billy McFall <bmcfall@redhat.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2017-03-27 17:17:33 +02:00
Aaron Conole
ac71108d64 eal: add info about various init error codes
The rte_eal_init function will now pass failure reason hints to the
application.  To help app developers decipher this, add some brief
information about what the codes are indicating.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2017-03-27 15:59:53 +02:00
Aaron Conole
1908008f5d eal: do not panic on bus probe/scan failure
For now, exit the init.  It's likely that even aborting the initialization
is premature in this case, as it may be possible to proceed even if one
bus or another is not available.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2017-03-27 15:59:06 +02:00
Aaron Conole
e2c0413f2d eal: do not panic on vdev init failure
Even if one vdev should fail, there's no need to prevent further
processing.  Log the error, and reflect it to the higher levels to
decide.

Seems like it's possible to continue.  At least, the error is reflected
properly in the logs.  A user could then go and correct or investigate
the situation.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2017-03-27 15:58:28 +02:00
Aaron Conole
10f6c93cea eal: do not panic on PCI failures
Some devices may be inaccessible for a variety of reasons, or the
PCI-bus may be unavailable causing the whole thing to fail.  Still,
better to continue attempts at probes.

Since PCI isn't neccessarily required, it may be possible to simply log
the error and continue on letting the user check the logs and restart
the application when things have failed.

This will usually be an issue because of permissions.  However, it could
also be caused by OOM.  In either case, errno will contain the
underlying cause.

For linux, it is safe to re-init the system here, so allow the
application to take corrective action and reinit.

For BSD, this is not the case, for other reasons, including hugepage
allocation has already happened, and needs to be properly uninitialized.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2017-03-27 15:58:00 +02:00
Aaron Conole
4fe1d33987 eal: do not panic if plugins fail to init
Plugins are useful and important.  However, it seems crazy to abort
everything just because they don't initialize properly.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2017-03-27 15:57:13 +02:00
Aaron Conole
c050e5abae eal: do not panic on interrupt thread init
There could be some confusion as to why the call failed - this change
will always reflect the value of the error in rte_error.

When initializing the interrupt thread, there are a number of possible
reasons for failure - some of which are correctable by the application.
Do not panic() needlessly, and give the application a change to reflect
this information to the user.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2017-03-27 15:56:59 +02:00
Aaron Conole
330bed86d3 eal: do not panic on timer init failure
After code inspection, there is no way for eal_timer_init() to fail.  It
simply returns 0 in all cases.  As such, this test could either go-away
or stay here as 'future-proofing'.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2017-03-27 15:55:49 +02:00
Aaron Conole
7d5c430f69 eal: do not panic on a number of conditions
When log initialization fails, it's generally because the fopencookie
failed.  While this is rare in practice, it could happen, and it is
likely because of memory pressure.  So, flag the error, and allow the
user to retry.

Memory init can only fail when access to hugepages (either as primary or
secondary process) fails (and that is usually permissions).  Since the
manner of failure is not reversible, we cannot allow retry.

There are some theoretical racy conditions in the system that _could_
cause early tailq init to fail;  however, no need to panic the
application.  While it can't continue using DPDK, it could make better
alerts to the user.

rte_eal_alarm_init() call uses the linux timerfd framework to create a
poll()-able timer using standard posix file operations.  This could fail
for a few reasons given in the man-pages, but many could be
corrected by the user application.  No need to panic.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2017-03-27 15:54:49 +02:00
Aaron Conole
8f113d9818 eal: set errno when exiting for already initialized
Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2017-03-27 15:53:46 +02:00
Aaron Conole
ce3bede01e eal: do not panic on memzone init failure
When memzone initialization fails, report the error to the calling
application rather than panic().  Without a good way of detaching /
releasing hugepages, at this point the application will have to restart.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2017-03-27 15:53:06 +02:00
Aaron Conole
a0222a4679 eal: do not panic on argument parsing error
It's possible that the application could take a corrective action here,
and either prompt the user for different arguments, or at least perform
a better logging.  Exiting this early prevents any useful information
gathering from the application layer.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2017-03-27 15:52:08 +02:00
Aaron Conole
547a61af71 eal: do not panic on hugepage info init
When attempting to scan hugepages, signal to the eal that an error has
occurred, rather than performing a panic.

If we fail to acquire hugepage information, simply signal an error to
the application.  This clears the run_once counter, allowing the user or
application to take a corrective action and retry.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2017-03-27 15:50:37 +02:00
Aaron Conole
37e97ad2c5 eal: do not panic when CPU is not supported
This adds a new API to check for the eal cpu versions.

It's now possible to gracefully exit the application, or for
applications which support non-dpdk datapaths working in concert with
DPDK datapaths, there no longer is the possibility of exiting for
unsupported CPUs.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2017-03-27 15:50:09 +02:00
Aaron Conole
647644e51f eal: do not panic on CPU detection
There may be no way to gracefully recover, but the application
should be notified that a failure happened, rather than completely
aborting.  This allows the user to proceed with a "slow-path" type
solution.

After this change, the EAL CPU NUMA node resolution step can no longer
emit an rte_panic.  This aligns with the code in rte_eal_init, which
expects failures to return an error code.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2017-03-27 15:47:10 +02:00
Ben Walker
24a5357968 pci: fix device registration on FreeBSD
The FreeBSD implementation wasn't registering new devices
with the device framework on start up. However, common
code attempts to unregister them on shutdown which causes
a SEGFAULT. This fix makes the FreeBSD code do the same
thing as the Linux code for registration.

Fixes: 13a1317d3b ("pci: create device list and fallback on its members")
Cc: stable@dpdk.org

Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
2017-03-27 12:07:53 +02:00
Vladyslav Buslov
d89a5bce1d lpm6: extend next hop field
This patch extend next_hop field from 8-bits to 21-bits in LPM library
for IPv6.

Added versioning symbols to functions and updated
library and applications that have a dependency on LPM library.

Signed-off-by: Vladyslav Buslov <vladyslav.buslov@harmonicinc.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2017-03-15 18:49:41 +01:00
Matt Peters
b61befb48c igb_uio: support devices with only I/O BAR
Allow the BAR setup to succeed if a device has at least 1 BAR region
defined.  Previously, the device probe would only succeed if at least one
memory BAR existed, but there are devices that have only port I/O BARs.

For example, on Virtual Box a virtio device has only a single I/O BAR
because by default MSI-X is not enabled.  While in qemu/kvm the virtio
device has MSI-X enabled and therefore has both an I/O and Memory BAR.

The following are excerpts from "lspci -nnvvvv -s 00:09.0" on both types of
systems.

Virtual Box:

    Region 0: I/O ports at d260 [size=32]
    Capabilities: [80] #00 [0000]

QEMU/KVM:

    Region 0: I/O ports at c060 [size=32]
    Region 1: Memory at febd1000 (32-bit, non-prefetchable) [size=4K]
    Expansion ROM at feb80000 [disabled] [size=256K]
    Capabilities: [40] MSI-X: Enable+ Count=3 Masked-
            Vector table: BAR=1 offset=00000000
            PBA: BAR=1 offset=00000800

Signed-off-by: Matt Peters <matt.peters@windriver.com>
Signed-off-by: Allain Legacy <allain.legacy@windriver.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-03-15 14:02:41 +01:00
Hemant Agrawal
5a11168d9b mbuf: use pktmbuf helper to create the pool
When possible, replace the uses of rte_mempool_create() with
the helper provided in librte_mbuf: rte_pktmbuf_pool_create().

This is the preferred way to create a mbuf pool.

This also updates the documentation.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2017-03-15 13:48:02 +01:00
Thomas Monjalon
31123211bd remove unmaintained TILE-Gx architecture
The TILE-Gx architecture and its driver mpipe are not maintained.
The code is removed to avoid confusion.

A last update has been done in 17.05 before removal.
It can be built with the updated toolchain:
	http://www.mellanox.com/repository/solutions/tile-scm/
and libgxio:
	http://www.mellanox.com/repository/solutions/tile-scm/libgxio-1.0.tar.xz

Quote from http://dpdk.org/ml/archives/dev/2017-February/057940.html
"
Mellanox agrees to remove TILE-Gx support from DPDK.org, but will continue
to support customers using DPDK.
Customer that needs support should contact Mellanox directly.
"

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2017-03-15 11:40:57 +01:00
Olivier Matz
0ef850c4f6 ethdev: move a queue id check to generic layer
The check of queue_id is done in all drivers implementing
rte_eth_rx_queue_count(). Factorize this check in the generic function.

Note that the nfp driver was doing the check differently, which could
induce crashes if the queue index was too big.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2017-03-09 19:29:51 +01:00
Olivier Matz
44e93f4a34 ethdev: clarify API comments of Rx queue count
The API comments are not consistent between each other.

The function rte_eth_rx_queue_count() returns the number of used
descriptors on a receive queue.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-03-09 19:27:40 +01:00
Gowrishankar Muthukrishnan
0fe9830b53 eal/ppc: support sPAPR IOMMU for vfio-pci
Below changes adds pci probing support for vfio-pci devices in power8.

Signed-off-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
2017-03-09 18:39:45 +01:00
Ben Walker
cdc242f260 eal/linux: support running as unprivileged user
For Linux kernel 4.0 and newer, the ability to obtain
physical page frame numbers for unprivileged users from
/proc/self/pagemap was removed. Instead, when an IOMMU
is present, simply choose our own DMA addresses instead.

Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
2017-03-09 17:08:46 +01:00
Bruce Richardson
03437f2947 ring: add a function to return the ring size
Applications and other libraries should not be reading inside the
rte_ring structure directly to get the ring size. Instead add a fn
to allow it to be queried.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2017-03-08 16:05:19 +01:00
Jan Blunck
b2fba63690 eal: ensure constness of container_of target
This adds a check to ensure that the container_of() macro is not used to
cast away (remove) constness.

Signed-off-by: Jan Blunck <jblunck@infradead.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2017-03-08 14:04:29 +01:00
Jan Blunck
7cfd280578 eal: fix container_of macro for const members
This fixes the usage of structure members that are declared const to get
a pointer to the embedding parent structure.

Signed-off-by: Jan Blunck <jblunck@infradead.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2017-03-08 13:48:36 +01:00
Chris Metcalf
dd0eedb1cf tile: fix build
Re-enable CONFIG_RTE_LIBRTE_SCHED, since it is needed to build
correctly.

Fix a few warnings when compiling mpipe_tilegx.c.

Remove an empty rte_cpu_feature_table[] array using a bogus type.

Properly set RTE_OBJCOPY_{TARGET,ARCH} in mk/arch/tile/rte.vars.mk.

Signed-off-by: Chris Metcalf <cmetcalf@mellanox.com>
2017-02-27 16:44:32 +01:00
Chris Metcalf
f80468b680 eal/tile: avoid use of non-upstreamed header
It's trivial to directly invoke a read of the special-purpose
register that holds the clock cycle counter, so just do that.

Signed-off-by: Chris Metcalf <cmetcalf@mellanox.com>
2017-02-27 16:44:23 +01:00
Olivier Matz
93092a5610 mempool: remove deprecated get and put functions
As announced in the deprecation notice, remove the functions for
single/multi producer/consumer enqueue/dequeue.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2017-02-21 12:05:46 +01:00
Olivier Matz
f3bc028909 mempool: remove deprecated count functions
As announced in the deprecation notice, remove these functions.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2017-02-21 12:05:46 +01:00
Thomas Monjalon
420195e6af log: remove old symbols from map
When removing log history functions, the map has not been updated.

Fixes: d7e61ad3ae ("log: remove deprecated history dump")

Reported-by: Olivier Matz <olivier.matz@6wind.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-02-21 11:43:45 +01:00
Ferruh Yigit
aa0d7c2d32 kni: remove KNI vhost support
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2017-02-21 11:43:07 +01:00
Thomas Monjalon
d450914ab8 version: 17.05-rc0
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2017-02-17 12:17:39 +01:00
Thomas Monjalon
b9ebab26d9 version: 17.02.0
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2017-02-14 22:17:45 +01:00
Pablo de Lara
b09efeb9d5 doc: add thread-safety information about EFD library
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
2017-02-14 21:48:36 +01:00
Dmitriy Yakovlev
e7ee2ca1c9 cfgfile: fix uninitialized variable on load error
Uninitialized scalar variable.
Using uninitialized value cfg->sections[curr_section]->num_entries
when calling rte_cfgfile_close.
And memory in variables cfg->sections[curr_section],
sect->entries[curr_entry] maybe not equal NULL.
We must decrement counters curr_section, curr_entry when failed to realloc.

Fixes: eaafbad419 ("cfgfile: library to interpret config files")

Signed-off-by: Dmitriy Yakovlev <bombermag@gmail.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2017-02-14 18:13:48 +01:00
Qi Zhang
2eed820fd4 vfio: fix maximum number of interrupt for MSI-X
The max number of interrupt request is possible
be changed after rte_intr_callback_register, so
in get_max_intr, we need to check if necessary to
update the max_intr.

Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
2017-02-13 22:25:04 +01:00
Andrew Rybchenko
549b3587f5 ethdev: fix typo in UDP tunnel API description
Fixes: 1cbe755fef ("ethdev: rename UDP tunnel port functions")

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
2017-02-13 22:22:19 +01:00
Thomas Monjalon
47aa9d4e0d version: 17.02-rc3
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2017-02-10 17:15:32 +01:00
Slawomir Mrozowicz
b75a76d354 cryptodev: fix crash when querying device by name
This patch fixes a segmentation fault in function
rte_cryptodev_devices_get(), due to incorrect driver name path.
It reworks the function to use correct types and clean up
for visibility.

Coverity issue: 141067
Fixes: 38227c0e3a ("cryptodev: retrieve device info")

Signed-off-by: Slawomir Mrozowicz <slawomirx.mrozowicz@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2017-02-10 15:57:29 +01:00
Jingjing Wu
64c1375b83 mbuf: fix bitmask of Tx offload flags
Add missed PKT_TX_MACSEC and PKT_TX_IEEE1588_TMST flags to bitmask of
all supported packet Tx offload features flags.

Fixes: 4fb7e803eb ("ethdev: add Tx preparation")

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2017-02-10 12:25:49 +01:00
Yong Wang
511a4c74b8 pci: fix UIO interrupt file descriptor check before close
The "dev->intr_handle.fd" is possibly a negative value while it is
passed as an argument to function "close". Fix the check to the fd.

Fixes: 5a60a7ffc8 ("pci: introduce functions to alloc and free uio resource")

Signed-off-by: Yong Wang <wang.yong19@zte.com.cn>
2017-02-10 14:23:27 +01:00
Alan Dewar
3b780b9e9e sched: fix crash when freeing port
Prevent a segmentation fault in rte_sched_port_free by only accessing
the port structure after the NULL pointer check has been made.

Fixes: 7b3c4f35 ("sched: fix releasing enqueued packets")
Cc: stable@dpdk.org

Signed-off-by: Alan Dewar <adewar@brocade.com>
Signed-off-by: Jan Blunck <jblunck@infradead.org>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2017-02-09 18:46:52 +01:00
Patrick MacArthur
811b6b2506 vfio: fix file descriptor leak in multi-process
When a secondary process wants access to the VFIO container file
descriptor, the primary process calls vfio_get_container_fd() which
always opens an entirely new file descriptor on /dev/vfio/vfio.
However, once the file descriptor has been passed to the subprocess, it
is effectively duplicated, meaning that the copy of the file descriptor
in the primary process is no longer needed.  However, the primary
process does not close the duplicate fd, which results in a resource
leak.

This can be reproduced by starting a primary process with a small
RLIMIT_NOFILE limit configured to use VFIO for at least one device, and
repeatedly launching secondary processes until the file descriptor limit
is exceeded.

Fix the resource leak by closing the local vfio container file
descriptor after passing it to the secondary process.

Fixes: 2f4adfad0a ("vfio: add multiprocess support")
Cc: stable@dpdk.org

Signed-off-by: Patrick MacArthur <patrick@patrickmacarthur.net>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2017-02-09 18:39:30 +01:00
Thomas Monjalon
5b243cbab2 version: 17.02-rc2
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2017-01-30 23:47:11 +01:00
Emmanuel Roullit
68759bbe73 vhost: remove unneeded variable assignment
Found with clang static analysis:
lib/librte_vhost/vhost_user.c:996:3: warning:
Value stored to 'ret' is never read
        ret = vhost_user_get_vring_base(dev, &msg.payload.state);
        ^     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Emmanuel Roullit <emmanuel.roullit@gmail.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2017-01-30 13:47:20 +01:00
Emmanuel Roullit
5c1f70daaf vhost: do not GSO when no header is present
Found with clang static analysis:
lib/librte_vhost/virtio_net.c:723:17: warning:
Access to field 'data_off' results in a dereference of a null pointer
(loaded from variable 'tcp_hdr')
        m->l4_len = (tcp_hdr->data_off & 0xf0) >> 2;
                     ^~~~~~~~~~~~~~~~~

Fixes: d0cf91303d ("vhost: add Tx offload capabilities")
Cc: stable@dpdk.org

Signed-off-by: Emmanuel Roullit <emmanuel.roullit@gmail.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2017-01-30 13:46:57 +01:00
Yuanhan Liu
b8b992e93f vhost: fix long stall of negotiation
Setting up the mapping from GPA (guest physical address) to HPA (guest
physical address) could be very time consuming when the guest memory is
backened with small pages (4K). The bigger the guest memory, the longer
it takes. This could lead a very long vhost-user negotiation.

Since the mapping is only needed in zero copy mode so far, we could
avoid such time consuming settup when zero copy is turned off (which is
the default case).

It's actually a workaround, a right fix might be to start a new thread,
and hide the big latency there.

Fixes: e246896178 ("vhost: get guest/host physical address mappings")
Cc: stable@dpdk.org

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2017-01-28 14:25:40 +01:00
Yuanhan Liu
cc7301908c vhost: fix dead loop in enqueue path
If a malicious guest forges a dead loop desc chain (let desc->next point
to itself) and desc->len is zero, this could lead to a dead loop in
copy_mbuf_to_desc(following is a simplified code to show this issue
clearly):

    while (mbuf_is_not_totally_consumed) {
        if (desc_avail == 0) {
            desc = &descs[desc->next];
            desc_avail = desc->len;
        }

        COPY(desc, mbuf, desc_avail);
    }

I have actually fixed a same issue before: commit a436f53ebf ("vhost:
avoid dead loop chain"); it fixes the dequeue path though, leaving the
enqueue path still vulnerable.

The fix is the same. Add a var nr_desc to avoid the dead loop.

Fixes: f1a519ad98 ("vhost: fix enqueue/dequeue to handle chained vring descriptors")
Cc: stable@dpdk.org

Reported-by: Xieming Katty <katty.xieming@huawei.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2017-01-28 14:25:23 +01:00
Slawomir Mrozowicz
38227c0e3a cryptodev: retrieve device info
This patch adds helper functions for new performance application which
provide identifiers and number of crypto device and
provide and check capabilities available for defined device and algorithm.
The performance application can be used to measure throughput and latency
of cryptography operation performed by crypto device.

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Signed-off-by: Slawomir Mrozowicz <slawomirx.mrozowicz@intel.com>
Signed-off-by: Marcin Kerlin <marcinx.kerlin@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2017-01-30 17:46:36 +01:00
Fan Zhang
547017d80a cryptodev: add scheduler PMD name and type
This patch adds the cryptodev scheduler PMD name and type identifier to
librte_cryptodev.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2017-01-30 17:23:33 +01:00
Hemant Agrawal
3b84878c4b cryptodev: decouple from PCI device
This makes struct rte_cryptodev independent of struct rte_pci_device by
replacing it with a pointer to the generic struct rte_device.

This is inline with the recent changes in ethdev

Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: John Griffin <john.griffin@intel.com>
Reviewed-by: Shreyansh Jain <shreyansh.jain@nxp.com>
2017-01-30 17:23:33 +01:00
Declan Doherty
b815872e70 cryptodev: uninline some functions
rte_cryptodev_pmd_get_dev, rte_cryptodev_pmd_get_named_dev,
rte_cryptodev_pmd_is_valid_dev were incorrectly marked as inline and
therefore not useable from crypto PMDs when built as shared
libraries as they accessed the global rte_cryptodev_globals device
structure.

Fixes: d11b0f30 ("cryptodev: introduce API and framework for crypto devices")

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
2017-01-30 17:23:33 +01:00
Andrew Rybchenko
370621b5a1 pdump: fix coverage warning
Fix GCC 4.8.2 20140120 (Red Hat 4.8.2-16) (RHEL 7.0) false warning
when build with EXTRA_CFLAGS='--coverage'.

Fixes: 278f945402 ("pdump: add new library for packet capture")

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
2017-01-30 17:04:46 +01:00
Michał Mirosław
aad0c999b3 acl: fix flow data comments
Signed-off-by: Michał Mirosław <michal.miroslaw@atendesoftware.pl>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2017-01-30 11:15:11 +01:00
Michał Mirosław
c6c7a8d7e6 acl: allow zero verdict
This enables ACL matches to return 0 where the distinction
from no-match case is not needed.

Signed-off-by: Michał Mirosław <michal.miroslaw@atendesoftware.pl>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2017-01-30 11:08:47 +01:00
Olivier Matz
cb8fac62be efd: fix build by removing dependency to libmath
When we compile the dpdk with:
  CONFIG_RTE_LIBRTE_EFD=y
  CONFIG_RTE_LIBRTE_NFP_PMD=n
  CONFIG_RTE_LIBRTE_THUNDERX_NICVF_PMD=n
  CONFIG_RTE_LIBRTE_SCHED=n
  CONFIG_RTE_LIBRTE_METER=n

The linker gives the following error:
  lib/librte_efd.a(rte_efd.o): In function `rte_efd_create':
  lib/librte_efd/rte_efd.c:560: undefined reference to `log2'
  collect2: error: ld returned 1 exit status

This is because the '-lm' is missing in mk/rte.app.mk.

An alternative, which is proposed by this patch, is to use the compiler
builtin rte_bsf32() to process log2 instead of the libmath log2() that
requires to include math.h and link with -lm.

Fixes: 56b6ef874f ("efd: new Elastic Flow Distributor library")

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2017-01-30 10:58:40 +01:00
Emmanuel Roullit
7eb8895d40 ethdev: remove useless pointer initialization
Found with clang static analysis:
lib/librte_ether/rte_ethdev.c:2467:22:
warning: Value stored to 'dev' during its initialization is never read
struct rte_eth_dev *dev = &rte_eth_devices[port_id];
                    ^~~   ~~~~~~~~~~~~~~~~~~~~~~~~~

Fixes: 88ac4396ad ("ethdev: add VMDq support")

Signed-off-by: Emmanuel Roullit <emmanuel.roullit@gmail.com>
2017-01-30 10:36:18 +01:00
Steve Shin
9bdfc1e596 ethdev: fix MAC address replay
This patch fixes a bug in replaying MAC address to the hardware
in rte_eth_dev_config_restore() routine. Added default MAC replay as well.

Fixes: 4bdefaade6 ("ethdev: VMDQ enhancements")

Signed-off-by: Steve Shin <jonshin@cisco.com>
Reviewed-by: Igor Ryzhov <iryzhov@nfware.com>
2017-01-30 10:15:57 +01:00
Ilya V. Matveychikov
7c93195224 mbuf: remove redundant assignment when attaching
mi->next will be assigned to NULL few lines later, trivial patch

Fixes: ea672a8b16 ("mbuf: remove the rte_pktmbuf structure")

Signed-off-by: Ilya V. Matveychikov <matvejchikov@gmail.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2017-01-30 10:08:54 +01:00
Olivier Matz
e09ff22d53 mempool: fix stack handler dequeue
The return value of the stack handler is wrong: it should be 0 on
success, not the number of objects dequeued.

This could lead to memory leaks depending on how the caller checks the
return value (ret < 0 or ret != 0). This was also breaking autotests
with debug enabled, because the debug cookies are only updated when the
function returns 0, so the cookies were not updated, leading to
an abort().

Fixes: 295a530b0844 ("mempool: add stack mempool handler")
Cc: stable@dpdk.org

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2017-01-29 23:38:33 +01:00
Emmanuel Roullit
3cdfdf2a33 devargs: reset driver name pointer on parsing failure
The pointer set by strdup() needs to be cleared on failure to avoid a
potential double-free from the caller.

Found with clang static analysis:
lib/librte_eal/common/eal_common_devargs.c:123:2:
warning: Attempt to free released memory
        free(buf);
        ^~~~~~~~~

Fixes: 0fe11ec592 ("eal: add vdev init and uninit")

Signed-off-by: Emmanuel Roullit <emmanuel.roullit@gmail.com>
2017-01-29 23:34:07 +01:00
Olivier Matz
ec7df18bb7 eal: fix warning about debug log at startup
The log "Debug logs available - lower performance" should
now only be displayed when dataplane debug logs are enabled.

The issue occurs only if the default log level (CONFIG_RTE_LOG_LEVEL) is
set to DEBUG in the configuration, which is not the case by default.

Fixes: 5d8f0baf69 ("log: do not drop debug logs at compile time")

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2017-01-29 22:58:14 +01:00
Michał Mirosław
d613f57dd0 kni: guard against unterminated name oops
If the name is too long, it triggers BUG in alloc_netdev().

Signed-off-by: Michał Mirosław <michal.miroslaw@atendesoftware.pl>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-01-29 22:50:28 +01:00
Michał Mirosław
4d0db6df1c kni: set interface name source as user-space
Signed-off-by: Michał Mirosław <michal.miroslaw@atendesoftware.pl>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-01-29 22:47:30 +01:00
Ferruh Yigit
b2b0f85182 kni: add build option for ethtool support
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-01-29 22:36:26 +01:00
Yuanhan Liu
61207d014f ethdev: fix data reset when allocating port
Fix an silly error by auto-complete while managing the merge conflicts.
It's the eth_dev_data (but not eth_dev) entry should be memset.

Fixes: d948f596fe ("ethdev: fix port data mismatched in multiple process model")

Reported-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2017-01-20 19:03:57 +01:00
Pablo de Lara
2ee926f1fd eal: fix FreeBSD build
rte_bus_scan() and rte_bus_probe() have been introduced
in eal.c, but it is missing the rte_bus.h header file,
for BSD systems.

Fixes: f44abbc12f ("bus: add scanning")
Fixes: c3cec1d807 ("bus: add probing")

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2017-01-19 15:29:45 +01:00
Thomas Monjalon
6818a7f480 version: 17.02-rc1
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2017-01-19 05:54:41 +01:00
Shreyansh Jain
c3cec1d807 bus: add probing
Bus implementations can implement a probe handler to match the devices
scanned against the drivers registered.

Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2017-01-19 04:58:17 +01:00
Shreyansh Jain
f44abbc12f bus: add scanning
Scan for bus discovers the devices available on the bus and adds them
to a bus specific device list. Each bus mandatorily implements this
method.

Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2017-01-19 04:58:12 +01:00
Shreyansh Jain
a97725791e bus: introduce bus abstraction
This patch introduces the rte_bus abstraction for EAL.
The model is:
 - One or more devices are connected to a Bus
 - Drivers are running instances which manage one or more devices
 - Bus is responsible for identifying devices (and interrupt propogation)
 - Driver is responsible for initializing the device

This patch adds a 'rte_bus' base class which would be extended for
specific implementations. It also introduces Bus registration and
deregistration functions.

Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2017-01-19 04:57:18 +01:00
Zbigniew Bodek
c2fec02245 cryptodev: introduce ARM-specific feature flags
Add two new feature flags:
* RTE_CRYPTODEV_FF_CPU_NEON
  represents ARM NEON (TM) instructions
* RTE_CRYPTODEV_FF_CPU_ARM_CE
  represents ARM crypto extensions

Add them to both cryptodev library, documentation and relevant
PMD driver for ARMv8.

Signed-off-by: Zbigniew Bodek <zbigniew.bodek@caviumnetworks.com>
2017-01-19 01:00:55 +01:00
Zbigniew Bodek
169ca3db55 crypto/armv8: add PMD optimized for ARMv8 processors
This patch introduces crypto poll mode driver
using ARMv8 cryptographic extensions.
CPU compatibility with this driver is detected in
run-time and virtual crypto device will not be
created if CPU doesn't provide:
AES, SHA1, SHA2 and NEON.

This PMD is optimized to provide performance boost
for chained crypto operations processing,
such as encryption + HMAC generation,
decryption + HMAC validation. In particular,
cipher only or hash only operations are
not provided.

The driver currently supports AES-128-CBC
in combination with: SHA256 HMAC and SHA1 HMAC
and relies on the external armv8_crypto library:
https://github.com/caviumnetworks/armv8_crypto

Build ARMv8 crypto PMD if compiling for ARM64
and CONFIG_RTE_LIBRTE_PMD_ARMV8_CRYPTO option
is enable in the configuration file.
ARMV8_CRYPTO_LIB_PATH environment variable will
point to the appropriate library directory.

Signed-off-by: Zbigniew Bodek <zbigniew.bodek@caviumnetworks.com>
Reviewed-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2017-01-19 01:00:55 +01:00
Fan Zhang
d803b4439d cryptodev: add user defined name for vdev
This patch adds a user defined name initializing parameter to cryptodev
library.

Originally, for software cryptodev PMD, the vdev name parameter is
treated as the driver identifier, and will create an unique name for each
device automatically, which is not necessarily as same as the vdev
parameter.

This patch allows the user to either create a unique name for his software
cryptodev, or by default, let the system creates a unique one. This should
help the user managing the created cryptodevs easily.

Examples:
CLI command fragment 1: --vdev "crypto_aesni_gcm_pmd"
The above command will result in creating a AESNI-GCM PMD with name of
"crypto_aesni_gcm_X", where postfix X is the number assigned by the system,
starting from 0. This fragment can be placed in the same CLI command
multiple times, resulting the postfixs incremented by one for each new
device.

CLI command fragment 2: --vdev "crypto_aesni_gcm_pmd,name=gcm1"
The above command will result in creating a AESNI-GCM PMD with name of
"gcm1". This fragment can be placed in the same CLI command multiple
times, as long as each having a unique name value.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
2017-01-18 21:48:56 +01:00
Tomasz Kulasek
2d03ec6abd crypto: support scatter-gather in software drivers
This patch introduces RTE_CRYPTODEV_FF_MBUF_SCATTER_GATHER feature flag
informing that selected crypto device supports segmented mbufs natively
and doesn't need to be coalesced before crypto operation.

While using segmented buffers in crypto devices may have unpredictable
results, for PMDs which doesn't support it natively, additional check is
made for debug compilation.

Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
2017-01-18 21:48:56 +01:00
Fan Zhang
c5b70818ad cryptodev: fix loop in device query
This patch fixes the dev value update problem in
rte_cryptodev_pmd_get_named_dev, orginally, dev won't be updated
after the initial step in the loop.

Fixes: d11b0f30df ("cryptodev: introduce API and framework for crypto devices")

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
2017-01-18 21:48:56 +01:00
Declan Doherty
84d7965866 crypto/aesni_mb: support AVX512
Release v0.44 of Intel(R) Multi-Buffer Crypto for IPsec library adds
support for AVX512 instructions. This patch enables the new AVX512
accelerated functions from the aesni_mb_pmd crypto poll mode driver.

This patch set requires that the aesni_mb_pmd is linked against the
version 0.44 or greater of the Multi-Buffer Crypto for IPsec library.

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2017-01-18 21:48:56 +01:00
Arek Kusztal
d59626753c cryptodev: add DES CBC cipher algorithm
This commit adds DES CBC ciper algorithm to available algorithms

Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
2017-01-18 21:45:15 +01:00
Jerin Jacob
53a3ba0c36 cryptodev: fix crash on null dereference
crypodev->data->name will be null when
rte_cryptodev_get_dev_id() invoked without a valid
crypto device instance.

Fixes: d11b0f30df ("cryptodev: introduce API and framework for crypto devices")

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
2017-01-18 21:45:15 +01:00
Fiona Trahe
50d7f314de cryptodev: remove unused digest-appended feature
The cryptodev API had specified that if the digest address field was
left empty on an authentication operation, then the PMD would assume
the digest was appended to the source or destination data.
This case was not handled at all by most PMDs and incorrectly handled
by the QAT PMD.
As no bugs were raised, it is assumed to be not needed, so this patch
removes it, rather than add handling for the case on all PMDs.
The digest can still be appended to the data, but its
address must now be provided in the op.

Signed-off-by: Fiona Trahe <fiona.trahe@intel.com>
Acked-by: John Griffin <john.griffin@intel.com>
2017-01-18 21:45:15 +01:00
Pablo de Lara
86d8989688 efd: add AVX2 vector lookup function
Signed-off-by: Byron Marohn <byron.marohn@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Signed-off-by: Saikrishna Edupuganti <saikrishna.edupuganti@intel.com>
Acked-by: Christian Maciocco <christian.maciocco@intel.com>
2017-01-18 20:53:45 +01:00
Pablo de Lara
56b6ef874f efd: new Elastic Flow Distributor library
Elastic Flow Distributor (EFD) is a distributor library that uses
perfect hashing to determine a target/value for a given incoming flow key.
It has the following advantages:

- First, because it uses perfect hashing, it does not store
  the key itself and hence lookup performance is not dependent
  on the key size.

- Second, the target/value can be any arbitrary value hence
  the system designer and/or operator can better optimize service rates
  and inter-cluster network traffic locating.

- Third, since the storage requirement is much smaller than a hash-based
  flow table (i.e. better fit for CPU cache), EFD can scale to
  millions of flow keys.
  Finally, with current optimized library implementation performance
  is fully scalable with number of CPU cores.

Signed-off-by: Byron Marohn <byron.marohn@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Signed-off-by: Saikrishna Edupuganti <saikrishna.edupuganti@intel.com>
Acked-by: Christian Maciocco <christian.maciocco@intel.com>
2017-01-18 20:53:28 +01:00
Jerin Jacob
b15740bd43 eal/arm64: change barrier definitions to macros
Change rte_*wb definitions to macros in order to
keep consistent with other barrier definitions in
the file.

Suggested-by: Jianbo Liu <jianbo.liu@linaro.org>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2017-01-18 17:18:26 +01:00
Jerin Jacob
e783b81d44 eal/arm64: override I/O device read/write access
Override the generic I/O device memory read/write access and implement it
using armv8 instructions for arm64.

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2017-01-18 17:18:26 +01:00
Jerin Jacob
a2c4cd8648 eal: let all architectures use generic I/O implementation
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2017-01-18 17:17:28 +01:00
Jerin Jacob
4540fbb2ae eal: add generic I/O device read/write implementation
This patch implements the generic version of rte_read[b/w/l/q]_[relaxed]
and rte_write[b/w/l/q]_[relaxed] using rte_io_wmb() and rte_io_rmb()

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2017-01-18 17:12:05 +01:00
Jerin Jacob
69736db1d8 eal: introduce I/O device memory read/write operations
This commit introduces 8-bit, 16-bit, 32bit, 64bit I/O device
memory read/write operations along with the relaxed versions.

The weakly-ordered machine like ARM needs additional I/O barrier for
device memory read/write access over PCI bus.
By introducing the eal abstraction for I/O device memory read/write access,
The drivers can access I/O device memory in architecture agnostic manner.

The relaxed version does not have additional I/O memory barrier, useful in
accessing the device registers of integrated controllers which
implicitly strongly ordered with respect to memory access.

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2017-01-18 16:57:11 +01:00
Jerin Jacob
2cf953cfd8 eal/arm64: define I/O device memory barriers
CC: Jianbo Liu <jianbo.liu@linaro.org>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2017-01-18 16:57:11 +01:00
Jerin Jacob
67ce81bd3d eal/arm64: define SMP barrier
dmb instruction based barrier is used for smp version of memory barrier.

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2017-01-18 16:57:11 +01:00
Jerin Jacob
84733fd0d7 eal/arm64: fix memory barrier definition
dsb instruction based barrier is used for non smp
version of memory barrier.

Fixes: d708f01b71 ("eal/arm: add atomic operations for ARMv8")
Cc: stable@dpdk.org

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Jianbo Liu <jianbo.liu@linaro.org>
2017-01-18 16:57:11 +01:00
Jerin Jacob
b41508b7a4 eal/armv7: define I/O device memory barriers
The patch does not provide any functional change for ARMv7.
I/O barriers are mapped to existing smp barriers.

CC: Jan Viktorin <viktorin@rehivetech.com>
CC: Jianbo Liu <jianbo.liu@linaro.org>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2017-01-18 16:57:11 +01:00
Jerin Jacob
38b636b7cc eal/arm: separate SMP barrier definition for ARMv7 and ARMv8
Separate the smp barrier definition for arm and arm64 for fine
control on smp barrier definition for each architecture.

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2017-01-18 16:57:11 +01:00
Jerin Jacob
da07a35be5 eal/ppc64: define I/O device memory barriers
The patch does not provide any functional change for ppc_64.
I/O barriers are mapped to existing smp barriers.

CC: Chao Zhu <chaozhu@linux.vnet.ibm.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2017-01-18 16:57:11 +01:00
Jerin Jacob
dff90714b1 eal/tile: define I/O device memory barriers
The patch does not provide any functional change for tile.
I/O barriers are mapped to existing smp barriers.

CC: Zhigang Lu <zlu@ezchip.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2017-01-18 16:57:11 +01:00
Jerin Jacob
e802521171 eal/x86: define I/O device memory barriers
The patch does not provide any functional change for IA.
I/O barriers are mapped to existing smp barriers.

CC: Bruce Richardson <bruce.richardson@intel.com>
CC: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2017-01-18 16:57:11 +01:00
Jerin Jacob
1ea155733e eal: introduce I/O device memory barriers
This commit introduce rte_io_mb(), rte_io_wmb() and rte_io_rmb(), in
order to enable memory barriers between I/O device and CPU.

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2017-01-18 16:57:11 +01:00
Wei Zhao
99e7003831 net/ixgbe: parse L2 tunnel filter
check if the rule is a L2 tunnel rule, and get the L2 tunnel info.

Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Wei Dai <wei.dai@intel.com>
2017-01-17 19:41:43 +01:00
Michał Mirosław
d75547718c net/i40e: return errno when interrupt setup fails
Signed-off-by: Michał Mirosław <michal.miroslaw@atendesoftware.pl>
Reviewed-by: Jingjing Wu <jingjing.wu@intel.com>
2017-01-17 19:41:42 +01:00
Bernard Iremonger
5e823a4512 ethdev: remove some VF functions
remove the following API's:

rte_eth_dev_set_vf_rxmode
rte_eth_dev_set_vf_rx
rte_eth_dev_set_vf_tx
rte_eth_dev_set_vf_vlan_filter
rte_eth_dev_set_vf_rate_limit

Increment LIBABIVER in Makefile
Remove deprecation notice for removing rte_eth_dev_set_vf_* API's.

Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
2017-01-17 19:40:50 +01:00
Qiming Yang
2191347120 ethdev: add firmware version get
This patch adds a new API 'rte_eth_dev_fw_version_get' for
fetching firmware version by a given device.

Signed-off-by: Qiming Yang <qiming.yang@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
2017-01-17 22:34:35 +01:00
Zhihong Wang
f5472703c0 eal: optimize aligned memcpy on x86
This patch optimizes rte_memcpy for well aligned cases, where both
dst and src addr are aligned to maximum MOV width. It introduces a
dedicated function called rte_memcpy_aligned to handle the aligned
cases with simplified instruction stream. The existing rte_memcpy
is renamed as rte_memcpy_generic. The selection between them 2 is
done at the entry of rte_memcpy.

The existing rte_memcpy is for generic cases, it handles unaligned
copies and make store aligned, it even makes load aligned for micro
architectures like Ivy Bridge. However alignment handling comes at
a price: It adds extra load/store instructions, which can cause
complications sometime.

DPDK Vhost memcpy with Mergeable Rx Buffer feature as an example:
The copy is aligned, and remote, and there is header write along
which is also remote. In this case the memcpy instruction stream
should be simplified, to reduce extra load/store, therefore reduce
the probability of load/store buffer full caused pipeline stall, to
let the actual memcpy instructions be issued and let H/W prefetcher
goes to work as early as possible.

This patch is tested on Ivy Bridge, Haswell and Skylake, it provides
up to 20% gain for Virtio Vhost PVP traffic, with packet size ranging
from 64 to 1500 bytes.

The test can also be conducted without NIC, by setting loopback
traffic between Virtio and Vhost. For example, modify the macro
TXONLY_DEF_PACKET_LEN to the requested packet size in testpmd.h,
rebuild and start testpmd in both host and guest, then "start" on
one side and "start tx_first 32" on the other.

Signed-off-by: Zhihong Wang <zhihong.wang@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Lei Yao <lei.a.yao@intel.com>
2017-01-17 16:40:05 +01:00
Yuanhan Liu
d948f596fe ethdev: fix port data mismatched in multiple process model
Assume we have two virtio ports, 00:03.0 and 00:04.0. The first one is
managed by the kernel driver, while the later one is managed by DPDK.

Now we start the primary process. 00:03.0 will be skipped by DPDK virtio
PMD driver (since it's being used by the kernel). 00:04.0 would be
successfully initiated by DPDK virtio PMD (if nothing abnormal happens).
After that, we would get a port id 0, and all the related info needed
by virtio (virtio_hw) is stored at rte_eth_dev_data[0].

Then we start the secondary process. As usual, 00:03.0 will be firstly
probed. It firstly tries to get a local eth_dev structure for it (by
rte_eth_dev_allocate):

        port_id = rte_eth_dev_find_free_port();
        ...

        eth_dev = &rte_eth_devices[port_id];
        eth_dev->data = &rte_eth_dev_data[port_id];
        ...

        return eth_dev;

Since it's a first PCI device, port_id will be 0. eth_dev->data would
then point to rte_eth_dev_data[0]. And here things start going wrong,
as rte_eth_dev_data[0] actually stores the virtio_hw for 00:04.0.

That said, in the secondary process, DPDK will continue to drive PCI
device 00.03.0 (despite the fact it's been managed by kernel), with
the info from PCI device 00:04.0. Which is wrong.

The fix is to attach the port already registered by the primary process.
That is, iterate the rte_eth_dev_data[], and get the port id who's PCI
ID matches the current PCI device.

This would let us maintain same port ID for the same PCI device, keeping
the chance of referencing to wrong data minimal.

Fixes: af75078fec ("first public release")
Cc: stable@dpdk.org

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2017-01-17 09:20:18 +01:00
Jan Wickbom
59317cef24 vhost: allow many vhost-user ports
Currently select() is used to monitor file descriptors for vhostuser
ports. This limits the number of ports possible to create since the
fd number is used as index in the fd_set and we have seen fds > 1023.
This patch changes select() to poll(). This way we can keep an
packed (pollfd) array for the fds, e.g. as many fds as the size of
the array.

Also see:
http://dpdk.org/ml/archives/dev/2016-April/037024.html

Reported-by: Patrik Andersson <patrik.r.andersson@ericsson.com>
Signed-off-by: Jan Wickbom <jan.wickbom@ericsson.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2017-01-17 09:20:18 +01:00
Maxime Coquelin
73c8f9f69c vhost: introduce reply ack feature
REPLY_ACK features provide a generic way for QEMU to ensure both
completion and success of a request.

As described in vhost-user spec in QEMU repository, QEMU sets
VHOST_USER_NEED_REPLY flag (bit 3) when expecting a reply_ack from
the backend. Backend must reply with 0 for success or non-zero
otherwise when flag is set.

Currently, only VHOST_USER_SET_MEM_TABLE request implements reply_ack,
in order to synchronize mapping updates.

This patch enables REPLY_ACK feature generally, but only checks error
code for VHOST_USER_SET_MEM_TABLE.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2017-01-17 09:20:18 +01:00
Yong Wang
5731d6acc7 vhost: fix memory leak
In function vhost_new_device(), current code dose not free 'dev'
in "i == MAX_VHOST_DEVICE" condition statements. It will lead to a
memory leak.

Fixes: 45ca9c6f7b ("vhost: get rid of linked list for devices")
Cc: stable@dpdk.org

Signed-off-by: Yong Wang <wang.yong19@zte.com.cn>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2017-01-17 09:20:17 +01:00
Haifeng Lin
8c33fc10f6 vhost: fix guest/host physical address mapping
When reg_size < page_size the function read in
rte_mem_virt2phy would not return, because
host_user_addr is invalid.

Fixes: e246896178 ("vhost: get guest/host physical address mappings")
Cc: stable@dpdk.org

Signed-off-by: Haifeng Lin <haifeng.lin@huawei.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2017-01-17 09:20:17 +01:00
Tomasz Kulasek
1feda4d8fc mbuf: add a function to linearize a packet
This patch adds function rte_pktmbuf_linearize to let crypto PMD coalesce
chained mbuf before crypto operation and extend their capabilities to
support segmented mbufs when device cannot handle them natively.

Included unit tests for rte_pktmbuf_linearize functionality:

 1) Creates banch of segmented mbufs with different size and number of
    segments.
 2) Fills noncontigouos mbuf with sequential values.
 3) Uses rte_pktmbuf_linearize to coalesce segmented buffer into one
    contiguous.
 4) Verifies data in linearized buffer.

Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2017-01-15 19:30:00 +01:00
Tiwei Bie
375008544b ethdev: add MACsec capability flags
If these flags are advertised by a PMD, the NIC supports the MACsec
offload. The incoming MACsec traffics can be offloaded transparently
after the MACsec offload is configured correctly by the application.
And the application can set the PKT_TX_MACSEC flag in mbufs to enable
the MACsec offload for the packets to be transmitted.

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
2017-01-15 19:16:19 +01:00
Tiwei Bie
8609a7a86c ethdev: add MACsec event type
This commit adds a below event type:

- RTE_ETH_EVENT_MACSEC

This event will occur when the PN counter in a MACsec connection
reaches the exhaustion threshold.

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
2017-01-15 19:16:03 +01:00
Tiwei Bie
223d629f8c mbuf: add MACsec flag
Add a new Tx flag in mbuf, that can be set by applications to
enable the MACsec offload for a packet to be transmitted.

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2017-01-15 19:15:51 +01:00
Bruce Richardson
0615355ffa kvargs: make pointers in string arrays const
Change the parameters of functions from const char *valid[] to
const char * const valid[]. This additional const is needed to
allow us to fix some checkpatch warnings, as well as being good
programming practice.

For the checkpatch warnings, if we have a set of command line
args that we want to check defined as:
	static const char *args[] = { "arg1", "arg2", NULL };
	kvlist = rte_kvargs_parse(params, args);

checkpatch will complain:
	WARNING:STATIC_CONST_CHAR_ARRAY: static const char *
	array should probably be static const char * const

Adding the additional const to the definition of the args
will then trigger a compiler error in the absence of this
change to the kvargs library, as we lose the const in the
call to kvargs_parse.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2017-01-13 19:28:26 +01:00
Wenfeng Liu
454a0a7009 mempool: use cache in single producer or consumer mode
Currently we will check mempool flags when we put/get objects from
mempool. However, this makes cache useless when mempool is SC|SP,
SC|MP, MC|SP cases.
This patch makes cache available in above cases and improves performance.

Signed-off-by: Wenfeng Liu <liuwf@arraynetworks.com.cn>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2017-01-13 16:38:09 +01:00
Ben Walker
b0d3e3f73b pci: use address struct in function arguments
Instead of passing domain, bus, devid, func, just pass
an rte_pci_addr.

Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
2017-01-12 15:55:16 +01:00
Ben Walker
22dda618c0 pci: separate detaching ethernet ports from PCI devices
Attaching and detaching ethernet ports from an application
is not the same thing as physically removing a PCI device,
so clarify the flags indicating support. All PCI devices
are assumed to be physically removable, so no flag is
necessary in the PCI layer.

Signed-off-by: Ben Walker <benjamin.walker@intel.com>
2017-01-12 15:48:54 +01:00
Ben Walker
e84ad157b7 pci: unmap resources if probe fails
If resources were mapped prior to probe, unmap them
if probe fails.

This does not handle the case where the kernel driver was
forcibly unbound prior to probe.

Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
2017-01-12 15:47:45 +01:00
Adrien Mazarguil
6de5c0f130 ethdev: define default item masks in flow API
Leaving default pattern item mask values up for interpretation by PMDs is
an undefined behavior that applications might find difficult to use in the
wild. It also needlessly complicates PMD implementation.

This commit addresses this by defining consistent default masks for each
item type.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-01-11 16:54:52 +01:00
Adrien Mazarguil
f4e7c8bc84 ethdev: clarify RSS action in flow API
Contrary to the current description, mbuf RSS hash result storage does not
overlap with the returned MARK value (hash.fdir.lo vs. hash.fdir.hi), and
both may be combined.

Reflect this change by allowing testpmd to display both values
simultaneously.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-01-11 16:54:47 +01:00
Adrien Mazarguil
ebaa064e2c ethdev: clarify MARK and FLAG actions in flow API
Both actions share the PKT_RX_FDIR mbuf flag, as a result there is no way
to tell them apart. Moreover, the maximum allowed value for the MARK action
may not necessarily cover the entire 32-bit space.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-01-11 16:54:22 +01:00
Adrien Mazarguil
a757583e55 ethdev: modify flow API error function
Based on initial PMD implementations of the flow API, returning the error
structure which may be NULL is useless and always discarded.

Returning the error code instead appears to be much more convenient.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-01-11 16:53:07 +01:00
Nelio Laranjeiro
86c743cf91 eal: define generic vector types
Add common vector type definitions to all CPU architectures.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
2017-01-04 21:50:19 +01:00
Thomas Monjalon
c6dab2a873 tools: move to usertools
Rename tools/ into usertools/ to differentiate from buildtools/
and devtools/ while making clear these scripts are part of
DPDK runtime.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Tested-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-01-04 21:17:32 +01:00
Tomasz Kulasek
4fb7e803eb ethdev: add Tx preparation
Added API for `rte_eth_tx_prepare`

uint16_t rte_eth_tx_prepare(uint8_t port_id, uint16_t queue_id,
	struct rte_mbuf **tx_pkts, uint16_t nb_pkts)

Added fields to the `struct rte_eth_desc_lim`:

	uint16_t nb_seg_max;
		/**< Max number of segments per whole packet. */

	uint16_t nb_mtu_seg_max;
		/**< Max number of segments per one MTU */

These fields can be used to create valid packets according to the
following rules:

 * For non-TSO packet, a single transmit packet may span up to
   "nb_mtu_seg_max" buffers.

 * For TSO packet the total number of data descriptors is "nb_seg_max",
   and each segment within the TSO may span up to "nb_mtu_seg_max".

Added functions:

int
rte_validate_tx_offload(struct rte_mbuf *m)

  to validate general requirements for tx offload set in mbuf of packet
  such a flag completness. In current implementation this function is
  called optionaly when RTE_LIBRTE_ETHDEV_DEBUG is enabled.

int rte_net_intel_cksum_prepare(struct rte_mbuf *m)

  to prepare pseudo header checksum for TSO and non-TSO tcp/udp packets
  before hardware tx checksum offload.
   - for non-TSO tcp/udp packets full pseudo-header checksum is
     counted and set.
   - for TSO the IP payload length is not included.

int
rte_net_intel_cksum_flags_prepare(struct rte_mbuf *m, uint64_t ol_flags)

  this function uses same logic as rte_net_intel_cksum_prepare, but
  allows application to choose which offloads should be taken into
  account, if full preparation is not required.

PERFORMANCE TESTS
-----------------

This feature was tested with modified csum engine from test-pmd.

The packet checksum preparation was moved from application to Tx
preparation step placed before burst.

We may expect some overhead costs caused by:
1) using additional callback before burst,
2) rescanning burst,
3) additional condition checking (packet validation),
4) worse optimization (e.g. packet data access, etc.)

We tested it using ixgbe Tx preparation implementation with some parts
disabled to have comparable information about the impact of different
parts of implementation.

IMPACT:

1) For unimplemented Tx preparation callback the performance impact is
   negligible,
2) For packet condition check without checksum modifications (nb_segs,
   available offloads, etc.) is 14626628/14252168 (~2.62% drop),
3) Full support in ixgbe driver (point 2 + packet checksum
   initialization) is 14060924/13588094 (~3.48% drop)

Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2017-01-04 20:40:15 +01:00
Olivier Matz
6d52d1d4af ethdev: clarify extended statistics documentation
Reword the API documentation of ethdev xstats.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Remy Horton <remy.horton@intel.com>
2017-01-04 19:08:21 +01:00
Olivier Matz
513c78ae3f ethdev: fix extended statistics name index
The function rte_eth_xstats_get() return an array of tuples (id,
value). The value is the statistic counter, while the id references a
name in the array returned by rte_eth_xstats_get_name().

Today, each 'id' returned by rte_eth_xstats_get() is equal to the index
in the returned array, making this value useless. It also prevents a
driver from having different indexes for names and value, like in the
example below:

  rte_eth_xstats_get_name() returns:
    0: "rx0_stat"
    1: "rx1_stat"
    2: ...
    7: "rx7_stat"
    8: "tx0_stat"
    9: "tx1_stat"
    ...
    15: "tx7_stat"

  rte_eth_xstats_get() returns:
    0: id=0, val=<stat>    ("rx0_stat")
    1: id=1, val=<stat>    ("rx1_stat")
    2: id=8, val=<stat>    ("tx0_stat")
    3: id=9, val=<stat>    ("tx1_stat")

This patch fixes the drivers to set the 'id' in their ethdev->xstats_get()
(except e1000 which was already doing it), and fixes ethdev by not setting
the 'id' field to the index of the table for pmd-specific stats: instead,
they should just be shifted by the max number of generic statistics.

Fixes: bd6aa172cf ("ethdev: fetch extended statistics with integer ids")

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Remy Horton <remy.horton@intel.com>
2017-01-04 19:04:30 +01:00
Jan Blunck
eac901ce29 ethdev: decouple from PCI device
This makes struct rte_eth_dev independent of struct rte_pci_device by
replacing it with a pointer to the generic struct rte_device.

Signed-off-by: Jan Blunck <jblunck@infradead.org>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
2016-12-25 23:30:19 +01:00
Jan Blunck
ae34410a8a ethdev: move info filling of PCI into drivers
Only the drivers itself can decide if it could fill PCI information fields
of dev_info.

Signed-off-by: Jan Blunck <jblunck@infradead.org>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
2016-12-25 23:25:42 +01:00
Jan Blunck
0e1b45a284 ethdev: decouple interrupt handling from PCI device
The struct rte_intr_handle is an abstraction layer for different types of
interrupt mechanisms. It is embedded in the low-level device (e.g. PCI).
On allocation of a struct rte_eth_dev a reference to the intr_handle
should be stored for devices supporting interrupts.

Signed-off-by: Jan Blunck <jblunck@infradead.org>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
2016-12-25 23:25:14 +01:00
Stephen Hemminger
71ff78a7f7 eal: make driver pointer const in device struct
The info in rte_device about driver is immutable and
shouldn't change.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Jan Blunck <jblunck@infradead.org>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
2016-12-24 18:46:40 +01:00
Jan Blunck
bb30369dc1 eal: allow passing const interrupt handle
Both register/unregister and enable/disable don't necessarily require the
rte_intr_handle to be modifiable. Therefore lets constify it.

Signed-off-by: Jan Blunck <jblunck@infradead.org>
2016-12-24 18:45:50 +01:00
Jan Blunck
9b815f2141 eal: define container_of macro
This macro is based on Jan Viktorin's original patch but also checks the
type of the passed pointer against the type of the member.

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
[jblunck@infradead.org: add type checking and __extension__]
Signed-off-by: Jan Blunck <jblunck@infradead.org>
2016-12-24 18:45:01 +01:00
Adrien Mazarguil
1ad44ac13c cmdline: add alignment constraint
This prevents sigbus errors on architectures that cannot handle unexpected
unaligned accesses to the output buffer.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
2016-12-23 10:19:11 +01:00
Adrien Mazarguil
4fffc05a2b cmdline: support dynamic tokens
Considering tokens must be hard-coded in a list part of the instruction
structure, context-dependent tokens cannot be expressed.

This commit adds support for building dynamic token lists through a
user-provided function, which is called when the static token list is empty
(a single NULL entry).

Because no structures are modified (existing fields are reused), this
commit has no impact on the current ABI.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
2016-12-23 10:18:25 +01:00
Adrien Mazarguil
b1a4b4cbc0 ethdev: introduce generic flow API
This new API supersedes all the legacy filter types described in
rte_eth_ctrl.h. It is slightly higher level and as a result relies more on
PMDs to process and validate flow rules.

Benefits:

- A unified API is easier to program for, applications do not have to be
  written for a specific filter type which may or may not be supported by
  the underlying device.

- The behavior of a flow rule is the same regardless of the underlying
  device, applications do not need to be aware of hardware quirks.

- Extensible by design, API/ABI breakage should rarely occur if at all.

- Documentation is self-standing, no need to look up elsewhere.

Existing filter types will be deprecated and removed in the near future.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
2016-12-23 10:11:07 +01:00
Ferruh Yigit
15925897e7 ethdev: cleanup device ops struct whitespace
- Grouped related items using empty lines
- Aligned arguments to same column
- All item comments that doesn't fit same line are placed blow the item
  itself
- Moved some comments to same line if overall line < 100 chars

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2016-12-22 15:59:05 +01:00
Ferruh Yigit
a4435629a9 ethdev: remove invalid function from version map
Fixes: 9d41beed24 ("lib: provide initial versioning")

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2016-12-22 15:48:13 +01:00
Jan Blunck
5c51e40445 ethdev: add internal reset function
This is a helper for DPDK internal users to force a reconfiguration of a
device.

Signed-off-by: Jan Blunck <jblunck@infradead.org>
2016-12-21 18:47:53 +01:00
Jan Blunck
c3d3ba9892 ethdev: free queue array after releasing all queues
If all queues are released lets also free up the dev->data->rx/tx_queues
to be able to properly reinitialize.

Signed-off-by: Jan Blunck <jblunck@infradead.org>
2016-12-21 18:40:09 +01:00
Jan Blunck
d00d7cc883 ethdev: release queue before setting up
If a queue has been setup before lets release it before we setup.
Otherwise we might leak resources.

Signed-off-by: Jan Blunck <jblunck@infradead.org>
2016-12-21 18:27:58 +01:00
Jan Blunck
75aca7997e ethdev: initialize more fields on allocation
This moves the non-PCI related initialization of the link state interrupt
callback list and the setting of the default MTU to rte_eth_dev_allocate()
so that drivers only need to set non-default values.

Signed-off-by: Jan Blunck <jblunck@infradead.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2016-12-21 17:32:17 +01:00
Jan Blunck
7f95f78a8a ethdev: clear data when allocating device
Lets clear the eth_dev->data when allocating a new rte_eth_dev so that
drivers only need to set non-zero values.

Signed-off-by: Jan Blunck <jblunck@infradead.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2016-12-21 17:30:27 +01:00
Stephen Hemminger
3030127eb2 pci: remove unused unbind support
No device driver sets the unbind flag in current public code base.
Therefore it is good time to remove the unused dead code.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2016-12-21 16:13:46 +01:00
Jerin Jacob
f4ce209a8c eal: postpone vdev initialization
Some platform like octeontx may use pci and
vdev based combined device to represent a logical
dpdk functional device.In such case, postponing the
vdev initialization after pci device
initialization will provide the better view of
the pci device resources in the system in
vdev's probe function, and it allows better
functional subsystem registration in vdev probe
function.

As a bonus, This patch fixes a bond device
initialization use case.

example command to reproduce the issue:
./testpmd -c 0x2  --vdev 'eth_bond0,mode=0,
slave=0000:02:00.0,slave=0000:03:00.0' --
--port-topology=chained

root cause:
In existing case(vdev initialization and then pci
initialization), creates three Ethernet ports with
following port ids
0 - Bond device
1 - PCI device 0
2 - PCI devive 1

Since testpmd, calls the configure/start on all the ports on
start up,it will translate to following illegal setup sequence

1)bond device configure/start
1.1) pci device0 stop/configure/start
1.2) pci device1 stop/configure/start
2)pci device 0 configure(illegal setup case,
as device in start state)

The fix changes the initialization sequence and
allow initialization in following valid setup order
1) pcie device 0 configure/start
2) pcie device 1 configure/start
3) bond device 2 configure/start
3.1) pcie device 0/stop/configure/start
3.2) pcie device 1/stop/configure/start

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2016-12-21 15:41:17 +01:00
Jianfeng Tan
2eba8d21f3 eal: restrict cores auto detection
This patch uses pthread_getaffinity_np() to narrow down used
cores when none of below options is specified:
  * coremask (-c)
  * corelist (-l)
  * and coremap (--lcores)

The purpose of this patch is to leave out these core related options
when DPDK applications are deployed under container env, so that
users do not need decide the core related parameters when developing
applications. Instead, when applications are deployed in containers,
use cpu-set to constrain which cores can be used inside this container
instance. And DPDK application inside containers just rely on this
auto detect mechanism to start polling threads.

Note: previously, some users are using isolated CPUs, which could
be excluded by default. Please add commands like taskset to use
those cores.

Test example:
$ taskset 0xc0000 ./examples/helloworld/build/helloworld -m 1024

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2016-12-21 15:31:16 +01:00
Aaron Conole
12b9b2ee70 eal: clarify the argc and argv documentation
It's been a source of confusion in the past, and even with this update
may continue to be a source of confusion.  However, the original
language seems to imply that the DPDK EAL will take ownership of the
array passed in.  Loosening the language up a bit might give a better
understanding for what is actually happening.

Signed-off-by: Aaron Conole <aconole@redhat.com>
2016-12-21 15:25:29 +01:00
Olivier Matz
0880c40113 drivers: advertise kmod dependencies in pmdinfo
Add a new macro RTE_PMD_REGISTER_KMOD_DEP() that allows a driver to
declare the list of kernel modules required to run properly.

Today, most PCI drivers require uio/vfio.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-12-20 18:26:00 +01:00
Anatoly Burakov
c431384c8f vdev: fix detaching with alias
Fixes: d63eed6b2d ("eal: add driver name alias")

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2016-12-07 21:14:43 +01:00
Anatoly Burakov
f9ae888b1e ethdev: fix port lookup if none
Aside from avoiding doing useless work, this also fixes a segfault
when calling rte_eth_dev_get_port_by_name() whenever no devices
were found yet, and therefore rte_eth_dev_data wasn't yet allocated.

Fixes: 9c5b8d8b9f ("ethdev: clean port id retrieval when attaching")

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2016-12-07 19:04:41 +01:00
Robin Jarry
51c0c01b70 kni: avoid using lsb_release script
The lsb_release script is part of an optional package which is not
always installed. On the other hand, /etc/lsb-release is always present
even on minimal Ubuntu installations.

    root@ubuntu1604:~# dpkg -S /etc/lsb-release
    base-files: /etc/lsb-release

Read the file if present and use the variables defined in it.

Signed-off-by: Robin Jarry <robin.jarry@6wind.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2016-12-07 18:36:05 +01:00
Thomas Monjalon
65df6d069d config: remove insecure warnings
There was an option CONFIG_RTE_INSECURE_FUNCTION_WARNING (disabled by
default), which prevents from using some libc functions:
sprintf, snprintf, vsnprintf, strcpy, strncpy, strcat, strncat, sscanf,
strtok, strsep and strlen.

It's all about using them at the right place with the right precautions.
However, it is neither really possible nor a good advice to disable them.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2016-12-07 18:34:02 +01:00
Reshma Pattan
6d059e9980 doc: add pdump library to API doxygen
Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
2016-12-06 15:43:13 +01:00
Yong Wang
247bde5231 doc: fix typos in code comments
Signed-off-by: Yong Wang <wang.yong19@zte.com.cn>
Acked-by: John McNamara <john.mcnamara@intel.com>
2016-12-06 15:25:01 +01:00
Olivier Matz
8ee94d8aee mempool: fix API documentation
A previous commit changed the local_cache table into a
pointer, reducing the size of the rte_mempool structure.

Fix the API comment of rte_mempool_create() related to
this modification.

Fixes: 213af31e09 ("mempool: reduce structure size if no cache needed")

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
2016-12-06 15:18:51 +01:00
Wei Zhao
580db2a5ce mempool: remove a redundant word in comment
There is a redundant repetition word "for" in comment line of the
file rte_mempool.h after the definition of RTE_MEMPOOL_OPS_NAMESIZE.
The word "for" appears twice in line 359 and 360. One of them is
redundant, so delete it.

Fixes: 449c49b93a ("mempool: support handler operations")

Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2016-12-06 15:18:51 +01:00
Wei Zhao
e8453b9f3c mempool: remove redundant socket id assignment
There is a redundant repetition mempool socket_id assignment in the
file rte_mempool.c in function rte_mempool_create_empty. The
statement "mp->socket_id = socket_id;"appear twice in line 821
and 824. One of them is redundant, so delete it.

Fixes: 85226f9c52 ("mempool: introduce a function to create an empty pool")

Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2016-12-06 15:18:51 +01:00
Bert van Leeuwen
e0ccf33800 ethdev: check maximum number of queues for statistics
Arrays inside rte_eth_stats have size=RTE_ETHDEV_QUEUE_STAT_CNTRS.
Some devices report more queues than that and this code blindly uses
the reported number of queues by the device to fill those arrays up.
This patch fixes the problem using MIN between the reported number of
queues and RTE_ETHDEV_QUEUE_STAT_CNTRS.

Fixes: ce757f5c9a ("ethdev: new method to retrieve extended statistics")

Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Reviewed-by: Olivier Matz <olivier.matz@6wind.com>
2016-12-06 14:18:26 +01:00
Wei Dai
f1d54e6dfb pci: fix check of mknod
In function pci_mknod_uio_dev() in lib/librte_eal/eal/eal_pci_uio.c,
The return value of mknod() is ret, not f got by fopen().
So the value of ret should be checked for mknod().

Fixes: f7f97c1604 ("pci: add option --create-uio-dev to run without hotplug")

Signed-off-by: Wei Dai <wei.dai@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2016-12-06 11:59:39 +01:00
Olivier Matz
5d8f0baf69 log: do not drop debug logs at compile time
Today, all logs whose level is lower than INFO are dropped at
compile-time. This prevents from enabling debug logs at runtime using
--log-level=8.

The rationale was to remove debug logs from the data path at
compile-time, avoiding a test at run-time.

This patch changes the behavior of RTE_LOG() to avoid the compile-time
optimization, and introduces the RTE_LOG_DP() macro that has the same
behavior than the previous RTE_LOG(), for the rare cases where debug
logs are in the data path.

So it is now possible to enable debug logs at run-time by just
specifying --log-level=8. Some drivers still have special compile-time
options to enable more debug log. Maintainers may consider to
remove/reduce them.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2016-12-01 18:09:13 +01:00
Thomas Monjalon
3549dd6715 version: 17.02-rc0
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2016-11-14 15:16:46 +01:00
Thomas Monjalon
d3bfeaaabf version: 16.11.0
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2016-11-13 15:28:12 +01:00
Reshma Pattan
c9c7758dde pdump: fix log messages
The ethdev Rx/Tx remove callback apis doesn't set rte_errno during
failures, instead they just return negative error number, so using
that number in logs instead of rte_errno upon Rx and Tx callback
removal failures.

Fixes: 278f9454 ("pdump: add new library for packet capture")

Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2016-11-12 22:27:09 +01:00
Nipun Gupta
f5e9ed5c4e mempool: fix leak if populate fails
This patch fixes the issue of memzone not being freed incase the
rte_mempool_populate_phys fails in the rte_mempool_populate_default

This issue was identified when testing with OVS ~2.6
- configure the system with low memory (e.g. < 500 MB)
- add bridge and dpdk interfaces
- delete brigde
- keep on repeating the above sequence.

Fixes: d1d914ebbc ("mempool: allocate in several memory chunks by default")

Signed-off-by: Nipun Gupta <nipun.gupta@nxp.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2016-11-12 22:27:09 +01:00
Thomas Monjalon
a02a9135dc version: 16.11-rc3
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2016-11-07 22:13:27 +01:00
Alain Leon
8035bdde78 doc: fix typos
Fixes typos present in the documentation and code comments.

Signed-off-by: Alain Leon <xerebz@gmail.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
2016-11-07 21:50:27 +01:00
Wei Dai
083e2f2600 ethdev: comment statistics field support
Add comments to describe that not all statistics fields in
struct rte_eth_stats are supported by any type of network
interface card. If any statistics field is not supported,
its value is 0.

Signed-off-by: Wei Dai <wei.dai@intel.com>
2016-11-07 21:33:38 +01:00
Wenzhuo Lu
91cd905af9 ip_frag: fix IP reassembly regression
After changing pkt[0] to pkt[], the example IP reassembly is not working.
It's weird because this change is fine. There should be no difference
between them.
As a workaround, revert this change.

Fixes: 347a1e037f ("lib: use C99 syntax for zero-size arrays")

Reported-by: Huilong Xu <huilongx.xu@intel.com>
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
2016-11-07 21:27:50 +01:00
Yuanhan Liu
164a601b52 vhost: remove references to vhost-cuse
vhost-cuse is removed, update corresponding comments that are still
referencing it.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
2016-11-07 15:59:21 +01:00
Igor Ryzhov
25c62ca4eb pci: fix probing error if no driver found
The rte_eal_pci_probe_one function could return false positive result if
no driver is found for the device.

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Acked-by: David Marchand <david.marchand@6wind.com>
2016-11-07 14:50:47 +01:00
David Marchand
d82eeefe7a pci: unset driver if probe fails
dev->driver should be set only if a driver did take the device.

Signed-off-by: David Marchand <david.marchand@6wind.com>
2016-11-07 14:50:47 +01:00
Fiona Trahe
59ab9a97c8 cryptodev: clarify how operations affect buffers
Updated comments on API to clarify which parts of mbufs are
copied or changed by crypto operations.

Signed-off-by: Fiona Trahe <fiona.trahe@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2016-11-07 00:22:58 +01:00
Wei Dai
c7e9d6a6ed lpm: fix freeing memory
The memory pointed by lpm->rules_tbl should also be freed
when memory malloc for tbl8 fails in rte_lpm_create_v1604( ).
And the memory pointed by lpm->tbl8 should also be freed
when the lpm object is freed in rte_lpm_free_v1604( ).

Fixes: f1f7261838 ("lpm: add a new config structure for IPv4")

Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
Signed-off-by: Wei Dai <wei.dai@intel.com>
2016-11-06 23:46:03 +01:00
Dmitriy Yakovlev
bf39fb4c5b cfgfile: fix API comments
Fixed style and return values in Doxygen comments.

Fixes: eaafbad419 ("cfgfile: library to interpret config files")

Signed-off-by: Dmitriy Yakovlev <bombermag@gmail.com>
2016-11-06 23:34:40 +01:00
Ferruh Yigit
20dbd33a36 net: remove mempool dependency
Fixes: b25c2a8c69 ("net: introduce net library")

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2016-11-06 23:25:20 +01:00
Ben Walker
d9dd34bc8a pci: probe only devices without any driver
If the user asks to probe multiple times, the probe
callback should only be called on devices that don't have
a driver already loaded.

This is useful if a driver is registered after the
execution of a program has started and the list of devices
needs to be re-scanned.

Signed-off-by: Ben Walker <benjamin.walker@intel.com>
2016-11-06 22:58:08 +01:00
Jianbo Liu
545f16daf6 eal/ppc: fix file descriptor leak when getting CPU features
Close the file descriptor after finish using it.

Fixes: 9ae15538 ("eal/ppc: cpu flag checks for IBM Power")

Signed-off-by: Jianbo Liu <jianbo.liu@linaro.org>
Acked-by: Jan Viktorin <viktorin@rehivetech.com>
2016-11-06 22:42:06 +01:00
Jianbo Liu
a1ed637873 eal/arm: fix file descriptor leak when getting CPU features
Close the file descriptor after finish using it.

Fixes: b94e5c94 ("eal/arm: add CPU flags for ARMv7")
Fixes: 97523f82 ("eal/arm: add CPU flags for ARMv8")

Signed-off-by: Jianbo Liu <jianbo.liu@linaro.org>
Acked-by: Jan Viktorin <viktorin@rehivetech.com>
2016-11-06 22:41:51 +01:00
Thomas Monjalon
e65b5069fd ethdev: rename library for consistency
The library was named libethdev without rte_ prefix.
It is now fixed, the library namespace is consistent.

Note: the ABI version has already been changed in this release cycle.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2016-11-06 20:53:23 +01:00
Shreyansh Jain
6ba1affa54 lib: fix ABI version after device model rework
rte_device/driver generalization patches [1] were merged without a change
in the LIBABIVER variable. This patches bumps the macro of affected libs:

- libcryptodev and libetherdev have been bumped
- librte_eal version changed in
  d7e61ad3ae ("log: remove deprecated history dump")

Details of ABI/API changes:
- EAL [version already bumped in: d7e61ad3ae]
  |- type field was removed from rte_driver
  |- rte_pci_device now embeds rte_device
  |- rte_pci_resource renamed to rte_mem_resource
  |- numa_node and devargs of rte_pci_driver is moved to rte_driver
  |- APIs for device hotplug (attach/detach) moved into EAL
  |- API rte_eal_pci_device_name added for PCI device naming
  |- vdev registration API introduced (rte_eal_vdrv_register,
  |  rte_eal_vdrv_unregister

- librte_crypto (v 1=>2)
  |- removed rte_cryptodev_create_unique_device_name API
  |- moved device naming to EAL

- librte_ethdev (v 4=>5)
  |- rte_eth_dev_type is removed
  |- removed dev_type from rte_eth_dev_allocate API
  |- removed API rte_eth_dev_get_device_type
  |- removed API rte_eth_dev_get_addr_by_port
  |- removed API rte_eth_dev_get_port_by_addr
  |- removed rte_cryptodev_create_unique_device_name API
  |- moved device naming to EAL

Also, deprecation notice from 16.07 has been removed and release notes for
16.11 added.

[1] http://dpdk.org/ml/archives/dev/2016-September/047087.html

Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
2016-11-06 20:53:23 +01:00
Thomas Monjalon
ca41215c05 version: 16.11-rc2
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2016-10-26 23:51:42 +02:00
Reshma Pattan
bb900072ff pdump: revert PCI device name conversion
Earlier ethdev library created the device names in the
"bus:device.func" format hence pdump library implemented
its own conversion method for changing the user passed
device name format "domain🚌device.func" to "bus:device.func"
for finding the port id using device name using ethdev library
calls. Now after ethdev and eal rework
http://dpdk.org/dev/patchwork/patch/15855/,
the device names are created in the format "domain🚌device.func",
so pdump library conversion is not needed any more, hence removed
the corresponding code.

Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
2016-10-26 21:54:38 +02:00
Slawomir Mrozowicz
8a9867a635 crypto/openssl: rename libcrypto to openssl
This patch replaces name "libcrypto" to "openssl" from file directories,
symbol prefixes and sub-names connected with old name.
Renamed poll mode driver files, test files, and documentations.
It is done to better name association with library because
the cryptography operations are using Openssl library crypto API.

Fixes: d61f70b4c9 ("crypto/libcrypto: add driver for OpenSSL library")

Signed-off-by: Slawomir Mrozowicz <slawomirx.mrozowicz@intel.com>
Acked-by: Deepak Kumar Jain <deepak.k.jain@intel.com>
2016-10-26 14:58:37 +02:00
Maxime Coquelin
2a51b1091c vhost: support indirect descriptor in non-mergeable Rx
Linux virtio-net kernel driver uses indirect descriptors when
mergeable buffers are not used.

This patch adds its support, fixing the use of indirect
descriptors with these guests.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-10-26 13:39:09 +02:00
Maxime Coquelin
1be4ebb1c4 vhost: support indirect descriptor in mergeable Rx
Windows virtio-net driver uses indirect descriptors with
mergeable buffers.

This patch adds its support, fixing the use of indirect
descriptors with these guests.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-10-26 13:39:09 +02:00
Yuanhan Liu
54524e0391 vhost: fix use after free
Fix the coverity USE_AFTER_FREE issue.

Coverity issue: 137884
Fixes: a277c71598 ("vhost: refactor code structure")

Reported-by: John McNamara <john.mcnamara@intel.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-10-26 13:39:09 +02:00
Yuanhan Liu
f2f5bc8f30 vhost: retrieve available head once
There is no need to retrieve the latest avail head every time we enqueue
a packet in the mereable Rx path by

    avail_idx = *((volatile uint16_t *)&vq->avail->idx);

Instead, we could just retrieve it once at the beginning of the enqueue
path. This could diminish the cache penalty slightly, because the virtio
driver could be updating it while vhost is reading it (for each packet).

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Signed-off-by: Zhihong Wang <zhihong.wang@intel.com>
Reviewed-by: Jianbo Liu <jianbo.liu@linaro.org>
Tested-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2016-10-26 13:39:09 +02:00
Yuanhan Liu
45847f015d vhost: prefetch available ring
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Signed-off-by: Zhihong Wang <zhihong.wang@intel.com>
Reviewed-by: Jianbo Liu <jianbo.liu@linaro.org>
Tested-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2016-10-26 13:39:09 +02:00
Zhihong Wang
f689586bc0 vhost: shadow used ring update
The basic idea is to shadow the used ring update: update them into a
local buffer first, and then flush them all to the virtio used vring
at once in the end.

And since we do avail ring reservation before enqueuing data, we would
know which and how many descs will be used. Which means we could update
the shadow used ring at the reservation time. It also introduce another
slight advantage: we don't need access the desc->flag any more inside
copy_mbuf_to_desc_mergeable().

Signed-off-by: Zhihong Wang <zhihong.wang@intel.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Jianbo Liu <jianbo.liu@linaro.org>
Tested-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2016-10-26 13:39:09 +02:00
Yuanhan Liu
fcdbe1fe1a vhost: use last available index for ring reservation
shadow_used_ring will be introduced later. Since then last avail
idx will not be updated together with last used idx.

So, here we use last_avail_idx for avail ring reservation.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Signed-off-by: Zhihong Wang <zhihong.wang@intel.com>
Reviewed-by: Jianbo Liu <jianbo.liu@linaro.org>
Tested-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2016-10-26 13:39:09 +02:00
Yuanhan Liu
3f9e48f7da vhost: simplify mergeable Rx vring reservation
Let it return "num_buffers" we reserved, so that we could re-use it
with copy_mbuf_to_desc_mergeable() directly, instead of calculating
it again there.

Meanwhile, the return type of copy_mbuf_to_desc_mergeable is changed
to "int". -1 will be return on error.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Signed-off-by: Zhihong Wang <zhihong.wang@intel.com>
Reviewed-by: Jianbo Liu <jianbo.liu@linaro.org>
Tested-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2016-10-26 13:39:09 +02:00
Zhihong Wang
e7ef688562 vhost: optimize cache access
This patch reorders the code to delay virtio header write to improve
cache access efficiency for cases where the mrg_rxbuf feature is turned
on. CPU pipeline stall cycles can be significantly reduced.

Virtio header write and mbuf data copy are all remote store operations
which takes a long time to finish. It's a good idea to put them together
to remove bubbles in between, to let as many remote store instructions
as possible go into store buffer at the same time to hide latency, and
to let the H/W prefetcher goes to work as early as possible.

On a Haswell machine, about 100 cycles can be saved per packet by this
patch alone. Taking 64B packets traffic for example, this means about 60%
efficiency improvement for the enqueue operation.

Signed-off-by: Zhihong Wang <zhihong.wang@intel.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Jianbo Liu <jianbo.liu@linaro.org>
Tested-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2016-10-26 13:39:09 +02:00
Zhihong Wang
b24fb48886 vhost: remove useless volatile
last_used_idx is a local var, there is no need to decorate it
by "volatile".

Signed-off-by: Zhihong Wang <zhihong.wang@intel.com>
Reviewed-by: Jianbo Liu <jianbo.liu@linaro.org>
Tested-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2016-10-26 13:39:09 +02:00
Maxime Coquelin
c843af3aa1 vhost: access header only if offloading is supported
If offloading features are not negotiated, parsing the virtio header
is not needed.

Micro-benchmark with testpmd shows that the gain is +4% with indirect
descriptors, +1% when using direct descriptors.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-10-26 13:39:09 +02:00
E. Scott Daniels
dbffbf80e9 ethdev: prevent duplicate event callback
This change prevents the attempt to add a structure which is
already on the callback list. If a struct with matching
parameters is found on the list, then no action is taken.

Fixes: ac2f69c ("ethdev: fix crash if malloc of user callback fails")

Signed-off-by: E. Scott Daniels <daniels@research.att.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
2016-10-25 23:30:23 +02:00
Wei Dai
1e975578bc mempool: fix search of maximum contiguous pages
paddr[i] + pg_sz always points to the start physical address of the
2nd page after pddr[i], so only up to 2 pages can be combinded to
be used. With this revision, more than 2 pages can be used.

Fixes: 84121f1971 ("mempool: store memory chunks in a list")

Signed-off-by: Wei Dai <wei.dai@intel.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2016-10-25 23:18:47 +02:00
Jan Blunck
d63eed6b2d eal: add driver name alias
This adds infrastructure for drivers to allow being requested by an alias
so that a renamed driver can still get loaded by its legacy name.

Signed-off-by: Jan Blunck <jblunck@infradead.org>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-10-25 18:49:18 +02:00
Ferruh Yigit
6445198f80 kni: fix build with kernel 4.9
compile error:
  CC [M]  .../lib/librte_eal/linuxapp/kni/igb_main.o
.../lib/librte_eal/linuxapp/kni/igb_main.c:2317:21:
error: initialization from incompatible pointer type
	[-Werror=incompatible-pointer-types]
  .ndo_set_vf_vlan = igb_ndo_set_vf_vlan,
                     ^~~~~~~~~~~~~~~~~~~

Linux kernel 4.9 updates API for ndo_set_vf_vlan:
Linux: 79aab093a0b5 ("net: Update API for VF vlan protocol 802.1ad support")

Use new API for Linux kernels >= 4.9

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2016-10-25 16:20:43 +02:00
Ferruh Yigit
1735110509 kni: fix build with kernel < 3.1
compile error:
  CC [M]  .../lib/librte_eal/linuxapp/kni/kni_misc.o
cc1: warnings being treated as errors
.../lib/librte_eal/linuxapp/kni/kni_misc.c: In function ‘kni_exit_net’:
.../lib/librte_eal/linuxapp/kni/kni_misc.c:113:18:
error: unused variable ‘knet’

For kernel versions < v3.1 mutex_destroy() is a macro and does nothing,
this cause an unused variable warning for knet which used in the
mutex_destroy()

mutex_destroy() converted into static inline function with commit:
Linux: 4582c0a4866e ("mutex: Make mutex_destroy() an inline function")

To fix the warning unused attribute added to the knet variable.

Fixes: 93a298b34e ("kni: support core id parameter in single threaded mode")

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2016-10-14 22:21:28 +02:00
Thomas Monjalon
fed622dfd9 version: 16.11-rc1
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2016-10-14 02:34:11 +02:00
Bernard Iremonger
c1ceaf3ad0 ethdev: add an argument to internal callback function
add cb_arg parameter to the _rte_eth_dev_callback_process function.

Adding a parameter to this function allows passing information
to the application when an eth device event occurs such as
a VF to PF message.
This allows the application to decide if a particular function
is permitted.

Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Signed-off-by: Alex Zelezniak <alexz@att.com>
2016-10-14 02:01:52 +02:00
Shreyansh Jain
01f1922786 drivers: rename register macro prefix
All macros related to driver registeration renamed from DRIVER_*
to RTE_PMD_*

This includes:

 DRIVER_REGISTER_PCI -> RTE_PMD_REGISTER_PCI
 DRIVER_REGISTER_PCI_TABLE -> RTE_PMD_REGISTER_PCI_TABLE
 DRIVER_REGISTER_VDEV -> RTE_PMD_REGISTER_VDEV
 DRIVER_REGISTER_PARAM_STRING -> RTE_PMD_REGISTER_PARAM_STRING
 DRIVER_EXPORT_* -> RTE_PMD_EXPORT_*

Fix PMDINFOGEN tool to look for matches of RTE_PMD_REGISTER_*.

Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2016-10-14 01:49:32 +02:00
Ferruh Yigit
473c4957a4 kni: move kernel version ifdefs to compat header
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2016-10-13 23:12:26 +02:00
Ferruh Yigit
dd6d3570c8 kni: prefer uint32_t to unsigned int
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2016-10-13 23:12:11 +02:00
Ferruh Yigit
e4728a1cfa kni: update log messages
Remove some function entrance logs and changed log level of some logs.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2016-10-13 23:12:00 +02:00
Ferruh Yigit
f13fbc033a kni: remove compile time debug configuration
Since switched to kernel dynamic debugging it is possible to remove
compile time debug log configuration.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2016-10-13 23:11:26 +02:00
Ferruh Yigit
dec1ffcae7 kni: move functions to eliminate declarations
Function implementations kept same.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2016-10-13 23:11:12 +02:00
Ferruh Yigit
d2d7d6fc5f kni: remove unnecessary messages for out of memory
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2016-10-13 23:09:12 +02:00
Ferruh Yigit
dd3e4e36d4 kni: update kernel logging
Switch to dynamic logging functions. Depending kernel configuration this
may cause previously visible logs disappear.

How to enable dynamic logging:
https://www.kernel.org/doc/Documentation/dynamic-debug-howto.txt

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2016-10-13 23:09:02 +02:00
Ferruh Yigit
05788ff054 kni: prefer ether_addr_copy to memcpy
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2016-10-13 23:08:44 +02:00
Ferruh Yigit
e479813505 kni: prefer min_t to min
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2016-10-13 23:08:40 +02:00
Ferruh Yigit
88b7a2a7bd kni: enclose macros with complex values in parens
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2016-10-13 23:08:14 +02:00
Ferruh Yigit
59b36980e4 kni: do not use assignment in if condition
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2016-10-13 23:08:04 +02:00
Ferruh Yigit
50e25e4049 kni: move trailing statement on next line
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2016-10-13 23:07:16 +02:00
Ferruh Yigit
fa34b39dfd kni: move comparison constants on the right
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2016-10-13 23:06:42 +02:00
Ferruh Yigit
e227435ec0 kni: remove useless return
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2016-10-13 23:06:38 +02:00
Ferruh Yigit
0861751c93 kni: prefer unsigned int to unsigned
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2016-10-13 23:06:30 +02:00
Ferruh Yigit
1b9190aff3 kni: fix spacing and line lenghts
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2016-10-13 23:05:31 +02:00
Ferruh Yigit
9afcc5bc74 kni: make static struct const
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2016-10-13 23:05:28 +02:00
Ferruh Yigit
fbd71b623f kni: uninitialize global variables
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2016-10-13 23:05:20 +02:00
Ferruh Yigit
f60c4df6bb kni: move externs to the header file
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2016-10-13 22:49:41 +02:00
Vladyslav Buslov
93a298b34e kni: support core id parameter in single threaded mode
Allow binding KNI thread to specific core in single threaded mode
by setting core_id and force_bind config parameters.

Signed-off-by: Vladyslav Buslov <vladyslav.buslov@harmonicinc.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Ferruh Yigit <ferruh.yigit@intel.com>
2016-10-13 22:24:45 +02:00
Wei Dai
f05b0fbe7d lpm: remove redundant check when adding rule
When a rule with depth > 24 is added into an existing
rule with depth <=24, a new tbl8 is allocated, the existing
rule first fulfill whole new tbl8, so the filed valid of
each entry in this tbl8 is always true and depth of each
entry is always <= 24 before adding the new rule with depth > 24.

Signed-off-by: Wei Dai <wei.dai@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2016-10-13 22:17:00 +02:00
Wei Dai
69ed52dddc lpm: fix freeing unused sub-table on rule delete
When all rules with depth > 24 are deleted in a same sub-table
(tlb8 group) and only a rule with depth <=24 is left in it,
this sub-table (tlb8 group) should be recycled.

Fixes: dc81ebbaca ("lpm: extend IPv4 next hop field")
Fixes: af75078fec ("first public release")

Signed-off-by: Wei Dai <wei.dai@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2016-10-13 22:13:19 +02:00
John Ousterhout
844bd77c03 log: respect logger configured before EAL init
Before this patch, application-specific loggers could not be
installed before rte_eal_init completed (the initialization process
called rte_openlog_stream, overwriting any previously installed
logger). This made it impossible for an application to capture the
initial log messages generated during rte_eal_init. This patch changes
initialization so that information from a previous call to
rte_openlog_stream is not lost. Specifically:
* The default log stream is now maintained separately from an
  application-specific log stream installed with rte_openlog_stream.
* rte_eal_common_log_init has been renamed to eal_log_set_default,
  since this is all it does. It no longer invokes rte_openlog_stream; it
  just updates the default stream. Also, this method now returns void,
  rather than int, since there are no errors.

This patch also removes the "early log" mechanism and cleans up the
log initialization mechanism:
* The default log stream defaults to stderr on all platforms if
  eal_log_set_default hasn't been invoked (Linux used to use stdout
  during the first part of initialization).
* Removed rte_eal_log_early_init; all of the desired functionality can
  be achieved by calling eal_log_set_default.
* Removed lib/librte_eal/bsdapp/eal/eal_log.c: it contained only one
  function, rte_eal_log_init, which is not needed or invoked for BSD.
* Removed declaration for eal_default_log_stream in rte_log.h (it's now
  private to eal_common_log.c).
* Moved call to rte_eal_log_init earlier in rte_eal_init for Linux, so
  that it starts using the preferrred log ASAP.

Signed-off-by: John Ousterhout <ouster@cs.stanford.edu>
2016-10-13 22:13:18 +02:00