numam-dpdk

Author	SHA1	Message	Date
Anatoly Burakov	5b61c62cfd	fbarray: add internal tailq for mapped areas Currently, there are numerous reliability issues with fbarray, such as: - There is no way to prevent attaching to overlapping memory areas - There is no way to prevent double-detach - Failed destroy leaves fbarray in an invalid state (fbarray itself is valid, but its backing memory area is already detached) In addition, on FreeBSD, doing mmap() on a file descriptor does not keep the lock, so we also need to store the fd in order to keep the lock. This patch improves upon fbarray to address both of these issues by adding an internal tailq to track allocated areas and their respective file descriptors. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-03-28 23:28:50 +01:00
Nikhil Rao	db9f4430c2	service: fix parameter type for attribute The type of value parameter to rte_service_attr_get should be uint64_t *, since the attributes are of type uint64_t. Fixes: `4d55194d76` ("service: add attribute get function") Signed-off-by: Nikhil Rao <nikhil.rao@intel.com> Reviewed-by: Gage Eads <gage.eads@intel.com> Reviewed-by: Rami Rosen <ramirose@gmail.com> Acked-by: Harry van Haaren <harry.van.haaren@intel.com>	2019-03-28 21:07:48 +01:00
Ruifeng Wang	90fefe78bf	hash: optimize signature compare for Arm NEON Implemented signature compare function based on neon intrinsic. Hash bulk lookup had 3% - 6% performance gain after optimization. Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> Reviewed-by: Gavin Hu <gavin.hu@arm.com> Acked-by: Yipeng Wang <yipeng1.wang@intel.com> Acked-by: Jerin Jacob <jerinj@marvell.com>	2019-03-28 19:54:21 +01:00
Joyce Kong	ca49b92079	ticketlock: enable generic ticketlock on all arch Let all architectures use generic ticketlock implementation. Signed-off-by: Joyce Kong <joyce.kong@arm.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2019-03-28 15:00:11 +01:00
Joyce Kong	184104fc61	ticketlock: introduce fair ticket based locking The spinlock implementation is unfair, some threads may take locks aggressively while leaving the other threads starving for long time. This patch introduces ticketlock which gives each waiting thread a ticket and they can take the lock one by one. First come, first serviced. This avoids starvation for too long time and is more predictable. Suggested-by: Jerin Jacob <jerinj@marvell.com> Signed-off-by: Joyce Kong <joyce.kong@arm.com> Reviewed-by: Gavin Hu <gavin.hu@arm.com> Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2019-03-28 14:58:49 +01:00
Joyce Kong	e8af2f1f11	rwlock: reimplement with atomic builtins The __sync builtin based implementation generates full memory barriers ('dmb ish') on Arm platforms. Using C11 atomic builtins to generate one way barriers. Here is the assembly code of __sync_compare_and_swap builtin. __sync_bool_compare_and_swap(dst, exp, src); 0x000000000090f1b0 <+16>: e0 07 40 f9 ldr x0, [sp, #8] 0x000000000090f1b4 <+20>: e1 0f 40 79 ldrh w1, [sp, #6] 0x000000000090f1b8 <+24>: e2 0b 40 79 ldrh w2, [sp, #4] 0x000000000090f1bc <+28>: 21 3c 00 12 and w1, w1, #0xffff 0x000000000090f1c0 <+32>: 03 7c 5f 48 ldxrh w3, [x0] 0x000000000090f1c4 <+36>: 7f 00 01 6b cmp w3, w1 0x000000000090f1c8 <+40>: 61 00 00 54 b.ne 0x90f1d4 <rte_atomic16_cmpset+52> // b.any 0x000000000090f1cc <+44>: 02 fc 04 48 stlxrh w4, w2, [x0] 0x000000000090f1d0 <+48>: 84 ff ff 35 cbnz w4, 0x90f1c0 <rte_atomic16_cmpset+32> 0x000000000090f1d4 <+52>: bf 3b 03 d5 dmb ish 0x000000000090f1d8 <+56>: e0 17 9f 1a cset w0, eq // eq = none Fixes: `af75078fec` ("first public release") Cc: stable@dpdk.org Signed-off-by: Gavin Hu <gavin.hu@arm.com> Signed-off-by: Joyce Kong <joyce.kong@arm.com> Tested-by: Joyce Kong <joyce.kong@arm.com> Acked-by: Jerin Jacob <jerinj@marvell.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2019-03-28 11:47:05 +01:00
Gavin Hu	453d8f7366	spinlock: reimplement with atomic one-way barrier The __sync builtin based implementation generates full memory barriers ('dmb ish') on Arm platforms. Using C11 atomic builtins to generate one way barriers. Here is the assembly code of __sync_compare_and_swap builtin. __sync_bool_compare_and_swap(dst, exp, src); 0x000000000090f1b0 <+16>: e0 07 40 f9 ldr x0, [sp, #8] 0x000000000090f1b4 <+20>: e1 0f 40 79 ldrh w1, [sp, #6] 0x000000000090f1b8 <+24>: e2 0b 40 79 ldrh w2, [sp, #4] 0x000000000090f1bc <+28>: 21 3c 00 12 and w1, w1, #0xffff 0x000000000090f1c0 <+32>: 03 7c 5f 48 ldxrh w3, [x0] 0x000000000090f1c4 <+36>: 7f 00 01 6b cmp w3, w1 0x000000000090f1c8 <+40>: 61 00 00 54 b.ne 0x90f1d4 <rte_atomic16_cmpset+52> // b.any 0x000000000090f1cc <+44>: 02 fc 04 48 stlxrh w4, w2, [x0] 0x000000000090f1d0 <+48>: 84 ff ff 35 cbnz w4, 0x90f1c0 <rte_atomic16_cmpset+32> 0x000000000090f1d4 <+52>: bf 3b 03 d5 dmb ish 0x000000000090f1d8 <+56>: e0 17 9f 1a cset w0, eq // eq = none The benchmarking results showed constant improvements on all available platforms: 1. Cavium ThunderX2: 126% performance; 2. Hisilicon 1616: 30%; 3. Qualcomm Falkor: 13%; 4. Marvell ARMADA 8040 with A72 cores on macchiatobin: 3.7% Here is the example test result on TX2: $sudo ./build/app/test -l 16-27 -- i RTE>>spinlock_autotest * spinlock_autotest without this patch * Test with lock on 12 cores... Core [16] Cost Time = 53886 us Core [17] Cost Time = 53605 us Core [18] Cost Time = 53163 us Core [19] Cost Time = 49419 us Core [20] Cost Time = 34317 us Core [21] Cost Time = 53408 us Core [22] Cost Time = 53970 us Core [23] Cost Time = 53930 us Core [24] Cost Time = 53283 us Core [25] Cost Time = 51504 us Core [26] Cost Time = 50718 us Core [27] Cost Time = 51730 us Total Cost Time = 612933 us * spinlock_autotest with this patch * Test with lock on 12 cores... Core [16] Cost Time = 18808 us Core [17] Cost Time = 29497 us Core [18] Cost Time = 29132 us Core [19] Cost Time = 26150 us Core [20] Cost Time = 21892 us Core [21] Cost Time = 24377 us Core [22] Cost Time = 27211 us Core [23] Cost Time = 11070 us Core [24] Cost Time = 29802 us Core [25] Cost Time = 15793 us Core [26] Cost Time = 7474 us Core [27] Cost Time = 29550 us Total Cost Time = 270756 us In the tests on ThunderX2, with more cores contending, the performance gain was even higher, indicating the __atomic implementation scales up better than __sync. Fixes: `af75078fec` ("first public release") Cc: stable@dpdk.org Signed-off-by: Gavin Hu <gavin.hu@arm.com> Reviewed-by: Phil Yang <phil.yang@arm.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com> Reviewed-by: Steve Capper <steve.capper@arm.com> Reviewed-by: Jerin Jacob <jerinj@marvell.com> Acked-by: Nipun Gupta <nipun.gupta@nxp.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2019-03-28 09:19:39 +01:00
Gavin Hu	85cffb2ecc	ring: enforce reading tail before slots In weak memory models, like arm64, reading the prod.tail may get reordered after reading the ring slots, which corrupts the ring and stale data is observed. This issue was reported by NXP on 8-A72 DPAA2 board. The problem is most likely caused by missing the acquire semantics when reading prod.tail (in SC dequeue) which makes it possible to read a stale value from the ring slots. For MP (and MC) case, rte_atomic32_cmpset() already provides the required ordering. For SP case, the control depependency between if-statement (which depends on the read of r->cons.tail) and the later stores to the ring slots make RMB unnecessary. About the control dependency, read more at: https://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test7.pdf This patch is adding the required read barrier to prevent reading the ring slots get reordered before reading prod.tail for SC case. Fixes: `c9fb3c6289` ("ring: move code in a new header file") Cc: stable@dpdk.org Signed-off-by: Gavin Hu <gavin.hu@arm.com> Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com> Tested-by: Nipun Gupta <nipun.gupta@nxp.com> Acked-by: Nipun Gupta <nipun.gupta@nxp.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2019-03-28 01:22:04 +01:00
Pavan Nikhilesh	5cbd14b3e5	eal: roundup TSC frequency when estimating When estimating tsc frequency using sleep/gettime round it up to the nearest multiple of 10Mhz for more accuracy. Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Reviewed-by: Keith Wiles <keith.wiles@intel.com>	2019-03-28 00:45:16 +01:00
Pavan Nikhilesh	f56e551485	eal: add macro to align value to the nearest multiple Add macro to align value to the nearest multiple of the given value, resultant value might be greater than or less than the first parameter whichever difference is the lowest. Update unit test to include the new macro. Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>	2019-03-28 00:45:00 +01:00
Jerin Jacob	55878866eb	use appropriate EAL macro for constructors Use eal's RTE_INIT abstraction for defining constructors. Signed-off-by: Jerin Jacob <jerinj@marvell.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2019-03-27 23:10:57 +01:00
Jakub Grajciar	0c7ce182a7	eal: add pending interrupt callback unregister use case: if callback is used to receive message form socket, and the message received is disconnect/error, this callback needs to be unregistered, but cannot because it is still active. With this patch it is possible to mark the callback to be unregistered once the interrupt process is done with this interrupt source. Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>	2019-03-27 18:53:47 +01:00
Kevin Traynor	c0d9052afb	eal/linux: fix log levels for pagemap reading failure Commit `cdc242f260` says: For Linux kernel 4.0 and newer, the ability to obtain physical page frame numbers for unprivileged users from /proc/self/pagemap was removed. Instead, when an IOMMU is present, simply choose our own DMA addresses instead. In this case the user still sees error messages, so adjust the log levels. Later, other checks will ensure that errors are logged in the appropriate cases. Fixes: `cdc242f260` ("eal/linux: support running as unprivileged user") Cc: stable@dpdk.org Signed-off-by: Kevin Traynor <ktraynor@redhat.com> Acked-by: Aaron Conole <aconole@redhat.com>	2019-03-27 14:54:40 +01:00
Anatoly Burakov	929a91e99c	malloc: fix documentation of realloc function The documentation for rte_realloc claims that the resized area will always reside on the same NUMA node. This is not actually the case - while resized area will be on the same NUMA node, if resizing the area is not possible, then the memory will be reallocated using rte_malloc(), which can allocate memory on another NUMA node, depending on which lcore rte_realloc() was called from and which NUMA nodes have memory available. Fix the API doc to match the actual code of rte_realloc(). Fixes: `af75078fec` ("first public release") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-03-27 12:15:04 +01:00
Stephen Hemminger	24aa4f0fba	mem: poison memory when freed DPDK malloc library allows broken programs to work because the semantics of zmalloc and malloc are the same. This patch enables a more secure model which will catch (and crash) programs that reuse memory already freed if RTE_MALLOC_DEBUG is enabled. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-03-27 10:53:41 +01:00
Andrius Sirvys	cd6683331d	acl: fix compiler flags with meson and AVX2 runtime When compiling the ACL library on a system without AVX2 support, the flags used to compile the AVX2-specific code for later run-time use were not based on the regular cflags for the rest of the library. This can cause errors due to symbols being missed/undefined due to incorrect flags. For example, when testing compilation on Alpine linux, we got: error: unknown type name 'cpu_set_t' due to _GNU_SOURCE not being defined in the cflags. This issue can be fixed by appending "-mavx2" to the cflags rather than replacing them with it. Fixes: `5b9656b157` ("lib: build with meson") Cc: stable@dpdk.org Signed-off-by: Andrius Sirvys <andrius.sirvys@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2019-03-27 10:38:06 +01:00
Bruce Richardson	88f591d1db	eal: remove unneeded version logic The version number in the DPDK_VERSION file will never have an offset that needs to be subtracted, so remove that logic from the version string generation. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Luca Boccassi <bluca@debian.org>	2019-03-27 09:43:54 +01:00
Bruce Richardson	d320fe56bd	build: use version number from config file Since we have the version number in a separate file at the root level, we should not need to duplicate this in rte_version.h too. Best approach here is to move the macros for specifying the year/month/etc. parts from the version header file to the build config file - leaving the other utility macros for e.g. printing the version string, where they are. For "make", this is done by having a little bit of awk parse the version file and pass the results through to the preprocessor for the config generation stage. For "meson", this is done by parsing the version and adding it to the standard dpdk_conf object. In both cases, we need to append a large number - in this case "99", previously 16 in original code - to the version number when we want to do version number comparisons. Without this, the release version e.g. 19.05.0 will compare as less than it's RC's e.g. 19.05.0-rc4. With it, the comparison is correct as "19.05.0.99 > 19.05.0-rc4.99". Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Luca Boccassi <bluca@debian.org> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2019-03-27 09:43:47 +01:00
Fiona Trahe	866bc6742c	compressdev: add flag to specify where processing is done A new device feature flag, RTE_COMPDEV_FF_OP_DONE_IN_DEQUEUE is added. A PMD should set this if the bulk of the processing is done during the dequeue. It should leave it cleared if the bulk of the processing is done during the enqueue (default). Applications can use this as a hint for tuning. Signed-off-by: Fiona Trahe <fiona.trahe@intel.com> Acked-by: Shally Verma <shallyv@marvell.com>	2019-03-22 15:54:24 +01:00
Fan Zhang	51acc16b51	ipsec: support 3DES-CBC This patch adds triple-des CBC mode cipher algorithm to ipsec library. Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-03-22 15:54:24 +01:00
Fan Zhang	3975d5cb1d	ipsec: support AES-CTR This patch adds AES-CTR cipher algorithm support to ipsec library. Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2019-03-22 15:54:24 +01:00
Damian Nowak	a76e869f66	cryptodev: remove XTS comment duplication This patch removes duplicated text about AES-XTS mode. Signed-off-by: Damian Nowak <damianx.nowak@intel.com> Acked-by: Fiona Trahe <fiona.trahe@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-03-22 15:54:24 +01:00
Anoob Joseph	2382aa8c8f	cryptodev: fix driver name comparison The string compare to the length of driver name might give false positives when there are drivers with similar names (one being the subset of another). Following is such a naming which could result in false positive. 1. crypto_driver 2. crypto_driver1 When strncmp with len = strlen("crypto_driver") is done, it could give a false positive when compared against "crypto_driver1". For such cases, 'strlen + 1' is done, so that the NULL termination also would be considered for the comparison. Fixes: `d11b0f30df` ("cryptodev: introduce API and framework for crypto devices") Cc: stable@dpdk.org Signed-off-by: Ankur Dwivedi <adwivedi@marvell.com> Signed-off-by: Anoob Joseph <anoobj@marvell.com> Acked-by: Fiona Trahe <fiona.trahe@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-03-22 14:27:46 +01:00
Arek Kusztal	83a6cb03bc	cryptodev: add result field to mod exp and inv This commit adds result field to be used when modular exponentiation or modular multiplicative inverse operation is used Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com> Acked-by: Shally Verma <shallyv@marvell.com>	2019-03-22 14:27:46 +01:00
Konstantin Ananyev	27e71c7fdc	cryptodev: restore crypto op alignment and layout in 18.08 new cache-aligned structure rte_crypto_asym_op was introduced. As it also was included into rte_crypto_op, it caused implicit change in rte_crypto_op layout and alignment: now rte_crypto_op is cache-line aligned has a hole of 40/104 bytes between phys_addr and sym/asym op. It looks like unintended ABI breakage, plus such change can cause negative performance effects: - now status and sym[0].m_src lies on different cache-lines, so post-process code would need extra cache-line read. - new alignment causes grow of the space requirements and cache-line reads/updates for structures that contain rte_crypto_op inside. As there seems no actual need to have rte_crypto_asym_op cache-line aligned, and rte_crypto_asym_op is not intended to be used on it's own - the simplest fix is just to remove cache-line alignment for it. As the immediate positive effect: on IA ipsec-secgw performance increased by 5-10% (depending on the crypto-dev and algo used). My guess that on machines with 128B cache-line and lookaside-protocol capable crypto devices the impact will be even more noticeable. Fixes: `26008aaed1` ("cryptodev: add asymmetric xform and op definitions") Cc: stable@dpdk.org Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Fiona Trahe <fiona.trahe@intel.com> Acked-by: Shally Verma <shallyv@marvell.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-03-22 14:27:46 +01:00
Stephen Hemminger	0366137722	ethdev: check for invalid device name Do not allow creating an Ethernet device with a name over the allowed maximum (or zero length). This is safer than silently truncating which is what happens now. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com> Acked-by: Ali Alnubani <alialnu@mellanox.com>	2019-03-21 19:27:51 +01:00
Andrew Rybchenko	b6950cc79d	ethdev: highlight that all-multicast is retained on restart All-multicast is a part of receive mode configuration and it is better to mention explicitly that it is retained across restart. Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2019-03-20 18:15:42 +01:00
Andrew Rybchenko	8010be2a12	ethdev: advertise default MAC as retained on restart The documentation says MAC addresses array is retained and it is logical to assume that default MAC address is retained as well. Also some PMDs do not allow to change the default MAC in running state (see RTE_ETH_DEV_NOLIVE_MAC_ADDR). Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2019-03-20 18:15:42 +01:00
Andrew Rybchenko	189f554647	ethdev: advertise MTU as retained across stop/start Changing MTU in running state may return -EBUSY saying that MTU cannot be changed when the port is running. It assumes that changes may be done in stopped and started (but some PMDs may reject it) state and it is logical to require that changes done in any of these states are retained. Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2019-03-20 18:15:42 +01:00
Maxime Coquelin	3e0396166b	vhost: support requests only handled by external backend External backends may have specific requests to handle, and so we don't want the vhost-user lib to handle these requests as errors. This patch also changes the experimental API by introducing RTE_VHOST_MSG_RESULT_NOT_HANDLED so that vhost-user lib can report an error if a message is handled neither by the vhost-user library nor by the external backend. The logic changes a bit so that if the callback returns with ERR, OK or REPLY, it is considered the message is handled by the external backend so it won't be handled by the vhost-user library. It is still possible for an external backend to listen to requests that have to be handled by the vhost-user library like SET_MEM_TABLE, but the callback have to return NOT_HANDLED in that case. Vhost-crypto backend is also adapted to this API change. Suggested-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Tested-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>	2019-03-20 18:15:42 +01:00
Maxime Coquelin	9401b80327	vhost: add API to set protocol features flags rte_vhost_driver_set_protocol_features API is to be used by external backends to advertise vhost-user protocol features it supports. It has to be called after rte_vhost_driver_register() and before rte_vhost_driver_start(). Example of usage to advertize VHOST_USER_PROTOCOL_F_FOOBAR protocol feature: const char *path = "/tmp/vhost-user"; uint64_t protocol_features; rte_vhost_driver_register(path, 0); rte_vhost_driver_get_protocol_features(path, &protocol_features); protocol_features \|= VHOST_USER_PROTOCOL_F_FOOBAR; rte_vhost_driver_set_protocol_features(path, protocol_features); rte_vhost_driver_start(path); Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Tested-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>	2019-03-20 18:15:42 +01:00
Jiayu Hu	2f706027c8	vhost: fix interrupt suppression for the split ring The VIRTIO_RING_F_EVENT_IDX feature of split ring might be broken, as the value of signalled_used is invalid after live migration, start up and virtio driver reload. This patch fixes it by using signalled_used_valid. In addition, this patch makes the VIRTIO_RING_F_EVENT_IDX implementation of split ring match kernel backend to suppress more interrupts. Fixes: `e37ff95440` ("vhost: support virtqueue interrupt/notification suppression") Cc: stable@dpdk.org Signed-off-by: Jiayu Hu <jiayu.hu@intel.com> Tested-by: Yinan Wang <yinan.wang@intel.com> Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>	2019-03-20 18:15:42 +01:00
Maxime Coquelin	11d5253a3e	vhost: prevent disabled rings to be processed with zero-copy The vhost-user spec says that once the vring is disabled, the client has to stop processing it. But it can happen when dequeue zero-copy is enabled if outstanding descriptors buffers are still being processed by an external NIC or another guest. The fix consists in draining the zmbufs list to ensure no more descriptors buffers are in the wild. Note that this fix is only working in the case REPLY_ACK protocol feature is enabled, which is not the case by default for now (it is only enabled when IOMMU feature is enabled in the vhost library). Fixes: `b0a985d1f3` ("vhost: add dequeue zero copy") Cc: stable@dpdk.org Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>	2019-03-20 18:15:42 +01:00
Rami Rosen	7065dff68e	ethdev: fix method name in doxygen comment This patch fixes rte_ethdev header file to use the correct method name, namely to use rte_eth_dev_info_get() instead of rte_eth_dev_infos_get(). Fixes: `a4996bd89c` ("ethdev: new Rx/Tx offloads API") Fixes: `4f5701f28b` ("examples: fix RSS hash function configuration") Cc: stable@dpdk.org Signed-off-by: Rami Rosen <ramirose@gmail.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>	2019-03-20 18:15:42 +01:00
Stephen Hemminger	ddf5a9c6d7	ethdev: replace snprintf with strlcpy on init Don't need to use snprintf for simple name copy. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Rami Rosen <ramirose@gmail.com>	2019-03-20 18:15:42 +01:00
Stephen Hemminger	e97f83e452	ethdev: replace snprintf with strlcpy for owner The set_port_owner was copying a string between structures of the same type, therefore the name could never be truncated (unless source string was not null terminated). Use strlcpy which does it better. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>	2019-03-20 18:15:42 +01:00
Tomasz Jozwiak	a7cece2ead	malloc: add NUMA-aware realloc function Currently, rte_realloc will not respect original allocation's NUMA node when memory cannot be resized, and there is no NUMA-aware equivalent of rte_realloc. This patch adds such a function. The new API will ensure that reallocated memory stays on requested NUMA node, as well as allow moving allocated memory to a different NUMA node. Signed-off-by: Tomasz Jozwiak <tomaszx.jozwiak@intel.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-03-23 16:54:50 +01:00
Pavan Nikhilesh	1534cc6ab1	doc: add notes about eventdev producer/consumer dependency EventDev i.e consumer needs to be started before starting the event producers. Update documentation of EventDev and EventDev adapters. Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Reviewed-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com> Reviewed-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com>	2019-03-15 06:46:50 +01:00
Bruce Richardson	218c4e68c1	mk: use linux and freebsd in config names Rather than using linuxapp and bsdapp everywhere, we can change things to use the, more readable, terms "linux" and "freebsd" in our build configs. Rather than renaming the configs we can just duplicate the existing ones with the new names using symlinks, and use the new names exclusively internally. ["make showconfigs" also only shows the new names to keep the list short] The result is that backward compatibility is kept fully but any new builds or development can be done using the newer names, i.e. both "make config T=x86_64-native-linuxapp-gcc" and "T=x86_64-native-linux-gcc" work. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>	2019-03-12 23:05:06 +01:00
Bruce Richardson	5fbc1d498f	build/freebsd: rename macro BSDPAPP to FREEBSD Rename the macro and all instances in DPDK code, but keep a copy of the old macro defined for legacy code linking against DPDK Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>	2019-03-12 23:01:14 +01:00
Bruce Richardson	742bde12f3	build/linux: rename macro from LINUXAPP to LINUX Rename the macro to make things shorter and more comprehensible. For both meson and make builds, keep the old macro around for backward compatibility. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>	2019-03-12 17:31:22 +01:00
Bruce Richardson	91d7846ce6	eal/linux: rename linuxapp to linux The term "linuxapp" is a legacy one, but just calling the subdirectory "linux" is just clearer for all concerned. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>	2019-03-12 17:31:13 +01:00
Bruce Richardson	25c99fbd68	eal/bsd: rename bsdapp to freebsd The term "bsdapp" is a legacy one, but just calling the subdirectory "freebsd" is just clearer for all concerned. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>	2019-03-12 17:30:20 +01:00
Arek Kusztal	9f9d23d428	cryptodev: rework modexp and modinv comments This patch changes modular exponentiation and modular multiplicative inverse API comments to make it more precise. Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com> Acked-by: Fiona Trahe <fiona.trahe@intel.com> Acked-by: Shally Verma <shallyv@marvell.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-03-06 17:30:43 +01:00
David Marchand	1598c72959	eal: fix core list validation with disabled cores -l and -c options are two ways to select the cores used by DPDK. Their format differs, but the checks on the selected cores are the same. Use an intermediate array to separate the specific parsing checks from the common consistency checks. The parsing functions now concentrate on validating the passed string and do nothing more. We can report all invalid core indexes rather than only the first error. In the error log message, reporting [0, cfg->lcore_count - 1] as a valid range is then wrong when the core list is not continuous. Example on my 8 cpus laptop with core 2 and 6 disabled. echo 0 > /sys/devices/system/cpu/cpu2/online echo 0 > /sys/devices/system/cpu/cpu6/online Before: ./master/app/testpmd -l 0-7 --no-huge -m 512 -- --total-num-mbufs 2048 EAL: Detected 6 lcore(s) EAL: Detected 1 NUMA nodes EAL: invalid core list, please check core numbers are in [0, 5] range ... After: ./master/app/testpmd -l 0-7 --no-huge -m 512 -- --total-num-mbufs 2048 EAL: Detected 6 lcore(s) EAL: Detected 1 NUMA nodes EAL: lcore 2 unavailable EAL: lcore 6 unavailable EAL: invalid core list, please check specified cores are part of 0-1,3-5,7 ... Fixes: `d888cb8b96` ("eal: add core list input format") Fixes: `b38693b612` ("eal: fix core number validation") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com>	2019-03-07 21:22:53 +01:00
David Marchand	33df941d79	eal: remove dead code in core list parsing We don't need to look for trailing spaces. This is a copy/paste block from eal_parse_coremask(). Remove it and the associated comment. Fixes: `d888cb8b96` ("eal: add core list input format") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com>	2019-03-07 21:22:48 +01:00
David Marchand	c3568ea376	eal: restrict control threads to startup CPU affinity Spawning the ctrl threads on anything that is not part of the eal coremask is not that polite to the rest of the system, especially when you took good care to pin your processes on cpu resources with tools like taskset (linux) / cpuset (freebsd). Rather than introduce yet another eal options to control on which cpu those ctrl threads are created, let's take the startup cpu affinity as a reference and remove the eal coremask from it. If no cpu is left, then we default to the master core. The cpuset is computed once at init before the original cpu affinity is lost. Introduced a RTE_CPU_AND macro to abstract the differences between linux and freebsd respective macros. Examples in a 4 cores FreeBSD vm: $ ./build/app/testpmd -l 2,3 --no-huge --no-pci -m 512 \ -- -i --total-num-mbufs=2048 $ procstat -S 1057 PID TID COMM TDNAME CPU CSID CPU MASK 1057 100131 testpmd - 2 1 2 1057 100140 testpmd eal-intr-thread 1 1 0-1 1057 100141 testpmd rte_mp_handle 1 1 0-1 1057 100142 testpmd lcore-slave-3 3 1 3 $ cpuset -l 1,2,3 ./build/app/testpmd -l 2,3 --no-huge --no-pci -m 512 \ -- -i --total-num-mbufs=2048 $ procstat -S 1061 PID TID COMM TDNAME CPU CSID CPU MASK 1061 100131 testpmd - 2 2 2 1061 100144 testpmd eal-intr-thread 1 2 1 1061 100145 testpmd rte_mp_handle 1 2 1 1061 100147 testpmd lcore-slave-3 3 2 3 $ cpuset -l 2,3 ./build/app/testpmd -l 2,3 --no-huge --no-pci -m 512 \ -- -i --total-num-mbufs=2048 $ procstat -S 1065 PID TID COMM TDNAME CPU CSID CPU MASK 1065 100131 testpmd - 2 2 2 1065 100148 testpmd eal-intr-thread 2 2 2 1065 100149 testpmd rte_mp_handle 2 2 2 1065 100150 testpmd lcore-slave-3 3 2 3 Fixes: `d651ee4919` ("eal: set affinity for control threads") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com> Reviewed-by: Olivier Matz <olivier.matz@6wind.com>	2019-03-07 19:21:28 +01:00
David Marchand	759b9be661	eal: fix control threads pinnning pthread_setaffinity_np returns a >0 value on error. We could end up letting the ctrl threads on the current process cpu affinity. Fixes: `d651ee4919` ("eal: set affinity for control threads") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com> Reviewed-by: Olivier Matz <olivier.matz@6wind.com>	2019-03-07 19:13:48 +01:00
David Marchand	b206376438	eal: fix check when retrieving current CPU affinity pthread_getaffinity_np returns a >0 value when failing. This is mainly for the sake of correctness. The only case where it could fail is when passing an incorrect cpuset size wrt to the kernel. Fixes: `2eba8d21f3` ("eal: restrict cores auto detection") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com> Reviewed-by: Rami Rosen <ramirose@gmail.com>	2019-03-07 16:37:14 +01:00
Stephen Hemminger	e7d798172f	eal: remove legacy PMD log macro The RTE_PMD_DEBUG_TRACE was only enabled for EVENTDEV_DEBUG and that configuration is now handled by RTE_EDEV_LOG macros. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-03-01 18:17:36 +01:00
Stephen Hemminger	b5af9abaa7	eventdev: use same log macro for all unsupported calls The driver already has RTE_EDEV_XXX log macros so use them in two more places. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-03-01 18:17:36 +01:00
Stephen Hemminger	e37aad5ed3	eal: drop unused macros for primary process check No usage in current DPDK code base. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-03-01 18:17:36 +01:00
Darek Stojaczyk	c19429844c	vhost: remove vhost-net requirements from generic APIs The rte_vhost API to put data into virtqueues operates on mbufs and hence it is strictly vhost-net specific. External backends need to implement virtqueue handling from scratch and that's just not possible without APIs to get/set vring base addresses. Those relevant APIs are there, but they have a check that prevents them from working with any non-vhost-net device. This patch removes those checks. rte_vhost_get_log_base() is not necessarily needed for external backends, as other, higher level vhost APIs for live migration are available and could be used instead. We remove the extra check from it anyway for consistency. Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>	2019-03-01 18:17:36 +01:00
Tiwei Bie	2a2904fa9c	vhost: fix potential use-after-free for memory region Reclaim outstanding zmbufs first before freeing memory regions, otherwise there could be use-after-free. Fixes: `b0a985d1f3` ("vhost: add dequeue zero copy") Cc: stable@dpdk.org Signed-off-by: Tiwei Bie <tiwei.bie@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-03-01 18:17:36 +01:00
Tiwei Bie	d767436ee5	vhost: fix potential use-after-free for zero copy mbuf Don't free the zero copy mbufs before they have been consumed, otherwise there could be use-after-free. Fixes: `b0a985d1f3` ("vhost: add dequeue zero copy") Cc: stable@dpdk.org Signed-off-by: Tiwei Bie <tiwei.bie@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-03-01 18:17:36 +01:00
Tiwei Bie	041d37b2ef	vhost: restore mbuf first when freeing zmbuf The mbufs should also be restored in free_zmbufs(). Fixes: `b0a985d1f3` ("vhost: add dequeue zero copy") Fixes: `3ebd930588` ("vhost: fix mbuf free") Cc: stable@dpdk.org Signed-off-by: Tiwei Bie <tiwei.bie@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-03-01 18:17:35 +01:00
Luca Boccassi	a9933bb1de	build: improve libbsd dependency handling Use dependency() instead of manual append to ldflags. Move libbsd inclusion to librte_eal, so that all other libraries and PMDs will inherit it. Signed-off-by: Luca Boccassi <bluca@debian.org> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2019-02-27 12:28:03 +01:00
Luca Boccassi	8990c8c0d0	build: reorder cmdline library after EAL Most libraries and PMDs depend on eal, and eal depends only on kvargs, so reorder the list in Meson to reflect this and take advantage of this dependency chain. Signed-off-by: Luca Boccassi <bluca@debian.org> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2019-02-27 12:27:58 +01:00
Luca Boccassi	e30b4e566f	build: improve dependency handling Whenever possible (if the library ships a pkg-config file) use meson's dependency() function to look for it, as it will automatically add it to the Requires.private list if needed, to allow for static builds to succeed for reverse dependencies of DPDK. Otherwise the recursive dependencies are not parsed, and users doing static builds have to resolve them manually by themselves. When using this API avoid additional checks that are superfluous and take extra time, and avoid adding the linker flag manually which causes it to be duplicated. Signed-off-by: Luca Boccassi <bluca@debian.org> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Tested-by: Bruce Richardson <bruce.richardson@intel.com>	2019-02-27 12:13:54 +01:00
Bruce Richardson	d23e141ffa	build: set RTE_ARCH_64 based on pointer size Rather than relying on the target machine architecture, use the size of a pointer from the compiler to determine if we are 64-bits or not. This allows correct behaviour when you pass -m32 as a compile option. It also allows us to use this value repeatedly throughout the repo rather than continually testing for the sizeof(void*). Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Tested-by: Luca Boccassi <bluca@debian.org> Acked-by: Luca Boccassi <bluca@debian.org>	2019-02-26 18:34:28 +01:00
Bruce Richardson	b543d1a715	compat: merge compat library into EAL Since compat library is only a single header, we can easily move it into the EAL common headers instead of tracking it separately. The downside of this is that it becomes a little more difficult to have any libs that are built before EAL depend on it. Thankfully, this is not a major problem as the only library which uses rte_compat.h and is built before EAL (kvargs) already has the path to the compat.h header file explicitly called out as an include path. However, to ensure that we don't hit problems later with this, we can add EAL common headers folder to the global include list in the meson build which means that all common headers can be safely used by all libraries, no matter what their build order. As a side-effect, this patch also fixes an issue with building on BSD using meson, due to compat lib no longer needing to be listed as a dependency. Fixes: `a8499f65a1` ("log: add missing experimental tag") Cc: stable@dpdk.org Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Tested-by: David Marchand <david.marchand@redhat.com> Tested-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2019-02-25 16:03:31 +01:00
Pallantla Poornima	7c7b756225	vhost: fix sprintf with snprintf sprintf function is not secure as it doesn't check the length of string. More secure function snprintf is used. Fixes: `d7280c9fff` ("vhost: support selective datapath") Cc: stable@dpdk.org Signed-off-by: Pallantla Poornima <pallantlax.poornima@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Tiwei Bie <tiwei.bie@intel.com>	2019-02-22 14:39:49 +01:00
Wenjie Sun	054617fd82	vhost: fix deadlock in driver unregister In rte_vhost_driver_unregister(), the connection fd is removed from the fdset using fdset_try_del(). Call to this function may fail if the corresponding fd is in busy state, indicating that event dispatcher is executing the read or write callback on this fd. When it happens, rte_vhost_driver_unregister() keeps trying to remove the fd from the set until it is no more busy. This situation is causing a deadlock, because rte_vhost_driver_unregister() keeps trying to remove the fd from the set with vhost_user.mutex held, while the callback executed by the dispatcher, vhost_user_read_cb(), also takes this mutex at numerous places. The fix consists in releasing vhost_user.mutex between each retry in vhost_driver_unregister(). Fixes: `8b4b949144` ("vhost: fix dead lock on closing in server mode") Cc: stable@dpdk.org Signed-off-by: Wenjie Sun <findtheonlyway@gmail.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-02-22 14:39:49 +01:00
Thomas Monjalon	a3a4aba4b5	ethdev: remove unused variable When removing the old attach function, the racy variable for getting the last port id became unused. Fixes: `c9cce42876` ("ethdev: remove deprecated attach/detach functions") Cc: stable@dpdk.org Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>	2019-02-20 16:41:43 +01:00
Rami Rosen	b13baac8d5	mbuf: fix a typo This trivial patch fixes a typo in rte_mbuf.h. Fixes: `f20b50b946` ("mbuf: optimize refcnt update") Cc: stable@dpdk.org Signed-off-by: Rami Rosen <ramirose@gmail.com>	2019-02-12 14:32:01 +01:00
Bruce Richardson	146e57627f	eal: support strlcat function Add the strlcat function to DPDK to exist alongside the strlcpy one. While strncat is generally safe for use for concatenation, the API for the strlcat function is perhaps a little nicer to use, and supports truncation detection. See commit `5364de644a` ("eal: support strlcpy function") for more details on the function selection logic, since we only should be using the DPDK-provided version when no system-provided version is present. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-02-12 10:04:28 +01:00
Darek Stojaczyk	0b8572a0c1	vhost: add external message handling to the API External message callbacks are used e.g. by vhost crypto to parse crypto-specific vhost-user messages. We are now publishing the API to register those callbacks, so that other backends outside of DPDK can use them as well. Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-by: Tiwei Bie <tiwei.bie@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-02-08 19:27:07 +01:00
Xiao Wang	b172129583	vhost: remove vDPA available ring relay helper We don't need to relay available ring and check the desc, vdpa device can access the available ring in the guest directly. With this patch, we can achieve better throughput and lower CPU usage. Signed-off-by: Xiao Wang <xiao.w.wang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-02-08 19:27:07 +01:00
Vladimir Medvedkin	1b421d195c	hash: replace license text with SPDX tag for Toeplitz algo Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>	2019-02-06 22:08:04 +01:00
Thomas Monjalon	cae0d722d6	version: 19.05-rc0 Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: John McNamara <john.mcnamara@intel.com>	2019-02-06 11:20:06 +01:00
Thomas Monjalon	8b937bae24	version: 19.02.0 Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2019-02-01 15:25:17 +01:00
Thomas Monjalon	a2f9c0d417	version: 19.02-rc4 Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2019-01-28 02:53:53 +01:00
Congwen Zhang	1459d335d0	lpm: remove recursively included header file includes itself: rte_lpm.h -> rte_lpm_sse.h -> rte_lpm.h includes itself: rte_lpm.h -> rte_lpm_neon.h -> rte_lpm.h Signed-off-by: Congwen Zhang <zhang.congwen@zte.com.cn>	2019-01-28 01:47:23 +01:00
Ed Czeck	14ff7fb97e	mbuf: fix struct initialization with C++ g++ reports "error: missing initializer for member" Fixes: `5d3f721009` ("mbuf: implement generic format for sched field") Signed-off-by: Ed Czeck <ed.czeck@atomicrules.com>	2019-01-28 00:57:57 +01:00
Tiwei Bie	4800639000	vhost: fix access for indirect descriptors Fix a possible out of bound access which may happen when handling indirect descs in split ring. Fixes: `1be4ebb1c4` ("vhost: support indirect descriptor in mergeable Rx") Cc: stable@dpdk.org Reported-by: Haiyue Wang <haiyue.wang@intel.com> Signed-off-by: Tiwei Bie <tiwei.bie@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-01-24 10:08:31 +01:00
Ilya Maximets	0a703f0f36	eal/linux: fix parsing zero socket memory and limits Modern memory mode allowes to not reserve any memory by the '--socket-mem' option. i.e. it could be possible to specify zero preallocated memory like '--socket-mem 0'. Also, it should be possible to configure unlimited memory allocations by '--socket-limit 0'. Both cases are impossible now and blocks starting the DPDK application: ./dpdk-app --socket-limit 0 <...> EAL: invalid parameters for --socket-limit EAL: Invalid 'command line' arguments. Unable to initialize DPDK: Invalid argument Fixes: `6b42f75632` ("eal: enable non-legacy memory mode") Cc: stable@dpdk.org Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-01-23 23:02:07 +01:00
Andrew Rybchenko	727663e88b	gso: fix VxLAN/GRE tunnel checks Tunnel type is an enum in PKT_TX_TUNNEL_MASK bits. As the result, for example, IPIP or MPLSinUDP tunnel packets may be incorrectly treated internally as VXLAN. Fixes: `b058d92ea9` ("gso: support VxLAN GSO") Fixes: `70e737e448` ("gso: support GRE GSO") Cc: stable@dpdk.org Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Jiayu Hu <jiayu.hu@intel.com>	2019-01-22 17:08:21 +01:00
Andrew Rybchenko	8e0c8dcf35	ethdev: fix errno to have positive value rte_errno should be set to positive value from errno.h plus few RTE-specific values. Fixes: `4fb7e803eb` ("ethdev: add Tx preparation") Fixes: `439a90b5f2` ("ethdev: reorder inline functions") Cc: stable@dpdk.org Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>	2019-01-22 17:03:10 +01:00
Anatoly Burakov	47f4fe0595	vfio: allow secondary process to query IOMMU type It is only possible to know IOMMU type of a given VFIO container by attempting to initialize it. Since secondary process never attempts to set up VFIO container itself (because they're shared between primary and secondary), it never knows which IOMMU type the container is using, and never sets up the appropriate config structures. This results in inability to perform DMA mappings in secondary process. Fix this by allowing secondary process to query IOMMU type of primary's default container at device initialization. Note that this fix is assuming we're only interested in default container. Bugzilla ID: 174 Fixes: `6bcb7c95fe` ("vfio: share default container in multi-process") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>	2019-01-21 16:13:59 +01:00
Thomas Monjalon	84a1d4a873	version: 19.02-rc3 Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2019-01-20 22:39:20 +01:00
Hari Kumar Vemula	c749c03225	efd: fix tail queue leak In rte_efd_create() allocated memory for tail queue entry but not freed. Added freeing the tail queue entry. Fixes: `56b6ef874f` ("efd: new Elastic Flow Distributor library") Cc: stable@dpdk.org Signed-off-by: Hari Kumar Vemula <hari.kumarx.vemula@intel.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Acked-by: Reshma Pattan <reshma.pattan@intel.com> Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2019-01-19 00:07:52 +01:00
Nithin Dabilpuram	e1e1c08a53	ethdev: report error on ring name truncation Currently this api doesn't report error if name is truncated and so user is not sure about uniqueness of name. This change reports error to help user. Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2019-01-18 09:47:26 +01:00
Tiwei Bie	e1c0834f95	vhost: fix memory leak on realloc failure When realloc() fails, the original block isn't freed. Fixes: `e246896178` ("vhost: get guest/host physical address mappings") Cc: stable@dpdk.org Signed-off-by: Tiwei Bie <tiwei.bie@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-01-18 09:47:26 +01:00
Xiaolong Ye	9303de0c94	vhost: remove unused function prototype vhost_user_host_notifier_ctrl is not existed anymore, its statement in header file should be removed accordingly. Fixes: `43f34e3566` ("vhost: provide helper for host notifier ctrl") Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com> Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>	2019-01-18 09:47:26 +01:00
Xiaolong Ye	9f90145128	vhost: configure vDPA device after set vring call message As qemu will only send VHOST_USER_SET_VRING_ENABLE message for guest enabled vrings (only first queue pair will be enabled at initialized stage), this will cause trouble for multiqueue case, vDPA's dev_conf callback will get no chance be invoked. Decouple the dev_conf callback from VHOST_USER_SET_VRING_ENABLE solves this issue. Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-01-18 09:47:26 +01:00
Jiayu Hu	7ccc7a05d6	gro: check invalid TCP header length When the TCP header length of input packets is invalid (i.e., less than 20 bytes or greater than 60 bytes), check_seq_option() will access illegal memory area when compare TCP Options, which may cause a segmentation fault. This patch adds missing invalid TCP header length check to avoid illegal memory accesses. Fixes: `0d2cbe59b7` ("lib/gro: support TCP/IPv4") Fixes: `9e0b9d2ec0` ("gro: support VxLAN GRO") Cc: stable@dpdk.org Signed-off-by: Jiayu Hu <jiayu.hu@intel.com> Tested-by: Yinan Wang <yinan.wang@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2019-01-17 22:38:54 +01:00
Liang Ma	7c06d9258a	power: fix file descriptor leak Coverity issue: 328528 Fixes: `e6c6dc0f96` ("power: add p-state driver compatibility") Signed-off-by: Liang Ma <liang.j.ma@intel.com> Reviewed-by: Lei Yao <lei.a.yao@intel.com> Tested-by: Lei Yao <lei.a.yao@intel.com> Signed-off-by: David Hunt <david.hunt@intel.com>	2019-01-17 19:20:02 +01:00
Ilya Maximets	6406d70561	eal: fix clang build with intrinsics forced This fixes x86_64-native-linuxapp-clang build with CONFIG_RTE_FORCE_INTRINSICS=y: include/generic/rte_atomic.h:218:9: error: implicit declaration of function '__atomic_exchange_2' is invalid in C99 [-Werror,-Wimplicit-function-declaration] include/generic/rte_atomic.h:501:9: error: implicit declaration of function '__atomic_exchange_4' is invalid in C99 [-Werror,-Wimplicit-function-declaration] include/generic/rte_atomic.h:783:9: error: implicit declaration of function '__atomic_exchange_8' is invalid in C99 [-Werror,-Wimplicit-function-declaration] We didn't caught this issue previously on other platforms because CONFIG_RTE_FORCE_INTRINSICS enabled by default only for armv8. Fixes: `7bdccb9307` ("eal: fix ARM build with clang") Cc: stable@dpdk.org Signed-off-by: Ilya Maximets <i.maximets@samsung.com>	2019-01-17 18:39:55 +01:00
Anatoly Burakov	2383d8e909	eal: check string parameter lengths When specifying parameters such as hugefile prefix from the command-line, it is possibly to supply an empty string. This may lead to various problems: for example, if hugefile prefix is empty, the runtime config path construction may end up looking like "/var/run/dpdk//_config", which will technically work, but is wrong and places files in the wrong place. To fix it, check lengths of such user-specified parameters for hugefile prefix, as well as hugepage dir and user-specified mbuf pool ops string. Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-01-17 18:39:55 +01:00
David Marchand	7b55015e14	eal: fix out of bound access when no CPU available In the unlikely case when the dpdk application is started with no cpu available in the [0, RTE_MAX_LCORE - 1] range, the master_lcore is automatically chosen as RTE_MAX_LCORE which triggers an out of bound access. Either you have a crash then, or the initialisation fails later when trying to pin the master thread on it. In my test, with RTE_MAX_LCORE == 2: $ taskset -c 2 ./master/app/testpmd --no-huge -m 512 --log-level *:debug [...] EAL: pthread_setaffinity_np failed PANIC in eal_thread_init_master(): cannot set affinity 7: [./master/app/testpmd() [0x47f629]] Bugzilla ID: 19 Fixes: `2eba8d21f3` ("eal: restrict cores auto detection") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com>	2019-01-17 18:39:55 +01:00
Hari Kumar Vemula	b38693b612	eal: fix core number validation When incorrect core value or range provided, as part of -l command line option, a crash occurs. Added valid range checks to fix the crash. Added ut check for negative core values. Added unit test case for invalid core number range. Fixes: `d888cb8b96` ("eal: add core list input format") Cc: stable@dpdk.org Signed-off-by: Hari Kumar Vemula <hari.kumarx.vemula@intel.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2019-01-17 17:22:04 +01:00
Thomas Monjalon	05853e1784	version: 19.02-rc2 Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2019-01-15 03:08:43 +01:00
David Hunt	ad514edf71	power: fix frequency list return code The power_pstate_cpufreq_freqs() function was returning -1 in an unsigned int, causing buffer over-runs when the results were being processed. This function should be returning zero for all error conditions, similar to it's acpi relation, power_acpi_cpufreq_freqs(). Fixes: `e6c6dc0f96` ("power: add p-state driver compatibility") Signed-off-by: David Hunt <david.hunt@intel.com>	2019-01-15 02:40:41 +01:00
David Hunt	2e6ccdb4e0	power: fix frequency list to handle null buffer This patch fixes a segfault in the case where a null buffer is passed to the following functions: power_acpi_cpufreq_freqs() power_pstate_cpufreq_freqs() Fixes: `445c6528b5` ("power: common interface for guest and host") Signed-off-by: David Hunt <david.hunt@intel.com>	2019-01-15 02:40:41 +01:00
David Hunt	de394915df	power: fix error handling on setting governor In the power_set_governor_*() functions, we using fputs() on /sys filesystem. However, we also need to call fflush() to ensure that the write completes successfully. Otherwise the attempt to set the power governor fails and the function returns as if it has succeeded. This patch adds an fflush to ensure that the write succeeds, otherwise returns an error. Fixes: `e6c6dc0f96` ("power: add p-state driver compatibility") Signed-off-by: David Hunt <david.hunt@intel.com>	2019-01-15 02:40:40 +01:00
Konstantin Ananyev	423aa5e130	ipsec: fix build with gcc 6 gcc 6.2 and 6.4 fails to compile lib/librte_ipsec/sa.c with the following errors: lib/librte_ipsec/sa.c: In function inline_outb_tun_pkt_process: x86_64-native-linuxapp-gcc/include/rte_memcpy.h:337:2: error: array subscript is above array bounds [-Werror=array-bounds] rte_mov32((uint8_t )dst + 1 32, (const uint8_t )src + 1 32); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ... It complains about the following lines of code: esp_outb_tun_pkt_prepare(struct rte_ipsec_sa sa, rte_be64_t .... / update spi, seqn and iv / esph = (struct esp_hdr )(ph + sa->hdr_len); iv = (uint64_t *)(esph + 1); rte_memcpy(iv, ivp, sa->iv_len); While I believe it is a false positive, it is too excessive to use rte_memcpy() here, as IV length could be only 0/8/16 bytes. So introduce small helper function to copy IV and use it instead of rte_memcpy(). Fixes: `4d7ea3e145` ("ipsec: implement SA data-path API") Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2019-01-15 02:40:40 +01:00
Gaetan Rivet	68c4768d36	eal: return error when option register fails Make rte_option_register return a negative value when an error occur. Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>	2019-01-15 02:40:40 +01:00
Gaetan Rivet	e48839afff	eal: improve option API documentation Use doxygen to describe the main structure and describe a little more why it exists. Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>	2019-01-15 02:40:40 +01:00
Gaetan Rivet	d3bdefef22	eal: fix log level of error in option register INFO is not correct when logging an error. Fixes: `2395332798` ("eal: add option register infrastructure") Cc: stable@dpdk.org Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>	2019-01-15 02:40:40 +01:00
Gaetan Rivet	f87471c3f1	eal: check against common option on register Not only check against other registered options, but also common EAL options. This will mitigate user confusion. Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>	2019-01-15 02:40:40 +01:00
Gaetan Rivet	42f6dbda09	eal: rename option name field option->opt_* is redundant. The field should also be constant. Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>	2019-01-15 02:40:40 +01:00
Gaetan Rivet	b8fe14b7cf	eal: add option usage string Add a usage string field in rte_option, allowing to display help to the user and describe which options are currently available. Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>	2019-01-15 02:40:40 +01:00
Gaetan Rivet	ce6448fa01	eal: do not use static option iterator This is rather weird. Someone should have caught that during review. Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>	2019-01-15 02:40:40 +01:00
Gaetan Rivet	4c3bf26c19	eal: use bare option string as name Current options name can be passed with arbitrary format. Force the use of "--" prefix and thus POSIX long options format. This restricts the ability to introduce surprising options and will help future additional checks. Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>	2019-01-15 02:40:40 +01:00
Yongseok Koh	c277b34c1b	mbuf: add function returning buffer address This patch introduces two new functions - rte_mbuf_buf_addr() and rte_mbuf_data_addr_default(). rte_mbuf_buf_addr() reutrns the buffer address of given mbuf which comes after mbuf structure and private data. rte_mbuf_data_addr_default() returns the default address of mbuf data taking the headroom into account. Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2019-01-15 02:40:40 +01:00
David Marchand	d4dca8fe43	mbuf: add a non fatal sanity check helper Let's add a little helper that does the same as rte_mbuf_sanity_check but without the panic. Signed-off-by: David Marchand <david.marchand@6wind.com> Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2019-01-15 02:40:40 +01:00
David Marchand	dcd2a33d3c	mbuf: add sanity checks on segment metadata Add some basic checks on the segments offset and length metadata: always funny to have a < 0 tailroom cast to uint16_t ;-). Signed-off-by: David Marchand <david.marchand@6wind.com> Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2019-01-15 02:40:40 +01:00
David Marchand	094eb27507	ethdev: declare Tx prepare API as not experimental The Tx prepare API was introduced at a time when the experimental API tag mechanism did not exist yet and is missing the tag. However, this API has been there since 17.02 and did not suffer any change since its introduction. Let's just remove the warning from the documentation. Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2019-01-14 17:44:29 +01:00
Fan Zhang	16d2e718b8	vhost/crypto: fix possible out of bound access This patch fixes a out of bound access possbility in vhost crypto. Originally the incorrect next descriptor index may cause the library read invalid memory content and crash the application. Fixes: `3bb595ecd6` ("vhost/crypto: add request handler") Cc: stable@dpdk.org Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-01-14 17:44:29 +01:00
Fan Zhang	c7e7244b82	vhost/crypto: fix possible dead loop This patch fixes a possible infinite loop caused by incorrect descriptor chain created by the driver. Fixes: `3bb595ecd6` ("vhost/crypto: add request handler") Cc: stable@dpdk.org Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-01-14 17:44:29 +01:00
Tiwei Bie	61ec8f58b0	vhost: ensure event idx is mapped when negotiated Fixes: `30920b1e2b` ("vhost: ensure all range is mapped when translating QVAs") Cc: stable@dpdk.org Signed-off-by: Tiwei Bie <tiwei.bie@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-01-14 17:44:29 +01:00
Tiwei Bie	450539b47e	vhost: fix possible dead loop in vector filling Fix a possible dead loop which may happen, e.g. when driver created a loop in the desc list and lens in descs are zero. Fixes: `fd68b4739d` ("vhost: use buffer vectors in dequeue path") Fixes: `2f3225a7d6` ("vhost: add vector filling support for packed ring") Cc: stable@dpdk.org Signed-off-by: Tiwei Bie <tiwei.bie@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-01-14 17:44:29 +01:00
Tiwei Bie	06fc8545fd	vhost: fix possible out of bound access in vector filling Fixes: `7f74b95c44` ("vhost: pre update used ring for Tx and Rx") Cc: stable@dpdk.org Signed-off-by: Tiwei Bie <tiwei.bie@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-01-14 17:44:29 +01:00
Tiwei Bie	4e282bc6c5	vhost: fix possible dead loop in relay helpers Fix a possible dead loop which may happen, e.g. when driver created a loop in the desc list. Fixes: `b13ad2decc` ("vhost: provide helpers for virtio ring relay") Signed-off-by: Tiwei Bie <tiwei.bie@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-01-14 17:44:29 +01:00
Tiwei Bie	85936f0546	vhost: fix possible out of bound access in relay helpers Fixes: `b13ad2decc` ("vhost: provide helpers for virtio ring relay") Signed-off-by: Tiwei Bie <tiwei.bie@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-01-14 17:44:29 +01:00
Tiwei Bie	e218fa09f4	vhost: fix desc access in relay helpers Descs in desc table should be indexed using the desc idx instead of the idx of avail ring and used ring. Fixes: `b13ad2decc` ("vhost: provide helpers for virtio ring relay") Signed-off-by: Tiwei Bie <tiwei.bie@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-01-14 17:44:29 +01:00
Ilya Maximets	9726aa9907	eal: fix build of external app with clang on armv8 In case DPDK built using GCC, RTE_TOOLCHAIN_CLANG is not defined. But 'rte_atomic.h' is a generic header that included to the external apps like OVS while building with DPDK. As a result, clang build of OVS fails on armv8 if DPDK built using gcc: include/generic/rte_atomic.h:215:9: error: implicit declaration of function '__atomic_exchange_2' is invalid in C99 include/generic/rte_atomic.h:494:9: error: implicit declaration of function '__atomic_exchange_4' is invalid in C99 include/generic/rte_atomic.h:772:9: error: implicit declaration of function '__atomic_exchange_8' is invalid in C99 We need to check for current compiler, not the compiler used for DPDK build. Fixes: `7bdccb9307` ("eal: fix ARM build with clang") Cc: stable@dpdk.org Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2019-01-14 19:49:48 +01:00
Yongseok Koh	bf78d4dc2b	mbuf: remove experimental tag for external attachment Remove the experimental tag of rte_pktmbuf_attach_extbuf() which was introduced in 18.05. Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2019-01-14 16:37:59 +01:00
Yongseok Koh	952f4cf5f0	mbuf: remove deprecated macro RTE_MBUF_INDIRECT() is replaced with RTE_MBUF_CLONED() and removed. This macro was deprecated in release 18.05 when EXT_ATTACHED_MBUF was introduced. Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2019-01-14 16:37:36 +01:00
Harry van Haaren	6af649a44c	mbuf: fix C++ compatibility by making sched struct visible Although C compilation works with the struct rte_mbuf_sched declared inside the struct rte_mbuf namespace, C++ fails to compile. This fix moves the rte_mbuf_sched struct up to the global namespace, instead of declaring it inside the struct mbuf namespace. The struct rte_mbuf_sched is being used on the stack in rte_mbuf_sched_get() and as a cast in _set(). For this reason, it must be exposed as an available type. Fixes: `5d3f721009` ("mbuf: implement generic format for sched field") Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2019-01-14 16:03:28 +01:00
Anatoly Burakov	ba07193e03	mem: fix storing old policy The original code was supposed to overwrite the value pointed to by the pointer, but the new one is instead overwriting the pointer value itself, which has no effect outside that function. Fix it by adding a pointer dereference. Fixes: `582bed1e1d` ("mem: support mapping hugepages at runtime") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-01-14 15:50:52 +01:00
Anatoly Burakov	199629022c	mem: fix variable shadowing A local variable ``flags`` was shadowing another variable from outer scope. Fix this by renaming the variable and make it const. Fixes: `c127be93f6` ("mem: support using memfd segments for in-memory mode") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-01-14 15:42:40 +01:00
Anatoly Burakov	c0f8d50d1c	vfio: do not unregister callback in secondary process Callbacks are only registered in the primary, so do not attempt to unregister callbacks in secondary processes. Fixes: `43e4631371` ("vfio: support memory event callbacks") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-01-14 15:31:51 +01:00
Anatoly Burakov	97257eee2d	eal/bsd: remove clean up of files at startup On FreeBSD, closing the file descriptor drops the lock even if the file descriptor was mmap'ed. This leads to the cleanup at the end of EAL init to remove fbarray files that are still in use by the process itself. However, instead of working around this issue, we can take advantage of the fact that FreeBSD doesn't really create any per-process files in the first place, so no cleanup is actually needed. Fixes: `0a529578f1` ("eal: clean up unused files on initialization") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-01-14 15:23:12 +01:00
Anatoly Burakov	66d9f61de0	eal: fix strdup usages in internal config Currently, we use strdup in a few places to store command-line parameter values for certain internal config values. There are several issues with that. First of all, they're never freed, so memory ends up leaking either after EAL exit, or when these command-line options are supplied multiple times. Second of all, they're defined as `const char `, so they cannot* be freed even if we wanted to. Finally, strdup may return NULL, which will be stored in the config. For most fields, NULL is a valid value, but for the default prefix, the value is always expected to be valid. To fix all of this, three things are done. First, we change the definitions of these values to `char ` as opposed to `const char `. This does not break the ABI, and previous code assumes constness (which is more restrictive), so it's safe to do so. Then, fix all usages of strdup to check return value, and add a cleanup function that will free the memory occupied by these strings, as well as freeing them before assigning a new value to prevent leaks when parameter is specified multiple times. And finally, add an internal API to query hugefile prefix, so that, absent of a valid value, a default value will be returned, and also fix up all usages of hugefile prefix to use this API instead of accessing hugefile prefix directly. Bugzilla ID: 108 Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-01-14 15:05:19 +01:00
Konstantin Ananyev	b73cec26cd	ipsec: fix assert condition fix invalid RTE_ASSERT condition in rsn_update_finish() Fixes: `c0308cd895` ("ipsec: rework SA replay window/SQN for MT environment") Reported-by: Ferruh Yigit <ferruh.yigit@intel.com> Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-01-14 14:45:02 +01:00
Bruce Richardson	efa8088663	build: fix variable name in dependency error message The variable name in the error message had an extra '_' which caused an actual meson error when the message would otherwise be printed to give meaningful information about what was going wrong. Fixes: `203b61dc5e` ("build: improve error message for missing dependency") Cc: stable@dpdk.org Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Luca Boccassi <bluca@debian.org>	2019-01-14 12:24:57 +01:00
Konstantin Ananyev	f901d9c826	ipsec: add helpers to group completed crypto-ops Introduce helper functions to process completed crypto-ops and group related packets by sessions they belong to. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Declan Doherty <declan.doherty@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-01-10 16:57:22 +01:00
Konstantin Ananyev	c0308cd895	ipsec: rework SA replay window/SQN for MT environment With these changes functions: - rte_ipsec_pkt_crypto_prepare - rte_ipsec_pkt_process can be safely used in MT environment, as long as the user can guarantee that they obey multiple readers/single writer model for SQN+replay_window operations. To be more specific: for outbound SA there are no restrictions. for inbound SA the caller has to guarantee that at any given moment only one thread is executing rte_ipsec_pkt_process() for given SA. Note that it is caller responsibility to maintain correct order of packets to be processed. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Declan Doherty <declan.doherty@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-01-10 16:57:22 +01:00
Konstantin Ananyev	4d7ea3e145	ipsec: implement SA data-path API Provide implementation for rte_ipsec_pkt_crypto_prepare() and rte_ipsec_pkt_process(). Current implementation: - supports ESP protocol tunnel mode. - supports ESP protocol transport mode. - supports ESN and replay window. - supports algorithms: AES-CBC, AES-GCM, HMAC-SHA1, NULL. - covers all currently defined security session types: - RTE_SECURITY_ACTION_TYPE_NONE - RTE_SECURITY_ACTION_TYPE_INLINE_CRYPTO - RTE_SECURITY_ACTION_TYPE_INLINE_PROTOCOL - RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL For first two types SQN check/update is done by SW (inside the library). For last two type it is HW/PMD responsibility. Signed-off-by: Mohammad Abdul Awal <mohammad.abdul.awal@intel.com> Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Declan Doherty <declan.doherty@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-01-10 16:57:22 +01:00
Konstantin Ananyev	1e0ad1e36d	ipsec: add SA data-path API Introduce Security Association (SA-level) data-path API Operates at SA level, provides functions to: - initialize/teardown SA object - process inbound/outbound ESP/AH packets associated with the given SA (decrypt/encrypt, authenticate, check integrity, add/remove ESP/AH related headers and data, etc.). Signed-off-by: Mohammad Abdul Awal <mohammad.abdul.awal@intel.com> Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Declan Doherty <declan.doherty@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-01-10 16:57:22 +01:00
Konstantin Ananyev	9f7b43141c	lib: introduce IPsec library Introduce librte_ipsec library. The library is supposed to utilize existing DPDK crypto-dev and security API to provide application with transparent IPsec processing API. That initial commit provides some base API to manage IPsec Security Association (SA) object. Signed-off-by: Mohammad Abdul Awal <mohammad.abdul.awal@intel.com> Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Declan Doherty <declan.doherty@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-01-10 16:57:22 +01:00
Konstantin Ananyev	19b08e5406	net: add ESP trailer structure definition define esp_tail structure. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Mohammad Abdul Awal <mohammad.abdul.awal@intel.com> Acked-by: Declan Doherty <declan.doherty@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-01-10 16:57:22 +01:00
Konstantin Ananyev	58a8e49a98	security: add opaque userdata pointer into security session Add 'uint64_t opaque_data' inside struct rte_security_session. That allows upper layer to easily associate some user defined data with the session. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Mohammad Abdul Awal <mohammad.abdul.awal@intel.com> Acked-by: Declan Doherty <declan.doherty@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-01-10 16:57:22 +01:00
Fan Zhang	b1d978fc7b	cryptodev: add opaque data field to symmetric session This patch adds a opaque data field to cryptodev symmetric session. Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com> Acked-by: Fiona Trahe <fiona.trahe@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-01-10 16:57:22 +01:00
Fan Zhang	5d6c73dd59	cryptodev: add reference count to session private data This patch adds a refcnt field to every session private data in the cryptodev symmetric session. The counter is used to prevent freeing symmetric session blindly before it is not cleared by every type of crypto device in use. Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com> Acked-by: Fiona Trahe <fiona.trahe@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-01-10 16:57:22 +01:00
Fan Zhang	9e5f5ecb5e	cryptodev: add user data size to symmetric session This patch adds a user_data_sz field to cryptodev symmetric session. The field is used to check if reading or writing the session's user data field is eligible. Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com> Acked-by: Fiona Trahe <fiona.trahe@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-01-10 16:57:22 +01:00
Fan Zhang	e764cd72a9	cryptodev: update symmetric session structure This patch updates the rte_cryptodev_sym_session structure for cryptodev library. The updates include a changed session private data array and an added nb_drivers field. They are used to calculate the correct session header size and ensure safe access of the session private data. Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com> Acked-by: Fiona Trahe <fiona.trahe@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-01-10 16:57:22 +01:00
Fan Zhang	0b60386ac3	cryptodev: add sym session header size function This patch adds a new API in Cryptodev Framework. The API is used to get the header size for the created symmetric Cryptodev session. Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com> Acked-by: Fiona Trahe <fiona.trahe@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-01-10 16:57:22 +01:00
Fan Zhang	ac5e42daca	vhost/crypto: use separate session mempools This patch uses the two session mempool approach to vhost crypto. One mempool is for session header objects, and the other is for session private data. Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com> Acked-by: Fiona Trahe <fiona.trahe@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-01-10 16:57:22 +01:00
Fan Zhang	1d6f89885e	cryptodev: add sym session mempool create This patch adds a new API "rte_cryptodev_sym_session_pool_create()" to cryptodev library. All applications are required to use this API to create sym session mempool as it adds private data and nb_drivers information to the mempool private data. Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com> Acked-by: Fiona Trahe <fiona.trahe@intel.com> Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-01-10 16:57:22 +01:00
Fan Zhang	725d2a7fbf	cryptodev: change queue pair configure structure This patch changes the cryptodev queue pair configure structure to enable two mempool passed into cryptodev PMD simutaneously. Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com> Acked-by: Fiona Trahe <fiona.trahe@intel.com> Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-01-10 16:57:22 +01:00
Eelco Chaudron	655796d2b5	meter: support RFC4115 trTCM This patch adds support for RFC4115 trTCM meters. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2019-01-10 00:34:09 +01:00
Thomas Monjalon	7637518249	version: 19.02-rc1 Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2018-12-23 00:21:13 +01:00
Tonghao Zhang	03b7fd7e54	sched: fix memory leak on init failure In some case, we may create sched port dynamically, if err when creating so memory will leak. Fixes: `de3cfa2c98` ("sched: initial import") Cc: stable@dpdk.org Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>	2018-12-22 00:22:57 +01:00
Reshma Pattan	5d3f721009	mbuf: implement generic format for sched field This patch implements the changes proposed in the deprecation notes [1][2]. librte_mbuf changes: The mbuf->hash.sched field is updated to support generic definition in line with the ethdev traffic manager and meter APIs. The new generic format contains: queue ID, traffic class, color. Added public APIs to set and get these new fields to and from mbuf. librte_sched changes: In addtion, following API functions of the sched library have been modified with an additional parameter of type struct rte_sched_port to accommodate the changes made to mbuf sched field. (i)rte_sched_port_pkt_write() (ii) rte_sched_port_pkt_read_tree_path() librte_pipeline, qos_sched UT, qos_sched app are updated to make use of new changes. Also mbuf->hash.txadapter has been added for eventdev txq, rte_event_eth_tx_adapter_txq_set and rte_event_eth_tx_adapter_txq_get() are updated to use mbuf->hash.txadapter.txq. doc: Release notes updated. Removed deprecation notice for mbuf->hash.sched and sched API. [1] http://mails.dpdk.org/archives/dev/2018-February/090651.html [2] https://mails.dpdk.org/archives/dev/2018-November/119051.html Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com> Signed-off-by: Reshma Pattan <reshma.pattan@intel.com> Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Olivier Matz <olivier.matz@6wind.com> Tested-by: Nikhil Rao <nikhil.rao@intel.com> Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>	2018-12-22 00:22:44 +01:00
Reshma Pattan	c712b01326	meter: unify packet color definition Added new rte_color definition in librte_meter to consolidate color definition which is currently replicated in various places such as rte_meter.h, rte_tm.h and rte_mtr.h Created aliases for rte_tm_color, rte_mtr_color and rte_meter_color to use new rte_color values. The definitions of rte_tm_color, rte_mtr_color and rte_meter_color will be deprecated in future. Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com> Signed-off-by: Reshma Pattan <reshma.pattan@intel.com> Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2018-12-20 19:00:10 +01:00
Bruce Richardson	fff6df7bf5	telemetry: fix using ports of different types Different NIC ports can have different numbers of xstats on them, which means that we can't just use the xstats list from the first port registered in the telemetry library. Instead, we need to check the type of each port - by checking its ops structure pointer - and register each port type once with the metrics lib. Fixes: `fdbdb3f9ce` ("telemetry: add initial connection socket") Cc: stable@dpdk.org Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Kevin Laatz <kevin.laatz@intel.com>	2018-12-22 03:23:06 +01:00
Maxime Coquelin	b473ec1131	vhost: batch used descs chains write-back with packed ring Instead of writing back descriptors chains in order, let's write the first chain flags last in order to improve batching. Also, move the write barrier in logging cache sync, so that it is done only when logging is enabled. It means there is now one more barrier for split ring when logging is enabled. With Kernel's pktgen benchmark, ~3% performance gain is measured. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>	2018-12-21 16:22:41 +01:00
Maxime Coquelin	815814c4ff	vhost: remove useless prefetch for packed ring descriptor This prefetch does not show any performance improvement. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Tiwei Bie <tiwei.bie@intel.com>	2018-12-21 16:22:41 +01:00
Maxime Coquelin	aaf8979d6f	vhost: prefetch descriptor after the read barrier This patch moves the prefetch after the available index is read to avoid prefetching a descriptor not available yet. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Tiwei Bie <tiwei.bie@intel.com>	2018-12-21 16:22:41 +01:00
Maxime Coquelin	33e12d63d1	vhost: enforce desc flags and content read ordering A read barrier is required to ensure that the ordering between descriptor's flags and content reads is enforced. 1. read flags = desc->flags if (flags & AVAIL_BIT) 2. read desc->id There is a control dependency between steps 1 and step 2. 2 could be speculatively executed before 1, which could result in 'id' to not be updated yet. Fixes: `2f3225a7d6` ("vhost: add vector filling support for packed ring") Cc: stable@dpdk.org Reported-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Tiwei Bie <tiwei.bie@intel.com>	2018-12-21 16:22:41 +01:00
Maxime Coquelin	d4ff2135eb	vhost: enforce avail index and desc read ordering A read barrier is required to ensure the ordering between available index and the descriptor reads is enforced. 1. read avail_head = avail->idx 2. read cur_idx = last_avail_idx if (cur_idx != avail_head) { 3. read idx = avail->ring[cur_idx] 4. read desc[idx] } There is a control dependency between step 1 and steps 3 & 4, 3 could be speculatively executed before 1, which could result in 'idx' to not being updated yet. Fixes: `4796ad63ba` ("examples/vhost: import userspace vhost application") Cc: stable@dpdk.org Reported-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Tiwei Bie <tiwei.bie@intel.com>	2018-12-21 16:22:41 +01:00
Bruce Richardson	8743d499a5	net: fix underflow for checksum of invalid IPv4 packets If we receive a packet with an invalid IP header, where the total packet length is reported as less than the IP header length, we would end up getting an underflow in the length subtraction. This could cause us to checksum e.g. 4GB of data in the case where the result of the subtraction was -1. We fix this by having the function return 0 - an invalid sum - when the length is less than the header length. Fixes: `af75078fec` ("first public release") Fixes: `6006818cfb` ("net: new checksum functions") Cc: stable@dpdk.org Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>	2018-12-21 16:22:41 +01:00
Xiao Wang	b13ad2decc	vhost: provide helpers for virtio ring relay This patch provides two helpers for vdpa device driver to perform a relay between the guest virtio ring and a mediated virtio ring. The available ring relay will synchronize the available entries, and help to do desc validity checking. The used ring relay will synchronize the used entries from mediated ring to guest ring, and help to do dirty page logging for live migration. The later patch will leverage these two helpers. Signed-off-by: Xiao Wang <xiao.w.wang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2018-12-21 16:22:40 +01:00
Xiao Wang	43f34e3566	vhost: provide helper for host notifier ctrl VDPA driver can decide if it needs to enable/disable the host notifier mapping, so exposing a API can allow flexibility. A later patch will base on this. Signed-off-by: Xiao Wang <xiao.w.wang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2018-12-21 16:22:40 +01:00
Xiao Wang	02e3b285d4	vhost: remove unused function vhost_detach_vdpa_device() is internally defined but not used, remove it in this patch. Signed-off-by: Xiao Wang <xiao.w.wang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2018-12-21 16:22:40 +01:00
Matthias Gatto	276d63505b	vhost: fix race condition when adding fd in the fdset fdset_add can call fdset_shrink_nolock which call fdset_move concurrently to poll that is call in fdset_event_dispatch. This patch add a mutex to protect poll from been call at the same time fdset_add call fdset_shrink_nolock. Fixes: `1b815b8959` ("vhost: try to shrink pfdset when fdset_add fails") Cc: stable@dpdk.org Signed-off-by: Matthias Gatto <matthias.gatto@outscale.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2018-12-21 16:22:40 +01:00
Anatoly Burakov	ba731ea1dd	malloc: fix deadlock when reading stats Currently, malloc statistics and external heap creation code use memory hotplug lock as a way to synchronize accesses to heaps (as in, locking the hotplug lock to prevent list of heaps from changing under our feet). At the same time, malloc statistics code will also lock the heap because it needs to access heap data and does not want any other thread to allocate anything from that heap. In such scheme, it is possible to enter a deadlock with the following sequence of events: thread 1 thread 2 rte_malloc() rte_malloc_dump_stats() take heap lock take hotplug lock failed to allocate, attempt to take hotplug lock attempt to take heap lock Neither thread will be able to continue, as both of them are waiting for the other one to drop the lock. Adding an additional lock will require an ABI change, so instead of that, make malloc statistics calls thread-unsafe with respect to creating/destroying heaps. Fixes: `72cf92b318` ("malloc: index heaps using heap ID rather than NUMA node") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-12-21 15:26:43 +01:00
Honnappa Nagarahalli	d5c677db89	hash: fix out-of-bound write while freeing key slot Add a debug check for out-of-bound write while freeing the key slot. Coverity issue: 325733 Fixes: `e605a1d36c` ("hash: add lock-free r/w concurrency") Cc: stable@dpdk.org Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Reviewed-by: Steve Capper <steve.capper@arm.com> Reviewed-by: Gavin Hu <gavin.hu@arm.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2018-12-21 01:53:33 +01:00
Jeff Shaw	0f48ca429b	hash: fix return of bulk lookup The __rte_hash_lookup_bulk() function returns void, and therefore should not return with an expression. This commit fixes the following compiler warning when attempting to compile with "-pedantic -std=c11". warning: ISO C forbids ‘return’ with expression, in function returning void [-Wpedantic] Fixes: `9eca8bd7a6` ("hash: separate lock-free and r/w lock lookup") Cc: stable@dpdk.org Signed-off-by: Jeff Shaw <jeffrey.b.shaw@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2018-12-21 01:41:18 +01:00
Liang Ma	e6c6dc0f96	power: add p-state driver compatibility Previously, in order to use the power library, it was necessary for the user to disable the intel_pstate driver by adding “intel_pstate=disable” to the kernel command line for the system, which causes the acpi_cpufreq driver to be loaded in its place. This patch adds the ability for the power library use the intel-pstate driver. It adds a new suite of functions behind the current power library API, and will seamlessly set up the user facing API function pointers to the relevant functions depending on whether the system is running with acpi_cpufreq kernel driver, intel_pstate kernel driver or in a guest, using kvm. The library API and ABI is unchanged. Signed-off-by: Liang Ma <liang.j.ma@intel.com> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: David Hunt <david.hunt@intel.com>	2018-12-21 01:33:59 +01:00
Qi Zhang	85d6815fa6	eal: close multi-process socket during cleanup When secondary process quit, the mp_socket* file still exist, that cause rte_mp_request_sync fail when try to send message on a floating socket. The patch fix the issue by introduce a function rte_mp_channel_cleanup. This function will be called by rte_eal_cleanup and it will close the mp socket and delete the mp_socket* file. Fixes: `bacaa27540` ("eal: add channel for multi-process communication") Cc: stable@dpdk.org Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>	2018-12-21 01:15:41 +01:00
Anatoly Burakov	9d65053761	eal: add 64-bit log2 function Add missing implementation for 64-bit log2 function, and extend the unit test to test this new function. Also, remove duplicate reimplementation of this function from testpmd and memalloc. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-12-21 00:23:49 +01:00
Anatoly Burakov	43c9e6c205	eal: add 64-bit fls function Add missing implementation for 64-bit fls function, and extend unit test to test the new function as well. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-12-21 00:17:43 +01:00
Anatoly Burakov	4e261f5519	eal: add 64-bit bsf and 32-bit safe bsf functions Add an rte_bsf64 function that follows the convention of existing rte_bsf32 function. Also, add missing implementation for safe version of rte_bsf32, and implement unit tests for all recently added bsf varieties. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-12-21 00:00:58 +01:00
Anatoly Burakov	cc7ddb00da	bitmap: remove deprecated 64-bit bsf function The function rte_bsf64 was deprecated in a previous release, so remove the function, and the deprecation notice associated with it. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-12-20 23:44:56 +01:00
Anatoly Burakov	307315d457	eal: fix runtime directory cleanup in noshconf mode When using --no-shconf or --in-memory modes, there is no runtime directory to be created, so there is no point in attempting to clean it. Fixes: `0a529578f1` ("eal: clean up unused files on initialization") Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-12-20 23:27:35 +01:00
Anatoly Burakov	c75f535ac5	mem: use memfd for no-huge mode When running in no-huge mode, we anonymously allocate our memory. While this works for regular NICs and vdev's, it's not suitable for memory sharing scenarios such as virtio with vhost_user backend. To fix this, allocate no-huge memory using memfd, and register it with memalloc just like any other memseg fd. This will enable using rte_memseg_get_fd() API with --no-huge EAL flag. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Tiwei Bie <tiwei.bie@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2018-12-20 22:58:25 +01:00
Anatoly Burakov	df7722c75b	mem: allow setting up segment list fd Currently, only segment fd's for multi-file segments are supported, while for memfd-backed no-huge memory we need single-file segments mode. Add support for single-file segments in the internal API. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Tiwei Bie <tiwei.bie@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2018-12-20 22:55:56 +01:00
Anatoly Burakov	d75eea3145	mem: check for memfd support in segment fd API If memfd support was not compiled, or hugepage memfd support is not available at runtime, the API will now return proper error code, indicating that this API is unsupported. This changes the API, so document the changes. Fixes: `41dbdb6872` ("mem: add external API to retrieve page fd") Fixes: `3a44687139` ("mem: allow querying offset into segment fd") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Tiwei Bie <tiwei.bie@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2018-12-20 22:54:37 +01:00
Anatoly Burakov	525670756a	mem: fix segment fd API error code for external segment Segment fd API does not support getting segment fd's from externally allocated memory, so return proper error code on any attempts to do so. This changes API behavior, so document the change as well. Fixes: `5282bb1c36` ("mem: allow memseg lists to be marked as external") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Tiwei Bie <tiwei.bie@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2018-12-20 22:51:49 +01:00
Anatoly Burakov	bed7941886	mem: allow usage of non-heap external memory in multiprocess Add multiprocess support for externally allocated memory areas that are not added to DPDK heap (and add relevant doc sections). Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-12-20 18:14:55 +01:00
Anatoly Burakov	950e8fb4e1	mem: allow registering external memory areas The general use-case of using external memory is well covered by existing external memory API's. However, certain use cases require manual management of externally allocated memory areas, so this memory should not be added to the heap. It should, however, be added to DPDK's internal structures, so that API's like ``rte_virt2memseg`` would work on such external memory segments. This commit adds such an API to DPDK. The new functions will allow to register and unregister externally allocated memory areas, as well as documentation for them. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-12-20 18:14:55 +01:00
Anatoly Burakov	39ff94e71c	malloc: separate destroying memseg list and heap data Currently, destroying external heap chunk and its memseg list is part of one process. When we will gain the ability to unregister external memory from DPDK that doesn't have any heap structures associated with it, we need to be able to find and destroy memseg lists as well as heap data separately. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-12-20 18:10:08 +01:00
Anatoly Burakov	0f526d674f	malloc: separate creating memseg list and malloc heap Currently, creating external malloc heap involves also creating a memseg list backing that malloc heap. We need to have them as separate functions, to allow creating memseg lists without creating a malloc heap. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-12-20 18:09:55 +01:00
Anatoly Burakov	646e5260ee	malloc: make alignment requirements more stringent The external heaps API already implicitly expects start address of the external memory area to be page-aligned, but it is not enforced or documented. Fix this by implementing additional parameter checks at memory add call, and document the page alignment requirement explicitly. Fixes: `7d75c31014` ("malloc: allow adding memory to named heaps") Cc: stable@dpdk.org Suggested-by: Yongseok Koh <yskoh@mellanox.com> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-12-20 15:34:03 +01:00
Anatoly Burakov	b3e735e16e	malloc: fix duplicate mem event notification We already trigger a mem event notification inside the walk function, no need to do it twice. Fixes: `f32c7c9de9` ("malloc: enable event callbacks for external memory") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-12-20 15:28:55 +01:00
Seth Howell	fba0ca2274	malloc: notify primary process about hotplug in secondary When secondary process hotplugs memory, it sends a request to primary, which then performs the real mmap() and sends sync requests to all secondary processes. Upon receiving such sync request, each secondary process will notify the upper layers of hotplugged memory (and will call all locally registered event callbacks). In the end we'll end up with memory event callbacks fired in all the processes except the primary, which is a bug. This gets critical if memory is hotplugged while a VFIO device is attached, as the VFIO memory registration - which is done from a memory event callback present in the primary process only - is never called. After this patch, a primary process fires memory event callbacks before secondary processes start their synchronizations - both for hotplug and hotremove. Fixes: `07dcbfe010` ("malloc: support multiprocess memory hotplug") Cc: stable@dpdk.org Signed-off-by: Seth Howell <seth.howell@intel.com> Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-12-20 15:25:34 +01:00
Yongseok Koh	6d09256148	malloc: fix finding maximum contiguous IOVA size malloc_elem_find_max_iova_contig() could return invalid size due to a missing sanity check. The following gdb output shows how 'cur_size' can be invalid in find_biggest_element(). (gdb) p/x cur_size $4 = 0xffffffffffe42900 (gdb) p elem $1 = (struct malloc_elem ) 0x12e842000 (gdb) p elem $2 = {heap = 0x7ffff7ff387c, prev = 0x12e831fc0, next = 0x12e842900, free_list = {le_next = 0x109538000, le_prev = 0x7ffff7ff3894}, msl = 0x7ffff7ff107c, state = ELEM_FREE, pad = 0, size = 2304} (gdb) p *elem->msl $5 = {{base_va = 0x100200000, addr_64 = 4297064448}, page_sz = 2097152, socket_id = 0, version = 790, len = 17179869184, external = 0, memseg_arr = {name = "memseg-2048k-0-0", '\000' <repeats 47 times>, count = 493, len = 8192, elt_sz = 48, data = 0x10002e000, rwlock = {cnt = 0}}} Fixes: `9fe6bceafd` ("malloc: add finding biggest free IOVA-contiguous element") Cc: stable@dpdk.org Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-12-20 15:17:48 +01:00
Jim Harris	476c847ab6	malloc: add option --match-allocations SPDK uses the rte_mem_event_callback_register API to create RDMA memory regions (MRs) for newly allocated regions of memory. This is used in both the SPDK NVMe-oF target and the NVMe-oF host driver. DPDK creates internal malloc_elem structures for these allocated regions. As users malloc and free memory, DPDK will sometimes merge malloc_elems that originated from different allocations that were notified through the registered mem_event callback routine. This results in subsequent allocations that can span across multiple RDMA MRs. This requires SPDK to check each DPDK buffer to see if it crosses an MR boundary, and if so, would have to add considerable logic and complexity to describe that buffer before it can be accessed by the RNIC. It is somewhat analagous to rte_malloc returning a buffer that is not IOVA-contiguous. As a malloc_elem gets split and some of these elements get freed, it can also result in DPDK sending an RTE_MEM_EVENT_FREE notification for a subset of the original RTE_MEM_EVENT_ALLOC notification. This is also problematic for RDMA memory regions, since unregistering the memory region is all-or-nothing. It is not possible to unregister part of a memory region. To support these types of applications, this patch adds a new --match-allocations EAL init flag. When this flag is specified, malloc elements from different hugepage allocations will never be merged. Memory will also only be freed back to the system (with the requisite memory event callback) exactly as it was originally allocated. Since part of this patch is extending the size of struct malloc_elem, we also fix up the malloc autotests so they do not assume its size exactly fits in one cacheline. Signed-off-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-12-20 13:01:08 +01:00
Gao Feng	cc80353223	memzone: fix unlock on initialization failure The RTE_PROC_PRIMARY error handler lost the unlock statement in the current codes. Now unlock and return in one place to fix it. Fixes: `49df3db848` ("memzone: replace memzone array with fbarray") Cc: stable@dpdk.org Signed-off-by: Gao Feng <davidfgao@tencent.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-12-20 12:24:14 +01:00
Gao Feng	32fa7f8913	eal: check peer allocation in multi-process request Add the check for null peer pointer like the bundle pointer in the mp request handler. They should follow same style. And add some logs for nomem cases. Signed-off-by: Gao Feng <davidfgao@tencent.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-12-20 00:01:28 +01:00
Gao Feng	e14bc93e8f	eal: fix leak on multi-process request error When rte_eal_alarm_set failed, need to free the bundle mem in the error handler of handle_primary_request and handle_secondary_request. Fixes: `244d513071` ("eal: enable hotplug on multi-process") Fixes: `ac9e4a1737` ("eal: support attach/detach shared device from secondary") Cc: stable@dpdk.org Signed-off-by: Gao Feng <davidfgao@tencent.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-12-20 00:01:28 +01:00
Gaetan Rivet	c9b413c3b1	eal: fix detection of duplicate option register Missing brackets around the if means that the loop will end at its first iteration. Fixes: `2395332798` ("eal: add option register infrastructure") Cc: stable@dpdk.org Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2018-12-20 00:01:28 +01:00
Keith Wiles	e3b090f3da	eal: fix missing newline in a log Add a missing newline to a RTE_LOG message. Fixes: `2395332798` ("eal: add option register infrastructure") Cc: stable@dpdk.org Signed-off-by: Keith Wiles <keith.wiles@intel.com>	2018-12-20 00:01:28 +01:00
Chas Williams	7a838c8798	ip_frag: fix IPv6 when MTU sizes not aligned to 8 bytes The same issue was fixed on for the ipv4 version of this routine in commit `8d4d3a4f73` ("ip_frag: handle MTU sizes not aligned to 8 bytes"). Briefly, the size of an ipv6 header is always 40 bytes. With an MTU of 1500, this will never produce a multiple of 8 bytes for the frag_size and this routine can never succeed. Since RTE_ASSERTS are disabled by default, this failure is typically ignored. To fix this, round down to the nearest 8 bytes and use this when producing the fragments. Fixes: `0aa31d7a59` ("ip_frag: add IPv6 fragmentation support") Cc: stable@dpdk.org Signed-off-by: Chas Williams <chas3@att.com> Acked-by: Luca Boccassi <bluca@debian.org> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2018-12-19 22:40:08 +01:00
Konstantin Ananyev	d5b46fc363	rwlock: introduce try semantics Introduce rte_rwlock_read_trylock() and rte_rwlock_write_trylock(). Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Reviewed-by: Gavin Hu <gavin.hu@arm.com>	2018-12-19 20:56:11 +01:00
Erik Gabriel Carrillo	7079e29f7f	timer: fix race condition rte_timer_manage() adds expired timers to a "run list", and walks the list, transitioning each timer from the PENDING to the RUNNING state. If another lcore resets or stops the timer at precisely this moment, the timer state would instead be set to CONFIG by that other lcore, which would cause timer_manage() to skip over it. This is expected behavior. However, if a timer expires quickly enough, there exists the following race condition that causes the timer_manage() routine to misinterpret a timer in CONFIG state, resulting in lost timers: - Thread A: - starts a timer with rte_timer_reset() - the timer is moved to CONFIG state - the spinlock associated with the appropriate skiplist is acquired - timer is inserted into the skiplist - the spinlock is released - Thread B: - executes rte_timer_manage() - find above timer as expired, add it to run list - walk run list, see above timer still in CONFIG state, unlink it from run list and continue on - Thread A: - move timer to PENDING state - return from rte_timer_reset() - timer is now in PENDING state, but not actually linked into a pending list or a run list and will never get processed further by rte_timer_manage() This commit fixes this race condition by only releasing the spinlock after the timer state has been transitioned from CONFIG to PENDING, which prevents rte_timer_manage() from seeing an incorrect state. Fixes: `9b15ba895b` ("timer: use a skip list") Cc: stable@dpdk.org Signed-off-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com> Reviewed-by: Gavin Hu <gavin.hu@arm.com>	2018-12-19 20:56:09 +01:00
Amr Mokhtar	56b878b0ba	bbdev: add missing experimental tags and map entries - add missing APIs to map file - add experimental tag to all bbdev APIs Signed-off-by: Amr Mokhtar <amr.mokhtar@intel.com>	2018-12-19 19:36:53 +01:00
Kamil Chalupnik	0b98d574e3	bbdev: enhance throughput test Improvements added to throughput test: - test is run in loop (number of iterations is specified by TEST_REPETITIONS define) which ensures more accurate results - length of input data is calculated based on amount of CBs in TB - maximum number of decoding iterations is gathered from results - added new functions responsible for printing results - small fixes for memory management Signed-off-by: Kamil Chalupnik <kamilx.chalupnik@intel.com> Acked-by: Amr Mokhtar <amr.mokhtar@intel.com>	2018-12-19 11:19:10 +01:00
Kamil Chalupnik	9fa6ebde8e	bbdev: enhance offload cost test Offload cost test was improved in order to collect more accurate results. Signed-off-by: Kamil Chalupnik <kamilx.chalupnik@intel.com> Acked-by: Amr Mokhtar <amr.mokhtar@intel.com>	2018-12-19 11:19:10 +01:00
Lee Daly	9d3e1cb135	compressdev: fix structure comment Fixes incorrect comment on compressdev rte_comp_op structure element. Comment needed to be updated to be compliant with the use of chained mbufs. Fixes: `f87bdc1ddc` ("compressdev: add compression specific data") Cc: stable@dpdk.org Signed-off-by: Lee Daly <lee.daly@intel.com> Acked-by: Fiona Trahe <fiona.trahe@intel.com>	2018-12-19 11:19:10 +01:00
Fiona Trahe	5eb0d610a5	compressdev: add bulk free operation API There's an API to bulk allocate operations, this adds a corresponding bulk free API. Signed-off-by: Fiona Trahe <fiona.trahe@intel.com> Acked-by: Shally Verma <shally.verma@caviumnetworks.com> Acked-by: Lee Daly <lee.daly@intel.com>	2018-12-19 11:19:10 +01:00
Nikhil Rao	5bd4ae2d77	eventdev: fix eth Tx adapter queue count checks rte_event_eth_tx_adapter_queue_add() - add a check that returns an error if the ethdev has zero Tx queues configured. rte_event_eth_tx_adapter_queue_del() - remove the checks for ethdev queue count, instead check for queues added to the adapter which maybe different from the current ethdev queue count. Fixes: `a3bbf2e097` ("eventdev: add eth Tx adapter implementation") Cc: stable@dpdk.org Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>	2018-12-17 20:25:10 +01:00
Gage Eads	1f7a110269	eventdev: fix xstats documentation typo The eventdev extended stats documentation referred to two non-existent functions, rte_eventdev_xstats_get and rte_eventdev_get_xstats_by_name. Fixes: `3ed7fc039a` ("eventdev: add extended stats") Cc: stable@dpdk.org Signed-off-by: Gage Eads <gage.eads@intel.com>	2018-12-16 18:28:07 +01:00
Erik Gabriel Carrillo	ac0fc54a49	eventdev: remove redundant timer adapter function prototypes Fixes: `6750b21bd6` ("eventdev: add default software timer adapter") Cc: stable@dpdk.org Signed-off-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com> Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>	2018-12-16 17:22:14 +01:00
Nikhil Rao	91c1667da0	eventdev: fix error log in eth Rx adapter strerror() input parameter should be > 0. Coverity issue: 302864 Fixes: `3810ae4357` ("eventdev: add interrupt driven queues to Rx adapter") Cc: stable@dpdk.org Signed-off-by: Nikhil Rao <nikhil.rao@intel.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2018-12-16 17:22:14 +01:00
Jiayu Hu	f8a05885e7	gro: fix overflow of payload length calculation When the packet length is smaller than the header length, the calculated payload length will be overflowed and result in incorrect reassembly behaviors. Fixes: `1e4cf4d6d4` ("gro: cleanup") Fixes: `9e0b9d2ec0` ("gro: support VxLAN GRO") Cc: stable@dpdk.org Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>	2018-12-19 04:29:57 +01:00
Anatoly Burakov	0a529578f1	eal: clean up unused files on initialization When creating process data structures, EAL will create many files in EAL runtime directory. Because we allow multiple secondary processes to run, each secondary process gets their own unique file. With many secondary processes running and exiting on the system, runtime directory will, over time, create enormous amounts of sockets, fbarray files and other stuff that just sits there unused because the process that allocated it has died a long time ago. This may lead to exhaustion of disk (or RAM) space in the runtime directory. Fix this by removing every unlocked file at initialization that matches either socket or fbarray naming convention. We cannot be sure of any other files, so we'll leave them alone. Also, remove similar code from mp socket code. We do it at the end of init, rather than at the beginning, because secondary process will use primary process' data structures even if the primary itself has died, and we don't want to remove those before we lock them. Bugzilla ID: 106 Cc: stable@dpdk.org Reported-by: Vipin Varghese <vipin.varghese@intel.com> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-12-19 04:12:30 +01:00
David Marchand	a8499f65a1	log: add missing experimental tag When rte_log_register_type_and_pick_level() has been introduced, it has been correctly added to the EXPERIMENTAL section of the eal map and the symbol itself has been marked at its definition. However, the declaration of this symbol in rte_log.h is missing the __rte_experimental tag. Because of this, a user can try to call this symbol without being aware this is an experimental api (neither compilation nor link warning). Fixes: `b22e77c026` ("eal: register log type and pick level from args") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com>	2018-12-19 02:30:02 +01:00
Jeff Shaw	68687daff2	eal: remove unnecessary dirent.h include Prior to this patch, the two affected .c files include <dirent.h> unnecessarily. This commit removes the include lines. Signed-off-by: Jeff Shaw <jeffrey.b.shaw@intel.com> Reviewed-by: Rami Rosen <ramirose@gmail.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2018-12-19 01:29:36 +01:00
Tiwei Bie	e9436f54af	pdump: remove deprecated APIs We already changed to use generic IPC in pdump since below commit: commit `660098d61f` ("pdump: use generic multi-process channel") The `rte_pdump_set_socket_dir()`, the `path` parameter of `rte_pdump_init()` and the `enum rte_pdump_socktype` have been deprecated since then. This commit removes these deprecated APIs and also bumps the pdump ABI. Signed-off-by: Tiwei Bie <tiwei.bie@intel.com> Acked-by: Reshma Pattan <reshma.pattan@intel.com>	2018-12-19 01:25:56 +01:00
Ilya Maximets	48cae0bfa6	vhost: fix double read of descriptor flags Flags could be updated in a separate process leading to the inconsistent check. Additionally, read marked as 'volatile' to highlight the shared nature of the variable and avoid such issues in the future. Fixes: `d3211c98c4` ("vhost: add helpers for packed virtqueues") Cc: stable@dpdk.org Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2018-12-13 18:17:42 +00:00
Maxime Coquelin	cf14478d77	vhost: fix crash after mmap failure If mmap() call fails in vhost_user_set_mem_table, dev->mem is set to NULL. If later, qva_to_vva() is called, a segfault occurs. Fixes: `8f972312b8` ("vhost: support vhost-user") Cc: stable@dpdk.org Reviewed-by: Tiwei Bie <tiwei.bie@intel.com> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Jens Freimann <jfreimann@redhat.com>	2018-12-13 17:56:21 +00:00
Yaroslav Brustinov	b4b896fcfe	ethdev: fix typo in queue setup error log '=' should be '>=" for '[rt]x_desc_lim.nb_min' check. Fixes: `386c993e95` ("ethdev: add a missing sanity check for Tx queue setup") Fixes: `80a1deb4c7` ("ethdev: add API to retrieve queue information") Cc: stable@dpdk.org Signed-off-by: Yaroslav Brustinov <ybrustin@cisco.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2018-12-13 17:45:59 +00:00
Thomas Monjalon	37d800031d	version: 19.02-rc0 Start version numbering for a new release cycle, and introduce a template file for release notes. The release notes comments are updated to mandate a scope label for API and ABI changes. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: John McNamara <john.mcnamara@intel.com>	2018-11-30 16:20:33 +00:00
Thomas Monjalon	0da7f445df	version: 18.11.0 Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2018-11-27 00:36:00 +01:00
Thomas Monjalon	c5f21bdae4	fix indentation in symbol maps Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Allain Legacy <allain.legacy@windriver.com>	2018-11-26 20:16:46 +01:00
Anatoly Burakov	e45088b1e1	mem: fix division by zero in no-NUMA mode When RTE_EAL_NUMA_AWARE_HUGEPAGES is set to "n", not all memtypes will be valid, because we skip some due to not supporting other NUMA nodes, leading to a division by zero error down the line because the necessary memtype fields weren't populated. Fix it by limiting number of memtypes to number of memtypes we have actually created. Fixes: `1dd342d0fd` ("mem: improve segment list preallocation") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Tested-by: David Hunt <david.hunt@intel.com>	2018-11-26 15:35:46 +01:00
Thomas Monjalon	6cff3183c2	version: 18.11-rc5 Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2018-11-25 21:19:19 +01:00
Darek Stojaczyk	161419983d	eal: fix devargs reference after probing failure Even if a device failed to plug, it's still a device object that references the devargs. Those devargs will be freed automatically together with the device, but freeing them any earlier - like it's done in the hotplug error handling path right now - will give us a dangling pointer and a segfault scenario. Consider the following case: * secondary process receives the hotplug request IPC message * devargs are either created or updated * the bus is scanned * a new device object is created with the latest devargs * the device can't be plugged for whatever reason, bus->plug returns error * the devargs are freed, even though they're still referenced by the device object on the bus For PCI devices, the generic device name comes from a buffer within the devargs. Freeing those will make EAL segfault whenever the device name is checked. This patch just prevents the hotplug error handling path from removing the devargs when there's a device that references them. This is done by simply exiting early from the hotplug function. As mentioned in the beginning, those devargs will be freed later, together with the device itself. Fixes: `7e8b266501` ("eal: fix hotplug add / remove") Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2018-11-25 13:45:35 +01:00
Darek Stojaczyk	29bf7e93ba	eal: fix devargs leak on multi-process detach request Device detach triggered through IPC leaked some memory. It allocated a devargs objects just to use it for parsing the devargs string in order to retrieve the device name. Those devargs weren't passed anywhere and were never freed. First of all, let's put those devargs on the stack, so they doesn't need to be freed. Then free the additional arguments string as soon as it's allocated, because we won't need it. Fixes: `ac9e4a1737` ("eal: support attach/detach shared device from secondary") Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com>	2018-11-25 13:32:01 +01:00
Darek Stojaczyk	494db286f3	eal: fix multi-process hotplug if attached in secondary Consider the following scenario: 1) primary process (A) starts, probes the bus 2) a secondary process (B) starts, probes the bus 3) yet another secondary process (C) starts 4) (C) registers the pci driver and hotplugs the device * an IPC attach req is sent to the primary (A) * (A) ignores the -EEXIST from process-local probe * (A) propagates the request to all secondary processes * (B) responds with -EEXIST * (A) replies to the original request with the -EEXIST return code * the -EEXIST is returned back to the user, although the device was successfully attached both locally and in all other processes This patch makes the primary process reply with rc=0 even if there was another secondary process with the device already attached. The primary process already didn't reply with -EEXIST when the device was attached locally, so now this behavior is even more consistent. Looking by the code, this seems to be the originally intended behavior. Fixes: `ac9e4a1737` ("eal: support attach/detach shared device from secondary") Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com>	2018-11-25 13:27:17 +01:00
Darek Stojaczyk	d27eed3139	eal: fix multi-process hotplug if already probed When primary process receives an IPC attach request of a device that's already locally-attached, it doesn't setup its variables properly and is prone to segfaulting on a subsequent rollback. `ret = local_dev_probe(req->devargs, &dev)` The above function will set `dev` pointer to the proper device unless it returns with error. One of those errors is -EEXIST, which the hotplug function explicitly ignores. For -EEXIST, it proceeds with attaching the device and expects the dev pointer to be valid. This patch makes `local_dev_probe` set the dev pointer even if it returns -EEXIST. Fixes: `ac9e4a1737` ("eal: support attach/detach shared device from secondary") Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>	2018-11-25 13:22:51 +01:00
Darek Stojaczyk	5d36bf2bcd	eal: fix multi-process hotplug rollback If a device fails to attach before it's plugged, the subsequent rollback will still try to detach it, causing a segfault. Unplugging a device that wasn't plugged isn't really supported, so this patch adds an extra error check to prevent that from happening. While here, fix this also for normal (non-rollback) detach, which could also theoretically segfault on non-plugged device. Fixes: `244d513071` ("eal: enable hotplug on multi-process") Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>	2018-11-25 13:15:34 +01:00
Ilya Maximets	9e8b90fc6d	eal/bsd: fix possible IOPL fd leak If rte_eal_iopl_init() will be called more than once we'll leak the file descriptor. Fixes: `b46fe31862` ("eal/bsd: fix virtio on FreeBSD") Cc: stable@dpdk.org Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2018-11-25 11:44:25 +01:00
Maxime Coquelin	5a12b67e74	vhost: fix packed ring constants declaration The packed ring defines were declared only if kernel header does not declare them. The problem is that they are not applied in upstream kernel, and some changes in the names have been required. This patch declares the defines unconditionally, which fixes potential build issues. Fixes: `297b1e7350` ("vhost: add virtio packed virtqueue defines") Cc: stable@dpdk.org Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2018-11-22 23:06:26 +01:00
Ferruh Yigit	8461a5bb70	ethdev: remove unused deferred device state DEFERRED state replaced by ownership concept and it is no more used as code comment states. ethdev ABI broken on this release use this opportunity to remove DEFERRED state. Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Matan Azrad <matan@mellanox.com>	2018-11-21 16:11:14 +01:00
Akhil Goyal	f63ffee26f	security: restore experimental tag for unimplemented APIs Following APIs are not currently implemented by any of the drivers, so marking them as rte_experimental again. Fixes: `1a81dce780` ("security: remove experimental tag") rte_security_get_userdata; rte_security_session_stats_get; rte_security_session_update; Signed-off-by: Akhil Goyal <akhil.goyal@nxp.com>	2018-11-23 02:03:33 +01:00
Nikhil Rao	e846cfdec3	eventdev: fix unlock in Rx adapter In the eth Rx adapter SW service function, move the return to after the spinlock unlock. Coverity issue: 302857 Fixes: `a66a837446` ("eventdev: fix Rx SW adapter stop") Cc: stable@dpdk.org Signed-off-by: Nikhil Rao <nikhil.rao@intel.com> Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>	2018-11-23 02:03:33 +01:00
Thomas Monjalon	6b8d9a4b4c	eventdev: fix possible uninitialized variable When compiling with -O1, this error can appear: lib/librte_eventdev/rte_event_eth_tx_adapter.c:705:6: error: ‘ret’ may be used uninitialized in this function If tx_queue_id is -1 and nb_queues is 0, then ret is returned without being initialized. It is fixed by setting 0 as initial value. Fixes: `a3bbf2e097` ("eventdev: add eth Tx adapter implementation") Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2018-11-23 01:43:42 +01:00
Thomas Monjalon	a17842c142	kni: fix possible uninitialized variable This error can be raised: lib/librte_kni/rte_kni.c:531:15: error: 'req' may be used uninitialized in this function It should not happen because kni_fifo_get() would return 0 if req is not initialized, so the function would return before using req. But GCC complains about it in -O1 optimization, and a NULL initialization is harmless here. Fixes: `3fc5ca2f63` ("kni: initial import") Cc: stable@dpdk.org Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2018-11-23 01:43:35 +01:00
Thomas Monjalon	e357e8ebd9	eal: fix build with -O1 In case of optimized compilation, RTE_BUILD_BUG_ON use an external variable which is neither defined, nor used. It seems not optimized out in case of OPDL compiled with clang -O1: opdl_ring.c: undefined reference to `RTE_BUILD_BUG_ON_detected_error' clang-6.0: fatal error: linker command failed with exit code 1 Fixes: `af75078fec` ("first public release") Cc: stable@dpdk.org Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2018-11-23 01:43:32 +01:00
Anatoly Burakov	509cc88513	eal: deprecate and rename bsf64 function Rename rte_bsf64 to rte_bsf64_safe (this is a "safe" version in that it prevents undefined behavior by checking if incoming parameter is zero) and move it to common header. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Acked-by: Jasvinder Singh <jasvinder.singh@intel.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2018-11-23 01:43:31 +01:00
Anatoly Burakov	816c924e9e	eal: remove useless code in bsf64 function RTE_BITMAP_OPTIMIZATIONS was never set to 0 and makes no sense anyway, so remove all code related to it. Also, drop the "likely" for bsf64 code, because it's a generic function and we cannot make any assumptions about likely values of incoming arguments. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2018-11-23 01:43:26 +01:00
Anatoly Burakov	615fcf55d2	ipc: fix access after async request failure Previous fix for rte_panic has moved setting of alarm before sending the message. This means that whether we send a message, the alarm would still trigger. The comment noted that cleanup would happen in the alarm handler, but that's not what actually happened - instead, in the event of failed send we freed the memory in-place, before putting the request on the queue. This works OK when the message is sent, but when sending the message fails, the alarm would still trigger with a pointer argument that points to non-existent memory, and cause memory corruption. There probably is a "proper" fix for this issue, with correct handling of sent vs. unsent requests, however it would be simpler just to sacrifice the sent request in the (extremely unlikely) event of alarm set failing. The other process would still send a response, but it will be ignored by the sender. Fixes: `45e5f49e87` ("ipc: remove panic in async request") Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-11-23 01:43:24 +01:00
Thomas Monjalon	d82e5db6f6	version: 18.11-rc4 Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2018-11-19 01:40:54 +01:00
Akhil Goyal	1a81dce780	security: remove experimental tag rte_security has been experimental since DPDK 17.11 release. Now the library has matured and expermental tag is removed in this patch. Signed-off-by: Akhil Goyal <akhil.goyal@nxp.com> Acked-by: Anoob Joseph <anoob.joseph@caviumnetworks.com> Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com> Acked-by: Boris Pismenny <borisp@mellanox.com>	2018-11-18 22:31:30 +01:00
Jeff Guo	c48407e8af	eal: fix deadlock in hot-unplug When device be hot-unplugged, the hot-unplug handler will be invoked by uio remove event and the device will be detached, then kernel will sent another pci remove event. So if there is any unlock miss, it will cause a dead lock issue. This patch will add this missing unlock for hot-unplug handler. Fixes: `0fc54536b1` ("eal: add failure handling for hot-unplug") Signed-off-by: Jeff Guo <jia.guo@intel.com>	2018-11-18 17:16:40 +01:00
Chaitanya Babu Talluri	f493119397	efd: fix write unlock during ring creation In rte_efd_create() write lock has already been unlocked before ring creation itself. So second unlock after the ring creation has been removed. Fixes: `56b6ef874f` ("efd: new Elastic Flow Distributor library") Cc: stable@dpdk.org Signed-off-by: Chaitanya Babu Talluri <tallurix.chaitanya.babu@intel.com> Acked-by: Reshma Pattan <reshma.pattan@intel.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2018-11-18 15:46:02 +01:00
David Wilder	6b062d56bc	mem: fix anonymous mapping on Power9 Removed the use of MAP_HUGETLB for anonymous mapping on ppc64. The MAP_HUGETLB had previously been added to workaround issues on IBM Power8 systems when mapping /dev/zero. In the current code the MAP_HUGETLB flag will cause the anonymous mapping to fail on Power9. Note, Power8 is currently failing to correctly mmap Hugepages, with and without this change. Fixes: `284ae3e9ff` ("eal/ppc: fix mmap for memory initialization") Signed-off-by: David Wilder <dwilder@us.ibm.com> Reviewed-by: Pradeep Satyanarayana <pradeep@us.ibm.com>	2018-11-18 14:42:18 +01:00
Anatoly Burakov	71aae4b421	malloc: fix adjacency check to also include segment list It may so happen that two memory locations may be adjacent in virtual memory, but belong to different segment lists. With current code, such segments will be concatenated. Fix the adjacency checking code to also check if the adjacent malloc elements belong to the same memseg list. Fixes: `66cc45e293` ("mem: replace memseg with memseg lists") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-11-18 14:15:04 +01:00
Anatoly Burakov	32fc0fa00e	mem: check for contiguousness in external segments For IOVA as VA mode, we assume that memory is contiguous. However, for external segments that assumption may not necessarily hold. Fix the code to not assume that external memory segments are contiguous even in IOVA as VA mode. Fixes: `5282bb1c36` ("mem: allow memseg lists to be marked as external") Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-11-18 14:12:20 +01:00
Kevin Laatz	2ddd89c3c6	eal: fix duplicate function declaration The rte_eal_get_runtime_dir() function is currently being declared in two header files. This API was made public in commit `6911c9fd8f` ("eal: export function to get runtime directory"), adding it to rte_eal.h. To make it public, the 'rte' prefix was added to the function so it needed to be modified in the original location of the declaration, eal_filesystem.h. By only modifying, and not removing the decalration, it is now a duplicate. This patch removes the declaration from eal_filesystem.h. Fixes: `6911c9fd8f` ("eal: export function to get runtime directory") Reported-by: Anatoly Burakov <anatoly.burakov@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-11-18 13:40:26 +01:00
Thomas Monjalon	3e42b6ce06	version: 18.11-rc3 Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2018-11-14 05:05:29 +01:00
Fan Zhang	1c25cf4a1c	pipeline: fix logically dead code This patches fixes the coverity issue of logically dead code. Coverity issue: 323523 Fixes: `96303217a6` ("pipeline: add symmetric crypto table action") Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>	2018-11-12 17:45:23 +01:00
Ferruh Yigit	68b931bff2	ethdev: eliminate interim variable `local_conf` variable was needed for offload conversions but no more required. No functional difference, only interim variable eliminated. Fixes: `ab3ce1e0c1` ("ethdev: remove old offload API") Cc: stable@dpdk.org Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>	2018-11-14 00:35:53 +01:00
Wenzhuo Lu	1a411a6fdb	ethdev: fix device info getting The device information cannot be gotten correctly before the configuration is set. Because on some NICs the information has dependence on the configuration. Fixes: `3be82f5cc5` ("ethdev: support PMD-tuned Tx/Rx parameters") Cc: stable@dpdk.org Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>	2018-11-14 00:35:53 +01:00
Wenzhuo Lu	aa28ec5d27	ethdev: fix invalid configuration after failure The new configuration is stored during the rte_eth_dev_configure() API but the API may fail. After failure stored configuration will be invalid since it is not fully applied to the device. We better roll the configuration back after failure. Fixes: `af75078fec` ("first public release") Cc: stable@dpdk.org Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>	2018-11-14 00:35:53 +01:00
Tiwei Bie	0541588a44	vhost: remove unneeded null pointer check The caller will guarantee that msg won't be null. Remove the unneeded null pointer check which caused a Coverity warning. Coverity issue: 323484 Fixes: `8f972312b8` ("vhost: support vhost-user") Cc: stable@dpdk.org Signed-off-by: Tiwei Bie <tiwei.bie@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2018-11-14 00:35:53 +01:00
Fan Zhang	cd1e8f03ab	vhost/crypto: fix packet copy in chaining mode This patch fixes the incorrect packet content copy in the chaining mode. Originally the content before cipher offset is overwritten by all zeros. This patch fixes the problem by making sure the correct write back source and destination settings during set up. Fixes: `3bb595ecd6` ("vhost/crypto: add request handler") Cc: stable@dpdk.org Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2018-11-14 00:35:53 +01:00
Tiwei Bie	30affaeebc	vhost: fix IOVA access for packed ring We should apply for RO access when receiving packets from the VM and apply for RW access when sending packets to the VM. Fixes: `a922401f35` ("vhost: add Rx support for packed ring") Fixes: `ae999ce49d` ("vhost: add Tx support for packed ring") Cc: stable@dpdk.org Signed-off-by: Tiwei Bie <tiwei.bie@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2018-11-14 00:35:53 +01:00
Bruce Richardson	f98a95102d	eal/x86: move header to standard BSD license This updates the license on the rte_rtm.h file to be the standard BSD-3-Clause license used for the rest of DPDK, thus bringing the file in compliance with the DPDK licensing policy. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>	2018-11-14 01:44:14 +01:00
Bruce Richardson	e5f9a65147	eal/x86: reduce contention when retrying TSX When TSX transactions abort, it is generally worth retrying a number of times before falling back to the traditional locking path, as the parallelism benefits from TSX can be worth it when a transaction does succeed. For cases with multiple threads and high contention rates, it can be useful to have increasing delays between retry attempts, so as to avoid having the same threads repeatedly collided. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>	2018-11-14 01:03:21 +01:00
Yipeng Wang	606bd11736	hash: fix TSX aborts with newer gcc gcc 7 and 8 with O3 will generate vzeroupper from rte_memcpy into TSX region which may abort the TSX transaction. This fix changes rte_memcpy to memcpy which will not insert extra vzeroupper into the library. Fixes: `f2e3001b53` ("hash: support read/write concurrency") Cc: stable@dpdk.org Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>	2018-11-14 01:02:07 +01:00
Anatoly Burakov	45e5f49e87	ipc: remove panic in async request EAL should not crash when setting alarm fails. Also, remove the profanity in error message. Fixes: `daf9bfca71` ("ipc: remove thread for async requests") Cc: stable@dpdk.org Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-11-14 00:01:38 +01:00
Konstantin Ananyev	95df7307a7	bpf: fix x86 JIT for immediate loads x86 jit can generate invalid code for (BPF_LD \| BPF_IMM \| EBPF_DW) instructions, when immediate value is bigger then INT32_MAX. Fixes: `cc752e43e0` ("bpf: add JIT compilation for x86_64 ISA") Cc: stable@dpdk.org Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2018-11-13 23:18:53 +01:00
Thomas Monjalon	31f19a9beb	pci: fix parsing of address without function number If the last part of the PCI address (function number) is missing, the parsing was successful, assuming function 0. The call to strtoul is not returning an error in such a case, so an explicit check is inserted before. This bug has always been there in older parsing macros: - GET_PCIADDR_FIELD - GET_BLACKLIST_FIELD Fixes: `af75078fec` ("first public release") Cc: stable@dpdk.org Reported-by: Wisam Jaddo <wisamm@mellanox.com> Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>	2018-11-13 17:59:42 +01:00
Honnappa Nagarahalli	9eca8bd7a6	hash: separate lock-free and r/w lock lookup The lock-free algorithm has caused significant lookup performance regression for certain use cases. The regression is attributed to the use of non-relaxed memory orderings. 2 versions of the lookup functions are created. One that uses the RW lock and the one that is lock-free. This restores the performance regression caused for use cases that used RW lock version of the lookup function. Fixes: `e605a1d36` ("hash: add lock-free r/w concurrency") Suggested-by: Jerin Jacob <jerin.jacob@caviumnetworks.com> Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com> Reviewed-by: Gavin Hu <gavin.hu@arm.com> Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com> Tested-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>	2018-11-13 17:34:44 +01:00

... 3 4 5 6 7 ...

5391 Commits