numam-spdk
Go to file
Alexey Marchuk 8a0f9cf3a7 nvme_tcp: Fix icreq/icresp handing with zcopy enabled.
There is a problem with TCP zcopy enabled:
1. TCP initiator sends icreq and start polling a qpair. Polling of qpair
actively calls nvme_tcp_read_pdu function
2. nvme_tcp_read_pdu: qpair is in NVME_TCP_PDU_RECV_STATE_AWAIT_PDU_CH state,
it reads 8 bytes of common PDU header. It determines the type of the PDU
and finds the size of PDU_PSH header.
3. nvme_tcp_read_pdu: qpair is in NVME_TCP_PDU_RECV_STATE_AWAIT_PDU_PSH state.
It should read 120 bytes of icresp PDU. The number of bytes which needs to be
read is pdu->psh_len - pdu->psh_valid_bytes. qpair receives 120 bytes
(the full PDU) and calls nvme_tcp_pdu_psh_handle -> nvme_tcp_icresp_handle.
Here we check that we haven't yet received buffer reclaim notification and
simply return from this function. At the same time we continue to poll the qpair.
4. nvme_tcp_read_pdu: qpair is in NVME_TCP_PDU_RECV_STATE_AWAIT_PDU_PSH state
and tries to read data from a socket again. The number of bytes is
pdu->psh_len - pdu->psh_valid_bytes. But now pdu->psh_len == pdu->psh_valid_bytes,
so we call nvme_tcp_read_data with zero length.
readv with zero length is commonly used to check errors on the socket,
but in our case there is no errors and readv returns 0.
5. nvme_tcp_read_data treats zero as error and return NVME_TCP_CONNECTION_FATAL.

Fix is to handle icresp, but leave qpair in INITIALIZING state until
we receive acknowledgement for icreqsend_ack. We also move qpair to
NVME_TCP_PDU_RECV_STATE_AWAIT_PDU_READY recv_state so recv_pdu
will be zerofied and qpair will try to read a common PDU header.
But since it is not initialized yet, it won't receive anything
from the target.

Fixes issue #1633

Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4969 (master)

(cherry picked from commit d296fcd8d9)
Change-Id: I22cedefe530a8ac3b51495988ed6265d8fad15bb
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4976
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2020-10-30 14:51:12 +00:00
.githooks githooks/prepush: remove clang 2020-06-10 13:56:32 +00:00
.github/ISSUE_TEMPLATE github: provide link to CVE process submission 2020-04-06 07:48:20 +00:00
app event: remove printing legacy config for apps 2020-10-21 20:44:47 +00:00
build/lib build: consolidate library outputs in build/lib 2016-11-17 13:15:09 -07:00
doc event: add scheduler_set RPC 2020-10-23 22:35:53 +00:00
dpdk@7d8b8e4efe dpdk: move submodule to build rte_rcu library 2020-10-29 15:43:25 +00:00
dpdkbuild dpdkbuild: don't re-install unchanged shared libs 2020-10-02 07:13:53 +00:00
examples examples/nvme_fio_plugin: update ZNS section of README 2020-10-30 12:52:55 +00:00
go go: empty Go package 2018-06-28 18:15:51 +00:00
include nvme: continue probing ctrlrs even if one fails 2020-10-29 15:43:38 +00:00
intel-ipsec-mb@93c2ddf877 intel-ipsec-mb: update submodule to v0.54 2020-09-22 11:40:50 +00:00
ipsecbuild Makefile: don't override MAKEFLAGS in submake 2020-02-21 09:33:45 +00:00
isa-l@806b55ee57 isa-l: update submodule to v2.29.0 2020-08-27 08:36:29 +00:00
isalbuild Makefile: don't override MAKEFLAGS in submake 2020-02-21 09:33:45 +00:00
lib nvme_tcp: Fix icreq/icresp handing with zcopy enabled. 2020-10-30 14:51:12 +00:00
mk accel: Move non-engine specific batch to the accel_fw layer 2020-10-22 22:43:28 +00:00
module bdev_aio: fix interrupt mode notify error 2020-10-27 15:44:16 +00:00
ocf@02f6fc3a71 ocf: update submodule to v20.03.1 2020-08-27 08:36:29 +00:00
pkg sock/vpp: remove VPP implementation 2020-08-17 08:19:46 +00:00
scripts Modifications for using universal qcow2 image in tests 2020-10-29 08:09:37 +00:00
shared_lib build: add SO_SUFFIX for combined spdk.so library 2020-10-02 07:13:53 +00:00
test nvme: break completion loop when ctrlr is invalid 2020-10-29 15:44:01 +00:00
.astylerc astyle: change "add-braces" to "j" for compatibility 2017-12-13 21:23:27 -05:00
.gitignore add .gitreview to .gitignore 2020-10-12 08:26:27 +00:00
.gitmodules ocf: add ocf submodule 2019-02-27 17:26:51 +00:00
autobuild.sh autobuild.sh: exclude custom dpdk from scanbuild path 2020-10-23 13:47:43 +00:00
autopackage.sh autopackage.sh: add flag for release builds 2020-10-20 08:54:40 +00:00
autorun_post.py post_process: clearly delineate the beginning os script output. 2020-06-17 07:21:44 +00:00
autorun.sh test/common: consolidate test params for running with external DPDK 2020-08-17 11:56:32 +00:00
autotest.sh Modifications for using universal qcow2 image in tests 2020-10-29 08:09:37 +00:00
CHANGELOG.md changelog: Add information regarding scheduler implementation 2020-10-29 15:43:52 +00:00
CONFIG vhost: deprecate internal vhost library support 2020-10-20 02:42:16 +00:00
configure configure: Fix check against NASM version 2020-10-22 15:01:09 +00:00
CONTRIBUTING.md Add CONTRIBUTING.md 2017-09-05 13:25:45 -04:00
LICENSE Remove year from copyright headers. 2016-01-28 08:54:18 -07:00
Makefile mk: Remove the content of build/lib in "clean" target 2020-09-03 07:43:38 +00:00
README.md readme: add dpdk shared library note to LD_LIBRARY_PATH 2020-10-01 07:12:46 +00:00

Storage Performance Development Kit

Build Status

The Storage Performance Development Kit (SPDK) provides a set of tools and libraries for writing high performance, scalable, user-mode storage applications. It achieves high performance by moving all of the necessary drivers into userspace and operating in a polled mode instead of relying on interrupts, which avoids kernel context switches and eliminates interrupt handling overhead.

The development kit currently includes:

In this readme

Documentation

Doxygen API documentation is available, as well as a Porting Guide for porting SPDK to different frameworks and operating systems.

Source Code

git clone https://github.com/spdk/spdk
cd spdk
git submodule update --init

Prerequisites

The dependencies can be installed automatically by scripts/pkgdep.sh. The scripts/pkgdep.sh script will automatically install the bare minimum dependencies required to build SPDK. Use --help to see information on installing dependencies for optional components

./scripts/pkgdep.sh

Build

Linux:

./configure
make

FreeBSD: Note: Make sure you have the matching kernel source in /usr/src/ and also note that CONFIG_COVERAGE option is not available right now for FreeBSD builds.

./configure
gmake

Unit Tests

./test/unit/unittest.sh

You will see several error messages when running the unit tests, but they are part of the test suite. The final message at the end of the script indicates success or failure.

Vagrant

A Vagrant setup is also provided to create a Linux VM with a virtual NVMe controller to get up and running quickly. Currently this has been tested on MacOS, Ubuntu 16.04.2 LTS and Ubuntu 18.04.3 LTS with the VirtualBox and Libvirt provider. The VirtualBox Extension Pack or [Vagrant Libvirt] (https://github.com/vagrant-libvirt/vagrant-libvirt) must also be installed in order to get the required NVMe support.

Details on the Vagrant setup can be found in the SPDK Vagrant documentation.

AWS

The following setup is known to work on AWS: Image: Ubuntu 18.04 Before running setup.sh, run modprobe vfio-pci then: DRIVER_OVERRIDE=vfio-pci ./setup.sh

Advanced Build Options

Optional components and other build-time configuration are controlled by settings in the Makefile configuration file in the root of the repository. CONFIG contains the base settings for the configure script. This script generates a new file, mk/config.mk, that contains final build settings. For advanced configuration, there are a number of additional options to configure that may be used, or mk/config.mk can simply be created and edited by hand. A description of all possible options is located in CONFIG.

Boolean (on/off) options are configured with a 'y' (yes) or 'n' (no). For example, this line of CONFIG controls whether the optional RDMA (libibverbs) support is enabled:

CONFIG_RDMA?=n

To enable RDMA, this line may be added to mk/config.mk with a 'y' instead of 'n'. For the majority of options this can be done using the configure script. For example:

./configure --with-rdma

Additionally, CONFIG options may also be overridden on the make command line:

make CONFIG_RDMA=y

Users may wish to use a version of DPDK different from the submodule included in the SPDK repository. Note, this includes the ability to build not only from DPDK sources, but also just with the includes and libraries installed via the dpdk and dpdk-devel packages. To specify an alternate DPDK installation, run configure with the --with-dpdk option. For example:

Linux:

./configure --with-dpdk=/path/to/dpdk/x86_64-native-linuxapp-gcc
make

FreeBSD:

./configure --with-dpdk=/path/to/dpdk/x86_64-native-bsdapp-clang
gmake

The options specified on the make command line take precedence over the values in mk/config.mk. This can be useful if you, for example, generate a mk/config.mk using the configure script and then have one or two options (i.e. debug builds) that you wish to turn on and off frequently.

Shared libraries

By default, the build of the SPDK yields static libraries against which the SPDK applications and examples are linked. Configure option --with-shared provides the ability to produce SPDK shared libraries, in addition to the default static ones. Use of this flag also results in the SPDK executables linked to the shared versions of libraries. SPDK shared libraries by default, are located in ./build/lib. This includes the single SPDK shared lib encompassing all of the SPDK static libs (libspdk.so) as well as individual SPDK shared libs corresponding to each of the SPDK static ones.

In order to start a SPDK app linked with SPDK shared libraries, make sure to do the following steps:

  • run ldconfig specifying the directory containing SPDK shared libraries
  • provide proper LD_LIBRARY_PATH

If DPDK shared libraries are used, you may also need to add DPDK shared libraries to LD_LIBRARY_PATH

Linux:

./configure --with-shared
make
ldconfig -v -n ./build/lib
LD_LIBRARY_PATH=./build/lib/:./dpdk/build/lib/ ./build/bin/spdk_tgt

Hugepages and Device Binding

Before running an SPDK application, some hugepages must be allocated and any NVMe and I/OAT devices must be unbound from the native kernel drivers. SPDK includes a script to automate this process on both Linux and FreeBSD. This script should be run as root.

sudo scripts/setup.sh

Users may wish to configure a specific memory size. Below is an example of configuring 8192MB memory.

sudo HUGEMEM=8192 scripts/setup.sh

Example Code

Example code is located in the examples directory. The examples are compiled automatically as part of the build process. Simply call any of the examples with no arguments to see the help output. You'll likely need to run the examples as a privileged user (root) unless you've done additional configuration to grant your user permission to allocate huge pages and map devices through vfio.

Contributing

For additional details on how to get more involved in the community, including contributing code and participating in discussions and other activities, please refer to spdk.io