Compare commits
69 Commits
Author | SHA1 | Date | |
---|---|---|---|
|
3d1bbb273b | ||
|
43a94514af | ||
|
f71ccc5691 | ||
|
74c9fd40fd | ||
|
e46860f591 | ||
|
b2808069e3 | ||
|
9b83caf89f | ||
|
fb56c3214e | ||
|
f8d26843fc | ||
|
239eae6000 | ||
|
8bfa974cc2 | ||
|
ed016fdbfb | ||
|
ce4da6c39a | ||
|
ae161cdec6 | ||
|
7f9ea53d35 | ||
|
6fe32e3e17 | ||
|
b8446bb66d | ||
|
3f5f09db46 | ||
|
4faf9fc37b | ||
|
cbb1d099ff | ||
|
d42b332ae6 | ||
|
930d91f479 | ||
|
0acac18cfa | ||
|
2381516ecc | ||
|
90501268d6 | ||
|
ae0db495fb | ||
|
9bcc0ea8e8 | ||
|
fab97f2aac | ||
|
d635d6d297 | ||
|
e8d8cef0fd | ||
|
062da7a08a | ||
|
a7f7b1955e | ||
|
d4d3e76aed | ||
|
09377fc41f | ||
|
b90630a465 | ||
|
1ffa3c3f08 | ||
|
136c0771ad | ||
|
7976cae3b4 | ||
|
01a942dc6b | ||
|
b030befb1d | ||
|
77a53c2c00 | ||
|
cfc2cd611c | ||
|
4e4bb7f822 | ||
|
a763a7263a | ||
|
52c7d46a3c | ||
|
776d45b0e3 | ||
|
83edd2f716 | ||
|
374d2a2f64 | ||
|
1a4dec353a | ||
|
c55906a30d | ||
|
400316351f | ||
|
1dfd2cf594 | ||
|
3fcc5a9ec1 | ||
|
f7730adbf0 | ||
|
88b5d6d183 | ||
|
fad91fb911 | ||
|
afc6fb5e1a | ||
|
02dda9731a | ||
|
cc02904e82 | ||
|
5ffffe9d96 | ||
|
6edcf515d6 | ||
|
a0f982bb0a | ||
|
a1e730b460 | ||
|
2ad2b27149 | ||
|
7c06ec7247 | ||
|
06e7d22c06 | ||
|
54714eae1a | ||
|
cc9c0e6922 | ||
|
9cfd844f5f |
227
CHANGELOG.md
227
CHANGELOG.md
@ -1,6 +1,74 @@
|
||||
# Changelog
|
||||
|
||||
## v20.01: (Upcoming Release)
|
||||
## v20.01.3: (Upcoming Release)
|
||||
|
||||
## v20.01.2:
|
||||
|
||||
### dpdk
|
||||
|
||||
Updated DPDK submodule to DPDK 19.11.2, which includes fixes for DPDK vulnerabilities:
|
||||
CVE-2020-10722, CVE-2020-10723, CVE-2020-10724, CVE-2020-10725, CVE-2020-10724.
|
||||
|
||||
### env_dpdk
|
||||
|
||||
A new function, `spdk_mem_reserve`, has been added to reserve a memory region in SPDK's
|
||||
memory maps. It pre-allocates data structures to hold memory address translations
|
||||
without populating the region.
|
||||
|
||||
### rpc
|
||||
|
||||
A new RPC, `bdev_rbd_resize` has been added to resize the Ceph RBD bdev.
|
||||
|
||||
## v20.01.1:
|
||||
|
||||
## v20.01:
|
||||
|
||||
### bdev
|
||||
|
||||
A new function, `spdk_bdev_set_timeout`, has been added to set per descriptor I/O timeouts.
|
||||
|
||||
A new class of functions `spdk_bdev_compare*`, have been added to allow native bdev support
|
||||
of block comparisons and compare-and-write.
|
||||
|
||||
A new class of bdev events, `SPDK_BDEV_EVENT_MEDIA_MANAGEMENT`, has been added to allow bdevs
|
||||
which expose raw media to alert all I/O channels of pending media management events.
|
||||
|
||||
A new API was added `spdk_bdev_io_get_aux_buf` allowing the caller to request
|
||||
an auxiliary buffer for its own private use. The API is used in the same manner that
|
||||
`spdk_bdev_io_get_buf` is used and the length of the buffer is always the same as the
|
||||
bdev_io primary buffer. 'spdk_bdev_io_put_aux_buf' frees the allocated auxiliary
|
||||
buffer.
|
||||
|
||||
### blobfs
|
||||
|
||||
Added boolean return value for function spdk_fs_set_cache_size to indicate its operation result.
|
||||
|
||||
Added `blobfs_set_cache_size` RPC method to set cache size for blobstore filesystem.
|
||||
|
||||
### blobstore
|
||||
|
||||
Added new `use_extent_table` option to `spdk_blob_opts` for creating blobs with Extent Table descriptor.
|
||||
Using this metadata format, dramatically decreases number of writes required to persist each cluster allocation
|
||||
for thin provisioned blobs. Extent Table descriptor is enabled by default.
|
||||
See the [Blobstore Programmer's Guide](https://spdk.io/doc/blob.html#blob_pg_cluster_layout) for more details.
|
||||
|
||||
### dpdk
|
||||
|
||||
Updated DPDK submodule to DPDK 19.11.
|
||||
|
||||
### env_dpdk
|
||||
|
||||
`spdk_env_dpdk_post_init` now takes a boolean, `legacy_mem`, as an argument.
|
||||
|
||||
A new function, `spdk_env_dpdk_dump_mem_stats`, prints information about the memory consumed by DPDK to a file specified by
|
||||
the user. A new utility, `scripts/dpdk_mem_info.py`, wraps this function and prints the output in an easy to read way.
|
||||
|
||||
### event
|
||||
|
||||
The functions `spdk_reactor_enable_framework_monitor_context_switch()` and
|
||||
`spdk_reactor_framework_monitor_context_switch_enabled()` have been changed to
|
||||
`spdk_framework_enable_context_switch_monitor()` and
|
||||
`spdk_framework_context_switch_monitor_enabled()`, respectively.
|
||||
|
||||
### ftl
|
||||
|
||||
@ -18,43 +86,6 @@ parameter.
|
||||
|
||||
`spdk_ftl_punit_range` and `ftl_module_init_opts` structures were removed.
|
||||
|
||||
### nvmf
|
||||
|
||||
Support for custom NVMe admin command handlers and admin command passthru
|
||||
in the NVMF subsystem.
|
||||
|
||||
It is now possible to set a custom handler for a specific NVMe admin command.
|
||||
For example, vendor specific admin commands can now be intercepted by implementing
|
||||
a function handling the command.
|
||||
Further NVMe admin commands can be forwarded straight to an underlying NVMe bdev.
|
||||
|
||||
The functions `spdk_nvmf_set_custom_admin_cmd_hdlr` and `spdk_nvmf_set_passthru_admin_cmd`
|
||||
in `spdk_internal/nvmf.h` expose this functionality. There is an example custom admin handler
|
||||
for the NVMe IDENTIFY CTRLR in `lib/nvmf/custom_cmd_hdlr.c`. This handler gets the SN, MN, FR, IEEE, FGUID
|
||||
attributes from the first NVMe drive in the NVMF subsystem and returns it to the NVMF initiator (sn and mn attributes
|
||||
specified during NVMF subsystem creation RPC will be overwritten).
|
||||
This handler can be enabled via the `nvmf_set_config` RPC.
|
||||
Note: In a future version of SPDK, this handler will be enabled by default.
|
||||
|
||||
### bdev
|
||||
|
||||
A new API was added `spdk_bdev_io_get_aux_buf` allowing the caller to request
|
||||
an auxiliary buffer for its own private use. The API is used in the same manner that
|
||||
`spdk_bdev_io_get_buf` is used and the length of the buffer is always the same as the
|
||||
bdev_io primary buffer. 'spdk_bdev_io_put_aux_buf' frees the allocated auxiliary
|
||||
buffer.
|
||||
|
||||
### sock
|
||||
|
||||
Added spdk_sock_writev_async for performing asynchronous writes to sockets. This call will
|
||||
never return EAGAIN, instead queueing internally until the data has all been sent. This can
|
||||
simplify many code flows that create pollers to continue attempting to flush writes
|
||||
on sockets.
|
||||
|
||||
Added `impl_name` parameter in spdk_sock_listen and spdk_sock_connect functions. Users may now
|
||||
specify the sock layer implementation they'd prefer to use. Valid implementations are currently
|
||||
"vpp" and "posix" and NULL, where NULL results in the previous behavior of the functions.
|
||||
|
||||
### isa-l
|
||||
|
||||
Updated ISA-L submodule to commit f3993f5c0b6911 which includes implementation and
|
||||
@ -62,16 +93,30 @@ optimization for aarch64.
|
||||
|
||||
Enabled ISA-L on aarch64 by default in addition to x86.
|
||||
|
||||
### thread
|
||||
### nvme
|
||||
|
||||
`spdk_thread_send_msg` now returns int indicating if the message was successfully
|
||||
sent.
|
||||
`delayed_pcie_doorbell` parameter in `spdk_nvme_io_qpair_opts` was renamed to `delay_cmd_submit`
|
||||
to allow reuse in other transports.
|
||||
|
||||
### blobfs
|
||||
Added RDMA WR batching to NVMf RDMA initiator. Send and receive WRs are chained together
|
||||
and posted with a single call to ibv_post_send(receive) in the next call to qpair completion
|
||||
processing function. Batching is controlled by 'delay_cmd_submit' qpair option.
|
||||
|
||||
Added boolean return value for function spdk_fs_set_cache_size to indicate its operation result.
|
||||
The NVMe-oF initiator now supports plugging out of tree NVMe-oF transports. In order
|
||||
to facilitate this feature, several small API changes have been made:
|
||||
|
||||
Added `blobfs_set_cache_size` RPC method to set cache size for blobstore filesystem.
|
||||
The `spdk_nvme_transport_id` struct now contains a trstring member used to identify the transport.
|
||||
A new function, `spdk_nvme_transport_available_by_name`, has been added.
|
||||
A function table, `spdk_nvme_transport_ops`, and macro, `SPDK_NVME_TRANSPORT_REGISTER`, have been added which
|
||||
enable registering out of tree transports.
|
||||
|
||||
A new function, `spdk_nvme_ns_supports_compare`, allows a user to check whether a given namespace supports the compare
|
||||
operation.
|
||||
|
||||
A new family of functions, `spdk_nvme_ns_compare*`, give the user access to submitting compare commands to NVMe namespaces.
|
||||
|
||||
A new function, `spdk_nvme_ctrlr_cmd_get_log_page_ext`, gives users more granular control over the command dwords sent in
|
||||
log page requests.
|
||||
|
||||
### nvmf
|
||||
|
||||
@ -91,45 +136,71 @@ Add `spdk_nvmf_tgt_stop_listen()` that can be used to stop listening for
|
||||
incoming connections for specified target and trid. Listener is not stopped
|
||||
implicitly upon destruction of a subsystem any more.
|
||||
|
||||
A custom NVMe admin command handler has been added which allows the user to use the real drive
|
||||
attributes from one of the target NVMe drives when reporting drive attributes to the initiator.
|
||||
This handler can be enabled via the `nvmf_set_config` RPC.
|
||||
Note: In a future version of SPDK, this handler will be enabled by default.
|
||||
|
||||
The SPDK target and initiator both now include compare-and-write functionality with one caveat. If using the RDMA transport,
|
||||
the target expects the initiator to send both the compare command and write command either with, or without inline data. The
|
||||
SPDK initiator currently respects this requirement, but this note is included as a flag for other initiators attempting
|
||||
compatibility with this version of SPDK.
|
||||
|
||||
### rpc
|
||||
|
||||
A new RPC, `bdev_zone_block_create`, enables creating an emulated zoned bdev on top of a standard block device.
|
||||
|
||||
A new RPC, `bdev_ocssd_create`, enables creating an emulated zoned bdev on top of an Open Channel SSD.
|
||||
|
||||
A new RPC, `blobfs_set_cache_size`, enables managing blobfs cache size.
|
||||
|
||||
A new RPC, `env_dpdk_get_mem_stats`, has been added to facilitate reading DPDK related memory
|
||||
consumption stats. Please see the env_dpdk section above for more details.
|
||||
|
||||
A new RPC, `framework_get_reactors`, has been added to retrieve a list of all reactors.
|
||||
|
||||
`bdev_ftl_create` now takes a `base_bdev` argument in lieu of `trtype`, `traddr`, and `punits`.
|
||||
|
||||
`bdev_nvme_set_options` now allows users to disable I/O submission batching with the `-d` flag
|
||||
|
||||
`bdev_nvme_cuse_register` now accepts a `name` parameter.
|
||||
|
||||
`bdev_uring_create` now takes arguments for `bdev_name` and `block_size`
|
||||
|
||||
`nvmf_set_config` now takes an argument to enable passthru of identify commands to base NVMe devices.
|
||||
Please see the nvmf section above for more details.
|
||||
|
||||
### scsi
|
||||
|
||||
`spdk_scsi_lun_get_dif_ctx` now takes an additional argument of type `spdk_scsi_task`.
|
||||
|
||||
### sock
|
||||
|
||||
Added spdk_sock_writev_async for performing asynchronous writes to sockets. This call will
|
||||
never return EAGAIN, instead queueing internally until the data has all been sent. This can
|
||||
simplify many code flows that create pollers to continue attempting to flush writes
|
||||
on sockets.
|
||||
|
||||
Added `impl_name` parameter in spdk_sock_listen and spdk_sock_connect functions. Users may now
|
||||
specify the sock layer implementation they'd prefer to use. Valid implementations are currently
|
||||
"vpp" and "posix" and NULL, where NULL results in the previous behavior of the functions.
|
||||
|
||||
### thread
|
||||
|
||||
`spdk_thread_send_msg` now returns int indicating if the message was successfully
|
||||
sent.
|
||||
|
||||
A new function `spdk_thread_send_critical_msg`, has been added to support sending a single message from
|
||||
a context that may be interrupted, e.g. a signal handler.
|
||||
|
||||
Two new functions, `spdk_poller_pause`, and `spdk_poller_resume`, have been added to give greater control
|
||||
of pollers to the application owner.
|
||||
|
||||
### util
|
||||
|
||||
`spdk_pipe`, a new utility for buffering data from sockets or files for parsing
|
||||
has been added. The public API is available at `include/spdk/pipe.h`.
|
||||
|
||||
### nvme
|
||||
|
||||
`delayed_pcie_doorbell` parameter in `spdk_nvme_io_qpair_opts` was renamed to `delay_cmd_submit`
|
||||
to allow reuse in other transports.
|
||||
|
||||
Added RDMA WR batching to NVMf RDMA initiator. Send and receive WRs are chained together
|
||||
and posted with a single call to ibv_post_send(receive) in the next call to qpair completion
|
||||
processing function. Batching is controlled by 'delay_cmd_submit' qpair option.
|
||||
|
||||
The NVMe-oF initiator now supports plugging out of tree NVMe-oF transports. In order
|
||||
to facilitate this feature, several small API changes have been made:
|
||||
|
||||
The `spdk_nvme_transport_id` struct now contains a trstring member used to identify the transport.
|
||||
A new function, `spdk_nvme_transport_available_by_name`, has been added.
|
||||
A function table, `spdk_nvme_transport_ops`, and macro, `SPDK_NVME_TRANSPORT_REGISTER`, have been added which
|
||||
enable registering out of tree transports.
|
||||
|
||||
### rpc
|
||||
|
||||
Added optional 'delay_cmd_submit' parameter to 'bdev_nvme_set_options' RPC method.
|
||||
|
||||
An new RPC `framework_get_reactors` has been added to retrieve list of all reactors.
|
||||
|
||||
### dpdk
|
||||
|
||||
Updated DPDK submodule to DPDK 19.11.
|
||||
|
||||
### event
|
||||
|
||||
The functions `spdk_reactor_enable_framework_monitor_context_switch()` and
|
||||
`spdk_reactor_framework_monitor_context_switch_enabled()` have been changed to
|
||||
`spdk_framework_enable_context_switch_monitor()` and
|
||||
`spdk_framework_context_switch_monitor_enabled()`, respectively.
|
||||
|
||||
### bdev
|
||||
|
||||
Added spdk_bdev_io_get_nvme_fused_status function for translating bdev_io status to NVMe status
|
||||
|
@ -144,7 +144,7 @@ def confirmPerPatchTests(test_list, skiplist):
|
||||
exit(1)
|
||||
|
||||
|
||||
def aggregateCompletedTests(output_dir, repo_dir):
|
||||
def aggregateCompletedTests(output_dir, repo_dir, skip_confirm=False):
|
||||
test_list = {}
|
||||
test_completion_table = []
|
||||
|
||||
@ -172,14 +172,15 @@ def aggregateCompletedTests(output_dir, repo_dir):
|
||||
printListInformation("Tests", test_list)
|
||||
generateTestCompletionTables(output_dir, test_completion_table)
|
||||
skipped_tests = getSkippedTests(repo_dir)
|
||||
confirmPerPatchTests(test_list, skipped_tests)
|
||||
if not skip_confirm:
|
||||
confirmPerPatchTests(test_list, skipped_tests)
|
||||
|
||||
|
||||
def main(output_dir, repo_dir):
|
||||
def main(output_dir, repo_dir, skip_confirm=False):
|
||||
generateCoverageReport(output_dir, repo_dir)
|
||||
collectOne(output_dir, 'doc')
|
||||
collectOne(output_dir, 'ut_coverage')
|
||||
aggregateCompletedTests(output_dir, repo_dir)
|
||||
aggregateCompletedTests(output_dir, repo_dir, skip_confirm)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
@ -188,5 +189,7 @@ if __name__ == "__main__":
|
||||
help="The location of your build's output directory")
|
||||
parser.add_argument("-r", "--repo_directory", type=str, required=True,
|
||||
help="The location of your spdk repository")
|
||||
parser.add_argument("-s", "--skip_confirm", required=False, action="store_true",
|
||||
help="Do not check if all autotest.sh tests were executed.")
|
||||
args = parser.parse_args()
|
||||
main(args.directory_location, args.repo_directory)
|
||||
main(args.directory_location, args.repo_directory, args.skip_confirm)
|
||||
|
@ -119,6 +119,12 @@ To remove a block device representation use the bdev_rbd_delete command.
|
||||
|
||||
`rpc.py bdev_rbd_delete Rbd0`
|
||||
|
||||
To resize a bdev use the bdev_rbd_resize command.
|
||||
|
||||
`rpc.py bdev_rbd_resize Rbd0 4096`
|
||||
|
||||
This command will resize the Rbd0 bdev to 4096 MiB.
|
||||
|
||||
# Compression Virtual Bdev Module {#bdev_config_compress}
|
||||
|
||||
The compression bdev module can be configured to provide compression/decompression
|
||||
|
18
doc/blob.md
18
doc/blob.md
@ -318,6 +318,24 @@ form a linked list. The first page in the list will be written in place on updat
|
||||
be written to fresh locations. This requires the backing device to support an atomic write size greater than
|
||||
or equal to the page size to guarantee that the operation is atomic. See the section on atomicity for details.
|
||||
|
||||
### Blob cluster layout {#blob_pg_cluster_layout}
|
||||
|
||||
Each blob is an ordered list of clusters, where starting LBA of a cluster is called extent. A blob can be
|
||||
thin provisioned, resulting in no extent for some of the clusters. When first write operation occurs
|
||||
to the unallocated cluster - new extent is chosen. This information is stored in RAM and on-disk.
|
||||
|
||||
There are two extent representations on-disk, dependent on `use_extent_table` (default:true) opts used
|
||||
when creating a blob.
|
||||
* **use_extent_table=true**: EXTENT_PAGE descriptor is not part of linked list of pages. It contains extents
|
||||
that are not run-length encoded. Each extent page is referenced by EXTENT_TABLE descriptor, which is serialized
|
||||
as part of linked list of pages. Extent table is run-length encoding all unallocated extent pages.
|
||||
Every new cluster allocation updates a single extent page, in case when extent page was previously allocated.
|
||||
Otherwise additionally incurs serializing whole linked list of pages for the blob.
|
||||
|
||||
* **use_extent_table=false**: EXTENT_RLE descriptor is serialized as part of linked list of pages.
|
||||
Extents pointing to contiguous LBA are run-length encoded, including unallocated extents represented by 0.
|
||||
Every new cluster allocation incurs serializing whole linked list of pages for the blob.
|
||||
|
||||
### Sequences and Batches
|
||||
|
||||
Internally Blobstore uses the concepts of sequences and batches to submit IO to the underlying device in either
|
||||
|
@ -1853,6 +1853,49 @@ Example response:
|
||||
}
|
||||
~~~
|
||||
|
||||
## bdev_rbd_resize {#rpc_bdev_rbd_resize}
|
||||
|
||||
Resize @ref bdev_config_rbd bdev
|
||||
|
||||
This method is available only if SPDK was build with Ceph RBD support.
|
||||
|
||||
### Result
|
||||
|
||||
`true` if bdev with provided name was resized or `false` otherwise.
|
||||
|
||||
### Parameters
|
||||
|
||||
Name | Optional | Type | Description
|
||||
----------------------- | -------- | ----------- | -----------
|
||||
name | Required | string | Bdev name
|
||||
new_size | Required | int | New bdev size for resize operation in MiB
|
||||
|
||||
### Example
|
||||
|
||||
Example request:
|
||||
|
||||
~~~
|
||||
{
|
||||
"params": {
|
||||
"name": "Rbd0"
|
||||
"new_size": "4096"
|
||||
},
|
||||
"jsonrpc": "2.0",
|
||||
"method": "bdev_rbd_resize",
|
||||
"id": 1
|
||||
}
|
||||
~~~
|
||||
|
||||
Example response:
|
||||
|
||||
~~~
|
||||
{
|
||||
"jsonrpc": "2.0",
|
||||
"id": 1,
|
||||
"result": true
|
||||
}
|
||||
~~~
|
||||
|
||||
## bdev_delay_create {#rpc_bdev_delay_create}
|
||||
|
||||
Create delay bdev. This bdev type redirects all IO to it's base bdev and inserts a delay on the completion
|
||||
|
2
dpdk
2
dpdk
@ -1 +1 @@
|
||||
Subproject commit fdb511332624e28631f553a226abb1dc0b35b28a
|
||||
Subproject commit ef71bfaface10cc19b75e45d3158ab71a788e3a9
|
@ -140,6 +140,18 @@ endif
|
||||
# Allow users to specify EXTRA_DPDK_CFLAGS if they want to build DPDK using unsupported compiler versions
|
||||
DPDK_CFLAGS += $(EXTRA_DPDK_CFLAGS)
|
||||
|
||||
ifeq ($(CC_TYPE),gcc)
|
||||
GCC_MAJOR = $(shell echo __GNUC__ | $(CC) -E -x c - | tail -n 1)
|
||||
ifeq ($(shell test $(GCC_MAJOR) -ge 10 && echo 1), 1)
|
||||
#1. gcc 10 complains on operations with zero size arrays in rte_cryptodev.c, so
|
||||
#disable this warning
|
||||
#2. gcc 10 disables fcommon by default and complains on multiple definition of
|
||||
#aesni_mb_logtype_driver symbol which is defined in header file and presented in sevral
|
||||
#translation units
|
||||
DPDK_CFLAGS += -Wno-stringop-overflow -fcommon
|
||||
endif
|
||||
endif
|
||||
|
||||
$(SPDK_ROOT_DIR)/dpdk/build: $(SPDK_ROOT_DIR)/mk/cc.mk $(SPDK_ROOT_DIR)/include/spdk/config.h
|
||||
$(Q)rm -rf $(SPDK_ROOT_DIR)/dpdk/build
|
||||
$(Q)$(MAKE) -C $(SPDK_ROOT_DIR)/dpdk config T=$(DPDK_CONFIG) $(DPDK_OPTS)
|
||||
|
@ -11,6 +11,7 @@ iodepth=128
|
||||
rw=randrw
|
||||
bs=16k
|
||||
verify=md5
|
||||
verify_backlog=32
|
||||
|
||||
[test]
|
||||
numjobs=1
|
||||
|
@ -1241,6 +1241,20 @@ int spdk_mem_register(void *vaddr, size_t len);
|
||||
*/
|
||||
int spdk_mem_unregister(void *vaddr, size_t len);
|
||||
|
||||
/**
|
||||
* Reserve the address space specified in all memory maps.
|
||||
*
|
||||
* This pre-allocates the necessary space in the memory maps such that
|
||||
* future calls to spdk_mem_register() on that region require no
|
||||
* internal memory allocations.
|
||||
*
|
||||
* \param vaddr Virtual address to reserve
|
||||
* \param len Length in bytes of vaddr
|
||||
*
|
||||
* \return 0 on success, negated errno on failure.
|
||||
*/
|
||||
int spdk_mem_reserve(void *vaddr, size_t len);
|
||||
|
||||
#ifdef __cplusplus
|
||||
}
|
||||
#endif
|
||||
|
@ -2781,6 +2781,14 @@ struct spdk_nvme_rdma_hooks {
|
||||
* \return Infiniband remote key (rkey) for this buf
|
||||
*/
|
||||
uint64_t (*get_rkey)(struct ibv_pd *pd, void *buf, size_t size);
|
||||
|
||||
/**
|
||||
* \brief Put back keys got from get_rkey.
|
||||
*
|
||||
* \param key The Infiniband remote key (rkey) got from get_rkey
|
||||
*
|
||||
*/
|
||||
void (*put_rkey)(uint64_t key);
|
||||
};
|
||||
|
||||
/**
|
||||
|
@ -1156,7 +1156,7 @@ enum spdk_nvme_generic_command_status_code {
|
||||
enum spdk_nvme_command_specific_status_code {
|
||||
SPDK_NVME_SC_COMPLETION_QUEUE_INVALID = 0x00,
|
||||
SPDK_NVME_SC_INVALID_QUEUE_IDENTIFIER = 0x01,
|
||||
SPDK_NVME_SC_MAXIMUM_QUEUE_SIZE_EXCEEDED = 0x02,
|
||||
SPDK_NVME_SC_INVALID_QUEUE_SIZE = 0x02,
|
||||
SPDK_NVME_SC_ABORT_COMMAND_LIMIT_EXCEEDED = 0x03,
|
||||
/* 0x04 - reserved */
|
||||
SPDK_NVME_SC_ASYNC_EVENT_REQUEST_LIMIT_EXCEEDED = 0x05,
|
||||
|
@ -54,7 +54,7 @@
|
||||
* Patch level is incremented on maintenance branch releases and reset to 0 for each
|
||||
* new major.minor release.
|
||||
*/
|
||||
#define SPDK_VERSION_PATCH 0
|
||||
#define SPDK_VERSION_PATCH 3
|
||||
|
||||
/**
|
||||
* Version string suffix.
|
||||
|
@ -77,6 +77,12 @@ struct spdk_sock_group_impl {
|
||||
struct spdk_net_impl *net_impl;
|
||||
TAILQ_HEAD(, spdk_sock) socks;
|
||||
STAILQ_ENTRY(spdk_sock_group_impl) link;
|
||||
/* List of removed sockets. refreshed each time we poll the sock group. */
|
||||
int num_removed_socks;
|
||||
/* Unfortunately, we can't just keep a tailq of the sockets in case they are freed
|
||||
* or added to another poll group later.
|
||||
*/
|
||||
uintptr_t removed_socks[MAX_EVENTS_PER_POLL];
|
||||
};
|
||||
|
||||
struct spdk_net_impl {
|
||||
|
@ -165,7 +165,7 @@ spdk_scsi_nvme_translate(const struct spdk_bdev_io *bdev_io, int *sc, int *sk,
|
||||
*ascq = SPDK_SCSI_ASCQ_CAUSE_NOT_REPORTABLE;
|
||||
break;
|
||||
case SPDK_NVME_SC_INVALID_QUEUE_IDENTIFIER:
|
||||
case SPDK_NVME_SC_MAXIMUM_QUEUE_SIZE_EXCEEDED:
|
||||
case SPDK_NVME_SC_INVALID_QUEUE_SIZE:
|
||||
case SPDK_NVME_SC_ASYNC_EVENT_REQUEST_LIMIT_EXCEEDED:
|
||||
case SPDK_NVME_SC_INVALID_FIRMWARE_SLOT:
|
||||
case SPDK_NVME_SC_INVALID_FIRMWARE_IMAGE:
|
||||
|
@ -247,6 +247,7 @@ _spdk_blob_alloc(struct spdk_blob_store *bs, spdk_blob_id id)
|
||||
|
||||
TAILQ_INIT(&blob->xattrs);
|
||||
TAILQ_INIT(&blob->xattrs_internal);
|
||||
TAILQ_INIT(&blob->pending_persists);
|
||||
|
||||
return blob;
|
||||
}
|
||||
@ -268,6 +269,7 @@ static void
|
||||
_spdk_blob_free(struct spdk_blob *blob)
|
||||
{
|
||||
assert(blob != NULL);
|
||||
assert(TAILQ_EMPTY(&blob->pending_persists));
|
||||
|
||||
free(blob->active.extent_pages);
|
||||
free(blob->clean.extent_pages);
|
||||
@ -1520,6 +1522,7 @@ struct spdk_blob_persist_ctx {
|
||||
spdk_bs_sequence_t *seq;
|
||||
spdk_bs_sequence_cpl cb_fn;
|
||||
void *cb_arg;
|
||||
TAILQ_ENTRY(spdk_blob_persist_ctx) link;
|
||||
};
|
||||
|
||||
static void
|
||||
@ -1540,22 +1543,34 @@ spdk_bs_batch_clear_dev(struct spdk_blob_persist_ctx *ctx, spdk_bs_batch_t *batc
|
||||
}
|
||||
}
|
||||
|
||||
static void _spdk_blob_persist_check_dirty(struct spdk_blob_persist_ctx *ctx);
|
||||
|
||||
static void
|
||||
_spdk_blob_persist_complete(spdk_bs_sequence_t *seq, void *cb_arg, int bserrno)
|
||||
{
|
||||
struct spdk_blob_persist_ctx *ctx = cb_arg;
|
||||
struct spdk_blob_persist_ctx *next_persist;
|
||||
struct spdk_blob *blob = ctx->blob;
|
||||
|
||||
if (bserrno == 0) {
|
||||
_spdk_blob_mark_clean(blob);
|
||||
}
|
||||
|
||||
assert(ctx == TAILQ_FIRST(&blob->pending_persists));
|
||||
TAILQ_REMOVE(&blob->pending_persists, ctx, link);
|
||||
|
||||
next_persist = TAILQ_FIRST(&blob->pending_persists);
|
||||
|
||||
/* Call user callback */
|
||||
ctx->cb_fn(seq, ctx->cb_arg, bserrno);
|
||||
|
||||
/* Free the memory */
|
||||
spdk_free(ctx->pages);
|
||||
free(ctx);
|
||||
|
||||
if (next_persist != NULL) {
|
||||
_spdk_blob_persist_check_dirty(next_persist);
|
||||
}
|
||||
}
|
||||
|
||||
static void
|
||||
@ -2060,6 +2075,25 @@ _spdk_blob_persist_dirty(spdk_bs_sequence_t *seq, void *cb_arg, int bserrno)
|
||||
_spdk_bs_write_super(seq, ctx->blob->bs, ctx->super, _spdk_blob_persist_dirty_cpl, ctx);
|
||||
}
|
||||
|
||||
static void
|
||||
_spdk_blob_persist_check_dirty(struct spdk_blob_persist_ctx *ctx)
|
||||
{
|
||||
if (ctx->blob->bs->clean) {
|
||||
ctx->super = spdk_zmalloc(sizeof(*ctx->super), 0x1000, NULL,
|
||||
SPDK_ENV_SOCKET_ID_ANY, SPDK_MALLOC_DMA);
|
||||
if (!ctx->super) {
|
||||
ctx->cb_fn(ctx->seq, ctx->cb_arg, -ENOMEM);
|
||||
free(ctx);
|
||||
return;
|
||||
}
|
||||
|
||||
spdk_bs_sequence_read_dev(ctx->seq, ctx->super, _spdk_bs_page_to_lba(ctx->blob->bs, 0),
|
||||
_spdk_bs_byte_to_lba(ctx->blob->bs, sizeof(*ctx->super)),
|
||||
_spdk_blob_persist_dirty, ctx);
|
||||
} else {
|
||||
_spdk_blob_persist_start(ctx);
|
||||
}
|
||||
}
|
||||
|
||||
/* Write a blob to disk */
|
||||
static void
|
||||
@ -2070,7 +2104,7 @@ _spdk_blob_persist(spdk_bs_sequence_t *seq, struct spdk_blob *blob,
|
||||
|
||||
_spdk_blob_verify_md_op(blob);
|
||||
|
||||
if (blob->state == SPDK_BLOB_STATE_CLEAN) {
|
||||
if (blob->state == SPDK_BLOB_STATE_CLEAN && TAILQ_EMPTY(&blob->pending_persists)) {
|
||||
cb_fn(seq, cb_arg, 0);
|
||||
return;
|
||||
}
|
||||
@ -2086,21 +2120,15 @@ _spdk_blob_persist(spdk_bs_sequence_t *seq, struct spdk_blob *blob,
|
||||
ctx->cb_arg = cb_arg;
|
||||
ctx->next_extent_page = 0;
|
||||
|
||||
if (blob->bs->clean) {
|
||||
ctx->super = spdk_zmalloc(sizeof(*ctx->super), 0x1000, NULL,
|
||||
SPDK_ENV_SOCKET_ID_ANY, SPDK_MALLOC_DMA);
|
||||
if (!ctx->super) {
|
||||
cb_fn(seq, cb_arg, -ENOMEM);
|
||||
free(ctx);
|
||||
return;
|
||||
}
|
||||
|
||||
spdk_bs_sequence_read_dev(seq, ctx->super, _spdk_bs_page_to_lba(blob->bs, 0),
|
||||
_spdk_bs_byte_to_lba(blob->bs, sizeof(*ctx->super)),
|
||||
_spdk_blob_persist_dirty, ctx);
|
||||
} else {
|
||||
_spdk_blob_persist_start(ctx);
|
||||
/* Multiple blob persists can affect one another, via blob->state or
|
||||
* blob mutable data changes. To prevent it, queue up the persists. */
|
||||
if (!TAILQ_EMPTY(&blob->pending_persists)) {
|
||||
TAILQ_INSERT_TAIL(&blob->pending_persists, ctx, link);
|
||||
return;
|
||||
}
|
||||
TAILQ_INSERT_HEAD(&blob->pending_persists, ctx, link);
|
||||
|
||||
_spdk_blob_persist_check_dirty(ctx);
|
||||
}
|
||||
|
||||
struct spdk_blob_copy_cluster_ctx {
|
||||
@ -5129,6 +5157,9 @@ _spdk_bs_create_blob(struct spdk_blob_store *bs,
|
||||
}
|
||||
|
||||
blob->use_extent_table = opts->use_extent_table;
|
||||
if (blob->use_extent_table) {
|
||||
blob->invalid_flags |= SPDK_BLOB_EXTENT_TABLE;
|
||||
}
|
||||
|
||||
if (!internal_xattrs) {
|
||||
_spdk_blob_xattrs_init(&internal_xattrs_default);
|
||||
@ -6179,7 +6210,14 @@ _spdk_delete_snapshot_sync_clone_cpl(void *cb_arg, int bserrno)
|
||||
ctx->snapshot->active.clusters[i] = 0;
|
||||
}
|
||||
}
|
||||
for (i = 0; i < ctx->snapshot->active.num_extent_pages &&
|
||||
i < ctx->clone->active.num_extent_pages; i++) {
|
||||
if (ctx->clone->active.extent_pages[i] == ctx->snapshot->active.extent_pages[i]) {
|
||||
ctx->snapshot->active.extent_pages[i] = 0;
|
||||
}
|
||||
}
|
||||
|
||||
_spdk_blob_set_thin_provision(ctx->snapshot);
|
||||
ctx->snapshot->state = SPDK_BLOB_STATE_DIRTY;
|
||||
|
||||
if (ctx->parent_snapshot_entry != NULL) {
|
||||
@ -6212,6 +6250,12 @@ _spdk_delete_snapshot_sync_snapshot_xattr_cpl(void *cb_arg, int bserrno)
|
||||
ctx->clone->active.clusters[i] = ctx->snapshot->active.clusters[i];
|
||||
}
|
||||
}
|
||||
for (i = 0; i < ctx->snapshot->active.num_extent_pages &&
|
||||
i < ctx->clone->active.num_extent_pages; i++) {
|
||||
if (ctx->clone->active.extent_pages[i] == 0) {
|
||||
ctx->clone->active.extent_pages[i] = ctx->snapshot->active.extent_pages[i];
|
||||
}
|
||||
}
|
||||
|
||||
/* Delete old backing bs_dev from clone (related to snapshot that will be removed) */
|
||||
ctx->clone->back_bs_dev->destroy(ctx->clone->back_bs_dev);
|
||||
|
@ -166,6 +166,9 @@ struct spdk_blob {
|
||||
bool extent_table_found;
|
||||
bool use_extent_table;
|
||||
|
||||
/* A list of pending metadata pending_persists */
|
||||
TAILQ_HEAD(, spdk_blob_persist_ctx) pending_persists;
|
||||
|
||||
/* Number of data clusters retrived from extent table,
|
||||
* that many have to be read from extent pages. */
|
||||
uint64_t remaining_clusters_in_et;
|
||||
@ -331,7 +334,8 @@ struct spdk_blob_md_descriptor_extent_page {
|
||||
|
||||
#define SPDK_BLOB_THIN_PROV (1ULL << 0)
|
||||
#define SPDK_BLOB_INTERNAL_XATTR (1ULL << 1)
|
||||
#define SPDK_BLOB_INVALID_FLAGS_MASK (SPDK_BLOB_THIN_PROV | SPDK_BLOB_INTERNAL_XATTR)
|
||||
#define SPDK_BLOB_EXTENT_TABLE (1ULL << 2)
|
||||
#define SPDK_BLOB_INVALID_FLAGS_MASK (SPDK_BLOB_THIN_PROV | SPDK_BLOB_INTERNAL_XATTR | SPDK_BLOB_EXTENT_TABLE)
|
||||
|
||||
#define SPDK_BLOB_READ_ONLY (1ULL << 0)
|
||||
#define SPDK_BLOB_DATA_RO_FLAGS_MASK SPDK_BLOB_READ_ONLY
|
||||
|
@ -34,6 +34,10 @@
|
||||
SPDK_ROOT_DIR := $(abspath $(CURDIR)/../..)
|
||||
include $(SPDK_ROOT_DIR)/mk/spdk.common.mk
|
||||
|
||||
SO_VER := 2
|
||||
SO_MINOR := 1
|
||||
SO_SUFFIX := $(SO_VER).$(SO_MINOR)
|
||||
|
||||
CFLAGS += $(ENV_CFLAGS)
|
||||
C_SRCS = env.c memory.c pci.c init.c threads.c
|
||||
C_SRCS += pci_nvme.c pci_ioat.c pci_virtio.c pci_vmd.c
|
||||
|
@ -78,6 +78,11 @@ ifneq (, $(wildcard $(DPDK_ABS_DIR)/lib/librte_bus_pci.*))
|
||||
DPDK_LIB_LIST += rte_bus_pci
|
||||
endif
|
||||
|
||||
# DPDK 20.05 eal dependency
|
||||
ifneq (, $(wildcard $(DPDK_ABS_DIR)/lib/librte_telemetry.*))
|
||||
DPDK_LIB_LIST += rte_telemetry
|
||||
endif
|
||||
|
||||
# There are some complex dependencies when using crypto, reduce or both so
|
||||
# here we add the feature specific ones and set a flag to add the common
|
||||
# ones after that.
|
||||
|
@ -36,7 +36,6 @@
|
||||
#include "env_internal.h"
|
||||
|
||||
#include <rte_config.h>
|
||||
#include <rte_malloc.h>
|
||||
#include <rte_memory.h>
|
||||
#include <rte_eal_memconfig.h>
|
||||
|
||||
@ -343,11 +342,7 @@ spdk_mem_map_free(struct spdk_mem_map **pmap)
|
||||
}
|
||||
|
||||
for (i = 0; i < sizeof(map->map_256tb.map) / sizeof(map->map_256tb.map[0]); i++) {
|
||||
if (g_legacy_mem) {
|
||||
rte_free(map->map_256tb.map[i]);
|
||||
} else {
|
||||
free(map->map_256tb.map[i]);
|
||||
}
|
||||
free(map->map_256tb.map[i]);
|
||||
}
|
||||
|
||||
pthread_mutex_destroy(&map->mutex);
|
||||
@ -508,6 +503,57 @@ spdk_mem_unregister(void *vaddr, size_t len)
|
||||
return 0;
|
||||
}
|
||||
|
||||
int
|
||||
spdk_mem_reserve(void *vaddr, size_t len)
|
||||
{
|
||||
struct spdk_mem_map *map;
|
||||
void *seg_vaddr;
|
||||
size_t seg_len;
|
||||
uint64_t reg;
|
||||
|
||||
if ((uintptr_t)vaddr & ~MASK_256TB) {
|
||||
DEBUG_PRINT("invalid usermode virtual address %p\n", vaddr);
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
if (((uintptr_t)vaddr & MASK_2MB) || (len & MASK_2MB)) {
|
||||
DEBUG_PRINT("invalid %s parameters, vaddr=%p len=%ju\n",
|
||||
__func__, vaddr, len);
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
if (len == 0) {
|
||||
return 0;
|
||||
}
|
||||
|
||||
pthread_mutex_lock(&g_spdk_mem_map_mutex);
|
||||
|
||||
/* Check if any part of this range is already registered */
|
||||
seg_vaddr = vaddr;
|
||||
seg_len = len;
|
||||
while (seg_len > 0) {
|
||||
reg = spdk_mem_map_translate(g_mem_reg_map, (uint64_t)seg_vaddr, NULL);
|
||||
if (reg & REG_MAP_REGISTERED) {
|
||||
pthread_mutex_unlock(&g_spdk_mem_map_mutex);
|
||||
return -EBUSY;
|
||||
}
|
||||
seg_vaddr += VALUE_2MB;
|
||||
seg_len -= VALUE_2MB;
|
||||
}
|
||||
|
||||
/* Simply set the translation to the memory map's default. This allocates the space in the
|
||||
* map but does not provide a valid translation. */
|
||||
spdk_mem_map_set_translation(g_mem_reg_map, (uint64_t)vaddr, len,
|
||||
g_mem_reg_map->default_translation);
|
||||
|
||||
TAILQ_FOREACH(map, &g_spdk_mem_maps, tailq) {
|
||||
spdk_mem_map_set_translation(map, (uint64_t)vaddr, len, map->default_translation);
|
||||
}
|
||||
|
||||
pthread_mutex_unlock(&g_spdk_mem_map_mutex);
|
||||
return 0;
|
||||
}
|
||||
|
||||
static struct map_1gb *
|
||||
spdk_mem_map_get_map_1gb(struct spdk_mem_map *map, uint64_t vfn_2mb)
|
||||
{
|
||||
@ -527,23 +573,7 @@ spdk_mem_map_get_map_1gb(struct spdk_mem_map *map, uint64_t vfn_2mb)
|
||||
/* Recheck to make sure nobody else got the mutex first. */
|
||||
map_1gb = map->map_256tb.map[idx_256tb];
|
||||
if (!map_1gb) {
|
||||
/* Some of the existing apps use TCMalloc hugepage
|
||||
* allocator and register this tcmalloc allocated
|
||||
* hugepage memory with SPDK in the mmap hook. Since
|
||||
* this function is called in the spdk_mem_register
|
||||
* code path we can't do a malloc here otherwise that
|
||||
* would cause a livelock. So we use the dpdk provided
|
||||
* allocator instead, which avoids this cyclic
|
||||
* dependency. Note this is only guaranteed to work when
|
||||
* DPDK dynamic memory allocation is disabled (--legacy-mem),
|
||||
* which then is a requirement for anyone using TCMalloc in
|
||||
* this way.
|
||||
*/
|
||||
if (g_legacy_mem) {
|
||||
map_1gb = rte_malloc(NULL, sizeof(struct map_1gb), 0);
|
||||
} else {
|
||||
map_1gb = malloc(sizeof(struct map_1gb));
|
||||
}
|
||||
map_1gb = malloc(sizeof(struct map_1gb));
|
||||
if (map_1gb) {
|
||||
/* initialize all entries to default translation */
|
||||
for (i = 0; i < SPDK_COUNTOF(map_1gb->map); i++) {
|
||||
@ -778,14 +808,23 @@ static TAILQ_HEAD(, spdk_vtophys_pci_device) g_vtophys_pci_devices =
|
||||
TAILQ_HEAD_INITIALIZER(g_vtophys_pci_devices);
|
||||
|
||||
static struct spdk_mem_map *g_vtophys_map;
|
||||
static struct spdk_mem_map *g_phys_ref_map;
|
||||
|
||||
#if SPDK_VFIO_ENABLED
|
||||
static int
|
||||
vtophys_iommu_map_dma(uint64_t vaddr, uint64_t iova, uint64_t size)
|
||||
{
|
||||
struct spdk_vfio_dma_map *dma_map;
|
||||
uint64_t refcount;
|
||||
int ret;
|
||||
|
||||
refcount = spdk_mem_map_translate(g_phys_ref_map, iova, NULL);
|
||||
assert(refcount < UINT64_MAX);
|
||||
if (refcount > 0) {
|
||||
spdk_mem_map_set_translation(g_phys_ref_map, iova, size, refcount + 1);
|
||||
return 0;
|
||||
}
|
||||
|
||||
dma_map = calloc(1, sizeof(*dma_map));
|
||||
if (dma_map == NULL) {
|
||||
return -ENOMEM;
|
||||
@ -832,6 +871,7 @@ vtophys_iommu_map_dma(uint64_t vaddr, uint64_t iova, uint64_t size)
|
||||
out_insert:
|
||||
TAILQ_INSERT_TAIL(&g_vfio.maps, dma_map, tailq);
|
||||
pthread_mutex_unlock(&g_vfio.mutex);
|
||||
spdk_mem_map_set_translation(g_phys_ref_map, iova, size, refcount + 1);
|
||||
return 0;
|
||||
}
|
||||
|
||||
@ -839,6 +879,7 @@ static int
|
||||
vtophys_iommu_unmap_dma(uint64_t iova, uint64_t size)
|
||||
{
|
||||
struct spdk_vfio_dma_map *dma_map;
|
||||
uint64_t refcount;
|
||||
int ret;
|
||||
|
||||
pthread_mutex_lock(&g_vfio.mutex);
|
||||
@ -854,6 +895,18 @@ vtophys_iommu_unmap_dma(uint64_t iova, uint64_t size)
|
||||
return -ENXIO;
|
||||
}
|
||||
|
||||
refcount = spdk_mem_map_translate(g_phys_ref_map, iova, NULL);
|
||||
assert(refcount < UINT64_MAX);
|
||||
if (refcount > 0) {
|
||||
spdk_mem_map_set_translation(g_phys_ref_map, iova, size, refcount - 1);
|
||||
}
|
||||
|
||||
/* We still have outstanding references, don't clear it. */
|
||||
if (refcount > 1) {
|
||||
pthread_mutex_unlock(&g_vfio.mutex);
|
||||
return 0;
|
||||
}
|
||||
|
||||
/** don't support partial or multiple-page unmap for now */
|
||||
assert(dma_map->map.size == size);
|
||||
|
||||
@ -1383,10 +1436,21 @@ spdk_vtophys_init(void)
|
||||
.are_contiguous = vtophys_check_contiguous_entries,
|
||||
};
|
||||
|
||||
const struct spdk_mem_map_ops phys_ref_map_ops = {
|
||||
.notify_cb = NULL,
|
||||
.are_contiguous = NULL,
|
||||
};
|
||||
|
||||
#if SPDK_VFIO_ENABLED
|
||||
spdk_vtophys_iommu_init();
|
||||
#endif
|
||||
|
||||
g_phys_ref_map = spdk_mem_map_alloc(0, &phys_ref_map_ops, NULL);
|
||||
if (g_phys_ref_map == NULL) {
|
||||
DEBUG_PRINT("phys_ref map allocation failed.\n");
|
||||
return -ENOMEM;
|
||||
}
|
||||
|
||||
g_vtophys_map = spdk_mem_map_alloc(SPDK_VTOPHYS_ERROR, &vtophys_map_ops, NULL);
|
||||
if (g_vtophys_map == NULL) {
|
||||
DEBUG_PRINT("vtophys map allocation failed\n");
|
||||
|
@ -62,15 +62,7 @@ spdk_map_bar_rte(struct spdk_pci_device *device, uint32_t bar,
|
||||
struct rte_pci_device *dev = device->dev_handle;
|
||||
|
||||
*mapped_addr = dev->mem_resource[bar].addr;
|
||||
if (*mapped_addr == NULL) {
|
||||
return -1;
|
||||
}
|
||||
|
||||
*phys_addr = (uint64_t)dev->mem_resource[bar].phys_addr;
|
||||
if (*phys_addr == 0) {
|
||||
return -1;
|
||||
}
|
||||
|
||||
*size = (uint64_t)dev->mem_resource[bar].len;
|
||||
|
||||
return 0;
|
||||
@ -141,8 +133,8 @@ spdk_detach_rte(struct spdk_pci_device *dev)
|
||||
dev->internal.pending_removal = true;
|
||||
if (spdk_process_is_primary() && !pthread_equal(g_dpdk_tid, pthread_self())) {
|
||||
rte_eal_alarm_set(1, spdk_detach_rte_cb, rte_dev);
|
||||
/* wait up to 20ms for the cb to start executing */
|
||||
for (i = 20; i > 0; i--) {
|
||||
/* wait up to 2s for the cb to finish executing */
|
||||
for (i = 2000; i > 0; i--) {
|
||||
|
||||
spdk_delay_us(1000);
|
||||
pthread_mutex_lock(&g_pci_mutex);
|
||||
@ -157,7 +149,7 @@ spdk_detach_rte(struct spdk_pci_device *dev)
|
||||
/* besides checking the removed flag, we also need to wait
|
||||
* for the dpdk detach function to unwind, as it's doing some
|
||||
* operations even after calling our detach callback. Simply
|
||||
* cancell the alarm - if it started executing already, this
|
||||
* cancel the alarm - if it started executing already, this
|
||||
* call will block and wait for it to finish.
|
||||
*/
|
||||
rte_eal_alarm_cancel(spdk_detach_rte_cb, rte_dev);
|
||||
@ -171,6 +163,8 @@ spdk_detach_rte(struct spdk_pci_device *dev)
|
||||
if (!removed) {
|
||||
fprintf(stderr, "Timeout waiting for DPDK to remove PCI device %s.\n",
|
||||
rte_dev->name);
|
||||
/* If we reach this state, then the device couldn't be removed and most likely
|
||||
a subsequent hot add of a device in the same BDF will fail */
|
||||
}
|
||||
} else {
|
||||
spdk_detach_rte_cb(rte_dev);
|
||||
|
@ -1138,6 +1138,11 @@ iscsi_conn_login_pdu_success_complete(void *arg)
|
||||
{
|
||||
struct spdk_iscsi_conn *conn = arg;
|
||||
|
||||
if (conn->state >= ISCSI_CONN_STATE_EXITING) {
|
||||
/* Connection is being exited before this callback is executed. */
|
||||
SPDK_DEBUGLOG(SPDK_LOG_ISCSI, "Connection is already exited.\n");
|
||||
return;
|
||||
}
|
||||
if (conn->full_feature) {
|
||||
if (iscsi_conn_params_update(conn) != 0) {
|
||||
return;
|
||||
|
@ -34,6 +34,7 @@
|
||||
#include "spdk/nvmf_spec.h"
|
||||
#include "nvme_internal.h"
|
||||
#include "nvme_io_msg.h"
|
||||
#include "nvme_uevent.h"
|
||||
|
||||
#define SPDK_NVME_DRIVER_NAME "spdk_nvme_driver"
|
||||
|
||||
@ -350,10 +351,19 @@ nvme_robust_mutex_init_shared(pthread_mutex_t *mtx)
|
||||
int
|
||||
nvme_driver_init(void)
|
||||
{
|
||||
static pthread_mutex_t g_init_mutex = PTHREAD_MUTEX_INITIALIZER;
|
||||
int ret = 0;
|
||||
/* Any socket ID */
|
||||
int socket_id = -1;
|
||||
|
||||
/* Use a special process-private mutex to ensure the global
|
||||
* nvme driver object (g_spdk_nvme_driver) gets initialized by
|
||||
* only one thread. Once that object is established and its
|
||||
* mutex is initialized, we can unlock this mutex and use that
|
||||
* one instead.
|
||||
*/
|
||||
pthread_mutex_lock(&g_init_mutex);
|
||||
|
||||
/* Each process needs its own pid. */
|
||||
g_spdk_nvme_pid = getpid();
|
||||
|
||||
@ -366,6 +376,7 @@ nvme_driver_init(void)
|
||||
if (spdk_process_is_primary()) {
|
||||
/* The unique named memzone already reserved. */
|
||||
if (g_spdk_nvme_driver != NULL) {
|
||||
pthread_mutex_unlock(&g_init_mutex);
|
||||
return 0;
|
||||
} else {
|
||||
g_spdk_nvme_driver = spdk_memzone_reserve(SPDK_NVME_DRIVER_NAME,
|
||||
@ -375,7 +386,7 @@ nvme_driver_init(void)
|
||||
|
||||
if (g_spdk_nvme_driver == NULL) {
|
||||
SPDK_ERRLOG("primary process failed to reserve memory\n");
|
||||
|
||||
pthread_mutex_unlock(&g_init_mutex);
|
||||
return -1;
|
||||
}
|
||||
} else {
|
||||
@ -393,15 +404,16 @@ nvme_driver_init(void)
|
||||
}
|
||||
if (g_spdk_nvme_driver->initialized == false) {
|
||||
SPDK_ERRLOG("timeout waiting for primary process to init\n");
|
||||
|
||||
pthread_mutex_unlock(&g_init_mutex);
|
||||
return -1;
|
||||
}
|
||||
} else {
|
||||
SPDK_ERRLOG("primary process is not started yet\n");
|
||||
|
||||
pthread_mutex_unlock(&g_init_mutex);
|
||||
return -1;
|
||||
}
|
||||
|
||||
pthread_mutex_unlock(&g_init_mutex);
|
||||
return 0;
|
||||
}
|
||||
|
||||
@ -415,12 +427,21 @@ nvme_driver_init(void)
|
||||
if (ret != 0) {
|
||||
SPDK_ERRLOG("failed to initialize mutex\n");
|
||||
spdk_memzone_free(SPDK_NVME_DRIVER_NAME);
|
||||
pthread_mutex_unlock(&g_init_mutex);
|
||||
return ret;
|
||||
}
|
||||
|
||||
/* The lock in the shared g_spdk_nvme_driver object is now ready to
|
||||
* be used - so we can unlock the g_init_mutex here.
|
||||
*/
|
||||
pthread_mutex_unlock(&g_init_mutex);
|
||||
nvme_robust_mutex_lock(&g_spdk_nvme_driver->lock);
|
||||
|
||||
g_spdk_nvme_driver->initialized = false;
|
||||
g_spdk_nvme_driver->hotplug_fd = spdk_uevent_connect();
|
||||
if (g_spdk_nvme_driver->hotplug_fd < 0) {
|
||||
SPDK_DEBUGLOG(SPDK_LOG_NVME, "Failed to open uevent netlink socket\n");
|
||||
}
|
||||
|
||||
TAILQ_INIT(&g_spdk_nvme_driver->shared_attached_ctrlrs);
|
||||
|
||||
@ -594,6 +615,7 @@ spdk_nvme_probe_internal(struct spdk_nvme_probe_ctx *probe_ctx,
|
||||
int rc;
|
||||
struct spdk_nvme_ctrlr *ctrlr, *ctrlr_tmp;
|
||||
|
||||
spdk_nvme_trid_populate_transport(&probe_ctx->trid, probe_ctx->trid.trtype);
|
||||
if (!spdk_nvme_transport_available_by_name(probe_ctx->trid.trstring)) {
|
||||
SPDK_ERRLOG("NVMe trtype %u not available\n", probe_ctx->trid.trtype);
|
||||
return -1;
|
||||
@ -741,7 +763,7 @@ void
|
||||
spdk_nvme_trid_populate_transport(struct spdk_nvme_transport_id *trid,
|
||||
enum spdk_nvme_transport_type trtype)
|
||||
{
|
||||
const char *trstring;
|
||||
const char *trstring = "";
|
||||
|
||||
trid->trtype = trtype;
|
||||
switch (trtype) {
|
||||
@ -760,7 +782,8 @@ spdk_nvme_trid_populate_transport(struct spdk_nvme_transport_id *trid,
|
||||
case SPDK_NVME_TRANSPORT_CUSTOM:
|
||||
default:
|
||||
SPDK_ERRLOG("don't use this for custom transports\n");
|
||||
break;
|
||||
assert(0);
|
||||
return;
|
||||
}
|
||||
snprintf(trid->trstring, SPDK_NVMF_TRSTRING_MAX_LEN, "%s", trstring);
|
||||
}
|
||||
|
@ -2618,6 +2618,7 @@ nvme_ctrlr_destruct(struct spdk_nvme_ctrlr *ctrlr)
|
||||
|
||||
SPDK_DEBUGLOG(SPDK_LOG_NVME, "Prepare to destruct SSD: %s\n", ctrlr->trid.traddr);
|
||||
|
||||
spdk_nvme_qpair_process_completions(ctrlr->adminq, 0);
|
||||
nvme_transport_admin_qpair_abort_aers(ctrlr->adminq);
|
||||
|
||||
TAILQ_FOREACH_SAFE(qpair, &ctrlr->active_io_qpairs, tailq, tmp) {
|
||||
|
@ -60,6 +60,7 @@ struct cuse_device {
|
||||
TAILQ_ENTRY(cuse_device) tailq;
|
||||
};
|
||||
|
||||
static pthread_mutex_t g_cuse_mtx = PTHREAD_MUTEX_INITIALIZER;
|
||||
static TAILQ_HEAD(, cuse_device) g_ctrlr_ctx_head = TAILQ_HEAD_INITIALIZER(g_ctrlr_ctx_head);
|
||||
static struct spdk_bit_array *g_ctrlr_started;
|
||||
|
||||
@ -700,13 +701,14 @@ cuse_nvme_ns_start(struct cuse_device *ctrlr_device, uint32_t nsid, const char *
|
||||
if (rv < 0) {
|
||||
SPDK_ERRLOG("Device name too long.\n");
|
||||
free(ns_device);
|
||||
return -1;
|
||||
return -ENAMETOOLONG;
|
||||
}
|
||||
|
||||
if (pthread_create(&ns_device->tid, NULL, cuse_thread, ns_device)) {
|
||||
rv = pthread_create(&ns_device->tid, NULL, cuse_thread, ns_device);
|
||||
if (rv != 0) {
|
||||
SPDK_ERRLOG("pthread_create failed\n");
|
||||
free(ns_device);
|
||||
return -1;
|
||||
return -rv;
|
||||
}
|
||||
|
||||
TAILQ_INSERT_TAIL(&ctrlr_device->ns_devices, ns_device, tailq);
|
||||
@ -811,7 +813,7 @@ nvme_cuse_start(struct spdk_nvme_ctrlr *ctrlr)
|
||||
g_ctrlr_started = spdk_bit_array_create(128);
|
||||
if (g_ctrlr_started == NULL) {
|
||||
SPDK_ERRLOG("Cannot create bit array\n");
|
||||
return -1;
|
||||
return -ENOMEM;
|
||||
}
|
||||
}
|
||||
|
||||
@ -843,9 +845,10 @@ nvme_cuse_start(struct spdk_nvme_ctrlr *ctrlr)
|
||||
snprintf(ctrlr_device->dev_name, sizeof(ctrlr_device->dev_name), "spdk/nvme%d",
|
||||
ctrlr_device->index);
|
||||
|
||||
if (pthread_create(&ctrlr_device->tid, NULL, cuse_thread, ctrlr_device)) {
|
||||
rv = pthread_create(&ctrlr_device->tid, NULL, cuse_thread, ctrlr_device);
|
||||
if (rv != 0) {
|
||||
SPDK_ERRLOG("pthread_create failed\n");
|
||||
rv = -1;
|
||||
rv = -rv;
|
||||
goto err3;
|
||||
}
|
||||
TAILQ_INSERT_TAIL(&g_ctrlr_ctx_head, ctrlr_device, tailq);
|
||||
@ -857,10 +860,10 @@ nvme_cuse_start(struct spdk_nvme_ctrlr *ctrlr)
|
||||
continue;
|
||||
}
|
||||
|
||||
if (cuse_nvme_ns_start(ctrlr_device, nsid, ctrlr_device->dev_name) < 0) {
|
||||
rv = cuse_nvme_ns_start(ctrlr_device, nsid, ctrlr_device->dev_name);
|
||||
if (rv < 0) {
|
||||
SPDK_ERRLOG("Cannot start CUSE namespace device.");
|
||||
cuse_nvme_ctrlr_stop(ctrlr_device);
|
||||
rv = -1;
|
||||
goto err3;
|
||||
}
|
||||
}
|
||||
@ -877,10 +880,10 @@ err2:
|
||||
return rv;
|
||||
}
|
||||
|
||||
static void
|
||||
nvme_cuse_stop(struct spdk_nvme_ctrlr *ctrlr)
|
||||
static struct cuse_device *
|
||||
nvme_cuse_get_cuse_ctrlr_device(struct spdk_nvme_ctrlr *ctrlr)
|
||||
{
|
||||
struct cuse_device *ctrlr_device;
|
||||
struct cuse_device *ctrlr_device = NULL;
|
||||
|
||||
TAILQ_FOREACH(ctrlr_device, &g_ctrlr_ctx_head, tailq) {
|
||||
if (ctrlr_device->ctrlr == ctrlr) {
|
||||
@ -888,12 +891,46 @@ nvme_cuse_stop(struct spdk_nvme_ctrlr *ctrlr)
|
||||
}
|
||||
}
|
||||
|
||||
return ctrlr_device;
|
||||
}
|
||||
|
||||
static struct cuse_device *
|
||||
nvme_cuse_get_cuse_ns_device(struct spdk_nvme_ctrlr *ctrlr, uint32_t nsid)
|
||||
{
|
||||
struct cuse_device *ctrlr_device = NULL;
|
||||
struct cuse_device *ns_device = NULL;
|
||||
|
||||
ctrlr_device = nvme_cuse_get_cuse_ctrlr_device(ctrlr);
|
||||
if (!ctrlr_device) {
|
||||
return NULL;
|
||||
}
|
||||
|
||||
TAILQ_FOREACH(ns_device, &ctrlr_device->ns_devices, tailq) {
|
||||
if (ns_device->nsid == nsid) {
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
return ns_device;
|
||||
}
|
||||
|
||||
static void
|
||||
nvme_cuse_stop(struct spdk_nvme_ctrlr *ctrlr)
|
||||
{
|
||||
struct cuse_device *ctrlr_device;
|
||||
|
||||
pthread_mutex_lock(&g_cuse_mtx);
|
||||
|
||||
ctrlr_device = nvme_cuse_get_cuse_ctrlr_device(ctrlr);
|
||||
if (!ctrlr_device) {
|
||||
SPDK_ERRLOG("Cannot find associated CUSE device\n");
|
||||
pthread_mutex_unlock(&g_cuse_mtx);
|
||||
return;
|
||||
}
|
||||
|
||||
cuse_nvme_ctrlr_stop(ctrlr_device);
|
||||
|
||||
pthread_mutex_unlock(&g_cuse_mtx);
|
||||
}
|
||||
|
||||
static struct nvme_io_msg_producer cuse_nvme_io_msg_producer = {
|
||||
@ -911,18 +948,35 @@ spdk_nvme_cuse_register(struct spdk_nvme_ctrlr *ctrlr)
|
||||
return rc;
|
||||
}
|
||||
|
||||
pthread_mutex_lock(&g_cuse_mtx);
|
||||
|
||||
rc = nvme_cuse_start(ctrlr);
|
||||
if (rc) {
|
||||
nvme_io_msg_ctrlr_unregister(ctrlr, &cuse_nvme_io_msg_producer);
|
||||
}
|
||||
|
||||
pthread_mutex_unlock(&g_cuse_mtx);
|
||||
|
||||
return rc;
|
||||
}
|
||||
|
||||
void
|
||||
spdk_nvme_cuse_unregister(struct spdk_nvme_ctrlr *ctrlr)
|
||||
{
|
||||
nvme_cuse_stop(ctrlr);
|
||||
struct cuse_device *ctrlr_device;
|
||||
|
||||
pthread_mutex_lock(&g_cuse_mtx);
|
||||
|
||||
ctrlr_device = nvme_cuse_get_cuse_ctrlr_device(ctrlr);
|
||||
if (!ctrlr_device) {
|
||||
SPDK_ERRLOG("Cannot find associated CUSE device\n");
|
||||
pthread_mutex_unlock(&g_cuse_mtx);
|
||||
return;
|
||||
}
|
||||
|
||||
cuse_nvme_ctrlr_stop(ctrlr_device);
|
||||
|
||||
pthread_mutex_unlock(&g_cuse_mtx);
|
||||
|
||||
nvme_io_msg_ctrlr_unregister(ctrlr, &cuse_nvme_io_msg_producer);
|
||||
}
|
||||
@ -932,20 +986,15 @@ spdk_nvme_cuse_get_ctrlr_name(struct spdk_nvme_ctrlr *ctrlr)
|
||||
{
|
||||
struct cuse_device *ctrlr_device;
|
||||
|
||||
if (TAILQ_EMPTY(&g_ctrlr_ctx_head)) {
|
||||
return NULL;
|
||||
}
|
||||
|
||||
TAILQ_FOREACH(ctrlr_device, &g_ctrlr_ctx_head, tailq) {
|
||||
if (ctrlr_device->ctrlr == ctrlr) {
|
||||
break;
|
||||
}
|
||||
}
|
||||
pthread_mutex_lock(&g_cuse_mtx);
|
||||
|
||||
ctrlr_device = nvme_cuse_get_cuse_ctrlr_device(ctrlr);
|
||||
if (!ctrlr_device) {
|
||||
pthread_mutex_unlock(&g_cuse_mtx);
|
||||
return NULL;
|
||||
}
|
||||
|
||||
pthread_mutex_unlock(&g_cuse_mtx);
|
||||
return ctrlr_device->dev_name;
|
||||
}
|
||||
|
||||
@ -953,31 +1002,15 @@ char *
|
||||
spdk_nvme_cuse_get_ns_name(struct spdk_nvme_ctrlr *ctrlr, uint32_t nsid)
|
||||
{
|
||||
struct cuse_device *ns_device;
|
||||
struct cuse_device *ctrlr_device;
|
||||
|
||||
if (TAILQ_EMPTY(&g_ctrlr_ctx_head)) {
|
||||
return NULL;
|
||||
}
|
||||
|
||||
TAILQ_FOREACH(ctrlr_device, &g_ctrlr_ctx_head, tailq) {
|
||||
if (ctrlr_device->ctrlr == ctrlr) {
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
if (!ctrlr_device) {
|
||||
return NULL;
|
||||
}
|
||||
|
||||
TAILQ_FOREACH(ns_device, &ctrlr_device->ns_devices, tailq) {
|
||||
if (ns_device->nsid == nsid) {
|
||||
break;
|
||||
}
|
||||
}
|
||||
pthread_mutex_lock(&g_cuse_mtx);
|
||||
|
||||
ns_device = nvme_cuse_get_cuse_ns_device(ctrlr, nsid);
|
||||
if (!ns_device) {
|
||||
pthread_mutex_unlock(&g_cuse_mtx);
|
||||
return NULL;
|
||||
}
|
||||
|
||||
pthread_mutex_unlock(&g_cuse_mtx);
|
||||
return ns_device->dev_name;
|
||||
}
|
||||
|
@ -767,6 +767,9 @@ struct nvme_driver {
|
||||
|
||||
bool initialized;
|
||||
struct spdk_uuid default_extended_host_id;
|
||||
|
||||
/** netlink socket fd for hotplug messages */
|
||||
int hotplug_fd;
|
||||
};
|
||||
|
||||
extern struct nvme_driver *g_spdk_nvme_driver;
|
||||
|
@ -64,6 +64,7 @@ nvme_io_msg_send(struct spdk_nvme_ctrlr *ctrlr, uint32_t nsid, spdk_nvme_io_msg_
|
||||
rc = spdk_ring_enqueue(ctrlr->external_io_msgs, (void **)&io, 1, NULL);
|
||||
if (rc != 1) {
|
||||
assert(false);
|
||||
free(io);
|
||||
pthread_mutex_unlock(&ctrlr->external_io_msgs_lock);
|
||||
return -ENOMEM;
|
||||
}
|
||||
@ -106,6 +107,20 @@ spdk_nvme_io_msg_process(struct spdk_nvme_ctrlr *ctrlr)
|
||||
return count;
|
||||
}
|
||||
|
||||
static bool
|
||||
nvme_io_msg_is_producer_registered(struct spdk_nvme_ctrlr *ctrlr,
|
||||
struct nvme_io_msg_producer *io_msg_producer)
|
||||
{
|
||||
struct nvme_io_msg_producer *tmp;
|
||||
|
||||
STAILQ_FOREACH(tmp, &ctrlr->io_producers, link) {
|
||||
if (tmp == io_msg_producer) {
|
||||
return true;
|
||||
}
|
||||
}
|
||||
return false;
|
||||
}
|
||||
|
||||
int
|
||||
nvme_io_msg_ctrlr_register(struct spdk_nvme_ctrlr *ctrlr,
|
||||
struct nvme_io_msg_producer *io_msg_producer)
|
||||
@ -115,6 +130,10 @@ nvme_io_msg_ctrlr_register(struct spdk_nvme_ctrlr *ctrlr,
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
if (nvme_io_msg_is_producer_registered(ctrlr, io_msg_producer)) {
|
||||
return -EEXIST;
|
||||
}
|
||||
|
||||
if (!STAILQ_EMPTY(&ctrlr->io_producers) || ctrlr->is_resetting) {
|
||||
/* There are registered producers - IO messaging already started */
|
||||
STAILQ_INSERT_TAIL(&ctrlr->io_producers, io_msg_producer, link);
|
||||
@ -136,7 +155,8 @@ nvme_io_msg_ctrlr_register(struct spdk_nvme_ctrlr *ctrlr,
|
||||
if (ctrlr->external_io_msgs_qpair == NULL) {
|
||||
SPDK_ERRLOG("spdk_nvme_ctrlr_alloc_io_qpair() failed\n");
|
||||
spdk_ring_free(ctrlr->external_io_msgs);
|
||||
return -1;
|
||||
ctrlr->external_io_msgs = NULL;
|
||||
return -ENOMEM;
|
||||
}
|
||||
|
||||
STAILQ_INSERT_TAIL(&ctrlr->io_producers, io_msg_producer, link);
|
||||
@ -157,6 +177,7 @@ nvme_io_msg_ctrlr_detach(struct spdk_nvme_ctrlr *ctrlr)
|
||||
|
||||
if (ctrlr->external_io_msgs) {
|
||||
spdk_ring_free(ctrlr->external_io_msgs);
|
||||
ctrlr->external_io_msgs = NULL;
|
||||
}
|
||||
|
||||
if (ctrlr->external_io_msgs_qpair) {
|
||||
@ -173,6 +194,10 @@ nvme_io_msg_ctrlr_unregister(struct spdk_nvme_ctrlr *ctrlr,
|
||||
{
|
||||
assert(io_msg_producer != NULL);
|
||||
|
||||
if (!nvme_io_msg_is_producer_registered(ctrlr, io_msg_producer)) {
|
||||
return;
|
||||
}
|
||||
|
||||
STAILQ_REMOVE(&ctrlr->io_producers, io_msg_producer, nvme_io_msg_producer, link);
|
||||
if (STAILQ_EMPTY(&ctrlr->io_producers)) {
|
||||
nvme_io_msg_ctrlr_detach(ctrlr);
|
||||
|
@ -212,7 +212,6 @@ static int nvme_pcie_qpair_destroy(struct spdk_nvme_qpair *qpair);
|
||||
__thread struct nvme_pcie_ctrlr *g_thread_mmio_ctrlr = NULL;
|
||||
static uint16_t g_signal_lock;
|
||||
static bool g_sigset = false;
|
||||
static int g_hotplug_fd = -1;
|
||||
|
||||
static void
|
||||
nvme_sigbus_fault_sighandler(int signum, siginfo_t *info, void *ctx)
|
||||
@ -271,7 +270,11 @@ _nvme_pcie_hotplug_monitor(struct spdk_nvme_probe_ctx *probe_ctx)
|
||||
union spdk_nvme_csts_register csts;
|
||||
struct spdk_nvme_ctrlr_process *proc;
|
||||
|
||||
while (spdk_get_uevent(g_hotplug_fd, &event) > 0) {
|
||||
if (g_spdk_nvme_driver->hotplug_fd < 0) {
|
||||
return 0;
|
||||
}
|
||||
|
||||
while (spdk_get_uevent(g_spdk_nvme_driver->hotplug_fd, &event) > 0) {
|
||||
if (event.subsystem == SPDK_NVME_UEVENT_SUBSYSTEM_UIO ||
|
||||
event.subsystem == SPDK_NVME_UEVENT_SUBSYSTEM_VFIO) {
|
||||
if (event.action == SPDK_NVME_UEVENT_ADD) {
|
||||
@ -768,14 +771,7 @@ nvme_pcie_ctrlr_scan(struct spdk_nvme_probe_ctx *probe_ctx,
|
||||
|
||||
/* Only the primary process can monitor hotplug. */
|
||||
if (spdk_process_is_primary()) {
|
||||
if (g_hotplug_fd < 0) {
|
||||
g_hotplug_fd = spdk_uevent_connect();
|
||||
if (g_hotplug_fd < 0) {
|
||||
SPDK_DEBUGLOG(SPDK_LOG_NVME, "Failed to open uevent netlink socket\n");
|
||||
}
|
||||
} else {
|
||||
_nvme_pcie_hotplug_monitor(probe_ctx);
|
||||
}
|
||||
_nvme_pcie_hotplug_monitor(probe_ctx);
|
||||
}
|
||||
|
||||
if (enum_ctx.has_pci_addr == false) {
|
||||
@ -828,7 +824,6 @@ struct spdk_nvme_ctrlr *nvme_pcie_ctrlr_construct(const struct spdk_nvme_transpo
|
||||
|
||||
pctrlr->is_remapped = false;
|
||||
pctrlr->ctrlr.is_removed = false;
|
||||
spdk_nvme_trid_populate_transport(&pctrlr->ctrlr.trid, SPDK_NVME_TRANSPORT_PCIE);
|
||||
pctrlr->devhandle = devhandle;
|
||||
pctrlr->ctrlr.opts = *opts;
|
||||
memcpy(&pctrlr->ctrlr.trid, trid, sizeof(pctrlr->ctrlr.trid));
|
||||
@ -997,7 +992,8 @@ nvme_pcie_qpair_construct(struct spdk_nvme_qpair *qpair,
|
||||
volatile uint32_t *doorbell_base;
|
||||
uint64_t offset;
|
||||
uint16_t num_trackers;
|
||||
size_t page_align = VALUE_2MB;
|
||||
size_t page_align = sysconf(_SC_PAGESIZE);
|
||||
size_t queue_align, queue_len;
|
||||
uint32_t flags = SPDK_MALLOC_DMA;
|
||||
uint64_t sq_paddr = 0;
|
||||
uint64_t cq_paddr = 0;
|
||||
@ -1035,7 +1031,7 @@ nvme_pcie_qpair_construct(struct spdk_nvme_qpair *qpair,
|
||||
/* cmd and cpl rings must be aligned on page size boundaries. */
|
||||
if (ctrlr->opts.use_cmb_sqs) {
|
||||
if (nvme_pcie_ctrlr_alloc_cmb(ctrlr, pqpair->num_entries * sizeof(struct spdk_nvme_cmd),
|
||||
sysconf(_SC_PAGESIZE), &offset) == 0) {
|
||||
page_align, &offset) == 0) {
|
||||
pqpair->cmd = pctrlr->cmb_bar_virt_addr + offset;
|
||||
pqpair->cmd_bus_addr = pctrlr->cmb_bar_phys_addr + offset;
|
||||
pqpair->sq_in_cmb = true;
|
||||
@ -1049,9 +1045,9 @@ nvme_pcie_qpair_construct(struct spdk_nvme_qpair *qpair,
|
||||
/* To ensure physical address contiguity we make each ring occupy
|
||||
* a single hugepage only. See MAX_IO_QUEUE_ENTRIES.
|
||||
*/
|
||||
pqpair->cmd = spdk_zmalloc(pqpair->num_entries * sizeof(struct spdk_nvme_cmd),
|
||||
page_align, NULL,
|
||||
SPDK_ENV_SOCKET_ID_ANY, flags);
|
||||
queue_len = pqpair->num_entries * sizeof(struct spdk_nvme_cmd);
|
||||
queue_align = spdk_max(spdk_align32pow2(queue_len), page_align);
|
||||
pqpair->cmd = spdk_zmalloc(queue_len, queue_align, NULL, SPDK_ENV_SOCKET_ID_ANY, flags);
|
||||
if (pqpair->cmd == NULL) {
|
||||
SPDK_ERRLOG("alloc qpair_cmd failed\n");
|
||||
return -ENOMEM;
|
||||
@ -1072,9 +1068,9 @@ nvme_pcie_qpair_construct(struct spdk_nvme_qpair *qpair,
|
||||
if (pqpair->cq_vaddr) {
|
||||
pqpair->cpl = pqpair->cq_vaddr;
|
||||
} else {
|
||||
pqpair->cpl = spdk_zmalloc(pqpair->num_entries * sizeof(struct spdk_nvme_cpl),
|
||||
page_align, NULL,
|
||||
SPDK_ENV_SOCKET_ID_ANY, flags);
|
||||
queue_len = pqpair->num_entries * sizeof(struct spdk_nvme_cpl);
|
||||
queue_align = spdk_max(spdk_align32pow2(queue_len), page_align);
|
||||
pqpair->cpl = spdk_zmalloc(queue_len, queue_align, NULL, SPDK_ENV_SOCKET_ID_ANY, flags);
|
||||
if (pqpair->cpl == NULL) {
|
||||
SPDK_ERRLOG("alloc qpair_cpl failed\n");
|
||||
return -ENOMEM;
|
||||
|
@ -207,7 +207,7 @@ static const struct nvme_string generic_status[] = {
|
||||
static const struct nvme_string command_specific_status[] = {
|
||||
{ SPDK_NVME_SC_COMPLETION_QUEUE_INVALID, "INVALID COMPLETION QUEUE" },
|
||||
{ SPDK_NVME_SC_INVALID_QUEUE_IDENTIFIER, "INVALID QUEUE IDENTIFIER" },
|
||||
{ SPDK_NVME_SC_MAXIMUM_QUEUE_SIZE_EXCEEDED, "MAX QUEUE SIZE EXCEEDED" },
|
||||
{ SPDK_NVME_SC_INVALID_QUEUE_SIZE, "INVALID QUEUE SIZE" },
|
||||
{ SPDK_NVME_SC_ABORT_COMMAND_LIMIT_EXCEEDED, "ABORT CMD LIMIT EXCEEDED" },
|
||||
{ SPDK_NVME_SC_ASYNC_EVENT_REQUEST_LIMIT_EXCEEDED, "ASYNC LIMIT EXCEEDED" },
|
||||
{ SPDK_NVME_SC_INVALID_FIRMWARE_SLOT, "INVALID FIRMWARE SLOT" },
|
||||
@ -575,6 +575,7 @@ nvme_qpair_deinit(struct spdk_nvme_qpair *qpair)
|
||||
{
|
||||
struct nvme_error_cmd *cmd, *entry;
|
||||
|
||||
nvme_qpair_abort_queued_reqs(qpair, 1);
|
||||
nvme_qpair_complete_error_reqs(qpair);
|
||||
|
||||
TAILQ_FOREACH_SAFE(cmd, &qpair->err_cmd_head, link, entry) {
|
||||
|
@ -127,6 +127,12 @@ struct spdk_nvme_recv_wr_list {
|
||||
struct ibv_recv_wr *last;
|
||||
};
|
||||
|
||||
/* Memory regions */
|
||||
union nvme_rdma_mr {
|
||||
struct ibv_mr *mr;
|
||||
uint64_t key;
|
||||
};
|
||||
|
||||
/* NVMe RDMA qpair extensions for spdk_nvme_qpair */
|
||||
struct nvme_rdma_qpair {
|
||||
struct spdk_nvme_qpair qpair;
|
||||
@ -143,18 +149,19 @@ struct nvme_rdma_qpair {
|
||||
|
||||
uint16_t num_entries;
|
||||
|
||||
bool delay_cmd_submit;
|
||||
|
||||
/* Parallel arrays of response buffers + response SGLs of size num_entries */
|
||||
struct ibv_sge *rsp_sgls;
|
||||
struct spdk_nvme_cpl *rsps;
|
||||
|
||||
struct ibv_recv_wr *rsp_recv_wrs;
|
||||
|
||||
bool delay_cmd_submit;
|
||||
struct spdk_nvme_send_wr_list sends_to_post;
|
||||
struct spdk_nvme_recv_wr_list recvs_to_post;
|
||||
|
||||
/* Memory region describing all rsps for this qpair */
|
||||
struct ibv_mr *rsp_mr;
|
||||
union nvme_rdma_mr rsp_mr;
|
||||
|
||||
/*
|
||||
* Array of num_entries NVMe commands registered as RDMA message buffers.
|
||||
@ -163,7 +170,7 @@ struct nvme_rdma_qpair {
|
||||
struct spdk_nvmf_cmd *cmds;
|
||||
|
||||
/* Memory region describing all cmds for this qpair */
|
||||
struct ibv_mr *cmd_mr;
|
||||
union nvme_rdma_mr cmd_mr;
|
||||
|
||||
struct spdk_nvme_rdma_mr_map *mr_map;
|
||||
|
||||
@ -174,8 +181,19 @@ struct nvme_rdma_qpair {
|
||||
struct rdma_cm_event *evt;
|
||||
};
|
||||
|
||||
enum NVME_RDMA_COMPLETION_FLAGS {
|
||||
NVME_RDMA_SEND_COMPLETED = 1u << 0,
|
||||
NVME_RDMA_RECV_COMPLETED = 1u << 1,
|
||||
};
|
||||
|
||||
struct spdk_nvme_rdma_req {
|
||||
int id;
|
||||
uint16_t id;
|
||||
uint16_t completion_flags: 2;
|
||||
uint16_t reserved: 14;
|
||||
/* if completion of RDMA_RECV received before RDMA_SEND, we will complete nvme request
|
||||
* during processing of RDMA_SEND. To complete the request we must know the index
|
||||
* of nvme_cpl received in RDMA_RECV, so store it in this field */
|
||||
uint16_t rsp_idx;
|
||||
|
||||
struct ibv_send_wr send_wr;
|
||||
|
||||
@ -184,8 +202,6 @@ struct spdk_nvme_rdma_req {
|
||||
struct ibv_sge send_sgl[NVME_RDMA_DEFAULT_TX_SGE];
|
||||
|
||||
TAILQ_ENTRY(spdk_nvme_rdma_req) link;
|
||||
|
||||
bool request_ready_to_put;
|
||||
};
|
||||
|
||||
static const char *rdma_cm_event_str[] = {
|
||||
@ -210,6 +226,26 @@ static const char *rdma_cm_event_str[] = {
|
||||
static LIST_HEAD(, spdk_nvme_rdma_mr_map) g_rdma_mr_maps = LIST_HEAD_INITIALIZER(&g_rdma_mr_maps);
|
||||
static pthread_mutex_t g_rdma_mr_maps_mutex = PTHREAD_MUTEX_INITIALIZER;
|
||||
|
||||
static inline void *
|
||||
nvme_rdma_calloc(size_t nmemb, size_t size)
|
||||
{
|
||||
if (!g_nvme_hooks.get_rkey) {
|
||||
return calloc(nmemb, size);
|
||||
} else {
|
||||
return spdk_zmalloc(nmemb * size, 0, NULL, SPDK_ENV_SOCKET_ID_ANY, SPDK_MALLOC_DMA);
|
||||
}
|
||||
}
|
||||
|
||||
static inline void
|
||||
nvme_rdma_free(void *buf)
|
||||
{
|
||||
if (!g_nvme_hooks.get_rkey) {
|
||||
free(buf);
|
||||
} else {
|
||||
spdk_free(buf);
|
||||
}
|
||||
}
|
||||
|
||||
int nvme_rdma_ctrlr_delete_io_qpair(struct spdk_nvme_ctrlr *ctrlr,
|
||||
struct spdk_nvme_qpair *qpair);
|
||||
|
||||
@ -244,7 +280,8 @@ nvme_rdma_req_get(struct nvme_rdma_qpair *rqpair)
|
||||
static void
|
||||
nvme_rdma_req_put(struct nvme_rdma_qpair *rqpair, struct spdk_nvme_rdma_req *rdma_req)
|
||||
{
|
||||
rdma_req->request_ready_to_put = false;
|
||||
rdma_req->completion_flags = 0;
|
||||
rdma_req->req = NULL;
|
||||
TAILQ_REMOVE(&rqpair->outstanding_reqs, rdma_req, link);
|
||||
TAILQ_INSERT_HEAD(&rqpair->free_reqs, rdma_req, link);
|
||||
}
|
||||
@ -614,23 +651,66 @@ nvme_rdma_post_recv(struct nvme_rdma_qpair *rqpair, uint16_t rsp_idx)
|
||||
return nvme_rdma_qpair_queue_recv_wr(rqpair, wr);
|
||||
}
|
||||
|
||||
static int
|
||||
nvme_rdma_reg_mr(struct rdma_cm_id *cm_id, union nvme_rdma_mr *mr, void *mem, size_t length)
|
||||
{
|
||||
if (!g_nvme_hooks.get_rkey) {
|
||||
mr->mr = rdma_reg_msgs(cm_id, mem, length);
|
||||
if (mr->mr == NULL) {
|
||||
SPDK_ERRLOG("Unable to register mr: %s (%d)\n",
|
||||
spdk_strerror(errno), errno);
|
||||
return -1;
|
||||
}
|
||||
} else {
|
||||
mr->key = g_nvme_hooks.get_rkey(cm_id->pd, mem, length);
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void
|
||||
nvme_rdma_dereg_mr(union nvme_rdma_mr *mr)
|
||||
{
|
||||
if (!g_nvme_hooks.get_rkey) {
|
||||
if (mr->mr && rdma_dereg_mr(mr->mr)) {
|
||||
SPDK_ERRLOG("Unable to de-register mr\n");
|
||||
}
|
||||
} else {
|
||||
if (mr->key) {
|
||||
g_nvme_hooks.put_rkey(mr->key);
|
||||
}
|
||||
}
|
||||
memset(mr, 0, sizeof(*mr));
|
||||
}
|
||||
|
||||
static uint32_t
|
||||
nvme_rdma_mr_get_lkey(union nvme_rdma_mr *mr)
|
||||
{
|
||||
uint32_t lkey;
|
||||
|
||||
if (!g_nvme_hooks.get_rkey) {
|
||||
lkey = mr->mr->lkey;
|
||||
} else {
|
||||
lkey = *((uint64_t *) mr->key);
|
||||
}
|
||||
|
||||
return lkey;
|
||||
}
|
||||
|
||||
static void
|
||||
nvme_rdma_unregister_rsps(struct nvme_rdma_qpair *rqpair)
|
||||
{
|
||||
if (rqpair->rsp_mr && rdma_dereg_mr(rqpair->rsp_mr)) {
|
||||
SPDK_ERRLOG("Unable to de-register rsp_mr\n");
|
||||
}
|
||||
rqpair->rsp_mr = NULL;
|
||||
nvme_rdma_dereg_mr(&rqpair->rsp_mr);
|
||||
}
|
||||
|
||||
static void
|
||||
nvme_rdma_free_rsps(struct nvme_rdma_qpair *rqpair)
|
||||
{
|
||||
free(rqpair->rsps);
|
||||
nvme_rdma_free(rqpair->rsps);
|
||||
rqpair->rsps = NULL;
|
||||
free(rqpair->rsp_sgls);
|
||||
nvme_rdma_free(rqpair->rsp_sgls);
|
||||
rqpair->rsp_sgls = NULL;
|
||||
free(rqpair->rsp_recv_wrs);
|
||||
nvme_rdma_free(rqpair->rsp_recv_wrs);
|
||||
rqpair->rsp_recv_wrs = NULL;
|
||||
}
|
||||
|
||||
@ -640,20 +720,19 @@ nvme_rdma_alloc_rsps(struct nvme_rdma_qpair *rqpair)
|
||||
rqpair->rsps = NULL;
|
||||
rqpair->rsp_recv_wrs = NULL;
|
||||
|
||||
rqpair->rsp_sgls = calloc(rqpair->num_entries, sizeof(*rqpair->rsp_sgls));
|
||||
rqpair->rsp_sgls = nvme_rdma_calloc(rqpair->num_entries, sizeof(*rqpair->rsp_sgls));
|
||||
if (!rqpair->rsp_sgls) {
|
||||
SPDK_ERRLOG("Failed to allocate rsp_sgls\n");
|
||||
goto fail;
|
||||
}
|
||||
|
||||
rqpair->rsp_recv_wrs = calloc(rqpair->num_entries,
|
||||
sizeof(*rqpair->rsp_recv_wrs));
|
||||
rqpair->rsp_recv_wrs = nvme_rdma_calloc(rqpair->num_entries, sizeof(*rqpair->rsp_recv_wrs));
|
||||
if (!rqpair->rsp_recv_wrs) {
|
||||
SPDK_ERRLOG("Failed to allocate rsp_recv_wrs\n");
|
||||
goto fail;
|
||||
}
|
||||
|
||||
rqpair->rsps = calloc(rqpair->num_entries, sizeof(*rqpair->rsps));
|
||||
rqpair->rsps = nvme_rdma_calloc(rqpair->num_entries, sizeof(*rqpair->rsps));
|
||||
if (!rqpair->rsps) {
|
||||
SPDK_ERRLOG("can not allocate rdma rsps\n");
|
||||
goto fail;
|
||||
@ -668,22 +747,25 @@ fail:
|
||||
static int
|
||||
nvme_rdma_register_rsps(struct nvme_rdma_qpair *rqpair)
|
||||
{
|
||||
int i, rc;
|
||||
uint16_t i;
|
||||
int rc;
|
||||
uint32_t lkey;
|
||||
|
||||
rqpair->rsp_mr = rdma_reg_msgs(rqpair->cm_id, rqpair->rsps,
|
||||
rqpair->num_entries * sizeof(*rqpair->rsps));
|
||||
if (rqpair->rsp_mr == NULL) {
|
||||
rc = -errno;
|
||||
SPDK_ERRLOG("Unable to register rsp_mr: %s (%d)\n", spdk_strerror(errno), errno);
|
||||
rc = nvme_rdma_reg_mr(rqpair->cm_id, &rqpair->rsp_mr,
|
||||
rqpair->rsps, rqpair->num_entries * sizeof(*rqpair->rsps));
|
||||
|
||||
if (rc < 0) {
|
||||
goto fail;
|
||||
}
|
||||
|
||||
lkey = nvme_rdma_mr_get_lkey(&rqpair->rsp_mr);
|
||||
|
||||
for (i = 0; i < rqpair->num_entries; i++) {
|
||||
struct ibv_sge *rsp_sgl = &rqpair->rsp_sgls[i];
|
||||
|
||||
rsp_sgl->addr = (uint64_t)&rqpair->rsps[i];
|
||||
rsp_sgl->length = sizeof(rqpair->rsps[i]);
|
||||
rsp_sgl->lkey = rqpair->rsp_mr->lkey;
|
||||
rsp_sgl->lkey = lkey;
|
||||
|
||||
rqpair->rsp_recv_wrs[i].wr_id = i;
|
||||
rqpair->rsp_recv_wrs[i].next = NULL;
|
||||
@ -711,10 +793,7 @@ fail:
|
||||
static void
|
||||
nvme_rdma_unregister_reqs(struct nvme_rdma_qpair *rqpair)
|
||||
{
|
||||
if (rqpair->cmd_mr && rdma_dereg_mr(rqpair->cmd_mr)) {
|
||||
SPDK_ERRLOG("Unable to de-register cmd_mr\n");
|
||||
}
|
||||
rqpair->cmd_mr = NULL;
|
||||
nvme_rdma_dereg_mr(&rqpair->cmd_mr);
|
||||
}
|
||||
|
||||
static void
|
||||
@ -724,25 +803,25 @@ nvme_rdma_free_reqs(struct nvme_rdma_qpair *rqpair)
|
||||
return;
|
||||
}
|
||||
|
||||
free(rqpair->cmds);
|
||||
nvme_rdma_free(rqpair->cmds);
|
||||
rqpair->cmds = NULL;
|
||||
|
||||
free(rqpair->rdma_reqs);
|
||||
nvme_rdma_free(rqpair->rdma_reqs);
|
||||
rqpair->rdma_reqs = NULL;
|
||||
}
|
||||
|
||||
static int
|
||||
nvme_rdma_alloc_reqs(struct nvme_rdma_qpair *rqpair)
|
||||
{
|
||||
int i;
|
||||
uint16_t i;
|
||||
|
||||
rqpair->rdma_reqs = calloc(rqpair->num_entries, sizeof(struct spdk_nvme_rdma_req));
|
||||
rqpair->rdma_reqs = nvme_rdma_calloc(rqpair->num_entries, sizeof(struct spdk_nvme_rdma_req));
|
||||
if (rqpair->rdma_reqs == NULL) {
|
||||
SPDK_ERRLOG("Failed to allocate rdma_reqs\n");
|
||||
goto fail;
|
||||
}
|
||||
|
||||
rqpair->cmds = calloc(rqpair->num_entries, sizeof(*rqpair->cmds));
|
||||
rqpair->cmds = nvme_rdma_calloc(rqpair->num_entries, sizeof(*rqpair->cmds));
|
||||
if (!rqpair->cmds) {
|
||||
SPDK_ERRLOG("Failed to allocate RDMA cmds\n");
|
||||
goto fail;
|
||||
@ -785,16 +864,20 @@ static int
|
||||
nvme_rdma_register_reqs(struct nvme_rdma_qpair *rqpair)
|
||||
{
|
||||
int i;
|
||||
int rc;
|
||||
uint32_t lkey;
|
||||
|
||||
rqpair->cmd_mr = rdma_reg_msgs(rqpair->cm_id, rqpair->cmds,
|
||||
rqpair->num_entries * sizeof(*rqpair->cmds));
|
||||
if (!rqpair->cmd_mr) {
|
||||
SPDK_ERRLOG("Unable to register cmd_mr\n");
|
||||
rc = nvme_rdma_reg_mr(rqpair->cm_id, &rqpair->cmd_mr,
|
||||
rqpair->cmds, rqpair->num_entries * sizeof(*rqpair->cmds));
|
||||
|
||||
if (rc < 0) {
|
||||
goto fail;
|
||||
}
|
||||
|
||||
lkey = nvme_rdma_mr_get_lkey(&rqpair->cmd_mr);
|
||||
|
||||
for (i = 0; i < rqpair->num_entries; i++) {
|
||||
rqpair->rdma_reqs[i].send_sgl[0].lkey = rqpair->cmd_mr->lkey;
|
||||
rqpair->rdma_reqs[i].send_sgl[0].lkey = lkey;
|
||||
}
|
||||
|
||||
return 0;
|
||||
@ -804,35 +887,6 @@ fail:
|
||||
return -ENOMEM;
|
||||
}
|
||||
|
||||
static int
|
||||
nvme_rdma_recv(struct nvme_rdma_qpair *rqpair, uint64_t rsp_idx, int *reaped)
|
||||
{
|
||||
struct spdk_nvme_rdma_req *rdma_req;
|
||||
struct spdk_nvme_cpl *rsp;
|
||||
struct nvme_request *req;
|
||||
|
||||
assert(rsp_idx < rqpair->num_entries);
|
||||
rsp = &rqpair->rsps[rsp_idx];
|
||||
rdma_req = &rqpair->rdma_reqs[rsp->cid];
|
||||
|
||||
req = rdma_req->req;
|
||||
nvme_rdma_req_complete(req, rsp);
|
||||
|
||||
if (rdma_req->request_ready_to_put) {
|
||||
(*reaped)++;
|
||||
nvme_rdma_req_put(rqpair, rdma_req);
|
||||
} else {
|
||||
rdma_req->request_ready_to_put = true;
|
||||
}
|
||||
|
||||
if (nvme_rdma_post_recv(rqpair, rsp_idx)) {
|
||||
SPDK_ERRLOG("Unable to re-post rx descriptor\n");
|
||||
return -1;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int
|
||||
nvme_rdma_resolve_addr(struct nvme_rdma_qpair *rqpair,
|
||||
struct sockaddr *src_addr,
|
||||
@ -1023,9 +1077,9 @@ nvme_rdma_register_mem(struct nvme_rdma_qpair *rqpair)
|
||||
}
|
||||
}
|
||||
|
||||
mr_map = calloc(1, sizeof(*mr_map));
|
||||
mr_map = nvme_rdma_calloc(1, sizeof(*mr_map));
|
||||
if (mr_map == NULL) {
|
||||
SPDK_ERRLOG("calloc() failed\n");
|
||||
SPDK_ERRLOG("Failed to allocate mr_map\n");
|
||||
pthread_mutex_unlock(&g_rdma_mr_maps_mutex);
|
||||
return -1;
|
||||
}
|
||||
@ -1035,7 +1089,8 @@ nvme_rdma_register_mem(struct nvme_rdma_qpair *rqpair)
|
||||
mr_map->map = spdk_mem_map_alloc((uint64_t)NULL, &nvme_rdma_map_ops, pd);
|
||||
if (mr_map->map == NULL) {
|
||||
SPDK_ERRLOG("spdk_mem_map_alloc() failed\n");
|
||||
free(mr_map);
|
||||
nvme_rdma_free(mr_map);
|
||||
|
||||
pthread_mutex_unlock(&g_rdma_mr_maps_mutex);
|
||||
return -1;
|
||||
}
|
||||
@ -1067,7 +1122,7 @@ nvme_rdma_unregister_mem(struct nvme_rdma_qpair *rqpair)
|
||||
if (mr_map->ref == 0) {
|
||||
LIST_REMOVE(mr_map, link);
|
||||
spdk_mem_map_free(&mr_map->map);
|
||||
free(mr_map);
|
||||
nvme_rdma_free(mr_map);
|
||||
}
|
||||
|
||||
pthread_mutex_unlock(&g_rdma_mr_maps_mutex);
|
||||
@ -1517,6 +1572,7 @@ nvme_rdma_req_init(struct nvme_rdma_qpair *rqpair, struct nvme_request *req,
|
||||
struct spdk_nvme_ctrlr *ctrlr = rqpair->qpair.ctrlr;
|
||||
int rc;
|
||||
|
||||
assert(rdma_req->req == NULL);
|
||||
rdma_req->req = req;
|
||||
req->cmd.cid = rdma_req->id;
|
||||
|
||||
@ -1569,7 +1625,7 @@ nvme_rdma_ctrlr_create_qpair(struct spdk_nvme_ctrlr *ctrlr,
|
||||
struct spdk_nvme_qpair *qpair;
|
||||
int rc, retry_count = 0;
|
||||
|
||||
rqpair = calloc(1, sizeof(struct nvme_rdma_qpair));
|
||||
rqpair = nvme_rdma_calloc(1, sizeof(struct nvme_rdma_qpair));
|
||||
if (!rqpair) {
|
||||
SPDK_ERRLOG("failed to get create rqpair\n");
|
||||
return NULL;
|
||||
@ -1587,6 +1643,7 @@ nvme_rdma_ctrlr_create_qpair(struct spdk_nvme_ctrlr *ctrlr,
|
||||
SPDK_DEBUGLOG(SPDK_LOG_NVME, "rc =%d\n", rc);
|
||||
if (rc) {
|
||||
SPDK_ERRLOG("Unable to allocate rqpair RDMA requests\n");
|
||||
nvme_rdma_free(rqpair);
|
||||
return NULL;
|
||||
}
|
||||
SPDK_DEBUGLOG(SPDK_LOG_NVME, "RDMA requests allocated\n");
|
||||
@ -1595,6 +1652,8 @@ nvme_rdma_ctrlr_create_qpair(struct spdk_nvme_ctrlr *ctrlr,
|
||||
SPDK_DEBUGLOG(SPDK_LOG_NVME, "rc =%d\n", rc);
|
||||
if (rc < 0) {
|
||||
SPDK_ERRLOG("Unable to allocate rqpair RDMA responses\n");
|
||||
nvme_rdma_free_reqs(rqpair);
|
||||
nvme_rdma_free(rqpair);
|
||||
return NULL;
|
||||
}
|
||||
SPDK_DEBUGLOG(SPDK_LOG_NVME, "RDMA responses allocated\n");
|
||||
@ -1686,7 +1745,7 @@ nvme_rdma_ctrlr_delete_io_qpair(struct spdk_nvme_ctrlr *ctrlr, struct spdk_nvme_
|
||||
|
||||
nvme_rdma_free_reqs(rqpair);
|
||||
nvme_rdma_free_rsps(rqpair);
|
||||
free(rqpair);
|
||||
nvme_rdma_free(rqpair);
|
||||
|
||||
return 0;
|
||||
}
|
||||
@ -1718,20 +1777,19 @@ struct spdk_nvme_ctrlr *nvme_rdma_ctrlr_construct(const struct spdk_nvme_transpo
|
||||
struct ibv_device_attr dev_attr;
|
||||
int i, flag, rc;
|
||||
|
||||
rctrlr = calloc(1, sizeof(struct nvme_rdma_ctrlr));
|
||||
rctrlr = nvme_rdma_calloc(1, sizeof(struct nvme_rdma_ctrlr));
|
||||
if (rctrlr == NULL) {
|
||||
SPDK_ERRLOG("could not allocate ctrlr\n");
|
||||
return NULL;
|
||||
}
|
||||
|
||||
spdk_nvme_trid_populate_transport(&rctrlr->ctrlr.trid, SPDK_NVME_TRANSPORT_RDMA);
|
||||
rctrlr->ctrlr.opts = *opts;
|
||||
memcpy(&rctrlr->ctrlr.trid, trid, sizeof(rctrlr->ctrlr.trid));
|
||||
|
||||
contexts = rdma_get_devices(NULL);
|
||||
if (contexts == NULL) {
|
||||
SPDK_ERRLOG("rdma_get_devices() failed: %s (%d)\n", spdk_strerror(errno), errno);
|
||||
free(rctrlr);
|
||||
nvme_rdma_free(rctrlr);
|
||||
return NULL;
|
||||
}
|
||||
|
||||
@ -1743,7 +1801,7 @@ struct spdk_nvme_ctrlr *nvme_rdma_ctrlr_construct(const struct spdk_nvme_transpo
|
||||
if (rc < 0) {
|
||||
SPDK_ERRLOG("Failed to query RDMA device attributes.\n");
|
||||
rdma_free_devices(contexts);
|
||||
free(rctrlr);
|
||||
nvme_rdma_free(rctrlr);
|
||||
return NULL;
|
||||
}
|
||||
rctrlr->max_sge = spdk_min(rctrlr->max_sge, (uint16_t)dev_attr.max_sge);
|
||||
@ -1754,13 +1812,13 @@ struct spdk_nvme_ctrlr *nvme_rdma_ctrlr_construct(const struct spdk_nvme_transpo
|
||||
|
||||
rc = nvme_ctrlr_construct(&rctrlr->ctrlr);
|
||||
if (rc != 0) {
|
||||
free(rctrlr);
|
||||
nvme_rdma_free(rctrlr);
|
||||
return NULL;
|
||||
}
|
||||
|
||||
STAILQ_INIT(&rctrlr->pending_cm_events);
|
||||
STAILQ_INIT(&rctrlr->free_cm_events);
|
||||
rctrlr->cm_events = calloc(NVME_RDMA_NUM_CM_EVENTS, sizeof(*rctrlr->cm_events));
|
||||
rctrlr->cm_events = nvme_rdma_calloc(NVME_RDMA_NUM_CM_EVENTS, sizeof(*rctrlr->cm_events));
|
||||
if (rctrlr->cm_events == NULL) {
|
||||
SPDK_ERRLOG("unable to allocat buffers to hold CM events.\n");
|
||||
nvme_rdma_ctrlr_destruct(&rctrlr->ctrlr);
|
||||
@ -1834,7 +1892,7 @@ nvme_rdma_ctrlr_destruct(struct spdk_nvme_ctrlr *ctrlr)
|
||||
|
||||
STAILQ_INIT(&rctrlr->free_cm_events);
|
||||
STAILQ_INIT(&rctrlr->pending_cm_events);
|
||||
free(rctrlr->cm_events);
|
||||
nvme_rdma_free(rctrlr->cm_events);
|
||||
|
||||
if (rctrlr->cm_channel) {
|
||||
rdma_destroy_event_channel(rctrlr->cm_channel);
|
||||
@ -1843,7 +1901,7 @@ nvme_rdma_ctrlr_destruct(struct spdk_nvme_ctrlr *ctrlr)
|
||||
|
||||
nvme_ctrlr_destruct_finish(ctrlr);
|
||||
|
||||
free(rctrlr);
|
||||
nvme_rdma_free(rctrlr);
|
||||
|
||||
return 0;
|
||||
}
|
||||
@ -1945,6 +2003,14 @@ nvme_rdma_qpair_check_timeout(struct spdk_nvme_qpair *qpair)
|
||||
}
|
||||
}
|
||||
|
||||
static inline int
|
||||
nvme_rdma_request_ready(struct nvme_rdma_qpair *rqpair, struct spdk_nvme_rdma_req *rdma_req)
|
||||
{
|
||||
nvme_rdma_req_complete(rdma_req->req, &rqpair->rsps[rdma_req->rsp_idx]);
|
||||
nvme_rdma_req_put(rqpair, rdma_req);
|
||||
return nvme_rdma_post_recv(rqpair, rdma_req->rsp_idx);
|
||||
}
|
||||
|
||||
#define MAX_COMPLETIONS_PER_POLL 128
|
||||
|
||||
int
|
||||
@ -1954,10 +2020,12 @@ nvme_rdma_qpair_process_completions(struct spdk_nvme_qpair *qpair,
|
||||
struct nvme_rdma_qpair *rqpair = nvme_rdma_qpair(qpair);
|
||||
struct ibv_wc wc[MAX_COMPLETIONS_PER_POLL];
|
||||
int i, rc = 0, batch_size;
|
||||
uint32_t reaped;
|
||||
uint32_t reaped = 0;
|
||||
uint16_t rsp_idx;
|
||||
struct ibv_cq *cq;
|
||||
struct spdk_nvme_rdma_req *rdma_req;
|
||||
struct nvme_rdma_ctrlr *rctrlr;
|
||||
struct spdk_nvme_cpl *rsp;
|
||||
|
||||
if (spdk_unlikely(nvme_rdma_qpair_submit_sends(rqpair) ||
|
||||
nvme_rdma_qpair_submit_recvs(rqpair))) {
|
||||
@ -1982,7 +2050,6 @@ nvme_rdma_qpair_process_completions(struct spdk_nvme_qpair *qpair,
|
||||
|
||||
cq = rqpair->cq;
|
||||
|
||||
reaped = 0;
|
||||
do {
|
||||
batch_size = spdk_min((max_completions - reaped),
|
||||
MAX_COMPLETIONS_PER_POLL);
|
||||
@ -2012,20 +2079,32 @@ nvme_rdma_qpair_process_completions(struct spdk_nvme_qpair *qpair,
|
||||
goto fail;
|
||||
}
|
||||
|
||||
if (nvme_rdma_recv(rqpair, wc[i].wr_id, &reaped)) {
|
||||
SPDK_ERRLOG("nvme_rdma_recv processing failure\n");
|
||||
goto fail;
|
||||
assert(wc[i].wr_id < rqpair->num_entries);
|
||||
rsp_idx = (uint16_t)wc[i].wr_id;
|
||||
rsp = &rqpair->rsps[rsp_idx];
|
||||
rdma_req = &rqpair->rdma_reqs[rsp->cid];
|
||||
rdma_req->completion_flags |= NVME_RDMA_RECV_COMPLETED;
|
||||
rdma_req->rsp_idx = rsp_idx;
|
||||
|
||||
if ((rdma_req->completion_flags & NVME_RDMA_SEND_COMPLETED) != 0) {
|
||||
if (spdk_unlikely(nvme_rdma_request_ready(rqpair, rdma_req))) {
|
||||
SPDK_ERRLOG("Unable to re-post rx descriptor\n");
|
||||
goto fail;
|
||||
}
|
||||
reaped++;
|
||||
}
|
||||
break;
|
||||
|
||||
case IBV_WC_SEND:
|
||||
rdma_req = (struct spdk_nvme_rdma_req *)wc[i].wr_id;
|
||||
rdma_req->completion_flags |= NVME_RDMA_SEND_COMPLETED;
|
||||
|
||||
if (rdma_req->request_ready_to_put) {
|
||||
if ((rdma_req->completion_flags & NVME_RDMA_RECV_COMPLETED) != 0) {
|
||||
if (spdk_unlikely(nvme_rdma_request_ready(rqpair, rdma_req))) {
|
||||
SPDK_ERRLOG("Unable to re-post rx descriptor\n");
|
||||
goto fail;
|
||||
}
|
||||
reaped++;
|
||||
nvme_rdma_req_put(rqpair, rdma_req);
|
||||
} else {
|
||||
rdma_req->request_ready_to_put = true;
|
||||
}
|
||||
break;
|
||||
|
||||
|
@ -236,6 +236,11 @@ nvme_tcp_ctrlr_disconnect_qpair(struct spdk_nvme_ctrlr *ctrlr, struct spdk_nvme_
|
||||
struct nvme_tcp_qpair *tqpair = nvme_tcp_qpair(qpair);
|
||||
struct nvme_tcp_pdu *pdu;
|
||||
|
||||
if (nvme_qpair_get_state(qpair) == NVME_QPAIR_DISABLED) {
|
||||
/* Already disconnecting */
|
||||
return;
|
||||
}
|
||||
|
||||
nvme_qpair_set_state(qpair, NVME_QPAIR_DISABLED);
|
||||
spdk_sock_close(&tqpair->sock);
|
||||
|
||||
@ -1620,7 +1625,6 @@ struct spdk_nvme_ctrlr *nvme_tcp_ctrlr_construct(const struct spdk_nvme_transpor
|
||||
|
||||
tctrlr->ctrlr.opts = *opts;
|
||||
tctrlr->ctrlr.trid = *trid;
|
||||
spdk_nvme_trid_populate_transport(&tctrlr->ctrlr.trid, SPDK_NVME_TRANSPORT_TCP);
|
||||
|
||||
rc = nvme_ctrlr_construct(&tctrlr->ctrlr);
|
||||
if (rc != 0) {
|
||||
|
@ -2496,6 +2496,11 @@ spdk_nvmf_ctrlr_process_io_fused_cmd(struct spdk_nvmf_request *req, struct spdk_
|
||||
/* save request of first command to generate response later */
|
||||
req->first_fused_req = first_fused_req;
|
||||
req->qpair->first_fused_req = NULL;
|
||||
} else {
|
||||
SPDK_ERRLOG("Invalid fused command fuse field.\n");
|
||||
rsp->status.sct = SPDK_NVME_SCT_GENERIC;
|
||||
rsp->status.sc = SPDK_NVME_SC_INVALID_FIELD;
|
||||
return SPDK_NVMF_REQUEST_EXEC_STATUS_COMPLETE;
|
||||
}
|
||||
|
||||
rc = spdk_nvmf_bdev_ctrlr_compare_and_write_cmd(bdev, desc, ch, req->first_fused_req, req);
|
||||
|
@ -2,7 +2,7 @@
|
||||
* BSD LICENSE
|
||||
*
|
||||
* Copyright (c) Intel Corporation. All rights reserved.
|
||||
* Copyright (c) 2018-2019 Mellanox Technologies LTD. All rights reserved.
|
||||
* Copyright (c) 2018-2020 Mellanox Technologies LTD. All rights reserved.
|
||||
*
|
||||
* Redistribution and use in source and binary forms, with or without
|
||||
* modification, are permitted provided that the following conditions
|
||||
@ -359,11 +359,17 @@ spdk_rpc_nvmf_subsystem_started(struct spdk_nvmf_subsystem *subsystem,
|
||||
void *cb_arg, int status)
|
||||
{
|
||||
struct spdk_jsonrpc_request *request = cb_arg;
|
||||
struct spdk_json_write_ctx *w;
|
||||
|
||||
w = spdk_jsonrpc_begin_result(request);
|
||||
spdk_json_write_bool(w, true);
|
||||
spdk_jsonrpc_end_result(request, w);
|
||||
if (!status) {
|
||||
struct spdk_json_write_ctx *w = spdk_jsonrpc_begin_result(request);
|
||||
spdk_json_write_bool(w, true);
|
||||
spdk_jsonrpc_end_result(request, w);
|
||||
} else {
|
||||
spdk_jsonrpc_send_error_response_fmt(request, SPDK_JSONRPC_ERROR_INTERNAL_ERROR,
|
||||
"Subsystem %s start failed",
|
||||
subsystem->subnqn);
|
||||
spdk_nvmf_subsystem_destroy(subsystem);
|
||||
}
|
||||
}
|
||||
|
||||
static void
|
||||
@ -371,72 +377,77 @@ spdk_rpc_nvmf_create_subsystem(struct spdk_jsonrpc_request *request,
|
||||
const struct spdk_json_val *params)
|
||||
{
|
||||
struct rpc_subsystem_create *req;
|
||||
struct spdk_nvmf_subsystem *subsystem;
|
||||
struct spdk_nvmf_subsystem *subsystem = NULL;
|
||||
struct spdk_nvmf_tgt *tgt;
|
||||
int rc = -1;
|
||||
|
||||
req = calloc(1, sizeof(*req));
|
||||
if (!req) {
|
||||
goto invalid;
|
||||
SPDK_ERRLOG("Memory allocation failed\n");
|
||||
spdk_jsonrpc_send_error_response(request, SPDK_JSONRPC_ERROR_INTERNAL_ERROR,
|
||||
"Memory allocation failed");
|
||||
return;
|
||||
}
|
||||
|
||||
if (spdk_json_decode_object(params, rpc_subsystem_create_decoders,
|
||||
SPDK_COUNTOF(rpc_subsystem_create_decoders),
|
||||
req)) {
|
||||
SPDK_ERRLOG("spdk_json_decode_object failed\n");
|
||||
goto invalid;
|
||||
spdk_jsonrpc_send_error_response(request, SPDK_JSONRPC_ERROR_INVALID_PARAMS, "Invalid parameters");
|
||||
goto cleanup;
|
||||
}
|
||||
|
||||
tgt = spdk_nvmf_get_tgt(req->tgt_name);
|
||||
if (!tgt) {
|
||||
spdk_jsonrpc_send_error_response(request, SPDK_JSONRPC_ERROR_INTERNAL_ERROR,
|
||||
"Unable to find a target.");
|
||||
goto invalid_custom_response;
|
||||
SPDK_ERRLOG("Unable to find target %s\n", req->tgt_name);
|
||||
spdk_jsonrpc_send_error_response_fmt(request, SPDK_JSONRPC_ERROR_INTERNAL_ERROR,
|
||||
"Unable to find target %s", req->tgt_name);
|
||||
goto cleanup;
|
||||
}
|
||||
|
||||
subsystem = spdk_nvmf_subsystem_create(tgt, req->nqn, SPDK_NVMF_SUBTYPE_NVME,
|
||||
req->max_namespaces);
|
||||
if (!subsystem) {
|
||||
goto invalid;
|
||||
SPDK_ERRLOG("Unable to create subsystem %s\n", req->nqn);
|
||||
spdk_jsonrpc_send_error_response_fmt(request, SPDK_JSONRPC_ERROR_INTERNAL_ERROR,
|
||||
"Unable to create subsystem %s", req->nqn);
|
||||
goto cleanup;
|
||||
}
|
||||
|
||||
if (req->serial_number) {
|
||||
if (spdk_nvmf_subsystem_set_sn(subsystem, req->serial_number)) {
|
||||
SPDK_ERRLOG("Subsystem %s: invalid serial number '%s'\n", req->nqn, req->serial_number);
|
||||
goto invalid;
|
||||
spdk_jsonrpc_send_error_response_fmt(request, SPDK_JSONRPC_ERROR_INVALID_PARAMS,
|
||||
"Invalid SN %s", req->serial_number);
|
||||
goto cleanup;
|
||||
}
|
||||
}
|
||||
|
||||
if (req->model_number) {
|
||||
if (spdk_nvmf_subsystem_set_mn(subsystem, req->model_number)) {
|
||||
SPDK_ERRLOG("Subsystem %s: invalid model number '%s'\n", req->nqn, req->model_number);
|
||||
goto invalid;
|
||||
spdk_jsonrpc_send_error_response_fmt(request, SPDK_JSONRPC_ERROR_INVALID_PARAMS,
|
||||
"Invalid MN %s", req->model_number);
|
||||
goto cleanup;
|
||||
}
|
||||
}
|
||||
|
||||
spdk_nvmf_subsystem_set_allow_any_host(subsystem, req->allow_any_host);
|
||||
|
||||
rc = spdk_nvmf_subsystem_start(subsystem,
|
||||
spdk_rpc_nvmf_subsystem_started,
|
||||
request);
|
||||
|
||||
cleanup:
|
||||
free(req->nqn);
|
||||
free(req->tgt_name);
|
||||
free(req->serial_number);
|
||||
free(req->model_number);
|
||||
free(req);
|
||||
|
||||
spdk_nvmf_subsystem_start(subsystem,
|
||||
spdk_rpc_nvmf_subsystem_started,
|
||||
request);
|
||||
|
||||
return;
|
||||
|
||||
invalid:
|
||||
spdk_jsonrpc_send_error_response(request, SPDK_JSONRPC_ERROR_INVALID_PARAMS, "Invalid parameters");
|
||||
invalid_custom_response:
|
||||
if (req) {
|
||||
free(req->nqn);
|
||||
free(req->tgt_name);
|
||||
free(req->serial_number);
|
||||
free(req->model_number);
|
||||
if (rc && subsystem) {
|
||||
spdk_nvmf_subsystem_destroy(subsystem);
|
||||
}
|
||||
free(req);
|
||||
}
|
||||
SPDK_RPC_REGISTER("nvmf_create_subsystem", spdk_rpc_nvmf_create_subsystem, SPDK_RPC_RUNTIME)
|
||||
SPDK_RPC_REGISTER_ALIAS_DEPRECATED(nvmf_create_subsystem, nvmf_subsystem_create)
|
||||
|
185
lib/nvmf/rdma.c
185
lib/nvmf/rdma.c
@ -2,7 +2,7 @@
|
||||
* BSD LICENSE
|
||||
*
|
||||
* Copyright (c) Intel Corporation. All rights reserved.
|
||||
* Copyright (c) 2019 Mellanox Technologies LTD. All rights reserved.
|
||||
* Copyright (c) 2019, 2020 Mellanox Technologies LTD. All rights reserved.
|
||||
*
|
||||
* Redistribution and use in source and binary forms, with or without
|
||||
* modification, are permitted provided that the following conditions
|
||||
@ -2961,14 +2961,30 @@ static const char *CM_EVENT_STR[] = {
|
||||
};
|
||||
#endif /* DEBUG */
|
||||
|
||||
static void
|
||||
nvmf_rdma_disconnect_qpairs_on_port(struct spdk_nvmf_rdma_transport *rtransport,
|
||||
struct spdk_nvmf_rdma_port *port)
|
||||
{
|
||||
struct spdk_nvmf_rdma_poll_group *rgroup;
|
||||
struct spdk_nvmf_rdma_poller *rpoller;
|
||||
struct spdk_nvmf_rdma_qpair *rqpair;
|
||||
|
||||
TAILQ_FOREACH(rgroup, &rtransport->poll_groups, link) {
|
||||
TAILQ_FOREACH(rpoller, &rgroup->pollers, link) {
|
||||
TAILQ_FOREACH(rqpair, &rpoller->qpairs, link) {
|
||||
if (rqpair->listen_id == port->id) {
|
||||
spdk_nvmf_rdma_start_disconnect(rqpair);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
static bool
|
||||
nvmf_rdma_handle_cm_event_addr_change(struct spdk_nvmf_transport *transport,
|
||||
struct rdma_cm_event *event)
|
||||
{
|
||||
struct spdk_nvme_transport_id trid;
|
||||
struct spdk_nvmf_rdma_qpair *rqpair;
|
||||
struct spdk_nvmf_rdma_poll_group *rgroup;
|
||||
struct spdk_nvmf_rdma_poller *rpoller;
|
||||
struct spdk_nvmf_rdma_port *port;
|
||||
struct spdk_nvmf_rdma_transport *rtransport;
|
||||
uint32_t ref, i;
|
||||
@ -2986,27 +3002,41 @@ nvmf_rdma_handle_cm_event_addr_change(struct spdk_nvmf_transport *transport,
|
||||
}
|
||||
}
|
||||
if (event_acked) {
|
||||
TAILQ_FOREACH(rgroup, &rtransport->poll_groups, link) {
|
||||
TAILQ_FOREACH(rpoller, &rgroup->pollers, link) {
|
||||
TAILQ_FOREACH(rqpair, &rpoller->qpairs, link) {
|
||||
if (rqpair->listen_id == port->id) {
|
||||
spdk_nvmf_rdma_start_disconnect(rqpair);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
nvmf_rdma_disconnect_qpairs_on_port(rtransport, port);
|
||||
|
||||
for (i = 0; i < ref; i++) {
|
||||
spdk_nvmf_rdma_stop_listen(transport, &trid);
|
||||
}
|
||||
while (ref > 0) {
|
||||
for (i = 0; i < ref; i++) {
|
||||
spdk_nvmf_rdma_listen(transport, &trid, NULL, NULL);
|
||||
ref--;
|
||||
}
|
||||
}
|
||||
return event_acked;
|
||||
}
|
||||
|
||||
static void
|
||||
nvmf_rdma_handle_cm_event_port_removal(struct spdk_nvmf_transport *transport,
|
||||
struct rdma_cm_event *event)
|
||||
{
|
||||
struct spdk_nvmf_rdma_port *port;
|
||||
struct spdk_nvmf_rdma_transport *rtransport;
|
||||
uint32_t ref, i;
|
||||
|
||||
port = event->id->context;
|
||||
rtransport = SPDK_CONTAINEROF(transport, struct spdk_nvmf_rdma_transport, transport);
|
||||
ref = port->ref;
|
||||
|
||||
SPDK_NOTICELOG("Port %s:%s is being removed\n", port->trid.traddr, port->trid.trsvcid);
|
||||
|
||||
nvmf_rdma_disconnect_qpairs_on_port(rtransport, port);
|
||||
|
||||
rdma_ack_cm_event(event);
|
||||
|
||||
for (i = 0; i < ref; i++) {
|
||||
spdk_nvmf_rdma_stop_listen(transport, &port->trid);
|
||||
}
|
||||
}
|
||||
|
||||
static void
|
||||
spdk_nvmf_process_cm_event(struct spdk_nvmf_transport *transport, new_qpair_fn cb_fn, void *cb_arg)
|
||||
{
|
||||
@ -3024,68 +3054,87 @@ spdk_nvmf_process_cm_event(struct spdk_nvmf_transport *transport, new_qpair_fn c
|
||||
while (1) {
|
||||
event_acked = false;
|
||||
rc = rdma_get_cm_event(rtransport->event_channel, &event);
|
||||
if (rc == 0) {
|
||||
SPDK_DEBUGLOG(SPDK_LOG_RDMA, "Acceptor Event: %s\n", CM_EVENT_STR[event->event]);
|
||||
if (rc) {
|
||||
if (errno != EAGAIN && errno != EWOULDBLOCK) {
|
||||
SPDK_ERRLOG("Acceptor Event Error: %s\n", spdk_strerror(errno));
|
||||
}
|
||||
break;
|
||||
}
|
||||
|
||||
spdk_trace_record(TRACE_RDMA_CM_ASYNC_EVENT, 0, 0, 0, event->event);
|
||||
SPDK_DEBUGLOG(SPDK_LOG_RDMA, "Acceptor Event: %s\n", CM_EVENT_STR[event->event]);
|
||||
|
||||
switch (event->event) {
|
||||
case RDMA_CM_EVENT_ADDR_RESOLVED:
|
||||
case RDMA_CM_EVENT_ADDR_ERROR:
|
||||
case RDMA_CM_EVENT_ROUTE_RESOLVED:
|
||||
case RDMA_CM_EVENT_ROUTE_ERROR:
|
||||
/* No action required. The target never attempts to resolve routes. */
|
||||
spdk_trace_record(TRACE_RDMA_CM_ASYNC_EVENT, 0, 0, 0, event->event);
|
||||
|
||||
switch (event->event) {
|
||||
case RDMA_CM_EVENT_ADDR_RESOLVED:
|
||||
case RDMA_CM_EVENT_ADDR_ERROR:
|
||||
case RDMA_CM_EVENT_ROUTE_RESOLVED:
|
||||
case RDMA_CM_EVENT_ROUTE_ERROR:
|
||||
/* No action required. The target never attempts to resolve routes. */
|
||||
break;
|
||||
case RDMA_CM_EVENT_CONNECT_REQUEST:
|
||||
rc = nvmf_rdma_connect(transport, event, cb_fn, cb_arg);
|
||||
if (rc < 0) {
|
||||
SPDK_ERRLOG("Unable to process connect event. rc: %d\n", rc);
|
||||
break;
|
||||
case RDMA_CM_EVENT_CONNECT_REQUEST:
|
||||
rc = nvmf_rdma_connect(transport, event, cb_fn, cb_arg);
|
||||
if (rc < 0) {
|
||||
SPDK_ERRLOG("Unable to process connect event. rc: %d\n", rc);
|
||||
break;
|
||||
}
|
||||
}
|
||||
break;
|
||||
case RDMA_CM_EVENT_CONNECT_RESPONSE:
|
||||
/* The target never initiates a new connection. So this will not occur. */
|
||||
break;
|
||||
case RDMA_CM_EVENT_CONNECT_ERROR:
|
||||
/* Can this happen? The docs say it can, but not sure what causes it. */
|
||||
break;
|
||||
case RDMA_CM_EVENT_UNREACHABLE:
|
||||
case RDMA_CM_EVENT_REJECTED:
|
||||
/* These only occur on the client side. */
|
||||
break;
|
||||
case RDMA_CM_EVENT_ESTABLISHED:
|
||||
/* TODO: Should we be waiting for this event anywhere? */
|
||||
break;
|
||||
case RDMA_CM_EVENT_DISCONNECTED:
|
||||
rc = nvmf_rdma_disconnect(event);
|
||||
if (rc < 0) {
|
||||
SPDK_ERRLOG("Unable to process disconnect event. rc: %d\n", rc);
|
||||
break;
|
||||
case RDMA_CM_EVENT_CONNECT_RESPONSE:
|
||||
/* The target never initiates a new connection. So this will not occur. */
|
||||
break;
|
||||
case RDMA_CM_EVENT_CONNECT_ERROR:
|
||||
/* Can this happen? The docs say it can, but not sure what causes it. */
|
||||
break;
|
||||
case RDMA_CM_EVENT_UNREACHABLE:
|
||||
case RDMA_CM_EVENT_REJECTED:
|
||||
/* These only occur on the client side. */
|
||||
break;
|
||||
case RDMA_CM_EVENT_ESTABLISHED:
|
||||
/* TODO: Should we be waiting for this event anywhere? */
|
||||
break;
|
||||
case RDMA_CM_EVENT_DISCONNECTED:
|
||||
case RDMA_CM_EVENT_DEVICE_REMOVAL:
|
||||
}
|
||||
break;
|
||||
case RDMA_CM_EVENT_DEVICE_REMOVAL:
|
||||
/* In case of device removal, kernel IB part triggers IBV_EVENT_DEVICE_FATAL
|
||||
* which triggers RDMA_CM_EVENT_DEVICE_REMOVAL on all cma_id’s.
|
||||
* Once these events are sent to SPDK, we should release all IB resources and
|
||||
* don't make attempts to call any ibv_query/modify/create functions. We can only call
|
||||
* ibv_destory* functions to release user space memory allocated by IB. All kernel
|
||||
* resources are already cleaned. */
|
||||
if (event->id->qp) {
|
||||
/* If rdma_cm event has a valid `qp` pointer then the event refers to the
|
||||
* corresponding qpair. Otherwise the event refers to a listening device */
|
||||
rc = nvmf_rdma_disconnect(event);
|
||||
if (rc < 0) {
|
||||
SPDK_ERRLOG("Unable to process disconnect event. rc: %d\n", rc);
|
||||
break;
|
||||
}
|
||||
break;
|
||||
case RDMA_CM_EVENT_MULTICAST_JOIN:
|
||||
case RDMA_CM_EVENT_MULTICAST_ERROR:
|
||||
/* Multicast is not used */
|
||||
break;
|
||||
case RDMA_CM_EVENT_ADDR_CHANGE:
|
||||
event_acked = nvmf_rdma_handle_cm_event_addr_change(transport, event);
|
||||
break;
|
||||
case RDMA_CM_EVENT_TIMEWAIT_EXIT:
|
||||
/* For now, do nothing. The target never re-uses queue pairs. */
|
||||
break;
|
||||
default:
|
||||
SPDK_ERRLOG("Unexpected Acceptor Event [%d]\n", event->event);
|
||||
break;
|
||||
}
|
||||
if (!event_acked) {
|
||||
rdma_ack_cm_event(event);
|
||||
}
|
||||
} else {
|
||||
if (errno != EAGAIN && errno != EWOULDBLOCK) {
|
||||
SPDK_ERRLOG("Acceptor Event Error: %s\n", spdk_strerror(errno));
|
||||
} else {
|
||||
nvmf_rdma_handle_cm_event_port_removal(transport, event);
|
||||
event_acked = true;
|
||||
}
|
||||
break;
|
||||
case RDMA_CM_EVENT_MULTICAST_JOIN:
|
||||
case RDMA_CM_EVENT_MULTICAST_ERROR:
|
||||
/* Multicast is not used */
|
||||
break;
|
||||
case RDMA_CM_EVENT_ADDR_CHANGE:
|
||||
event_acked = nvmf_rdma_handle_cm_event_addr_change(transport, event);
|
||||
break;
|
||||
case RDMA_CM_EVENT_TIMEWAIT_EXIT:
|
||||
/* For now, do nothing. The target never re-uses queue pairs. */
|
||||
break;
|
||||
default:
|
||||
SPDK_ERRLOG("Unexpected Acceptor Event [%d]\n", event->event);
|
||||
break;
|
||||
}
|
||||
if (!event_acked) {
|
||||
rdma_ack_cm_event(event);
|
||||
}
|
||||
}
|
||||
}
|
||||
@ -3450,7 +3499,9 @@ spdk_nvmf_rdma_poll_group_destroy(struct spdk_nvmf_transport_poll_group *group)
|
||||
}
|
||||
|
||||
if (poller->srq) {
|
||||
nvmf_rdma_resources_destroy(poller->resources);
|
||||
if (poller->resources) {
|
||||
nvmf_rdma_resources_destroy(poller->resources);
|
||||
}
|
||||
ibv_destroy_srq(poller->srq);
|
||||
SPDK_DEBUGLOG(SPDK_LOG_RDMA, "Destroyed RDMA shared queue %p\n", poller->srq);
|
||||
}
|
||||
|
@ -332,6 +332,14 @@ spdk_sock_writev_async(struct spdk_sock *sock, struct spdk_sock_request *req)
|
||||
int
|
||||
spdk_sock_flush(struct spdk_sock *sock)
|
||||
{
|
||||
if (sock == NULL) {
|
||||
return -EBADF;
|
||||
}
|
||||
|
||||
if (sock->flags.closed) {
|
||||
return -EBADF;
|
||||
}
|
||||
|
||||
return sock->net_impl->flush(sock);
|
||||
}
|
||||
|
||||
@ -396,6 +404,7 @@ spdk_sock_group_create(void *ctx)
|
||||
if (group_impl != NULL) {
|
||||
STAILQ_INSERT_TAIL(&group->group_impls, group_impl, link);
|
||||
TAILQ_INIT(&group_impl->socks);
|
||||
group_impl->num_removed_socks = 0;
|
||||
group_impl->net_impl = impl;
|
||||
}
|
||||
}
|
||||
@ -492,6 +501,9 @@ spdk_sock_group_remove_sock(struct spdk_sock_group *group, struct spdk_sock *soc
|
||||
rc = group_impl->net_impl->group_impl_remove_sock(group_impl, sock);
|
||||
if (rc == 0) {
|
||||
TAILQ_REMOVE(&group_impl->socks, sock, link);
|
||||
assert(group_impl->num_removed_socks < MAX_EVENTS_PER_POLL);
|
||||
group_impl->removed_socks[group_impl->num_removed_socks] = (uintptr_t)sock;
|
||||
group_impl->num_removed_socks++;
|
||||
sock->group_impl = NULL;
|
||||
sock->cb_fn = NULL;
|
||||
sock->cb_arg = NULL;
|
||||
@ -518,6 +530,9 @@ spdk_sock_group_impl_poll_count(struct spdk_sock_group_impl *group_impl,
|
||||
return 0;
|
||||
}
|
||||
|
||||
/* The number of removed sockets should be reset for each call to poll. */
|
||||
group_impl->num_removed_socks = 0;
|
||||
|
||||
num_events = group_impl->net_impl->group_impl_poll(group_impl, max_events, socks);
|
||||
if (num_events == -1) {
|
||||
return -1;
|
||||
@ -525,10 +540,21 @@ spdk_sock_group_impl_poll_count(struct spdk_sock_group_impl *group_impl,
|
||||
|
||||
for (i = 0; i < num_events; i++) {
|
||||
struct spdk_sock *sock = socks[i];
|
||||
int j;
|
||||
bool valid = true;
|
||||
for (j = 0; j < group_impl->num_removed_socks; j++) {
|
||||
if ((uintptr_t)sock == group_impl->removed_socks[j]) {
|
||||
valid = false;
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
assert(sock->cb_fn != NULL);
|
||||
sock->cb_fn(sock->cb_arg, group, sock);
|
||||
if (valid) {
|
||||
assert(sock->cb_fn != NULL);
|
||||
sock->cb_fn(sock->cb_arg, group, sock);
|
||||
}
|
||||
}
|
||||
|
||||
return num_events;
|
||||
}
|
||||
|
||||
|
@ -919,6 +919,7 @@ vhost_blk_get_config(struct spdk_vhost_dev *vdev, uint8_t *config,
|
||||
uint32_t blk_size;
|
||||
uint64_t blkcnt;
|
||||
|
||||
memset(&blkcfg, 0, sizeof(blkcfg));
|
||||
bvdev = to_blk_dev(vdev);
|
||||
assert(bvdev != NULL);
|
||||
bdev = bvdev->bdev;
|
||||
@ -949,7 +950,6 @@ vhost_blk_get_config(struct spdk_vhost_dev *vdev, uint8_t *config,
|
||||
}
|
||||
}
|
||||
|
||||
memset(&blkcfg, 0, sizeof(blkcfg));
|
||||
blkcfg.blk_size = blk_size;
|
||||
/* minimum I/O size in blocks */
|
||||
blkcfg.min_io_size = 1;
|
||||
|
@ -266,7 +266,7 @@ LINK_CXX=\
|
||||
#
|
||||
# Variables to use for versioning shared libs
|
||||
#
|
||||
SO_VER := 1
|
||||
SO_VER := 2
|
||||
SO_MINOR := 0
|
||||
SO_SUFFIX_ALL := $(SO_VER).$(SO_MINOR)
|
||||
|
||||
|
@ -37,7 +37,11 @@ include $(SPDK_ROOT_DIR)/mk/spdk.lib_deps.mk
|
||||
SPDK_MAP_FILE = $(SPDK_ROOT_DIR)/shared_lib/spdk.map
|
||||
LIB := $(call spdk_lib_list_to_static_libs,$(LIBNAME))
|
||||
SHARED_LINKED_LIB := $(subst .a,.so,$(LIB))
|
||||
ifdef SO_SUFFIX
|
||||
SHARED_REALNAME_LIB := $(subst .so,.so.$(SO_SUFFIX),$(SHARED_LINKED_LIB))
|
||||
else
|
||||
SHARED_REALNAME_LIB := $(subst .so,.so.$(SO_SUFFIX_ALL),$(SHARED_LINKED_LIB))
|
||||
endif
|
||||
|
||||
ifeq ($(CONFIG_SHARED),y)
|
||||
DEP := $(SHARED_LINKED_LIB)
|
||||
|
@ -131,6 +131,7 @@ uint8_t g_number_of_claimed_volumes = 0;
|
||||
/* Specific to AES_CBC. */
|
||||
#define AES_CBC_IV_LENGTH 16
|
||||
#define AES_CBC_KEY_LENGTH 16
|
||||
#define AESNI_MB_NUM_QP 64
|
||||
|
||||
/* Common for suported devices. */
|
||||
#define IV_OFFSET (sizeof(struct rte_crypto_op) + \
|
||||
@ -368,6 +369,7 @@ vbdev_crypto_init_crypto_drivers(void)
|
||||
struct device_qp *dev_qp;
|
||||
unsigned int max_sess_size = 0, sess_size;
|
||||
uint16_t num_lcores = rte_lcore_count();
|
||||
char aesni_args[32];
|
||||
|
||||
/* Only the first call, via RPC or module init should init the crypto drivers. */
|
||||
if (g_session_mp != NULL) {
|
||||
@ -375,7 +377,8 @@ vbdev_crypto_init_crypto_drivers(void)
|
||||
}
|
||||
|
||||
/* We always init AESNI_MB */
|
||||
rc = rte_vdev_init(AESNI_MB, NULL);
|
||||
snprintf(aesni_args, sizeof(aesni_args), "max_nb_queue_pairs=%d", AESNI_MB_NUM_QP);
|
||||
rc = rte_vdev_init(AESNI_MB, aesni_args);
|
||||
if (rc) {
|
||||
SPDK_ERRLOG("error creating virtual PMD %s\n", AESNI_MB);
|
||||
return -EINVAL;
|
||||
|
@ -328,7 +328,9 @@ _bdev_nvme_reset_complete(struct nvme_bdev_ctrlr *nvme_bdev_ctrlr, int rc)
|
||||
SPDK_NOTICELOG("Resetting controller successful.\n");
|
||||
}
|
||||
|
||||
__atomic_clear(&nvme_bdev_ctrlr->resetting, __ATOMIC_RELAXED);
|
||||
pthread_mutex_lock(&g_bdev_nvme_mutex);
|
||||
nvme_bdev_ctrlr->resetting = false;
|
||||
pthread_mutex_unlock(&g_bdev_nvme_mutex);
|
||||
/* Make sure we clear any pending resets before returning. */
|
||||
spdk_for_each_channel(nvme_bdev_ctrlr,
|
||||
_bdev_nvme_complete_pending_resets,
|
||||
@ -425,7 +427,20 @@ bdev_nvme_reset(struct nvme_bdev_ctrlr *nvme_bdev_ctrlr, struct nvme_bdev_io *bi
|
||||
struct spdk_io_channel *ch;
|
||||
struct nvme_io_channel *nvme_ch;
|
||||
|
||||
if (__atomic_test_and_set(&nvme_bdev_ctrlr->resetting, __ATOMIC_RELAXED)) {
|
||||
pthread_mutex_lock(&g_bdev_nvme_mutex);
|
||||
if (nvme_bdev_ctrlr->destruct) {
|
||||
/* Don't bother resetting if the controller is in the process of being destructed. */
|
||||
if (bio) {
|
||||
spdk_bdev_io_complete(spdk_bdev_io_from_ctx(bio), SPDK_BDEV_IO_STATUS_FAILED);
|
||||
}
|
||||
pthread_mutex_unlock(&g_bdev_nvme_mutex);
|
||||
return 0;
|
||||
}
|
||||
|
||||
if (!nvme_bdev_ctrlr->resetting) {
|
||||
nvme_bdev_ctrlr->resetting = true;
|
||||
} else {
|
||||
pthread_mutex_unlock(&g_bdev_nvme_mutex);
|
||||
SPDK_NOTICELOG("Unable to perform reset, already in progress.\n");
|
||||
/*
|
||||
* The internal reset calls won't be queued. This is on purpose so that we don't
|
||||
@ -442,6 +457,7 @@ bdev_nvme_reset(struct nvme_bdev_ctrlr *nvme_bdev_ctrlr, struct nvme_bdev_io *bi
|
||||
return 0;
|
||||
}
|
||||
|
||||
pthread_mutex_unlock(&g_bdev_nvme_mutex);
|
||||
/* First, delete all NVMe I/O queue pairs. */
|
||||
spdk_for_each_channel(nvme_bdev_ctrlr,
|
||||
_bdev_nvme_reset_destroy_qpair,
|
||||
|
@ -83,8 +83,8 @@ spdk_rpc_nvme_cuse_register(struct spdk_jsonrpc_request *request,
|
||||
|
||||
rc = spdk_nvme_cuse_register(bdev_ctrlr->ctrlr);
|
||||
if (rc) {
|
||||
SPDK_ERRLOG("Failed to register CUSE devices\n");
|
||||
spdk_jsonrpc_send_error_response(request, -rc, spdk_strerror(rc));
|
||||
SPDK_ERRLOG("Failed to register CUSE devices: %s\n", spdk_strerror(-rc));
|
||||
spdk_jsonrpc_send_error_response(request, rc, spdk_strerror(-rc));
|
||||
goto cleanup;
|
||||
}
|
||||
|
||||
|
@ -130,10 +130,20 @@ nvme_bdev_unregister_cb(void *io_device)
|
||||
free(nvme_bdev_ctrlr);
|
||||
}
|
||||
|
||||
void
|
||||
int
|
||||
nvme_bdev_ctrlr_destruct(struct nvme_bdev_ctrlr *nvme_bdev_ctrlr)
|
||||
{
|
||||
assert(nvme_bdev_ctrlr->destruct);
|
||||
pthread_mutex_lock(&g_bdev_nvme_mutex);
|
||||
if (nvme_bdev_ctrlr->resetting) {
|
||||
nvme_bdev_ctrlr->destruct_poller =
|
||||
spdk_poller_register((spdk_poller_fn)nvme_bdev_ctrlr_destruct, nvme_bdev_ctrlr, 1000);
|
||||
pthread_mutex_unlock(&g_bdev_nvme_mutex);
|
||||
return 1;
|
||||
}
|
||||
pthread_mutex_unlock(&g_bdev_nvme_mutex);
|
||||
|
||||
spdk_poller_unregister(&nvme_bdev_ctrlr->destruct_poller);
|
||||
if (nvme_bdev_ctrlr->opal_dev) {
|
||||
if (nvme_bdev_ctrlr->opal_poller != NULL) {
|
||||
spdk_poller_unregister(&nvme_bdev_ctrlr->opal_poller);
|
||||
@ -149,6 +159,7 @@ nvme_bdev_ctrlr_destruct(struct nvme_bdev_ctrlr *nvme_bdev_ctrlr)
|
||||
}
|
||||
|
||||
spdk_io_device_unregister(nvme_bdev_ctrlr, nvme_bdev_unregister_cb);
|
||||
return 1;
|
||||
}
|
||||
|
||||
void
|
||||
|
@ -94,6 +94,7 @@ struct nvme_bdev_ctrlr {
|
||||
struct spdk_poller *opal_poller;
|
||||
|
||||
struct spdk_poller *adminq_timer_poller;
|
||||
struct spdk_poller *destruct_poller;
|
||||
|
||||
struct ocssd_bdev_ctrlr *ocssd_ctrlr;
|
||||
|
||||
@ -150,7 +151,7 @@ struct nvme_bdev_ctrlr *nvme_bdev_next_ctrlr(struct nvme_bdev_ctrlr *prev);
|
||||
void nvme_bdev_dump_trid_json(struct spdk_nvme_transport_id *trid,
|
||||
struct spdk_json_write_ctx *w);
|
||||
|
||||
void nvme_bdev_ctrlr_destruct(struct nvme_bdev_ctrlr *nvme_bdev_ctrlr);
|
||||
int nvme_bdev_ctrlr_destruct(struct nvme_bdev_ctrlr *nvme_bdev_ctrlr);
|
||||
void nvme_bdev_attach_bdev_to_ns(struct nvme_bdev_ns *nvme_ns, struct nvme_bdev *nvme_disk);
|
||||
void nvme_bdev_detach_bdev_from_ns(struct nvme_bdev *nvme_disk);
|
||||
|
||||
|
@ -328,7 +328,9 @@ bdev_rbd_flush(struct bdev_rbd *disk, struct spdk_io_channel *ch,
|
||||
struct spdk_bdev_io *bdev_io, uint64_t offset, uint64_t nbytes)
|
||||
{
|
||||
struct bdev_rbd_io_channel *rbdio_ch = spdk_io_channel_get_ctx(ch);
|
||||
struct bdev_rbd_io *rbd_io = (struct bdev_rbd_io *)bdev_io->driver_ctx;
|
||||
|
||||
rbd_io->num_segments++;
|
||||
return bdev_rbd_start_aio(rbdio_ch->image, bdev_io, NULL, offset, nbytes);
|
||||
}
|
||||
|
||||
@ -783,6 +785,44 @@ spdk_bdev_rbd_delete(struct spdk_bdev *bdev, spdk_delete_rbd_complete cb_fn, voi
|
||||
spdk_bdev_unregister(bdev, cb_fn, cb_arg);
|
||||
}
|
||||
|
||||
int
|
||||
spdk_bdev_rbd_resize(struct spdk_bdev *bdev, const uint64_t new_size_in_mb)
|
||||
{
|
||||
struct spdk_io_channel *ch;
|
||||
struct bdev_rbd_io_channel *rbd_io_ch;
|
||||
int rc;
|
||||
uint64_t new_size_in_byte;
|
||||
uint64_t current_size_in_mb;
|
||||
|
||||
if (bdev->module != &rbd_if) {
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
current_size_in_mb = bdev->blocklen * bdev->blockcnt / (1024 * 1024);
|
||||
if (current_size_in_mb > new_size_in_mb) {
|
||||
SPDK_ERRLOG("The new bdev size must be lager than current bdev size.\n");
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
ch = bdev_rbd_get_io_channel(bdev);
|
||||
rbd_io_ch = spdk_io_channel_get_ctx(ch);
|
||||
new_size_in_byte = new_size_in_mb * 1024 * 1024;
|
||||
|
||||
rc = rbd_resize(rbd_io_ch->image, new_size_in_byte);
|
||||
if (rc != 0) {
|
||||
SPDK_ERRLOG("failed to resize the ceph bdev.\n");
|
||||
return rc;
|
||||
}
|
||||
|
||||
rc = spdk_bdev_notify_blockcnt_change(bdev, new_size_in_byte / bdev->blocklen);
|
||||
if (rc != 0) {
|
||||
SPDK_ERRLOG("failed to notify block cnt change.\n");
|
||||
return rc;
|
||||
}
|
||||
|
||||
return rc;
|
||||
}
|
||||
|
||||
static int
|
||||
bdev_rbd_library_init(void)
|
||||
{
|
||||
|
@ -57,4 +57,12 @@ int spdk_bdev_rbd_create(struct spdk_bdev **bdev, const char *name, const char *
|
||||
void spdk_bdev_rbd_delete(struct spdk_bdev *bdev, spdk_delete_rbd_complete cb_fn,
|
||||
void *cb_arg);
|
||||
|
||||
/**
|
||||
* Resize rbd bdev.
|
||||
*
|
||||
* \param bdev Pointer to rbd bdev.
|
||||
* \param new_size_in_mb The new size in MiB for this bdev.
|
||||
*/
|
||||
int spdk_bdev_rbd_resize(struct spdk_bdev *bdev, const uint64_t new_size_in_mb);
|
||||
|
||||
#endif /* SPDK_BDEV_RBD_H */
|
||||
|
@ -197,3 +197,56 @@ cleanup:
|
||||
}
|
||||
SPDK_RPC_REGISTER("bdev_rbd_delete", spdk_rpc_bdev_rbd_delete, SPDK_RPC_RUNTIME)
|
||||
SPDK_RPC_REGISTER_ALIAS_DEPRECATED(bdev_rbd_delete, delete_rbd_bdev)
|
||||
|
||||
struct rpc_bdev_rbd_resize {
|
||||
char *name;
|
||||
uint64_t new_size;
|
||||
};
|
||||
|
||||
static const struct spdk_json_object_decoder rpc_bdev_rbd_resize_decoders[] = {
|
||||
{"name", offsetof(struct rpc_bdev_rbd_resize, name), spdk_json_decode_string},
|
||||
{"new_size", offsetof(struct rpc_bdev_rbd_resize, new_size), spdk_json_decode_uint64}
|
||||
};
|
||||
|
||||
static void
|
||||
free_rpc_bdev_rbd_resize(struct rpc_bdev_rbd_resize *req)
|
||||
{
|
||||
free(req->name);
|
||||
}
|
||||
|
||||
static void
|
||||
spdk_rpc_bdev_rbd_resize(struct spdk_jsonrpc_request *request,
|
||||
const struct spdk_json_val *params)
|
||||
{
|
||||
struct rpc_bdev_rbd_resize req = {};
|
||||
struct spdk_bdev *bdev;
|
||||
struct spdk_json_write_ctx *w;
|
||||
int rc;
|
||||
|
||||
if (spdk_json_decode_object(params, rpc_bdev_rbd_resize_decoders,
|
||||
SPDK_COUNTOF(rpc_bdev_rbd_resize_decoders),
|
||||
&req)) {
|
||||
spdk_jsonrpc_send_error_response(request, SPDK_JSONRPC_ERROR_INTERNAL_ERROR,
|
||||
"spdk_json_decode_object failed");
|
||||
goto cleanup;
|
||||
}
|
||||
|
||||
bdev = spdk_bdev_get_by_name(req.name);
|
||||
if (bdev == NULL) {
|
||||
spdk_jsonrpc_send_error_response(request, -ENODEV, spdk_strerror(ENODEV));
|
||||
goto cleanup;
|
||||
}
|
||||
|
||||
rc = spdk_bdev_rbd_resize(bdev, req.new_size);
|
||||
if (rc) {
|
||||
spdk_jsonrpc_send_error_response(request, rc, spdk_strerror(-rc));
|
||||
goto cleanup;
|
||||
}
|
||||
|
||||
w = spdk_jsonrpc_begin_result(request);
|
||||
spdk_json_write_bool(w, true);
|
||||
spdk_jsonrpc_end_result(request, w);
|
||||
cleanup:
|
||||
free_rpc_bdev_rbd_resize(&req);
|
||||
}
|
||||
SPDK_RPC_REGISTER("bdev_rbd_resize", spdk_rpc_bdev_rbd_resize, SPDK_RPC_RUNTIME)
|
||||
|
@ -462,7 +462,7 @@ spdk_posix_sock_close(struct spdk_sock *_sock)
|
||||
}
|
||||
|
||||
#ifdef SPDK_ZEROCOPY
|
||||
static void
|
||||
static int
|
||||
_sock_check_zcopy(struct spdk_sock *sock)
|
||||
{
|
||||
struct spdk_posix_sock *psock = __posix_sock(sock);
|
||||
@ -483,7 +483,7 @@ _sock_check_zcopy(struct spdk_sock *sock)
|
||||
|
||||
if (rc < 0) {
|
||||
if (errno == EWOULDBLOCK || errno == EAGAIN) {
|
||||
return;
|
||||
return 0;
|
||||
}
|
||||
|
||||
if (!TAILQ_EMPTY(&sock->pending_reqs)) {
|
||||
@ -491,19 +491,19 @@ _sock_check_zcopy(struct spdk_sock *sock)
|
||||
} else {
|
||||
SPDK_WARNLOG("Recvmsg yielded an error!\n");
|
||||
}
|
||||
return;
|
||||
return 0;
|
||||
}
|
||||
|
||||
cm = CMSG_FIRSTHDR(&msgh);
|
||||
if (cm->cmsg_level != SOL_IP || cm->cmsg_type != IP_RECVERR) {
|
||||
SPDK_WARNLOG("Unexpected cmsg level or type!\n");
|
||||
return;
|
||||
return 0;
|
||||
}
|
||||
|
||||
serr = (struct sock_extended_err *)CMSG_DATA(cm);
|
||||
if (serr->ee_errno != 0 || serr->ee_origin != SO_EE_ORIGIN_ZEROCOPY) {
|
||||
SPDK_WARNLOG("Unexpected extended error origin\n");
|
||||
return;
|
||||
return 0;
|
||||
}
|
||||
|
||||
/* Most of the time, the pending_reqs array is in the exact
|
||||
@ -521,7 +521,7 @@ _sock_check_zcopy(struct spdk_sock *sock)
|
||||
|
||||
rc = spdk_sock_request_put(sock, req, 0);
|
||||
if (rc < 0) {
|
||||
return;
|
||||
return rc;
|
||||
}
|
||||
|
||||
} else if (found) {
|
||||
@ -531,6 +531,8 @@ _sock_check_zcopy(struct spdk_sock *sock)
|
||||
|
||||
}
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
#endif
|
||||
|
||||
@ -959,14 +961,22 @@ spdk_posix_sock_group_impl_poll(struct spdk_sock_group_impl *_group, int max_eve
|
||||
|
||||
for (i = 0, j = 0; i < num_events; i++) {
|
||||
#if defined(__linux__)
|
||||
sock = events[i].data.ptr;
|
||||
|
||||
#ifdef SPDK_ZEROCOPY
|
||||
if (events[i].events & EPOLLERR) {
|
||||
_sock_check_zcopy(events[i].data.ptr);
|
||||
rc = _sock_check_zcopy(sock);
|
||||
/* If the socket was closed or removed from
|
||||
* the group in response to a send ack, don't
|
||||
* add it to the array here. */
|
||||
if (rc || sock->cb_fn == NULL) {
|
||||
continue;
|
||||
}
|
||||
}
|
||||
#endif
|
||||
|
||||
if (events[i].events & EPOLLIN) {
|
||||
socks[j++] = events[i].data.ptr;
|
||||
socks[j++] = sock;
|
||||
}
|
||||
|
||||
#elif defined(__FreeBSD__)
|
||||
|
@ -38,6 +38,13 @@ C_SRCS += vpp.c
|
||||
CFLAGS += -Wno-sign-compare -Wno-error=old-style-definition
|
||||
CFLAGS += -Wno-error=strict-prototypes -Wno-error=ignored-qualifiers
|
||||
|
||||
GCC_VERSION=$(shell $(CC) -dumpversion | cut -d. -f1)
|
||||
|
||||
# disable packed member unalign warnings
|
||||
ifeq ($(shell test $(GCC_VERSION) -ge 9 && echo 1), 1)
|
||||
CFLAGS += -Wno-error=address-of-packed-member
|
||||
endif
|
||||
|
||||
LIBNAME = sock_vpp
|
||||
|
||||
include $(SPDK_ROOT_DIR)/mk/spdk.lib.mk
|
||||
|
@ -2,12 +2,12 @@
|
||||
%bcond_with doc
|
||||
|
||||
Name: spdk
|
||||
Version: master
|
||||
Version: 20.01.x
|
||||
Release: 0%{?dist}
|
||||
Epoch: 0
|
||||
URL: http://spdk.io
|
||||
|
||||
Source: https://github.com/spdk/spdk/archive/master.tar.gz
|
||||
Source: https://github.com/spdk/spdk/archive/v20.01.x.tar.gz
|
||||
Summary: Set of libraries and utilities for high performance user-mode storage
|
||||
|
||||
%define package_version %{epoch}:%{version}-%{release}
|
||||
|
@ -541,6 +541,20 @@ if __name__ == "__main__":
|
||||
p.add_argument('name', help='rbd bdev name')
|
||||
p.set_defaults(func=bdev_rbd_delete)
|
||||
|
||||
def bdev_rbd_resize(args):
|
||||
print_json(rpc.bdev.bdev_rbd_resize(args.client,
|
||||
name=args.name,
|
||||
new_size=int(args.new_size)))
|
||||
rpc.bdev.bdev_rbd_resize(args.client,
|
||||
name=args.name,
|
||||
new_size=int(args.new_size))
|
||||
|
||||
p = subparsers.add_parser('bdev_rbd_resize',
|
||||
help='Resize a rbd bdev')
|
||||
p.add_argument('name', help='rbd bdev name')
|
||||
p.add_argument('new_size', help='new bdev size for resize operation. The unit is MiB')
|
||||
p.set_defaults(func=bdev_rbd_resize)
|
||||
|
||||
def bdev_delay_create(args):
|
||||
print_json(rpc.bdev.bdev_delay_create(args.client,
|
||||
base_bdev_name=args.base_bdev_name,
|
||||
|
@ -585,6 +585,20 @@ def bdev_rbd_delete(client, name):
|
||||
return client.call('bdev_rbd_delete', params)
|
||||
|
||||
|
||||
def bdev_rbd_resize(client, name, new_size):
|
||||
"""Resize rbd bdev in the system.
|
||||
|
||||
Args:
|
||||
name: name of rbd bdev to resize
|
||||
new_size: new bdev size of resize operation. The unit is MiB
|
||||
"""
|
||||
params = {
|
||||
'name': name,
|
||||
'new_size': new_size,
|
||||
}
|
||||
return client.call('bdev_rbd_resize', params)
|
||||
|
||||
|
||||
@deprecated_alias('construct_error_bdev')
|
||||
def bdev_error_create(client, base_name):
|
||||
"""Construct an error injection block device.
|
||||
|
@ -41,6 +41,27 @@
|
||||
static char g_path[256];
|
||||
static struct spdk_poller *g_poller;
|
||||
|
||||
struct ctrlr_entry {
|
||||
struct spdk_nvme_ctrlr *ctrlr;
|
||||
struct ctrlr_entry *next;
|
||||
};
|
||||
|
||||
static struct ctrlr_entry *g_controllers = NULL;
|
||||
|
||||
static void
|
||||
cleanup(void)
|
||||
{
|
||||
struct ctrlr_entry *ctrlr_entry = g_controllers;
|
||||
|
||||
while (ctrlr_entry) {
|
||||
struct ctrlr_entry *next = ctrlr_entry->next;
|
||||
|
||||
spdk_nvme_detach(ctrlr_entry->ctrlr);
|
||||
free(ctrlr_entry);
|
||||
ctrlr_entry = next;
|
||||
}
|
||||
}
|
||||
|
||||
static void
|
||||
usage(char *executable_name)
|
||||
{
|
||||
@ -70,6 +91,17 @@ static void
|
||||
attach_cb(void *cb_ctx, const struct spdk_nvme_transport_id *trid,
|
||||
struct spdk_nvme_ctrlr *ctrlr, const struct spdk_nvme_ctrlr_opts *opts)
|
||||
{
|
||||
struct ctrlr_entry *entry;
|
||||
|
||||
entry = malloc(sizeof(struct ctrlr_entry));
|
||||
if (entry == NULL) {
|
||||
fprintf(stderr, "Malloc error\n");
|
||||
exit(1);
|
||||
}
|
||||
|
||||
entry->ctrlr = ctrlr;
|
||||
entry->next = g_controllers;
|
||||
g_controllers = entry;
|
||||
}
|
||||
|
||||
static int
|
||||
@ -163,6 +195,8 @@ main(int argc, char **argv)
|
||||
opts.shutdown_cb = stub_shutdown;
|
||||
|
||||
ch = spdk_app_start(&opts, stub_start, (void *)(intptr_t)opts.shm_id);
|
||||
|
||||
cleanup();
|
||||
spdk_app_fini();
|
||||
|
||||
return ch;
|
||||
|
@ -45,7 +45,14 @@ pushd $DB_BENCH_DIR
|
||||
if [ -z "$SKIP_GIT_CLEAN" ]; then
|
||||
git clean -x -f -d
|
||||
fi
|
||||
$MAKE db_bench $MAKEFLAGS $MAKECONFIG DEBUG_LEVEL=0 SPDK_DIR=$rootdir
|
||||
|
||||
EXTRA_CXXFLAGS=""
|
||||
GCC_VERSION=$(cc -dumpversion | cut -d. -f1)
|
||||
if (( GCC_VERSION >= 9 )); then
|
||||
EXTRA_CXXFLAGS+="-Wno-deprecated-copy -Wno-pessimizing-move"
|
||||
fi
|
||||
|
||||
$MAKE db_bench $MAKEFLAGS $MAKECONFIG DEBUG_LEVEL=0 SPDK_DIR=$rootdir EXTRA_CXXFLAGS="$EXTRA_CXXFLAGS"
|
||||
popd
|
||||
|
||||
timing_exit db_bench_build
|
||||
|
14
test/env/memory/memory_ut.c
vendored
14
test/env/memory/memory_ut.c
vendored
@ -52,20 +52,6 @@ DEFINE_STUB(rte_mem_event_callback_register, int,
|
||||
(const char *name, rte_mem_event_callback_t clb, void *arg), 0);
|
||||
DEFINE_STUB(rte_mem_virt2iova, rte_iova_t, (const void *virtaddr), 0);
|
||||
|
||||
void *
|
||||
rte_malloc(const char *type, size_t size, unsigned align)
|
||||
{
|
||||
CU_ASSERT(type == NULL);
|
||||
CU_ASSERT(align == 0);
|
||||
return malloc(size);
|
||||
}
|
||||
|
||||
void
|
||||
rte_free(void *ptr)
|
||||
{
|
||||
free(ptr);
|
||||
}
|
||||
|
||||
static int
|
||||
test_mem_map_notify(void *cb_ctx, struct spdk_mem_map *map,
|
||||
enum spdk_mem_map_notify_action action,
|
||||
|
@ -107,6 +107,7 @@ function start_vpp() {
|
||||
# for VPP side maximal size of MTU for TCP is 1460 and tests doesn't work
|
||||
# stable with larger packets
|
||||
MTU=1460
|
||||
MTU_W_HEADER=$((MTU+20))
|
||||
ip link set dev $INITIATOR_INTERFACE mtu $MTU
|
||||
ethtool -K $INITIATOR_INTERFACE tso off
|
||||
ethtool -k $INITIATOR_INTERFACE
|
||||
@ -131,7 +132,7 @@ function start_vpp() {
|
||||
xtrace_disable
|
||||
counter=40
|
||||
while [ $counter -gt 0 ] ; do
|
||||
vppctl show version &> /dev/null && break
|
||||
vppctl show version | grep -E "vpp v[0-9]+\.[0-9]+" && break
|
||||
counter=$(( counter - 1 ))
|
||||
sleep 0.5
|
||||
done
|
||||
@ -140,37 +141,47 @@ function start_vpp() {
|
||||
return 1
|
||||
fi
|
||||
|
||||
# Setup host interface
|
||||
vppctl create host-interface name $TARGET_INTERFACE
|
||||
VPP_TGT_INT="host-$TARGET_INTERFACE"
|
||||
vppctl set interface state $VPP_TGT_INT up
|
||||
vppctl set interface ip address $VPP_TGT_INT $TARGET_IP/24
|
||||
vppctl set interface mtu $MTU $VPP_TGT_INT
|
||||
# Below VPP commands are masked with "|| true" for the sake of
|
||||
# running the test in the CI system. For reasons unknown when
|
||||
# run via CI these commands result in 141 return code (pipefail)
|
||||
# even despite producing valid output.
|
||||
# Using "|| true" does not impact the "-e" flag used in test scripts
|
||||
# because vppctl cli commands always return with 0, even if
|
||||
# there was an error.
|
||||
# As a result - grep checks on command outputs must be used to
|
||||
# verify vpp configuration and connectivity.
|
||||
|
||||
vppctl show interface
|
||||
# Setup host interface
|
||||
vppctl create host-interface name $TARGET_INTERFACE || true
|
||||
VPP_TGT_INT="host-$TARGET_INTERFACE"
|
||||
vppctl set interface state $VPP_TGT_INT up || true
|
||||
vppctl set interface ip address $VPP_TGT_INT $TARGET_IP/24 || true
|
||||
vppctl set interface mtu $MTU $VPP_TGT_INT || true
|
||||
|
||||
vppctl show interface | tr -s " " | grep -E "host-$TARGET_INTERFACE [0-9]+ up $MTU/0/0/0"
|
||||
|
||||
# Disable session layer
|
||||
# NOTE: VPP net framework should enable it itself.
|
||||
vppctl session disable
|
||||
vppctl session disable || true
|
||||
|
||||
# Verify connectivity
|
||||
vppctl show int addr
|
||||
vppctl show int addr | grep -E "$TARGET_IP/24"
|
||||
ip addr show $INITIATOR_INTERFACE
|
||||
ip netns exec $TARGET_NAMESPACE ip addr show $TARGET_INTERFACE
|
||||
sleep 3
|
||||
# SC1010: ping -M do - in this case do is an option not bash special word
|
||||
# shellcheck disable=SC1010
|
||||
ping -c 1 $TARGET_IP -s $(( MTU - 28 )) -M do
|
||||
vppctl ping $INITIATOR_IP repeat 1 size $(( MTU - (28 + 8) )) verbose
|
||||
vppctl ping $INITIATOR_IP repeat 1 size $(( MTU - (28 + 8) )) verbose | grep -E "$MTU_W_HEADER bytes from $INITIATOR_IP"
|
||||
}
|
||||
|
||||
function kill_vpp() {
|
||||
vppctl delete host-interface name $TARGET_INTERFACE
|
||||
vppctl delete host-interface name $TARGET_INTERFACE || true
|
||||
|
||||
# Dump VPP configuration before kill
|
||||
vppctl show api clients
|
||||
vppctl show session
|
||||
vppctl show errors
|
||||
vppctl show api clients || true
|
||||
vppctl show session || true
|
||||
vppctl show errors || true
|
||||
|
||||
killprocess $vpp_pid
|
||||
}
|
||||
|
@ -40,6 +40,15 @@ $rpc_py iscsi_create_portal_group $PORTAL_TAG $TARGET_IP:$ISCSI_PORT
|
||||
$rpc_py iscsi_create_initiator_group $INITIATOR_TAG $INITIATOR_NAME $NETMASK
|
||||
rbd_bdev="$($rpc_py bdev_rbd_create $RBD_POOL $RBD_NAME 4096)"
|
||||
$rpc_py bdev_get_bdevs
|
||||
|
||||
$rpc_py bdev_rbd_resize $rbd_bdev 2000
|
||||
num_block=$($rpc_py bdev_get_bdevs|grep num_blocks|sed 's/[^[:digit:]]//g')
|
||||
# get the bdev size in MiB.
|
||||
total_size=$(( num_block * 4096/ 1048576 ))
|
||||
if [ $total_size != 2000 ];then
|
||||
echo "resize failed."
|
||||
exit 1
|
||||
fi
|
||||
# "Ceph0:0" ==> use Ceph0 blockdev for LUN0
|
||||
# "1:2" ==> map PortalGroup1 to InitiatorGroup2
|
||||
# "64" ==> iSCSI queue depth 64
|
||||
|
@ -93,6 +93,7 @@ set -e
|
||||
|
||||
for i in {1..10}; do
|
||||
if [ -f "${KERNEL_OUT}.${i}" ] && [ -f "${CUSE_OUT}.${i}" ]; then
|
||||
sed -i "s/${nvme_name}/nvme0/g" ${KERNEL_OUT}.${i}
|
||||
diff --suppress-common-lines ${KERNEL_OUT}.${i} ${CUSE_OUT}.${i}
|
||||
fi
|
||||
done
|
||||
|
@ -22,6 +22,13 @@ function tgt_init()
|
||||
}
|
||||
|
||||
nvmftestinit
|
||||
# There is an intermittent error relating to this test and Soft-RoCE. for now, just
|
||||
# skip this test if we are using rxe. TODO: get to the bottom of GitHub issue #1165
|
||||
if [ $TEST_TRANSPORT == "rdma" ] && check_ip_is_soft_roce $NVMF_FIRST_TARGET_IP; then
|
||||
echo "Using software RDMA, skipping the host bdevperf tests."
|
||||
exit 0
|
||||
fi
|
||||
|
||||
|
||||
tgt_init
|
||||
|
||||
|
@ -63,6 +63,7 @@ DEFINE_STUB(nvme_transport_ctrlr_construct, struct spdk_nvme_ctrlr *,
|
||||
DEFINE_STUB_V(nvme_io_msg_ctrlr_detach, (struct spdk_nvme_ctrlr *ctrlr));
|
||||
DEFINE_STUB(spdk_nvme_transport_available, bool,
|
||||
(enum spdk_nvme_transport_type trtype), true);
|
||||
DEFINE_STUB(spdk_uevent_connect, int, (void), 1);
|
||||
|
||||
|
||||
static bool ut_destruct_called = false;
|
||||
|
@ -462,11 +462,11 @@ test_build_contig_hw_sgl_request(void)
|
||||
CU_ASSERT(req.cmd.dptr.sgl1.address == tr.prp_sgl_bus_addr);
|
||||
CU_ASSERT(req.cmd.dptr.sgl1.unkeyed.length == 2 * sizeof(struct spdk_nvme_sgl_descriptor));
|
||||
CU_ASSERT(tr.u.sgl[0].unkeyed.type == SPDK_NVME_SGL_TYPE_DATA_BLOCK);
|
||||
CU_ASSERT(tr.u.sgl[0].unkeyed.length = 60);
|
||||
CU_ASSERT(tr.u.sgl[0].address = 0xDEADBEEF);
|
||||
CU_ASSERT(tr.u.sgl[0].unkeyed.length == 60);
|
||||
CU_ASSERT(tr.u.sgl[0].address == 0xDEADBEEF);
|
||||
CU_ASSERT(tr.u.sgl[1].unkeyed.type == SPDK_NVME_SGL_TYPE_DATA_BLOCK);
|
||||
CU_ASSERT(tr.u.sgl[1].unkeyed.length = 40);
|
||||
CU_ASSERT(tr.u.sgl[1].address = 0xDEADBEEF);
|
||||
CU_ASSERT(tr.u.sgl[1].unkeyed.length == 40);
|
||||
CU_ASSERT(tr.u.sgl[1].address == 0xDEADBEEF);
|
||||
|
||||
MOCK_CLEAR(spdk_vtophys);
|
||||
g_vtophys_size = 0;
|
||||
|
Loading…
Reference in New Issue
Block a user