numam-spdk/module
Kefu Chai a7d174e2ef bdev/null: call spdk_bdev_module_fini_done() even if not registered
in bdev subsystem, if any of the bdev module fails to initialize in
bdev_modules_init(), this function just stops immediately. in general,
the non-zero rc is returned to the callback func passed to spdk_subsystem_init().
if spdk app is used for building the spdk application, it's very
likely that app_start_rpc() is used as this very callback func.
in this case, app_start_rpc() would just pass the `rc` to spdk_app_stop()
which tears down all subsystems one after another.

bdev tears itself down by calling all its modules' module_fini(),
including those whose .module_init never gets called. the problem is,
if a bdev module marks its `.async_fini` true, and it calls
spdk_bdev_module_fini_done() only if spdk_io_device_unregister(),
then a bdev module which fails to initialize would leave us an spdk
application hanging in the air.

a typical logging message sequence looks like:

[2022-02-27 20:47:13.766578] bdev.c:1438:spdk_bdev_initialize: *ERROR*: bdev modules init failed
[2022-02-27 20:47:13.766622] subsystem.c: 169:spdk_subsystem_init_next: *ERROR*: Init subsystem bdev failed
[2022-02-27 20:47:13.766638] app.c: 691:spdk_app_stop: *WARNING*: spdk_app_stop'd on non-zero
[2022-02-27 20:47:13.766658] thread.c:2050:spdk_io_device_unregister: *ERROR*: io_device 0x10d3c30 not found

this is exactly the case we could run into if a bdev module fails to
initialize and bdev_null is unable to call spdk_bdev_module_fini_done()
when being teared down, because spdk_io_device_unregister() just refuses
to call the callback if the I/O device is never registered.

since `g_null_read_buf` is set in bdev_null_initialize(), in this change,
this pointer is checked for zero before calling spdk_io_device_unregister(),
if it is NULL, spdk_bdev_module_fini_done() is called directly instead
of calling spdk_io_device_unregister(). this helps to address the
hanging issue.

Signed-off-by: Kefu Chai <tchaikov@gmail.com>
Change-Id: I3a41fcd2f1c986e416dacecd5ca352dfd1e379b7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/11750
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2022-03-02 08:39:40 +00:00
..
accel so_ver: increase all major versions 2022-01-31 15:29:56 +00:00
bdev bdev/null: call spdk_bdev_module_fini_done() even if not registered 2022-03-02 08:39:40 +00:00
blob so_ver: increase all major versions 2022-01-31 15:29:56 +00:00
blobfs so_ver: increase all major versions 2022-01-31 15:29:56 +00:00
env_dpdk so_ver: increase all major versions 2022-01-31 15:29:56 +00:00
event nvmf: remove deprecated conn_sched parameter 2022-02-15 14:38:37 +00:00
scheduler so_ver: increase all major versions 2022-01-31 15:29:56 +00:00
sock uring: fix bug when inserting sock into pending_recv list 2022-02-04 20:57:53 +00:00
Makefile scheduler: create public API and subsystem for scheduler/governor 2021-09-07 07:33:03 +00:00