2017-04-28 23:19:05 +00:00
|
|
|
# BlobFS (Blobstore Filesystem) {#blobfs}
|
|
|
|
|
2017-03-22 20:35:00 +00:00
|
|
|
# BlobFS Getting Started Guide {#blobfs_getting_started}
|
|
|
|
|
|
|
|
# RocksDB Integration {#blobfs_rocksdb}
|
|
|
|
|
2017-09-25 23:39:03 +00:00
|
|
|
Clone and build the SPDK repository as per https://github.com/spdk/spdk
|
2017-03-22 20:35:00 +00:00
|
|
|
|
2017-09-25 23:39:03 +00:00
|
|
|
~~~{.sh}
|
|
|
|
git clone https://github.com/spdk/spdk.git
|
|
|
|
cd spdk
|
|
|
|
./configure
|
|
|
|
make
|
|
|
|
~~~
|
2017-03-22 20:35:00 +00:00
|
|
|
|
2017-09-25 23:39:03 +00:00
|
|
|
Clone the RocksDB repository from the SPDK GitHub fork into a separate directory.
|
|
|
|
Make sure you check out the `spdk-v5.6.1` branch.
|
2017-03-22 20:35:00 +00:00
|
|
|
|
2017-09-25 23:39:03 +00:00
|
|
|
~~~{.sh}
|
|
|
|
cd ..
|
|
|
|
git clone -b spdk-v5.6.1 https://github.com/spdk/rocksdb.git
|
|
|
|
~~~
|
2017-06-07 23:27:52 +00:00
|
|
|
|
2017-09-25 23:39:03 +00:00
|
|
|
Build RocksDB. Only the `db_bench` benchmarking tool is integrated with BlobFS.
|
|
|
|
(Note: add `DEBUG_LEVEL=0` for a release build.)
|
2017-03-22 20:35:00 +00:00
|
|
|
|
2017-09-25 23:39:03 +00:00
|
|
|
~~~{.sh}
|
|
|
|
cd rocksdb
|
|
|
|
make db_bench SPDK_DIR=path/to/spdk
|
|
|
|
~~~
|
2017-06-07 23:27:52 +00:00
|
|
|
|
2018-08-27 11:29:12 +00:00
|
|
|
Create an NVMe section in the configuration file using SPDK's `gen_nvme.sh` script.
|
2017-03-22 20:35:00 +00:00
|
|
|
|
2017-09-25 23:39:03 +00:00
|
|
|
~~~{.sh}
|
2018-08-27 11:29:12 +00:00
|
|
|
scripts/gen_nvme.sh > /usr/local/etc/spdk/rocksdb.conf
|
2017-09-25 23:39:03 +00:00
|
|
|
~~~
|
2017-03-22 20:35:00 +00:00
|
|
|
|
2017-09-25 23:39:03 +00:00
|
|
|
Verify the configuration file has specified the correct NVMe SSD.
|
|
|
|
If there are any NVMe SSDs you do not wish to use for RocksDB/SPDK testing, remove them from the configuration file.
|
2017-06-07 23:27:52 +00:00
|
|
|
|
2017-09-25 23:39:03 +00:00
|
|
|
Make sure you have at least 5GB of memory allocated for huge pages.
|
2017-08-30 18:20:22 +00:00
|
|
|
By default, the SPDK `setup.sh` script only allocates 2GB.
|
|
|
|
The following will allocate 5GB of huge page memory (in addition to binding the NVMe devices to uio/vfio).
|
2017-03-22 20:35:00 +00:00
|
|
|
|
2017-09-25 23:39:03 +00:00
|
|
|
~~~{.sh}
|
2017-08-30 18:20:22 +00:00
|
|
|
HUGEMEM=5120 scripts/setup.sh
|
2017-09-25 23:39:03 +00:00
|
|
|
~~~
|
2017-03-22 20:35:00 +00:00
|
|
|
|
2017-09-25 23:39:03 +00:00
|
|
|
Create an empty SPDK blobfs for testing.
|
2017-03-22 20:35:00 +00:00
|
|
|
|
2017-09-25 23:39:03 +00:00
|
|
|
~~~{.sh}
|
2018-03-22 21:30:57 +00:00
|
|
|
test/blobfs/mkfs/mkfs /usr/local/etc/spdk/rocksdb.conf Nvme0n1
|
2017-09-25 23:39:03 +00:00
|
|
|
~~~
|
2017-03-22 20:35:00 +00:00
|
|
|
|
2017-09-25 23:39:03 +00:00
|
|
|
At this point, RocksDB is ready for testing with SPDK. Three `db_bench` parameters are used to configure SPDK:
|
2017-03-22 20:35:00 +00:00
|
|
|
|
2017-09-25 23:39:03 +00:00
|
|
|
1. `spdk` - Defines the name of the SPDK configuration file. If omitted, RocksDB will use the default PosixEnv implementation
|
2017-03-22 20:35:00 +00:00
|
|
|
instead of SpdkEnv. (Required)
|
2017-09-25 23:39:03 +00:00
|
|
|
2. `spdk_bdev` - Defines the name of the SPDK block device which contains the BlobFS to be used for testing. (Required)
|
|
|
|
3. `spdk_cache_size` - Defines the amount of userspace cache memory used by SPDK. Specified in terms of megabytes (MB).
|
2017-03-22 20:35:00 +00:00
|
|
|
Default is 4096 (4GB). (Optional)
|
|
|
|
|
2017-09-25 23:39:03 +00:00
|
|
|
SPDK has a set of scripts which will run `db_bench` against a variety of workloads and capture performance and profiling
|
2017-03-22 20:35:00 +00:00
|
|
|
data. The primary script is `test/blobfs/rocksdb/run_tests.sh`.
|
|
|
|
|
|
|
|
# FUSE
|
|
|
|
|
|
|
|
BlobFS provides a FUSE plug-in to mount an SPDK BlobFS as a kernel filesystem for inspection or debug purposes.
|
|
|
|
The FUSE plug-in requires fuse3 and will be built automatically when fuse3 is detected on the system.
|
|
|
|
|
2017-09-25 23:39:03 +00:00
|
|
|
~~~{.sh}
|
2018-03-22 21:30:57 +00:00
|
|
|
test/blobfs/fuse/fuse /usr/local/etc/spdk/rocksdb.conf Nvme0n1 /mnt/fuse
|
2017-09-25 23:39:03 +00:00
|
|
|
~~~
|
2017-03-22 20:35:00 +00:00
|
|
|
|
|
|
|
Note that the FUSE plug-in has some limitations - see the list below.
|
|
|
|
|
|
|
|
# Limitations
|
|
|
|
|
|
|
|
* BlobFS has primarily been tested with RocksDB so far, so any use cases different from how RocksDB uses a filesystem
|
|
|
|
may run into issues. BlobFS will be tested in a broader range of use cases after this initial release.
|
|
|
|
* Only a synchronous API is currently supported. An asynchronous API has been developed but not thoroughly tested
|
|
|
|
yet so is not part of the public interface yet. This will be added in a future release.
|
|
|
|
* File renames are not atomic. This will be fixed in a future release.
|
|
|
|
* BlobFS currently supports only a flat namespace for files with no directory support. Filenames are currently stored
|
|
|
|
as xattrs in each blob. This means that filename lookup is an O(n) operation. An SPDK btree implementation is
|
|
|
|
underway which will be the underpinning for BlobFS directory support in a future release.
|
|
|
|
* Writes to a file must always append to the end of the file. Support for writes to any location within the file
|
|
|
|
will be added in a future release.
|