thread: Update main threading documentation

Change-Id: I47b69efb0e3794bfc6150ae0c8457c637233fe28 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470521 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-10-04 14:02:20 -07:00 · 2019-10-04 14:02:20 -07:00 · 00d692d0df
commit 00d692d0df
parent 9a650e34ad
1 changed files with 89 additions and 74 deletions
--- a/doc/concurrency.md
+++ b/doc/concurrency.md
@ -3,60 +3,58 @@
 # Theory

 One of the primary aims of SPDK is to scale linearly with the addition of
-hardware. This can mean a number of things in practice. For instance, moving
-from one SSD to two should double the number of I/O's per second. Or doubling
-the number of CPU cores should double the amount of computation possible. Or
-even doubling the number of NICs should double the network throughput. To
-achieve this, the software must be designed such that threads of execution are
-independent from one another as much as possible. In practice, that means
-avoiding software locks and even atomic instructions.
+hardware. This can mean many things in practice. For instance, moving from one
+SSD to two should double the number of I/O's per second. Or doubling the number
+of CPU cores should double the amount of computation possible. Or even doubling
+the number of NICs should double the network throughput. To achieve this, the
+software's threads of execution must be independent from one another as much as
+possible. In practice, that means avoiding software locks and even atomic
+instructions.

 Traditionally, software achieves concurrency by placing some shared data onto
 the heap, protecting it with a lock, and then having all threads of execution
-acquire the lock only when that shared data needs to be accessed. This model
-has a number of great properties:
+acquire the lock only when accessing the data. This model has many great
+properties:

-* It's relatively easy to convert single-threaded programs to multi-threaded
-programs because you don't have to change the data model from the
-single-threaded version. You just add a lock around the data.
+* It's easy to convert single-threaded programs to multi-threaded programs
+  because you don't have to change the data model from the single-threaded
+  version. You add a lock around the data.
 * You can write your program as a synchronous, imperative list of statements
 that you read from top to bottom.
-* Your threads can be interrupted and put to sleep by the operating system
-scheduler behind the scenes, allowing for efficient time-sharing of CPU resources.
+* The scheduler can interrupt threads, allowing for efficient time-sharing
+  of CPU resources.

-Unfortunately, as the number of threads scales up, contention on the lock
-around the shared data does too. More granular locking helps, but then also
-greatly increases the complexity of the program. Even then, beyond a certain
-number highly contended locks, threads will spend most of their time
-attempting to acquire the locks and the program will not benefit from any
-additional CPU cores.
+Unfortunately, as the number of threads scales up, contention on the lock around
+the shared data does too. More granular locking helps, but then also increases
+the complexity of the program. Even then, beyond a certain number of contended
+locks, threads will spend most of their time attempting to acquire the locks and
+the program will not benefit from more CPU cores.

 SPDK takes a different approach altogether. Instead of placing shared data in a
 global location that all threads access after acquiring a lock, SPDK will often
-assign that data to a single thread. When other threads want to access the
-data, they pass a message to the owning thread to perform the operation on
-their behalf. This strategy, of course, is not at all new. For instance, it is
-one of the core design principles of
+assign that data to a single thread. When other threads want to access the data,
+they pass a message to the owning thread to perform the operation on their
+behalf. This strategy, of course, is not at all new. For instance, it is one of
+the core design principles of
 [Erlang](http://erlang.org/download/armstrong_thesis_2003.pdf) and is the main
 concurrency mechanism in [Go](https://tour.golang.org/concurrency/2). A message
-in SPDK typically consists of a function pointer and a pointer to some context,
-and is passed between threads using a
+in SPDK consists of a function pointer and a pointer to some context. Messages
+are passed between threads using a
 [lockless ring](http://dpdk.org/doc/guides/prog_guide/ring_lib.html). Message
-passing is often much faster than most software developer's intuition leads them to
-believe, primarily due to caching effects. If a single core is consistently
-accessing the same data (on behalf of all of the other cores), then that data
-is far more likely to be in a cache closer to that core. It's often most
-efficient to have each core work on a relatively small set of data sitting in
-its local cache and then hand off a small message to the next core when done.
+passing is often much faster than most software developer's intuition leads them
+to believe due to caching effects. If a single core is accessing the same data
+(on behalf of all of the other cores), then that data is far more likely to be
+in a cache closer to that core. It's often most efficient to have each core work
+on a small set of data sitting in its local cache and then hand off a small
+message to the next core when done.

-In more extreme cases where even message passing may be too costly, a copy of
-the data will be made for each thread. The thread will then only reference its
-local copy. To mutate the data, threads will send a message to each other
-thread telling them to perform the update on their local copy. This is great
-when the data isn't mutated very often, but may be read very frequently, and is
-often employed in the I/O path. This of course trades memory size for
-computational efficiency, so it's use is limited to only the most critical code
-paths.
+In more extreme cases where even message passing may be too costly, each thread
+may make a local copy of the data. The thread will then only reference its local
+copy. To mutate the data, threads will send a message to each other thread
+telling them to perform the update on their local copy. This is great when the
+data isn't mutated very often, but is read very frequently, and is often
+employed in the I/O path. This of course trades memory size for computational
+efficiency, so it is used in only the most critical code paths.

 # Message Passing Infrastructure

@ -68,48 +66,65 @@ their documentation (e.g. @ref nvme). Most libraries, however, depend on SPDK's
 abstraction, located in `libspdk_thread.a`. The thread abstraction provides a
 basic message passing framework and defines a few key primitives.

-First, spdk_thread is an abstraction for a thread of execution and
-spdk_poller is an abstraction for a function that should be
-periodically called on the given thread. On each system thread that the user
-wishes to use with SPDK, they must first call spdk_thread_create().
+First, `spdk_thread` is an abstraction for a lightweight, stackless thread of
+execution. A lower level framework can execute an `spdk_thread` for a single
+timeslice by calling `spdk_thread_poll()`. A lower level framework is allowed to
+move an `spdk_thread` between system threads at any time, as long as there is
+only a single system thread executing `spdk_thread_poll()` on that
+`spdk_thread` at any given time. New lightweight threads may be created at any
+time by calling `spdk_thread_create()` and destroyed by calling
+`spdk_thread_destroy()`. The lightweight thread is the foundational abstraction for
+threading in SPDK.

-The library also defines two other abstractions: spdk_io_device and
-spdk_io_channel. In the course of implementing SPDK we noticed the
-same pattern emerging in a number of different libraries. In order to
-implement a message passing strategy, the code would describe some object with
-global state and also some per-thread context associated with that object that
-was accessed in the I/O path to avoid locking on the global state. The pattern
-was clearest in the lowest layers where I/O was being submitted to block
-devices. These devices often expose multiple queues that can be assigned to
-threads and then accessed without a lock to submit I/O. To abstract that, we
-generalized the device to spdk_io_device and the thread-specific queue to
-spdk_io_channel. Over time, however, the pattern has appeared in a huge
-number of places that don't fit quite so nicely with the names we originally
-chose. In today's code spdk_io_device is any pointer, whose uniqueness is
-predicated only on its memory address, and spdk_io_channel is the per-thread
-context associated with a particular spdk_io_device.
+There are then a few additional abstractions layered on top of the
+`spdk_thread`. One is the `spdk_poller`, which is an abstraction for a
+function that should be repeatedly called on the given thread. Another is an
+`spdk_msg_fn`, which is a function pointer and a context pointer, that can
+be sent to a thread for execution via `spdk_thread_send_msg()`.
+
+The library also defines two additional abstractions: `spdk_io_device` and
+`spdk_io_channel`. In the course of implementing SPDK we noticed the same
+pattern emerging in a number of different libraries. In order to implement a
+message passing strategy, the code would describe some object with global state
+and also some per-thread context associated with that object that was accessed
+in the I/O path to avoid locking on the global state. The pattern was clearest
+in the lowest layers where I/O was being submitted to block devices. These
+devices often expose multiple queues that can be assigned to threads and then
+accessed without a lock to submit I/O. To abstract that, we generalized the
+device to `spdk_io_device` and the thread-specific queue to `spdk_io_channel`.
+Over time, however, the pattern has appeared in a huge number of places that
+don't fit quite so nicely with the names we originally chose. In today's code
+`spdk_io_device` is any pointer, whose uniqueness is predicated only on its
+memory address, and `spdk_io_channel` is the per-thread context associated with
+a particular `spdk_io_device`.

 The threading abstraction provides functions to send a message to any other
 thread, to send a message to all threads one by one, and to send a message to
 all threads for which there is an io_channel for a given io_device.

+Most critically, the thread abstraction does not actually spawn any system level
+threads of its own. Instead, it relies on the existence of some lower level
+framework that spawns system threads and sets up event loops. Inside those event
+loops, the threading abstraction simply requires the lower level framework to
+repeatedly call `spdk_thread_poll()` on each `spdk_thread()` that exists. This
+makes SPDK very portable to a wide variety of asynchronous, event-based
+frameworks such as [Seastar](https://www.seastar.io) or [libuv](https://libuv.org/).
+
 # The event Framework

-As the number of example applications in SPDK grew, it became clear that a
-large portion of the code in each was implementing the basic message passing
-infrastructure required to call spdk_thread_create(). This includes spawning
-one thread per core, pinning each thread to a unique core, and allocating
-lockless rings between the threads for message passing. Instead of
-re-implementing that infrastructure for each example application, SPDK
-provides the SPDK @ref event. This library handles setting up all of the
-message passing infrastructure, installing signal handlers to cleanly
-shutdown, implements periodic pollers, and does basic command line parsing.
-When started through spdk_app_start(), the library automatically spawns all of
-the threads requested, pins them, and calls spdk_thread_create(). This makes
-it much easier to implement a brand new SPDK application and is the recommended
-method for those starting out. Only established applications with sufficient
-message passing infrastructure should consider directly integrating the lower
-level libraries.
+The SPDK project didn't want to officially pick an asynchronous, event-based
+framework for all of the example applications it shipped with, in the interest
+of supporting the widest variety of frameworks possible. But the applications do
+of course require something that implements an asynchronous event loop in order
+to run, so enter the `event` framework located in `lib/event`. This framework
+includes things like spawning one thread per core, pinning each thread to a
+unique core, polling and scheduling the lightweight threads, installing signal
+handlers to cleanly shutdown, and basic command line option parsing. When
+started through spdk_app_start(), the library automatically spawns all of the
+threads requested, pins them, and is ready for lightweight threads to be
+created. This makes it much easier to implement a brand new SPDK application and
+is the recommended method for those starting out. Only established applications
+should consider directly integrating the lower level libraries.

 # Limitations of the C Language