doc: added scheduler framework documentation
Added changelog entry for dynamic scheduler, along with general information on scheduler framework and behaviour of particular scheduler implemenations. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Signed-off-by: Ben Walker <benjamin.walker@intel.com> (cherry picked from commit a691353a2a8d99984875c550af037df983996093) Change-Id: I9fcef56323c4be136b6b531297b070562981eee5 Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6185 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This commit is contained in:
parent
68eb748759
commit
edb5cd988a
@ -75,6 +75,10 @@ The `--pci-blacklist` command line option has been deprecated, replaced with
|
||||
The `--pci-whitelist/-W` command line options have been deprecated, replaced with
|
||||
`--pci-allowed/-A`.
|
||||
|
||||
Added new experimental `dynamic` scheduler that rebalances idle threads, adjusts CPU frequency
|
||||
using dpdk_governor and turns idle reactor cores to interrupt mode. Please see
|
||||
[scheduler documentation](https://www.spdk.io/doc/scheduler.html) for details.
|
||||
|
||||
## ioat
|
||||
|
||||
The PCI BDF whitelist option has been removed from the `ioat_scan_accel_engine` RPC.
|
||||
|
@ -835,6 +835,7 @@ INPUT += \
|
||||
peer_2_peer.md \
|
||||
pkgconfig.md \
|
||||
porting.md \
|
||||
scheduler.md \
|
||||
shfmt.md \
|
||||
spdkcli.md \
|
||||
spdk_top.md \
|
||||
|
@ -1,5 +1,6 @@
|
||||
# General Information {#general}
|
||||
|
||||
- @subpage event
|
||||
- @subpage scheduler
|
||||
- @subpage logical_volumes
|
||||
- @subpage accel_fw
|
||||
|
82
doc/scheduler.md
Normal file
82
doc/scheduler.md
Normal file
@ -0,0 +1,82 @@
|
||||
# Scheduler {#scheduler}
|
||||
|
||||
SPDK's event/application framework (`lib/event`) now supports scheduling of
|
||||
lightweight threads. Schedulers are provided as plugins, called
|
||||
implementations. A default implementation is provided, but users may wish to
|
||||
write their own scheduler to integrate into broader code frameworks or meet
|
||||
their performance needs.
|
||||
|
||||
This feature should be considered experimental and is disabled by default. When
|
||||
enabled, the scheduler framework gathers data for each spdk thread and reactor
|
||||
and passes it to a scheduler implementation to perform one of the following
|
||||
actions.
|
||||
|
||||
## Actions
|
||||
|
||||
### Move a thread
|
||||
|
||||
`spdk_thread`s can be moved to another reactor. Schedulers can examine the
|
||||
suggested cpu_mask value for each lightweight thread to see if the user has
|
||||
requested specific reactors, or choose a reactor using whatever algorithm they
|
||||
deem fit.
|
||||
|
||||
### Switch reactor mode
|
||||
|
||||
Reactors by default run in a mode that constantly polls for new actions for the
|
||||
most efficient processing. Schedulers can switch a reactor into a mode that
|
||||
instead waits for an event on a file descriptor. On Linux, this is implemented
|
||||
using epoll. This results in reduced CPU usage but may be less responsive when
|
||||
events occur. A reactor cannot enter this mode if any `spdk_threads` are
|
||||
currently scheduled to it. This limitation is expected to be lifted in the
|
||||
future, allowing `spdk_threads` to enter interrupt mode.
|
||||
|
||||
### Set frequency of CPU core
|
||||
|
||||
The frequency of CPU cores can be modified by the scheduler in response to
|
||||
load. Only CPU cores that match the application cpu_mask may be modified. The
|
||||
mechanism for controlling CPU frequency is pluggable and the default provided
|
||||
implementation is called `dpdk_governor`, based on the `rte_power` library from
|
||||
DPDK.
|
||||
|
||||
#### Known limitation
|
||||
|
||||
When SMT (Hyperthreading) is enabled the two logical CPU cores sharing a single
|
||||
physical CPU core must run at the same frequency. If one of two of such logical
|
||||
CPU cores is outside the application cpu_mask, the policy and frequency on that
|
||||
core has to be managed by the administrator.
|
||||
|
||||
## Scheduler implementations
|
||||
|
||||
The scheduler in use may be controlled by JSON-RPC. Please use the
|
||||
[framework_set_scheduler](jsonrpc.md/#rpc_framework_set_scheduler) RPC to
|
||||
switch between schedulers or change their options.
|
||||
|
||||
[spdk_top](spdk_top.md#spdk_top) is a useful tool to observe the behavior of
|
||||
schedulers in different scenarios and workloads.
|
||||
|
||||
### static [default]
|
||||
|
||||
The `static` scheduler is the default scheduler and does no dynamic scheduling.
|
||||
Lightweight threads are distributed round-robin among reactors, respecting
|
||||
their requested cpu_mask, and then they are never moved. This is equivalent to
|
||||
the previous behavior of the SPDK event/application framework.
|
||||
|
||||
### dynamic
|
||||
|
||||
The `dynamic` scheduler is designed for power saving and reduction of CPU
|
||||
utilization, especially in cases where workloads show large variations over
|
||||
time.
|
||||
|
||||
Active threads are distributed equally among reactors, taking cpu_mask into
|
||||
account. All idle threads are moved to the main core. Once an idle thread becomes
|
||||
active, it is redistributed again.
|
||||
|
||||
When a reactor has no scheduled `spdk_thread`s it is switched into interrupt
|
||||
mode and stops actively polling. After enough threads become active, the
|
||||
reactor is switched back into poll mode and threads are assigned to it again.
|
||||
|
||||
The main core can contain active threads only when their execution time does
|
||||
not exceed the sum of all idle threads. When no active threads are present on
|
||||
the main core, the frequency of that CPU core will decrease as the load
|
||||
decreases. All CPU cores corresponding to the other reactors remain at maximum
|
||||
frequency.
|
Loading…
Reference in New Issue
Block a user