Since in most configurations CTL serves as network service, we found
that this change improves local system interactivity under heavy load.
Priority of main threads is set slightly higher then worker taskqueues
to make them quickly sort incoming requests not creating bottlenecks,
while plenty of worker taskqueues should be less sensitive to latency.
MFC after: 1 week
Sponsored by: iXsystems, Inc.
ioctl frontend ports.
This revision introduces two changes to CTL:
- Changes the way options are passed to CTL_LUN_REQ and CTL_PORT_REQ ioctls.
Removes ctl_be_arg structure and associated logic and replaces it with
nv(3)-based logic for passing in and out arguments.
- Allows creating multiple ioctl frontend ports using either ctladm(8) or
ctld(8).
New frontend ports are represented by /dev/cam/ctl<pp>.<vp> nodes, eg /dev/cam/ctl5.3.
Those device nodes respond only to CTL_IO ioctl.
New command-line options for ctladm:
# creates new ioctl frontend port with using free pp and vp=0
ctladm port -c
# creates new ioctl frontend port with pp=10 and vp=0
ctladm port -c -O pp=10
# creates new ioctl frontend port with pp=11 and vp=12
ctladm port -c -O pp=11 -O vp=12
# removes port with number 4 (it's a "targ_port" number, not pp number)
ctladm port -r -p 4
New syntax for ctl.conf:
target ... {
port ioctl/<pp>
...
}
target ... {
port ioctl/<pp>/<vp>
...
Note: Most of this work was made by jceel@, thank you.
Submitted by: jceel
Reworked by: myself
Reviewed by: mav (earlier versions and recently during the rework)
Obtained from: FreeNAS and TrueOS
Relnotes: Yes
Sponsored by: iXsystems Inc.
Differential Revision: https://reviews.freebsd.org/D9299
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
It's awkward to have spaces in CAM device serial numbers. That leads to
such things as device nodes named "/dev/diskid/MYSERIAL%20%20%201". Better
to replace the spaces with "0"s. This change only affects the default
serial numbers for users who don't provide their own.
Reviewed by: ken, mav
MFC after: Never
Relnotes: Yes
Sponsored by: Spectra Logic Corp
Differential Revision: https://reviews.freebsd.org/D12263
It is only a first step and not perfect, but better then nothing.
The main blocker is CAM target frontend, that can not be unloaded,
since CAM does not have mechanism to unregister periph driver now.
MFC after: 2 weeks
It seems like kern_data_resid was never really implemented. This change
finally does it. Now frontends update this field while transferring data,
while CTL/backends getting it can more flexibly handle the result.
At this point behavior should not change significantly, still reporting
errors on write overrun, but that may be changed later, if we decide so.
CAM target frontend still does not properly handle overruns due to CAM API
limitations. We may need to add some fields to struct ccb_accept_tio to
pass information about initiator requested transfer size(s).
MFC after: 2 weeks
- Since I/Os are allocates from per-port pools, make allocations store
pointer to CTL softc there, and use it where needed instead of global.
- Created bunch of helper macros to access LUN, port and CTL softc.
MFC after: 2 weeks
- Use SDT_PROBE<N>() instead of SDT_PROBE(). This has no functional effect
at the moment, but will be needed for some future changes.
- Don't hardcode the module component of the probe identifier. This is
set automatically by the SDT framework.
MFC after: 1 week
I am not sure what for it was done. Now open routine should automatically
fall back to read-only if open for writing is impossible. In such case
attempt to upgrade to write sounds strange.
MFC after: 1 week
Previously, with serseq enabled, next command was unblocked only after
previous completed. With this change, for read operations, next command
is unblocked as soon as last media read completed. This is important
for frontends that actually wait for data move completion (like camtgt),
or when data are moved through the HA link, or especially when both.
All requests arriving for processing after OFFLINE flag set are rejected
with BUSY status. Races around OFFLINE flag setting are closed by calling
taskqueue_drain_all().
CTL HA functionality was originally implemented by Copan many years ago,
but large part of the sources was never published. This change includes
clean room implementation of the missing code and fixes for many bugs.
This code supports dual-node HA with ALUA in four modes:
- Active/Unavailable without interlink between nodes;
- Active/Standby with second node handling only basic LUN discovery and
reservation, synchronizing with the first node through the interlink;
- Active/Active with both nodes processing commands and accessing the
backing storage, synchronizing with the first node through the interlink;
- Active/Active with second node working as proxy, transfering all
commands to the first node for execution through the interlink.
Unlike original Copan's implementation, depending on specific hardware,
this code uses simple custom TCP-based protocol for interlink. It has
no authentication, so it should never be enabled on public interfaces.
The code may still need some polishing, but generally it is functional.
Relnotes: yes
Sponsored by: iXsystems, Inc.
This is preparation for possibility to open/close media several times
per LUN life cycle. While there, rename variables to reduce confusion.
As additional bonus this allows to open read-only media, such as ZFS
snapshots.
Its idea was to be a simple initiator and execute several commands from
kernel level, but FreeBSD never had consumer for that functionality,
while its implementation polluted many unrelated places..
At this point IMMED flag is translated to MNT_NOWAIT flag of VOP_FSYNC(),
hoping that file system implements that (ZFS seems doesn't).
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
Previously several places were doing it on its own, partially
incorrectly (e.g. without the filedesc locked) or even actively harmful
by populating jdir or assigning rootvnode without vrefing it.
Reviewed by: kib
This change by 2-3 times improves performance of misaligned XCOPY and WUT
commands by avoiding unneeded read-modify-write cycles inside ZFS.
MFC after: 1 week
This change by 2-3 times improves performance of misaligned WRITE SAME
commands by avoiding unneeded read-modify-write cycles inside ZFS.
MFC after: 1 week
While in most cases CTL should correctly fetch those values from backing
storages, there are some initiators (like MS SQL), that may not like large
physical block sizes, even if they are true. For such cases allow override
fetched values with supported ones (like 4K).
MFC after: 1 week
Technically read requests can be executed in any order or simultaneously
since they are not changing any data. But ZFS prefetcher goes crasy when
it receives consecutive requests from different threads. Since prefetcher
works on level of separate blocks, instead of two consecutive 128K requests
it may receive 32 8K requests in mixed order.
This patch is more workaround then a real fix, and it does not fix all of
prefetcher problems, but it improves sequential read speed by 3-4x times
in some configurations. On the other side it may hurt performance if
some backing store has no prefetch, that is why it is disabled by default
for raw devices.
MFC after: 2 weeks
Previously it was supported only for ZVOL-backed LUNs, but now should work
for file-backed LUNs too. Used value in this case is a space occupied by
the backing file, while available value is an available space on file
system. Pool thresholds are still not implemented in this case.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
It is implemented for LUNs backed by ZVOLs in "dev" mode and files.
GEOM has no such API, so for LUNs backed by raw devices all LBAs will
be reported as mapped/unknown.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.