Commit Graph

41 Commits

Author SHA1 Message Date
Marcelo Araujo
8951f05525 Rework CTL frontend & backend options to use nv(3), allow creating multiple
ioctl frontend ports.

This revision introduces two changes to CTL:
- Changes the way options are passed to CTL_LUN_REQ and CTL_PORT_REQ ioctls.
  Removes ctl_be_arg structure and associated logic and replaces it with
  nv(3)-based logic for passing in and out arguments.
- Allows creating multiple ioctl frontend ports using either ctladm(8) or
  ctld(8).
  New frontend ports are represented by /dev/cam/ctl<pp>.<vp> nodes, eg /dev/cam/ctl5.3.
  Those device nodes respond only to CTL_IO ioctl.

New command-line options for ctladm:
# creates new ioctl frontend port with using free pp and vp=0
ctladm port -c
# creates new ioctl frontend port with pp=10 and vp=0
ctladm port -c -O pp=10
# creates new ioctl frontend port with pp=11 and vp=12
ctladm port -c -O pp=11 -O vp=12
# removes port with number 4 (it's a "targ_port" number, not pp number)
ctladm port -r -p 4

New syntax for ctl.conf:
target ... {
    port ioctl/<pp>
    ...
}

target ... {
    port ioctl/<pp>/<vp>
    ...

Note: Most of this work was made by jceel@, thank you.

Submitted by:	jceel
Reworked by:	myself
Reviewed by:	mav (earlier versions and recently during the rework)
Obtained from:  FreeNAS and TrueOS
Relnotes:	Yes
Sponsored by:	iXsystems Inc.
Differential Revision:	https://reviews.freebsd.org/D9299
2018-05-10 03:50:20 +00:00
Pedro F. Giffuni
f24882eca5 SPDX: finish tagging sys/cam. 2018-01-16 23:19:57 +00:00
Alexander Motin
9ff948d05f Polish handling of different reset flavours.
The biggest change is that ctl_remove_initiator() now generates I_T NEXUS
LOSS event, cleaning part of LUs state related to the initiator.

MFC after:	2 weeks
2017-02-27 14:59:00 +00:00
Alexander Motin
46511441fb Change XCOPY memory allocations.
Before this change XCOPY code could allocate memory in chunks up to 16-32MB
(VMware does XCOPY in 4MB chunks by default), that could be difficult for
VM subsystem to do due to KVA fragmentation, that sometimes created huge
allocation delays, blocking any I/O for respective LU for that time.

This change limits allocations down to TPC_MAX_IO_SIZE, which is 1MB now.
1MB is also not a cookie, but ZFS also can do that for large blocks, so
it should be less dramatic.  As drawback this increases CPU overhead, but
it still look acceptable comparing to time consumed by ZFS read/write.

MFC after:	1 week
2017-02-18 06:03:16 +00:00
Alexander Motin
640603fbf6 Remove writing 'residual' field of struct ctl_scsiio.
This field has no practical use and never readed.  Initiators already
receive respective residual size from frontends.  Removed field had
different semantics, which looks useless, and was never passed through
by any frontend.

While there, fix kern_data_resid field support in case of HA, missed in
r312291.

MFC after:	13 days
2017-01-17 18:32:47 +00:00
Alexander Motin
eb6ac6f9db Make CTL frontends report kern_data_resid for under-/overruns.
It seems like kern_data_resid was never really implemented.  This change
finally does it.  Now frontends update this field while transferring data,
while CTL/backends getting it can more flexibly handle the result.
At this point behavior should not change significantly, still reporting
errors on write overrun, but that may be changed later, if we decide so.

CAM target frontend still does not properly handle overruns due to CAM API
limitations.  We may need to add some fields to struct ccb_accept_tio to
pass information about initiator requested transfer size(s).

MFC after:	2 weeks
2017-01-16 16:19:55 +00:00
Alexander Motin
9cbbfd2f5c Improve use of I/O's private area.
- Since I/Os are allocates from per-port pools, make allocations store
pointer to CTL softc there, and use it where needed instead of global.
 - Created bunch of helper macros to access LUN, port and CTL softc.

MFC after:	 2 weeks
2016-12-29 15:09:34 +00:00
Alexander Motin
4124315924 Remove CTL_MAX_LUNS from places where it is not required.
MFC after:	2 weeks
2016-12-25 13:34:02 +00:00
Alexander Motin
a3dd837892 Improve third-party copy error reporting.
For EXTENDED COPY:
 - improve parameters checking to report some errors before copy start;
 - forward sense data from copy target as descriptor in case of error;
 - report which CSCD reported error in sense key specific information.
For WRITE USING TOKEN:
 - pass through real sense data from copy target instead of reporting
our copy error, since for initiator its a "simple" write, not a copy.

MFC after:	2 weeks
2016-12-25 09:40:44 +00:00
Alexander Motin
32920cbfa3 When reporting "Logical block address out of range" error, report the LBA
in sense data INFORMATION field.

MFC after:	2 weeks
2016-12-19 19:00:03 +00:00
Alexander Motin
8fadf66094 Fix previous commit to report proper error code.
MFC after:	2 weeks
2016-05-10 08:37:41 +00:00
Alexander Motin
38618bf430 Validate XCOPY range offsets and lengths.
MFC after:	2 weeks
2016-05-10 08:28:16 +00:00
Alexander Motin
e13f4248db More XCOPY parameters validation.
MFC after:	2 weeks
2016-05-10 08:08:39 +00:00
Alexander Motin
3eb7651aad Improve validation of some POPULATE TOKEN parameters.
MFC after:	2 weeks
2016-05-10 07:14:49 +00:00
Alexander Motin
0952a19f7d More aggressively fill WUT read pipeline.
On some tests I've measured 5% copy speedup from this.
2015-10-01 19:07:15 +00:00
Alexander Motin
6ac1446d0e Make zero WUT use WRITE SAME with recently allowed NDOB flag. 2015-10-01 16:30:20 +00:00
Alexander Motin
862aedb0d6 Fix arguments order. 2015-09-29 12:53:41 +00:00
Alexander Motin
042e9bdc41 Report number of failed XCOPY segment. 2015-09-17 14:22:52 +00:00
Alexander Motin
a65a997fd9 Improve XCOPY error reporting. 2015-09-12 16:30:01 +00:00
Alexander Motin
238b6b7c75 Report that we have no limit on POPULATE TOKEN segment size. 2015-09-12 14:20:11 +00:00
Alexander Motin
7ac58230ea Reimplement CTL High Availability.
CTL HA functionality was originally implemented by Copan many years ago,
but large part of the sources was never published.  This change includes
clean room implementation of the missing code and fixes for many bugs.

This code supports dual-node HA with ALUA in four modes:
 - Active/Unavailable without interlink between nodes;
 - Active/Standby with second node handling only basic LUN discovery and
reservation, synchronizing with the first node through the interlink;
 - Active/Active with both nodes processing commands and accessing the
backing storage, synchronizing with the first node through the interlink;
 - Active/Active with second node working as proxy, transfering all
commands to the first node for execution through the interlink.

Unlike original Copan's implementation, depending on specific hardware,
this code uses simple custom TCP-based protocol for interlink.  It has
no authentication, so it should never be enabled on public interfaces.

The code may still need some polishing, but generally it is functional.

Relnotes:	yes
Sponsored by:	iXsystems, Inc.
2015-09-10 12:40:31 +00:00
Alexander Motin
2f444d157b Drop "internal" CTL frontend.
Its idea was to be a simple initiator and execute several commands from
kernel level, but FreeBSD never had consumer for that functionality,
while its implementation polluted many unrelated places..
2015-08-15 13:34:38 +00:00
Alexander Motin
73942c5ce0 Issue all reads of single XCOPY segment simultaneously.
During vMotion and Clone VMware by default runs multiple sequential 4MB
XCOPY requests same time.  If CTL issues reads sequentially in 1MB chunks
for each XCOPY command, reads from different commands are not detected
as sequential by serseq option code and allowed to execute simultaneously.
Such read pattern confused ZFS prefetcher, causing suboptimal disk access.
Issuing all reads same time make serseq code work properly, serializing
reads both within each XCOPY command and between them.

My tests with ZFS pool of 14 disks in RAID10 shows prefetcher efficiency
improved from 37% to 99.7%, copying speed improved by 10-60%, average
read latency reduced twice on HDD layer and by five times on zvol layer.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2015-08-05 13:46:15 +00:00
Alexander Motin
2d8b28765c Introduce separate lock for tokens to reduce ctl_lock scope. 2015-06-20 11:20:25 +00:00
Alexander Motin
fee04ef7a9 Make XCOPY and WUT commands respect physical block size/offset.
This change by 2-3 times improves performance of misaligned XCOPY and WUT
commands by avoiding unneeded read-modify-write cycles inside ZFS.

MFC after:	1 week
2015-02-12 15:46:44 +00:00
Alexander Motin
117f1bc17f Fix wrong LUN reference in XCOPY block-to-block operation.
This could cause data corruption due to accessing wrong LUN in case of
retries on write errors.  Failed writes were retried to read LUN.

MFC after:	3 days
2015-01-24 15:40:52 +00:00
Alexander Motin
9602f43616 Reduce number of places where global control_softc is used.
At some point we may want to have several CTL instances, and that is not
really impossible.

MFC after:	2 weeks
2014-12-19 20:35:06 +00:00
Alexander Motin
2a72b5936d Plug memory leaks on UNMAP and XCOPY with invalid parameters.
MFC after:	1 week
2014-12-03 08:25:41 +00:00
Alexander Motin
f7241cceb0 Coalesce last data move and command status for read commands.
Make CTL core and block backend set success status before initiating last
data move for read commands.  Make CAM target and iSCSI frontends detect
such condition and send command status together with data.  New I/O flag
allows to skip duplicate status sending on later fe_done() call.

For Fibre Channel this change saves one of three interrupts per read command,
increasing performance from 126K to 160K IOPS.  For iSCSI this change saves
one of three PDUs per read command, increasing performance from 1M to 1.2M
IOPS.

MFC after:	1 month
Sponsored by:	iXsystems, Inc.
2014-11-25 17:53:35 +00:00
Alexander Motin
1251a76b12 Replace home-grown CTL IO allocator with UMA.
Old allocator created significant lock congestion protecting its lists
of preallocated I/Os, while UMA provides much better SMP scalability.
The downside of UMA is lack of reliable preallocation, that could guarantee
successful allocation in non-sleepable environments.  But careful code
review shown, that only CAM target frontend really has that requirement.
Fix that making that frontend preallocate and statically bind CTL I/O for
every ATIO/INOT it preallocates any way.  That allows to avoid allocations
in hot I/O path.  Other frontends either may sleep in allocation context
or can properly handle allocation errors.

On 40-core server with 6 ZVOL-backed LUNs and 7 iSCSI client connections
this change increases peak performance from ~700K to >1M IOPS!  Yay! :)

MFC after:	1 month
Sponsored by:	iXsystems, Inc.
2014-11-24 11:37:27 +00:00
Alexander Motin
0b060244ab Fix couple issues with ROD tokens content.
MFC after:	3 days
2014-10-01 11:30:20 +00:00
Alexander Motin
4ab4d6879c Fix tpc_create_token() introduced in r269497 to encode CREATOR LOGICAL UNIT
DESCRIPTOR field as Identification Descriptor CSCD descriptor, not just as
Identification Descriptor.

MFC after:	3 days
2014-09-17 07:08:59 +00:00
Alexander Motin
2ac1d5afc8 Fix lock recursion on LUN shutdown, introduced on r269497.
MFC after:	3 days
2014-08-19 17:04:18 +00:00
Alexander Motin
e3e592bb7d Reimplement WRITE USING TOKEN with Block Zero token using WRITE SAME.
On my ZVOL of SSDs that increases speed of zero writing in that way from
1 to 2.5GB/s by reducing CPU overhead.
MFC after:	2 weeks
2014-08-05 15:01:30 +00:00
Alexander Motin
25eee848cd Add support for Windows dialect of EXTENDED COPY command, aka Microsoft ODX.
This allows to avoid extra network traffic when copying files on NTFS iSCSI
disks within one storage host by drag'n'dropping them in Windows Explorer
of Windows 8/2012.  It should also accelerate Hyper-V VM operations, etc.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2014-08-04 01:16:20 +00:00
Alexander Motin
c5c6059574 Rework r269444 to work also for lists without IDs.
MFC after:	3 days
2014-08-02 23:20:43 +00:00
Alexander Motin
475267eff1 Plug EXTENDED COPY request data memory leak.
MFC after:	3 days
2014-08-02 20:15:00 +00:00
Alexander Motin
a7c09f5c12 Fix some bugs in RECEIVE COPY STATUS data.
MFC after:	3 days
2014-08-02 19:59:19 +00:00
Alexander Motin
43d2d71953 Add missing comparisons to make list IDs in EXTENDED COPY per-initiator,
as they should be.  Wrap it into a function to not duplicate the code.

MFC after:	3 days
2014-08-02 19:51:10 +00:00
Alexander Motin
8cbf9eae6f Increase maximal number of SCSI ports in CTL from 32 to 128.
After I gave each iSCSI target its own port, the old limit appeared to be
not so big.  This change almost proportionally increases per-LUN memory
use, but it is still three times better then it was before r268807.

MFC after:	2 weeks
2014-07-17 21:16:52 +00:00
Alexander Motin
984a2ea91f Add support for VMWare dialect of EXTENDED COPY command, aka VAAI Clone.
This allows to clone VMs and move them between LUNs inside one storage
host without generating extra network traffic to the initiator and back,
and without being limited by network bandwidth.

LUNs participating in copy operation should have UNIQUE NAA or EUI IDs set.
For LUNs without these IDs VMWare will use traditional copy operations.

Beware: the above LUN IDs explicitly set to values non-unique from the VM
cluster point of view may cause data corruption if wrong LUN is addressed!

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2014-07-16 15:57:17 +00:00