Commit Graph

44 Commits

Author SHA1 Message Date
Alexander Motin
920c6cbadc CTL LUN mapping rewrite.
Replace iSCSI-specific LUN mapping mechanism with new one, working for any
ports.  By default all ports are created without LUN mapping, exposing all
CTL LUNs as before.  But, if needed, LUN mapping can be manually set on
per-port basis via ctladm.  For its iSCSI ports ctld does it via ioctl(2).
The next step will be to teach ctld to work with FibreChannel ports also.

Respecting additional flexibility of the new mechanism, ctl.conf now allows
alternative syntax for LUN definition.  LUNs can now be defined in global
context, and then referenced from targets by unique name, as needed.  It
allows same LUN to be exposed several times via multiple targets.

While there, increase limit for LUNs per target in ctld from 256 to 1024.
Some initiators do not support LUNs above 255, but that is not our problem.

Discussed with:	trasz
MFC after:	2 weeks
Relnotes:	yes
Sponsored by:	iXsystems, Inc.
2015-02-01 21:50:28 +00:00
Alexander Motin
bfbfc4a3cb Count consecutive read requests as blocking in CTL for files and ZVOLs.
Technically read requests can be executed in any order or simultaneously
since they are not changing any data.  But ZFS prefetcher goes crasy when
it receives consecutive requests from different threads.  Since prefetcher
works on level of separate blocks, instead of two consecutive 128K requests
it may receive 32 8K requests in mixed order.

This patch is more workaround then a real fix, and it does not fix all of
prefetcher problems, but it improves sequential read speed by 3-4x times
in some configurations.  On the other side it may hurt performance if
some backing store has no prefetch, that is why it is disabled by default
for raw devices.

MFC after:	2 weeks
2014-12-06 20:39:25 +00:00
Alexander Motin
ef8daf3fed Add GET LBA STATUS command support to CTL.
It is implemented for LUNs backed by ZVOLs in "dev" mode and files.
GEOM has no such API, so for LUNs backed by raw devices all LBAs will
be reported as mapped/unknown.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2014-12-04 11:34:19 +00:00
Alexander Motin
9e52565344 Do not pre-allocate UNIT ATTENTIONs storage for every possible initiator.
Abusing ability of major UAs cover minor ones we may not account UAs for
inactive ports.  Allocate UAs storage for port and start accounting only
after some initiator from that port fetched its first POWER ON OCCURRED.

This reduces per-LUN CTL memory usage from >1MB to less then 100K.

MFC after:	1 month
2014-12-03 15:16:18 +00:00
Alexander Motin
411598df7a Do not pre-allocate reservation keys memory for every possible initiator.
In configurations with many ports, like iSCSI, each LUN is typically
accessed only by limited subset of ports.  Allocating that memory on
demand allows to reduce CTL memory usage from 5.3MB/LUN to 1.3MB/LUN.

MFC after:	1 month
2014-12-03 09:05:53 +00:00
Alexander Motin
77a06f9db0 Convert persis_offset from global variable to softc field. 2014-12-02 12:38:22 +00:00
Alexander Motin
1251a76b12 Replace home-grown CTL IO allocator with UMA.
Old allocator created significant lock congestion protecting its lists
of preallocated I/Os, while UMA provides much better SMP scalability.
The downside of UMA is lack of reliable preallocation, that could guarantee
successful allocation in non-sleepable environments.  But careful code
review shown, that only CAM target frontend really has that requirement.
Fix that making that frontend preallocate and statically bind CTL I/O for
every ATIO/INOT it preallocates any way.  That allows to avoid allocations
in hot I/O path.  Other frontends either may sleep in allocation context
or can properly handle allocation errors.

On 40-core server with 6 ZVOL-backed LUNs and 7 iSCSI client connections
this change increases peak performance from ~700K to >1M IOPS!  Yay! :)

MFC after:	1 month
Sponsored by:	iXsystems, Inc.
2014-11-24 11:37:27 +00:00
Alexander Motin
23b30f5600 Partially reconstruct Active/Standby clusting.
In this mode one head is in Active state, supporting all commands, while
another is in Standby state, supporting only minimal LUN discovery subset.

It is still incomplete since Standby state requires reservation support,
which is impossible to do right without having interlink between heads.
But it allows to run some basic experiments.
2014-11-21 06:27:37 +00:00
Alexander Motin
3275003e03 Synchronize medium rotation rate in legacy Rigid Disk Drive Geometry mode
page with modern Block Device Characteristics VPD page.

MFC after:	1 week
2014-11-07 00:10:07 +00:00
Alexander Motin
c3e7ba3e6d Add to CTL support for logical block provisioning threshold notifications.
For ZVOL-backed LUNs this allows to inform initiators if storage's used or
available spaces get above/below the configured thresholds.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2014-11-06 00:48:36 +00:00
Alexander Motin
115dc0c762 Reduce code duplication around Write Exclusive persistent reservation.
While there, allow some more commands to pass persistent reservation.

MFC after:	1 week
2014-10-27 09:26:24 +00:00
Alexander Motin
b491e75b4b Allocate buffer for READ BUFFER/WRITE BUFFER commands on demand.
These commands are rare, but consume additional 256KB RAM per LUN.

MFC after:	1 week
2014-10-26 23:25:42 +00:00
Alexander Motin
218d5d4be2 Implement more functional CTL debug logging.
Setting bits in kern.cam.ctl.debug allows to log errors, commands and some
commands data respectively.

MFC after:	1 week
2014-10-16 08:42:17 +00:00
Alexander Motin
9a0190c9a1 Remove couple Copan's vendor-specific mode pages.
Those pages are highly system-/hardware-specific, the code is incomplete,
and so they hardly can be useful for anybody else.
2014-10-14 11:28:25 +00:00
Alexander Motin
523f047ea2 Some groundwork for later Informational Exceptions support.
This includes support for:
 - Read-Write Error Recovery mode page;
 - Informational Exceptions Control mode page;
 - Logical Block Provisioning mode page;
 - LOG SENSE command.

No real Informational Exceptions features yet. This is only a placeholder.

Sponsored by:	iXsystems, Inc.
2014-10-14 10:14:14 +00:00
Alexander Motin
d70698b372 Add support for READ DEFECT DATA (10/12) commands.
SPC-4 r2 allows to return empty defect list if the list is not supported.
We don't reallu support defect data lists, but this suppresses some errors.

MFC after:	1 week
2014-10-13 14:48:49 +00:00
Alexander Motin
0ca2fdeca2 Store persistent reservation keys as uint64_t instead of uint8_t[8].
This allows to simplify the code and save 512KB of RAM per LUN (8%)
by removing no longer needed "registered" keys flags.
2014-10-10 12:38:53 +00:00
Alexander Motin
5396f4d279 Implement software (mode page) and hardware (config) write protection. 2014-10-08 12:24:24 +00:00
Alexander Motin
2f872218e7 Simplify legacy reservation handling. Drop it on I_T nexus loss. 2014-09-22 07:59:25 +00:00
Alexander Motin
ab55ae255a Implement control over command reordering via options and control mode page.
It allows to bypass range checks between UNMAP and READ/WRITE commands,
which may introduce additional delays while waiting for UNMAP parameters.
READ and WRITE commands are always processed in safe order since their
range checks are almost free.
2014-09-13 10:34:23 +00:00
Alexander Motin
bfa005503c Make ctl_port_mask an array to support more then 32 ports.
Overflow reported by Coverity.

CID:		1229894
MFC after:	3 days
2014-09-10 07:16:17 +00:00
Alexander Motin
55551d0542 Improve cache control support, including DPO/FUA flags and the mode page.
At this moment it works only for files and ZVOLs in device mode since BIOs
have no respective respective cache control flags (DPO/FUA).

MFC after:	1 month
Sponsored by:	iXsystems, Inc.
2014-09-09 11:38:29 +00:00
Alexander Motin
25eee848cd Add support for Windows dialect of EXTENDED COPY command, aka Microsoft ODX.
This allows to avoid extra network traffic when copying files on NTFS iSCSI
disks within one storage host by drag'n'dropping them in Windows Explorer
of Windows 8/2012.  It should also accelerate Hyper-V VM operations, etc.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2014-08-04 01:16:20 +00:00
Alexander Motin
8cbf9eae6f Increase maximal number of SCSI ports in CTL from 32 to 128.
After I gave each iSCSI target its own port, the old limit appeared to be
not so big.  This change almost proportionally increases per-LUN memory
use, but it is still three times better then it was before r268807.

MFC after:	2 weeks
2014-07-17 21:16:52 +00:00
Alexander Motin
38afa8f733 Reduce per-LUN memory usage from 18MB to 1.8MB.
CTL never had use for CA support code since SPI has gone, and there is no
even frontends supporting that.  But it still was reserving 256 bytes of
memory per LUN per every possible initiator on every possible port.

Wrap unused code with ifdef's in case somebody even need it.

MFC after:	2 weeks
2014-07-17 20:28:51 +00:00
Alexander Motin
984a2ea91f Add support for VMWare dialect of EXTENDED COPY command, aka VAAI Clone.
This allows to clone VMs and move them between LUNs inside one storage
host without generating extra network traffic to the initiator and back,
and without being limited by network bandwidth.

LUNs participating in copy operation should have UNIQUE NAA or EUI IDs set.
For LUNs without these IDs VMWare will use traditional copy operations.

Beware: the above LUN IDs explicitly set to values non-unique from the VM
cluster point of view may cause data corruption if wrong LUN is addressed!

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2014-07-16 15:57:17 +00:00
Alexander Motin
4d877c4148 Merge several equal serialization indexes. 2014-07-13 06:01:23 +00:00
Alexander Motin
604e257984 Teach ctl_add_initiator() to dynamically allocate IIDs from pool.
If port passed negative IID value, the function will try to allocate IID
from the pool of unused, based on passed wwpn or name arguments.  It does
all its best to make IID unique and persistent across reconnects.

This makes persistent reservation properly work for iSCSI.  Previously,
in case of reconnects, reservation could be unexpectedly lost, or even
migrate between intiators.
2014-07-07 09:37:22 +00:00
Alexander Motin
69d7b87790 Make REPORT TARGET PORT GROUPS command report realistic data instead of
hardcoded garbage.
2014-07-06 07:02:36 +00:00
Alexander Motin
027e5269c9 Burry devid port method, which was a gross hack.
Instead make ports provide wanted port and target IDs, and LUNs provide
wanted LUN IDs.  After that core Device ID VPD code only had to link all
of them together and add relative port and port group numbers.

LUN ID for iSCSI LUNs no longer created by CTL, but by ctld, and passed
to CTL as "scsiname" LUN option.  This makes LUNs to report the same set
of IDs, independently from the port through which it is accessed, as
required by SCSI specifications.
2014-07-05 19:30:20 +00:00
Alexander Motin
92168f4c01 Separate concepts of frontend and port.
Before iSCSI implementation CTL had no knowledge about frontend drivers,
it had only frontends, which really were ports (alike to LUNs, if comparing
to backends).  But iSCSI added there ioctl() method, which does not belong
to frontend as a port, but belongs to a frontend driver.
2014-07-04 19:27:06 +00:00
Alexander Motin
25c9d5e593 Add support for REPORT TIMESTAMP command.
MFC after:	2 weeks
2014-07-01 16:52:41 +00:00
Alexander Motin
1b08cb4ee7 Add more formal and strict command parsing and validation.
For every supported command define CDB length and mask of bits that are
allowed to be set.  This allows to remove bunch of checks through the code
and still make the validation more strict.  To properly do it for commands
supporting multiple service actions, formalize their parsing by adding
subtables for each of such commands.

As visible effect, this change allows to add support for REPORT SUPPORTED
OPERATION CODES command, reporting to client all the data about supported
SCSI commands, except timeouts.

MFC after:	2 weeks
2014-07-01 15:05:23 +00:00
Alexander Motin
85165a3f70 Add READ BUFFER and improve WRITE BUFFER SCSI commands support.
This gives some use to 512KB per-LUN buffers, allocated for Copan-specific
processor code and not used.  It allows, for example, to test transport
performance and/or correctness without accessing the media, as supported
by Linux version of sg3_utils.

MFC after:	2 weeks
2014-06-26 08:56:36 +00:00
Alexander Motin
3a8ce4a36b Introduce fine-grained CTL locking to improve SMP scalability.
Split global ctl_lock, historically protecting most of CTL context:
 - remaining ctl_lock now protects lists of fronends and backends;
 - per-LUN lun_lock(s) protect LUN-specific information;
 - per-thread queue_lock(s) protect request queues.
This allows to radically reduce congestion on ctl_lock.

Create multiple worker threads, depending on number of CPUs, and assign
each LUN to one of them.  This allows to spread load between multiple CPUs,
still avoiging congestion on queues and LUNs locks.

On 40-core server, exporting 5 LUNs, each backed by gstripe of SATA SSDs,
accessed via 6 iSCSI connections, this change improves peak request rate
from 250K to 680K IOPS.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2014-06-25 17:02:01 +00:00
Alexander Motin
50fe38b6b8 Execute task management request directly in ctl_queue() context.
From one side it allows to remove CTL_FLAG_TASK_PENDING flag, handling of
which significantly complicates fine-grained locking.  From the other side
it reduces task management requests latency even below then that flag could.
As downside, it denies task management code to sleep, but that is not needed
any way now.

Discussed with:	ken
2014-06-19 13:19:35 +00:00
Alexander Motin
11b569f7cb Add support for VERIFY(10/12/16) and COMPARE AND WRITE SCSI commands.
Make data_submit backends method support not only read and write requests,
but also two new ones: verify and compare.  Verify just checks readability
of the data in specified location without transferring them outside.
Compare reads the specified data and compares them to received data,
returning error if they are different.

VERIFY(10/12/16) commands request either verify or compare from backend,
depending on BYTCHK CDB field.  COMPARE AND WRITE command executed in two
stages: first it requests compare, and then, if succeesed, requests write.
Atomicity of operation is guarantied by CTL request ordering code.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2014-06-16 11:00:14 +00:00
Alexander Motin
ad9cb3314a Remove non-functional remnants of control LUN -- 18MB of RAM for nothing. 2014-06-14 20:25:14 +00:00
Alexander Motin
004008d6e6 Introduce new serialization type CTL_SERIDX_UNMAP.
Unfortunately we can't check range collisions for UNMAP commands alike
to writes, because they include multiple ranges, which are also passed
in data block, not in CDB.  As result, UNMAP commands have to be treated
as colliding with any other command accessing the media.

From the other side all UNMAPs are equal (we don't support ANCHOR flag),
so we can execute several UNMAPs same time.
2014-04-09 10:58:52 +00:00
Alexander Motin
ee7f31c068 Add support for SCSI UNMAP commands to CTL.
This patch adds support for three new SCSI commands: UNMAP, WRITE SAME(10)
and WRITE SAME(16).  WRITE SAME commands support both normal write mode
and UNMAP flag.  To properly report UNMAP capabilities this patch also adds
support for reporting two new VPD pages: Block limits and Logical Block
Provisioning.

UNMAP support can be enabled per-LUN by adding "-o unmap=on" to `ctladm
create` command line or "option unmap on" to lun sections of /etc/ctl.conf.

At this moment UNMAP supported for ramdisks and device-backed block LUNs.
It was tested to work great with ZFS ZVOLs.  For file-backed LUNs UNMAP
support is unfortunately missing due to absence of respective VFS KPI.

Reviewed by:	ken
MFC after:	1 month
Sponsored by:	iXsystems, Inc
2014-04-08 20:50:48 +00:00
Alexander Motin
8c6d5f8282 Introduce seperate mutex lock to protect protect CTL I/O pools, slightly
reducing global CTL lock scope and congestion.

While there, simplify CTL I/O pools KPI, hiding implementation details.
2013-11-11 08:27:20 +00:00
Kenneth D. Merry
bf8f8f340e Change the SCSI INQUIRY peripheral qualifier that CTL reports for LUNs
that don't exist.

Anecdotal evidence indicates that it is better to return 011b (bad LUN)
than 001b (LUN offline).  However, this change also gives the user a
sysctl/tunable, kern.cam.ctl.inquiry_pq_no_lun, to override the change
and return to the previous behavior.  (The previous behavior was to
return 001b, or LUN offline.)

ctl.c:		Change the default inquiry peripheral qualifier to 011b,
		and add a sysctl and tunable to allow the user to change
		it back to 001b if needed.

		Don't insert a Copan copyright statement in the inquiry
		data.  The copyright statements on the files are
		sufficient.

ctl_private.h:	Add sysctl variable context to the CTL softc.

ctl_cmd_table.c,
ctl_frontend_internal.c,
ctl_frontend.c,
ctl_backend.c,
ctl_error.c:	Include sys/sysctl.h.

MFC after:	3 days
2012-04-06 22:23:13 +00:00
Dimitry Andric
c995195062 Use a better way to silence unneeded internal declaration warnings in
several sys/cam/ctl files.

Suggested by:	ed
Reviewed by:	ken
MFC after:	1 week
2012-02-23 21:34:14 +00:00
Kenneth D. Merry
130f4520cb Add the CAM Target Layer (CTL).
CTL is a disk and processor device emulation subsystem originally written
for Copan Systems under Linux starting in 2003.  It has been shipping in
Copan (now SGI) products since 2005.

It was ported to FreeBSD in 2008, and thanks to an agreement between SGI
(who acquired Copan's assets in 2010) and Spectra Logic in 2010, CTL is
available under a BSD-style license.  The intent behind the agreement was
that Spectra would work to get CTL into the FreeBSD tree.

Some CTL features:

 - Disk and processor device emulation.
 - Tagged queueing
 - SCSI task attribute support (ordered, head of queue, simple tags)
 - SCSI implicit command ordering support.  (e.g. if a read follows a mode
   select, the read will be blocked until the mode select completes.)
 - Full task management support (abort, LUN reset, target reset, etc.)
 - Support for multiple ports
 - Support for multiple simultaneous initiators
 - Support for multiple simultaneous backing stores
 - Persistent reservation support
 - Mode sense/select support
 - Error injection support
 - High Availability support (1)
 - All I/O handled in-kernel, no userland context switch overhead.

(1) HA Support is just an API stub, and needs much more to be fully
    functional.

ctl.c:			The core of CTL.  Command handlers and processing,
			character driver, and HA support are here.

ctl.h:			Basic function declarations and data structures.

ctl_backend.c,
ctl_backend.h:		The basic CTL backend API.

ctl_backend_block.c,
ctl_backend_block.h:	The block and file backend.  This allows for using
			a disk or a file as the backing store for a LUN.
			Multiple threads are started to do I/O to the
			backing device, primarily because the VFS API
			requires that to get any concurrency.

ctl_backend_ramdisk.c:	A "fake" ramdisk backend.  It only allocates a
			small amount of memory to act as a source and sink
			for reads and writes from an initiator.  Therefore
			it cannot be used for any real data, but it can be
			used to test for throughput.  It can also be used
			to test initiators' support for extremely large LUNs.

ctl_cmd_table.c:	This is a table with all 256 possible SCSI opcodes,
			and command handler functions defined for supported
			opcodes.

ctl_debug.h:		Debugging support.

ctl_error.c,
ctl_error.h:		CTL-specific wrappers around the CAM sense building
			functions.

ctl_frontend.c,
ctl_frontend.h:		These files define the basic CTL frontend port API.

ctl_frontend_cam_sim.c:	This is a CTL frontend port that is also a CAM SIM.
			This frontend allows for using CTL without any
			target-capable hardware.  So any LUNs you create in
			CTL are visible in CAM via this port.

ctl_frontend_internal.c,
ctl_frontend_internal.h:
			This is a frontend port written for Copan to do
			some system-specific tasks that required sending
			commands into CTL from inside the kernel.  This
			isn't entirely relevant to FreeBSD in general,
			but can perhaps be repurposed.

ctl_ha.h:		This is a stubbed-out High Availability API.  Much
			more is needed for full HA support.  See the
			comments in the header and the description of what
			is needed in the README.ctl.txt file for more
			details.

ctl_io.h:		This defines most of the core CTL I/O structures.
			union ctl_io is conceptually very similar to CAM's
			union ccb.

ctl_ioctl.h:		This defines all ioctls available through the CTL
			character device, and the data structures needed
			for those ioctls.

ctl_mem_pool.c,
ctl_mem_pool.h:		Generic memory pool implementation used by the
			internal frontend.

ctl_private.h:		Private data structres (e.g. CTL softc) and
			function prototypes.  This also includes the SCSI
			vendor and product names used by CTL.

ctl_scsi_all.c,
ctl_scsi_all.h:		CTL wrappers around CAM sense printing functions.

ctl_ser_table.c:	Command serialization table.  This defines what
			happens when one type of command is followed by
			another type of command.

ctl_util.c,
ctl_util.h:		CTL utility functions, primarily designed to be
			used from userland.  See ctladm for the primary
			consumer of these functions.  These include CDB
			building functions.

scsi_ctl.c:		CAM target peripheral driver and CTL frontend port.
			This is the path into CTL for commands from
			target-capable hardware/SIMs.

README.ctl.txt:		CTL code features, roadmap, to-do list.

usr.sbin/Makefile:	Add ctladm.

ctladm/Makefile,
ctladm/ctladm.8,
ctladm/ctladm.c,
ctladm/ctladm.h,
ctladm/util.c:		ctladm(8) is the CTL management utility.
			It fills a role similar to camcontrol(8).
			It allow configuring LUNs, issuing commands,
			injecting errors and various other control
			functions.

usr.bin/Makefile:	Add ctlstat.

ctlstat/Makefile
ctlstat/ctlstat.8,
ctlstat/ctlstat.c:	ctlstat(8) fills a role similar to iostat(8).
			It reports I/O statistics for CTL.

sys/conf/files:		Add CTL files.

sys/conf/NOTES:		Add device ctl.

sys/cam/scsi_all.h:	To conform to more recent specs, the inquiry CDB
			length field is now 2 bytes long.

			Add several mode page definitions for CTL.

sys/cam/scsi_all.c:	Handle the new 2 byte inquiry length.

sys/dev/ciss/ciss.c,
sys/dev/ata/atapi-cam.c,
sys/cam/scsi/scsi_targ_bh.c,
scsi_target/scsi_cmds.c,
mlxcontrol/interface.c:	Update for 2 byte inquiry length field.

scsi_da.h:		Add versions of the format and rigid disk pages
			that are in a more reasonable format for CTL.

amd64/conf/GENERIC,
i386/conf/GENERIC,
ia64/conf/GENERIC,
sparc64/conf/GENERIC:	Add device ctl.

i386/conf/PAE:		The CTL frontend SIM at least does not compile
			cleanly on PAE.

Sponsored by:	Copan Systems, SGI and Spectra Logic
MFC after:	1 month
2012-01-12 00:34:33 +00:00