275 Commits

Author SHA1 Message Date
mav
8f6eba03a8 MFC r278625: Make XCOPY and WUT commands respect physical block size/offset.
This change by 2-3 times improves performance of misaligned XCOPY and WUT
commands by avoiding unneeded read-modify-write cycles inside ZFS.
2015-02-19 14:36:03 +00:00
mav
2116079c78 MFC r278161: Bring some more order into iSCSI portal group tags support.
While ctld(8) still does not allow multiple portal groups per target
to be configured, kernel should now be able to handle it.

Sponsored by:	iXsystems, Inc.
2015-02-19 14:33:46 +00:00
mav
aa6a7df89f MFC r278037: CTL LUN mapping rewrite.
Replace iSCSI-specific LUN mapping mechanism with new one, working for any
ports.  By default all ports are created without LUN mapping, exposing all
CTL LUNs as before.  But, if needed, LUN mapping can be manually set on
per-port basis via ctladm.  For its iSCSI ports ctld does it via ioctl(2).
The next step will be to teach ctld to work with FibreChannel ports also.

Respecting additional flexibility of the new mechanism, ctl.conf now allows
alternative syntax for LUN definition.  LUNs can now be defined in global
context, and then referenced from targets by unique name, as needed.  It
allows same LUN to be exposed several times via multiple targets.

While there, increase limit for LUNs per target in ctld from 256 to 1024.
Some initiators do not support LUNs above 255, but that is not our problem.

Relnotes:	yes
Sponsored by:	iXsystems, Inc.
2015-02-19 14:31:16 +00:00
mav
8fc5fb008d MFC r278619: Make WRITE SAME commands respect physical block size.
This change by 2-3 times improves performance of misaligned WRITE SAME
commands by avoiding unneeded read-modify-write cycles inside ZFS.
2015-02-19 13:06:38 +00:00
mav
28f489d0bd MFC r278500: Do not abort already aborted tasks.
This fixes abort of new tasks with the same tags as previously aborted,
but still remaining on the queue.
2015-02-17 17:34:45 +00:00
mav
a131bc6ccd MFC r277917 (by ken), r278598:
Improve SCSI Extended Inquiry VPD page (0x86) support.

sys/cam/scsi/scsi_all.h:
        In struct scsi_extended_inquiry_data:
        - Increase the length field to 2 bytes, as it is 2 bytes in SPC-4.
        - Add bit definitions for the various Activiate Microcode actions.
        - Add the Sequential Access Logical Block Protection support bit,
          since we need that in the sa(4) driver.  (For modifications
          that will come later.)
        - Add definitions for the various Multi I_T Nexus Microcode
          Download modes.

sys/cam/ctl/ctl.c:
        As of SPC-4, a single report of "REPORTED LUNS DATA HAS CHANGED"
        is to be given per I_T nexus.  Once it is reported, the unit
        attention condition should be cleared for all LUNS attached to
        an I_T nexus.

        Previously that only happened when a REPORT LUNS command was
        processed.

        This behavior may be different (according to SAM-5) when the
        UA_INTLCK_CTRL bits are non-zero in the control mode page but
        CTL does not currently support that.

        So, in view of the spec, whenever we report a LUN inventory
        change unit attention, clear it on all LUNs for that
        particular I_T nexus.

        Add a new function, ctl_clear_ua() that will clear a unit
        attention on all LUNs for the given I_T nexus.

        One field in the extended inquiry data that we could potentially
        report at some point is the maximum supported sense data length.
        To do that, we would the SIM to report (via path inquiry
        perhaps) how much sense data it is able to send.

        Add comments to explain some of the bits that are set in the
        Extended Inquiry VPD page.

        Add a few comments to make it more clear which functions handle
        various VPD pages.
2015-02-15 08:52:09 +00:00
mav
0a6db6b09e MFC r277247: Don't count status as sent until CTIO completes successfully.
If we aggregated status sending with data move and got error, allow status
to be updated and resent again separately.  Without this command may stuck
without status sent at all.
2015-01-30 09:05:43 +00:00
mav
aba44db36e MFC r277529: Don't count requests with status sent as overlapping.
While those requests are still in target OOA queue, for initiator they are
already completed, so tags can be reused.
2015-01-30 09:04:20 +00:00
mav
67407ee0df MFC r277647: Fix wrong LUN reference in XCOPY block-to-block operation.
This could cause data corruption due to accessing wrong LUN in case of
retries on write errors.  Failed writes were retried to read LUN.
2015-01-27 19:41:24 +00:00
mav
57b1ef3ed0 MFC r274036 (by trasz):
s/icl_pdu_new_bhs/icl_pdu_new/; no functional changes, just a little
nicer code.
2015-01-03 13:36:56 +00:00
mav
1fe3d1bb1e MFC r276141: Hide block device VPD pages for non-block devices. 2015-01-03 13:12:47 +00:00
mav
6bf3be1ed2 MFC r275953: Replace ctl_min() macro with MIN(). 2015-01-03 13:11:39 +00:00
mav
853d60fa73 MFC r275943: Constify some static data. 2015-01-03 13:10:23 +00:00
mav
94e2b4d8f8 MFC r275942: Reduce number of places where global control_softc is used.
At some point we may want to have several CTL instances, and that is not
really impossible.
2015-01-03 13:09:32 +00:00
mav
2952feb3ad MFC r275864: Make sequence numbers checks more strict.
While we don't support MCS, hole in received sequence numbers may mean
only PDU loss.  While we don't support lost PDU recovery, terminate the
connection to avoid stuck commands.

While there, improve handling of sequence numbers wrap after 2^32 PDUs.
2015-01-03 13:08:08 +00:00
mav
8b298a2978 MFC r275920, r276127: Pass real optimal transfer size supported by backend.
For files and ZVOLs that is 1MB now, not 128K.
2014-12-26 09:44:32 +00:00
mav
5d78bffe2b MFC r275865:
Add configuration options to override physical and UNMAP blocks geometry.

While in most cases CTL should correctly fetch those values from backing
storages, there are some initiators (like MS SQL), that may not like large
physical block sizes, even if they are true.  For such cases allow override
fetched values with supported ones (like 4K).
2014-12-24 13:49:40 +00:00
mav
9cec814411 MFC r275959: Report initiator id in portlist XML in more formalized way. 2014-12-23 12:45:29 +00:00
mav
69ac340493 MFC r275842: Do not count RCTD bit set as an error.
We can not really implement it, but specification tells that it "shall"
work, so it can be safely ignored.
2014-12-23 12:41:28 +00:00
mav
a0e66849d7 MFC r275568:
Count consecutive read requests as blocking in CTL for files and ZVOLs.

Technically read requests can be executed in any order or simultaneously
since they are not changing any data.  But ZFS prefetcher goes crasy when
it receives consecutive requests from different threads.  Since prefetcher
works on level of separate blocks, instead of two consecutive 128K requests
it may receive 32 8K requests in mixed order.

This patch is more workaround then a real fix, and it does not fix all of
prefetcher problems, but it improves sequential read speed by 3-4x times
in some configurations.  On the other side it may hurt performance if
some backing store has no prefetch, that is why it is disabled by default
for raw devices.
2014-12-18 08:46:53 +00:00
mav
fe777840de MFC r275512:
In addition to r275481 allow threshold notifications work without UNMAP.

While without UNMAP support there is not much initiator can do about it,
the administrator still better be notified about the storage overflow.

Sponsored by:   iXsystems, Inc.
2014-12-18 08:45:28 +00:00
mav
57c600c3f5 MFC r275481:
Add to CTL support for threshold notifications for file-backed LUNs.

Previously it was supported only for ZVOL-backed LUNs, but now should work
for file-backed LUNs too.  Used value in this case is a space occupied by
the backing file, while available value is an available space on file
system.  Pool thresholds are still not implemented in this case.

Sponsored by:   iXsystems, Inc.
2014-12-18 08:43:36 +00:00
mav
4ff47ae9ab MFC r275474: Add GET LBA STATUS command support to CTL.
It is implemented for LUNs backed by ZVOLs in "dev" mode and files.
GEOM has no such API, so for LUNs backed by raw devices all LBAs will
be reported as mapped/unknown.

Sponsored by:   iXsystems, Inc.
2014-12-18 08:38:07 +00:00
mav
251a95deec MFC r275461:
Increase CTL ports limit from 128 to 256 and LUNs limit from 256 to 1024.

After recent optimizations this change is no longer blocked by CTL memory
consumption.  Those limits are still not free, but much cheaper now.

Relnotes:	yes
Sponsored by:	iXsystems, Inc.
2014-12-18 08:37:09 +00:00
mav
e99b295765 MFC r275459: Unify function names after r275458. 2014-12-18 08:32:56 +00:00
mav
510c6d695e MFC r275458:
Do not pre-allocate UNIT ATTENTIONs storage for every possible initiator.

Abusing ability of major UAs cover minor ones we may not account UAs for
inactive ports.  Allocate UAs storage for port and start accounting only
after some initiator from that port fetched its first POWER ON OCCURRED.

This reduces per-LUN CTL memory usage from >1MB to less then 100K.
2014-12-18 08:32:06 +00:00
mav
e998495ddf MFC r275455: Remove some unused code. 2014-12-18 08:31:13 +00:00
mav
ef698a8b3d MFC r275447:
Do not pre-allocate reservation keys memory for every possible initiator.

In configurations with many ports, like iSCSI, each LUN is typically
accessed only by limited subset of ports.  Allocating that memory on
demand allows to reduce CTL memory usage from 5.3MB/LUN to 1.3MB/LUN.
2014-12-18 08:30:28 +00:00
mav
a4403fd4b3 MFC r275405: Convert persis_offset from global variable to softc field. 2014-12-18 08:28:44 +00:00
mav
be8a551758 MFC r275404: Reduce code duplication by creating ctl_set_res_ua() helper. 2014-12-18 08:27:46 +00:00
mav
f47cf8ca42 MFC r275403: Removed unused variable and unify some names. 2014-12-18 08:27:00 +00:00
mav
08d09659c7 MFC r275365: Move ctlfe_onoffline() out of lock to let it sleep when needed.
Do some more other polishing while there.
2014-12-18 08:26:11 +00:00
mav
19cc556c2a MFC r275058: Coalesce last data move and command status for read commands.
Make CTL core and block backend set success status before initiating last
data move for read commands.  Make CAM target and iSCSI frontends detect
such condition and send command status together with data.  New I/O flag
allows to skip duplicate status sending on later fe_done() call.

For Fibre Channel this change saves one of three interrupts per read command,
increasing performance from 126K to 160K IOPS.  For iSCSI this change saves
one of three PDUs per read command, increasing performance from 1M to 1.2M
IOPS.

Sponsored by:   iXsystems, Inc.
2014-12-18 08:25:00 +00:00
mav
67f35b59b5 MFC r275032: Decouple datamove/done logic from CTL status set. 2014-12-18 08:23:59 +00:00
mav
347eb16c88 MFC r275009: Use ctl_set_success() instead of direct inlining. 2014-12-18 08:23:04 +00:00
mav
91695c330f MFC r274962: Replace home-grown CTL IO allocator with UMA.
Old allocator created significant lock congestion protecting its lists
of preallocated I/Os, while UMA provides much better SMP scalability.
The downside of UMA is lack of reliable preallocation, that could guarantee
successful allocation in non-sleepable environments.  But careful code
review shown, that only CAM target frontend really has that requirement.
Fix that making that frontend preallocate and statically bind CTL I/O for
every ATIO/INOT it preallocates any way.  That allows to avoid allocations
in hot I/O path.  Other frontends either may sleep in allocation context
or can properly handle allocation errors.

On 40-core server with 6 ZVOL-backed LUNs and 7 iSCSI client connections
this change increases peak performance from ~700K to >1M IOPS!  Yay! :)

Sponsored by:	iXsystems, Inc.
2014-12-18 08:22:16 +00:00
mav
cc4fa4df69 MFC r275478: Swap resource count scopes for used/available space.
Used count should be reported as per-LUN, while available should not.
2014-12-11 00:25:26 +00:00
mav
4875c0a205 MFC r275446: Plug memory leaks on UNMAP and XCOPY with invalid parameters. 2014-12-10 08:52:47 +00:00
mav
c2a2522fc4 MFC r274805:
Make cfiscsi_offline() synchronous, waiting for connections termination
before return.  This should make ctld restart more clean and predictable.
2014-12-05 07:25:02 +00:00
mav
0d78a1c549 MFC r274795:
Close race between cfiscsi_offline() and new connection arrival.

Incoming connection should be either rejected or accepted and terminated.
2014-12-05 07:24:17 +00:00
mav
cc2f0f4af5 MFC r274785: Partially reconstruct Active/Standby clusting.
In this mode one head is in Active state, supporting all commands, while
another is in Standby state, supporting only minimal LUN discovery subset.

It is still incomplete since Standby state requires reservation support,
which is impossible to do right without having interlink between heads.
But it allows to run some basic experiments.
2014-12-05 07:23:25 +00:00
trasz
ea68673881 MFC r274703:
Fix typo.

Sponsored by:	The FreeBSD Foundation
2014-12-03 08:22:13 +00:00
trasz
eafe9cc2c3 MFC r273918:
Change the default log level for iSCSI target from 3 to 1.  It should
have been 1 from the beginning; not sure how it ended up at 3.

Sponsored by:	The FreeBSD Foundation
2014-11-30 10:36:29 +00:00
mav
8eec24be06 MFC r274840, r274940:
Make iSCSI frontend less chatty while waiting for tasks termination.
2014-11-28 08:56:37 +00:00
mav
2e80dc504a MFC r274790: Remove bunch of unused lun variables. 2014-11-28 08:54:43 +00:00
mav
6625927f31 MFC r274789: Reduce race between LUN destruction and request arrival. 2014-11-28 08:53:44 +00:00
mav
6f5634bed9 MFC r274786: Log errors for absent LUNs too. 2014-11-28 08:52:38 +00:00
mav
6414b04c27 MFC r274154, r274163:
Add to CTL support for logical block provisioning threshold notifications.

For ZVOL-backed LUNs this allows to inform initiators if storage's used or
available spaces get above/below the configured thresholds.

Sponsored by:	iXsystems, Inc.
2014-11-20 01:55:12 +00:00
mav
5966236aa7 MFC r274333: Handle PREEMPT AND ABORT service action equal to PREEMPT.
With command serialization used in CTL, there are no other commands to abort
when PREEMPT AND ABORT gets to run, so it is practically equal to PREEMPT.
2014-11-16 01:47:43 +00:00
mav
862a9d976c MFC r274206:
Synchronize medium rotation rate in legacy Rigid Disk Drive Geometry mode
page with modern Block Device Characteristics VPD page.
2014-11-14 00:25:10 +00:00