Improve cache control support, including DPO/FUA flags and the mode page.
At this moment it works only for files and ZVOLs in device mode since BIOs
have no respective respective cache control flags (DPO/FUA).
When reporting some major UNIT ATTENTION condition, like POWER ON OCCURRED
or I_T NEXUS LOSS, clear all minor UAs for the LUN, redundant in this case.
All SAM specifications tell that target MAY do it, but libiscsi initiator
seems require it to be done, terminating connection with error if some more
UAs happen to be reported during iSCSI connection.
Approved by: re (gjb)
Using pointer from the cdev directly is dangerous since we have no
reference on it, and it may change any time. That caused panic if
device has gone.
While there, report capacity change only if it really changed.
Approved by: re (dephij)
Fix tpc_create_token() introduced in r269497 to encode CREATOR LOGICAL
UNIT DESCRIPTOR field as Identification Descriptor CSCD descriptor, not
just as Identification Descriptor.
Approved by: re (gjb)
Make it possible to disable NOP-In PDUs by the iSCSI initiator by setting
kern.cam.ctl.iscsi.ping_timeout to 0. This fixes interoperability with
some initiators that don't properly support NOP-Ins, namely iPXE/gPXE.
Approved by: re (kib)
None of existing STEC devices need UNMAP or even support it well,
having many limitations and even hanging sometimes executing those
commands. New devices that may use UNMAP going to be released under
HGST name.
Approved by: re (delphij)
Improve ZFS N-way mirror read performance by using load and locality
information.
MFC r260713:
Fix ZFS mirror code for handling multiple DVA's
Also make the addition of the d_rotation_rate binary compatible. This allows
storage drivers compiled for 10.0 to work by preserving the ABI for disks.
Approved by: re (gjb)
Sponsored by: Multiplay
r270327 | imp | 2014-08-22 07:15:59 -0600 (Fri, 22 Aug 2014) | 6 lines
We should never enter the PROBE_SETAN phase if we're not ATAPI, since
that's ATAPI specific. Instead, skip to PROBE_SET_MULTI instead for
non ATAPI protocols. The prior code incorrectly terminated the probe
with a break, rather than arranging for probedone to get called. This
caused panics or worse on some systems.
r270249 | imp | 2014-08-20 16:58:12 -0600 (Wed, 20 Aug 2014) | 13 lines
Turns out that IDENTIFY DEVICE and IDENTIFY PACKET DEVICE return data
that's only mostly similar. Specifically word 78 bits are defined for
IDENTIFY DEVICE as
5 Supports Hardware Feature Control
while a IDENTIFY PACKET DEVICE defines them as
5 Asynchronous notification supported
Therefore, only pay attention to bit 5 when we're talking to ATAPI
devices (we don't use the hardware feature control at this time).
Ignore it for ATA devices. Remove kludge that papered over this issue
for Samsung SATA SSDs, since Micron drives also have the bit set and
the error was caused by this bad interpretation of the spec (which is
quite easy to do, since bits aren't normally overlapping like this).
Sponsored by: Netflix (the merge and the original work)
Reduce reported additional INQUIRY data length.
sizeof(struct scsi_inquiry_data) of 256 bytes combined with off-by-one
error in the changed code gave total INQUIRY data length above 255 bytes,
that was maximal INQUIRY length in SPC-2. While SPC-3 increased the
maximal length to 64K, at least sg3_utils are still confused by that.
Reimplement WRITE USING TOKEN with Block Zero token using WRITE SAME.
On my ZVOL of SSDs that increases speed of zero writing in that way from
1 to 2.5GB/s by reducing CPU overhead.
Add support for Windows dialect of EXTENDED COPY command, aka Microsoft ODX.
This allows to avoid extra network traffic when copying files on NTFS iSCSI
disks within one storage host by drag'n'dropping them in Windows Explorer
of Windows 8/2012. It should also accelerate Hyper-V VM operations, etc.
Fix breakage introduced by r256843: removing the SA_CCB_WAITING bit
left some of the decisions based on its counterpart, SA_CCB_BUFFER_IO
being random. As a result, propagation of the residual information
for the SPACE command was broken, so the number of filemarks
encountered during a SPACE operation was miscalculated. Consequently,
systems relying on properly tracked filemark counters (like Bacula)
fell apart.
The change also removes a switch/case in sadone() which r256843
degraded to a single remaining case label.
PR: 192285
Implement separate I/O dispatch method for ZVOLs in "dev" mode.
Unlike disk devices ZVOLs process all requests synchronously. That makes
impossible sending multiple requests to them from single thread. From the
other side ZVOLs have real d_read/d_write methods, which unlike d_strategy
can handle uio scatter/gather and have no strict I/O size limitations.
So, if ZVOL in "dev" mode is detected, use of d_read/d_write methods instead
of d_strategy allows to avoid pointless splitting of large requests into
MAXPHYS (128K) sized chunks.
Increase maximal number of SCSI ports in CTL from 32 to 128.
After I gave each iSCSI target its own port, the old limit appeared to be
not so big. This change almost proportionally increases per-LUN memory
use, but it is still three times better then it was before r268807.
Reduce per-LUN memory usage from 18MB to 1.8MB.
CTL never had use for CA support code since SPI has gone, and there is no
even frontends supporting that. But it still was reserving 256 bytes of
memory per LUN per every possible initiator on every possible port.
Wrap unused code with ifdef's in case somebody ever need it.
Add support for VMWare dialect of EXTENDED COPY command, aka VAAI Clone.
This allows to clone VMs and move them between LUNs inside one storage
host without generating extra network traffic to the initiator and back,
and without being limited by network bandwidth.
LUNs participating in copy operation should have UNIQUE NAA or EUI IDs set.
For LUNs without these IDs VMWare will use traditional copy operations.
Beware: the above LUN IDs explicitly set to values non-unique from the VM
cluster point of view may cause data corruption if wrong LUN is addressed!
Sponsored by: iXsystems, Inc.
Fix ctl(4) kldload failure that manifested like this:
link_elf_obj: symbol icl_pdu_new_bhs undefined
PR: 192031
Submitted by: Nils Beyer (earlier version)
Sponsored by: FreeBSD Foundation
>r267118 | imp | 2014-06-05 11:13:42 -0600 (Thu, 05 Jun 2014) | 9 lines
>The code that combines adjacent ranges for BIO_DELETEs to optimize
>trims to the device assumes the list is sorted. Don't apply the
>optimization of not sorting the queue when we have SSDs to the
>delete_queue, since it causes more discard traffic to the drive. While
>one could argue that the higher levels should coalesce the trims,
>that's not done today, so some optimization at this level is needed.
>CR: https://phabric.freebsd.org/D142
>r268205 | imp | 2014-07-02 23:22:13 -0600 (Wed, 02 Jul 2014) | 9 lines
>Rework the BIO_DELETE code slightly. Always queue the BIO_DELETE
>requests on the trim_queue, even for the CFA ERASE. This allows us, in
>the future, to collapse adjacent requests. Since CFA ERASE is only for
>CF cards, and it is so restrictive in what it can do, the collapse
>code is not presently here. This also brings the ada driver more in
>line with the da driver's treatment of BIO_DELETEs.
Add persistent reservation support to camcontrol(8).
camcontrol(8) now supports a new 'persist' subcommand that allows users to
issue SCSI PERSISTENT RESERVE IN / OUT commands.
Enable TAS feature: notify initiator if its command was aborted by other.
That should make operation more kind to multi-initiator environment.
Without this, other initiators may find out that something bad happened
to their commands only via command timeout.
Teach ctl_add_initiator() to dynamically allocate IIDs from pool.
If port passed negative IID value, the function will try to allocate IID
from the pool of unused, based on passed wwpn or name arguments. It does
all its best to make IID unique and persistent across reconnects.
This makes persistent reservation properly work for iSCSI. Previously,
in case of reconnects, reservation could be unexpectedly lost, or even
migrate between intiators.
When new connection comes in, check whether we already have session from
the same intiator (Name+ISID). If so -- terminate the old session and let
the new one take its place, as required by iSCSI RFC.
Implement ABORT TASK SET and I_T NEXUS RESET task management functions.
Use the last one to terminate active commands on iSCSI session termination.
Previous code was aborting only commands doing some data moves.
Close race in r268291 between port destruction, delayed by sessions
teardown, and new port creation during `service ctld restart`.
Close it by returning iSCSI port internal state, that allows to identify
dying ports, which should not be counted as existing, from really alive.
Move lun_map() method from command nexus to port.
Previous implementation made impossible to do some things, such as calling
it for ports other then one through which command arrived.