Commit Graph

88 Commits

Author SHA1 Message Date
Mark Johnston
3616095801 Fix style issues around existing SDT probes.
- Use SDT_PROBE<N>() instead of SDT_PROBE(). This has no functional effect
  at the moment, but will be needed for some future changes.
- Don't hardcode the module component of the probe identifier. This is
  set automatically by the SDT framework.

MFC after:	1 week
2015-12-16 23:39:27 +00:00
Alexander Motin
5124012aae Make some panic strings mode informative. 2015-10-21 15:31:26 +00:00
Alexander Motin
361e885315 Remove lock upgrade attempt from ctl_be_block_open_file().
I am not sure what for it was done.  Now open routine should automatically
fall back to read-only if open for writing is impossible.  In such case
attempt to upgrade to write sounds strange.

MFC after:	1 week
2015-10-11 08:28:49 +00:00
Alexander Motin
3d5cb709bd Add missing vnode lock in case of file modify request.
Submitted by:	Richard Kojedzinszky
MFC after:	1 week
2015-10-08 07:34:30 +00:00
Alexander Motin
66b6967686 Really implement PREVENT ALLOW MEDIUM REMOVAL command. 2015-09-29 15:12:40 +00:00
Alexander Motin
648dfc1a29 Umplement media load/eject support for removable devices.
In case of block backend eject really closes the backing store, while
load tries to open it back.  Failed store open is reported as no media.
2015-09-28 20:54:18 +00:00
Alexander Motin
91be33dc78 Add to CTL initial support for CDROMs and removable devices.
Relnotes:	yes
2015-09-27 13:47:28 +00:00
Alexander Motin
9c887a4f86 Remove some duplicate, legacy, dead and questionable code. 2015-09-26 11:28:45 +00:00
Alexander Motin
c30a4c1871 Remove some dead code found by Clang analyzer. 2015-09-25 18:15:34 +00:00
Alexander Motin
67cc546dfc Remove stale comments and some excessive empty lines. 2015-09-25 16:34:59 +00:00
Alexander Motin
e675024a02 Switch I/O time accounting from system time to uptime.
While there, make num_dmas accounted independently of CTL_TIME_IO.
2015-09-25 10:14:39 +00:00
Alexander Motin
6c2acea564 Allow WRITE SAME with NDOB bit set but without UNMAP.
This combination was originally forbidden, but allowed at spc4r3.
2015-09-24 15:59:08 +00:00
Alexander Motin
4ce7a0868c Remove duplicate and incomplete code handling LUN modify.
Instead reuse code from LUN creation.  This allows most of LUN media
options to be changed live with modify request without full restart.
2015-09-22 10:45:50 +00:00
Alexander Motin
b22213694e Remove couple excess SGLIST I/O flags.
Those flags duplicated respective (sg_entries > 0) values.
2015-09-20 10:40:30 +00:00
Alexander Motin
75a3108e13 Relax serseq option operation for reads.
Previously, with serseq enabled, next command was unblocked only after
previous completed.  With this change, for read operations, next command
is unblocked as soon as last media read completed.  This is important
for frontends that actually wait for data move completion (like camtgt),
or when data are moved through the HA link, or especially when both.
2015-09-18 19:43:14 +00:00
Alexander Motin
7f7bb97a0f Report proper medium error code for VERIFY commands. 2015-09-17 12:52:18 +00:00
Alexander Motin
83981e319d Fix reading after end of file for file-backed LUNs.
If backing file is smaller then the LUN size, we have to explicitly clear
the rest of the buffer to not leak some random data from previous I/Os.
2015-09-16 21:43:51 +00:00
Alexander Motin
d6043e4643 Make COMPARE AND WRITE report offset of difference. 2015-09-16 18:33:04 +00:00
Alexander Motin
6187d4722a Improve read-only support. 2015-09-13 16:49:41 +00:00
Alexander Motin
ee4ad294d2 Close races between device close and request processing.
All requests arriving for processing after OFFLINE flag set are rejected
with BUSY status.  Races around OFFLINE flag setting are closed by calling
taskqueue_drain_all().
2015-09-11 14:33:05 +00:00
Alexander Motin
3236151ea8 Reference/release devices on every I/O, rather on open/close.
While this may be slower, it allows device destruction to complete,
rather then block waiting for indefinitely long time.
2015-09-11 12:50:52 +00:00
Alexander Motin
7ac58230ea Reimplement CTL High Availability.
CTL HA functionality was originally implemented by Copan many years ago,
but large part of the sources was never published.  This change includes
clean room implementation of the missing code and fixes for many bugs.

This code supports dual-node HA with ALUA in four modes:
 - Active/Unavailable without interlink between nodes;
 - Active/Standby with second node handling only basic LUN discovery and
reservation, synchronizing with the first node through the interlink;
 - Active/Active with both nodes processing commands and accessing the
backing storage, synchronizing with the first node through the interlink;
 - Active/Active with second node working as proxy, transfering all
commands to the first node for execution through the interlink.

Unlike original Copan's implementation, depending on specific hardware,
this code uses simple custom TCP-based protocol for interlink.  It has
no authentication, so it should never be enabled on public interfaces.

The code may still need some polishing, but generally it is functional.

Relnotes:	yes
Sponsored by:	iXsystems, Inc.
2015-09-10 12:40:31 +00:00
Alexander Motin
a3977bea20 Allow LUN options modification via CTL_LUNREQ_MODIFY.
Not all changes take effect, but that is a different question.
2015-09-06 11:23:01 +00:00
Alexander Motin
0bcd4ab6ba Move setting of media parameters inside open routines.
This is preparation for possibility to open/close media several times
per LUN life cycle.  While there, rename variables to reduce confusion.
As additional bonus this allows to open read-only media, such as ZFS
snapshots.
2015-09-06 09:54:56 +00:00
Alexander Motin
bd236ba5c0 Remove some dead code. 2015-09-04 09:19:01 +00:00
Alexander Motin
f6295033c1 Fix type bug introduced at r286811. 2015-08-27 21:16:24 +00:00
Alexander Motin
a15bbf1508 Polish sizes processing. 2015-08-15 18:22:16 +00:00
Alexander Motin
2f444d157b Drop "internal" CTL frontend.
Its idea was to be a simple initiator and execute several commands from
kernel level, but FreeBSD never had consumer for that functionality,
while its implementation polluted many unrelated places..
2015-08-15 13:34:38 +00:00
Alexander Motin
7d0d4342e3 Pass SYNCHRONIZE CACHE command parameters to backends.
At this point IMMED flag is translated to MNT_NOWAIT flag of VOP_FSYNC(),
hoping that file system implements that (ZFS seems doesn't).

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2015-08-05 22:24:49 +00:00
Mateusz Guzik
8a08cec166 Create a dedicated function for ensuring that cdir and rdir are populated.
Previously several places were doing it on its own, partially
incorrectly (e.g. without the filedesc locked) or even actively harmful
by populating jdir or assigning rootvnode without vrefing it.

Reviewed by:	kib
2015-07-11 16:22:48 +00:00
Alexander Motin
b9b4269c1d Fix couple panics on forced unmount of backing file.
MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2015-07-02 12:53:22 +00:00
Alexander Motin
0631de4a79 Handle EDQUOT backend storage errors same as ENOSPC.
MFC after:	1 week
2015-05-06 19:47:31 +00:00
Alexander Motin
fbc8d4ff38 Teach CTL to ask GEOM devices about BIO_DELETE support.
MFC after:	1 week
2015-02-13 13:26:23 +00:00
Alexander Motin
fee04ef7a9 Make XCOPY and WUT commands respect physical block size/offset.
This change by 2-3 times improves performance of misaligned XCOPY and WUT
commands by avoiding unneeded read-modify-write cycles inside ZFS.

MFC after:	1 week
2015-02-12 15:46:44 +00:00
Alexander Motin
93b8c96cfd Make WRITE SAME commands respect physical block size.
This change by 2-3 times improves performance of misaligned WRITE SAME
commands by avoiding unneeded read-modify-write cycles inside ZFS.

MFC after:	1 week
2015-02-12 10:28:45 +00:00
Alexander Motin
e7038eb747 Replace ctl_min() macro with MIN().
MFC after:	1 week
2014-12-20 13:33:31 +00:00
Alexander Motin
cb8727e23a Pass real optimal transfer size supported by backend.
For files and ZVOLs that is 1MB now, not 128K.

MFC after:	1 week
2014-12-18 22:32:22 +00:00
Alexander Motin
34961f407d Add configuration options to override physical and UNMAP blocks geometry.
While in most cases CTL should correctly fetch those values from backing
storages, there are some initiators (like MS SQL), that may not like large
physical block sizes, even if they are true.  For such cases allow override
fetched values with supported ones (like 4K).

MFC after:	1 week
2014-12-17 17:30:54 +00:00
Alexander Motin
bfbfc4a3cb Count consecutive read requests as blocking in CTL for files and ZVOLs.
Technically read requests can be executed in any order or simultaneously
since they are not changing any data.  But ZFS prefetcher goes crasy when
it receives consecutive requests from different threads.  Since prefetcher
works on level of separate blocks, instead of two consecutive 128K requests
it may receive 32 8K requests in mixed order.

This patch is more workaround then a real fix, and it does not fix all of
prefetcher problems, but it improves sequential read speed by 3-4x times
in some configurations.  On the other side it may hurt performance if
some backing store has no prefetch, that is why it is disabled by default
for raw devices.

MFC after:	2 weeks
2014-12-06 20:39:25 +00:00
Alexander Motin
53c146de18 Add to CTL support for threshold notifications for file-backed LUNs.
Previously it was supported only for ZVOL-backed LUNs, but now should work
for file-backed LUNs too.  Used value in this case is a space occupied by
the backing file, while available value is an available space on file
system.  Pool thresholds are still not implemented in this case.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2014-12-04 18:37:42 +00:00
Alexander Motin
ef8daf3fed Add GET LBA STATUS command support to CTL.
It is implemented for LUNs backed by ZVOLs in "dev" mode and files.
GEOM has no such API, so for LUNs backed by raw devices all LBAs will
be reported as mapped/unknown.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2014-12-04 11:34:19 +00:00
Alexander Motin
f7241cceb0 Coalesce last data move and command status for read commands.
Make CTL core and block backend set success status before initiating last
data move for read commands.  Make CAM target and iSCSI frontends detect
such condition and send command status together with data.  New I/O flag
allows to skip duplicate status sending on later fe_done() call.

For Fibre Channel this change saves one of three interrupts per read command,
increasing performance from 126K to 160K IOPS.  For iSCSI this change saves
one of three PDUs per read command, increasing performance from 1M to 1.2M
IOPS.

MFC after:	1 month
Sponsored by:	iXsystems, Inc.
2014-11-25 17:53:35 +00:00
Alexander Motin
3f829b0c9c Fix LUN resize broken by r272911 commit.
MFC after:	3 days
2014-11-07 20:42:15 +00:00
Alexander Motin
c3e7ba3e6d Add to CTL support for logical block provisioning threshold notifications.
For ZVOL-backed LUNs this allows to inform initiators if storage's used or
available spaces get above/below the configured thresholds.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2014-11-06 00:48:36 +00:00
Alexander Motin
4fc18ff9bb Implement better handling for ENOSPC error for both CTL and CAM.
This makes VMWare VAAI Thin Provisioning Stun primitive activate, pausing
the virtual machine, when backing storage (ZFS pool) is getting overflowed.

MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2014-10-29 03:14:29 +00:00
Alexander Motin
20a28d6cee Report physical block size for file-backed LUNs, using vattr.va_blocksize.
MFC after:	1 week
2014-10-13 11:00:58 +00:00
Alexander Motin
19720f4113 Make ctld start even if some LUNs are unable to open backing storage.
Such LUNs will be visible to initiators, but return "not ready" status
on media access commands.  If backing storage become available later,
`ctladm modify ...` or `service ctld reload` can trigger its reopen.
2014-10-10 19:41:09 +00:00
Alexander Motin
8a41675372 Add support for WRITE ATOMIC (16) command and report SBC-4 compliance.
Atomic writes are only supported for ZVOLs in "dev" mode.  In other cases
atomicity can not be guarantied and so the command is blocked.
2014-10-08 07:48:36 +00:00
Alexander Motin
64c5167c91 Add support for "no Data-Out Buffer" (NDOB) flag of WRITE SAME (16) command. 2014-09-18 21:39:00 +00:00
Alexander Motin
71d8e97e35 When updating device media size use cached cdevsw pointer.
Using pointer from the cdev directly is dangerous since we have no reference
on it, and it may change any time.  That caused panic if device has gone.

While there, report capacity change only if it really changed.

MFC after:	3 days
2014-09-18 17:25:20 +00:00