- Use SDT_PROBE<N>() instead of SDT_PROBE(). This has no functional effect
at the moment, but will be needed for some future changes.
- Don't hardcode the module component of the probe identifier. This is
set automatically by the SDT framework.
MFC after: 1 week
I am not sure what for it was done. Now open routine should automatically
fall back to read-only if open for writing is impossible. In such case
attempt to upgrade to write sounds strange.
MFC after: 1 week
Previously, with serseq enabled, next command was unblocked only after
previous completed. With this change, for read operations, next command
is unblocked as soon as last media read completed. This is important
for frontends that actually wait for data move completion (like camtgt),
or when data are moved through the HA link, or especially when both.
All requests arriving for processing after OFFLINE flag set are rejected
with BUSY status. Races around OFFLINE flag setting are closed by calling
taskqueue_drain_all().
CTL HA functionality was originally implemented by Copan many years ago,
but large part of the sources was never published. This change includes
clean room implementation of the missing code and fixes for many bugs.
This code supports dual-node HA with ALUA in four modes:
- Active/Unavailable without interlink between nodes;
- Active/Standby with second node handling only basic LUN discovery and
reservation, synchronizing with the first node through the interlink;
- Active/Active with both nodes processing commands and accessing the
backing storage, synchronizing with the first node through the interlink;
- Active/Active with second node working as proxy, transfering all
commands to the first node for execution through the interlink.
Unlike original Copan's implementation, depending on specific hardware,
this code uses simple custom TCP-based protocol for interlink. It has
no authentication, so it should never be enabled on public interfaces.
The code may still need some polishing, but generally it is functional.
Relnotes: yes
Sponsored by: iXsystems, Inc.
This is preparation for possibility to open/close media several times
per LUN life cycle. While there, rename variables to reduce confusion.
As additional bonus this allows to open read-only media, such as ZFS
snapshots.
Its idea was to be a simple initiator and execute several commands from
kernel level, but FreeBSD never had consumer for that functionality,
while its implementation polluted many unrelated places..
At this point IMMED flag is translated to MNT_NOWAIT flag of VOP_FSYNC(),
hoping that file system implements that (ZFS seems doesn't).
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
Previously several places were doing it on its own, partially
incorrectly (e.g. without the filedesc locked) or even actively harmful
by populating jdir or assigning rootvnode without vrefing it.
Reviewed by: kib
This change by 2-3 times improves performance of misaligned XCOPY and WUT
commands by avoiding unneeded read-modify-write cycles inside ZFS.
MFC after: 1 week
This change by 2-3 times improves performance of misaligned WRITE SAME
commands by avoiding unneeded read-modify-write cycles inside ZFS.
MFC after: 1 week
While in most cases CTL should correctly fetch those values from backing
storages, there are some initiators (like MS SQL), that may not like large
physical block sizes, even if they are true. For such cases allow override
fetched values with supported ones (like 4K).
MFC after: 1 week
Technically read requests can be executed in any order or simultaneously
since they are not changing any data. But ZFS prefetcher goes crasy when
it receives consecutive requests from different threads. Since prefetcher
works on level of separate blocks, instead of two consecutive 128K requests
it may receive 32 8K requests in mixed order.
This patch is more workaround then a real fix, and it does not fix all of
prefetcher problems, but it improves sequential read speed by 3-4x times
in some configurations. On the other side it may hurt performance if
some backing store has no prefetch, that is why it is disabled by default
for raw devices.
MFC after: 2 weeks
Previously it was supported only for ZVOL-backed LUNs, but now should work
for file-backed LUNs too. Used value in this case is a space occupied by
the backing file, while available value is an available space on file
system. Pool thresholds are still not implemented in this case.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
It is implemented for LUNs backed by ZVOLs in "dev" mode and files.
GEOM has no such API, so for LUNs backed by raw devices all LBAs will
be reported as mapped/unknown.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
Make CTL core and block backend set success status before initiating last
data move for read commands. Make CAM target and iSCSI frontends detect
such condition and send command status together with data. New I/O flag
allows to skip duplicate status sending on later fe_done() call.
For Fibre Channel this change saves one of three interrupts per read command,
increasing performance from 126K to 160K IOPS. For iSCSI this change saves
one of three PDUs per read command, increasing performance from 1M to 1.2M
IOPS.
MFC after: 1 month
Sponsored by: iXsystems, Inc.
For ZVOL-backed LUNs this allows to inform initiators if storage's used or
available spaces get above/below the configured thresholds.
MFC after: 2 weeks
Sponsored by: iXsystems, Inc.
This makes VMWare VAAI Thin Provisioning Stun primitive activate, pausing
the virtual machine, when backing storage (ZFS pool) is getting overflowed.
MFC after: 1 week
Sponsored by: iXsystems, Inc.
Such LUNs will be visible to initiators, but return "not ready" status
on media access commands. If backing storage become available later,
`ctladm modify ...` or `service ctld reload` can trigger its reopen.
Using pointer from the cdev directly is dangerous since we have no reference
on it, and it may change any time. That caused panic if device has gone.
While there, report capacity change only if it really changed.
MFC after: 3 days