Pass through iSCSI session ISID from LOGIN request to the CTL frontend.
ISID is an important part of initiator transport ID for iSCSI. It is not
used now, but should be to properly implement persistent reservation.
Burry devid port method, which was a gross hack.
Instead make ports provide wanted port and target IDs, and LUNs provide
wanted LUN IDs. After that core Device ID VPD code only had to link all
of them together and add relative port and port group numbers.
LUN ID for iSCSI LUNs no longer created by CTL, but by ctld, and passed
to CTL as "scsiname" LUN option. This makes LUNs to report the same set
of IDs, independently from the port through which it is accessed, as
required by SCSI specifications.
Create separate CTL port for every iSCSI target (and maybe portal group).
Having single port for all iSCSI connections makes problematic implementing
some more advanced SCSI functionality in CTL, that require proper ports
enumeration and identification.
This change extends CTL iSCSI API, making ctld daemon to control list of
iSCSI ports in CTL. When new target is defined in config fine, ctld will
create respective port in CTL. When target is removed -- port will be
also removed after all active commands through that port properly aborted.
This change require ctld to be rebuilt to match the kernel.
As a minor side effect, this allows to have iSCSI targets without LUNs.
While that may look odd and not very useful, that is not incorrect.
Separate concepts of frontend and port.
Before iSCSI implementation CTL had no knowledge about frontend drivers,
it had only frontends, which really were ports (alike to LUNs, if comparing
to backends). But iSCSI added there ioctl() method, which does not belong
to frontend as a port, but belongs to a frontend driver.
Remove targ_enable()/targ_disable() frontend methods.
Those methods were never implemented, and I believe that their concept is
wrong, since single frontend (SCSI port) can not handle several targets.
Add more formal and strict command parsing and validation.
For every supported command define CDB length and mask of bits that are
allowed to be set. This allows to remove bunch of checks through the code
and still make the validation more strict. To properly do it for commands
supporting multiple service actions, formalize their parsing by adding
subtables for each of such commands.
As visible effect, this change allows to add support for REPORT SUPPORTED
OPERATION CODES command, reporting to client all the data about supported
SCSI commands, except timeouts.
Increase CTL_DEVID_LEN from 16 to 64 bytes.
SPC-4 recommends T10 vendor ID based LUN ID was created by concatenating
product name and serial number (and istgt follows that). But product name
is 16 bytes long by itself, so 16 bytes total length is clearly not enough
to fit both.
To keep compatibility with existing configurations, pad short device IDs
to old length of 16, same as before.
This change probably breaks CTL user-level ABI, so control tools should
be rebuilt after this change.
Introduce fine-grained CTL locking to improve SMP scalability.
Split global ctl_lock, historically protecting most of CTL context:
- remaining ctl_lock now protects lists of fronends and backends;
- per-LUN lun_lock(s) protect LUN-specific information;
- per-thread queue_lock(s) protect request queues.
This allows to radically reduce congestion on ctl_lock.
Create multiple worker threads, depending on number of CPUs, and assign
each LUN to one of them. This allows to spread load between multiple CPUs,
still avoiging congestion on queues and LUNs locks.
On 40-core server, exporting 5 LUNs, each backed by gstripe of SATA SSDs,
accessed via 6 iSCSI connections, this change improves peak request rate
from 250K to 680K IOPS.
Sponsored by: iXsystems, Inc.
Simplify statistics calculation.
Instead of trying to guess size of disk I/O operations (it just won't work
that way for newly added commands, and is equal to data move size for old
ones), account data move traffic. If disk I/Os are that interesting, then
backends have to account and provide that information.
Block backend already exports the information about disk I/Os via devstat,
so having it here too is excessive.
Allow MODE SENSE commands through Write Exclusive persistent reservation,
as required by SPC-4.
Report that fact in persistent reservation capabilities.
Add READ BUFFER and improve WRITE BUFFER SCSI commands support.
This gives some use to 512KB per-LUN buffers, allocated for Copan-specific
processor code and not used. It allows, for example, to test transport
performance and/or correctness without accessing the media, as supported
by Linux version of sg3_utils.
Add some more CTL_FLAG_ABORT check points.
This should allow to abort commands doing mostly disk I/O, such as VERIFY
or WRITE SAME. Before this change CTL_FLAG_ABORT was only checked around
data moves, which for these commands may not happen for a very long time.
Rework session termination in iSCSI target to actually wait
for any outstanding commands to be properly aborted by CTL.
Without it, in some cases (such as files backing the LUNs
stored on failing disk drives), terminating a busy session
would result in panic.
- Add support for SG_GET_SG_TABLESIZE IOCTL to report that we don't support
scatter/gather lists.
- Return error for still unsupported SG 3.x API read/write calls.
Add support for VERIFY(10/12/16) and COMPARE AND WRITE SCSI commands.
Make data_submit backends method support not only read and write requests,
but also two new ones: verify and compare. Verify just checks readability
of the data in specified location without transferring them outside.
Compare reads the specified data and compares them to received data,
returning error if they are different.
VERIFY(10/12/16) commands request either verify or compare from backend,
depending on BYTCHK CDB field. COMPARE AND WRITE command executed in two
stages: first it requests compare, and then, if succeesed, requests write.
Atomicity of operation is guarantied by CTL request ordering code.
Sponsored by: iXsystems, Inc.
Make backends track completion by processed number of sectors instead of
total transfer size.
Commands such as VERIFY or COMPARE AND WRITE may have transfer size not
matching directly to number of sectors.
Format Portal Group Tag same as istgt does -- %4.4x instead of %x.
SPC-4 spec tells it should be "two or more hexadecimal digits".
RFC3720 tells it is 16-bit value.
Allow to use iSCSI immediate data by several ctl_datamove() calls.
While for FreeBSD client that is only a minor optimization, VMWare client
doesn't support additional data requests after all data being sent once as
immediate.
Overhaul CAM SG driver IOCTL interfaces.
Make it really work for native FreeBSD programs. Before this it was broken
for years due to different number of pointer dereferences in Linux and
FreeBSD IOCTL paths, permanently returning errors to FreeBSD programs.
This change breaks the driver FreeBSD IOCTL ABI, making it more strict,
but since it was not working any way -- who bother.
Add shims for 32-bit programs on 64-bit host, translating the argument
of the SG_IO IOCTL for both FreeBSD and Linux ABIs.
With this change I was able to run 32-bit Linux sg3_utils tools and simple
32 and 64-bit FreeBSD test tools on both 32 and 64-bit FreeBSD systems.
Don't denounce peripherals on system shutdown. Together with r267321
(MFCed to stable/10 in r267775), we're now back to the pre-r228483
level of default verbosity. This in turn again typically allows for
reading information that userland might have printed on the screen
before initiating a halt, but still permits to debug potential device
shutdown problems on system shutdown via CAM_DEBUG etc.
Reviewed by: mav
Sponsored by: Bally Wulff Games & Entertainment GmbH
Respect MAXIMUM TRANSFER LENGTH field of Block Limits VPD page.
Nobody yet reported disk supporting I/Os less then our MAXPHYS value, but
since we any way have code to read Block Limits VPD page, that is easy.
Do not reread SCSI disk VPD pages on every device open.
Instead of rereading VPD pages on every device open, do it only on initial
device probe, and in cases when device reported via UNIT ATTENTIONs that
something has changed. Capacity is still rereaded on every open because
it is more critical for operation and more probable to change in run time.
On my tests with Intel 530 SSDs on mps(4) HBA this change reduces time
GEOM needs to retaste the device (that includes few open/close cycles)
from ~150ms to ~30ms.
Remove limits on size of READ/WRITE operations.
Instead of allocating up to 16MB or RAM at once to handle whole I/O,
allocate up to 1MB at a time, but do multiple ctl_datamove() and storage
I/Os if needed.
Make CAM target CTL frontend respect SIM I/O size limitations.
If datamove size is bigger then SIM can handle, or it has more segments
then this code can handle -- split it into several CTIO requests.
Add support for SCSI UNMAP commands to CTL.
This patch adds support for three new SCSI commands: UNMAP, WRITE SAME(10)
and WRITE SAME(16). WRITE SAME commands support both normal write mode
and UNMAP flag. To properly report UNMAP capabilities this patch also adds
support for reporting two new VPD pages: Block limits and Logical Block
Provisioning.
UNMAP support can be enabled per-LUN by adding "-o unmap=on" to `ctladm
create` command line or "option unmap on" to lun sections of /etc/ctl.conf.
At this moment UNMAP supported for ramdisks and device-backed block LUNs.
It was tested to work great with ZFS ZVOLs. For file-backed LUNs UNMAP
support is unfortunately missing due to absence of respective VFS KPI.
Sponsored by: iXsystems, Inc
Replace several instances of -1 with appropriate CAM_*_WILDCARD and types.
It was equal before r259397, but for good or bad, not any more for LUNs.
This change fixes at least CAM debugging.
Properly identify target portal when running in proxy mode. While here,
remove CTL_ISCSI_CLOSE, it wasn't used or implemented anyway.
Sponsored by: The FreeBSD Foundation
Add some stuff to make it easier to figure out for the system administrator
whether the ICL_KERNEL_PROXY stuff got compiled in correctly.
Sponsored by: The FreeBSD Foundation
Make it possible for the iSCSI target side to operate in both normal
and ICL_KERNEL_PROXY mode, and fix some bit rot so the latter actually
works again.
Sponsored by: The FreeBSD Foundation
Instead of "icltx" and "iclrx", use thread names with prefix from upper
layer, so that one can see which side of the stack the threads are for.
Sponsored by: The FreeBSD Foundation
Get rid of ICL lock; use upper-layer (initiator or target) lock instead.
This avoids extra locking in icl_pdu_queue(); the upper layer needs to call
it while holding its own lock anyway, to avoid sending PDUs out of order.
Sponsored by: The FreeBSD Foundation
Remove the homegrown ctl_be_block_io allocator, replacing it with UMA.
There is no performance difference.
Reviewed by: mav@
Sponsored by: The FreeBSD Foundation
Hide CTL messages about SCSI error responses. Too many users take
them for actual target errors. They can be enabled back by setting
kern.cam.ctl.verbose=1, or booting with bootverbose.
Sponsored by: The FreeBSD Foundation
Remove support of LUN-based CD changers from cd(4) driver.
This code was heavily broken few months ago during CAM locking changes.
Fixing it would require almost complete rewrite. Since there are no
known devices on market using this interface younger then ~15 years, and
they are CD, not even DVD, I don't see much reason to rewrite it.
This change does not mean those devices won't work. They will just work
slower due to inefficient disks load/unload schedule if several LUNs
accessed same time.
Fix support for increased logical sector size (4K-native drives).
- Logical sector size is measured in words, not bytes.
- If physical sector is not bigger then logical sector, it does not mean
it should be set equal to 512 bytes, but set to logical sector.
PR: misc/187269
Submitted by: Ravi Pokala <rpokala@panasas.com>
Move xpt_run_devq() call before request completion callback where it was
originally.
I am not sure why exactly have I moved it during one of many refactorings
during camlock project, but obviously it opens race window that may cause
use after free panics during SIM (in reported cases umass(4)) detach.
Take additional reference on SCSI probe periph to cover its freeze count.
Otherwise periph may be invalidated and freed before single-stepping freeze
is dropped, causing use after free panic.
locking support for CAM
r256826:
Fix several target mode SIMs to not blindly clear ccb_h.flags field of
ATIO CCBs. Not all CCB flags there belong to them.
r256836:
Remove hard limit on number of BIOs handled with one ATA TRIM request.
r256843:
Merge CAM locking changes from the projects/camlock branch to radically
reduce lock congestion and improve SMP scalability of the SCSI/ATA stack,
preparing the ground for the coming next GEOM direct dispatch support.
r256888:
Unconditionally acquire periph reference on CCB allocation failure.
r256895:
Fix memory and references leak due to unfreed path.
r256960:
Move CAM_UNQUEUED_INDEX setting to the last moment and under the periph lock.
This fixes race condition with cam_periph_ccbwait(), causing use-after-free.
r256975:
Minor (mostly cosmetical) addition to r256960.
r257054:
Some microoptimizations for da and ada drivers:
- Replace ordered_tag_count counter with single flag;
- From da remove outstanding_cmds counter, duplicating pending_ccbs list;
- From da_softc remove unused links field.
r257482:
Fix lock recursion, triggered by `smartctl -a /dev/adaX`.
r257501:
Make getenv_*() functions and respectively TUNABLE_*_FETCH() macros not
allocate memory and so not require sleepable environment. getenv() has
already used on-stack temporary storage, so just use it more rationally.
getenv_string() receives buffer as argument, so don't need another one.
r257914:
Some CAM locks polishing:
- Fix LOR and possible lock recursion when handling high-power commands.
Introduce new lock to protect left power quota and list of frozen devices.
- Correct locking around xpt periph creation.
- Remove seems never used XPT_FLAG_OPEN xpt periph flag.
Again, Netflix assisted with testing the merge, but all of the credit goes
to Alexander and iX Systems.
Submitted by: mav
Sponsored by: iX Systems
r256603:
Introduce new function devstat_end_transaction_bio_bt(), adding new argument
to specify present time. Use this function to move binuptime() out of lock,
substantially reducing lock congestion when slow timecounter is used.
r256606:
Move g_io_deliver() out of the lock, as required for direct dispatch.
Move g_destroy_bio() out too to reduce lock scope even more.
r256607:
Fix passing uninitialized bio_resid argument to g_trace().
r256610:
Add unmapped I/O support to GEOM RAID.
r256830:
Restore BIO_UNMAPPED and BIO_TRANSIENT_MAPPING in biodonne() when unmapping
temporary mapped buffer. That fixes double unmap if biodone() called twice
for the same BIO (but with different done methods).
r256880:
Merge GEOM direct dispatch changes from the projects/camlock branch.
When safety requirements are met, it allows to avoid passing I/O requests
to GEOM g_up/g_down thread, executing them directly in the caller context.
That allows to avoid CPU bottlenecks in g_up/g_down threads, plus avoid
several context switches per I/O.
r259247:
Fix bug introduced at r256607. We have to recalculate bp_resid here since
sizes of original and completed requests may differ due to end of media.
Testing of the stable/10 merge was done by Netflix, but all of the credit
goes to Alexander and iX Systems.
Submitted by: mav
Sponsored by: iX Systems
When comparing device IDs, make sure that they have the same type
(like NAA assigned) and identify the same entity (like device or port).
Otherwise there can be false positives since at least some models of
Seagate disks use same IDs for the whole device and one of its ports.
Properly report an error instead of panicing when user tries to create
LUN backed by non-disk device, e.g. /dev/null.
Sponsored by: The FreeBSD Foundation
Implement extended LUN support. If PIM_EXTLUNS is set by a SIM, encode
the upper 32-bits of the LUN, if possible, into the target_lun field as
passed directly from the REPORT LUNs response. This allows extended LUN
support to work for all LUNs with zeros in the lower 32-bits, which covers
most addressing modes without breaking KBI. Behavior for drivers not
setting PIM_EXTLUNS is unchanged. No user-facing interfaces are modified.
Extended LUNs are stored with swizzled 16-bit word order so that, for
devices implementing LUN addressing (like SCSI-2), the numerical
representation of the LUN is identical with and without PIM_EXTLUNS. Thus
setting PIM_EXTLUNS keeps most behavior, and user-facing LUN IDs, unchanged.
This follows the strategy used in Solaris. A macro (CAM_EXTLUN_BYTE_SWIZZLE)
is provided to transform a lun_id_t into a uint64_t ordered for the wire.
This is the second part of work for full 64-bit extended LUN support and is
designed to a bridge for stable/10 to the final 64-bit LUN code. The
third and final part will involve widening lun_id_t to 64 bits and will
not be MFCed. This third part will break the KBI but will keep the KPI
unchanged so that all drivers that will care about this can be updated now
and not require code changes between HEAD and stable/10.
Reviewed by: scottl
Unhide "Serial Number" lines from bootverbose. That information may
be useful for system administration to have in hard copy (in logs) if
one of several devices suddenly dies.
Approved by: re (hrs)
registered simultaneously. Due to topology unlock between the ID allocation
and the bus registration there is a chance that two buses may get the
same IDs. That is supposed reason of lock assertion panic in CAM during
initial bus scanning after new iscsid initiates two sessions same time.
Reported by: trasz
Approved by: re (glebus, marius)
MFC after: 2 weeks