1611 Commits

Author SHA1 Message Date
mav
6c91ce3da6 Fix use-after-free on XPT_RESET_BUS.
That command is not queued, so does not use later status update.
2014-07-08 16:56:21 +00:00
mav
5f1a6650c5 Enable TAS feature: notify initiator if its command was aborted by other.
That should make operation more kind to multi-initiator environment.
Without this, other initiators may find out that something bad happened
to their commands only via command timeout.
2014-07-08 16:38:05 +00:00
mav
57e388ec90 Fix typo in r267873. 2014-07-08 13:28:37 +00:00
mav
a19a69842c Do not return statuses for aborted iSCSI commands. 2014-07-08 12:16:28 +00:00
mav
3d17a2a3d2 Return task management requests to queued execution, but differently.
Testing shown that both original queued design with separate task queue,
and recent direct execution design had significant flaw: If abort request
arrives just after the victim, the last one may not be in the ooa_queue
yet, and so invisible for the task management function.

Unlike original queued implementation, use same queue for all SCSI and
TASK requests from the same initiator. That avoids races between them:
task functions are always executed in proper time, relatively to other
requests.
2014-07-08 12:15:15 +00:00
mav
f71ed2c8ee Fix task management functions status: task not found is not an error,
while not implemented function is.
2014-07-08 08:34:34 +00:00
mav
aef4aec0ff Fix "use after free" on port creation error in r268291. 2014-07-07 11:52:22 +00:00
mav
db5360cb99 Add support for READ FULL STATUS action of PERSISTENT RESERVE IN command. 2014-07-07 11:05:04 +00:00
mav
784dfeee39 Teach ctl_add_initiator() to dynamically allocate IIDs from pool.
If port passed negative IID value, the function will try to allocate IID
from the pool of unused, based on passed wwpn or name arguments.  It does
all its best to make IID unique and persistent across reconnects.

This makes persistent reservation properly work for iSCSI.  Previously,
in case of reconnects, reservation could be unexpectedly lost, or even
migrate between intiators.
2014-07-07 09:37:22 +00:00
mav
25e7dba03e Fix bugs for PERSISTENT RESERVE OUT bits in r268096. 2014-07-07 08:58:36 +00:00
mav
c964df7230 Correction to r268356: collide only sessions to the same target. 2014-07-07 06:17:07 +00:00
mav
17f8e3065c When new connection comes in, check whether we already have session from
the same intiator (Name+ISID).  If so -- terminate the old session and let
the new one take its place, as required by iSCSI RFC.
2014-07-07 05:48:11 +00:00
mav
c28c880da7 Implement ABORT TASK SET and I_T NEXUS RESET task management functions.
Use the last one to terminate active commands on iSCSI session termination.
Previous code was aborting only commands doing some data moves.
2014-07-07 03:10:56 +00:00
andreast
67a3cc14e4 Make gcc happy, init idlen2. 2014-07-06 20:09:23 +00:00
mav
ac68a20a27 Close race in r268291 between port destruction, delayed by sessions
teardown, and new port creation during `service ctld restart`.

Close it by returning iSCSI port internal state, that allows to identify
dying ports, which should not be counted as existing, from really alive.
2014-07-06 17:57:59 +00:00
mav
832ad256f9 Add support for SCSI Ports (88h) VPD page. 2014-07-06 07:34:18 +00:00
mav
30f8b78b35 Make REPORT TARGET PORT GROUPS command report realistic data instead of
hardcoded garbage.
2014-07-06 07:02:36 +00:00
mav
a50500ead5 Move lun_map() method from command nexus to port.
Previous implementation made impossible to do some things, such as calling
it for ports other then one through which command arrived.
2014-07-06 06:21:34 +00:00
mav
1bf007c808 Relax some bit checks for INQUIRY command.
FreeBSD still tries to put LUN number in second byte until it get device
protocol version, even that it was obsoleted about 20 years ago.
2014-07-06 06:12:29 +00:00
mav
e3cec6db55 Pass through iSCSI session ISID from LOGIN request to the CTL frontend.
ISID is an important part of initiator transport ID for iSCSI.  It is not
used now, but should be to properly implement persistent reservation.
2014-07-05 21:18:33 +00:00
mav
dd9568e892 Burry devid port method, which was a gross hack.
Instead make ports provide wanted port and target IDs, and LUNs provide
wanted LUN IDs.  After that core Device ID VPD code only had to link all
of them together and add relative port and port group numbers.

LUN ID for iSCSI LUNs no longer created by CTL, but by ctld, and passed
to CTL as "scsiname" LUN option.  This makes LUNs to report the same set
of IDs, independently from the port through which it is accessed, as
required by SCSI specifications.
2014-07-05 19:30:20 +00:00
mav
4e932574fb Create separate CTL port for every iSCSI target (and maybe portal group).
Having single port for all iSCSI connections makes problematic implementing
some more advanced SCSI functionality in CTL, that require proper ports
enumeration and identification.

This change extends CTL iSCSI API, making ctld daemon to control list of
iSCSI ports in CTL.  When new target is defined in config fine, ctld will
create respective port in CTL.  When target is removed -- port will be
also removed after all active commands through that port properly aborted.
This change require ctld to be rebuilt to match the kernel.

As a minor side effect, this allows to have iSCSI targets without LUNs.
While that may look odd and not very useful, that is not incorrect.
2014-07-05 18:15:00 +00:00
mav
6a3d6f3982 Improve CTL_BEARG_* flags support, including optional values copyout. 2014-07-05 14:32:42 +00:00
mav
cd2bf77221 Implement and use ctl_frontend_find(). 2014-07-05 13:50:05 +00:00
mav
28432b0ce5 Introduce new IOCTL CTL_PORT_LIST reporting in more flexible XML format.
Leave old CTL_GET_PORT_LIST in place so far.  Garbage-collect it later.
2014-07-05 05:44:26 +00:00
mav
1aa291ed88 Improve readability of XML generated by CTL_LUN_LIST. 2014-07-05 04:10:24 +00:00
mav
5ec7bb54ef Make options KPI more generic to allow it to be used for ports too,
not only for LUNs.
2014-07-05 03:34:52 +00:00
mav
c3a321909b Use proper links field for ports linking. 2014-07-05 01:24:06 +00:00
mav
43424a0972 Separate concepts of frontend and port.
Before iSCSI implementation CTL had no knowledge about frontend drivers,
it had only frontends, which really were ports (alike to LUNs, if comparing
to backends).  But iSCSI added there ioctl() method, which does not belong
to frontend as a port, but belongs to a frontend driver.
2014-07-04 19:27:06 +00:00
mav
c9aabd0ff9 Remove targ_enable()/targ_disable() frontend methods.
Those methods were never implemented, and I believe that their concept is
wrong, since single frontend (SCSI port) can not handle several targets.
2014-07-04 19:19:03 +00:00
ken
ea871d446f Add persistent reservation support to camcontrol(8).
camcontrol(8) now supports a new 'persist' subcommand that allows users to
issue SCSI PERSISTENT RESERVE IN / OUT commands.

sbin/camcontrol/Makefile:
	Add persist.c.

sbin/camcontrol/persist.c:
	New persistent reservation support for camcontrol(8).

	We have support for all known operation modes for PERSISTENT RESERVE
	IN and PERSISTENT RESERVE OUT.
	exceptions noted above.

sbin/camcontrol/camcontrol.8:
	Document the new 'persist' subcommand.

	In the section on the Transport ID (-I) option, explain what
	Transport IDs for each protocol should look like.  At some point
	some of this information could probably get moved off in a
	separate man page, either on Transport IDs alone or a man page
	documenting the Transport ID parsing code.

	Add a number of examples of persistent reservation commands.
	Persistent Reservations are complex enough that the average user
	probably won't be able to get the commands exactly right by just
	reading the man page.  These examples show a few basic and
	advanced examples of how to use persistent reservations.

sbin/camcontrol/camcontrol.h:
	Move the definition for camcontrol_optret here, so we can use it
	for the persistent reservation code.

	Add a definition for the new scsipersist() function.

sbin/camcontrol/camcontrol.c:
	Add 'persist' to the list of subcommands.

	Document 'persist' in the help text.

sys/cam/scsi/scsi_all.c:
	Add the scsi_persistent_reserve_in() and
	scsi_persistent_reserve_out() CCB building functions.

	Add a new function, scsi_transportid_sbuf().  This takes a
	SCSI Transport ID (documented in SPC-4), and prints it to
	an sbuf(9).  There are some transports (like ATA, USB, and
	SSA) for which there is no transport defined.  We need to
	come up with a reasonable thing to do if we're presented
	with a Transport ID that claims to be for one of those
	protocols.

	Add new routines scsi_get_nv() and scsi_nv_to_str().

	These functions do a table lookup to go between a string and an
	integer.  There are lots of table lookups needed in the
	persistent reservation code in camcontrol(8).

	Add a new function, scsi_parse_transportid(), along with leaf node
	functions to parse:
	FC, 1394 and SAS (scsi_parse_transportid_64bit())
	iSCSI (scsi_parse_transportid_iscsi())
	SPI (scsi_parse_transportid_spi())
	RDMA (scsi_parse_transportid_rdma())
	PCIe (scsi_parse_transportid_sop())

	Transport IDs.  Given a string with the general form proto,id these
	functions create a SCSI Transport ID structure.

sys/cam/scsi/scsi_all.h:
	Update the various persistent reservation data structures to
	SPC4r36l, but also rename some fields that were previously
	obsolete with the proper names from older SCSI specs.  This
	allows using older, obsolete persistent reservation types when
	desired.

	Add function prototypes for the new persistent reservation CCB
	building functions.

	Add a data strucure for the READ FULL STATUS service action
	of the PERSISTENT RESERVE IN command.

	Add Transport ID structures for all protocols described in SPC-4.

	Add a new series of SCSI_PROTO_XXX definitions, and
	redefine other defines in terms of these new definitions.

	Add a prototype for scsi_transportid_sbuf().

	Change a couple of "obsolete" persistent reservation data
	structure fields into something more meaningful, based on
	what the field was called when it was defined in the spec.
	(e.g. SPC, SPC-2, etc.)

	Create a new define, SPRI_MAX_LEN, for the maximum allocation
	length allowed for the PERSISTENT RESERVE IN command.

	Add data structures and enumerations for the new name/value
	translation functions.

	Add data structures for SCSI over PCIe Routing IDs.

	Bring the PERSISTENT RESERVE OUT Register and Move parameter list
	structure (struct scsi_per_res_out_parms) up to date with SPC-4.

	Add a data structure for the transport IDs that can optionally be
	appended to the basic PERSISTENT RESERVE OUT parameter list.

	Move SCSI protocol macro definitions out of the VPD page 0x83
	definition and combine them with the more up to date protocol
	definitions higher in the file.

	Add function prototypes for scsi_nv_to_str(), scsi_get_nv(),
	scsi_parse_transportid_64bit(), scsi_parse_transportid_spi(),
	scsi_parse_transportid_rdma(), scsi_parse_transportid_iscsi(),
	scsi_parse_transportid_sop(), and scsi_parse_transportid().

Sponsored by:	Spectra Logic Corporation
MFC after:	1 week
2014-07-03 23:09:44 +00:00
imp
41f8871f5e Rework the BIO_DELETE code slightly. Always queue the BIO_DELETE
requests on the trim_queue, even for the CFA ERASE. This allows us, in
the future, to collapse adjacent requests. Since CFA ERASE is only for
CF cards, and it is so restrictive in what it can do, the collapse
code is not presently here. This also brings the ada driver more in
line with the da driver's treatment of BIO_DELETEs.

Reviewed by: mav@
2014-07-03 05:22:13 +00:00
mav
83d76f3c46 Use separate memory type M_CTLIO for I/Os.
CTL allocate large amount of RAM.  This change give some more stats.

MFC after:	2 weeks
2014-07-03 04:26:53 +00:00
mav
7eb84da710 Add support for REPORT TIMESTAMP command.
MFC after:	2 weeks
2014-07-01 16:52:41 +00:00
mav
68790c5590 Add more formal and strict command parsing and validation.
For every supported command define CDB length and mask of bits that are
allowed to be set.  This allows to remove bunch of checks through the code
and still make the validation more strict.  To properly do it for commands
supporting multiple service actions, formalize their parsing by adding
subtables for each of such commands.

As visible effect, this change allows to add support for REPORT SUPPORTED
OPERATION CODES command, reporting to client all the data about supported
SCSI commands, except timeouts.

MFC after:	2 weeks
2014-07-01 15:05:23 +00:00
hselasky
35b126e324 Pull in r267961 and r267973 again. Fix for issues reported will follow. 2014-06-28 03:56:17 +00:00
mav
be666c404b Remove odd practice of inverting error codes.
-EPERM is equal to ERESTART, returning which from ioctl() handler causes
infinite syscall restart.

MFC after:	2 weeks
2014-06-27 22:28:14 +00:00
gjb
fc21f40567 Revert r267961, r267973:
These changes prevent sysctl(8) from returning proper output,
such as:

 1) no output from sysctl(8)
 2) erroneously returning ENOMEM with tools like truss(1)
    or uname(1)
 truss: can not get etype: Cannot allocate memory
2014-06-27 22:05:21 +00:00
hselasky
bd1ed65f0f Extend the meaning of the CTLFLAG_TUN flag to automatically check if
there is an environment variable which shall initialize the SYSCTL
during early boot. This works for all SYSCTL types both statically and
dynamically created ones, except for the SYSCTL NODE type and SYSCTLs
which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to
be used in the case a tunable sysctl has a custom initialisation
function allowing the sysctl to still be marked as a tunable. The
kernel SYSCTL API is mostly the same, with a few exceptions for some
special operations like iterating childrens of a static/extern SYSCTL
node. This operation should probably be made into a factored out
common macro, hence some device drivers use this. The reason for
changing the SYSCTL API was the need for a SYSCTL parent OID pointer
and not only the SYSCTL parent OID list pointer in order to quickly
generate the sysctl path. The motivation behind this patch is to avoid
parameter loading cludges inside the OFED driver subsystem. Instead of
adding special code to the OFED driver subsystem to post-load tunables
into dynamically created sysctls, we generalize this in the kernel.

Other changes:
- Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask"
to "hw.pcic.intr_mask".
- Removed redundant TUNABLE statements throughout the kernel.
- Some minor code rewrites in connection to removing not needed
TUNABLE statements.
- Added a missing SYSCTL_DECL().
- Wrapped two very long lines.
- Avoid malloc()/free() inside sysctl string handling, in case it is
called to initialize a sysctl from a tunable, hence malloc()/free() is
not ready when sysctls from the sysctl dataset are registered.
- Bumped FreeBSD version to indicate SYSCTL API change.

MFC after:	2 weeks
Sponsored by:	Mellanox Technologies
2014-06-27 16:33:43 +00:00
mav
689a02cb4f Fix typo in r267481.
MFC after:	3 days
2014-06-27 06:52:37 +00:00
mav
3b1508e471 Simplify statistics calculation.
Instead of trying to guess size of disk I/O operations (it just won't work
that way for newly added commands, and is equal to data move size for old
ones), account data move traffic.  If disk I/Os are that interesting, then
backends have to account and provide that information.

Block backend already exports the information about disk I/Os via devstat,
so having it here too is excessive.

MFC after:	2 weeks
2014-06-26 20:06:37 +00:00
mav
3fa0a3d2c6 Allow MODE SENSE commands through Write Exclusive persistent reservation,
as required by SPC-4.

Report that fact in persistent reservation capabilities.

MFC after:	2 weeks
2014-06-26 09:42:00 +00:00
mav
36b6236db2 Add READ BUFFER and improve WRITE BUFFER SCSI commands support.
This gives some use to 512KB per-LUN buffers, allocated for Copan-specific
processor code and not used.  It allows, for example, to test transport
performance and/or correctness without accessing the media, as supported
by Linux version of sg3_utils.

MFC after:	2 weeks
2014-06-26 08:56:36 +00:00
mav
52ca2df270 Lock devstat updates in block backend to make it usable. Polish lock names.
MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2014-06-25 17:54:36 +00:00
mav
d4f8a83dc3 Introduce fine-grained CTL locking to improve SMP scalability.
Split global ctl_lock, historically protecting most of CTL context:
 - remaining ctl_lock now protects lists of fronends and backends;
 - per-LUN lun_lock(s) protect LUN-specific information;
 - per-thread queue_lock(s) protect request queues.
This allows to radically reduce congestion on ctl_lock.

Create multiple worker threads, depending on number of CPUs, and assign
each LUN to one of them.  This allows to spread load between multiple CPUs,
still avoiging congestion on queues and LUNs locks.

On 40-core server, exporting 5 LUNs, each backed by gstripe of SATA SSDs,
accessed via 6 iSCSI connections, this change improves peak request rate
from 250K to 680K IOPS.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2014-06-25 17:02:01 +00:00
mav
f8bcf3a156 Allow to use iSCSI immediate data by several ctl_datamove() calls.
While for FreeBSD client that is only a minor optimization, VMWare client
doesn't support additional data requests after all data being sent once as
immediate.

MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2014-06-25 16:12:14 +00:00
mav
85412f13ad Execute task management request directly in ctl_queue() context.
From one side it allows to remove CTL_FLAG_TASK_PENDING flag, handling of
which significantly complicates fine-grained locking.  From the other side
it reduces task management requests latency even below then that flag could.
As downside, it denies task management code to sleep, but that is not needed
any way now.

Discussed with:	ken
2014-06-19 13:19:35 +00:00
mav
281c52cb4a Add some more CTL_FLAG_ABORT check points.
This should allow to abort commands doing mostly disk I/O, such as VERIFY
or WRITE SAME.  Before this change CTL_FLAG_ABORT was only checked around
data moves, which for these commands may not happen for a very long time.

MFC after:	2 weeks
2014-06-19 12:43:41 +00:00
mav
6ec07a92b2 Increase CTL_DEVID_LEN from 16 to 64 bytes.
SPC-4 recommends T10 vendor ID based LUN ID was created by concatenating
product name and serial number (and istgt follows that).  But product name
is 16 bytes long by itself, so 16 bytes total length is clearly not enough
to fit both.

To keep compatibility with existing configurations, pad short device IDs
to old length of 16, same as before.

This change probably breaks CTL user-level ABI, so control tools should
be rebuilt after this change.

MFC after:	2 weeks
2014-06-19 09:46:43 +00:00
marius
c1a033988a Don't denounce peripherals on system shutdown. Together with r267321,
we're now back to the pre-r228483 level of default verbosity. This in
turn again typically allows for reading information that userland might
have printed on the screen before initiating a halt, but still permits
to debug potential device shutdown problems on system shutdown via
CAM_DEBUG etc.

Reviewed by:	mav
MFC after:	3 days
Sponsored by:	Bally Wulff Games & Entertainment GmbH
2014-06-19 09:08:20 +00:00