Commit Graph

1374 Commits

Author SHA1 Message Date
Glen Barber
37a107a407 Revert r267961, r267973:
These changes prevent sysctl(8) from returning proper output,
such as:

 1) no output from sysctl(8)
 2) erroneously returning ENOMEM with tools like truss(1)
    or uname(1)
 truss: can not get etype: Cannot allocate memory
2014-06-27 22:05:21 +00:00
Hans Petter Selasky
3da1cf1e88 Extend the meaning of the CTLFLAG_TUN flag to automatically check if
there is an environment variable which shall initialize the SYSCTL
during early boot. This works for all SYSCTL types both statically and
dynamically created ones, except for the SYSCTL NODE type and SYSCTLs
which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to
be used in the case a tunable sysctl has a custom initialisation
function allowing the sysctl to still be marked as a tunable. The
kernel SYSCTL API is mostly the same, with a few exceptions for some
special operations like iterating childrens of a static/extern SYSCTL
node. This operation should probably be made into a factored out
common macro, hence some device drivers use this. The reason for
changing the SYSCTL API was the need for a SYSCTL parent OID pointer
and not only the SYSCTL parent OID list pointer in order to quickly
generate the sysctl path. The motivation behind this patch is to avoid
parameter loading cludges inside the OFED driver subsystem. Instead of
adding special code to the OFED driver subsystem to post-load tunables
into dynamically created sysctls, we generalize this in the kernel.

Other changes:
- Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask"
to "hw.pcic.intr_mask".
- Removed redundant TUNABLE statements throughout the kernel.
- Some minor code rewrites in connection to removing not needed
TUNABLE statements.
- Added a missing SYSCTL_DECL().
- Wrapped two very long lines.
- Avoid malloc()/free() inside sysctl string handling, in case it is
called to initialize a sysctl from a tunable, hence malloc()/free() is
not ready when sysctls from the sysctl dataset are registered.
- Bumped FreeBSD version to indicate SYSCTL API change.

MFC after:	2 weeks
Sponsored by:	Mellanox Technologies
2014-06-27 16:33:43 +00:00
Alexander Motin
1d6f7db544 Fix typo in r267481.
MFC after:	3 days
2014-06-27 06:52:37 +00:00
Alexander Motin
b88b05216a Simplify statistics calculation.
Instead of trying to guess size of disk I/O operations (it just won't work
that way for newly added commands, and is equal to data move size for old
ones), account data move traffic.  If disk I/Os are that interesting, then
backends have to account and provide that information.

Block backend already exports the information about disk I/Os via devstat,
so having it here too is excessive.

MFC after:	2 weeks
2014-06-26 20:06:37 +00:00
Alexander Motin
f82388fd84 Allow MODE SENSE commands through Write Exclusive persistent reservation,
as required by SPC-4.

Report that fact in persistent reservation capabilities.

MFC after:	2 weeks
2014-06-26 09:42:00 +00:00
Alexander Motin
85165a3f70 Add READ BUFFER and improve WRITE BUFFER SCSI commands support.
This gives some use to 512KB per-LUN buffers, allocated for Copan-specific
processor code and not used.  It allows, for example, to test transport
performance and/or correctness without accessing the media, as supported
by Linux version of sg3_utils.

MFC after:	2 weeks
2014-06-26 08:56:36 +00:00
Alexander Motin
75c7a1d357 Lock devstat updates in block backend to make it usable. Polish lock names.
MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2014-06-25 17:54:36 +00:00
Alexander Motin
3a8ce4a36b Introduce fine-grained CTL locking to improve SMP scalability.
Split global ctl_lock, historically protecting most of CTL context:
 - remaining ctl_lock now protects lists of fronends and backends;
 - per-LUN lun_lock(s) protect LUN-specific information;
 - per-thread queue_lock(s) protect request queues.
This allows to radically reduce congestion on ctl_lock.

Create multiple worker threads, depending on number of CPUs, and assign
each LUN to one of them.  This allows to spread load between multiple CPUs,
still avoiging congestion on queues and LUNs locks.

On 40-core server, exporting 5 LUNs, each backed by gstripe of SATA SSDs,
accessed via 6 iSCSI connections, this change improves peak request rate
from 250K to 680K IOPS.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2014-06-25 17:02:01 +00:00
Alexander Motin
d309b227c5 Allow to use iSCSI immediate data by several ctl_datamove() calls.
While for FreeBSD client that is only a minor optimization, VMWare client
doesn't support additional data requests after all data being sent once as
immediate.

MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2014-06-25 16:12:14 +00:00
Alexander Motin
50fe38b6b8 Execute task management request directly in ctl_queue() context.
From one side it allows to remove CTL_FLAG_TASK_PENDING flag, handling of
which significantly complicates fine-grained locking.  From the other side
it reduces task management requests latency even below then that flag could.
As downside, it denies task management code to sleep, but that is not needed
any way now.

Discussed with:	ken
2014-06-19 13:19:35 +00:00
Alexander Motin
ead2f11724 Add some more CTL_FLAG_ABORT check points.
This should allow to abort commands doing mostly disk I/O, such as VERIFY
or WRITE SAME.  Before this change CTL_FLAG_ABORT was only checked around
data moves, which for these commands may not happen for a very long time.

MFC after:	2 weeks
2014-06-19 12:43:41 +00:00
Alexander Motin
28b9e53b7d Increase CTL_DEVID_LEN from 16 to 64 bytes.
SPC-4 recommends T10 vendor ID based LUN ID was created by concatenating
product name and serial number (and istgt follows that).  But product name
is 16 bytes long by itself, so 16 bytes total length is clearly not enough
to fit both.

To keep compatibility with existing configurations, pad short device IDs
to old length of 16, same as before.

This change probably breaks CTL user-level ABI, so control tools should
be rebuilt after this change.

MFC after:	2 weeks
2014-06-19 09:46:43 +00:00
Marius Strobl
de6a705e34 Don't denounce peripherals on system shutdown. Together with r267321,
we're now back to the pre-r228483 level of default verbosity. This in
turn again typically allows for reading information that userland might
have printed on the screen before initiating a halt, but still permits
to debug potential device shutdown problems on system shutdown via
CAM_DEBUG etc.

Reviewed by:	mav
MFC after:	3 days
Sponsored by:	Bally Wulff Games & Entertainment GmbH
2014-06-19 09:08:20 +00:00
Alexander Motin
9ad03ef5e7 Add iSCSI Target Name ID descriptor to VPD 83h.
It shall/should be there according to SPC-4, and istgt also provides it.

MFC after:	2 weeks
2014-06-19 08:13:53 +00:00
Edward Tomasz Napierala
f7d6790884 Rework session termination in iSCSI target to actually wait
for any outstanding commands to be properly aborted by CTL.
Without it, in some cases (such as files backing the LUNs
stored on failing disk drives), terminating a busy session
would result in panic.

Reviewed by:	mav@ (earlier version)
Sponsored by:	The FreeBSD Foundation
2014-06-18 17:13:18 +00:00
Edward Tomasz Napierala
2af142cafe Make cs_terminating a bool; no functional changes.
Sponsored by:	The FreeBSD Foundation
2014-06-17 09:02:10 +00:00
Edward Tomasz Napierala
57072b5118 Add comment explaining a potential problem with just added LUN ID.
Reminded by:	mav@
Sponsored by:	The FreeBSD Foundation
2014-06-16 19:05:51 +00:00
Edward Tomasz Napierala
a39adbef47 Add LUN-associated name to VPD, to make Hyper-V Failover Cluster happy.
Sponsored by:	The FreeBSD Foundation
2014-06-16 18:14:05 +00:00
Alexander Motin
11b569f7cb Add support for VERIFY(10/12/16) and COMPARE AND WRITE SCSI commands.
Make data_submit backends method support not only read and write requests,
but also two new ones: verify and compare.  Verify just checks readability
of the data in specified location without transferring them outside.
Compare reads the specified data and compares them to received data,
returning error if they are different.

VERIFY(10/12/16) commands request either verify or compare from backend,
depending on BYTCHK CDB field.  COMPARE AND WRITE command executed in two
stages: first it requests compare, and then, if succeesed, requests write.
Atomicity of operation is guarantied by CTL request ordering code.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2014-06-16 11:00:14 +00:00
Alexander Motin
e86a414238 Make backends track completion by processed number of sectors instead of
total transfer size.

Commands such as VERIFY or COMPARE AND WRITE may have transfer size not
matching directly to number of sectors.
2014-06-15 20:14:11 +00:00
Alexander Motin
66df9136e3 Remove memcpy() from ctl_private[] accesses.
That union is aligned enough to access data directly.
2014-06-15 18:16:51 +00:00
Alexander Motin
9c71cd5aae Move kern_total_len setting from backend to core code. 2014-06-15 17:14:52 +00:00
Alexander Motin
20a5f2d963 Format Portal Group Tag same as istgt does -- %4.4x instead of %x.
SPC-4 spec tells it should be "two or more hexadecimal digits".
RFC3720 tells it is 16-bit value.

MFC after:	2 weeks
2014-06-15 10:04:44 +00:00
Alexander Motin
eb3687a6a6 Remove custom processing for "file" option. 2014-06-15 09:37:06 +00:00
Alexander Motin
5777f09019 Respect "vendor" option in all places.
MFC after:	2 weeks
2014-06-15 08:43:52 +00:00
Alexander Motin
0c934f7f89 Add "vendor", "product" and "revision" options to control inquiry data.
MFC after:	2 weeks
2014-06-15 06:56:10 +00:00
Alexander Motin
ad9cb3314a Remove non-functional remnants of control LUN -- 18MB of RAM for nothing. 2014-06-14 20:25:14 +00:00
Alexander Motin
57a5db13b7 Implement small KPI to access LUN options instead doing it by hands.
MFC after:	2 weeks
2014-06-14 17:47:44 +00:00
Alexander Motin
9e005bbcc9 Fix some leaks on LUN creation error.
MFC after:	2 weeks
2014-06-12 21:50:46 +00:00
Warner Losh
dbb3f5b28b The code that combines adjacent ranges for BIO_DELETEs to optimize
trims to the device assumes the list is sorted. Don't apply the
optimization of not sorting the queue when we have SSDs to the
delete_queue, since it causes more discard traffic to the drive. While
one could argue that the higher levels should coalesce the trims,
that's not done today, so some optimization at this level is needed.

CR: https://phabric.freebsd.org/D142
2014-06-05 17:13:42 +00:00
Alexander Motin
94fe9f959c - Add support for SG_GET_SG_TABLESIZE IOCTL to report that we don't support
scatter/gather lists.
- Return error for still unsupported SG 3.x API read/write calls.

MFC after:	1 month
2014-06-04 12:05:47 +00:00
Alexander Motin
fcaf473cfc Overhaul CAM SG driver IOCTL interfaces.
Make it really work for native FreeBSD programs.  Before this it was broken
for years due to different number of pointer dereferences in Linux and
FreeBSD IOCTL paths, permanently returning errors to FreeBSD programs.
This change breaks the driver FreeBSD IOCTL ABI, making it more strict,
but since it was not working any way -- who bother.

Add shims for 32-bit programs on 64-bit host, translating the argument
of the SG_IO IOCTL for both FreeBSD and Linux ABIs.

With this change I was able to run 32-bit Linux sg3_utils tools and simple
32 and 64-bit FreeBSD test tools on both 32 and 64-bit FreeBSD systems.

MFC after:	1 month
2014-06-02 19:53:53 +00:00
Edward Tomasz Napierala
8cd22f5edf Provide better descriptions for 'struct ctl_scsiio' fields; based mostly
on emails from ken@.
2014-05-04 15:35:04 +00:00
Alexander Motin
51ad63daae Respect MAXIMUM TRANSFER LENGTH field of Block Limits VPD page.
Nobody yet reported disk supporting I/Os less then our MAXPHYS value, but
since we any way have code to read Block Limits VPD page, that is easy.

MFC after:	2 weeks
2014-04-30 19:44:31 +00:00
Alexander Motin
b28e753c93 Do not reread SCSI disk VPD pages on every device open.
Instead of rereading VPD pages on every device open, do it only on initial
device probe, and in cases when device reported via UNIT ATTENTIONs that
something has changed.  Capacity is still rereaded on every open because
it is more critical for operation and more probable to change in run time.

On my tests with Intel 530 SSDs on mps(4) HBA this change reduces time
GEOM needs to retaste the device (that includes few open/close cycles)
from ~150ms to ~30ms.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2014-04-30 17:38:26 +00:00
Alexander Motin
08a7cce543 Remove limits on size of READ/WRITE operations.
Instead of allocating up to 16MB or RAM at once to handle whole I/O,
allocate up to 1MB at a time, but do multiple ctl_datamove() and storage
I/Os if needed.
2014-04-24 16:19:49 +00:00
Alexander Motin
acf5bea460 Make CAM target CTL frontend respect SIM I/O size limitations.
If datamove size is bigger then SIM can handle, or it has more segments
then this code can handle -- split it into several CTIO requests.
2014-04-24 15:16:26 +00:00
Edward Tomasz Napierala
a7f6a46874 Modify CTL iSCSI frontend to properly handle situations where datamove
routine is called multiple times per SCSI task.

Sponsored by:	The FreeBSD Foundation
2014-04-24 12:54:35 +00:00
Alexander Motin
0fa3cb336b Disable UNMAP support for STEC 842 SSDs.
In some unknown cases UNMAP commands make device firmware stuck.

MFC after:	2 weeks
2014-04-23 19:50:35 +00:00
Edward Tomasz Napierala
8eab95d646 Properly pass the initiator address when running in proxy mode.
Sponsored by:	The FreeBSD Foundation
2014-04-16 11:00:10 +00:00
Edward Tomasz Napierala
6e4f347cd6 Make it possible to interrupt login when running in proxy mode.
Sponsored by:	The FreeBSD Foundation
2014-04-16 10:37:26 +00:00
Edward Tomasz Napierala
8cab2ed4cd Properly identify target portal when running in proxy mode. While here,
remove CTL_ISCSI_CLOSE, it wasn't used or implemented anyway.

Sponsored by:	The FreeBSD Foundation
2014-04-16 10:29:34 +00:00
Edward Tomasz Napierala
2ebde326cb Add some stuff to make it easier to figure out for the system administrator
whether the ICL_KERNEL_PROXY stuff got compiled in correctly.

Sponsored by:	The FreeBSD Foundation
2014-04-16 10:18:44 +00:00
Edward Tomasz Napierala
ba3a2d31c8 Make it possible for the iSCSI target side to operate in both normal
and ICL_KERNEL_PROXY mode, and fix some bit rot so the latter actually
works again.

Sponsored by:	The FreeBSD Foundation
2014-04-16 10:06:37 +00:00
Alexander Motin
2dfdd4ae19 Join CTL worker threads into one process for convenience.
Report their idle state as "-".
2014-04-13 11:10:36 +00:00
Alexander Motin
3710ae64b9 Report more readable state "-" for idle CAM scan thread. 2014-04-13 11:08:57 +00:00
Steven Hartland
43d0f063c2 Fix build breakage caused by r264295
X-MFC-With: r264295
MFC after:	1 week
2014-04-10 05:04:23 +00:00
Alexander Motin
7081bb15b0 Fix three refcounter leaks and lock recursion they covered.
MFC after:	1 week
2014-04-09 19:16:40 +00:00
Alexander Motin
004008d6e6 Introduce new serialization type CTL_SERIDX_UNMAP.
Unfortunately we can't check range collisions for UNMAP commands alike
to writes, because they include multiple ranges, which are also passed
in data block, not in CDB.  As result, UNMAP commands have to be treated
as colliding with any other command accessing the media.

From the other side all UNMAPs are equal (we don't support ANCHOR flag),
so we can execute several UNMAPs same time.
2014-04-09 10:58:52 +00:00
Alexander Motin
8f5a226a3c When splitting huge unmap requests, do it on sector boundary. 2014-04-09 10:44:09 +00:00