821 Commits

Author SHA1 Message Date
imp
0dacc9824c MFC:
>r267118 | imp | 2014-06-05 11:13:42 -0600 (Thu, 05 Jun 2014) | 9 lines
>The code that combines adjacent ranges for BIO_DELETEs to optimize
>trims to the device assumes the list is sorted. Don't apply the
>optimization of not sorting the queue when we have SSDs to the
>delete_queue, since it causes more discard traffic to the drive. While
>one could argue that the higher levels should coalesce the trims,
>that's not done today, so some optimization at this level is needed.
>CR: https://phabric.freebsd.org/D142
2014-07-17 23:05:20 +00:00
mav
2c6230a1ad MFC r268240 (by ken):
Add persistent reservation support to camcontrol(8).

camcontrol(8) now supports a new 'persist' subcommand that allows users to
issue SCSI PERSISTENT RESERVE IN / OUT commands.
2014-07-15 17:26:43 +00:00
mav
1e9a14c2c4 MFC r268418:
Enable TAS feature: notify initiator if its command was aborted by other.

That should make operation more kind to multi-initiator environment.
Without this, other initiators may find out that something bad happened
to their commands only via command timeout.
2014-07-15 17:18:50 +00:00
mav
247c4d2053 MFC r268309:
Add support for SCSI Ports (88h) VPD page.
2014-07-15 17:09:52 +00:00
mav
75f2a98771 MFC r268103:
Add support for REPORT TIMESTAMP command.
2014-07-15 16:54:04 +00:00
mav
da977d4032 MFC r268096, r268306, r268361:
Add more formal and strict command parsing and validation.

For every supported command define CDB length and mask of bits that are
allowed to be set.  This allows to remove bunch of checks through the code
and still make the validation more strict.  To properly do it for commands
supporting multiple service actions, formalize their parsing by adding
subtables for each of such commands.

As visible effect, this change allows to add support for REPORT SUPPORTED
OPERATION CODES command, reporting to client all the data about supported
SCSI commands, except timeouts.
2014-07-15 16:53:04 +00:00
mav
dcab9610fe MFC r267906:
Allow MODE SENSE commands through Write Exclusive persistent reservation,
as required by SPC-4.

Report that fact in persistent reservation capabilities.
2014-07-12 02:26:11 +00:00
mav
526d9691dc MFC r267051:
- Add support for SG_GET_SG_TABLESIZE IOCTL to report that we don't support
scatter/gather lists.
- Return error for still unsupported SG 3.x API read/write calls.
2014-07-04 15:09:56 +00:00
mav
2019810069 MFC r267537:
Add support for VERIFY(10/12/16) and COMPARE AND WRITE SCSI commands.

Make data_submit backends method support not only read and write requests,
but also two new ones: verify and compare.  Verify just checks readability
of the data in specified location without transferring them outside.
Compare reads the specified data and compares them to received data,
returning error if they are different.

VERIFY(10/12/16) commands request either verify or compare from backend,
depending on BYTCHK CDB field.  COMPARE AND WRITE command executed in two
stages: first it requests compare, and then, if succeesed, requests write.
Atomicity of operation is guarantied by CTL request ordering code.

Sponsored by:	iXsystems, Inc.
2014-07-02 10:45:31 +00:00
mav
927a52fbbf MFC r266981:
Overhaul CAM SG driver IOCTL interfaces.

Make it really work for native FreeBSD programs.  Before this it was broken
for years due to different number of pointer dereferences in Linux and
FreeBSD IOCTL paths, permanently returning errors to FreeBSD programs.
This change breaks the driver FreeBSD IOCTL ABI, making it more strict,
but since it was not working any way -- who bother.

Add shims for 32-bit programs on 64-bit host, translating the argument
of the SG_IO IOCTL for both FreeBSD and Linux ABIs.

With this change I was able to run 32-bit Linux sg3_utils tools and simple
32 and 64-bit FreeBSD test tools on both 32 and 64-bit FreeBSD systems.
2014-07-02 10:16:12 +00:00
mav
8c42300872 MFC r265159:
Respect MAXIMUM TRANSFER LENGTH field of Block Limits VPD page.

Nobody yet reported disk supporting I/Os less then our MAXPHYS value, but
since we any way have code to read Block Limits VPD page, that is easy.
2014-05-08 07:13:22 +00:00
mav
878513ca7d MFC r265150:
Do not reread SCSI disk VPD pages on every device open.

Instead of rereading VPD pages on every device open, do it only on initial
device probe, and in cases when device reported via UNIT ATTENTIONs that
something has changed.  Capacity is still rereaded on every open because
it is more critical for operation and more probable to change in run time.

On my tests with Intel 530 SSDs on mps(4) HBA this change reduces time
GEOM needs to retaste the device (that includes few open/close cycles)
from ~150ms to ~30ms.
2014-05-08 07:12:06 +00:00
mav
df5d1f3a9b MFC r264834:
Disable UNMAP support for STEC 842 SSDs.

In some unknown cases UNMAP commands make device firmware stuck.
2014-05-08 07:05:19 +00:00
mav
2d5dc4736b MFC r264274, r264279, r264283, r264296, r264297:
Add support for SCSI UNMAP commands to CTL.

This patch adds support for three new SCSI commands: UNMAP, WRITE SAME(10)
and WRITE SAME(16).  WRITE SAME commands support both normal write mode
and UNMAP flag.  To properly report UNMAP capabilities this patch also adds
support for reporting two new VPD pages: Block limits and Logical Block
Provisioning.

UNMAP support can be enabled per-LUN by adding "-o unmap=on" to `ctladm
create` command line or "option unmap on" to lun sections of /etc/ctl.conf.

At this moment UNMAP supported for ramdisks and device-backed block LUNs.
It was tested to work great with ZFS ZVOLs.  For file-backed LUNs UNMAP
support is unfortunately missing due to absence of respective VFS KPI.

Sponsored by:   iXsystems, Inc
2014-05-08 07:00:45 +00:00
mav
246a5ae3a0 MFC r260509:
Replace several instances of -1 with appropriate CAM_*_WILDCARD and types.

It was equal before r259397, but for good or bad, not any more for LUNs.

This change fixes at least CAM debugging.
2014-05-08 06:55:48 +00:00
mav
71409ea2b1 MFC r264311 (by smh):
Fix build breakage caused by r264295
2014-04-16 15:27:14 +00:00
mav
b3571af592 MFC r264295:
Remove support of LUN-based CD changers from cd(4) driver.

This code was heavily broken few months ago during CAM locking changes.
Fixing it would require almost complete rewrite.  Since there are no
known devices on market using this interface younger then ~15 years, and
they are CD, not even DVD, I don't see much reason to rewrite it.

This change does not mean those devices won't work.  They will just work
slower due to inefficient disks load/unload schedule if several LUNs
accessed same time.
2014-04-16 10:04:19 +00:00
mav
3e6f6a1694 MFC r260267 (by smh), r261042:
Correct short delete issue in SCSI UNMAP support
Correct missing \n's in xpt_print's
Correct incorrect count being passed to short delete xpt_print
2014-01-29 02:38:25 +00:00
mav
5f170995ac MFC r260407:
Allow delete_method sysctl to be set to "DISABLE".
2014-01-20 23:56:49 +00:00
mav
204096a62c MFC r260541, r260547:
Take additional reference on SCSI probe periph to cover its freeze count.

Otherwise periph may be invalidated and freed before single-stepping freeze
is dropped, causing use after free panic.
2014-01-14 12:01:36 +00:00
mav
2e2af5808b MFC r256547 (by smh):
Added 4K quirks for Corsair Neutron GTX SSD's
2014-01-09 10:49:14 +00:00
scottl
cd4455d638 MFC Alexander Motin's direct dispatch, multi-queue, and finer-grained
locking support for CAM

r256826:
Fix several target mode SIMs to not blindly clear ccb_h.flags field of
ATIO CCBs.  Not all CCB flags there belong to them.

r256836:
Remove hard limit on number of BIOs handled with one ATA TRIM request.

r256843:
Merge CAM locking changes from the projects/camlock branch to radically
reduce lock congestion and improve SMP scalability of the SCSI/ATA stack,
preparing the ground for the coming next GEOM direct dispatch support.

r256888:
Unconditionally acquire periph reference on CCB allocation failure.

r256895:
Fix memory and references leak due to unfreed path.

r256960:
Move CAM_UNQUEUED_INDEX setting to the last moment and under the periph lock.
This fixes race condition with cam_periph_ccbwait(), causing use-after-free.

r256975:
Minor (mostly cosmetical) addition to r256960.

r257054:
Some microoptimizations for da and ada drivers:
 - Replace ordered_tag_count counter with single flag;
 - From da remove outstanding_cmds counter, duplicating pending_ccbs list;
 - From da_softc remove unused links field.

r257482:
Fix lock recursion, triggered by `smartctl -a /dev/adaX`.

r257501:
Make getenv_*() functions and respectively TUNABLE_*_FETCH() macros not
allocate memory and so not require sleepable environment.  getenv() has
already used on-stack temporary storage, so just use it more rationally.
getenv_string() receives buffer as argument, so don't need another one.

r257914:
Some CAM locks polishing:
 - Fix LOR and possible lock recursion when handling high-power commands.
Introduce new lock to protect left power quota and list of frozen devices.
 - Correct locking around xpt periph creation.
 - Remove seems never used XPT_FLAG_OPEN xpt periph flag.

Again, Netflix assisted with testing the merge, but all of the credit goes
to Alexander and iX Systems.

Submitted by:	mav
Sponsored by:	iX Systems
2014-01-07 01:51:48 +00:00
scottl
0a34594b9c MFC Alexander Motin's GEOM direct dispatch work:
r256603:
Introduce new function devstat_end_transaction_bio_bt(), adding new argument
to specify present time.  Use this function to move binuptime() out of lock,
substantially reducing lock congestion when slow timecounter is used.

r256606:
Move g_io_deliver() out of the lock, as required for direct dispatch.
Move g_destroy_bio() out too to reduce lock scope even more.

r256607:
Fix passing uninitialized bio_resid argument to g_trace().

r256610:
Add unmapped I/O support to GEOM RAID.

r256830:
Restore BIO_UNMAPPED and BIO_TRANSIENT_MAPPING in biodonne() when unmapping
temporary mapped buffer.  That fixes double unmap if biodone() called twice
for the same BIO (but with different done methods).

r256880:
Merge GEOM direct dispatch changes from the projects/camlock branch.

When safety requirements are met, it allows to avoid passing I/O requests
to GEOM g_up/g_down thread, executing them directly in the caller context.
That allows to avoid CPU bottlenecks in g_up/g_down threads, plus avoid
several context switches per I/O.

r259247:
Fix bug introduced at r256607.  We have to recalculate bp_resid here since
sizes of original and completed requests may differ due to end of media.

Testing of the stable/10 merge was done by Netflix, but all of the credit
goes to Alexander and iX Systems.

Submitted by:   mav
Sponsored by:   iX Systems
2014-01-07 01:32:23 +00:00
mav
53624fa293 MFC r259108:
When comparing device IDs, make sure that they have the same type
(like NAA assigned) and identify the same entity (like device or port).
Otherwise there can be false positives since at least some models of
Seagate disks use same IDs for the whole device and one of its ports.
2013-12-22 13:02:34 +00:00
nwhitehorn
ebd6e46afc MFC r257345,257382,257388:
Implement extended LUN support. If PIM_EXTLUNS is set by a SIM, encode
the upper 32-bits of the LUN, if possible, into the target_lun field as
passed directly from the REPORT LUNs response. This allows extended LUN
support to work for all LUNs with zeros in the lower 32-bits, which covers
most addressing modes without breaking KBI. Behavior for drivers not
setting PIM_EXTLUNS is unchanged. No user-facing interfaces are modified.

Extended LUNs are stored with swizzled 16-bit word order so that, for
devices implementing LUN addressing (like SCSI-2), the numerical
representation of the LUN is identical with and without PIM_EXTLUNS. Thus
setting PIM_EXTLUNS keeps most behavior, and user-facing LUN IDs, unchanged.
This follows the strategy used in Solaris. A macro (CAM_EXTLUN_BYTE_SWIZZLE)
is provided to transform a lun_id_t into a uint64_t ordered for the wire.

This is the second part of work for full 64-bit extended LUN support and is
designed to a bridge for stable/10 to the final 64-bit LUN code. The
third and final part will involve widening lun_id_t to 64 bits and will
not be MFCed. This third part will break the KBI but will keep the KPI
unchanged so that all drivers that will care about this can be updated now
and not require code changes between HEAD and stable/10.

Reviewed by:	scottl
2013-12-10 22:55:22 +00:00
mav
c896ef2ccc MFC r256552:
Unify periph invalidation and destruction reporting.
Print message containing device model and serial number on invalidation.

Approved by:	re (hrs)
2013-10-24 10:33:31 +00:00
scottl
e2e8dc4dbb Re-do r255853. Along with adding back the API/ABI changes from the
original, this hides the contents of cam_compat.h from ktrace/kdump/truss,
avoiding problems there.  There are no user-servicable parts in there, so
no need for those tools to be groping around in there.

Approved by:	re
2013-09-25 15:55:56 +00:00
gjb
d965f28ba1 Revert r255853 pending fixes to build errors in usr.bin/kdump
Approved by:	re (implicit)
2013-09-25 01:48:45 +00:00
scottl
108b7070e7 Update the CAM API for FreeBSD 10:
- Remove the timeout_ch field.  It's been deprecated since FreeBSD 7.0;
  MPSAFE drivers should be managing their own timeout storage.  The
  remaining non-MPSAFE drivers have been modified to also manage their own
  storage, and should be considered for updating to MPSAFE (or removal)
  during the FreeBSD 10.x lifecycle.

- Add fields related to soft timeouts and quality of service, to be used
  in upcoming work.

- Add room for more flags in the CCB header and path_inq structures.

- Begin support for extended 64-bit LUNs.

- Bump the CAM version number to 0x18, but add compat shims.  Tested with
  camcontrol and smartctl.

Reviewed by:    nathanw, ken, kib
Approved by:    re
Obtained from:  Netflix
2013-09-24 16:50:53 +00:00
mav
a368e04207 Make SES driver adequately react on simple enclosure devices -- read Short
Enclosure status to enclosure status field, clear previous state and exit.
2013-09-06 15:41:37 +00:00
bryanv
4dc4ea3c0d Add camcontrol support for the SCSI sanitize command
Reviewed by:	ken, mjacob (eariler version)
Sponsored by:	Netapp
2013-09-06 15:19:57 +00:00
mav
9ec6975e73 Fix kernel panic if cache->nelms is zero.
MFC after:	2 weeks
2013-09-06 14:31:52 +00:00
mav
e56875d5c5 Bring legacy CAM target implementation back into API/KPI-coherent and even
functional state.  While CTL is much more superior target from all points,
there is no reason why this code should not work.

Tested with ahc(4) as target side HBA.

MFC after:	2 weeks
2013-09-01 13:01:59 +00:00
mav
554edd303f Fix SES_ENABLE_PASSTHROUGH kernel option, unexpectedly broken during driver
overhaul.

MFC after:	3 days
2013-09-01 12:18:44 +00:00
mav
be4931fc75 Fix targbh crash on XPT_IMMED_NOTIFY error during attach. 2013-09-01 11:50:37 +00:00
ken
5a498aa69f Bump up the default timeouts for move commands in the ch(4) driver
to 15 minutes, and 5 minutes for things like READ ELEMENT STATUS.

This is needed to account for the worst case scenarios on at least
some Spectra Logic tape libraries.

Sponsored by:	Spectra Logic
MFC after:	3 days
2013-08-29 21:25:27 +00:00
ken
6c5aea24dd If a drive returns ASC/ASCQ 0x04,0x11 "Logical unit not ready,
notify (enable spinup) required", instead of doing the normal
retries, poll for a change in status.

We will poll every half second for a minute for the status to
change.

Hitachi drives (and likely other SAS drives) return that ASC/ASCQ
when they are waiting to spin up.  What it means is that they are
waiting for the SAS expander to send them the SAS
NOTIFY (ENABLE SPINUP) primitive.

That primitive is the mechanism expanders/enclosures use to
sequence drive spinup to avoid overloading power supplies.

Sponsored by:	Spectra Logic
MFC after:	3 days
2013-08-27 19:47:03 +00:00
ken
281a193b53 Add support to physio(9) for devices that don't want I/O split and
configure sa(4) to request no I/O splitting by default.

For tape devices, the user needs to be able to clearly understand
what blocksize is actually being used when writing to a tape
device.  The previous behavior of physio(9) was that it would split
up any I/O that was too large for the device, or too large to fit
into MAXPHYS.  This means that if, for instance, the user wrote a
1MB block to a tape device, and MAXPHYS was 128KB, the 1MB write
would be split into 8 128K chunks.  This would be done without
informing the user.

This has suboptimal effects, especially when trying to communicate
status to the user.  In the event of an error writing to a tape
(e.g. physical end of tape) in the middle of a 1MB block that has
been split into 8 pieces, the user could have the first two 128K
pieces written successfully, the third returned with an error, and
the last 5 returned with 0 bytes written.  If the user is using
a standard write(2) system call, all he will see is the ENOSPC
error.  He won't have a clue how much actually got written.  (With
a writev(2) system call, he should be able to determine how much
got written in addition to the error.)

The solution is to prevent physio(9) from splitting the I/O.  The
new cdev flag, SI_NOSPLIT, tells physio that the driver does not
want I/O to be split beforehand.

Although the sa(4) driver now enables SI_NOSPLIT by default,
that can be disabled by two loader tunables for now.  It will not
be configurable starting in FreeBSD 11.0.  kern.cam.sa.allow_io_split
allows the user to configure I/O splitting for all sa(4) driver
instances.  kern.cam.sa.%d.allow_io_split allows the user to
configure I/O splitting for a specific sa(4) instance.

There are also now three sa(4) driver sysctl variables that let the
users see some sa(4) driver values.  kern.cam.sa.%d.allow_io_split
shows whether I/O splitting is turned on.  kern.cam.sa.%d.maxio shows
the maximum I/O size allowed by kernel configuration parameters
(e.g. MAXPHYS, DFLTPHYS) and the capabilities of the controller.
kern.cam.sa.%d.cpi_maxio shows the maximum I/O size supported by
the controller.

Note that a better long term solution would be to implement support
for chaining buffers, so that that MAXPHYS is no longer a limiting
factor for I/O size to tape and disk devices.  At that point, the
controller and the tape drive would become the limiting factors.

sys/conf.h:	Add a new cdev flag, SI_NOSPLIT, that allows a
		driver to tell physio not to split up I/O.

sys/param.h:	Bump __FreeBSD_version to 1000049 for the addition
		of the SI_NOSPLIT cdev flag.

kern_physio.c:	If the SI_NOSPLIT flag is set on the cdev, return
		any I/O that is larger than si_iosize_max or
		MAXPHYS, has more than one segment, or would have
		to be split because of misalignment with EFBIG.
		(File too large).

		In the event of an error, print a console message to
		give the user a clue about what happened.

scsi_sa.c:	Set the SI_NOSPLIT cdev flag on the devices created
		for the sa(4) driver by default.

		Add tunables to control whether we allow I/O splitting
		in physio(9).

		Explain in the comments that allowing I/O splitting
		will be deprecated for the sa(4) driver in FreeBSD
		11.0.

		Add sysctl variables to display the maximum I/O
		size we can do (which could be further limited by
		read block limits) and the maximum I/O size that
		the controller can do.

		Limit our maximum I/O size (recorded in the cdev's
		si_iosize_max) by MAXPHYS.  This isn't strictly
		necessary, because physio(9) will limit it to
		MAXPHYS, but it will provide some clarity for the
		application.

		Record the controller's maximum I/O size reported
		in the Path Inquiry CCB.

sa.4:		Document the block size behavior, and explain that
		the option of allowing physio(9) to split the I/O
		will disappear in FreeBSD 11.0.

Sponsored by:	Spectra Logic
2013-08-24 04:52:22 +00:00
trasz
16272df377 Fix the (unused for now) SCSI_PROTO_iSCSI define to match style(9). 2013-08-21 07:45:47 +00:00
ken
435f1b4a02 Add unmapped I/O and larger I/O support to the sa(4) driver.
We now pay attention to the maxio field in the XPT_PATH_INQ CCB,
and if it is set, propagate it up to physio via the si_iosize_max
field in the cdev structure.

We also now pay attention to the PIM_UNMAPPED capability bit in the
XPT_PATH_INQ CCB, and set the new SI_UNMAPPED cdev flag when the
underlying SIM supports unmapped I/O.

scsi_sa.c:	Add unmapped I/O support and propagate the SIM's
		maximum I/O size up.

		Adjust scsi_tape_read_write() in the same way that
		scsi_read_write() was changed to support unmapped
		I/O.  We overload the readop parameter with bits
		that tell us whether it's an unmapped I/O, and we
		need to set the CAM_DATA_BIO CCB flag.  This change
		should be backwards compatible in source and
		binary forms.

MFC after:	1 week
Sponsored by:	Spectra Logic
2013-08-16 16:14:32 +00:00
smh
2c42b706ab Added 4K quirks for:-
* OCZ Agility 2 SSDs
* Marvell SSDs
* Intel X25-M Series SSDs
2013-08-14 15:18:28 +00:00
mav
7ddb89a6c3 Improve r253721 by reporting detected lack of BIO_FLUSH support to GEOM.
That prevents more of such requests from coming and errors from logging.
2013-08-07 08:20:11 +00:00
mav
d5767e96c8 Add NO_RC16 quirk to make da driver avoid using READ CAPACITY(16) command
if possible.  Use it for Kingston JetFlash USB sticks, that are known to
return garbage in response to that command.
2013-07-30 13:00:09 +00:00
mav
5ee60ee612 Fix returning incorrect bio_resid value with failed BIO_DELETE requests.
Neither residual length reported for ATA/SCSI command nor one from another
BIO_DELETE request are in any way related to the value to be returned.
2013-07-28 19:56:08 +00:00
mav
9932d6357c Synchronize device cache on close only if there were some write operations.
While these operations are not really needed otherwise, at least for SCSI
they may cause extra errors if some other initiator holds write exclusive
reservation on the LUN (SYNCHRONIZE CACHE handled as "write" operation).
2013-07-27 22:44:55 +00:00
mav
37cdfcd8aa Oops, revert unwanted part of r253721. 2013-07-27 22:21:10 +00:00
mav
b7dc63ce7a Detect unsupported PREVENT ALLOW MEDIUM REMOVAL and SYNCHRONIZE CACHE(10)
to not spam devices with useless commands and logs with errors.
2013-07-27 22:19:34 +00:00
mav
1f49e221e7 Make some improvements to r253322 to really rescan target, not a bus.
Add there and in two more places checks for NULL on xpt_alloc_ccb_nowait().
2013-07-15 18:17:31 +00:00
ken
b8930f9894 Fix an argument reversal in calls to scsi_read_element_status().
Reported by:	Ulrich Spoerlein <uqs@FreeBSD.org>
MFC after:	3 days
2013-07-15 16:38:48 +00:00
mav
c60dcc15f1 When printing opcode description, map T_NODEVICE to Direct Access Device to
handle REPORT LUNS, etc.
2013-07-13 15:34:37 +00:00