This patch adds support for three new SCSI commands: UNMAP, WRITE SAME(10)
and WRITE SAME(16). WRITE SAME commands support both normal write mode
and UNMAP flag. To properly report UNMAP capabilities this patch also adds
support for reporting two new VPD pages: Block limits and Logical Block
Provisioning.
UNMAP support can be enabled per-LUN by adding "-o unmap=on" to `ctladm
create` command line or "option unmap on" to lun sections of /etc/ctl.conf.
At this moment UNMAP supported for ramdisks and device-backed block LUNs.
It was tested to work great with ZFS ZVOLs. For file-backed LUNs UNMAP
support is unfortunately missing due to absence of respective VFS KPI.
Reviewed by: ken
MFC after: 1 month
Sponsored by: iXsystems, Inc
This avoids extra locking in icl_pdu_queue(); the upper layer needs to call
it while holding its own lock anyway, to avoid sending PDUs out of order.
Sponsored by: The FreeBSD Foundation
them for actual target errors. They can be enabled back by setting
kern.cam.ctl.verbose=1, or booting with bootverbose.
Sponsored by: The FreeBSD Foundation
further refinement is required as some device drivers intended to be
portable over FreeBSD versions rely on __FreeBSD_version to decide whether
to include capability.h.
MFC after: 3 weeks
- Logical sector size is measured in words, not bytes.
- If physical sector is not bigger then logical sector, it does not mean
it should be set equal to 512 bytes, but set to logical sector.
PR: misc/187269
Submitted by: Ravi Pokala <rpokala@panasas.com>
MFC after: 1 week
The latest draft of SBC-3 tells: "A MAXIMUM UNMAP LBA COUNT field set to
a non-zero value indicates the maximum number of LBAs that may be unmapped
by an UNMAP command." To me it does not sound like that limit is set per
single descriptor, but rather per all command. And I have at least one
device that behaves exactly that way. This patch fixes the problem there.
MFC after: 1 week
originally.
I am not sure why exactly have I moved it during one of many refactorings
during camlock project, but obviously it opens race window that may cause
use after free panics during SIM (in reported cases umass(4)) detach.
MFC after: 2 weeks
support all valid SAM-5 LUN IDs. CAM_VERSION is bumped, as the CAM ABI
(though not API) is changed. No behavior is changed relative to r257345
except that LUNs with non-zero high 32 bits will no longer be ignored
during device enumeration for SIMs that have set PIM_EXTLUNS.
Reviewed by: scottl
(like NAA assigned) and identify the same entity (like device or port).
Otherwise there can be false positives since at least some models of
Seagate disks use same IDs for the whole device and one of its ports.
MFC after: 2 weeks
In its stead use the Solaris / illumos approach of emulating '-' (dash)
in probe names with '__' (two consecutive underscores).
Reviewed by: markj
MFC after: 3 weeks
option, unbreak the lock tracing release semantic by embedding
calls to LOCKSTAT_PROFILE_RELEASE_LOCK() direclty in the inlined
version of the releasing functions for mutex, rwlock and sxlock.
Failing to do so skips the lockstat_probe_func invokation for
unlocking.
- As part of the LOCKSTAT support is inlined in mutex operation, for
kernel compiled without lock debugging options, potentially every
consumer must be compiled including opt_kdtrace.h.
Fix this by moving KDTRACE_HOOKS into opt_global.h and remove the
dependency by opt_kdtrace.h for all files, as now only KDTRACE_FRAMES
is linked there and it is only used as a compile-time stub [0].
[0] immediately shows some new bug as DTRACE-derived support for debug
in sfxge is broken and it was never really tested. As it was not
including correctly opt_kdtrace.h before it was never enabled so it
was kept broken for a while. Fix this by using a protection stub,
leaving sfxge driver authors the responsibility for fixing it
appropriately [1].
Sponsored by: EMC / Isilon storage division
Discussed with: rstone
[0] Reported by: rstone
[1] Discussed with: philip
- Fix LOR and possible lock recursion when handling high-power commands.
Introduce new lock to protect left power quota and list of frozen devices.
- Correct locking around xpt periph creation.
- Remove seems never used XPT_FLAG_OPEN xpt periph flag.
the upper 32-bits of the LUN, if possible, into the target_lun field as
passed directly from the REPORT LUNs response. This allows extended LUN
support to work for all LUNs with zeros in the lower 32-bits, which covers
most addressing modes without breaking KBI. Behavior for drivers not
setting PIM_EXTLUNS is unchanged. No user-facing interfaces are modified.
Extended LUNs are stored with swizzled 16-bit word order so that, for
devices implementing LUN addressing (like SCSI-2), the numerical
representation of the LUN is identical with and without PIM_EXTLUNS. Thus
setting PIM_EXTLUNS keeps most behavior, and user-facing LUN IDs, unchanged.
This follows the strategy used in Solaris. A macro (CAM_EXTLUN_BYTE_SWIZZLE)
is provided to transform a lun_id_t into a uint64_t ordered for the wire.
This is the second part of work for full 64-bit extended LUN support and is
designed to a bridge for stable/10 to the final 64-bit LUN code. The
third and final part will involve widening lun_id_t to 64 bits and will
not be MFCed. This third part will break the KBI but will keep the KPI
unchanged so that all drivers that will care about this can be updated now
and not require code changes between HEAD and stable/10.
Reviewed by: scottl
MFC after: 2 weeks
- Replace ordered_tag_count counter with single flag;
- From da remove outstanding_cmds counter, duplicating pending_ccbs list;
- From da_softc remove unused links field.
information.
The existing algorithm selects a preferred leaf vdev based on offset of the zio
request modulo the number of members in the mirror. It assumes the devices are
of equal performance and that spreading the requests randomly over both drives
will be sufficient to saturate them. In practice this results in the leaf vdevs
being under utilized.
The new algorithm takes into the following additional factors:
* Load of the vdevs (number outstanding I/O requests)
* The locality of last queued I/O vs the new I/O request.
Within the locality calculation additional knowledge about the underlying vdev
is considered such as; is the device backing the vdev a rotating media device.
This results in performance increases across the board as well as significant
increases for predominantly streaming loads and for configurations which don't
have evenly performing devices.
The following are results from a setup with 3 Way Mirror with 2 x HD's and
1 x SSD from a basic test running multiple parrallel dd's.
With pre-fetch disabled (vfs.zfs.prefetch_disable=1):
== Stripe Balanced (default) ==
Read 15360MB using bs: 1048576, readers: 3, took 161 seconds @ 95 MB/s
== Load Balanced (zfslinux) ==
Read 15360MB using bs: 1048576, readers: 3, took 297 seconds @ 51 MB/s
== Load Balanced (locality freebsd) ==
Read 15360MB using bs: 1048576, readers: 3, took 54 seconds @ 284 MB/s
With pre-fetch enabled (vfs.zfs.prefetch_disable=0):
== Stripe Balanced (default) ==
Read 15360MB using bs: 1048576, readers: 3, took 91 seconds @ 168 MB/s
== Load Balanced (zfslinux) ==
Read 15360MB using bs: 1048576, readers: 3, took 108 seconds @ 142 MB/s
== Load Balanced (locality freebsd) ==
Read 15360MB using bs: 1048576, readers: 3, took 48 seconds @ 320 MB/s
In addition to the performance changes the code was also restructured, with
the help of Justin Gibbs, to provide a more logical flow which also ensures
vdevs loads are only calculated from the set of valid candidates.
The following additional sysctls where added to allow the administrator
to tune the behaviour of the load algorithm:
* vfs.zfs.vdev.mirror.rotating_inc
* vfs.zfs.vdev.mirror.rotating_seek_inc
* vfs.zfs.vdev.mirror.rotating_seek_offset
* vfs.zfs.vdev.mirror.non_rotating_inc
* vfs.zfs.vdev.mirror.non_rotating_seek_inc
These changes where based on work started by the zfsonlinux developers:
https://github.com/zfsonlinux/zfs/pull/1487
Reviewed by: gibbs, mav, will
MFC after: 2 weeks
Sponsored by: Multiplay
cam_periph_acquire() can return error if periph already invalidated, but
that may be unacceptable and cause deadlock if the invalidated periph can't
be destroyed without "executing" the scheduled request.
Coverity CID: 1109822
MFC after: 2 months
When safety requirements are met, it allows to avoid passing I/O requests
to GEOM g_up/g_down thread, executing them directly in the caller context.
That allows to avoid CPU bottlenecks in g_up/g_down threads, plus avoid
several context switches per I/O.
The defined now safety requirements are:
- caller should not hold any locks and should be reenterable;
- callee should not depend on GEOM dual-threaded concurency semantics;
- on the way down, if request is unmapped while callee doesn't support it,
the context should be sleepable;
- kernel thread stack usage should be below 50%.
To keep compatibility with GEOM classes not meeting above requirements
new provider and consumer flags added:
- G_CF_DIRECT_SEND -- consumer code meets caller requirements (request);
- G_CF_DIRECT_RECEIVE -- consumer code meets callee requirements (done);
- G_PF_DIRECT_SEND -- provider code meets caller requirements (done);
- G_PF_DIRECT_RECEIVE -- provider code meets callee requirements (request).
Capable GEOM class can set them, allowing direct dispatch in cases where
it is safe. If any of requirements are not met, request is queued to
g_up or g_down thread same as before.
Such GEOM classes were reviewed and updated to support direct dispatch:
CONCAT, DEV, DISK, GATE, MD, MIRROR, MULTIPATH, NOP, PART, RAID, STRIPE,
VFS, ZERO, ZFS::VDEV, ZFS::ZVOL, all classes based on g_slice KPI (LABEL,
MAP, FLASHMAP, etc).
To declare direct completion capability disk(9) KPI got new flag equivalent
to G_PF_DIRECT_SEND -- DISKFLAG_DIRECT_COMPLETION. da(4) and ada(4) disk
drivers got it set now thanks to earlier CAM locking work.
This change more then twice increases peak block storage performance on
systems with manu CPUs, together with earlier CAM locking changes reaching
more then 1 million IOPS (512 byte raw reads from 16 SATA SSDs on 4 HBAs to
256 user-level threads).
Sponsored by: iXsystems, Inc.
MFC after: 2 months
reduce lock congestion and improve SMP scalability of the SCSI/ATA stack,
preparing the ground for the coming next GEOM direct dispatch support.
Replace big per-SIM locks with bunch of smaller ones:
- per-LUN locks to protect device and peripheral drivers state;
- per-target locks to protect list of LUNs on target;
- per-bus locks to protect reference counting;
- per-send queue locks to protect queue of CCBs to be sent;
- per-done queue locks to protect queue of completed CCBs;
- remaining per-SIM locks now protect only HBA driver internals.
While holding LUN lock it is allowed (while not recommended for performance
reasons) to take SIM lock. The opposite acquisition order is forbidden.
All the other locks are leaf locks, that can be taken anywhere, but should
not be cascaded. Many functions, such as: xpt_action(), xpt_done(),
xpt_async(), xpt_create_path(), etc. are no longer require (but allow) SIM
lock to be held.
To keep compatibility and solve cases where SIM lock can't be dropped, all
xpt_async() calls in addition to xpt_done() calls are queued to completion
threads for async processing in clean environment without SIM lock held.
Instead of single CAM SWI thread, used for commands completion processing
before, use multiple (depending on number of CPUs) threads. Load balanced
between them using "hash" of the device B:T:L address.
HBA drivers that can drop SIM lock during completion processing and have
sufficient number of completion threads to efficiently scale to multiple
CPUs can use new function xpt_done_direct() to avoid extra context switch.
Make ahci(4) driver to use this mechanism depending on hardware setup.
Sponsored by: iXsystems, Inc.
MFC after: 2 months
registered simultaneously. Due to topology unlock between the ID allocation
and the bus registration there is a chance that two buses may get the
same IDs. That is supposed reason of lock assertion panic in CAM during
initial bus scanning after new iscsid initiates two sessions same time.
Reported by: trasz
Approved by: re (glebus, marius)
MFC after: 2 weeks
CAM_EXTLUN_VALID is not erroneously set. Also add an XPORT_SRP
identifier to the known SCSI transports for the SCSI RDMA protocol, as
used, for example with Infiniband storage.
Reviewed by: scottl
Approved by: re (marius)
original, this hides the contents of cam_compat.h from ktrace/kdump/truss,
avoiding problems there. There are no user-servicable parts in there, so
no need for those tools to be groping around in there.
Approved by: re
- Remove the timeout_ch field. It's been deprecated since FreeBSD 7.0;
MPSAFE drivers should be managing their own timeout storage. The
remaining non-MPSAFE drivers have been modified to also manage their own
storage, and should be considered for updating to MPSAFE (or removal)
during the FreeBSD 10.x lifecycle.
- Add fields related to soft timeouts and quality of service, to be used
in upcoming work.
- Add room for more flags in the CCB header and path_inq structures.
- Begin support for extended 64-bit LUNs.
- Bump the CAM version number to 0x18, but add compat shims. Tested with
camcontrol and smartctl.
Reviewed by: nathanw, ken, kib
Approved by: re
Obtained from: Netflix
functional state. While CTL is much more superior target from all points,
there is no reason why this code should not work.
Tested with ahc(4) as target side HBA.
MFC after: 2 weeks
to 15 minutes, and 5 minutes for things like READ ELEMENT STATUS.
This is needed to account for the worst case scenarios on at least
some Spectra Logic tape libraries.
Sponsored by: Spectra Logic
MFC after: 3 days
notify (enable spinup) required", instead of doing the normal
retries, poll for a change in status.
We will poll every half second for a minute for the status to
change.
Hitachi drives (and likely other SAS drives) return that ASC/ASCQ
when they are waiting to spin up. What it means is that they are
waiting for the SAS expander to send them the SAS
NOTIFY (ENABLE SPINUP) primitive.
That primitive is the mechanism expanders/enclosures use to
sequence drive spinup to avoid overloading power supplies.
Sponsored by: Spectra Logic
MFC after: 3 days
configure sa(4) to request no I/O splitting by default.
For tape devices, the user needs to be able to clearly understand
what blocksize is actually being used when writing to a tape
device. The previous behavior of physio(9) was that it would split
up any I/O that was too large for the device, or too large to fit
into MAXPHYS. This means that if, for instance, the user wrote a
1MB block to a tape device, and MAXPHYS was 128KB, the 1MB write
would be split into 8 128K chunks. This would be done without
informing the user.
This has suboptimal effects, especially when trying to communicate
status to the user. In the event of an error writing to a tape
(e.g. physical end of tape) in the middle of a 1MB block that has
been split into 8 pieces, the user could have the first two 128K
pieces written successfully, the third returned with an error, and
the last 5 returned with 0 bytes written. If the user is using
a standard write(2) system call, all he will see is the ENOSPC
error. He won't have a clue how much actually got written. (With
a writev(2) system call, he should be able to determine how much
got written in addition to the error.)
The solution is to prevent physio(9) from splitting the I/O. The
new cdev flag, SI_NOSPLIT, tells physio that the driver does not
want I/O to be split beforehand.
Although the sa(4) driver now enables SI_NOSPLIT by default,
that can be disabled by two loader tunables for now. It will not
be configurable starting in FreeBSD 11.0. kern.cam.sa.allow_io_split
allows the user to configure I/O splitting for all sa(4) driver
instances. kern.cam.sa.%d.allow_io_split allows the user to
configure I/O splitting for a specific sa(4) instance.
There are also now three sa(4) driver sysctl variables that let the
users see some sa(4) driver values. kern.cam.sa.%d.allow_io_split
shows whether I/O splitting is turned on. kern.cam.sa.%d.maxio shows
the maximum I/O size allowed by kernel configuration parameters
(e.g. MAXPHYS, DFLTPHYS) and the capabilities of the controller.
kern.cam.sa.%d.cpi_maxio shows the maximum I/O size supported by
the controller.
Note that a better long term solution would be to implement support
for chaining buffers, so that that MAXPHYS is no longer a limiting
factor for I/O size to tape and disk devices. At that point, the
controller and the tape drive would become the limiting factors.
sys/conf.h: Add a new cdev flag, SI_NOSPLIT, that allows a
driver to tell physio not to split up I/O.
sys/param.h: Bump __FreeBSD_version to 1000049 for the addition
of the SI_NOSPLIT cdev flag.
kern_physio.c: If the SI_NOSPLIT flag is set on the cdev, return
any I/O that is larger than si_iosize_max or
MAXPHYS, has more than one segment, or would have
to be split because of misalignment with EFBIG.
(File too large).
In the event of an error, print a console message to
give the user a clue about what happened.
scsi_sa.c: Set the SI_NOSPLIT cdev flag on the devices created
for the sa(4) driver by default.
Add tunables to control whether we allow I/O splitting
in physio(9).
Explain in the comments that allowing I/O splitting
will be deprecated for the sa(4) driver in FreeBSD
11.0.
Add sysctl variables to display the maximum I/O
size we can do (which could be further limited by
read block limits) and the maximum I/O size that
the controller can do.
Limit our maximum I/O size (recorded in the cdev's
si_iosize_max) by MAXPHYS. This isn't strictly
necessary, because physio(9) will limit it to
MAXPHYS, but it will provide some clarity for the
application.
Record the controller's maximum I/O size reported
in the Path Inquiry CCB.
sa.4: Document the block size behavior, and explain that
the option of allowing physio(9) to split the I/O
will disappear in FreeBSD 11.0.
Sponsored by: Spectra Logic
We now pay attention to the maxio field in the XPT_PATH_INQ CCB,
and if it is set, propagate it up to physio via the si_iosize_max
field in the cdev structure.
We also now pay attention to the PIM_UNMAPPED capability bit in the
XPT_PATH_INQ CCB, and set the new SI_UNMAPPED cdev flag when the
underlying SIM supports unmapped I/O.
scsi_sa.c: Add unmapped I/O support and propagate the SIM's
maximum I/O size up.
Adjust scsi_tape_read_write() in the same way that
scsi_read_write() was changed to support unmapped
I/O. We overload the readop parameter with bits
that tell us whether it's an unmapped I/O, and we
need to set the CAM_DATA_BIO CCB flag. This change
should be backwards compatible in source and
binary forms.
MFC after: 1 week
Sponsored by: Spectra Logic
Change CCB queue resize logic to be able safely handle overallocations:
- (re)allocate queue space in power of 2 chunks with 64 elements minimum
and never shrink it; with only 4/8 bytes per element size is insignificant.
- automatically reallocate the queue to double size if it is overflowed.
- if queue reallocation failed, store extra CCBs in unsorted TAILQ,
fetching them back as soon as some queue element is freed.
To free space in CCB for TAILQ linking, change highpowerq from keeping
high-power CCBs to keeping devices frozen due to high-power CCBs.
This encloses all pieces of queue resize logic inside of cam_queue.[ch],
removing some not obvious duties from xpt_release_ccb().
While these operations are not really needed otherwise, at least for SCSI
they may cause extra errors if some other initiator holds write exclusive
reservation on the LUN (SYNCHRONIZE CACHE handled as "write" operation).
Add a PIM_NOSCAN flag to the CAM path inquiry CCB. This tells CAM
not to perform a rescan on a bus when it is registered.
We now use this flag in the mps(4) driver. Since it knows what
devices it has attached, it is more efficient for it to just issue
a target rescan on the targets that are attached.
Also, remove the private rescan thread from the mps(4) driver in
favor of the rescan thread already built into CAM. Without this
change, but with the change above, the MPS scanner could run before
or during CAM's initial setup, which would cause duplicate device
reprobes and announcements.
sys/param.h:
Bump __FreeBSD_version to 1000039 for the inclusion of the
PIM_RESCAN CAM path inquiry flag.
sys/cam/cam_ccb.h:
sys/cam/cam_xpt.c:
Added a PIM_NOSCAN flag. If a SIM sets this in the path
inquiry ccb, then CAM won't rescan the bus in
xpt_bus_regsister.
sys/dev/mps/mps_sas.c
For versions of FreeBSD that have the PIM_NOSCAN path
inquiry flag, don't freeze the sim queue during scanning,
because CAM won't be scanning this bus. Instead, hold
up the boot. Don't call mpssas_rescan_target in
mpssas_startup_decrement; it's redundant and I don't
know why it was in there.
Set PIM_NOSCAN in path inquiry CCBs.
Remove methods related to the internal rescan daemon.
Always use async events to trigger a probe for EEDP support.
In older versions of FreeBSD where AC_ADVINFO_CHANGED is
not available, use AC_FOUND_DEVICE and issue the
necessary READ CAPACITY manually.
Provide a path to xpt_register_async() so that we only
receive events for our own SCSI domain.
Improve error reporting in cases where setup for EEDP
detection fails.
sys/dev/mps/mps_sas.h:
Remove softc flags and data related to the scanner thread.
sys/dev/mps/mps_sas_lsi.c:
Unconditionally rescan the target whenever a device is added.
Sponsored by: Spectra Logic
MFC after: 1 week
"Logical unit not supported" errors. First initiates specific target rescan,
second -- destroys specific LUN. That allows to automatically detect changes
in list of device LUNs. This mechanism doesn't work when target is completely
idle, but probably that is all what can be done without active polling.
Reviewed by: ken
Sponsored by: iXsystems, Inc.
changers that don't support the DVCID and CURDATA bits that were
introduced in the SMC spec.
These changers will return an Illegal Request type error if the
bits are set. This causes "chio status" to fail.
The fix is two-fold. First, for changers that claim to be SCSI-2
or older, don't set the DVCID and CURDATA bits for READ ELEMENT
STATUS. For newer changers (SCSI-3 and newer), we default to
setting the new bits, but back off and try the READ ELEMENT STATUS
without the bits if we get an Illegal Request type error.
This has been tested on a Qualstar TLS-8211, which is a SCSI-2
changer that does not support the new bits, and a Spectra T-380,
which is a SCSI-3 changer that does support the new bits. In the
absence of a SCSI-3 changer that does not support the bits, I
tested that with some error injection code. (The SMC spec says
that support for CURDATA is mandatory, and DVCID is optional.)
scsi_ch.c: Add a new quirk, CH_Q_NO_DVCID that gets set for
SCSI-2 and older libraries, or newer libraries that
report errors when the DVCID/CURDATA bits are set.
In chgetelemstatus(), use the new quirk to
determine whether or not to set DVCID and CURDATA.
If we get an error with the bits set, back off and
try without the bits. Set the quirk flag if the
read element status succeeds without the bits set.
Increase the READ ELEMENT STATUS timeout to 60
seconds after testing with a Spectra T-380. The
previous value was 10 seconds, and too short for
the T-380. This may be decreased later after
some additional testing and investigation.
Tested by: Andre Albsmeier <Andre.Albsmeier@siemens.com>
Sponsored by: Spectra Logic
MFC after: 3 days
Ensure that d_delmaxsize is always set, removing init to 0 which could cause
future issues if use cases change.
Allow kern.cam.da.X.delete_max (which maps to d_delmaxsize) to be increased
up to the calculated max after being reduced.
MFC after: 1 day
X-MFC-With: r249940
needed for the last 10 years. Far too much of the internal API is
exposed, and every small adjustment causes applications to stop working.
To kick this off, bump the API version to 0x17 as should have been done
with r246713, but add shims to compensate. Thanks to the shims, there
should be no visible change in application behavior.
I have plans to do a significant overhaul of the API to harnen it for
the future, but until then, I welcome others to add shims for older
versions of the API.
Obtained from: Netflix
SPC-4 specification states that serial number may be property of device,
but not a specific logical unit. People reported about FC storages using
serial number in that way, making it unusable for purposes of LUN multipath
detection. SPC-4 states that designators associated with logical unit from
the VPD page 83h "Device Identification" should be used for that purpose.
Report first of them in the new attribute in such preference order: NAA,
EUI-64, T10 and SCSI name string.
While there, make GEOM DISK properly report GEOM::ident in XML output also
using d_getattr() method, if available. This fixes serial numbers reporting
for SCSI disks in `geom disk list` output and confxml.
Discussed with: gibbs, ken
Sponsored by: iXsystems, Inc.
MFC after: 2 weeks
While GEOM in general has provider opened while sending BIO_GETATTR,
GEOM DISK does not really need to open disk to read medium-unrelated
attributes for own use.
Proposed by: ken
Re-ordered SSD quirks alphabetically so they are easier to maintain.
Removed my email and PR reference from comments on each quirk.
Added quirks for more SSDs:
* Crucial M4
* Corsair Force GT
* Intel 520 Series
* Kingston E100 Series
* Samsung 830 Series
Reviewed by: pjd (mentor)
Approved by: pjd (mentor)
MFC after: 1 week
This prevents users from selecting a delete method which may cause
corruption e.g. MPS WS16 on pre P14 firmware.
Reviewed by: pjd (mentor)
Approved by: pjd (mentor)
MFC after: 2 days
With "cached read" HDD testing and multiple ports busy on a SATA
host controller, 3726/3826 PMP will very rarely drop a deferred
R_OK that was intended for the host. Symptom will be all 5 drives
under test will timeout, get reset, and recover.
Submitted by: Rich Futyma <rich.futyma@sanmina.com>
MFC after: 2 weeks
- remove DA_FLAG_SAW_MEDIA flag, almost opposite to DA_FLAG_PACK_INVALID,
using the last instead.
- allow opening device with no media present, reporting zero media size
and non-zero sector size, as geom/notes suggests. That allow to read
device attributes and potentially do other things, not related to media.
to query ATA functionality via ATA Pass-Through (16) as this page is defined
as "must" for SATL devices, hence indicating that the device is at least
likely to support Pass-Through (16).
This eliminates errors produced by CTL when ATA Pass-Through (16) fails.
Switch ATA probe daerror call to SF_NO_PRINT to avoid errors printing out
for devices which return invalid errors.
Output details about supported and choosen delete method when verbose booted.
Reviewed by: mav
Approved by: pjd (mentor)
MFC after: 1 week
Ensure that delete_available is reset so re-probes after a media change,
to one with different delete characteristics, will result in the correct
methods being flagged as available.
Make all ccb state changes use a consistent flow:
* free()
* xpt_release_ccb()
* softc->state = <new state>
* xpt_schedule()
Reviewed by: mav
Approved by: pjd (mentor)
MFC after: 1 week
Remove ADA_FLAG_PACK_INVALID flag. Since ATA disks have no concept of media
change it only duplicates CAM_PERIPH_INVALID flag, so we can use last one.
Slightly cleanup DA_FLAG_PACK_INVALID use.
Give periph validity flag own periph reference. That slightly simplifies
the release logic and covers hypothetical case if lock is dropped inside
the periph_oninval() method.
requests.
sys/geom/geom_disk.h:
- Added d_delmaxsize which represents the maximum size of individual
device delete requests in bytes. This can be used by devices to
inform geom of their size limitations regarding delete operations
which are generally different from the read / write limits as data
is not usually transferred from the host to physical device.
sys/geom/geom_disk.c:
- Use new d_delmaxsize to calculate the size of chunks passed through to
the underlying strategy during deletes instead of using read / write
optimised values. This defaults to d_maxsize if unset (0).
- Moved d_maxsize default up so it can be used to default d_delmaxsize
sys/cam/ata/ata_da.c:
- Added d_delmaxsize calculations for TRIM and CFA
sys/cam/scsi/scsi_da.c:
- Added re-calculation of d_delmaxsize whenever delete_method is set.
- Added kern.cam.da.X.delete_max sysctl which allows the max size for
delete requests to be limited. This is useful in preventing timeouts
on devices who's delete methods are slow. It should be noted that
this limit is reset then the device delete method is changed and
that it can only be lowered not increased from the device max.
Reviewed by: mav
Approved by: pjd (mentor)
maximum sizes for said methods, which are used when processing BIO_DELETE
requests. This includes updating UNMAP support discovery to be based on
SBC-3 T10/1799-D Revision 31 specification.
Added ATA TRIM support to cam scsi devices via ATA Pass-Through(16)
sys/cam/scsi/scsi_da.c:
- Added ATA Data Set Management TRIM support via ATA Pass-Through(16)
as a delete_method
- Added four new probe states used to identity available methods and their
limits for the processing of BIO_DELETE commands via both UNMAP and the
new ATA TRIM commands.
- Renamed Probe states to better indicate their use
- Added delete method descriptions used when informing user of issues.
- Added automatic calculation of the optimum delete mode based on which
method presents the largest maximum request size as this is most likely
to result in the best performance.
- Added WRITE SAME max block limits
- Updated UNMAP range generation to mirror that used by ATA TRIM, this
optimises the generation of ranges and fixes a potential overflow
issue in the count when combining multiple BIO_DELETE requests
- Added output of warnings about short deletes. This should only ever
be triggered on devices that fail to correctly advertise their supported
delete modes / max sizes.
- Fixed WS16 requests being incorrectly limited to 65535 in length.
Reviewed by: mav
Approved by: pjd (mentor)
MFC after: 2 weeks
so its available for use in generic scsi code.
This is a pre-requirement for using VPD queries to determine available SCSI
delete methods within scsi_da.
Reviewed by: mav
Approved by: pjd (mentor)
MFC after: 2 weeks
commands to an ATA device attached via a SCSI control.
sys/cam/scsi/scsi_all.c:
- Added scsi_ata_identify, scsi_ata_trim
Which use ATA Pass-Through to send commands to the attached disk.
sys/cam/scsi/scsi_all.h:
- Added defines for all missing ATA Pass-Through commands values.
- Added scsi_ata_identify, scsi_ata_trim methods used in ATA TRIM
support.
- Added scsi_vpd_logical_block_prov structure used when querying for
the supported sizes UNMAP commands.
- Added scsi_vpd_block_limits structure used when querying for the
supported sizes of the UNMAP command.
Reviewed by: mav
Approved by: pjd (mentor)
MFC after: 2 weeks
This allows users who boot without loader to adjust their environments
around slightly buggy or slow hardware.
PR: kern/161809
Submitted by: rozhuk.im@gmail.com
MFC after: 2 weeks
This allows mapping a tape drive in a changer (as reported by
'chio status') to a sa(4) driver instance by comparing the
serial numbers.
The designators can be ASCII (which is printed out directly), binary
(which is printed in hex format) or UTF-8, which is printed in either
native UTF-8 format if the terminal can support it, or in %XX notation
for non-ASCII characters. Thanks to Hiroki Sato <hrs@> for the
explaining UTF-8 printing and example UTF-8 printing code.
chio.h: Modify the changer_element_status structure to add new
fields and definitions from the SMC3r16 spec.
Rename the original CHIOGSTATUS ioctl to OCHIOGTATUS and
define a new CHIOGSTATUS ioctl.
Clean up some tab/space issues.
chio.c: For the 'status' subcommand, print the designator field
if it is supplied by a device.
scsi_ch.h: Add new flags for DVCID and CURDATA to the READ
ELEMENT STATUS command structure.
Add a read_element_status_device_id structure
for the data fields in the new standard. Add new
unions, dt_or_obsolete and voltage_devid, to hold
and address data from either SCSI-2 or newer devices.
scsi_ch.c: Implement support for fetching device IDs with READ
ELEMENT STATUS data.
Add new arguments to scsi_read_element_status() to
allow the user to request the DVCID and CURDATA bits.
This isn't compiled into libcam (it's only an internal
kernel interface), so we don't need any special
handling for the API change.
If the user issues the new CHIOGSTATUS ioctl, copy all of
the available element status data out. If he issues the
OCHIOGSTATUS ioctl, we don't copy the new fields in the
structure.
Fix a bug in chopen() that would result in the peripheral
never getting unheld if chgetparams() failed.
Sponsored by: Spectra Logic
Submitted by: Po-Li Soong
MFC After: 1 week
Stop abusing xpt_periph in random plases that really have no periph related
to CCB, for example, bus scanning. NULL value is fine in such cases and it
is correctly logged in debug messages as "noperiph". If at some point we
need some real XPT periphs (alike to pmpX now), quite likely they will be
per-bus, and not a single global instance as xpt_periph now.
r248917, r248918, r248978, r249001, r249014, r249030:
Remove multilevel freezing mechanism, implemented to handle specifics of
the ATA/SATA error recovery, when post-reset recovery commands should be
allocated when queues are already full of payload requests. Instead of
removing frozen CCBs with specified range of priorities from the queue
to provide free openings, use simple hack, allowing explicit CCBs over-
allocation for requests with priority higher (numerically lower) then
CAM_PRIORITY_OOB threshold.
Simplify CCB allocation logic by removing SIM-level allocation queue.
After that SIM-level queue manages only CCBs execution, while allocation
logic is localized within each single device.
Suggested by: gibbs
and kern.cam.ctl.disable tunable; those were introduced as a workaround
to make it possible to boot GENERIC on low memory machines.
With ctl(4) being built as a module and automatically loaded by ctladm(8),
this makes CTL work out of the box.
Reviewed by: ken
Sponsored by: FreeBSD Foundation
Some failing disks tend to return vendor-specific ASC/ASCQ codes with
NOT READY sense key. It caused extremely long recovery attempts, repeating
these 120 TURs (it takes at least 1 minute) for every I/O request.
Instead of that use default error handling, doing just few retries.
Reviewed by: ken, gibbs
MFC after: 1 month
references to it.
This is the functional equivalent to change r237518, which added this
functionality to the cd(4) and da(4) drivers.
This fix prevents a panic caused by GEOM calling adaopen() while the device
is going away. We now keep the device around until GEOM has finished
cleaning up its state.
ata_da.c: In adaregister(), add a d_gone callback to the GEOM disk
structure registered for the ada driver. Increment the
peripheral reference count for GEOM.
Add a new callback, adadiskgonecb(), that GEOM calls when
it is done with its resources. This callback releases the
reference acquired in adaregister().
Submitted by: Po-Li Soong
Sponsored by: Spectra Logic
MFC After: 5 days
the LUN was never freed.
ctl.c: Adjust ctl_alloc_lun() to make sure we don't clear the
CTL_LUN_MALLOCED flag.
Reported by: Sreenivasa Honnur <shonnur@chelsio.com>
Sponsored by: Spectra Logic
MFC after: 3 days
option left but actually consumed by ada(4), so move it to opt_ada.h
and get rid of opt_ata.h.
- Fix stand-alone build of atacore(4) by adding opt_cam.h.
- Use __FBSDID.
- Use DEVMETHOD_END.
- Use NULL instead of 0 for pointers.
r249017:
Some cosmetic things:
- Unify device to target insertion inside xpt_alloc_device() instead of
duplicating it three times.
- Remove extra checks for empty lists of devices and targets on release
since zero refcount check also implies it.
- Reformat code to reduce indentation.
r249103:
- Add lock assertions to every point where reference counters are modified.
- When reference counters are reaching zero, add assertions that there are
no children items left.
- Add a bit more locking to the xptpdperiphtraverse().
Move CAM_DEBUG_CDB messages from the point of queuing to the point of
sending to SIM. That allows to inspect real requests execution order,
respecting priorities, freezing, etc.
MFC after: 2 weeks
most kernels before FreeBSD 9.0. Remove such modules and respective kernel
options: atadisk, ataraid, atapicd, atapifd, atapist, atapicam. Remove the
atacontrol utility and some man pages. Remove useless now options ATA_CAM.
No objections: current@, stable@
MFC after: never
copied in from userspace. This fixes instant panic when creating CTL LUN
on sparc64. Not a security problem, since the API is root-only.
Reviewed by: ken
Sponsored by: FreeBSD Foundation
sys/cam/scsi/scsi_all.c:
- Added scsi_ata_pass_16 method
Which use ATA Pass-Through to send commands to the attached disk.
sys/cam/scsi/scsi_all.h:
- Added defines for all missing ATA Pass-Through commands values.
- Added scsi_ata_pass_16 method.
- Fixed a comment typo while I'm here
Reviewed by: mav
Approved by: pjd (mentor)
MFC after: 2 weeks
CAM. This can significantly improve performance particularly for SSDs
which don't suffer from seek latencies.
The sysctl / tunable kern.cam.sort_io_queues provides the systems default
setting where:-
0 = queued BIOs are NOT sorted
1 = queued BIOs are sorted (default)
Each device gets its own sysctl kern.cam.<type>.<id>.sort_io_queue
Valid values are:-
-1 = use system default (default)
0 = queued BIOs are NOT sorted
1 = queued BIOs are sorted
Note: Additional patch will look to add automatic use of none sorted queues
for none rotating media e.g. SSD's
Reviewed by: scottl
Approved by: pjd (mentor)
MFC after: 2 weeks
but execute the commands in regular way. There is no any reason to cook CPU
while the system is still fully operational. After this change polling in
CAM is used only for kernel dumping.
driver's periphs, acquiring and releaseing periph references while doing it.
Use it to iterate over the lists of ada and da periphs when flushing caches
and putting devices to sleep on shutdown and suspend. Previous code could
panic in theory if some device disappear in the middle of the process.
Before this change they were just leaked. Fortunately USB sticks now use
only one CCB, and so leak was only 2KB per detach, while other bigger SIMs
with much more allocated CCBs are rarely detached.
MFC after: 2 weeks
for the r248519:
For the cam-attached HBAs, allow the driver to specify that it accepts
the unmapped bio by the PIM_UNMAPPED flag. The CAM passes the
CAM_DATA_BIO data transfer type request for the unmapped bio, and the
driver could use the bus_dmamap_load_ccb() as a helper to
transparently handle the ccb.
Sponsored by: The FreeBSD Foundation
Reviewed by: scottl
Tested by: pho, scottl
The vnode-backed md(4) has to map the unmapped bio because VOP_READ()
and VOP_WRITE() interfaces do not allow to pass unmapped requests to
the filesystem. Vnode-backed md(4) uses pbufs instead of relying on
the bio_transient_map, to avoid usual md deadlock.
Sponsored by: The FreeBSD Foundation
Tested by: pho, scottl
tunable by default.
This will allow GENERIC configurations to boot on small memory boxes, but
not require end users who want to use CTL to recompile their kernel. They
can simply set kern.cam.ctl.disable=0 in loader.conf.
The eventual solution to the memory usage problem is to change the way
CTL allocates memory to be more configurable, but this should fix things
for small memory situations in the mean time.
UPDATING: Explain the change in the CTL configuration, and
how users can enable CTL if they would like to use
it.
sys/conf/options: Add a new option, CTL_DISABLE, that prevents CTL
from initializing.
ctl.c: If CTL_DISABLE is turned on, don't initialize.
i386/conf/GENERIC,
amd64/conf/GENERIC: Re-enable device ctl, and add the CTL_DISABLE
option.
PREVENT ALLOW MEDIUM REMOVAL commands return errors on these devices
without returning sense data. In some cases unrelated following commands
start to return errors too, that makes device to be dropped by CAM.
every architecture's busdma_machdep.c. It is done by unifying the
bus_dmamap_load_buffer() routines so that they may be called from MI
code. The MD busdma is then given a chance to do any final processing
in the complete() callback.
The cam changes unify the bus_dmamap_load* handling in cam drivers.
The arm and mips implementations are updated to track virtual
addresses for sync(). Previously this was done in a type specific
way. Now it is done in a generic way by recording the list of
virtuals in the map.
Submitted by: jeff (sponsored by EMC/Isilon)
Reviewed by: kan (previous version), scottl,
mjacob (isp(4), no objections for target mode changes)
Discussed with: ian (arm changes)
Tested by: marius (sparc64), mips (jmallet), isci(4) on x86 (jharris),
amd64 (Fabian Keil <freebsd-listen@fabiankeil.de>)
Make umass return an error code if SCSI sense retrieval request
has failed. Make sure scsi_error_action honors SF_NO_RETRY and
SF_NO_RECOVERY in all cases, even if it cannot parse sense bytes.
Reviewed by: hselasky (umass), scottl (cam)
to avoid sending extra READ CAPACITY requests by dastart(). Schedule periph
again on reprobe completion, or otherwise it may stuck indefinitely long.
This should fix USB explore thread hanging on device unplug, waiting for
periph destruction.
Reported by: hselasky
and da_default_timeout where their current hardcoded values matched the current
default value for said tunables.
PR: kern/169976
Reviewed by: pjd (mentor)
Approved by: mav
DISKFLAG_CANDELETE. While this change makes this layer consistent
other layers such as UFS and ZFS BIO_DELETE support may not notice
any change made manually via these device sysctls until the device
is reopened via a mount.
Also corrected var order in dadeletemethodsysctl
PR: kern/169801
Reviewed by: pjd (mentor)
Approved by: mav
MFC after: 2 weeks
Previously CTL would leave individual LUNs enabled in the target
driver, whether or not the port as a whole was enabled. It would
also leave the wildcard LUN enabled indefinitely.
This change means that CTL will enable and disable any active LUNs,
as well as the wildcard LUN, when enabling and disabling a port.
Also, fix a bug that could crop up due to an uninitialized CCB
type.
ctl.c: Before calling ctl_frontend_online(), run through
the LUN list and enable all active LUNs.
After calling ctl_frontend_offline(), run through
the LUN list and disble all active LUNs.
scsi_ctl.c: Before bringing a port online, allocate the
wildcard peripheral for that bus. And after taking
a port offline, invalidate the wildcard peripheral
for that bus.
Make sure that we hold the SIM lock around all
calls to xpt_action() and other transport layer
interfaces that require it.
Use CAM_SIM_{LOCK|UNLOCK} consistently to acquire
and release the SIM lock.
Update a number of outdated comments. Some of
these should have been fixed long ago.
Actually do LUN disbables now. The newer drivers
in the tree work correctly for this as far as I
know.
Initialize the CCB type to CTLFE_CCB_DEFAULT to
avoid a panic due to uninitialized memory.
Submitted by: Chuck Tuffli (partially)
MFC after: 1 week
ctl_frontend_cam_sim.c: Coalesce cfcs_online() and cfcs_offline()
into a single function since these were
identical except for one line.
Make sure we hold the SIM lock around path
creation, and calling xpt_rescan().
scsi_ctl.c: In ctlfe_onoffline(), make sure we hold the
SIM lock around path creation and free
calls, as well as xpt_action().
In ctlfe_lun_enable(), hold the SIM lock
around path and peripheral operations that
require it.
Sponsored by: Spectra Logic Corporation
MFC after: 1 week
while doing a copyout. That can cause a panic, because copyout
can trigger VM faults, and we can't handle VM faults while holding
a mutex.
The solution here is to malloc a separate buffer to hold the OOA
queue entries, so that we don't risk a VM fault while filling up
the buffer and we don't have to drop the lock. The other solution
would be to wire the user's memory while filling their buffer with
copyout, but that would have been a little more complex.
Also fix a debugging parenthesis issue in ctl_abort_task() pointed
out by Chuck Tuffli.
Sponsored by: Spectra Logic Corporation
MFC after: 1 week
drivers.
The bug occurrs when a userland process has the driver instance
open and the underlying device goes away. We get the devfs
callback that the device node has been destroyed, but not all of
the closes necessary to fully decrement the reference count on the
CAM peripheral.
The reason is that once devfs calls back and says the device has
been destroyed, it is moved off to deadfs, and devfs guarantees
that there will be no more open or close calls. So the solution
is to keep track of how many outstanding open calls there are on
the device, and just release that many references when we get the
callback from devfs.
scsi_pass.c,
scsi_enc.c,
scsi_enc_internal.h: Add an open count to the softc in these
drivers. Increment it on open and
decrement it on close.
When we get a devfs callback to say that
the device node has gone away, decrement
the peripheral reference count by the
number of still outstanding opens.
Make sure we don't access the peripheral
with cam_periph_unlock() after what might
be the final call to
cam_periph_release_locked(). The
peripheral might have been freed, and we
will be dereferencing freed memory.
scsi_ch.c,
scsi_sg.c: For the ch(4) and sg(4) drivers, add the
same changes described above, and in
addition, fix another bug that was
previously fixed in the pass(4) and enc(4)
drivers.
These drivers were calling destroy_dev()
from their cleanup routine, but that could
cause a deadlock because the cleanup
routine could be indirectly called from
the driver's close routine. This would
cause a deadlock, because the device node
is being held open by the active close
call, and can't be destroyed.
Sponsored by: Spectra Logic Corporation
MFC after: 1 week
The problem was a race condition between the EDT traversal used by
things like 'camcontrol devlist', and CAM peripheral driver
removal.
The EDT traversal code holds the CAM topology lock, and wants
to show devices that have been invalidated. It acquires a
reference to the peripheral to make sure the peripheral it is
examining doesn't go away.
However, because the peripheral removal code in camperiphfree()
drops the CAM topology lock to call the peripheral's destructor
routine, we can run into a situation where the EDT traversal
increments the peripheral reference count after free process is
already in progress. At that point, the reference count is
ignored, because it was 0 when we started the process.
Fix this race by setting a flag, CAM_PERIPH_FREE, that I previously
added and checked in xptperiphtraverse() and xptpdperiphtravsere(),
but failed to use. If the EDT traversal code sees that flag,
it will know that the peripheral free process has already started,
and that it should not access that peripheral.
Also, fix an inconsistency in the locking between
xptpdperiphtraverse() and xptperiphtraverse(). They now both
hold the CAM topology lock while calling the peripheral traversal
function.
cam_xpt.c: Change xptperiphtraverse() to hold the CAM topology
lock across calls to the traversal function.
Take out the comment in xptpdperiphtraverse() that
referenced the locking inconsistency.
cam_periph.c: Set the CAM_PERIPH_FREE flag when we are in the
process of freeing a peripheral driver.
Sponsored by: Spectra Logic Corporation
MFC after: 1 week
The device reports support for SATA Asynchronous Notification in its
IDENTIFY data, but returns error on attempt to enable that feature.
Make SATA XPT of CAM only report these errors, but not fail the device.
MFC after: 1 week
Element Descriptor page if it is not supported. This removes one error
message from verbose logs during boot on systems with some enclosures.
Sponsored by: iXsystems, Inc.
safe in some cases to reduce CCB priority after it was scheduled with high
priority. This fixes reproducible deadlock when command sent through the
pass interface while ATA XPT recovers from command timeout.
Instead of that enforce priority at passioctl(). libcam provides no obvious
interface to specify CCB priority and so much (all?) code specifies zero
(highest) priority. This change limits pass CCBs priority to NORMAL run
level, allowing XPT to complete bus and device recovery after reset before
running any payload.
In particular, do not lock Giant conditionally when calling into the
filesystem module, remove the VFS_LOCK_GIANT() and related
macros. Stop handling buffers belonging to non-mpsafe filesystems.
The VFS_VERSION is bumped to indicate the interface change which does
not result in the interface signatures changes.
Conducted and reviewed by: attilio
Tested by: pho
System time is set later on boot process then initial bus scan by CAM.
Until that moment microtime() is equal to microuptime(), and if system
boots quickly, the value can be close to zero. That causes settle time
waiting even for buses that don't use reset during probe.
On my test system this reduces boot time by 1 second if USB enabled, or
by 4 seconds if USB disabled. CAM waited for ctl2cam0 bus "settle".
- Extend the lock to cover xpt_path_release() for the new path.
- While xpt_action() is called while holding right SIM lock for the new
bus, the old path release may require different SIM lock. So we have
to temporary drop the new lock and get the old one.
without holding SIM lock. It really doesn't need that lock, but adding it
removes that specific exception, allowing to assert locking there later.
Submitted by: ken@ (earlier version)
It is required to store extra recovery requests in case of bus resets.
On ATA/SATA this fixes assertion panics on HEAD with INVARIANTS enabled or
possible memory corruptions otherwise if timeout/reset happens when device
CCB queue is already full.
Reported by: gibbs@
MFC after: 1 week
returns zero while request status is not CAM_REQ_CMP. That could cause
partial device attach or other unexpected results.
Found by: Clang Static Analyzer
drivers:
- Remove scsi_low_pisa.*, they were unused.
- Remove <compat/netbsd/physio_proc.h> and calls to the stubs in that
header. They were empty nops.
- Retire sl_xname and use device_get_nameunit() and device_printf() with
the underlying device_t instead.
- Remove unused {ct,ncv,nsp,stg}print() functions.
- Remove empty SOFT_INTR_REQUIRED() macro and the unused sl_irq member.
NetBSD/pc98 was never merged into the main NetBSD tree and is no longer
developed. Adding locking to these drivers would have made the compat
shims hard to impossible to maintain, so remove the shims to ease
future changes.
These changes were verified by md5. Some additional shims can be removed
that do affect the compiled results that I will probably do in another
round.
Approved by: nyan (tentatively)
of this hardware still running (close to twenty years now).
2. Quiesece and use ENC_VLOG instead of ENC_LOG for most
complaints. That is, they're visible with bootverbose, but
otherwise quiesced and not repeatedly spamming messages
with constant reminders that hardware in this space is
rarely fully compliant.
MFC after: 1 month
'encapsulating interface' used with IPsec and has nothing to do with
storage 'enclosure' services.
MFC after: 3 days
Noticed while: debugging why enc(4) is no longer automatically created
It includes three parts:
1) Modifications to CAM to detect media media changes and report them to
disk(9) layer. For modern SATA (and potentially UAS) devices it utilizes
Asynchronous Notification mechanism to receive events from hardware.
Active polling with TEST UNIT READY commands with 3 seconds period is used
for incapable hardware. After that both CD and DA drivers work the same way,
detecting two conditions: "NOT READY: Medium not present" after medium was
detected previously, and "UNIT ATTENTION: Not ready to ready change, medium
may have changed". First one reported to disk(9) as media removal, second
as media insert/change. To reliably receive second event new
AC_UNIT_ATTENTION async added to make UAs broadcasted to all periphs by
generic error handling code in cam_periph_error().
2) Modifications to GEOM core to handle media remove and change events.
Media removal handled by spoiling all consumers attached to the provider.
Media change event also schedules provider retaste after spoiling to probe
new media. New flag G_CF_ORPHAN was added to consumers to reflect that
consumer is in process of destruction. It allows retaste to create new
geom instance of the same class, while previous one is still dying.
3) Modifications to some GEOM classes: DEV -- to report media change
events to devd; VFS -- to handle spoiling same as orphan to prevent
accessing replaced media. PART class already handles spoiling alike to
orphan.
Reviewed by: silence on geom@ and scsi@
Tested by: avg
Sponsored by: iXsystems, Inc. / PC-BSD
MFC after: 2 months
data pointer. This is a temp fix that resubmits the
command, adjusted, so that the backend can fetch the
data again.
Sponsored by: Spectralogic
MFC after: 1 month
Just free inclomplete daemon cache instead to let it retry next time.
Premature ses_softc_cleanup() caused NULL dereference when freed softc
was accessed later.
Renamed the kern.cam.ada.ada_send_ordered sysctl and tunable to
kern.cam.ada.send_ordered, more in line with the other da sysctls/tunables.
Suggested by: kib
kern.cam.da.send_ordered, more in line with the other da sysctls/tunables.
PR: 169765
Submitted by: Steven Hartland <steven.hartland@multiplay.co.uk>
Reviewed by: mav
a CD or DVD drive with a damaged disc often benefit from a shorter
timeout. Also, when retries are set to 0, an application is expecting
errors and recovering them so do not print the error into the log.
The number of expected errors can literally be in the hundreds of
thousands which significantly slows data recovery.
Reviewed by: ken@ (but quite some time ago).
to attach to target capable HBAs that implement the old immediate
notify (XPT_IMMED_NOTIFY) and notify acknowledge (XPT_NOTIFY_ACK)
CCBs. The new API has been in place since SVN change 196008 in
2009.
The solution is two-fold: fix CTL to handle the responses from the
HBAs, and convert the HBA drivers in question to use the new API.
These drivers have not been tested with CTL, so how well they will
interoperate with CTL is unknown.
scsi_target.c: Update the userland target example code to use the
new immediate notify API.
scsi_ctl.c: Detect when an immediate notify CCB is returned
with CAM_REQ_INVALID or CAM_PROVIDE_FAIL status,
and just free it.
Fix a duplicate assignment.
aic79xx.c,
aic79xx_osm.c: Update the aic79xx driver to use the new API.
Target mode is not enabled on for this driver, so
the changes will have no practical effect.
aic7xxx.c,
aic7xxx_osm.c: Update the aic7xxx driver to use the new API.
sbp_targ.c: Update the firewire target code to work with the
new API.
mpt_cam.c: Update the mpt(4) driver to work with the new API.
Target mode is only enabled for Fibre Channel
mpt(4) devices.
MFC after: 3 days
a da(4) instance going away while GEOM is still probing it.
In this case, the GEOM disk class instance has been created by
disk_create(), and the taste of the disk is queued in the GEOM
event queue.
While that event is queued, the da(4) instance goes away. When the
open call comes into the da(4) driver, it dereferences the freed
(but non-NULL) peripheral pointer provided by GEOM, which results
in a panic.
The solution is to add a callback to the GEOM disk code that is
called when all of its resources are cleaned up. This is
implemented inside GEOM by adding an optional callback that is
called when all consumers have detached from a provider, and the
provider is about to be deleted.
scsi_cd.c,
scsi_da.c: In the register routine for the cd(4) and da(4)
routines, acquire a reference to the CAM peripheral
instance just before we call disk_create().
Use the new GEOM disk d_gone() callback to register
a callback (dadiskgonecb()/cddiskgonecb()) that
decrements the peripheral reference count once GEOM
has finished cleaning up its resources.
In the cd(4) driver, clean up open and close
behavior slightly. GEOM makes sure we only get one
open() and one close call, so there is no need to
set an open flag and decrement the reference count
if we are not the first open.
In the cd(4) driver, use cam_periph_release_locked()
in a couple of error scenarios to avoid extra mutex
calls.
geom.h: Add a new, optional, providergone callback that
is called when a provider is about to be deleted.
geom_disk.h: Add a new d_gone() callback to the GEOM disk
interface.
Bump the DISK_VERSION to version 2. This probably
should have been done after a couple of previous
changes, especially the addition of the d_getattr()
callback.
geom_disk.c: Add a providergone callback for the disk class,
g_disk_providergone(), that calls the user's
d_gone() callback if it exists.
Bump the DISK_VERSION to 2.
geom_subr.c: In g_destroy_provider(), call the providergone
callback if it has been provided.
In g_new_geomf(), propagate the class's
providergone callback to the new geom instance.
blkfront.c: Callers of disk_create() are supposed to pass in
DISK_VERSION, not an explicit disk API version
number. Update the blkfront driver to do that.
disk.9: Update the disk(9) man page to include information
on the new d_gone() callback, as well as the
previously added d_getattr() callback, d_descr
field, and HBA PCI ID fields.
MFC after: 5 days
defect information it has before grabbing the full defect list.
This works around a bug with some Hitachi drives that generate data overrun
errors when they are asked for more defect data than they have.
The change is done in a spec-compliant way, so it should have no negative
impact on drives that don't have this issue.
This is based on work originally done at Sandvine.
scsi_da.h: Add a define for the maximum amount of data that can be
contained in a defect list.
camcontrol.c: Update the readdefects() function to issue an initial
command to determine the length of the defect list, and
then use that length in the request for the full defect
list.
camcontrol.8: Add a note that some drives will report 0 defects available
if you don't request either the PLIST or GLIST.
Submitted by: Mark Johnston <markjdb@gmail.com> (original version)
MFC after: 3 days
done queue. Clearing it before caused extra SIM queueing in some cases.
It was invisible during normal operation, but during USB device unplug and
respective SIM destruction it could keep pointer on SIM without having
counted reference and as result crash the system by use afer free.
Reported by: hselasky
MFC after: 1 week
invalidated while open, cam_periph_hold() will return error and won't
get the reference. Following reference release will crash the system.
Sponsored by: iXsystems, Inc.
MFC after: 3 days
the pass(4) and enc(4) drivers and devfs.
The pass(4) driver uses the destroy_dev_sched() routine to
schedule its device node for destruction in a separate thread
context. It does this because the passcleanup() routine can get
called indirectly from the passclose() routine, and that would
cause a deadlock if the close routine tried to destroy its own
device node.
In any case, once a particular passthrough driver number, e.g.
pass3, is destroyed, CAM considers that unit number (3 in this
case) available for reuse.
The problem is that devfs may not be done cleaning up the previous
instance of pass3, and will panic if isn't done cleaning up the
previous instance.
The solution is to get a callback from devfs when the device node
is removed, and make sure we hold a reference to the peripheral
until that happens.
Testing exposed some other cases where we have reference counting
issues, and those were also fixed in the pass(4) driver.
cam_periph.c: In camperiphfree(), reorder some of the operations.
The peripheral destructor needs to be called before
the peripheral is removed from the peripheral is
removed from the list. This is because once we
remove the peripheral from the list, and drop the
topology lock, the peripheral number may be reused.
But if the destructor hasn't been called yet, there
may still be resources hanging around (like devfs
nodes) that haven't been fully cleaned up.
cam_xpt.c: Add an argument to xpt_remove_periph() to indicate
whether the topology lock is already held.
scsi_enc.c: Acquire an extra reference to the peripheral during
registration, and release it once we get a callback
from devfs indicating that the device node is gone.
Call destroy_dev_sched_cb() in enc_oninvalidate()
instead of calling destroy_dev() in the cleanup
routine.
scsi_pass.c: Add reference counting to handle peripheral and
devfs object lifetime issues.
Add a reference to the peripheral and the devfs
node in the peripheral registration.
Don't attempt to add a physical path alias if the
peripheral has been marked invalid.
Release the devfs reference once the initial
physical path alias taskqueue run has completed.
Schedule devfs node destruction in the
passoninvalidate(), and release our peripheral
reference in a new routine, passdevgonecb() once
the devfs node is gone. This allows the peripheral
to fully go away, and the peripheral destructor,
passcleanup(), will get called.
MFC after: 3 days
Sponsored by: Spectra Logic
reporting. It includes:
- removing of error messages controlled by bootverbose, replacing them
with more universal and informative debugging on CAM_DEBUG_INFO level,
that is now built into the kernel by default;
- more close following to the arguments submitted by caller, such as
SF_PRINT_ALWAYS, SF_QUIET_IR and SF_NO_PRINT; consumer knows better which
errors are usual/expected at this point and which are really informative;
- adding two new flags SF_NO_RECOVERY and SF_NO_RETRY to allow caller
specify how much assistance it needs at this point; previously consumers
controlled that by not calling cam_periph_error() at all, but that made
behavior inconsistent and debugging complicated;
- tuning debug messages and taken actions order to make debugging output
more readable and cause-effect relationships visible;
- making camperiphdone() (common device recovery completion handler) to
also use cam_periph_error() in most cases, instead of own dumb code;
- removing manual sense fetching code from cam_periph_error(); I was told
by number of people that it is SIM obligation to fetch sense data, so this
code is useless and only significantly complicates recovery logic;
- making ada, da and pass driver to use cam_periph_error() with new limited
recovery options to handle error recovery and debugging in common way;
as one of results, CAM_REQUEUE_REQ and other retrying statuses are now
working fine with pass driver, that caused many problems before.
- reverting r186891 by raj@ to avoid burning few seconds in tight DELAY()
loops on device probe, while device simply loads media; I think that problem
may already be fixed in other way, and even if it is not, solution must be
different.
Sponsored by: iXsystems, Inc.
MFC after: 2 weeks
CAM_DEBUG_CDB, CAM_DEBUG_PERIPH and CAM_DEBUG_PROBE) by default.
List of these flags can be modified with CAM_DEBUG_COMPILE kernel option.
CAMDEBUG kernel option still enables all possible debug, if not overriden.
Additional 50KB of kernel size is a good price for the ability to debug
problems without rebuilding the kernel. In case where size is important,
debugging can be compiled out by setting CAM_DEBUG_COMPILE option to 0.
until transport will do some probe actions (at least soft reset).
Make ATA/SATA SIMs to not report bogus and confusing PROTO_ATA protocol.
Make ATA/SATA transport to fill that gap by reporting protocol to SIM with
XPT_SET_TRAN_SETTINGS and patching XPT_GET_TRAN_SETTINGS results if needed.
figure out domain, etc..
Zero ATIO and INOTify allocations. It makes for much
less guesswork when looking at the structure and
seeing 'deadc0de' present.
Reviewed by: kdm
MFC after: 2 weeks
Sponsored by: Spectralogic
via `camcontrol tags ... -N ...`. There is no need to tune it in
usual cases, but some users want to have it for debugging purposes.
MFC after: 2 weeks
are handled in most CAM peripheral drivers that are not handled by
GEOM's disk class.
The usual character driver open and close semantics are that the
driver gets N open calls, but only one close, when the last caller
closes the device.
CAM peripheral drivers expect that behavior to be honored to the
letter, and the CAM peripheral driver code (specifically
cam_periph_release_locked_busses()) panics if it is done incorrectly.
Since devfs has to drop its locks while it calls a driver's close
routine, and it does not have a way to delay or prevent open calls
while it is calling the close routine, there is a race.
The sequence of events, simplified a bit, is:
- devfs acquires a lock
- devfs checks the reference count, and if it is 1, continues to close.
- devfs releases the lock
- 2nd process open call on the device happens here
- devfs calls the driver's close routine
- devfs acquires a lock
- devfs decrements the reference count
- devfs releases the lock
- 2nd process close call on the device happens here
At the second close, we get a panic in
cam_periph_release_locked_busses(), complaining that peripheral
has been released when the reference count is already 0. This is
because we have gotten two closes in a row, which should not
happen.
The fix is to add the D_TRACKCLOSE flag to the driver's cdevsw, so
that we get a close() call for each open(). That does happen
reliably, so we can make sure that our reference counts are
correct.
Note that the sa(4) and pt(4) drivers only allow one context
through the open routine. So these drivers aren't exposed to the
same race condition.
scsi_ch.c,
scsi_enc.c,
scsi_enc_internal.h,
scsi_pass.c,
scsi_sg.c:
For these drivers, change the open() routine to
increment the reference count for every open, and
just decrement the reference count in the close.
Call cam_periph_release_locked() in some scenarios
to avoid additional lock and unlock calls.
scsi_pt.c: Call cam_periph_release_locked() in some scenarios
to avoid additional lock and unlock calls.
MFC after: 3 days
PMP ports such as PMP configuration or SEMB should be exposed or hidden.
These ports were always hidden before as useless and sometimes promatic.
But with updated ses driver supporting SEMB it is no longer so straight.
Keep ports hidden by default to avoid probe request ttimeouts if SEP is
not connected to PMP's SEMB via I2C, that is very often situation.
process exit. Instead use CAM's standard reference counting to prevent
periph going away until process won't complete. I think that sleep in
single CAM SWI thread is not a good idea and may lead to deadlocks if
daemon process waits for some command completion. Combined with recent
patch avoiding use of CAM SWI for ATA it just causes panics because of
sleeps prohibited in interrupt thread context.
Revamp the CAM enclosure services driver.
This updated driver uses an in-kernel daemon to track state changes and
publishes physical path location information\for disk elements into the
CAM device database.
Sponsored by: Spectra Logic Corporation
Sponsored by: iXsystems, Inc.
Submitted by: gibbs, will, mav
- Add low-level support for SATA Enclosure Management Bridge (SEMB)
devices -- SATA equivalents of the SCSI SES/SAF-TE devices.
- Add some utility functions for SCSI SAF-TE devices access.
Sponsored by: iXsystems, Inc.
to allow drivers to handle request completion directly without passing
them to the CAM SWI thread removing extra context switch.
Modify all ATA/SATA drivers to use them.
Reviewed by: gibbs, ken
MFC after: 2 weeks
Olympus FE-210 camera
LG UP3S MP3 player
Laser MP3-2GA13 MP3
PR: usb/119201
Submitted by: Peter Jeremy <peterjeremy@optushome.com.au>
Approved by: cperciva
MFC after: 1 week
that don't exist.
Anecdotal evidence indicates that it is better to return 011b (bad LUN)
than 001b (LUN offline). However, this change also gives the user a
sysctl/tunable, kern.cam.ctl.inquiry_pq_no_lun, to override the change
and return to the previous behavior. (The previous behavior was to
return 001b, or LUN offline.)
ctl.c: Change the default inquiry peripheral qualifier to 011b,
and add a sysctl and tunable to allow the user to change
it back to 001b if needed.
Don't insert a Copan copyright statement in the inquiry
data. The copyright statements on the files are
sufficient.
ctl_private.h: Add sysctl variable context to the CTL softc.
ctl_cmd_table.c,
ctl_frontend_internal.c,
ctl_frontend.c,
ctl_backend.c,
ctl_error.c: Include sys/sysctl.h.
MFC after: 3 days
checked PROTECT bit in INQUIRY data for all SPC devices, while it is defined
only since SPC-3. But there are some SPC-2 USB devices were reported, that
have PROTECT bit set, return no error for READ CAPACITY(16) command, but
return wrong sector count value in response.
MFC after: 3 days
of the default one.
Without this change setting kern.cam.ada.default_timeout to 1 instead of 30
allowed me to trigger several false positive command timeouts under heavy
ZFS load on a SiI3132 siis(4) controller with 5 HDDs on a port multiplier.
MFC after: 1 week
Even having more specific hint.ata.X.mode controls, global ones are
still could be useful from some points, including compatibility.
PR: kern/164651
MFC after: 1 week
data changes.
cam_ccb.h: Add a new advanced information type, CDAI_TYPE_RCAPLONG,
for long read capacity data.
cam_xpt_internal.h:
Add a read capacity data pointer and length to struct cam_ed.
cam_xpt.c: Free the read capacity buffer when a device goes away.
While we're here, make sure we don't leak memory for other
malloced fields in struct cam_ed.
scsi_all.c: Update the scsi_read_capacity_16() to take a uint8_t * and
a length instead of just a pointer to the parameter data
structure. This will hopefully make this function somewhat
immune to future changes in the parameter data.
scsi_all.h: Add some extra bit definitions to struct
scsi_read_capacity_data_long, and bump up the structure
size to the full size specified by SBC-3.
Change the prototype for scsi_read_capacity_16().
scsi_da.c: Register changes in read capacity data with the transport
layer. This allows the transport layer to send out an
async notification to interested parties. Update the
dasetgeom() API.
Use scsi_extract_sense_len() instead of
scsi_extract_sense().
scsi_xpt.c: Add support for the new CDAI_TYPE_RCAPLONG advanced
information type.
Make sure we set the physpath pointer to NULL after freeing
it. This allows blindly freeing it in the struct cam_ed
destructor.
sys/param.h: Bump __FreeBSD_version from 1000005 to 1000006 to make it
easier for third party drivers to determine that the read
capacity data async notification is available.
camcontrol.c,
mptutil/mpt_cam.c:
Update these for the new scsi_read_capacity_16() argument
structure.
Sponsored by: Spectra Logic
in response to CAM_DEV_NOT_THERE, instead of just the LUN in question.
This will now just eliminate the specified LUN in response to
CAM_DEV_NOT_THERE.
Reported by: Richard Todd <rmtodd@servalan.servalan.com>
MFC after: 3 days
ctl_error.c,
ctl_error.h: Take out the ctl_sense_format enumeration, and use
scsi_sense_data_type instead.
Remove ctl_get_sense_format() and switch ctl_build_ua()
over to using scsi_sense_data_type.
ctl_backend_ramdisk.c,
ctl_backend_block.c:
Use C99 structure initializers instead of GNU initializers.
ctl.c: Switch over to using the SCSI sense format enumeration
instead of the CTL-specific enumeration.
Submitted by: dim (partially)
MFC after: 1 month
Depending on device capabilities use different methods to implement it.
Currently used method can be read/set via kern.cam.da.X.delete_method
sysctls. Possible values are:
NONE - no provisioning support reported by the device;
DISABLE - provisioning support was disabled because of errors;
ZERO - use WRITE SAME (10) command to write zeroes;
WS10 - use WRITE SAME (10) command with UNMAP bit set;
WS16 - use WRITE SAME (16) command with UNMAP bit set;
UNMAP - use UNMAP command (equivalent of the ATA DSM TRIM command).
The last two methods (UNMAP and WS16) are defined by SBC specification and
the UNMAP method is the most advanced one. The rest of methods I've found
supported in Linux, and as soon as they were trivial to implement, then
why not? Hope they will be useful in some cases.
Unluckily I have no devices properly reporting parameters of the logical
block provisioning support via respective VPD pages (0xB0 and 0xB2). So
all info I have/use now is the flag telling whether logical block
provisioning is supported or not. As result, specific methods chosen now
by trying different ones in order (UNMAP, WS16, DISABLE) and checking
completion status to fallback if needed. I don't expect problems from this,
as if something go wrong, it should just disable itself. It may disable
even too aggressively if only some command parameter misfit.
Unlike Linux, which executes each delete with separate request, I've
implemented here the same request aggregation as implemented in ada driver.
Tests on SSDs I have show much better results doing it this way: above
8GB/s of the linear delete on Intel SATA SSD on LSI SAS HBA (mps).
Reviewed by: silence on scsi@
MFC after: 2 month
Sponsored by: iXsystems, Inc.
in the CAM XPT bus traversal code, and a number of other periph level
issues.
cam_periph.h,
cam_periph.c: Modify cam_periph_acquire() to test the CAM_PERIPH_INVALID
flag prior to allowing a reference count to be gained
on a peripheral. Callers of this function will receive
CAM_REQ_CMP_ERR status in the situation of attempting to
reference an invalidated periph. This guarantees that
a peripheral scheduled for a deferred free will not
be accessed during its wait for destruction.
Panic during attempts to drop a reference count on
a peripheral that already has a zero reference count.
In cam_periph_list(), use a local sbuf with SBUF_FIXEDLEN
set so that mallocs do not occur while the xpt topology
lock is held, regardless of the allocation policy of the
passed in sbuf.
Add a new routine, cam_periph_release_locked_buses(),
that can be called when the caller already holds
the CAM topology lock.
Add some extra debugging for duplicate peripheral
allocations in cam_periph_alloc().
Treat CAM_DEV_NOT_THERE much the same as a selection
timeout (AC_LOST_DEVICE is emitted), but forgo retries.
cam_xpt.c: Revamp the way the EDT traversal code does locking
and reference counting. This was broken, since it
assumed that the EDT would not change during
traversal, but that assumption is no longer valid.
So, to prevent devices from going away while we
traverse the EDT, make sure we properly lock
everything and hold references on devices that
we are using.
The two peripheral driver traversal routines should
be examined. xptpdperiphtraverse() holds the
topology lock for the entire time it runs.
xptperiphtraverse() is now locked properly, but
only holds the topology lock while it is traversing
the list, and not while the traversal function is
running.
The bus locking code in xptbustraverse() should
also be revisited at a later time, since it is
complex and should probably be simplified.
scsi_da.c: Pay attention to the return value from cam_periph_acquire().
Return 0 always from daclose() even if the disk is now gone.
Add some rudimentary error injection support.
scsi_sg.c: Fix reference counting in the sg(4) driver.
The sg driver was calling cam_periph_release() on close,
but never called cam_periph_acquire() (which increments
the reference count) on open.
The periph code correctly complained that the sg(4)
driver was trying to decrement the refcount when it
was already 0.
Sponsored by: Spectra Logic
MFC after: 2 weeks
CTL is a disk and processor device emulation subsystem originally written
for Copan Systems under Linux starting in 2003. It has been shipping in
Copan (now SGI) products since 2005.
It was ported to FreeBSD in 2008, and thanks to an agreement between SGI
(who acquired Copan's assets in 2010) and Spectra Logic in 2010, CTL is
available under a BSD-style license. The intent behind the agreement was
that Spectra would work to get CTL into the FreeBSD tree.
Some CTL features:
- Disk and processor device emulation.
- Tagged queueing
- SCSI task attribute support (ordered, head of queue, simple tags)
- SCSI implicit command ordering support. (e.g. if a read follows a mode
select, the read will be blocked until the mode select completes.)
- Full task management support (abort, LUN reset, target reset, etc.)
- Support for multiple ports
- Support for multiple simultaneous initiators
- Support for multiple simultaneous backing stores
- Persistent reservation support
- Mode sense/select support
- Error injection support
- High Availability support (1)
- All I/O handled in-kernel, no userland context switch overhead.
(1) HA Support is just an API stub, and needs much more to be fully
functional.
ctl.c: The core of CTL. Command handlers and processing,
character driver, and HA support are here.
ctl.h: Basic function declarations and data structures.
ctl_backend.c,
ctl_backend.h: The basic CTL backend API.
ctl_backend_block.c,
ctl_backend_block.h: The block and file backend. This allows for using
a disk or a file as the backing store for a LUN.
Multiple threads are started to do I/O to the
backing device, primarily because the VFS API
requires that to get any concurrency.
ctl_backend_ramdisk.c: A "fake" ramdisk backend. It only allocates a
small amount of memory to act as a source and sink
for reads and writes from an initiator. Therefore
it cannot be used for any real data, but it can be
used to test for throughput. It can also be used
to test initiators' support for extremely large LUNs.
ctl_cmd_table.c: This is a table with all 256 possible SCSI opcodes,
and command handler functions defined for supported
opcodes.
ctl_debug.h: Debugging support.
ctl_error.c,
ctl_error.h: CTL-specific wrappers around the CAM sense building
functions.
ctl_frontend.c,
ctl_frontend.h: These files define the basic CTL frontend port API.
ctl_frontend_cam_sim.c: This is a CTL frontend port that is also a CAM SIM.
This frontend allows for using CTL without any
target-capable hardware. So any LUNs you create in
CTL are visible in CAM via this port.
ctl_frontend_internal.c,
ctl_frontend_internal.h:
This is a frontend port written for Copan to do
some system-specific tasks that required sending
commands into CTL from inside the kernel. This
isn't entirely relevant to FreeBSD in general,
but can perhaps be repurposed.
ctl_ha.h: This is a stubbed-out High Availability API. Much
more is needed for full HA support. See the
comments in the header and the description of what
is needed in the README.ctl.txt file for more
details.
ctl_io.h: This defines most of the core CTL I/O structures.
union ctl_io is conceptually very similar to CAM's
union ccb.
ctl_ioctl.h: This defines all ioctls available through the CTL
character device, and the data structures needed
for those ioctls.
ctl_mem_pool.c,
ctl_mem_pool.h: Generic memory pool implementation used by the
internal frontend.
ctl_private.h: Private data structres (e.g. CTL softc) and
function prototypes. This also includes the SCSI
vendor and product names used by CTL.
ctl_scsi_all.c,
ctl_scsi_all.h: CTL wrappers around CAM sense printing functions.
ctl_ser_table.c: Command serialization table. This defines what
happens when one type of command is followed by
another type of command.
ctl_util.c,
ctl_util.h: CTL utility functions, primarily designed to be
used from userland. See ctladm for the primary
consumer of these functions. These include CDB
building functions.
scsi_ctl.c: CAM target peripheral driver and CTL frontend port.
This is the path into CTL for commands from
target-capable hardware/SIMs.
README.ctl.txt: CTL code features, roadmap, to-do list.
usr.sbin/Makefile: Add ctladm.
ctladm/Makefile,
ctladm/ctladm.8,
ctladm/ctladm.c,
ctladm/ctladm.h,
ctladm/util.c: ctladm(8) is the CTL management utility.
It fills a role similar to camcontrol(8).
It allow configuring LUNs, issuing commands,
injecting errors and various other control
functions.
usr.bin/Makefile: Add ctlstat.
ctlstat/Makefile
ctlstat/ctlstat.8,
ctlstat/ctlstat.c: ctlstat(8) fills a role similar to iostat(8).
It reports I/O statistics for CTL.
sys/conf/files: Add CTL files.
sys/conf/NOTES: Add device ctl.
sys/cam/scsi_all.h: To conform to more recent specs, the inquiry CDB
length field is now 2 bytes long.
Add several mode page definitions for CTL.
sys/cam/scsi_all.c: Handle the new 2 byte inquiry length.
sys/dev/ciss/ciss.c,
sys/dev/ata/atapi-cam.c,
sys/cam/scsi/scsi_targ_bh.c,
scsi_target/scsi_cmds.c,
mlxcontrol/interface.c: Update for 2 byte inquiry length field.
scsi_da.h: Add versions of the format and rigid disk pages
that are in a more reasonable format for CTL.
amd64/conf/GENERIC,
i386/conf/GENERIC,
ia64/conf/GENERIC,
sparc64/conf/GENERIC: Add device ctl.
i386/conf/PAE: The CTL frontend SIM at least does not compile
cleanly on PAE.
Sponsored by: Copan Systems, SGI and Spectra Logic
MFC after: 1 month
sector size same as acd driver does. Together with r228808 and r228847 this
allows existing multimedia/vlc to play Audio CDs via CAM cd driver.
PR: ports/162190
MFC after: 1 week
cam_periph_runccb() since the beginning checks it and releases device queue.
After r203108 it even clears CAM_DEV_QFRZN flag after that to avoid double
release, so removed code is unreachable now.
MFC after: 1 month
As soon as not all devices support READ CAPACITY(16), automatically fall
back to READ CAPACITY(10) if CAM_REQ_INVALID or SSD_KEY_ILLEGAL_REQUEST
status returned.
It also provides first bits of information about Logical Block Provisioning
(aka UNMAP/TRIM) support by the device.
connected via SAS or USB. Unluckily I've found that SAS (mps) and USB-SATA
I have translate models in different ways, requiring twice more quirks.
Unluckily for Hitachi, their model names are trimmed on SAS, making
impossible to identify 4K sector drives that way.
GEOM and using READ CD command for reading data, same as acd driver does.
Audio CDs identified by checking respective bit of the control field of
the first track in TOC.
This fixes bunch of error messages during boot (GEOM taste) with Audio CD
inserted and allows to grab Audio CD image using just dd.
MFC after: 1 month
the 16-bit cylinders field of the VTOC8 disk label (at around 502GB). The
geometry chosen for disks above that limit allows to use disks up to 2TB,
which is the limit of the extended VTOC8 format. The geometry used for
disks smaller than the 16-bit cylinders limit stays the same as used by
cam_calc_geometry(9) for extended translation.
Thanks to Hans-Joerg Sirtl for providing hardware for testing this change.
MFC after: 3 days
It blocks CAM SWI usage on requests completion, unneeded because of polling
and denied during kernel dumping because of blocked scheduler.
Before r198899 there was periph flag CAM_PERIPH_POLLED, but that was wrong,
because there is whole SIM is polled or handled by SWI, not a single periph.
Tested by: kib
MFC after: 1 month
The SYSCTL_NODE macro defines a list that stores all child-elements of
that node. If there's no SYSCTL_DECL macro anywhere else, there's no
reason why it shouldn't be static.
CAM.
Desriptor sense is a new sense data format that originated in SPC-3. Among
other things, it allows for an 8-byte info field, which is necessary to
pass back block numbers larger than 4 bytes.
This change adds a number of new functions to scsi_all.c (and therefore
libcam) that abstract out most access to sense data.
This includes a bump of CAM_VERSION, because the CCB ABI has changed.
Userland programs that use the CAM pass(4) driver will need to be
recompiled.
camcontrol.c: Change uses of scsi_extract_sense() to use
scsi_extract_sense_len().
Use scsi_get_sks() instead of accessing sense key specific
data directly.
scsi_modes: Update the control mode page to the latest version (SPC-4).
scsi_cmds.c,
scsi_target.c: Change references to struct scsi_sense_data to struct
scsi_sense_data_fixed. This should be changed to allow the
user to specify fixed or descriptor sense, and then use
scsi_set_sense_data() to build the sense data.
ps3cdrom.c: Use scsi_set_sense_data() instead of setting sense data
manually.
cam_periph.c: Use scsi_extract_sense_len() instead of using
scsi_extract_sense() or accessing sense data directly.
cam_ccb.h: Bump the CAM_VERSION from 0x15 to 0x16. The change of
struct scsi_sense_data from 32 to 252 bytes changes the
size of struct ccb_scsiio, but not the size of union ccb.
So the version must be bumped to prevent structure
mis-matches.
scsi_all.h: Lots of updated SCSI sense data and other structures.
Add function prototypes for the new sense data functions.
Take out the inline implementation of scsi_extract_sense().
It is now too large to put in a header file.
Add macros to calculate whether fields are present and
filled in fixed and descriptor sense data
scsi_all.c: In scsi_op_desc(), allow the user to pass in NULL inquiry
data, and we'll assume a direct access device in that case.
Changed the SCSI RESERVED sense key name and description
to COMPLETED, as it is now defined in the spec.
Change the error recovery action for a number of read errors
to prevent lots of retries when the drive has said that the
block isn't accessible. This speeds up reconstruction of
the block by any RAID software running on top of the drive
(e.g. ZFS).
In scsi_sense_desc(), allow for invalid sense key numbers.
This allows calling this routine without checking the input
values first.
Change scsi_error_action() to use scsi_extract_sense_len(),
and handle things when invalid asc/ascq values are
encountered.
Add a new routine, scsi_desc_iterate(), that will call the
supplied function for every descriptor in descriptor format
sense data.
Add scsi_set_sense_data(), and scsi_set_sense_data_va(),
which build descriptor and fixed format sense data. They
currently default to fixed format sense data.
Add a number of scsi_get_*() functions, which get different
types of sense data fields from either fixed or descriptor
format sense data, if the data is present.
Add a number of scsi_*_sbuf() functions, which print
formatted versions of various sense data fields. These
functions work for either fixed or descriptor sense.
Add a number of scsi_sense_*_sbuf() functions, which have a
standard calling interface and print the indicated field.
These functions take descriptors only.
Add scsi_sense_desc_sbuf(), which will print a formatted
version of the given sense descriptor.
Pull out a majority of the scsi_sense_sbuf() function and
put it into scsi_sense_only_sbuf(). This allows callers
that don't use struct ccb_scsiio to easily utilize the
printing routines. Revamp that function to handle
descriptor sense and use the new sense fetching and
printing routines.
Move scsi_extract_sense() into scsi_all.c, and implement it
in terms of the new function, scsi_extract_sense_len().
The _len() version takes a length (which should be the
sense length - residual) and can indicate which fields are
present and valid in the sense data.
Add a couple of new scsi_get_*() routines to get the sense
key, asc, and ascq only.
mly.c: Rename struct scsi_sense_data to struct
scsi_sense_data_fixed.
sbp_targ.c: Use the new sense fetching routines to get sense data
instead of accessing it directly.
sbp.c: Change the firewire/SCSI sense data transformation code to
use struct scsi_sense_data_fixed instead of struct
scsi_sense_data. This should be changed later to use
scsi_set_sense_data().
ciss.c: Calculate the sense residual properly. Use
scsi_get_sense_key() to fetch the sense key.
mps_sas.c,
mpt_cam.c: Set the sense residual properly.
iir.c: Use scsi_set_sense_data() instead of building sense data by
hand.
iscsi_subr.c: Use scsi_extract_sense_len() instead of grabbing sense data
directly.
umass.c: Use scsi_set_sense_data() to build sense data.
Grab the sense key using scsi_get_sense_key().
Calculate the sense residual properly.
isp_freebsd.h: Use scsi_get_*() routines to grab asc, ascq, and sense key
values.
Calculate and set the sense residual.
MFC after: 3 days
Sponsored by: Spectra Logic Corporation
target reference miscounts. It also adds a helper function to get
the current reference counts for components of cam_path for debug
aid. One minor style(9) change.
Partially Obtained from: Chuck Tuffli (Emulex)
Reviewed by: scsi@ (ken)
Approved by: re (kib)
MFC after: 1 month
respond to any commands. I've found that because of multiple command
retries, each of which cause 30s timeout, bus reset and another retry or
requeue for many commands, it may take ages to eventually drop the
failed device. The odd thing is that those retries continue even after
XPT considered device as dead and invalidated it.
This patch makes cam_periph_error() to block any command retries after
periph was marked as invalid. With that patch all activity completes in
1-2 minutes, just after several timeouts, required to consider device
death. This should make ZFS, gmirror, graid, etc. operation more robust.
Reviewed by: mjacob@ on scsi@
Approved by: re (kib)
In cdregister(), hold the periph lock semaphore during changer
probe/configuration. This removes a window where an open of the
cd device may succeed before probe processing has completed.
In camisr_runqueue(), we need to run the sims queue regardless of
whether or not the current peripheral has more work to do. This
reverts a change mistakenly made in revision 223081.
Reported by: ache
This was previously done only for SCSI XPT in r223081, on which the change
in r223089 depended in order to respond to serial number requests. As a
result of r223089, da(4) and ada(4) devices register a d_getattr for geom to
use to obtain the information.
Reported by: ache
Reviewed by: ken