It includes three parts:
1) Modifications to CAM to detect media media changes and report them to
disk(9) layer. For modern SATA (and potentially UAS) devices it utilizes
Asynchronous Notification mechanism to receive events from hardware.
Active polling with TEST UNIT READY commands with 3 seconds period is used
for incapable hardware. After that both CD and DA drivers work the same way,
detecting two conditions: "NOT READY: Medium not present" after medium was
detected previously, and "UNIT ATTENTION: Not ready to ready change, medium
may have changed". First one reported to disk(9) as media removal, second
as media insert/change. To reliably receive second event new
AC_UNIT_ATTENTION async added to make UAs broadcasted to all periphs by
generic error handling code in cam_periph_error().
2) Modifications to GEOM core to handle media remove and change events.
Media removal handled by spoiling all consumers attached to the provider.
Media change event also schedules provider retaste after spoiling to probe
new media. New flag G_CF_ORPHAN was added to consumers to reflect that
consumer is in process of destruction. It allows retaste to create new
geom instance of the same class, while previous one is still dying.
3) Modifications to some GEOM classes: DEV -- to report media change
events to devd; VFS -- to handle spoiling same as orphan to prevent
accessing replaced media. PART class already handles spoiling alike to
orphan.
Reviewed by: silence on geom@ and scsi@
Tested by: avg
Sponsored by: iXsystems, Inc. / PC-BSD
MFC after: 2 months
Renamed the kern.cam.ada.ada_send_ordered sysctl and tunable to
kern.cam.ada.send_ordered, more in line with the other da sysctls/tunables.
Suggested by: kib
reporting. It includes:
- removing of error messages controlled by bootverbose, replacing them
with more universal and informative debugging on CAM_DEBUG_INFO level,
that is now built into the kernel by default;
- more close following to the arguments submitted by caller, such as
SF_PRINT_ALWAYS, SF_QUIET_IR and SF_NO_PRINT; consumer knows better which
errors are usual/expected at this point and which are really informative;
- adding two new flags SF_NO_RECOVERY and SF_NO_RETRY to allow caller
specify how much assistance it needs at this point; previously consumers
controlled that by not calling cam_periph_error() at all, but that made
behavior inconsistent and debugging complicated;
- tuning debug messages and taken actions order to make debugging output
more readable and cause-effect relationships visible;
- making camperiphdone() (common device recovery completion handler) to
also use cam_periph_error() in most cases, instead of own dumb code;
- removing manual sense fetching code from cam_periph_error(); I was told
by number of people that it is SIM obligation to fetch sense data, so this
code is useless and only significantly complicates recovery logic;
- making ada, da and pass driver to use cam_periph_error() with new limited
recovery options to handle error recovery and debugging in common way;
as one of results, CAM_REQUEUE_REQ and other retrying statuses are now
working fine with pass driver, that caused many problems before.
- reverting r186891 by raj@ to avoid burning few seconds in tight DELAY()
loops on device probe, while device simply loads media; I think that problem
may already be fixed in other way, and even if it is not, solution must be
different.
Sponsored by: iXsystems, Inc.
MFC after: 2 weeks
until transport will do some probe actions (at least soft reset).
Make ATA/SATA SIMs to not report bogus and confusing PROTO_ATA protocol.
Make ATA/SATA transport to fill that gap by reporting protocol to SIM with
XPT_SET_TRAN_SETTINGS and patching XPT_GET_TRAN_SETTINGS results if needed.
via `camcontrol tags ... -N ...`. There is no need to tune it in
usual cases, but some users want to have it for debugging purposes.
MFC after: 2 weeks
PMP ports such as PMP configuration or SEMB should be exposed or hidden.
These ports were always hidden before as useless and sometimes promatic.
But with updated ses driver supporting SEMB it is no longer so straight.
Keep ports hidden by default to avoid probe request ttimeouts if SEP is
not connected to PMP's SEMB via I2C, that is very often situation.
- Add low-level support for SATA Enclosure Management Bridge (SEMB)
devices -- SATA equivalents of the SCSI SES/SAF-TE devices.
- Add some utility functions for SCSI SAF-TE devices access.
Sponsored by: iXsystems, Inc.
of the default one.
Without this change setting kern.cam.ada.default_timeout to 1 instead of 30
allowed me to trigger several false positive command timeouts under heavy
ZFS load on a SiI3132 siis(4) controller with 5 HDDs on a port multiplier.
MFC after: 1 week
Even having more specific hint.ata.X.mode controls, global ones are
still could be useful from some points, including compatibility.
PR: kern/164651
MFC after: 1 week
cam_periph_runccb() since the beginning checks it and releases device queue.
After r203108 it even clears CAM_DEV_QFRZN flag after that to avoid double
release, so removed code is unreachable now.
MFC after: 1 month
The SYSCTL_NODE macro defines a list that stores all child-elements of
that node. If there's no SYSCTL_DECL macro anywhere else, there's no
reason why it shouldn't be static.
This was previously done only for SCSI XPT in r223081, on which the change
in r223089 depended in order to respond to serial number requests. As a
result of r223089, da(4) and ada(4) devices register a d_getattr for geom to
use to obtain the information.
Reported by: ache
Reviewed by: ken
DEVFS, and make it accessible via the diskinfo utility.
Extend GEOM's generic attribute query mechanism into generic disk consumers.
sys/geom/geom_disk.c:
sys/geom/geom_disk.h:
sys/cam/scsi/scsi_da.c:
sys/cam/ata/ata_da.c:
- Allow disk providers to implement a new method which can override
the default BIO_GETATTR response, d_getattr(struct bio *). This
function returns -1 if not handled, otherwise it returns 0 or an
errno to be passed to g_io_deliver().
sys/cam/scsi/scsi_da.c:
sys/cam/ata/ata_da.c:
- Don't copy the serial number to dp->d_ident anymore, as the CAM XPT
is now responsible for returning this information via
d_getattr()->(a)dagetattr()->xpt_getatr().
sys/geom/geom_dev.c:
- Implement a new ioctl, DIOCGPHYSPATH, which returns the GEOM
attribute "GEOM::physpath", if possible. If the attribute request
returns a zero-length string, ENOENT is returned.
usr.sbin/diskinfo/diskinfo.c:
- If the DIOCGPHYSPATH ioctl is successful, report physical path
data when diskinfo is executed with the '-v' option.
Submitted by: will
Reviewed by: gibbs
Sponsored by: Spectra Logic Corporation
Add generic attribute change notification support to GEOM.
sys/sys/geom/geom.h:
Add a new attrchanged method field to both g_class
and g_geom.
sys/sys/geom/geom.h:
sys/geom/geom_event.c:
- Provide the g_attr_changed() function that providers
can use to advertise attribute changes.
- Perform delivery of attribute change notifications
from a thread context via the standard GEOM event
mechanism.
sys/geom/geom_subr.c:
Inherit the attrchanged method from class to geom (class instance).
sys/geom/geom_disk.c:
Provide disk_attr_changed() to provide g_attr_changed() access
to consumers of the disk API.
sys/cam/scsi/scsi_pass.c:
sys/cam/scsi/scsi_da.c:
sys/geom/geom_dev.c:
sys/geom/geom_disk.c:
Use attribute changed events to track updates to physical path
information.
sys/cam/scsi/scsi_da.c:
Add AC_ADVINFO_CHANGED to the registered asynchronous CAM
events for this driver. When this event occurs, and
the updated buffer type references our physical path
attribute, emit a GEOM attribute changed event via the
disk_attr_changed() API.
sys/cam/scsi/scsi_pass.c:
Add AC_ADVINFO_CHANGED to the registered asynchronous CAM
events for this driver. When this event occurs, update
the physical patch devfs alias for this pass instance.
Submitted by: gibbs
Sponsored by: Spectra Logic Corporation
(up to 2048 instead of 256 or even 64) of them with single TRIM request.
OCZ Vertex2/Vertex3 SSDs can handle no more then 64 ranges per TRIM request.
Due to lack of BIO_DELETE clustering now, it means that we could delete no
more then 2MB per request (on FS with 32K block) with limited request rate.
This change increases delete rate on Vertex2 from 250MB/s to 950MB/s.
reporting it properly (none? of known disks now).
Hitachi and WDC AF disks seem could be identified more or less formally.
For Seagate and Samsung enumerate some found models/series.
For other disks it can be forced with kern.cam.ada.X.quirks=1 tunable.
device in /dev/ create symbolic link with adY name, trying to mimic old ATA
numbering. Imitation is not complete, but should be enough in most cases to
mount file systems without touching /etc/fstab.
- To know what behavior to mimic, restore ATA_STATIC_ID option in cases
where it was present before.
- Add some more details to UPDATING.
- make SATA SIMs announce capabilities to handle SDB with Notification bit;
- make PMP driver honor this SIMs capability;
- make SATA XPT to negotiate and enable this feature for ATAPI devices.
This feature allows supporting SATA ATAPI devices to inform system about
some events happened, that may require attention. In my case this allows
LG GH22LS50 SATA DVR-RW drive to report tray open/close events. Events
reported to CAM in form of AC_SCSI_AEN async. Further they could be used
as a hints for checking device status and reporting media change to upper
layers, for example, via spoiling mechanism of GEOM.
on per-device basis.
- While adding support for per-device sysctls, merge from graid branch
support for ADA_TEST_FAILURE kernel option, which opens few more sysctl,
allowing to simulate read and write errors for testing purposes.
existing code caused problems with some SCSI controllers.
A new sysctl kern.cam.ada.spindown_shutdown has been added that controls
whether or not to spin-down disks when shutting down.
Spinning down the disks unloads/parks the heads - this is
much better than removing power when the disk is still
spinning because otherwise an Emergency Unload occurs which may cause damage
to the actuator.
PR: kern/140752
Submitted by: olli
Reviewed by: arundel
Discussed with: mav
MFC after: 2 weeks
Add the BIO_ORDERED flag for struct bio and update bio clients to use it.
The barrier semantics of bioq_insert_tail() were broken in two ways:
o In bioq_disksort(), an added bio could be inserted at the head of
the queue, even when a barrier was present, if the sort key for
the new entry was less than that of the last queued barrier bio.
o The last_offset used to generate the sort key for newly queued bios
did not stay at the position of the barrier until either the
barrier was de-queued, or a new barrier (which updates last_offset)
was queued. When a barrier is in effect, we know that the disk
will pass through the barrier position just before the
"blocked bios" are released, so using the barrier's offset for
last_offset is the optimal choice.
sys/geom/sched/subr_disk.c:
sys/kern/subr_disk.c:
o Update last_offset in bioq_insert_tail().
o Only update last_offset in bioq_remove() if the removed bio is
at the head of the queue (typically due to a call via
bioq_takefirst()) and no barrier is active.
o In bioq_disksort(), if we have a barrier (insert_point is non-NULL),
set prev to the barrier and cur to it's next element. Now that
last_offset is kept at the barrier position, this change isn't
strictly necessary, but since we have to take a decision branch
anyway, it does avoid one, no-op, loop iteration in the while
loop that immediately follows.
o In bioq_disksort(), bypass the normal sort for bios with the
BIO_ORDERED attribute and instead insert them into the queue
with bioq_insert_tail(). bioq_insert_tail() not only gives
the desired command order during insertion, but also provides
barrier semantics so that commands disksorted in the future
cannot pass the just enqueued transaction.
sys/sys/bio.h:
Add BIO_ORDERED as bit 4 of the bio_flags field in struct bio.
sys/cam/ata/ata_da.c:
sys/cam/scsi/scsi_da.c
Use an ordered command for SCSI/ATA-NCQ commands issued in
response to bios with the BIO_ORDERED flag set.
sys/cam/scsi/scsi_da.c
Use an ordered tag when issuing a synchronize cache command.
Wrap some lines to 80 columns.
sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_geom.c
sys/geom/geom_io.c
Mark bios with the BIO_FLUSH command as BIO_ORDERED.
Sponsored by: Spectra Logic Corporation
MFC after: 1 month
by timeout/error of one of probe commands, process may continue infinitely.
Make CAM ATA more robust to faulty devices and false positive detections,
abort probe after two restarts on timeouts or ten on other errors.
whole bus (XPT_SCAN_BUS) and a single lun on that bus (XPT_SCAN_LUN).
It's less resource comsumptive than scanning a whole bus when the
caller knows only one target has changes.
Reviewed by: scsi@
Sponsored by: Panasas
MFC after: 1 month
hook it up to ada(4) also. While at it, rename *ad_firmware_geom_adjust()
to *ata_disk_firmware_geom_adjust() etc now that these are no longer
limited to ad(4).
Reviewed by: mav
MFC after: 3 days
- device initiated power management (some devices support only this way);
- Automatic Partial to Slumber Transition (more power saving);
- DMA auto-activation (expected to slightly improve performance).
More features could be added later, when hardware supports.
Add Power Up In Stand-by feature support. Device with PUIS enabled
require explicit command to do initial spin-up. Mark that command
with CAM_HIGH_POWER flag, to allow CAM manage staggered spin-up.
- Unify bus reset/probe sequence. Whenever bus attached at boot or later,
CAM will automatically reset and scan it. It allows to remove duplicate
code from many drivers.
- Any bus, attached before CAM completed it's boot-time initialization,
will equally join to the process, delaying boot if needed.
- New kern.cam.boot_delay loader tunable should help controllers that
are still unable to register their buses in time (such as slow USB/
PCCard/ CardBus devices), by adding one more event to wait on boot.
- To allow synchronization between different CAM levels, concept of
requests priorities was extended. Priorities now split between several
"run levels". Device can be freezed at specified level, allowing higher
priority requests to pass. For example, no payload requests allowed,
until PMP driver enable port. ATA XPT negotiate transfer parameters,
periph driver configure caching and so on.
- Frozen requests are no more counted by request allocation scheduler.
It fixes deadlocks, when frozen low priority payload requests occupying
slots, required by higher levels to manage theit execution.
- Two last changes were holding proper ATA reinitialization and error
recovery implementation. Now it is done: SATA controllers and Port
Multipliers now implement automatic hot-plug and should correctly
recover from timeouts and bus resets.
- Improve SCSI error recovery for devices on buses without automatic sense
reporting, such as ATAPI or USB. For example, it allows CAM to wait, while
CD drive loads disk, instead of immediately return error status.
- Decapitalize diagnostic messages and make them more readable and sensible.
- Teach PMP driver to limit maximum speed on fan-out ports.
- Make boot wait for PMP scan completes, and make rescan more reliable.
- Fix pass driver, to return CCB to user level in case of error.
- Increase number of retries in cd driver, as device may return several UAs.
- For SSDs use TRIM feature of DATA SET MANAGEMENT command, as defined by
ACS-2 specification working draft.
- For CompactFlash use CFA ERASE command, same as ad(4) does.
With this patch, `newfs -E /dev/ada1` was able to restore write speed of
my heavily weared OCZ Vertex SSD (firmware 1.4) up to the initial level
for the most part of it's capacity. Previous 1.3 firmware, even reportiong
TRIM capabilty bit set, was not working, reporting ABORT error for every
DSM command.
I have no idea whether it is normal, but for some reason it takes 200ms
to handle any TRIM command on this drive, that was making delete extremely
slow. But TRIM command is able to accept long list of LBAs and the length of
that list seems doesn't affect it's execution time. Implemented request
clusting algorithm allowed me to rise delete rate up to reasonable numbers,
when many parallel DELETE requests running.
- Cleanup kernel messages, mostly PMP.
- Took references on devices, while PMP reinitializes them, to not let them
go and distort freeze reference counting.
Introduce ATA_CAM kernel option, turning ata(4) controller drivers into
cam(4) interface modules. When enabled, this options deprecates all ata(4)
peripheral drivers (ad, acd, ...) and interfaces and allows cam(4) drivers
(ada, cd, ...) and interfaces to be natively used instead.
As side effect of this, ata(4) mode setting code was completely rewritten
to make controller API more strict and permit above change. While doing
this, SATA revision was separated from PATA mode. It allows DMA-incapable
SATA devices to operate and makes hw.ata.atapi_dma tunable work again.
Also allow ata(4) controller drivers (except some specific or broken ones)
to handle larger data transfers. Previous constraint of 64K was artificial
and is not really required by PCI ATA BM specification or hardware.
Submitted by: nwitehorn (powerpc part)
- Extend XPT-SIM transfer settings control API. Now it allows to report to
SATA SIM number of tags supported by each device, implement ATA mode and
SATA revision negotiation for both SATA and PATA SIMs.
- Make ahci(4) and siis(4) to use submitted maximum tag number, when
scheduling requests. It allows to support NCQ on devices with lower tags
count then controller supports.
- Make PMP driver to report attached devices connection speeds.
- Implement ATA mode negotiation between user settings, device and
controller capabilities.
- Move tagged queueing control from ADA to ATA XPT. It allows to control
device command queue length correctly. First step to support < 32 tags.
- Limit queue for non-tagged devices by 2 slots for ahci(4) and siis(4).
- Implement quirk matching for ATA devices.
- Move xpt_schedule_dev_sendq() from header to source file.
- Move delayed queue shrinking to the more expected place - element freeing.
- Remove some SCSIsms in ATA.
- Remove CAM_PERIPH_POLLED flag. It is broken by design. Polling can't be
periph flag. May be SIM, may be CCB, but now it works fine just without it.
- Remove check unused for at least five years. If we will ever have non-BIO
devices in CAM, this check is smallest of what we will need.
- If several controllers complete requests same time, call swi_sched()
only once.
- Add support for sector size > 512 bytes and physical sector of several
logical sectors, introduced by ATA-7 specification.
- Remove some obsoleted code.
Fix reference counting bug, when device unreferenced before then
invalidated. To do it, do not handle validity flag as another
reference, but explicitly modify reference count each time flag is
modified.
Discovered by: thompsa
- Reduce code duplication in ATA XPT and PMP driver.
- Move PIO size setting from ada driver to ATA XPT. It is XPT business
to negotiate transfer details. ada driver is now stateless.
- Report PIO size to SIM. It is required for correct PATA SIM operation.
- Tune PMP scan timings. It workarounds some problems with SiI.
- If reset hapens during PMP initialization - restart it.
- Introduce early-initialized periph drivers, which are used during initial
scan process. Use it for xpt, probe, aprobe and pmp. It gives pmp chance
to finish scan before mountroot and numerate devices in right order.
Separate CAM_DEV_IDENTIFY_DATA_VALID flag from CAM_DEV_INQUIRY_DATA_VALID.
Add workaround for very old devices without support for mode setting.
Add some PATA bus scanning support.
Remove some SCSIsms.
Add support for PIO-only devices.
Fix maxio values and 256 sectors transactions for 28bits commands.
Implement periodic ordered commands insertion, sames as da driver does.
Remove some SCSIsms.
for ATA_SETFEATURES/ATA_SF_SETXFER command which by definition transfers no
data. Most of controllers are irrelevant to this bug, but some nVidia's
doesn't.
Tested on: current@
Approved by: re (kib)
modularize it so that new transports can be created.
Add a transport for SATA
Add a periph+protocol layer for ATA
Add a driver for AHCI-compliant hardware.
Add a maxio field to CAM so that drivers can advertise their max
I/O capability. Modify various drivers so that they are insulated
from the value of MAXPHYS.
The new ATA/SATA code supports AHCI-compliant hardware, and will override
the classic ATA driver if it is loaded as a module at boot time or compiled
into the kernel. The stack now support NCQ (tagged queueing) for increased
performance on modern SATA drives. It also supports port multipliers.
ATA drives are accessed via 'ada' device nodes. ATAPI drives are
accessed via 'cd' device nodes. They can all be enumerated and manipulated
via camcontrol, just like SCSI drives. SCSI commands are not translated to
their ATA equivalents; ATA native commands are used throughout the entire
stack, including camcontrol. See the camcontrol manpage for further
details. Testing this code may require that you update your fstab, and
possibly modify your BIOS to enable AHCI functionality, if available.
This code is very experimental at the moment. The userland ABI/API has
changed, so applications will need to be recompiled. It may change
further in the near future. The 'ada' device name may also change as
more infrastructure is completed in this project. The goal is to
eventually put all CAM busses and devices until newbus, allowing for
interesting topology and management options.
Few functional changes will be seen with existing SCSI/SAS/FC drivers,
though the userland ABI has still changed. In the future, transports
specific modules for SAS and FC may appear in order to better support
the topologies and capabilities of these technologies.
The modularization of CAM and the addition of the ATA/SATA modules is
meant to break CAM out of the mold of being specific to SCSI, letting it
grow to be a framework for arbitrary transports and protocols. It also
allows drivers to be written to support discrete hardware without
jeopardizing the stability of non-related hardware. While only an AHCI
driver is provided now, a Silicon Image driver is also in the works.
Drivers for ICH1-4, ICH5-6, PIIX, classic IDE, and any other hardware
is possible and encouraged. Help with new transports is also encouraged.
Submitted by: scottl, mav
Approved by: re