Make CAM to stop all attached devices on system shutdown.
It allows devices to park heads, reducing stress on power loss.
Add `kern.cam.power_down` tunable and sysctl to controll it.
- Unify bus reset/probe sequence. Whenever bus attached at boot or later,
CAM will automatically reset and scan it. It allows to remove duplicate
code from many drivers.
- Any bus, attached before CAM completed it's boot-time initialization,
will equally join to the process, delaying boot if needed.
- New kern.cam.boot_delay loader tunable should help controllers that
are still unable to register their buses in time (such as slow USB/
PCCard/ CardBus devices), by adding one more event to wait on boot.
- To allow synchronization between different CAM levels, concept of
requests priorities was extended. Priorities now split between several
"run levels". Device can be freezed at specified level, allowing higher
priority requests to pass. For example, no payload requests allowed,
until PMP driver enable port. ATA XPT negotiate transfer parameters,
periph driver configure caching and so on.
- Frozen requests are no more counted by request allocation scheduler.
It fixes deadlocks, when frozen low priority payload requests occupying
slots, required by higher levels to manage theit execution.
- Two last changes were holding proper ATA reinitialization and error
recovery implementation. Now it is done: SATA controllers and Port
Multipliers now implement automatic hot-plug and should correctly
recover from timeouts and bus resets.
- Improve SCSI error recovery for devices on buses without automatic sense
reporting, such as ATAPI or USB. For example, it allows CAM to wait, while
CD drive loads disk, instead of immediately return error status.
- Decapitalize diagnostic messages and make them more readable and sensible.
- Teach PMP driver to limit maximum speed on fan-out ports.
- Make boot wait for PMP scan completes, and make rescan more reliable.
- Fix pass driver, to return CCB to user level in case of error.
- Increase number of retries in cd driver, as device may return several UAs.
- Extend XPT-SIM transfer settings control API. Now it allows to report to
SATA SIM number of tags supported by each device, implement ATA mode and
SATA revision negotiation for both SATA and PATA SIMs.
- Make ahci(4) and siis(4) to use submitted maximum tag number, when
scheduling requests. It allows to support NCQ on devices with lower tags
count then controller supports.
- Make PMP driver to report attached devices connection speeds.
- Implement ATA mode negotiation between user settings, device and
controller capabilities.
Remove code that years ago was closing race between request submission
to SIM and device/SIM freeze. That race become impossible after moving from
spl to mutex locking, while this workaround causes some unexpected effects.
- Move tagged queueing control from ADA to ATA XPT. It allows to control
device command queue length correctly. First step to support < 32 tags.
- Limit queue for non-tagged devices by 2 slots for ahci(4) and siis(4).
- Implement quirk matching for ATA devices.
- Move xpt_schedule_dev_sendq() from header to source file.
- Move delayed queue shrinking to the more expected place - element freeing.
- Remove some SCSIsms in ATA.
- Remove CAM_PERIPH_POLLED flag. It is broken by design. Polling can't be
periph flag. May be SIM, may be CCB, but now it works fine just without it.
- Remove check unused for at least five years. If we will ever have non-BIO
devices in CAM, this check is smallest of what we will need.
- If several controllers complete requests same time, call swi_sched()
only once.
Fix reference counting bug, when device unreferenced before then
invalidated. To do it, do not handle validity flag as another
reference, but explicitly modify reference count each time flag is
modified.
Discovered by: thompsa
- Reduce code duplication in ATA XPT and PMP driver.
- Move PIO size setting from ada driver to ATA XPT. It is XPT business
to negotiate transfer details. ada driver is now stateless.
- Report PIO size to SIM. It is required for correct PATA SIM operation.
- Tune PMP scan timings. It workarounds some problems with SiI.
- If reset hapens during PMP initialization - restart it.
- Introduce early-initialized periph drivers, which are used during initial
scan process. Use it for xpt, probe, aprobe and pmp. It gives pmp chance
to finish scan before mountroot and numerate devices in right order.
Ensure target/lun passed from user-level supported on this bus.
Scanning unsupported IDs causes different issues from duplicate
devices to system crash.
firmware loading bugs.
Target mode support has received some serious attention to make it
more usable and stable.
Some backward compatible additions to CAM have been made that make
target mode async events easier to deal with have also been put
into place.
Further refinement and better support for NP-IV (N-port Virtualization)
is now in place.
Code for release prior to RELENG_7 has been stripped away for code clarity.
Sponsored by: Copan Systems
Reviewed by: scottl, ken, jung-uk kim
Approved by: re
modularize it so that new transports can be created.
Add a transport for SATA
Add a periph+protocol layer for ATA
Add a driver for AHCI-compliant hardware.
Add a maxio field to CAM so that drivers can advertise their max
I/O capability. Modify various drivers so that they are insulated
from the value of MAXPHYS.
The new ATA/SATA code supports AHCI-compliant hardware, and will override
the classic ATA driver if it is loaded as a module at boot time or compiled
into the kernel. The stack now support NCQ (tagged queueing) for increased
performance on modern SATA drives. It also supports port multipliers.
ATA drives are accessed via 'ada' device nodes. ATAPI drives are
accessed via 'cd' device nodes. They can all be enumerated and manipulated
via camcontrol, just like SCSI drives. SCSI commands are not translated to
their ATA equivalents; ATA native commands are used throughout the entire
stack, including camcontrol. See the camcontrol manpage for further
details. Testing this code may require that you update your fstab, and
possibly modify your BIOS to enable AHCI functionality, if available.
This code is very experimental at the moment. The userland ABI/API has
changed, so applications will need to be recompiled. It may change
further in the near future. The 'ada' device name may also change as
more infrastructure is completed in this project. The goal is to
eventually put all CAM busses and devices until newbus, allowing for
interesting topology and management options.
Few functional changes will be seen with existing SCSI/SAS/FC drivers,
though the userland ABI has still changed. In the future, transports
specific modules for SAS and FC may appear in order to better support
the topologies and capabilities of these technologies.
The modularization of CAM and the addition of the ATA/SATA modules is
meant to break CAM out of the mold of being specific to SCSI, letting it
grow to be a framework for arbitrary transports and protocols. It also
allows drivers to be written to support discrete hardware without
jeopardizing the stability of non-related hardware. While only an AHCI
driver is provided now, a Silicon Image driver is also in the works.
Drivers for ICH1-4, ICH5-6, PIIX, classic IDE, and any other hardware
is possible and encouraged. Help with new transports is also encouraged.
Submitted by: scottl, mav
Approved by: re
a serial number, fall through to the next case so that initial negotiation
still happens. Without this, devices were showing up with only 1 available
tag opening, leading to observations of very poor I/O performance.
This should fix problems reported with VMWare Fusion and ESX. Early
generation MPT-SAS controllers with SATA disks might also be affected.
HP CISS controllers are also likely affected, as are many other
pseudo-scsi disk subsystems.
created by atapicam is being kept opened or mounted. This is probably just
a temporary solution until we invent something better.
Reviewed by: scottl
Approved by: rwatson (mentor)
Sponsored by: FreeBSD Foundation
Reported by: Jaakko Heinonen
This fixes problems with discovering some USB devices that are very slow to
respond during initialisation.
When a USB device is inserted, CAM performs the sequence:
1) INQUIRY
2) INQUIRY (second time with other parameters)
3) TEST UNIT READY
4) READ CAPACITY
Before this change CAM didn't check if TEST UNIT READY was successful and went
on blindly to the next state and sent READ CAPACITY. If the device was still
not ready by then, CAM ended with error message. This patch adds checking for the
status of TEST UNIT READY command and retrying up to 10 times with 0.5 sec
interval.
Submitted by: Grzegorz Bernacki gjb ! semihalf dot com
Reviewed by: scottl
does - in DragonFly, it's cam_sim_release() what actually frees the
SIM; cam_sim_free does nothing more than calling cam_sim_release().
Here, we drain in cam_sim_free, waiting for refcount to drop to zero.
We cannot do the same think DragonFly does, because after cam_sim_free
returns, client would destroy the sim->mtx, and CAM would trip over
an initialized mutex.
Reviewed by: scottl
Approved by: rwatson (mentor)
Sponsored by: FreeBSD Foundation
to actually use it would panic on mtx operation, as dead_sim doesn't
have a proper mutex. Even if it had a properly initialized mutex,
it wouldn't have properly locked and owned one.
Reviewed by: scottl
Approved by: rwatson (mentor)
Sponsored by: FreeBSD Foundation
device supports retrieving a serial number. Instead, first query the
list of VPD pages it does support, and only query the serial number if
it's supported, else silently move on. This eliminates a lot of noise
during verbose booting, and will likely eliminate the need for most
NOSERIAL quirks.
to kproc_xxx as they actually make whole processes.
Thos makes way for us to add REAL kthread_create() and friends
that actually make theads. it turns out that most of these
calls actually end up being moved back to the thread version
when it's added. but we need to make this cosmetic change first.
I'd LOVE to do this rename in 7.0 so that we can eventually MFC the
new kthread_xxx() calls.
now takes a device_t to be the parent of the bus that is being created.
Most SIMs have been updated with a reasonable argument, but a few exceptions
just pass NULL for now. This argument isn't used yet and the newbus
integration likely won't be ready until after 7.0-RELEASE.
sysctl_handle_int is not sizeof the int type you want to export.
The type must always be an int or an unsigned int.
Remove the instances where a sizeof(variable) is passed to stop
people accidently cut and pasting these examples.
In a few places this was sysctl_handle_int was being used on 64 bit
types, which would truncate the value to be exported. In these
cases use sysctl_handle_quad to export them and change the format
to Q so that sysctl(1) can still print them.
use to synchornize and protect all data objects that are used for that
SIM. Drivers that are not yet MPSAFE register Giant and operate as
usual. RIght now, no drivers are MPSAFE, though a few will be changed
in the coming week as this work settles down.
The driver API has changed, so all CAM drivers will need to be recompiled.
The userland API has not changed, so tools like camcontrol do not need to
be recompiled.
rescan requests. The purpose of this is to allow a SIM
(or other entities) to request a bus rescan and have it
then fielded in a different (process) context from the
caller.
There are probably better ways to accomplish this, but
it's a very small change that helps solve a number of
problems.
Reviewed by: Justin, Ken and Scott.
MFC after: 2 weeks
what to do with it.
This forces us to scan targets sequentially, not in parallel.
The reason we might want to do this is that SPI negotiation
might not work right at the SIM level if we try to do it
in parallel. We *could* fix this for each SIM where this is
broken, but it's a lot harder to do that when we can simply
ask CAM to probe sequentially.
If PIM_SEQSCAN is not set (default), the original behaviour for
probing is unchanged.
LUN probing is still done in parallel for each target in either
case.
While we're at it, clean up some resource leakage for error
cases.
Reviewed by: ken, scott, scsi@
MFC after: 1 week
usage as of SPC2r20. Specifically, handle the BQueue
flag which will indicate that a device supports the
Basic Queueing model (no Head of Queue or Ordered tags).
When this flag is set, SID_CmdQueue is clear. This has
causes FreeBSD to assume that the device did not support
tagged operations.
MFC after: 1 month
operations before returning. Point the bus at a dummy cam_sim
structure so that any CCBs will complete immediately with a
CAM_DEV_NOT_THERE status, and ensure that any xpt_schedule() calls
on the bus's devices will immediately call the peripheral's
periph_start() routine. Also repeat the async messages because
devices that were part of the way through being probed may appear
after the original AC_LOST_DEVICE was sent, and would otherwise
never go away.
These changes make it possible to deregister a bus and free the SIM
at most stages during bus probing without the usual crashes in
camisr(). In particular, plugging in a umass device and then
unplugging it as soon as the first probe messages appeared would
almost always result in a crash. Now the device just goes away with
a few CAM errors and all references to the CAM bus, target and
device are dropped correctly.
tunable (until we get REPORT LUNS in place).
If we're probing luns, and each probe succeeds, we keep going past
lun 7 if we're a SCSI3 or better device (until we fail to probe).
If we're probing luns, and a probe fails, we only keep going if
we're quirked *for* it (CAM_QUIRK_HILUNS), and if we're not quirked
*against* it (CAM_QUIRK_NOHILUNS), or we're a SCSI3 or better device
and the tunable (kern.cam.cam_srch_hi) is set non-zero.
Reviewed by: nate@rootlabs.org, gibbs@scsiguy.com, ken@kdm.com, scottl@samsco.org
MFC after: 1 week
module-specific malloc types. These should help us to pinpoint the
possible memory leakage in the future.
- Implementing xpt_alloc_ccb_nowait() and replacing all malloc/free based
CCB memory management with xpt_alloc_ccb[_nowait]/xpt_free_ccb. Hopefully
this would be helpful if someday we move the CCB allocator to use UMA
instead of malloc().
Encouraged by: jeffr, rwatson
Reviewed by: gibbs, scottl
Approved by: re (scottl)
(depends on how many memory you have) observed through "tar -tvf /dev/sa0."
Without this patch, RELENG_5 and HEAD panics with something like:
kmem_malloc(4096): kmem_map too small: 42258432 total allocated
RELENG_4 doesn't panic but spews following errors:
camq_init: - cannot malloc array!
Reviewed by: gibbs, scottl
Approved by: re (scottl)
MFC after: 3 days
disables tag queuing temporarily in order to allow controllers a window
to safely perform transfer negotiation with non-compliant devices. Before
this change, CAM would restore the queue depth to the controller specified
maximum or device quirk level rather than any depth determined by reactions
to QUEUE FULL/BUSY events or an explicit user setting.
During device probe, initialize the flags field for XPT_SCAN_BUS.
The uninitialized value often confused CAM into not bothering to
issue an AC_FOUND_DEVICE async event for new devices. The reason
this bug wasn't reported earlier is that CAM manually announces
devices after the initial system bus scans.
MFC: 3 days
Giant held. In camisr(), move the ccb_bioq elements to a temporary local list
and then process the elements off of that list. This enables the list to be
processed by only taking the ccb_bioq_lock once and only for a very short
time.
ccb_bioq_lock is a leaf mutex, so it's fine to call xpt_done() with other
locks held. This is just a very minor step in the work to lock CAM, but
it allows us to avoid some messy locking/unlock dances in certain drivers.
It reports itself as SCSI-3 but doesnt like getting probed on high luns
because it hangs hard after finding itself again on lun 32...
Suggested by: Kenneth Merry
its ability to automatically scan and attach luns for modern storage
which has luns in the 0..1000 range, not 0..7.
The correct thing would be to do REPORT LUNS for devices whose LUN0
version shows a version >= SCSI3, but lacking that we should be able
to search higher than LUN 7 if we're >= SCSI3 with no ill effects.
This change keeps all of the QUIRK_HILUNS quirks, obeys the QUIRK_NOLUNS,
and introduces a QUIRK_NOHILUNS which will keep searches above LUN 7
happening for devices that report >= SCSI3 compliance. I doubt the latter
will be needed, but you never know.
This allowed me to randomly scan and attach > 500 disks at a time in
a situation where quirking for QUIRK_HILUNS wasn't practical (the
vendor id and product id changes of the virtualization changes
constantly).
Reviewed by: ken@freebsd.org, scottl@freebsd.org, gibbs@freebsd.org
MFC after: 2 weeks
to request from devices during the "long inquiry" portion of our probe.
This same bug was fixed in the 4.x stream a few years ago, but the fix
was never propogated to -current.
This fix is slightly different than in -stable:
o Use offsetof() instead of a hard coded constant so as the make
the code more self-explainatory.
o Round odd long inquiry lengths up so as to avoid tickling ignore
wide residue bugs in broken parallel SCSI devices running with a
wide transfer negotiation.
MFC: 3 days
for unknown events.
A number of modules return EINVAL in this instance, and I have left
those alone for now and instead taught MOD_QUIESCE to accept this
as "didn't do anything".