devq_openings counter lost its meaning after allocation queues has gone.
held counter is still meaningful, but problematic to update due to separate
locking of CCB allocation and queuing.
To fix that replace devq_openings counter with allocated counter. held is
now calculated on request as difference between number of allocated, queued
and active CCBs.
MFC after: 1 month
reduce lock congestion and improve SMP scalability of the SCSI/ATA stack,
preparing the ground for the coming next GEOM direct dispatch support.
Replace big per-SIM locks with bunch of smaller ones:
- per-LUN locks to protect device and peripheral drivers state;
- per-target locks to protect list of LUNs on target;
- per-bus locks to protect reference counting;
- per-send queue locks to protect queue of CCBs to be sent;
- per-done queue locks to protect queue of completed CCBs;
- remaining per-SIM locks now protect only HBA driver internals.
While holding LUN lock it is allowed (while not recommended for performance
reasons) to take SIM lock. The opposite acquisition order is forbidden.
All the other locks are leaf locks, that can be taken anywhere, but should
not be cascaded. Many functions, such as: xpt_action(), xpt_done(),
xpt_async(), xpt_create_path(), etc. are no longer require (but allow) SIM
lock to be held.
To keep compatibility and solve cases where SIM lock can't be dropped, all
xpt_async() calls in addition to xpt_done() calls are queued to completion
threads for async processing in clean environment without SIM lock held.
Instead of single CAM SWI thread, used for commands completion processing
before, use multiple (depending on number of CPUs) threads. Load balanced
between them using "hash" of the device B:T:L address.
HBA drivers that can drop SIM lock during completion processing and have
sufficient number of completion threads to efficiently scale to multiple
CPUs can use new function xpt_done_direct() to avoid extra context switch.
Make ahci(4) driver to use this mechanism depending on hardware setup.
Sponsored by: iXsystems, Inc.
MFC after: 2 months
Change CCB queue resize logic to be able safely handle overallocations:
- (re)allocate queue space in power of 2 chunks with 64 elements minimum
and never shrink it; with only 4/8 bytes per element size is insignificant.
- automatically reallocate the queue to double size if it is overflowed.
- if queue reallocation failed, store extra CCBs in unsorted TAILQ,
fetching them back as soon as some queue element is freed.
To free space in CCB for TAILQ linking, change highpowerq from keeping
high-power CCBs to keeping devices frozen due to high-power CCBs.
This encloses all pieces of queue resize logic inside of cam_queue.[ch],
removing some not obvious duties from xpt_release_ccb().
r248917, r248918, r248978, r249001, r249014, r249030:
Remove multilevel freezing mechanism, implemented to handle specifics of
the ATA/SATA error recovery, when post-reset recovery commands should be
allocated when queues are already full of payload requests. Instead of
removing frozen CCBs with specified range of priorities from the queue
to provide free openings, use simple hack, allowing explicit CCBs over-
allocation for requests with priority higher (numerically lower) then
CAM_PRIORITY_OOB threshold.
Simplify CCB allocation logic by removing SIM-level allocation queue.
After that SIM-level queue manages only CCBs execution, while allocation
logic is localized within each single device.
Suggested by: gibbs
- Unify bus reset/probe sequence. Whenever bus attached at boot or later,
CAM will automatically reset and scan it. It allows to remove duplicate
code from many drivers.
- Any bus, attached before CAM completed it's boot-time initialization,
will equally join to the process, delaying boot if needed.
- New kern.cam.boot_delay loader tunable should help controllers that
are still unable to register their buses in time (such as slow USB/
PCCard/ CardBus devices), by adding one more event to wait on boot.
- To allow synchronization between different CAM levels, concept of
requests priorities was extended. Priorities now split between several
"run levels". Device can be freezed at specified level, allowing higher
priority requests to pass. For example, no payload requests allowed,
until PMP driver enable port. ATA XPT negotiate transfer parameters,
periph driver configure caching and so on.
- Frozen requests are no more counted by request allocation scheduler.
It fixes deadlocks, when frozen low priority payload requests occupying
slots, required by higher levels to manage theit execution.
- Two last changes were holding proper ATA reinitialization and error
recovery implementation. Now it is done: SATA controllers and Port
Multipliers now implement automatic hot-plug and should correctly
recover from timeouts and bus resets.
- Improve SCSI error recovery for devices on buses without automatic sense
reporting, such as ATAPI or USB. For example, it allows CAM to wait, while
CD drive loads disk, instead of immediately return error status.
- Decapitalize diagnostic messages and make them more readable and sensible.
- Teach PMP driver to limit maximum speed on fan-out ports.
- Make boot wait for PMP scan completes, and make rescan more reliable.
- Fix pass driver, to return CCB to user level in case of error.
- Increase number of retries in cd driver, as device may return several UAs.
is an application space macro and the applications are supposed to be free
to use it as they please (but cannot). This is consistant with the other
BSD's who made this change quite some time ago. More commits to come.
Move handling of CAM_AUTOSENSE_FAIL into block dealing with
all other scsi status errors.
cam_queue.c:
cam_queue.h:
Fix 'off by one' heap bug in a more efficient manner. Since
heap algorithms like to deal with indexes started from 1,
offset our heap array pointer at allocation time to make this
so for a C environment. This makes the implementation of the
algorithm a bit more efficient.
cam_xpt.c:
Use macros for accessing the head of the heap so that code
is isolated from implementation details of the heap.