ioat(4): Fix race between process_events and reset_hw
In the case where a hardware error is detected during ioat_process_events, hardware may advance (by one descriptor, probably) and a subsequent ioat_process_events may race the intended ioat_reset_hw followup. In that case, the second process_events would observe a completion update that does not match the software "last_seen" status, and attempt to successfully complete already-failed descriptors. Guard against this race with the resetting_cleanup flag. Reviewed by: bdrewery, markj Sponsored by: Dell EMC Isilon
This commit is contained in:
parent
836055542f
commit
3a37091931
@ -765,6 +765,15 @@ out:
|
||||
mtx_lock(&ioat->submit_lock);
|
||||
mtx_lock(&ioat->cleanup_lock);
|
||||
ioat->quiescing = TRUE;
|
||||
/*
|
||||
* This is safe to do here because we have both locks and the submit
|
||||
* queue is quiesced. We know that we will drain all outstanding
|
||||
* events, so ioat_reset_hw can't deadlock. It is necessary to
|
||||
* protect other ioat_process_event threads from racing ioat_reset_hw,
|
||||
* reading an indeterminate hw state, and attempting to continue
|
||||
* issuing completions.
|
||||
*/
|
||||
ioat->resetting_cleanup = TRUE;
|
||||
|
||||
chanerr = ioat_read_4(ioat, IOAT_CHANERR_OFFSET);
|
||||
if (1 <= g_ioat_debug_level)
|
||||
|
Loading…
x
Reference in New Issue
Block a user