9d9fd8b79f
- Move ctl_get_cmd_entry() calls from every OOA traversal to when the requests first inserted, storing seridx in struct ctl_scsiio. - Move some checks out of the loop in ctl_check_ooa(). - Replace checks for errors that can not happen with asserts. - Transpose ctl_serialize_table, so that any OOA traversal accessed only one row (cache line). Compact it from enum to uint8_t. - Optimize static branch predictions in hottest places. Due to O(n) nature on deep LUN queues this can be the hottest code path in CTL, and additional 20% of IOPS I see in some 4KB I/O tests are good to have in reserve. About 50% of CPU time here according to the profiles is now spent in two memory accesses per traversed request in OOA. Sponsored by: iXsystems, Inc. MFC after: 2 weeks