HomeFreeBSD

Micro-optimize OOA queue processing.

Description

Micro-optimize OOA queue processing.

  • Move ctl_get_cmd_entry() calls from every OOA traversal to when the requests first inserted, storing seridx in struct ctl_scsiio.
  • Move some checks out of the loop in ctl_check_ooa().
  • Replace checks for errors that can not happen with asserts.
  • Transpose ctl_serialize_table, so that any OOA traversal accessed only one row (cache line). Compact it from enum to uint8_t.
  • Optimize static branch predictions in hottest places.

Due to O(n) nature on deep LUN queues this can be the hottest code
path in CTL, and additional 20% of IOPS I see in some 4KB I/O tests
are good to have in reserve. About 50% of CPU time here according
to the profiles is now spent in two memory accesses per traversed
request in OOA.

Sponsored by: iXsystems, Inc.
MFC after: 2 weeks

Details

Provenance
mavAuthored on Feb 27 2021, 3:14 PM
Parents
rGbecaac3972f1: loader: use display pixel density for font autoselection
Branches
Unknown
Tags
Unknown