Page MenuHomeFreeBSD

[PowerPC64] Fix OPAL IPMI driver
ClosedPublic

Authored by luporl on Mar 25 2020, 5:27 PM.
Referenced Files
Unknown Object (File)
Tue, Mar 19, 4:14 AM
Unknown Object (File)
Tue, Mar 19, 2:58 AM
Unknown Object (File)
Tue, Mar 5, 12:00 AM
Unknown Object (File)
Dec 20 2023, 4:50 AM
Unknown Object (File)
Sep 26 2023, 6:52 AM
Unknown Object (File)
Sep 5 2023, 2:01 AM
Unknown Object (File)
Aug 2 2023, 6:22 AM
Unknown Object (File)
Aug 2 2023, 6:22 AM

Details

Summary

This change fixes a couple of issues with OPAL IPMI driver and
implements a mechanism to detect timeouts and discard old messages left
in receive queue, to avoid old messages from being confused with the
reply of new ones.

Details:

  • Implemented a mechanism to discard old messages left in receive queue after timeouts.
  • Added proper handling for requests with timeout == 0.
  • Fixed wrong error logic in opal_ipmi_loop.
  • Fixed issue when getting ipmi-interface-id from device tree.
  • Added some debugging printf's.
Test Plan

Issue a couple of ipmitool commands on a POWER9 machine and verify that they return the expected results.
Examples:

ipmitool power status
ipmitool sensor

Diff Detail

Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 30100
Build 27907: arc lint + arc unit

Event Timeline

sys/dev/ipmi/ipmi_opal.c
89–90

I'm considering that 0 actually means no timeout or MAX_TIMEOUT, otherwise most requests end up timing out.

137

Waiting 100 ms for a previously timed out message sent is a guess.
One second seemed too much for me (especially on attach) while less than 100 ms seemed very little.

If 100 ms of wait time is not enough, stale messages may remain in the queue and be returned instead of replies to new requests.
My guess is that timeouts should not occur very often and a reply to a single request will not take a long time to arrive.

All timeouts I've noticed while debugging this driver occurred with requests specifying a timeout value of zero and this being treated as zero seconds timeout, instead of no timeout.

282

Previous code would fill the first 4 bytes of sc_interface, that is not what we want on big-endian hosts.
This issue doesn't manifest itself when ipmi-interface-id is 0.

sys/dev/ipmi/ipmi_opal.c
72

If you make this #else case '#define EPRINTF(fmt, ...) ((void)0)' you can avoid the '#if OPAL_IPMI_DEBUG > 0' below.

Might be able to avoid those regardless.

This revision is now accepted and ready to land.Mar 25 2020, 8:18 PM
This revision was automatically updated to reflect the committed changes.