Page MenuHomeFreeBSD

[PowerPC64] Fix OPAL IPMI driver
ClosedPublic

Authored by luporl on Mar 25 2020, 5:27 PM.
Referenced Files
Unknown Object (File)
Fri, Dec 6, 11:39 PM
Unknown Object (File)
Nov 20 2024, 9:02 AM
Unknown Object (File)
Nov 16 2024, 10:34 PM
Unknown Object (File)
Oct 23 2024, 1:25 PM
Unknown Object (File)
Oct 23 2024, 1:25 PM
Unknown Object (File)
Oct 23 2024, 1:25 PM
Unknown Object (File)
Oct 23 2024, 1:25 PM
Unknown Object (File)
Sep 30 2024, 2:22 AM

Details

Summary

This change fixes a couple of issues with OPAL IPMI driver and
implements a mechanism to detect timeouts and discard old messages left
in receive queue, to avoid old messages from being confused with the
reply of new ones.

Details:

  • Implemented a mechanism to discard old messages left in receive queue after timeouts.
  • Added proper handling for requests with timeout == 0.
  • Fixed wrong error logic in opal_ipmi_loop.
  • Fixed issue when getting ipmi-interface-id from device tree.
  • Added some debugging printf's.
Test Plan

Issue a couple of ipmitool commands on a POWER9 machine and verify that they return the expected results.
Examples:

ipmitool power status
ipmitool sensor

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

sys/dev/ipmi/ipmi_opal.c
89–90 ↗(On Diff #69863)

I'm considering that 0 actually means no timeout or MAX_TIMEOUT, otherwise most requests end up timing out.

137 ↗(On Diff #69863)

Waiting 100 ms for a previously timed out message sent is a guess.
One second seemed too much for me (especially on attach) while less than 100 ms seemed very little.

If 100 ms of wait time is not enough, stale messages may remain in the queue and be returned instead of replies to new requests.
My guess is that timeouts should not occur very often and a reply to a single request will not take a long time to arrive.

All timeouts I've noticed while debugging this driver occurred with requests specifying a timeout value of zero and this being treated as zero seconds timeout, instead of no timeout.

286 ↗(On Diff #69863)

Previous code would fill the first 4 bytes of sc_interface, that is not what we want on big-endian hosts.
This issue doesn't manifest itself when ipmi-interface-id is 0.

sys/dev/ipmi/ipmi_opal.c
72 ↗(On Diff #69863)

If you make this #else case '#define EPRINTF(fmt, ...) ((void)0)' you can avoid the '#if OPAL_IPMI_DEBUG > 0' below.

Might be able to avoid those regardless.

This revision is now accepted and ready to land.Mar 25 2020, 8:18 PM
This revision was automatically updated to reflect the committed changes.