Page MenuHomeFreeBSD

powerpc: Implement RTAS event log support
Needs ReviewPublic

Authored by ivy on Wed, May 13, 10:32 AM.
Referenced Files
F157137722: D56984.diff
Mon, May 18, 5:06 PM
F157131210: D56984.diff
Mon, May 18, 4:00 PM
F157072484: D56984.diff
Mon, May 18, 7:00 AM
Unknown Object (File)
Sat, May 16, 5:51 PM
Unknown Object (File)
Sat, May 16, 5:23 PM
Unknown Object (File)
Sat, May 16, 5:10 PM
Unknown Object (File)
Sat, May 16, 5:10 PM
Unknown Object (File)
Sat, May 16, 7:31 AM

Details

Summary

The IBM Run-Time Abstract Services (RTAS) provides support for an
event log that allows the OS to receive hardware and firmware events.

Two types of event log are provided: a general log that needs to be
scanned at regular intervals (typically once per second), and an
exception log which is interrupt-triggered. Handle the log scan in
rtasdev itself. For the exception logs, add a new rtas_esrc driver
which attaches to each event source.

For most event types, we just log the event. For shutdown request
events, call shutdown_nice(RB_POWEROFF), which allows pSeries VM
shutdown requests to work.

This implementation is based on "Linux on Power Architecture Reference"
revision 2.9 (August 12, 2020).

MFC after: 2 weeks

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 73074
Build 69957: arc lint + arc unit

Event Timeline

ivy requested review of this revision.Wed, May 13, 10:32 AM

fix big-endian systems

also, remove some dead code

adrian added inline comments.
sys/powerpc/include/rtas.h
133

is "unsigned" guaranteed to always be a byte here?

use uint8_t for bitfields

i'm not sure how much difference it makes in practice, but since we're treating
these fields as individual bytes anyway, it's probably more correct.

ivy marked an inline comment as done.Wed, May 13, 2:44 PM
sys/powerpc/pseries/rtas_dev.c
585–588

Do you expect to need to extend this in the future, adding other event type handling? If so, maybe consider a jump table struct, or a switch here, to make it easier.

Also, that buffer pointer math looks error prone, so consider simplifying that. You could simply pass the whole buffer down instead, and use a union to get the vendor data you need.

666

Is the EPOW shutdown request a request, or an order? Unless this is the hypervisor saying that the partition will be shut down, I think this could instead be propagated up to devd (devctl_notify()) to let the user decide how to handle it, and add an entry to devd.conf (or, better yet, create a pseries devd.conf) to default to shutdown.

sys/powerpc/pseries/rtas_dev.c
585–588

the only other one i think we definitely want to handle is hotplug events, but i'm not planning on looking at that any time soon. so a jump table is probably over kill, but i'll try a switch and see if it looks nicer.

666

it depends on the type/reason code, but i think in all cases, the user will want to shut down here:

  • if the user interacted with the power button or the BMC, they obviously want the system to shut down
  • if a temperature threshold is exceeded or the UPS requested a shutdown, we are going to turn off anyway (because either the hardware is about to melt or the UPS battery is about to run out), so we should do it cleanly
  • if the VM host is shutting down / rebooting, we are going to turn off anyway, so as above

i'm not opposed to making this a devd event, but i don't think any other architecture works that way, and we'd want to make sure the event was at least roughly portable between platforms, so it's probably out of scope for this diff.

incidentally, OPAL does this the same way, see sys/powerpc/powernv/opal_dev.c:opal_handle_shutdown_message().

cc: @imp

sys/powerpc/pseries/rtas_dev.c
666

OPAL does it this way because the message is reporting that the machine is going down shortly, there's no way to stop it. The only safe thing to do in that case is to shutdown from the kernel, and hope that it cleans up before power is removed.

rf

re: pointer arithmetic, i have a better idea of how to do this, but let's
see if this is a better way to handle shutdown in the mean time.

improve buffer handling

instead of parsing the buffer in rtas_handle_epow(), add a new function
to find a section by ID within the buffer. check the entire log event
fits within the buffer before doing anything else.

also, stylise and fix some endianness confusion in the platform structs.

remove an unused prototype