Page MenuHomeFreeBSD

acpi: ignore wake button press replayed by firmware on resume
Needs ReviewPublic

Authored by dteske on Sat, Jun 20, 4:17 PM.
Tags
None
Referenced Files
F160295712: D57712.diff
Tue, Jun 23, 12:45 AM
F160113671: IMG_1089.jpeg
Sun, Jun 21, 12:54 PM
Unknown Object (File)
Sun, Jun 21, 7:54 AM

Details

Summary

Some firmware delivers the power or sleep button press that woke the
system as an ordinary button press (Notify 0x80) shortly after resume,
rather than as the wakeup notification (Notify 0x02) the ACPI
specification requires for a button that is also a wake source.

On affected machines (e.g. the Framework Laptop 12, Intel Alder Lake-P)
the power button is a control-method device behind the embedded
controller. The EC latches the key press that woke the system across the
sleep transition and flushes it through its normal _Qxx query path as
soon as it is reinitialized on resume. The replayed press is
indistinguishable from a genuine one, so the kernel honors it as a fresh
suspend request and the machine immediately sleeps again -- an endless
resume->suspend loop that otherwise can only be broken by an s2idle
cycle.

The event cannot be filtered at its source: it arrives over the same EC
query path that also carries legitimate events (lid, AC, thermal,
battery), so suppressing the drain would lose real notifications.
Instead, record the time of resume and ignore a button-initiated suspend
that arrives within a short, sub-second window of it. The replay is
delivered synchronously with resume -- measured at a consistent ~162 ms
across many cycles on a Framework Laptop 12 -- whereas a deliberate press
cannot occur that quickly, as it happens well after the display is back.
This separates the replayed wake event from genuine input without
ignoring real presses for any perceptible time.

Spec-compliant firmware reports the wake as Notify 0x02, which is handled
on a different path and never reaches this check, so there is no change
in behavior on such systems.

The replay window is a fixed compile-time constant rather than a tunable
on purpose: it tracks a hardware characteristic -- the EC's post-resume
replay latency -- not a user policy, so there is no value a user would
meaningfully choose. A deterministic in-kernel check is also preferable
to the time-based userspace holdoff that other systems rely on (for
example systemd-logind's 30 s HoldoffTimeoutSec). That said, if
reviewers would rather it be adjustable, exposing it as a hw.acpi.* OID
is a trivial follow-up.

Test Plan

Framework Laptop 12, FreeBSD main, hw.acpi.power_button_state=fw_suspend.

Before: enter S3 with the power button, then wake with the power button; the machine resumes and ~0.2 s later suspends itself again, looping indefinitely until broken with an s2idle (S2) cycle.

After: the same sequence resumes and stays up. Each resume logs, e.g.:

acpi0: ignoring power button press 162 ms after resume (firmware replayed the wake event)

A deliberate power-button press well after resume still suspends normally. Behavior is unchanged on spec-compliant firmware that reports the wake as Notify 0x02 (handled on a different path that never reaches this check).

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 74062
Build 70945: arc lint + arc unit

Event Timeline

looks good, some small suggestions to make things better, nothing that serious.

sys/dev/acpica/acpi.c
4193

I'd put this behind bootverbose, and I suspect we'd be happier with microsecond elapsed time....

This revision is now accepted and ready to land.Sat, Jun 20, 5:35 PM
sys/dev/acpica/acpi.c
4193

Agree -- I was questioning myself whether to leave it in but erred on embracing the value, though more than happy to hide it behind verbose boot and move to microsecond resolution

Without verbose boot, the message is gone.

With verbose boot on; at loader, 7 (Options) -> 6 (Verbose) -> Enter ...

I put the laptop into S3 10 times, here is the debug line for each return from S3:

acpi0: ignoring power button press 162037 us after resume (firmware replayed the wake event)
acpi0: ignoring power button press 53012 us after resume (firmware replayed the wake event)
acpi0: ignoring power button press 199048 us after resume (firmware replayed the wake event)
acpi0: ignoring power button press 157035 us after resume (firmware replayed the wake event)
acpi0: ignoring power button press 173038 us after resume (firmware replayed the wake event)
acpi0: ignoring power button press 47010 us after resume (firmware replayed the wake event)
acpi0: ignoring power button press 46009 us after resume (firmware replayed the wake event)
acpi0: ignoring power button press 141031 us after resume (firmware replayed the wake event)
acpi0: ignoring power button press 157035 us after resume (firmware replayed the wake event)
acpi0: ignoring power button press 159037 us after resume (firmware replayed the wake event)

(update to code is inbound soon)

Address review (imp): hide the ignore message behind bootverbose and report elapsed time in microseconds.

This revision now requires review to proceed.Sat, Jun 20, 7:06 PM

I m not fully convinced this is a firmware issue.

While debugging ACPI, I have seen cases where handlers appear to be registered repeatedly (I observed counts around 131 registrations) and a fair amount of event spam after resume.

I haven't proven a connection to this bug, but it suggests we may also want to investigate our ACPI handling before assuming the replayed event is entirely a firmware problem.

Also, do other operating systems have to carry a similar quirk for this hardware? This laptop is well tested in arch linux, so someone must have noticed this.

do other operating systems have to carry a similar quirk for this hardware? This laptop is well tested in arch linux, so someone must have noticed this.

Linux systemd HoldoffTimeoutSec defaults to 30 seconds where all events are ignored after a resume for a spell.

I didn’t want to do that for multiple reasons.

What if you actually needed/wanted to put the thing back to sleep by pressing the button again but didn’t want to be forced to either (a) wait 30s or (b) configure some setting.

I imagined how I would feel if, after resuming and seeing very low battery and panicking, pressing the button, then being astonished that the button did nothing when I pressed it.

So instead of doing what Linux does, and ignoring events for so long after resume, I tested.

I was pleasantly surprised by the consistency of low (sub-500ms) timing and was quite pleased we could get by without the hack Linux uses.

That being said, we could do what Linux does, and we could also make the wait configurable, but ultimately I decided “just doing what Linux does” is a low bar and that we could do better.

With this patch, if I bring it out of S3 with the button press, subsequently determine I need to put it back into S3 and dash to a power outlet, I have that option since I can hit the button again in less than 1s after querying the battery state.

I m not fully convinced this is a firmware issue.

While debugging ACPI, I have seen cases

This is less of an “issue” with firmware but a reality of modern firmwares. Older hardware/firmware provides enough info to differentiate between suspend and resume off the same button press. Modern hardware hides the button behind a state that we cannot filter.

I did not intend to imply this was something that should be fixed in firmware — it is on us to come up with a fix in our own ACPI code; as I have done here

I don’t think we should copy Linux 30-second holdoff either. If 500ms is enough, that’s better than ignoring legitimate button presses for half a minute.

One question: is the Framework running the latest BIOS/EC firmware available? A lot of suspend/resume bugs end up being fixed in firmware updates, especially around button handling, debounce logic or device reattachment during resume.

Framework has also been moving parts of its lineup toward coreboot, while others still use Insyde firmware.

Do you know which firmware stack your machine is running ?

One question: is the Framework running the latest BIOS/EC firmware available? A lot of suspend/resume bugs end up being fixed in firmware updates, especially around button handling, debounce logic or device reattachment during resume.

I did not upgrade the BIOS yet, but the laptop is only 5 weeks old. That being said, it is still possible that the BIOS is out of date; though this is a relatively new laptop model as well.

I will check the BIOS type and version in a moment.

Do you know which firmware stack your machine is running ?

It’s an Insyde BIOS (picture below); InsydeH2O Version LFR20.03.04

IMG_1089.jpeg (3×4 px, 2 MB)

You are 3 versions behind : https://knowledgebase.frame.work/framework-laptop-12-bios-and-driver-releases-13th-gen-intel-core-HyrqeX2ex

The changelog of the EC is opaque, but it looks like there was code update.

Framework has also been moving parts of its lineup toward coreboot, while others still use AMI firmware.

All Framework Computer systems ship with Insyde BIOS and a fork of chrome EC firmware.
Coreboot is just an experiment at the moment.
EC firmware is open source: https://github.com/FrameworkComputer/EmbeddedController/tree/fwk-sunflower-26784/zephyr/program/framework/sunflower

Thanks for the clarification Daniel and the link to the EC.

I'm going to test this on some computers i have here.

I checked the firmware behavior, it first triggers a wake-up via PWRBTN# pin

Here is how linux handles it: https://github.com/torvalds/linux/commit/e71eeb2a6bcc
It does not have a timeout, it just ignores powerbutton ACPI notify events in sleep, it requires a wake source.

Linux systemd HoldoffTimeoutSec defaults to 30 seconds where all events are ignored after a resume for a spell.

Documentation says just lid events are ignored, so not power button, so this is probably not the same logic.
https://www.freedesktop.org/software/systemd/man/latest/logind.conf.html#HoldoffTimeoutSec=

I checked the firmware behavior, it first triggers a wake-up via PWRBTN# pin

Here is how linux handles it: https://github.com/torvalds/linux/commit/e71eeb2a6bcc
It does not have a timeout, it just ignores powerbutton ACPI notify events in sleep, it requires a wake source.

Linux systemd HoldoffTimeoutSec defaults to 30 seconds where all events are ignored after a resume for a spell.

Documentation says just lid events are ignored, so not power button, so this is probably not the same logic.
https://www.freedesktop.org/software/systemd/man/latest/logind.conf.html#HoldoffTimeoutSec=

Daniel — thank you; this is exactly the kind of authoritative input that makes the change better. The PWRBTN# detail and the EC source link are much appreciated, and your correction on HoldoffTimeoutSec (lid-only) is well taken — I had mischaracterized it.

On the Linux handling (e71eeb2a6bcc): that mechanism fits the freeze/thaw case it was written for, where the press is delivered synchronously while the button is still marked suspended. On this machine the EC latches the press and replays it through its _Qxx path after resume has completed — consistently tens-to-hundreds of ms later — so a flag set at suspend and cleared on resume would already be clear by the time the replay lands. A short elapsed-time check catches it without depending on resume ordering, and it has the gentler failure mode: it reverts to accepting input unconditionally within the window, leaving no state that could wedge the button if a future wake path behaves differently.

Worth underscoring for anyone following along: the check only ever sees the replayed Notify 0x80 and never spec-compliant Notify 0x02, so on firmware that reports the wake correctly it does nothing at all — there's no firmware version to chase. The aim is simply to make the machine work as it ships; absorbing this in the OS is the friendliest path for someone who just bought the hardware, and I'm glad to be doing it alongside Framework rather than pointing at anyone.

@guest-seuros — thanks for digging into this, and for offering to test on hardware you have on hand; more data points across machines would be genuinely valuable, so please do share what you find.

You are 3 versions behind ... looks like there was code update.

I don't think we need to chase a specific BIOS/EC build, because the check is gated on behavior, not firmware. Spec-compliant firmware reports the wake as Notify 0x02, which never reaches this code, so the change is inert there and acts only when the press is replayed as an ordinary Notify 0x80. The intent is to make the machine usable as it ships, without asking the owner to flash firmware first just to get a working system.

While debugging ACPI, I have seen cases where handlers appear to be registered repeatedly (I observed counts around 131 registrations) and a fair amount of event spam after resume.

That sounds like it may be its own thing, separate from this resume→suspend loop. If you can capture a trace (dmesg with bootverbose, plus the conditions), I'm glad to look; if there's a real registration leak it deserves its own bug rather than riding on this one.

@obiwac — great seeing you at BSDCan! Looks like your comment came through empty on my end (which is all my "??" meant — sorry if it read as terse). I'd really value your take on this one; you know this corner of ACPI better than I do, having written the s2idle loop, and I'd rather shape it with your input than land it without. Whenever you get a sec, mind re-posting whatever didn't make it through?

so a flag set at suspend and cleared on resume would already be clear by the time the replay lands

Hmm we don't have any issues on Linux though.
I think the system would still be asleep by the time you receive the power button event.

spec-compliant Notify 0x02

Huh, you are right: https://uefi.org/specs/ACPI/6.5/04_ACPI_Hardware_Specification.html#control-method-power-button
"Upon waking from a G1 sleeping state, the AML event handler generates a notify command with the code of 0x2 to indicate it was responsible for waking the system."

I don't expect that we are the only firmware with this behavior though, so even if we change it, it's good to handle the case of only 0x80 notification.

Hmm we don't have any issues on Linux though.
I think the system would still be asleep by the time you receive the power button event.

That's the part worth pinning down, because it's measurable. On this machine the press doesn't arrive during firmware sleep — it comes up through the EC's _Qxx query drain after the kernel is already running again, which is why I can timestamp it at all: 46–199 ms past the wake point, consistently, across many cycles. Linux's suspended-flag handles its case cleanly because the button device stays marked suspended through thaw; the FreeBSD analog would have to remain set until after that post-resume EC drain, and clearing it in the natural resume path would race the replay. I'd rather not hinge correctness on that ordering, so a short, self-clearing window absorbs it instead — it reverts to accepting input no matter what. Happy to share the raw trace if it's useful.

... even if we change it, it's good to handle the case of only 0x80 notification.

Exactly — and thank you; that's the crux. Other firmware will land here too, so handling the 0x80-only case is the durable thing to do. I appreciate you confirming the spec language.

I've also pulled obiwac in on the mechanism — he wrote our s2idle loop — so I'm glad to align the shape with whatever the ACPI maintainers prefer. Either way, your input has made this better.