Page MenuHomeFreeBSD

acpi: ignore wake button press replayed by firmware on resume
AcceptedPublic

Authored by dteske on Sat, Jun 20, 4:17 PM.
Tags
None
Referenced Files
F160428136: D57712.id180185.diff
Wed, Jun 24, 8:11 AM
F160414683: D57712.id180171.diff
Wed, Jun 24, 6:33 AM
F160406141: D57712.diff
Wed, Jun 24, 4:45 AM
F160398908: D57712.diff
Wed, Jun 24, 3:02 AM
Unknown Object (File)
Tue, Jun 23, 8:08 PM
Unknown Object (File)
Tue, Jun 23, 12:45 AM
F160113671: IMG_1089.jpeg
Sun, Jun 21, 12:54 PM
Unknown Object (File)
Sun, Jun 21, 7:54 AM

Details

Summary

Some firmware delivers the power or sleep button press that woke the system as an ordinary button press (Notify 0x80) shortly after resume, rather than as the wakeup notification (Notify 0x02) the ACPI specification requires for a button that is also a wake source.

On affected machines (e.g. the Framework Laptop 12, Intel Raptor Lake-P) the power button is a control-method device behind the embedded controller. The EC latches the key press that woke the system across the sleep transition and flushes it through its normal _Qxx query path as soon as it is reinitialized on resume. The replayed press is indistinguishable from a genuine one, so the kernel honors it as a fresh suspend request and the machine suspends again immediately after waking; it cannot be kept awake with the button.

The event cannot be filtered at its source: it arrives over the same EC query path that also carries legitimate events (lid, AC, thermal, battery), so suppressing the drain would lose real notifications. Instead, record the time of resume and ignore a button-initiated suspend that arrives within a short grace window of it. The timestamp is taken before DEVICE_RESUME() re-initializes the EC, so it is set before the replay can be processed on the ACPICA notify taskqueue; otherwise the replay can be evaluated before the timestamp is written and slip through. Measured from that point, the replay lands at ~600 ms across many cycles on a Framework Laptop 12, whereas a deliberate press cannot occur that quickly -- it happens well after the display is back -- so a one-second window separates the two without ignoring real presses for any perceptible time.

Spec-compliant firmware reports the wake as Notify 0x02, which is handled on a different path and never reaches this check, so there is no change in behavior on such systems.

The replay window is a fixed compile-time constant rather than a tunable on purpose: it tracks a hardware characteristic -- the EC's post-resume replay latency -- not a user policy, so there is no value a user would meaningfully choose.

PR: 296243
MFC after: 2 weeks

Test Plan

Hardware: Framework Laptop 12 (Intel Raptor Lake-P). GENERIC kernel built from this change, booted with bootverbose.

  1. Enter S3 with the power button; resume with the power button. Repeat for 5 cycles.
  2. Each cycle, the firmware-replayed press is ignored and the machine stays awake. dmesg shows, e.g.: acpi0: ignoring power button press 615204 us after resume (firmware replayed the wake event) with measured latencies of ~603-615 ms across the cycles.
  3. The deliberate press that begins each subsequent cycle (well over 1 s after the prior resume, after the display is back) suspends normally -- confirming genuine presses outside the window are honored.
  4. No dmesg errors; suspend/resume otherwise unaffected.

Before the rework, the same procedure re-suspended immediately on resume and the machine could not be kept awake with the button.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 74178
Build 71061: arc lint + arc unit

Event Timeline

looks good, some small suggestions to make things better, nothing that serious.

sys/dev/acpica/acpi.c
4186

I'd put this behind bootverbose, and I suspect we'd be happier with microsecond elapsed time....

This revision is now accepted and ready to land.Sat, Jun 20, 5:35 PM
sys/dev/acpica/acpi.c
4186

Agree -- I was questioning myself whether to leave it in but erred on embracing the value, though more than happy to hide it behind verbose boot and move to microsecond resolution

Without verbose boot, the message is gone.

With verbose boot on; at loader, 7 (Options) -> 6 (Verbose) -> Enter ...

I put the laptop into S3 10 times, here is the debug line for each return from S3:

acpi0: ignoring power button press 162037 us after resume (firmware replayed the wake event)
acpi0: ignoring power button press 53012 us after resume (firmware replayed the wake event)
acpi0: ignoring power button press 199048 us after resume (firmware replayed the wake event)
acpi0: ignoring power button press 157035 us after resume (firmware replayed the wake event)
acpi0: ignoring power button press 173038 us after resume (firmware replayed the wake event)
acpi0: ignoring power button press 47010 us after resume (firmware replayed the wake event)
acpi0: ignoring power button press 46009 us after resume (firmware replayed the wake event)
acpi0: ignoring power button press 141031 us after resume (firmware replayed the wake event)
acpi0: ignoring power button press 157035 us after resume (firmware replayed the wake event)
acpi0: ignoring power button press 159037 us after resume (firmware replayed the wake event)

(update to code is inbound soon)

Address review (imp): hide the ignore message behind bootverbose and report elapsed time in microseconds.

This revision now requires review to proceed.Sat, Jun 20, 7:06 PM

I m not fully convinced this is a firmware issue.

While debugging ACPI, I have seen cases where handlers appear to be registered repeatedly (I observed counts around 131 registrations) and a fair amount of event spam after resume.

I haven't proven a connection to this bug, but it suggests we may also want to investigate our ACPI handling before assuming the replayed event is entirely a firmware problem.

Also, do other operating systems have to carry a similar quirk for this hardware? This laptop is well tested in arch linux, so someone must have noticed this.

do other operating systems have to carry a similar quirk for this hardware? This laptop is well tested in arch linux, so someone must have noticed this.

Linux systemd HoldoffTimeoutSec defaults to 30 seconds where all events are ignored after a resume for a spell.

I didn’t want to do that for multiple reasons.

What if you actually needed/wanted to put the thing back to sleep by pressing the button again but didn’t want to be forced to either (a) wait 30s or (b) configure some setting.

I imagined how I would feel if, after resuming and seeing very low battery and panicking, pressing the button, then being astonished that the button did nothing when I pressed it.

So instead of doing what Linux does, and ignoring events for so long after resume, I tested.

I was pleasantly surprised by the consistency of low (sub-500ms) timing and was quite pleased we could get by without the hack Linux uses.

That being said, we could do what Linux does, and we could also make the wait configurable, but ultimately I decided “just doing what Linux does” is a low bar and that we could do better.

With this patch, if I bring it out of S3 with the button press, subsequently determine I need to put it back into S3 and dash to a power outlet, I have that option since I can hit the button again in less than 1s after querying the battery state.

I m not fully convinced this is a firmware issue.

While debugging ACPI, I have seen cases

This is less of an “issue” with firmware but a reality of modern firmwares. Older hardware/firmware provides enough info to differentiate between suspend and resume off the same button press. Modern hardware hides the button behind a state that we cannot filter.

I did not intend to imply this was something that should be fixed in firmware — it is on us to come up with a fix in our own ACPI code; as I have done here

I don’t think we should copy Linux 30-second holdoff either. If 500ms is enough, that’s better than ignoring legitimate button presses for half a minute.

One question: is the Framework running the latest BIOS/EC firmware available? A lot of suspend/resume bugs end up being fixed in firmware updates, especially around button handling, debounce logic or device reattachment during resume.

Framework has also been moving parts of its lineup toward coreboot, while others still use Insyde firmware.

Do you know which firmware stack your machine is running ?

One question: is the Framework running the latest BIOS/EC firmware available? A lot of suspend/resume bugs end up being fixed in firmware updates, especially around button handling, debounce logic or device reattachment during resume.

I did not upgrade the BIOS yet, but the laptop is only 5 weeks old. That being said, it is still possible that the BIOS is out of date; though this is a relatively new laptop model as well.

I will check the BIOS type and version in a moment.

Do you know which firmware stack your machine is running ?

It’s an Insyde BIOS (picture below); InsydeH2O Version LFR20.03.04

IMG_1089.jpeg (3×4 px, 2 MB)

You are 3 versions behind : https://knowledgebase.frame.work/framework-laptop-12-bios-and-driver-releases-13th-gen-intel-core-HyrqeX2ex

The changelog of the EC is opaque, but it looks like there was code update.

Framework has also been moving parts of its lineup toward coreboot, while others still use AMI firmware.

All Framework Computer systems ship with Insyde BIOS and a fork of chrome EC firmware.
Coreboot is just an experiment at the moment.
EC firmware is open source: https://github.com/FrameworkComputer/EmbeddedController/tree/fwk-sunflower-26784/zephyr/program/framework/sunflower

Thanks for the clarification Daniel and the link to the EC.

I'm going to test this on some computers i have here.

I checked the firmware behavior, it first triggers a wake-up via PWRBTN# pin

Here is how linux handles it: https://github.com/torvalds/linux/commit/e71eeb2a6bcc
It does not have a timeout, it just ignores powerbutton ACPI notify events in sleep, it requires a wake source.

Linux systemd HoldoffTimeoutSec defaults to 30 seconds where all events are ignored after a resume for a spell.

Documentation says just lid events are ignored, so not power button, so this is probably not the same logic.
https://www.freedesktop.org/software/systemd/man/latest/logind.conf.html#HoldoffTimeoutSec=

I checked the firmware behavior, it first triggers a wake-up via PWRBTN# pin

Here is how linux handles it: https://github.com/torvalds/linux/commit/e71eeb2a6bcc
It does not have a timeout, it just ignores powerbutton ACPI notify events in sleep, it requires a wake source.

Linux systemd HoldoffTimeoutSec defaults to 30 seconds where all events are ignored after a resume for a spell.

Documentation says just lid events are ignored, so not power button, so this is probably not the same logic.
https://www.freedesktop.org/software/systemd/man/latest/logind.conf.html#HoldoffTimeoutSec=

Daniel — thank you; this is exactly the kind of authoritative input that makes the change better. The PWRBTN# detail and the EC source link are much appreciated, and your correction on HoldoffTimeoutSec (lid-only) is well taken — I had mischaracterized it.

On the Linux handling (e71eeb2a6bcc): that mechanism fits the freeze/thaw case it was written for, where the press is delivered synchronously while the button is still marked suspended. On this machine the EC latches the press and replays it through its _Qxx path after resume has completed — consistently tens-to-hundreds of ms later — so a flag set at suspend and cleared on resume would already be clear by the time the replay lands. A short elapsed-time check catches it without depending on resume ordering, and it has the gentler failure mode: it reverts to accepting input unconditionally within the window, leaving no state that could wedge the button if a future wake path behaves differently.

Worth underscoring for anyone following along: the check only ever sees the replayed Notify 0x80 and never spec-compliant Notify 0x02, so on firmware that reports the wake correctly it does nothing at all — there's no firmware version to chase. The aim is simply to make the machine work as it ships; absorbing this in the OS is the friendliest path for someone who just bought the hardware, and I'm glad to be doing it alongside Framework rather than pointing at anyone.

@guest-seuros — thanks for digging into this, and for offering to test on hardware you have on hand; more data points across machines would be genuinely valuable, so please do share what you find.

You are 3 versions behind ... looks like there was code update.

I don't think we need to chase a specific BIOS/EC build, because the check is gated on behavior, not firmware. Spec-compliant firmware reports the wake as Notify 0x02, which never reaches this code, so the change is inert there and acts only when the press is replayed as an ordinary Notify 0x80. The intent is to make the machine usable as it ships, without asking the owner to flash firmware first just to get a working system.

While debugging ACPI, I have seen cases where handlers appear to be registered repeatedly (I observed counts around 131 registrations) and a fair amount of event spam after resume.

That sounds like it may be its own thing, separate from this resume→suspend loop. If you can capture a trace (dmesg with bootverbose, plus the conditions), I'm glad to look; if there's a real registration leak it deserves its own bug rather than riding on this one.

@obiwac — great seeing you at BSDCan! Looks like your comment came through empty on my end (which is all my "??" meant — sorry if it read as terse). I'd really value your take on this one; you know this corner of ACPI better than I do, having written the s2idle loop, and I'd rather shape it with your input than land it without. Whenever you get a sec, mind re-posting whatever didn't make it through?

so a flag set at suspend and cleared on resume would already be clear by the time the replay lands

Hmm we don't have any issues on Linux though.
I think the system would still be asleep by the time you receive the power button event.

spec-compliant Notify 0x02

Huh, you are right: https://uefi.org/specs/ACPI/6.5/04_ACPI_Hardware_Specification.html#control-method-power-button
"Upon waking from a G1 sleeping state, the AML event handler generates a notify command with the code of 0x2 to indicate it was responsible for waking the system."

I don't expect that we are the only firmware with this behavior though, so even if we change it, it's good to handle the case of only 0x80 notification.

Hmm we don't have any issues on Linux though.
I think the system would still be asleep by the time you receive the power button event.

That's the part worth pinning down, because it's measurable. On this machine the press doesn't arrive during firmware sleep — it comes up through the EC's _Qxx query drain after the kernel is already running again, which is why I can timestamp it at all: 46–199 ms past the wake point, consistently, across many cycles. Linux's suspended-flag handles its case cleanly because the button device stays marked suspended through thaw; the FreeBSD analog would have to remain set until after that post-resume EC drain, and clearing it in the natural resume path would race the replay. I'd rather not hinge correctness on that ordering, so a short, self-clearing window absorbs it instead — it reverts to accepting input no matter what. Happy to share the raw trace if it's useful.

... even if we change it, it's good to handle the case of only 0x80 notification.

Exactly — and thank you; that's the crux. Other firmware will land here too, so handling the 0x80-only case is the durable thing to do. I appreciate you confirming the spec language.

I've also pulled obiwac in on the mechanism — he wrote our s2idle loop — so I'm glad to align the shape with whatever the ACPI maintainers prefer. Either way, your input has made this better.

▎ The intent is to make the machine usable as it ships, without asking the owner to flash firmware first just to get a working system.

The premise is wrong. A firmware update doesn't just fix one ACPI quirk: it ships microcode updates, fan curve corrections, thermal calibration, security patches. It's basic maintenance, not an unreasonable ask. I've literally salvaged working computers that were headed for recycling, fixed by a single BIOS update. Framework users specifically chose the brand because they can maintain and upgrade their hardware, it's the entire value proposition, the Ship of Theseuros of laptops. Asking them to run a firmware update is not a burden, its the product working as designed.

@git_danielschaefer.me You already confirmed the spec requires Notify 0x02 for wake. To your point that you don't expect Framework to be the only firmware with this behavior, I tested this patch on Apple, MSI, Gigabyte, Intel, Lenovo, Sony, Panasonic, Toshiba, AZW, Minisforum, Qotom and Beelink machines across multiple generations. I also tested on machines running coreboot (System76, and a bunch i corebooted). None of them replay the wake as 0x80. Those vendors ship AMI, Phoenix in stock which all get the button status bits right. Framework ships Insyde with your own fork of chrome-ec. This may be a Framework-only problem.

I don't have Framework hardware yet, but I would prefer to see this fixed on their side rather than patching FreeBSD. It costs the end user a BIOS update on one machine vs. FreeBSD maintaining this spec violation forever. If Framework fixes the EC, future maintainers will have zero way to reproduce what this code guards against, it becomes permanent dead code with no test path.

Fixing it here also means every BSD flavor needs to carry the same quirk (NetBSD, OpenBSD, DragonFly), let alone OSes like 9front that have a minuscule maintenance team. One firmware fix eliminates the problem for all of them.

I spent a few months bringing Apple hardware support to FreeBSD, so I have seen this pattern play out. I get an old machine, fix a bug, message @adrian that i found a bug and i will open a fix, then I boot to macOS, which auto-updates the firmware. The bug fix becomes 100% hallucination, because it depended on behavior that was already fixed 7 years ago by Apple. I can't reproduce it anymore.

The difference is that Apple is hostile to other OSes, before D54762 landed, some hardware would not even enumerate unless the OS claim to be Darwin..

Framework engineering could be ship the fix in days/weeks. FreeBSD 16 is months away from release.

That said, if we do want to handle non-compliant firmware gracefully in the OS without fixing the EC, the fix belongs in acpi_button.c, not acpi.c.

FreeBSD's acpi_button_suspend() / acpi_button_resume are empty stubs, the button driver has no suspend awareness at all.

Framework will get their wall of shame file acpi_framework.c where we document all the spec violations, like in acpi_panasonic.c and others. (@dteske , you already documented it in this file).

But I believe Framework will prefer to fix on their side.

In the end, FreeBSD's acpi_button is lacking generic suspend/resume awareness: the handlers are just stubs. If there is no diff open about it, maybe @dteske you can take a stab at it. If you do, I will test it on all the machines I have here. A correct implementation could be backported to 14-15.x .

@guest-seuros — thank you for the testing; validating across Apple, MSI, Gigabyte, Intel, Lenovo, Sony, Panasonic, Toshiba, AZW, Minisforum, Qotom, Beelink and coreboot is real work, and it's useful: it scopes the replayed-0x80 behavior tightly, which makes a guard easy to reason about.

I'd gently reframe one thing, though: this isn't either/or. A firmware update is good maintenance, and I'd genuinely welcome Framework improving the EC — but "update first, then the OS works" and "the OS works as shipped" are different promises. Not everyone who installs FreeBSD on these machines is a kernel developer; many simply expect it to come up when they open the box, and firmware getting better over time doesn't change what a person needs on the machine in front of them today. Both paths can be true at once, and pursuing ours asks nothing of Framework.

On maintenance cost — your strongest point — I think the right answer is your other suggestion, which is the better idea: put this in acpi_button.c. You're right that acpi_button_suspend()/acpi_button_resume() are empty stubs, and a button that is also a wake source should have suspend/resume awareness regardless of any one vendor. So I'd record the resume time in acpi_button_resume() and gate the replayed press in the PRESSED_FOR_SLEEP (0x80) path of acpi_button_notify_handler(). That fills a real gap that stands on its own merit, and the guard rides on top — self-deactivating on firmware that reports the spec's 0x02, so if the behavior ever disappears it simply goes quiet rather than lingering as a liability or dead code.

I'd lean toward that generic button awareness over a vendor-specific quirk file — partly because Daniel expects this may not stay Framework-only, and partly because the generic improvement helps more people and, as you note, is backportable to 14/15. I'd keep the self-clearing window rather than a persistent suspended-flag, for the fail-safe reasons Daniel and I were just discussing.

I'll gladly take you up on testing across your fleet — confirmation that it's inert on compliant firmware and correct on the outliers is the best validation there is. Let me sync with obiwac on the final shape, then I'll post the reworked diff. Thank you again; this is a better change for your push on it.

[snip]

So I agree with everyone in the thread, which makes it rough to find consensus, heh. :-)

I do agree right now it's a platform quirk. I don't yet see how we could handle it in the acpi_button path since it's an event that we get and process way earlier than button processing would be able to filter it, no?

Also (and I should've asked you to do this for the i2c suspend/resume issue heh) we should create a bug first and dump the relevant debugging related stuff in there
and link to it from the comment block you have there. Otherwise many years from now people will have to go grovel through the commit history of the file to
understand why a specific weird thing is happening and it may be lost from our collective immediate term memory.

Not everyone who installs FreeBSD on these machines is a kernel developer; many simply expect it to come up when they open the box, and firmware getting better over time doesn't change what a person needs on the machine in front of them today.

I never claimed everybody should be a kernel dev.
In fact this bug is just cosmetic, not vital .

People using FreeBSD currently don't find it pre-installed in their laptop, they do it willingly. They spend at least 1 day with the installer deciding the layout and getting confused if they should use zfs or ufs, and 2-3 days asking where is the WIFI/GPU driver.
Then a lifetime learning the OS capability.

As for acpi_button.c... it should have the spec compliant implementation, the quirks go in ${brand}_acpi.c.

Framework should aim to have quirkless computers that aligns with their right-to-repair philosophy also to be PNP with any OS.


Looking forward to test the driver. It will be a good opportunity for me to learn more about the ACPI. (I touched it slightly when i fixed the Panasonic toughbooks before i corebooted them out of frustration.)

@adrian -- good call on the PR; I'll open one and dump the bootverbose traces, the EC _Qxx timing, and the per-cycle timestamps there, then link it from the comment block so the "why" survives long after our short-term memory fades. (Noted for the i2c one too.)

On location, you may well be right that the button path can't see it. Let me confirm which route the replay actually takes on this box -- if it arrives via the fixed-feature/SCI path rather than as a control-method Notify(0x80) on the button device, then acpi_button.c never sees it and acpi.c is the right home after all. I'll put the trace in the PR so we decide on evidence rather than taste, and I'm glad to land it wherever you and obiwac think it belongs.

@guest-seuros -- one clarification, since it changes how it's weighed: this isn't cosmetic. The replayed press re-enters the sleep path, so the machine suspends again immediately after resume; without the guard it won't stay awake, which makes suspend/resume effectively unusable rather than untidy. (And no worries -- I wasn't suggesting you'd argued otherwise; we're after the same baseline.) Where the fix ultimately lives -- button driver, a per-platform file, or acpi.c -- is exactly what adrian and obiwac are weighing now, so I'll follow the evidence and their lead. Thanks again for offering to test; exercising it across your fleet will make whatever shape we choose more trustworthy.

Maybe i misexplained myself. I did not mean the work you doing is cosmetic.

What i mean is that the resume/suspend flow is cosmetic.

FreeBSD was without that feature because its target audience was servers that take a beating 24/7 and people that recompile the kernel and reboot every day.

Now that a new audience is targeted, i will prefer if we don't rush and collaborate with vendors to provide proper support.

PS: i myself want to suspend/resume some computers instead of turning them off.

olce requested changes to this revision.Tue, Jun 23, 9:47 PM

I'm doing significant ACPI work right now, so please keep me posted. Actually, the current diff does not apply anymore (but rebasing it is easy).

As it's a matter of just waiting an additional half-second in order to avoid receiving the spurious power/sleep button press notification, really this is a no brainer and we should just go for it. This has no drawbacks (nobody will need to re-suspend that early, and even if that is needed waiting for half a second seems negligible). We cannot presume manufacturers will patch their firmware, nor that users will timely install them. And we'll have releases much earlier than 16.

The acpi.c file is a monster that needs slimming. I'd really want to see everything related to power/sleep buttons moved into acpi_button.c. And yes, suspend/wakeup presses (in FreeBSD parlance) behaviors must be factored out in a single function (actually, they are already, functions are acpi_event_{power,sleep}_button_{sleep,wakeup}()), with an additional guard if the corresponding operation is in flight (as it may be triggered by different ACPI sources, e.g., fixed buttons and buttons in the namespace which there may be multiple instances of). I'd move all this functions to acpi_button.c (in a separate commit).

Requesting changes just because of a problem, see inline comments.

sys/dev/acpica/acpi.c
3797–3800

Seems too detailed. Please just say that we record the time after wakeup to be able to ignore spurious sleep events for a grace period. The precise details of the mechanism, in particular the functions realizing it, is too level and may change without the comment being updated. Refer to the herald comment before ACPI_BUTTON_REPLAY_WINDOW.

4149–4180

Again, seems pretty long and with some perhaps not that relevant information.

How about something like (probably can be condensed even further):
"Grace window after wakeup where spurious power/sleep button press for suspend are ignored. Some firmwares indeed wrongly report the depress causing the wakeup as a sleep event (notify value of 0x80 instead of 0x02). On XXX laptop, the corresponding event appears before 200ms after the wakeup, so a value of half a second was chosen."

4197–4200

Must be before the #if defined(...) just above.

This revision now requires changes to proceed.Tue, Jun 23, 9:47 PM
sys/dev/acpica/acpi.c
4197–4200

Well, forget it, as other platforms anyway don't support suspend/resume, that could as well stay here.

Rework per review: take the resume timestamp before DEVICE_RESUME() to close a race the instrumentation exposed (PR 296243); window 1s (replay measured ~600 ms from that anchor); condense comments; move the guard ahead of the arch #if so it also covers the non-x86 path; rebased on current main.

Move the replay guard back inside the #if defined(amd64) || defined(i386) block in acpi_event_power_button_sleep(), per olce's note that the non-x86 paths don't support suspend/resume anyway. No functional change on x86.

dteske edited the test plan for this revision. (Show Details)
dteske edited the summary of this revision. (Show Details)

@adrian -- thanks for pushing on this; asking for real traces was the right call and it turned up more than I expected.

I filed PR 296243 with the full evidence: the device_printf capturing the replayed Notify value, a kdb_backtrace() showing the path through the ACPICA notify taskqueue, and the per-cycle timing data.

Two things came out of it. First, the replayed press arrives as an ordinary button event (Notify 0x80), not the Notify 0x02 the spec requires for a wake source, so it's indistinguishable from a genuine press at the point we evaluate it. Second, the trace exposed a race in my original timestamp placement: I recorded the resume time too late — after DEVICE_RESUME() had already reinitialized the EC — so the replay could be evaluated on the taskqueue before the timestamp was written and slip through. Moving the timestamp ahead of DEVICE_RESUME() makes it deterministic. Measured from that anchor the replay lands at ~600 ms across many cycles, so the window is now a fixed 1 s.

The bug has the traces, the timing table, and the backtrace if you want the raw data.

@olce -- thanks for the review; I've worked in all of it. Summary of this update:

  • Rebased onto main (picks up the reworked event-handler signatures).
  • Condensed the comments — both the recording-site note and the herald above ACPI_BUTTON_REPLAY_WINDOW are tightened, with the full traces, timing, and analysis moved to PR 296243 rather than living in the source.
  • Window is now SBT_1S; measured replay latency is ~600 ms (per-cycle data in the bug).
  • Per your follow-up ("forget it, as other platforms anyway don't support suspend/resume, that could as well stay here"), the replay guard is back inside the #if defined(__amd64__) || defined(__i386__) block in acpi_event_power_button_sleep().

On relocating the button-handling functions into acpi_button.c: agreed that's the right home. I'd like to keep it as a separate, mechanical commit so this change stays focused on the fix and the move is easy to review on its own — I'll follow up with that once this lands, unless you'd prefer it bundled here.

Restore the pre-existing 'ACPICA Event Handlers' section comment to its function (acpi_invoke_sleep_eventhandler); the replay-window comment, define, and helper now sit above that header as a self-contained block. No functional change.

Add 'MFC after: 2 weeks' to give the change soak time on HEAD before merging back.

On relocating the button-handling functions into acpi_button.c: agreed that's the right home. I'd like to keep it as a separate, mechanical commit so this change stays focused on the fix and the move is easy to review on its own — I'll follow up with that once this lands, unless you'd prefer it bundled here.

Oh, no, I agree a separate change for such a kind of mechanical move is much better.

I was slightly worried about having a call to getsbinuptime() before acpi_resync_clock(), but the latter does something different (system clock reset) and the earlier resumeclock() is what actually sets up the timers back. Your experiments also give confidence that this is working correctly.

Also, it seemed to me that having the assignment of acpi_resume_sbt after interrupts are re-enabled could not completely close the race (if the event is raised very close to the wakeup), but thanks to the mechanics around acpi_next_stype, we are good.

So looks good to go.

This revision is now accepted and ready to land.Wed, Jun 24, 1:29 PM