Page MenuHomeFreeBSD

acpi: Implement s2idle loop
Needs ReviewPublic

Authored by obiwac on Mon, Dec 29, 9:03 PM.
Tags
None
Referenced Files
Unknown Object (File)
Tue, Jan 20, 3:17 AM
Unknown Object (File)
Sat, Jan 17, 6:42 PM
Unknown Object (File)
Sat, Jan 17, 11:05 AM
Unknown Object (File)
Sat, Jan 17, 9:08 AM
Unknown Object (File)
Fri, Jan 16, 6:55 PM
Unknown Object (File)
Fri, Jan 16, 2:47 PM
Unknown Object (File)
Thu, Jan 8, 3:20 AM
Unknown Object (File)
Thu, Jan 8, 1:41 AM
Subscribers

Details

Reviewers
jkim
olce
markj
Summary

When entering s2idle, we might still receive spurious SCIs from devices we can't mask GPEs from (well, we can mask them but that would mean masking out wake GPEs too). To solve this, put everything in a loop where every CPU except BSP is idled, and, when BSP is woken up, check if we are meant to break out of this loop and wake all other CPUs or if we should immediately go back to sleep.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 69921
Build 66804: arc lint + arc unit

Event Timeline

Don't use new IPI_IDLE/IPI_UNIDLE, but use smp_rendezvous_cpus() calling sched_do_idle() instead.

Remove unused acpi_s2idle_looping

This approach looks weird, and I'm still unclear on what is really needed. More generally, I'd like to know why we have to do that, and I question its safety.

So I have some questions:

  1. Which kind of interrupts do you still see that prevent staying in S0ix? I think you mentioned some IPIs at some point, but which ones? Can we determine their cause?

(In passing, I think we should settle on a common terminology and use it everywhere; S0ix instead of s2idle? something else? the ACPI spec calls it Low-Power (S0) Idle.)

  1. What does control routing of these interrupts to the different CPUs? Are we sure these are routed to CPU 0/the BSP?
  2. In general, shouldn't we service interrupts quite quickly after they are triggered? Could delaying interrupt servicing for arbitrary amount of times cause problems (hardware going mad, possibly bringing down the whole platform; driver queues becoming full and the driver exploding at wakeup)?

Additionally, forcing execution of the idle threads on other CPUs, as we discussed offline, might lead to strange problems where kernel threads hold mutexes that might be needed by GPE servicing.

sys/dev/acpica/acpi.c
3648–3650

You could simplify by comparing directly pc_cpuid with curcpu and getting rid of other_cpus.

Which kind of interrupts do you still see that prevent staying in S0ix? I think you mentioned some IPIs at some point, but which ones? Can we determine their cause?

There's two parts to this: what's interrupting the BSP, and what's interrupting the other cores.

The s2idle loop is necessary for working around the BSP being interrupted. We mask out only SCIs, and we try to mask out the (wake) GPEs we care about, but unfortunately some wake events are put under the same GPE number as noisy devices, such as the battery on Framework laptops. This seems to be expected behaviour and the same thing happens on Linux. One thing entering S0i3 does on the Framework laptops is reduce the frequency of battery events, to once a second to once a minute (except if plugged in), so we're not interrupting S0i3 that often. I believe the actual code to do this is in common/charge_state_v2.c's charger_task() function. update_base_battery_info() is called in the infinite loop there, which in turn calls host_set_single_event(EC_HOST_EVENT_BATTERY), which is one of the things that can trigger an SCI according to SCI_HOST_EVENT_MASK in board/hx30/board.h (Framework board).

The actual code that decides the timing is the following:

/* How long to sleep? */
if (problems_exist)
	/* If there are errors, don't wait very long. */
	sleep_usec = CHARGE_POLL_PERIOD_SHORT;
else if (sleep_usec <= 0) {
	/* default values depend on the state */
	if (!curr.ac &&
		 (curr.state == ST_IDLE ||
		 curr.state == ST_DISCHARGE)) {
#ifdef CONFIG_CHARGER_OTG
		int output_current = curr.output_current;
#else
		int output_current = 0;
#endif
		/*
		 * If AP is off and we do not provide power, we
		 * can sleep a long time.
		 */
		if (chipset_in_state(CHIPSET_STATE_ANY_OFF |
					  CHIPSET_STATE_ANY_SUSPEND)
				&& output_current == 0)
			sleep_usec =
				CHARGE_POLL_PERIOD_VERY_LONG;
		else
			/* Discharging, not too urgent */
			sleep_usec = CHARGE_POLL_PERIOD_LONG;
	} else {
		/* AC present, so pay closer attention */
		sleep_usec = CHARGE_POLL_PERIOD_CHARGE;
	}
}

And this tracks with what's observed (CHARGE_POLL_PERIOD_VERY_LONG is a minute).

(Disclaimer that what I'm saying may not be entirely accurate; I have not spent much time with the EC code.)

I'm not getting IPIs I'm pretty sure. Verified by comparing IPI counts before and after the s2idle loops.

(In passing, I think we should settle on a common terminology and use it everywhere; S0ix instead of s2idle? something else? the ACPI spec calls it Low-Power (S0) Idle.)

Well s2idle and S0ix are 2 different things really. s2idle is just the idea that you're suspending the CPUs to an idle state, and in theory can be supported on any machine, so long as it can idle its CPUs. Whereas S0ix are states the firmware puts the whole CPU in when it detects the CPUs are idle + some other very platform-specific conditions (on modern AMD CPUs, the only relevant state is S0i3). So you're "doing" s2idle (+ some other stuff) in order to "enter" an S0ix state.

And low-power idle is kind of too generic a term for what we're attempting to support here, as in there's a lot of very S0ix-specific stuff that we're doing so I think its worth sticking with that name.

What does control routing of these interrupts to the different CPUs? Are we sure these are routed to CPU 0/the BSP?

For SCIs, yes. I seem to remember reading this somewhere in the ACPI docs but can't find it at the minute. Either way this can be experimentally shown and those should be the only enabled interrupts.

In general, shouldn't we service interrupts quite quickly after they are triggered? Could delaying interrupt servicing for arbitrary amount of times cause problems (hardware going mad, possibly bringing down the whole platform; driver queues becoming full and the driver exploding at wakeup)?

I guess by nature of suspend we should always put the system in a state where none of this matters anymore before disabling interrupts, which a priori we are already doing. Otherwise our CPUs would constantly exit idle and thus the package would constantly exit S0i3.

Additionally, forcing execution of the idle threads on other CPUs, as we discussed offline, might lead to strange problems where kernel threads hold mutexes that might be needed by GPE servicing.

Absolutely. Though no idea what the solution to this could be except for making sure our GPE service routines don't try to acquire any of these resources.