Page MenuHomeFreeBSD

x86/xen: fix accounted interrupt time
AbandonedPublic

Authored by royger on Mar 7 2024, 9:02 AM.
Tags
None
Referenced Files
Unknown Object (File)
Wed, Jul 3, 10:48 PM
Unknown Object (File)
Sun, Jun 23, 3:34 PM
Unknown Object (File)
Sun, Jun 23, 12:20 PM
Unknown Object (File)
Sun, Jun 23, 4:01 AM
Unknown Object (File)
Sat, Jun 22, 2:50 PM
Unknown Object (File)
Jun 7 2024, 3:20 PM
Unknown Object (File)
Jun 3 2024, 4:17 PM
Unknown Object (File)
May 28 2024, 1:08 PM

Details

Summary

The current addition to the interrupt nesting level in
xen_arch_intr_handle_upcall() needs to be compensated in
xen_intr_handle_upcall(), otherwise interrupts dispatched by the upcall handler
end up seeing a td_intr_nesting_level of 2 or more, which makes them assume
there's been an interrupt nesting.

Such extra interrupt nesting count lead to statclock() reporting idle time as
interrupt, as the call from interrupt context will always be seen as a nested
one (td->td_intr_nesting_level >= 2) due to the nesting count increase done by
both xen_arch_intr_handle_upcall() and intr_execute_handlers().

Fix this by adjusting the nested interrupt count before dispatching interrupts
from xen_intr_handle_upcall().

PR: 277231
Reported by: Matthew Grooms <mgrooms@shrew.net>
Fixes: af610cabf1f4 ('xen/intr: adjust xen_intr_handle_upcall() to match driver filter')
Sponsored by: Cloud Software Group

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

royger requested review of this revision.Mar 7 2024, 9:02 AM
royger updated this revision to Diff 135495.

If you check your archive of e-mail you will find I expressed uncertainty over whether curthread->td_intr_nesting_level should be incremented/decremented. Removing that is fine, but the rest is wrong.

sys/dev/xen/bus/xen_intr.c
344–349

This portion is wrong. Passing the trap_frame via the argument is still allowed due to compatibility with mid-1990s i386 FreeBSD, but this is deprecated. I hope this will be removed before FreeBSD 15.0 (D37688). This doesn't work on any non-x86.

sys/x86/xen/xen_arch_intr.c
92–93 ↗(On Diff #135495)

I believe what is needed is simply removing the increment/decrement of curthread->td_intr_nesting_level.

sys/dev/xen/bus/xen_intr.c
344–349

Hm, it seems to me that setting td_intr_frame (and storing the previous one) require bumping the td_intr_nesting_level for logical correctness. Whether that's mandatory or not it's possibly to be argued about.

I'm not sure why you say this is deprecated, I guess it would be deprecated once it's also deprecated in intr_execute_handlers()?

sys/x86/xen/xen_arch_intr.c
92–93 ↗(On Diff #135495)

Actually I'm unsure whether the old value of curthread->td_intr_frame needs to be preserved here. This is an ISR and I'm left wondering whether the old value may well be garbage or NULL.

sys/dev/xen/bus/xen_intr.c
344–349

Might be for consistency, but I'm pretty sure the result will still be correct. Alternatively perhaps the accounting numbers were wrong before and the new numbers are actually correct?

There is only a single remaining use of passing trap_frame via the argument. Once that is gone, D37688 can go in and the mechanism will no longer be available. Issue is removing that single use so miniscule, Phabricator seems overkill.

sys/x86/xen/xen_arch_intr.c
92–93 ↗(On Diff #135495)

Why do you think interrupts cannot nest?

sys/dev/xen/bus/xen_intr.c
344–349

No, new numbers are incorrect, as a non-nested interrupt dispatched from xen_intr_handle_upcall has the handler seeing td_intr_nesting_level == 2 when it should be 1.

344–349

Can't you just batch the removal of the trap_frame here from the commit that will also remove it from intr_execute_handlers() that would be the more natural way forward IMO.

xen_intr_handle_upcall() only requires the trap_frame because intr_execute_handlers() requires it, if the plan is to remove it from intr_execute_handlers() let's remove it from xen_intr_handle_upcall() at the same time.

sys/dev/xen/bus/xen_intr.c
344–349

Do the Xen handlers truly qualify as non-nested? td_intr_nesting_level == 2 might be a change, but I question whether that is actually wrong. This is not a strong objection, just I'm wondering. I am okay with removing the increment/decrement of td_intr_nesting_level.

Presently, passing the trap frame via the argument is deprecated, but still nominally allowed. The natural way forward is to not reintroduce use of deprecated functionality.

sys/x86/xen/xen_arch_intr.c
92–93 ↗(On Diff #135495)

Possible. My only strongly held view is xen_intr_handle_upcall(NULL); is correct. Just I'm left wondering whether Xen would prevent this circumstance.

sys/dev/xen/bus/xen_intr.c
344–349

It's wrong, there's no nesting, just double accounting, once in xen_arch_intr_handle_upcall() and the other in intr_execute_handlers().

I'm not sure why you say it's deprecated, all callers of intr_execute_handlers() pass it a proper trapframe except Xen, see lapic_handle_intr() for example.

sys/x86/xen/xen_arch_intr.c
92–93 ↗(On Diff #135495)

If you remove the trap_frame argument from intr_execute_handlers(), sure.

sys/dev/xen/bus/xen_intr.c
344–349

I'm not sure why you say it's deprecated, all callers of intr_execute_handlers() pass it a proper trapframe except Xen, see lapic_handle_intr() for example.

You've responded backwards. Of the functions used as a driver_filter_t (passed to intr_event_add_handler()/BUS_SETUP_INTR()), only sys/arm/arm/pmu.c:pmu_intr() still casts its argument to a trapframe. All other functions called by intr_event_execute_handlers() (via intr_execute_handlers()) retrieve the trapframe via curthread->td_intr_frame.

xen_arch_intr_handle_upcall() and lapic_handle_intr() receiving the trapframe via argument is not deprecated since they are called by x86-assembly language (sys/amd64/amd64/apic_vector.S/sys/i386/i386/apic_vector.S), not intr_event_execute_handlers().

What is deprecated is driver_filter_t functions (xen_intr_handle_upcall()) with a NULL context expecting to receive the trapframe as their argument.

Most of what is presently here is definitely wrong. xen_intr_handle_upcall() retrieving the trapframe from curthread->td_intr_frame is definitely right. Issue is xen_intr_handle_upcall() needs to behave as a proper driver_filter_t for the XenPCI support. ARM(64) and RISC-V also need this, likely PowerPC would need similar. I do see two ways forward though:

  1. I mentioned this before handing the patch for af610cabf1f off that I was uncertain about the handling of curthread->td_intr_nesting_level. As already mentioned, perhaps xen_arch_intr_handle_upcall() should be leaving the value alone?
  2. I'm wondering about the indirect call to xen_intr_handle_upcall() via the XenPCI implementation. During that call curthread->td_intr_nesting_level would start out as 1, but would then end up as 2 when xen_intr_handle_upcall() calls xen_arch_intr_execute_handlers() => intr_execute_handlers() => intr_event_execute_handlers(). Perhaps xen_intr_handle_upcall() should be decrementing curthread->td_intr_nesting_level before it calls handlers?

I've noticed the load average seeming a bit on the high side on my ARM64 test machine. I'm unsure whether that might be this bug versus simply being a debugging kernel. I can state I've tried the second and it appeared to work.

This revision now requires changes to proceed.Mar 8 2024, 11:22 PM

Most of what is presently here is definitely wrong. xen_intr_handle_upcall() retrieving the trapframe from curthread->td_intr_frame is definitely right. Issue is xen_intr_handle_upcall() needs to behave as a proper driver_filter_t for the XenPCI support. ARM(64) and RISC-V also need this, likely PowerPC would need similar. I do see two ways forward though:

  1. I mentioned this before handing the patch for af610cabf1f off that I was uncertain about the handling of curthread->td_intr_nesting_level. As already mentioned, perhaps xen_arch_intr_handle_upcall() should be leaving the value alone?
  2. I'm wondering about the indirect call to xen_intr_handle_upcall() via the XenPCI implementation. During that call curthread->td_intr_nesting_level would start out as 1, but would then end up as 2 when xen_intr_handle_upcall() calls xen_arch_intr_execute_handlers() => intr_execute_handlers() => intr_event_execute_handlers(). Perhaps xen_intr_handle_upcall() should be decrementing curthread->td_intr_nesting_level before it calls handlers?

I've noticed the load average seeming a bit on the high side on my ARM64 test machine. I'm unsure whether that might be this bug versus simply being a debugging kernel. I can state I've tried the second and it appeared to work.

Are you planning on putting your alternative for review in phab? Or should I attempt to do so based on our email conversation?

I think the usual is to let the differential owner do the update. Otherwise the steps end up being "Commandeer Revision", update, "Foist Upon".

royger edited the summary of this revision. (Show Details)

This is acceptable to me, but I believe the desire is have an okay by @markj. I believe resigning should make this more visible to @markj, rather than accepting.

This is acceptable to me, but I believe the desire is have an okay by @markj. I believe resigning should make this more visible to @markj, rather than accepting.

If you are fine with it I will commit it with your reviewed by, this being purely Xen internal code I'm happy with that. (not to refuse markj review if he wishes to take a look). Will wait until tomorrow.

Hopefully commenting doesn't cause this to reopen.

As of rG:38c35248fe3 the approach of diff 1 and 2 is now broken. Finally finished with mhorne's wanting to get the old i386 compatibility removed.