Page MenuHomeFreeBSD

linuxkpi: fix mod_delayed_work() with running tasks
AcceptedPublic

Authored by kevans on Wed, Apr 8, 2:03 PM.
Tags
None
Referenced Files
F152474617: D56308.diff
Wed, Apr 15, 4:44 AM
Unknown Object (File)
Sat, Apr 11, 5:42 PM

Details

Reviewers
markj
kib
manu
wulf
Group Reviewers
Klara
Summary

Extensive exercising of irdma reveals a problem with the current
implementation of mod_delayed_work: if the task is currently running,
then it is not cancellable and mod_delayed_work() will return false.

This is most obviously a problem in irdma_sched_qp_flush_work(), which
avoids dropping the refcount if it returns false because the expectation
is that the task will be scheduled and this thread 'owns' that. After
a closer look at the Linux implementation, this seems to be fine: they
wait until the task is either pending or idle to act on it.

This could instead be rewritten to loop on the current
linux_cancel_delayed_work / linux_queue_delayed_work_on sequence as long
as they both return false, but inlining the logic feels like the more
correct thing to do.

This patch has run fine on my laptop, but i915kms doesn't stress this
enough to hit a scenario where the task was already running and the
SDT probes won't fire (nor does a thread get stuck in that loop). It's
also been run through typical workloads in irdma, but we haven't
specifically confirmed that the busy loop has been hit.

(Is maybe_yield() wrong here?)

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 72122
Build 69005: arc lint + arc unit

Event Timeline

kevans requested review of this revision.Wed, Apr 8, 2:03 PM
des added inline comments.
sys/compat/linuxkpi/common/src/linux_work.c
243–244

Should we assert that the lock is held?

sys/compat/linuxkpi/common/src/linux_work.c
540

What is the purpose of __aligned? I see it was copied.

659
kevans added inline comments.
sys/compat/linuxkpi/common/src/linux_work.c
540

I'll admit to not having noticed, and I don't see a reason for it. linux_update_state just passes individual states on through atomic_cmpxchg, so presumably on all of our platforms that's loaded into a register and there's no reaon for the table to over-aligned.

I'm not seeing any clues in rGca2ad6bd779fae857f27b028d736f506b66a528a that introduced the other ones, either. Maybe I'll just remove this one, then do another pass and strip the other ones out?

sys/compat/linuxkpi/common/src/linux_work.c
540

Sure.

  • Assert in the _locked variant
  • Drop mysterious __aligned
  • Fix typo
This revision is now accepted and ready to land.Fri, Apr 10, 6:52 PM
sys/compat/linuxkpi/common/src/linux_work.c
568

Why is it ok to ignore the result?

619

This assignment is redundant.