Page MenuHomeFreeBSD

linuxkpi: Do not perform unbound sleeps in del_timer_sync()
AbandonedPublic

Authored by wulf on May 7 2020, 7:57 PM.

Details

Reviewers
hselasky
zeising
Summary

Linux version of callout_drain() does not perform unbound sleep while
waiting for timer to stop. Instead of that, it does non-blockable spins.
Implement the same logic in linuxkpi with spinning around callout_stop().
It uses 3-rd callout reset/stop syncronization method from callout(9).

This fixes following witness warning in drm-kmod:

calling _callout_stop_safe with the following non-sleepable locks held:
exclusive sleep mutex lnxspin (lnxspin) r = 0 (0xffffffff82140be0) locked @ /usr/ports/graphics/drm-devel-kmod/work/kms-drm-43fe25f/linuxkpi/gplv2/src/linux_dmafence.c:323
stack backtrace:
#0 0xffffffff8078e8b1 at witness_debugger+0x71
#1 0xffffffff8078f84d at witness_warn+0x40d
#2 0xffffffff8073efb7 at _callout_stop_safe+0x47
#3 0xffffffff81fac71c at dma_i915_sw_fence_wake_timer+0xac
#4 0xffffffff8221b75b at dma_fence_signal+0x16b
#5 0xffffffff81f5ac09 at i915_clflush_work+0x89
#6 0xffffffff81f1bc4f at linux_work_fn+0xdf
#7 0xffffffff807813aa at taskqueue_run_locked+0xaa
#8 0xffffffff80782434 at taskqueue_thread_loop+0x94
#9 0xffffffff806e1d70 at fork_exit+0x80
#10 0xffffffff809b4fae at fork_trampoline+0xe

Test Plan

Run drm-kmod with this witness warning disappeared.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Lint Skipped
Unit
Unit Tests Skipped

Event Timeline

wulf requested review of this revision.May 7 2020, 7:57 PM
hselasky requested changes to this revision.May 7 2020, 8:39 PM

This patch is not correct. Del timer sync means that when this function returns, the callout structure shall no longer be in use. The only way to guarantee that is to use callout_drain()!
It is not good enough to only block the timer callback!

If you look at the code causing this, it is incorrectly patched:

static void dma_i915_sw_fence_wake_timer(struct dma_fence *dma,
                                         struct dma_fence_cb *data)
{
        struct i915_sw_dma_fence_cb_timer *cb =
                container_of(data, typeof(*cb), base.base);
        struct i915_sw_fence *fence;
                                  
        fence = xchg(&cb->base.fence, NULL);
        if (fence)
                i915_sw_fence_complete(fence);
#ifdef __linux__
        irq_work_queue(&cb->work);
#else
        del_timer_sync(&cb->timer);
        dma_fence_put(cb->dma);
        kfree_rcu(cb, rcu);
#endif
}

Please just use a regular work_queue() instead of work_irq_queue() to solve this. Keeping the ifdef Linux there!

--HPS

This revision now requires changes to proceed.May 7 2020, 8:39 PM

Hm, linux_dmafence.c dma_fence_array_cb_func is patched to not do the irq work thing at all.. o_0

This patch is not correct. Del timer sync means that when this function returns, the callout structure shall no longer be in use. The only way to guarantee that is to use callout_drain()!
It is not good enough to only block the timer callback!

The patch does not just block the timer callbacks. It tries to take into account threads that have passed the "point of no return" in softclock_call_cc() but not called back yet. It does it with spinning on callout_stop() until it return non-zero value. I am not 100% sure that KPI guarantee correctness of this approach though.

Please just use a regular work_queue() instead of work_irq_queue() to solve this. Keeping the ifdef Linux there!

That way suits me too. Unfortunately, It will take me some days to test your patch.

Depending on how it is used, it may have to change too. If you don't see witness warnings it is fine!

Witness warnings are gone with your patch. Thank you.