LinuxKPI: Reimplement irq_work queue on top of fast taskqueue
Summary:
Linux's irq_work queue was created for asynchronous execution of code from contexts where spin_lock's are not available like "hardware interrupt context". FreeBSD's fast taskqueues was created for the same purposes.
Drm-kmod 5.4 uses irq_work_queue() at least in one place to schedule execution of task/work from the critical section that triggers following INVARIANTS-induced panic:
panic: acquiring blockable sleep lock with spinlock or critical section held (sleep mutex) linuxkpi_short_wq @ /usr/src/sys/kern/subr_taskqueue.c:281 cpuid = 6 time = 1605048416 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe006b538c90 vpanic() at vpanic+0x182/frame 0xfffffe006b538ce0 panic() at panic+0x43/frame 0xfffffe006b538d40 witness_checkorder() at witness_checkorder+0xf3e/frame 0xfffffe006b538f00 __mtx_lock_flags() at __mtx_lock_flags+0x94/frame 0xfffffe006b538f50 taskqueue_enqueue() at taskqueue_enqueue+0x42/frame 0xfffffe006b538f70 linux_queue_work_on() at linux_queue_work_on+0xe9/frame 0xfffffe006b538fb0 irq_work_queue() at irq_work_queue+0x21/frame 0xfffffe006b538fd0 semaphore_notify() at semaphore_notify+0xb2/frame 0xfffffe006b539020 __i915_sw_fence_notify() at __i915_sw_fence_notify+0x2e/frame 0xfffffe006b539050 __i915_sw_fence_complete() at __i915_sw_fence_complete+0x63/frame 0xfffffe006b539080 i915_sw_fence_complete() at i915_sw_fence_complete+0x8e/frame 0xfffffe006b5390c0 dma_i915_sw_fence_wake() at dma_i915_sw_fence_wake+0x4f/frame 0xfffffe006b539100 dma_fence_signal_locked() at dma_fence_signal_locked+0x105/frame 0xfffffe006b539180 dma_fence_signal() at dma_fence_signal+0x72/frame 0xfffffe006b5391c0 dma_fence_is_signaled() at dma_fence_is_signaled+0x80/frame 0xfffffe006b539200 dma_resv_add_shared_fence() at dma_resv_add_shared_fence+0xb3/frame 0xfffffe006b539270 i915_vma_move_to_active() at i915_vma_move_to_active+0x18a/frame 0xfffffe006b5392b0 eb_move_to_gpu() at eb_move_to_gpu+0x3ad/frame 0xfffffe006b539320 eb_submit() at eb_submit+0x15/frame 0xfffffe006b539350 i915_gem_do_execbuffer() at i915_gem_do_execbuffer+0x7d4/frame 0xfffffe006b539570 i915_gem_execbuffer2_ioctl() at i915_gem_execbuffer2_ioctl+0x1c1/frame 0xfffffe006b539600 drm_ioctl_kernel() at drm_ioctl_kernel+0xd9/frame 0xfffffe006b539670 drm_ioctl() at drm_ioctl+0x5cd/frame 0xfffffe006b539820 linux_file_ioctl() at linux_file_ioctl+0x323/frame 0xfffffe006b539880 kern_ioctl() at kern_ioctl+0x1f4/frame 0xfffffe006b5398f0 sys_ioctl() at sys_ioctl+0x12a/frame 0xfffffe006b5399c0 amd64_syscall() at amd64_syscall+0x121/frame 0xfffffe006b539af0 fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe006b539af0 --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x800a6f09a, rsp = 0x7fffffffe588, rbp = 0x7fffffffe640 --- KDB: enter: panic
Here, the dma_resv_add_shared_fence() performs a critical_enter() and following call of schedule_work() from semaphore_notify() triggers 'acquiring blockable sleep lock with spinlock or critical section held' panic.
Switching irq_work implementation to fast taskqueue fixes the panic for me.
Other report with the similar bug: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=247166
Reviewed By: hselasky
Differential Revision: https://reviews.freebsd.org/D27171