Page MenuHomeFreeBSD

firewire/sbp: try to improve locking, plus a few style nits
ClosedPublic

Authored by avg on Mar 5 2017, 1:40 PM.
Tags
None
Referenced Files
F83760276: D9898.id.diff
Tue, May 14, 1:10 PM
Unknown Object (File)
Sun, May 12, 3:42 AM
Unknown Object (File)
Sun, May 12, 3:38 AM
Unknown Object (File)
Thu, Apr 25, 9:47 PM
Unknown Object (File)
Apr 10 2024, 7:59 AM
Unknown Object (File)
Apr 9 2024, 9:11 PM
Unknown Object (File)
Dec 20 2023, 1:34 AM
Unknown Object (File)
Dec 15 2023, 9:25 PM
Subscribers

Details

Summary

This change tries to fix the most obvious locking problems.

sbp_cam_scan_lun() is never called with the sbp lock held, so the lock
needs to be acquired internally (if it's needed at all).
Without this change a kernel with INVARIANTS panics when a firewire disk
is connected:

panic: mutex sbp not owned at /usr/src/sys/dev/firewire/sbp.c:967
KDB: stack backtrace:
db_trace_self_wrapper() at 0xffffffff80420bbb = db_trace_self_wrapper+0x2b/frame 0xfffffe0504df0930
kdb_backtrace() at 0xffffffff80670359 = kdb_backtrace+0x39/frame 0xfffffe0504df09e0
vpanic() at 0xffffffff8063986c = vpanic+0x14c/frame 0xfffffe0504df0a20
panic() at 0xffffffff806395b3 = panic+0x43/frame 0xfffffe0504df0a80
__mtx_assert() at 0xffffffff8061c40d = __mtx_assert+0xed/frame 0xfffffe0504df0ac0
sbp_cam_scan_lun() at 0xffffffff80474667 = sbp_cam_scan_lun+0x37/frame 0xfffffe0504df0af0
xpt_done_process() at 0xffffffff802aacfa = xpt_done_process+0x2da/frame 0xfffffe0504df0b30
xpt_done_td() at 0xffffffff802ac2e5 = xpt_done_td+0xd5/frame 0xfffffe0504df0b80
fork_exit() at 0xffffffff805ff72f = fork_exit+0xdf/frame 0xfffffe0504df0bf0
fork_trampoline() at 0xffffffff8082483e = fork_trampoline+0xe/frame
0xfffffe0504df0bf0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---

Also, I tried to reduce the scope of the sbp lock to avoid holding it
while doing bus_dma allocations. Without the change there is mirriad
of warnings like:

kernel: bus_dma_tag_create with the following non-sleepable locks held:
kernel: exclusive sleep mutex sbp (sbp) r = 0 (0xfffff8000a099748) locked @ /usr/src/sys/modules/firewire/sbp/../../../dev/firewire/sbp.c:802
kernel: stack backtrace:
kernel: #0 0xffffffff80aa1250 at witness_debugger+0x70
kernel: #1 0xffffffff80aa2537 at witness_warn+0x3d7
kernel: #2 0xffffffff810064dd at bus_dma_tag_create+0x3d
kernel: #3 0xffffffff83703402 at fwdma_malloc+0x82
kernel: #4 0xffffffff836ed759 at sbp_post_explore+0x849
kernel: #5 0xffffffff836f9e76 at fw_bus_probe_thread+0x906
kernel: #6 0xffffffff80a05674 at fork_exit+0x84
kernel: #7 0xffffffff80e988be at fork_trampoline+0xe
kernel: bus_dmamem_alloc with the following non-sleepable locks held:
kernel: exclusive sleep mutex sbp (sbp) r = 0 (0xfffff8000a099748) locked @ /usr/src/sys/modules/firewire/sbp/../../../dev/firewire/sbp.c:802
kernel: stack backtrace:
kernel: #0 0xffffffff80aa1250 at witness_debugger+0x70
kernel: #1 0xffffffff80aa2537 at witness_warn+0x3d7
kernel: #2 0xffffffff810065c3 at bus_dmamem_alloc+0x33
kernel: #3 0xffffffff83703424 at fwdma_malloc+0xa4
kernel: #4 0xffffffff836ed759 at sbp_post_explore+0x849
kernel: #5 0xffffffff836f9e76 at fw_bus_probe_thread+0x906
kernel: #6 0xffffffff80a05674 at fork_exit+0x84
kernel: #7 0xffffffff80e988be at fork_trampoline+0xe
kernel: bus_dmamap_create with the following non-sleepable locks held:
kernel: exclusive sleep mutex sbp (sbp) r = 0 (0xfffff8000a099748) locked @ /usr/src/sys/modules/firewire/sbp/../../../dev/firewire/sbp.c:802
kernel: stack backtrace:
kernel: #0 0xffffffff80aa1250 at witness_debugger+0x70
kernel: #1 0xffffffff80aa2537 at witness_warn+0x3d7
kernel: #2 0xffffffff8100655f at bus_dmamap_create+0x2f
kernel: #3 0xffffffff836ed7d2 at sbp_post_explore+0x8c2
kernel: #4 0xffffffff836f9e76 at fw_bus_probe_thread+0x906
kernel: #5 0xffffffff80a05674 at fork_exit+0x84
kernel: #6 0xffffffff80e988be at fork_trampoline+0xe

Also, there were a couple of LORs:

kernel: lock order reversal:
kernel: 1st 0xfffff8000a099748 sbp (sbp) @ /usr/src/sys/kern/kern_mutex.c:220
kernel: 2nd 0xfffffe0001b80870 firewire (firewire) @ /usr/src/sys/modules/firewire/firewire/../../../dev/firewire/firewire.c:302
kernel: stack backtrace:
kernel: #0 0xffffffff80aa1250 at witness_debugger+0x70
kernel: #1 0xffffffff80aa1144 at witness_checkorder+0xe54
kernel: #2 0xffffffff80a22a04 at __mtx_lock_flags+0xa4
kernel: #3 0xffffffff836f6423 at fw_asyreq+0x2d3
kernel: #4 0xffffffff80a580dc at softclock_call_cc+0x19c
kernel: #5 0xffffffff80a584d7 at softclock+0x47
kernel: #6 0xffffffff80a07f26 at intr_event_execute_handlers+0x96
kernel: #7 0xffffffff80a085a6 at ithread_loop+0xa6
kernel: #8 0xffffffff80a05674 at fork_exit+0x84
kernel: #9 0xffffffff80e988be at fork_trampoline+0xe
kernel: lock order reversal:
kernel: 1st 0xfffff8000a099748 sbp (sbp) @ /usr/src/sys/kern/kern_mutex.c:220
kernel: 2nd 0xfffff8000ff37460 CAM device lock (CAM device lock) @ /usr/src/sys/cam/scsi/scsi_xpt.c:2356
kernel: stack backtrace:
kernel: #0 0xffffffff80aa1250 at witness_debugger+0x70
kernel: #1 0xffffffff80aa1144 at witness_checkorder+0xe54
kernel: #2 0xffffffff80a22a04 at __mtx_lock_flags+0xa4
kernel: #3 0xffffffff80319f12 at scsi_scan_lun+0x122
kernel: #4 0xffffffff836eef40 at sbp_cam_scan_target+0x100
kernel: #5 0xffffffff80a580dc at softclock_call_cc+0x19c
kernel: #6 0xffffffff80a584d7 at softclock+0x47
kernel: #7 0xffffffff80a07f26 at intr_event_execute_handlers+0x96
kernel: #8 0xffffffff80a085a6 at ithread_loop+0xa6
kernel: #9 0xffffffff80a05674 at fork_exit+0x84
kernel: #10 0xffffffff80e988be at fork_trampoline+0xe

The code badly needs some re-engineering. SBP really should implement
a CAM transport, so that it avoids control flow inversion when re-scanning
the bus. Also, the sbp lock seems to be too coarse.

Also, the change includes some changes to related to locking.

  • sbp_cam_scan_lun: restore CAM_DEV_QFREEZE before re-queueing the ccb because xpt_setup_ccb resets ccb_h.flags
  • sbp_post_busreset: call xpt_release_simq only if it's actually frozen
  • don't place private SIMQ_FREEZED flag (sic, "freezed") into sim->flags, use sbp->flags for that
  • some style fixes and control flow enhancements
Test Plan

Tested with an external firewire disk:
sbp0: sbp_show_sdev_info: sbp0:0:0: ordered:1 type:14 EUI:0050770e00071002 node:0 speed:2 maxrec:8
sbp0: sbp_show_sdev_info: sbp0:0:0 'Prolific PL3507 Combo Device' '(1394-ATAPI rev1.10)' '012804'
da0 at sbp0 bus 0 scbus7 target 0 lun 0
da0: <Prolific (1394-ATAPI rev1 2804> Fixed Simplified Direct Access SPC-2 SCSI device
da0: 50.000MB/s transfers
da0: 305244MB (625140335 512 byte sectors)
da0: quirks=0x2<NO_6_BYTE>

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

avg retitled this revision from to firewire/sbp: try to improve locking, plus a few style nits.
avg updated this object.
avg edited the test plan for this revision. (Show Details)
avg added reviewers: mav, sbruno, scottl.
sbruno edited edge metadata.

Go for it. I took a look at what is being restructured, and this looks good.

This revision is now accepted and ready to land.Mar 6 2017, 1:49 PM
This revision was automatically updated to reflect the committed changes.