Page MenuHomeFreeBSD

iscsi(4) freeze vs action race workaround
ClosedPublic

Authored by trasz on Oct 12 2020, 1:29 PM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Jan 2, 9:36 AM
Unknown Object (File)
Dec 2 2024, 11:11 AM
Unknown Object (File)
Nov 29 2024, 9:15 PM
Unknown Object (File)
Oct 25 2024, 5:44 PM
Unknown Object (File)
Oct 25 2024, 5:44 PM
Unknown Object (File)
Oct 25 2024, 5:44 PM
Unknown Object (File)
Oct 25 2024, 5:35 PM
Unknown Object (File)
Oct 22 2024, 6:49 AM

Details

Summary

If the SIM freezes the queue at exactly the wrong moment, after
another thread has started to send in a CCB and already checked
the queue wasn't frozen, we would end up with iscsi_action()
being called despite the queue is now frozen.

Add a check to make sure this doesn't happen. Perhaps this should
be fixed at the CAM level instead, but given how the send queue
and SIM are governed by two separate mutexes, it could be somewhat
hard to do.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

trasz requested review of this revision.Oct 12 2020, 1:29 PM

It seems like a real race window, opened by my fine-grained locking years ago. The freeze is protected by devq->send_mtx, while the driver code is not protected by it, protected either by SIM lock, or possibly nothing at all. So for this specific driver your solution seems correct. For general case though it is more difficult.

This revision is now accepted and ready to land.Oct 12 2020, 2:13 PM

There's many races here around frozen queues, and I'll wager it's best to check the race is lost in the sim like this rather than introduce a new lock that would contend.

Though having said that. Scott is making the case that the whole recovery model needs a good overhaul give the diversity of practice in existing SIMs.