Page MenuHomeFreeBSD

Protect smbus ioctls in ig4 driver using a shared lock
ClosedPublic

Authored by grembo on Jun 6 2015, 5:57 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sun, Jan 19, 6:20 AM
Unknown Object (File)
Mon, Jan 6, 9:21 PM
Unknown Object (File)
Sun, Jan 5, 1:08 AM
Unknown Object (File)
Sat, Jan 4, 5:45 AM
Unknown Object (File)
Thu, Jan 2, 3:48 PM
Unknown Object (File)
Mon, Dec 30, 2:31 AM
Unknown Object (File)
Mon, Dec 30, 2:13 AM
Unknown Object (File)
Mon, Dec 30, 1:52 AM
Subscribers

Details

Summary

In https://reviews.freebsd.org/D2372 one important finding wasn't
addressed. Multiple calls over the smbus interface implemented by
the driver could trample over each other as the lock protecting
them was lifted in mtx_sleep to allow the interrupt handler to
do its work. The solution implemented is to add a shared lock (sx)
to protect calls/transactions (ig4_call_lock), while keeping the
mutex to protect I/O and allow synchronization of the interrupt handler
and wait_status (ig4_io_lock). I hope this makes sense.

Test Plan

Install, load on boot, load after boot, make sure everything still works.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

grembo retitled this revision from to Protect smbus ioctls in ig4 driver using a shared lock.
grembo updated this object.
grembo edited the test plan for this revision. (Show Details)
grembo added reviewers: jhb, adrian.

I am not sure that what you describe as a race between simultaneous ioctls really possible, esp. after the r281828.

I think that clarification of the mutex/io_lock semantic is required as the first step.

I think it is possible for calls into the smbus interface to get interleaved right now. If they come from smb(4) ioctls, they could get interleaved because that driver does not hold its own sx like iic(4) does (and did even before r281828). But those calls could also be made by another driver directly calling something like smbus_bwrite(), without going through smb(4).

It might be better to instead require callers to use smbus_request_bus(), which smb(4) already does. But for that to work, smbus_request_bus() needs some cleanup similar to what was done in iicbus_request_bus(). It also might be a pain to fixup the drivers that directly call smbus_* but don't use request_bus().

sys/dev/ichiic/ig4_iic.c
636โ€“640 โ†—(On Diff #5995)

Since this function doesn't do any bus locking, it's probably better to just return 0 here. Callers that pass SMB_DONTWAIT might not be able to sleep.

grembo edited edge metadata.

Remove locks in callback and simplify unsupported smb_quick call.

I think that clarification of the mutex/io_lock semantic is required as the first step.

call_lock protects a series of calls to the controller (set_controller/slave/wait_status) that form an ioctl so they won't interleave, while io_lock is used to coordinate the interrupt handler with other I/O to the controller, especially wait_status.

In D2744#53712, @grembo wrote:

I think that clarification of the mutex/io_lock semantic is required as the first step.

call_lock protects a series of calls to the controller (set_controller/slave/wait_status) that form an ioctl so they won't interleave, while io_lock is used to coordinate the interrupt handler with other I/O to the controller, especially wait_status.

That far I can understand myself from reading the code. What I cannot get from the code alone, without using some kind of the mind-reading low-tech solution, is what is the intent for the bounds of transactions manipulating the hardware state. E.g., why is it fine to allow interrupt handler to intervene while waiting for i/o completion: what hw state should be re-acquired after the sleep ? Does interrupt handler need exclusion with the top half of the driver, if needed, why ? ... etc

In D2744#53837, @kib wrote:
In D2744#53712, @grembo wrote:

I think that clarification of the mutex/io_lock semantic is required as the first step.

call_lock protects a series of calls to the controller (set_controller/slave/wait_status) that form an ioctl so they won't interleave, while io_lock is used to coordinate the interrupt handler with other I/O to the controller, especially wait_status.

That far I can understand myself from reading the code. What I cannot get from the code alone, without using some kind of mind-reading low-tech solution, is what is the intent for the bounds of transactions manipulating the hardware state. E.g., why is it fine to allow the interrupt handler to intervene while waiting for i/o completion:

Why do you think this wouldn't be the case?

what hw state should be re-acquired after the sleep ? Does the interrupt handler need exclusion with the top half of the driver, if needed, why ? ... etc

Which sleep are you referring to (mtx_sleep in wait_status maybe)? Why would a hardware state need to be reacquired? And where does the "top half of the driver" end?

In D2744#53895, @grembo wrote:
In D2744#53837, @kib wrote:
In D2744#53712, @grembo wrote:

I think that clarification of the mutex/io_lock semantic is required as the first step.

call_lock protects a series of calls to the controller (set_controller/slave/wait_status) that form an ioctl so they won't interleave, while io_lock is used to coordinate the interrupt handler with other I/O to the controller, especially wait_status.

That far I can understand myself from reading the code. What I cannot get from the code alone, without using some kind of mind-reading low-tech solution, is what is the intent for the bounds of transactions manipulating the hardware state. E.g., why is it fine to allow the interrupt handler to intervene while waiting for i/o completion:

Why do you think this wouldn't be the case?

what hw state should be re-acquired after the sleep ? Does the interrupt handler need exclusion with the top half of the driver, if needed, why ? ... etc

Which sleep are you referring to (mtx_sleep in wait_status maybe)? Why would a hardware state need to be reacquired? And where does the "top half of the driver" end?

Or do you mean mtx_sleep in set_controller (which allows the interrupt handler to read, which might not be good)? Stating the obvious, the interrupt handler can only read data if no ig4iic_smb_* call is in progress or during mtx_sleep in wait_status/set_controller. Exclusion is required in other places to make rpos/rnext predictable (e.g. whenever data_read is called and in smb_transaction).

I don't feel like I really understand what you mean, so more details/hints would be useful.

In D2744#53943, @grembo wrote:
In D2744#53895, @grembo wrote:
In D2744#53837, @kib wrote:
In D2744#53712, @grembo wrote:

I think that clarification of the mutex/io_lock semantic is required as the first step.

call_lock protects a series of calls to the controller (set_controller/slave/wait_status) that form an ioctl so they won't interleave, while io_lock is used to coordinate the interrupt handler with other I/O to the controller, especially wait_status.

That far I can understand myself from reading the code. What I cannot get from the code alone, without using some kind of mind-reading low-tech solution, is what is the intent for the bounds of transactions manipulating the hardware state. E.g., why is it fine to allow the interrupt handler to intervene while waiting for i/o completion:

Why do you think this wouldn't be the case?

what hw state should be re-acquired after the sleep ? Does the interrupt handler need exclusion with the top half of the driver, if needed, why ? ... etc

Which sleep are you referring to (mtx_sleep in wait_status maybe)? Why would a hardware state need to be reacquired? And where does the "top half of the driver" end?

Or do you mean mtx_sleep in set_controller (which allows the interrupt handler to read, which might not be good)? Stating the obvious, the interrupt handler can only read data if no ig4iic_smb_* call is in progress or during mtx_sleep in wait_status/set_controller. Exclusion is required in other places to make rpos/rnext predictable (e.g. whenever data_read is called and in smb_transaction).

Actually, this is a good explanation of what I asked for. If you add a comment somewhere stating that the interrupt handler can do that, and only interrupt handler can run when mtx is dropped (due to sx), it would provide the context which was missed when I tried to understand the locking mode.

I don't feel like I really understand what you mean, so more details/hints would be useful.

Add comment that aims to clarify locking semantics.

Same as before, but this time with the actual change: Add comment
to clarify locking semantics.

Sorry.

sys/dev/ichiic/ig4_iic.c
920 โ†—(On Diff #6364)

The more logical place for this comment is in ig4_var.h, somewhere near sc definition.
Also, it would be really nice to include explicit language like 'it is safe to drop io_lock in the wait routine because interrupt handler only access that and that registers ...' .

Use of ig4_ prefix for locks, while all other members of the softc structure are not prefixed, is strange.

kib added a reviewer: kib.
kib added inline comments.
sys/dev/ichiic/ig4_iic.c
184 โ†—(On Diff #6372)

BTW, PZERO is not same as 0. I do not see why would this wait should change the thread priority.

664 โ†—(On Diff #6372)

According to style, you need empty line after '{' if no local vars are defined.

This revision is now accepted and ready to land.Jun 22 2015, 8:24 AM
grembo edited edge metadata.

Add extra newline to comply to style(9), change PZERO to 0 in
mtx_sleep as suggestedi. Tested the change it works fine.

This revision now requires review to proceed.Jun 22 2015, 8:54 AM
grembo edited edge metadata.

Add extra newline to comply to style(9), change PZERO to 0 in mtx_sleep

This revision was automatically updated to reflect the committed changes.