led: call led state functions in a context with no locks held
AcceptedPublic
Actions

Authored by avg on Jul 19 2021, 8:01 AM.

Details

Reviewers

phk
gonzo
loos
adrian
mav
ganbold

Summary

This is to allow for LEDs controllable via transports such USB and I2C
where waiting (sleeping) is expected.
Previously the calls were made with a mutex held or even from a callout
context.

The easing of restrictions comes at a cost of having a dedicated kernel
thread for LED blinking duties.

I have not decided yet whether it's okay to "permanently" set a LED via
a direct call or if it would be better to defer the setting to the thread
as well. The former would have an advantage of decoupling the LED control
logic from the caller's context. The disadvantage is obvious, the
operation would become asynchronous.

Diff Detail

Repository

rS FreeBSD src repository - subversion

Lint

Lint Passed

Unit

No Test Coverage

Build Status

Buildable 40522
Build 37411: arc lint + arc unit

Event Timeline

avg created this revision.Jul 19 2021, 8:01 AM

Herald added a subscriber: imp. · View Herald TranscriptJul 19 2021, 8:01 AM

avg requested review of this revision.Jul 19 2021, 8:01 AM

Harbormaster completed remote builds in B40522: Diff 92412.Jul 19 2021, 8:01 AM

avg added reviewers: phk, gonzo, loos, adrian, mav, ganbold.Jul 19 2021, 8:04 AM

Tested this with D30787 where LEDs are controlled via I²C and had no issues with turning the LEDs on / off and making then blink in various combinations and sequences.
No crashes, no LORs, no WITNESS warnings.

I can see the point, but I cringe at the thought of adding a kthread to just to blink LEDs...

Yeah, me too... but doing it directly from a callout puts a lot of restrictions.
I considered creating and destroying the thread dynamically as "blinkers" are added and removed.
I can implement that if you think it would be better than having the thread permanently.

I think no matter what we agree here, people will probably take notice once the commit goes in, so it may be one of those cases where starting the party pre-commit in arch@ is warranted ?

If additional thread worries, may be it could be postponed to first LED creation. There should be systems without LEDs.

This revision is now accepted and ready to land.Jul 19 2021, 1:52 PM

in general I agree with "push it into a thread", especially since we want to support LED blinking on things like behind an i2c bus, but the unlock;func;lock thing without checking the list has changed or the LED itself has been deleted is problematic to me. Yes, sure, the race is tiny and it only affects things if we're modifying the list at the same time we're doing some work, but surely we can handle that case. :-) That's why it was locked in the first place!

sys/dev/led/led.c
90	can we not keep adding unlock; thing; lock as a pattern? That means that things like the led list can be modified whilst the func() is being called.

@adrian , I thought that I took care of that via the in_use flag.
Could you please look again?

In D31215#702973, @avg wrote:

@adrian , I thought that I took care of that via the in_use flag.
Could you please look again?

hm, maybe? I didn't see that, I only saw the "ugh, not again" unlock/lock pattern. :-)

You're not using in_use during adding a new LED to the list, hm. I wonder if there's some weirdness there where the list changes in between iterations (ie, it grows a new node at the end) that confuses things.

Also, how's this work if there's a non-blink call (eg an immediate state change call through led_state()) whilst destroy() is called? It doesn't look like wakeup() is being called there?

@adrian , in my opinion two things guarantee correct operation:

the current element can never be removed (thanks to in_use)
an iteration is always done with the lock held

So, whatever perturbations might happen while the lock is dropped, they should settle down before the next iteration.

You are right about the messing wakeup, I'll add it. Thanks!

Finally, I did consider removing led_mtx and using led_sx for everything.
led_sx is sleepable, so we wouldn't have to drop it.
But I decided against it because I did not want a USB timeout or an I2C one to block operations like adding or removing a LED.
Maybe there is no reason to worry about that and I should just make the code simpler.

In D31215#702999, @avg wrote:

@adrian , in my opinion two things guarantee correct operation:

the current element can never be removed (thanks to in_use)

an iteration is always done with the lock held

So, whatever perturbations might happen while the lock is dropped, they should settle down before the next iteration.

I mean, we really should have a design pattern for this in our kernel :-) This seems like something we all do often.

You are right about the messing wakeup, I'll add it. Thanks!

You're welcome!

Finally, I did consider removing led_mtx and using led_sx for everything.
led_sx is sleepable, so we wouldn't have to drop it.
But I decided against it because I did not want a USB timeout or an I2C one to block operations like adding or removing a LED.
Maybe there is no reason to worry about that and I should just make the code simpler.

I'm ... not a big fan for sx locks everywhere, I've had bad experiences with them when callbacks are involved due to LORs. :(

+1 for me to land it with that missing wakeup() :-) Thanks for poking at this!

-adrian

The easing of restrictions comes at a cost of having a dedicated kernel thread for LED blinking duties.

You may use timeout interface for taskqueue to avoid extra kernel thread creation

@wulf , I considered that well. But if we use a dedicate taskqueue then it means having a thread anyway. And I didnd't want to put potentially slow LED operations on one of the global taskqueues.

Re task queues: One of the ways to drive musicians right up the wall, is to have a LED flashing erratically, but not randomly. (See also: "Vetinari Clock") I think that counts against the task-queues here.

jlduran_gmail.com mentioned this in D32388: led: Eliminate sleepable mutex.Oct 9 2021, 4:59 AM