Page MenuHomeFreeBSD

net80211: improve logging about state transitions lost
ClosedPublic

Authored by bz on Nov 12 2023, 11:58 PM.
Tags
None
Referenced Files
Unknown Object (File)
Fri, Jan 10, 1:11 AM
Unknown Object (File)
Fri, Jan 10, 1:10 AM
Unknown Object (File)
Thu, Jan 9, 11:21 PM
Unknown Object (File)
Thu, Jan 9, 11:03 PM
Unknown Object (File)
Thu, Jan 9, 2:31 PM
Unknown Object (File)
Dec 9 2024, 5:50 PM
Unknown Object (File)
Dec 8 2024, 12:05 PM
Unknown Object (File)
Dec 4 2024, 11:36 AM

Details

Summary

It is possible that we call ieee80211_new_state_locked() again before
a previous task finished to completion (not run yet or unlocked in
between) since 5efea30f039c4 (and follow-up).
In either case we would overwrite the new state and argument in the vap.

While most drivers somehow deal with that (or not), LinuxKPI 802.11 compat
code has KASSERTs to keep net80211, LinuxKPI and driver/firmware state in
sync and they may trigger due to a missing transition (or more likely a
changed ni/lsta).

Enhance the wlandebug +state logging for these cases so they
are easier to debug.

While here remove the unconditional logging to the message buffer;
it has been here for a good decade but not helped to actually identify
and sort the problem.

Sponsored by: The FreeBSD Foundation

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

bz requested review of this revision.Nov 12 2023, 11:58 PM

remove two extra comments stuck in a rebase in the other branch

Hi @bz and @cc,

I have a laptop which suffers from the state transition-related panics. When I got the machine earlier this year, I debugged it a bit to find basically what you are describing; new state transitions being initiated before the previous one could finish. I haven't really looked at it since.

Is there any way I can help you evaluate this issue? I guess I can start by capturing the output with this change applied and state debugging enabled. If there is something else, let me know :)

Mitchell

Hi @bz and @cc,

I have a laptop which suffers from the state transition-related panics. When I got the machine earlier this year, I debugged it a bit to find basically what you are describing; new state transitions being initiated before the previous one could finish. I haven't really looked at it since.

Is there any way I can help you evaluate this issue? I guess I can start by capturing the output with this change applied and state debugging enabled. If there is something else, let me know :)

We are currently pondering how to deal with them; I admit I was mostly focused on LinuxKPI but then realised reading your message (after replying to cc@) that the native drivers also suffer from this -- though in different ways and we lower likelyhood currently. So maybe we will have to find a solution in net80211 long-term. Problem may be that for now we need to fix the iwlwifi users as it remains the main obstacle and in addition the KASSERTs trigger panics in main.

@cc any further thoughts on this one?

Thanks for the reminder.

This revision is now accepted and ready to land.Dec 21 2023, 1:18 PM