Page MenuHomeFreeBSD

LinuxKPI w/ LKPI_80211_HW_CRYPTO: fix non-sleepable mutex use.
AbandonedPublic

Authored by cc on Feb 16 2024, 9:44 PM.
Tags
None
Referenced Files
Unknown Object (File)
Mon, Apr 8, 10:37 PM
Unknown Object (File)
Mar 25 2024, 10:41 PM
Unknown Object (File)
Feb 22 2024, 4:52 AM
Unknown Object (File)
Feb 21 2024, 2:25 AM

Details

Reviewers
bz
Summary

PR: 277095
Sponsored by: The FreeBSD Foundation
MFC after: 3 days

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 56068
Build 52957: arc lint + arc unit

Event Timeline

cc requested review of this revision.Feb 16 2024, 9:44 PM
bz requested changes to this revision.Feb 16 2024, 10:32 PM
bz added inline comments.
sys/net80211/ieee80211_node.c
2209

That leaves all the following ni->*key* checks unprotected.
You really want to fix that problem you are seeing in the "driver" which in your case is LinuxKPI.

This revision now requires changes to proceed.Feb 16 2024, 10:32 PM
sys/net80211/ieee80211_node.c
2209

That leaves all the following ni->*key* checks unprotected.
You really want to fix that problem you are seeing in the "driver" which in your case is LinuxKPI.

The original node lock in net80211/ seems to protect the node table only.

Look around, it looks, for example the "ni->ni_ucastkey" is not protected in ieee80211_ioctl_getkey() and ieee80211_ioctl_setkey(). Does it mean we need to add additional lock to protect the ni->*key*?

sys/net80211/ieee80211_node.c
2209

Ok, bad explanation of me.

It makes sure your ni doesn't go away or more importantly prevents other invariants of re-use which exist in net80211.
That's probably why the lock is called NODE_LOCK and not NODE_TABLE_LOCK.

The malloc issue from the PR on this path is fixed in D43648.
The downcall isn't; but as the comment above suggests you can come into this function with the lock held already. That means even with your change you are still calling ieee80211_crypto_delkey possibly with a lock held and trigger the panic? Or is the comment stale?

I wonder how wpi or rtwn or others do these "downcalls"? A quick glance suggests that they are expecting to sleep in that call path too?

sys/net80211/ieee80211_node.c
2209

Ok, bad explanation of me.

It makes sure your ni doesn't go away or more importantly prevents other invariants of re-use which exist in net80211.
That's probably why the lock is called NODE_LOCK and not NODE_TABLE_LOCK.

The malloc issue from the PR on this path is fixed in D43648.
The downcall isn't; but as the comment above suggests you can come into this function with the lock held already. That means even with your change you are still calling ieee80211_crypto_delkey possibly with a lock held and trigger the panic? Or is the comment stale?

I wonder how wpi or rtwn or others do these "downcalls"? A quick glance suggests that they are expecting to sleep in that call path too?

OK. Looks wpi does that.

Also look at ieee80211_drain(), if I remember correctly, m_freem() may sleep, right?

{
....
    IEEE80211_NODE_LOCK(nt);
    ....
    m_freem()
    IEEE80211_NODE_UNLOCK
....
}
cc added inline comments.
sys/net80211/ieee80211_node.c
2209

Got some feedback and said that m_freem() is not sleepable.

I think I will be focusing on D43648, which also fixes PR277095.