Page MenuHomeFreeBSD

iflib: prevent possible infinite loop in iflib_encap
ClosedPublic

Authored by jacob.e.keller_intel.com on Mar 5 2019, 7:12 PM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Dec 12, 12:00 PM
Unknown Object (File)
Tue, Dec 10, 10:41 AM
Unknown Object (File)
Sun, Dec 8, 1:28 PM
Unknown Object (File)
Sat, Dec 7, 8:19 PM
Unknown Object (File)
Wed, Nov 20, 9:05 AM
Unknown Object (File)
Oct 31 2024, 9:27 PM
Unknown Object (File)
Oct 18 2024, 9:59 PM
Unknown Object (File)
Oct 16 2024, 10:53 PM
Subscribers

Details

Summary

iflib_encap calls bus_dmamap_load_mbuf_sg. Upon it returning EFBIG, an
m_collapse and an m_defrag are attempted to shrink the mbuf cluster to
fit within the DMA segment limitations.

However, if we call m_defrag, and then bus_dmamap_load_mbuf_sg returns
EFBIG on the now defragmented mbuf, we will continuously re-call
bus_dmamap_load_mbuf_sg over and over.

This happens because m_head isn't NULL, and remap is >1, so we don't try
to m_collapse or m_defrag again. The only way we exit the loop is if
m_head is NULL. However, m_head can't be modified by the call to
bus_dmamap_load_mbuf_sg, because we don't pass it as a double pointer.

I believe this will be an incredibly rare occurrence, because it is
unlikely that bus_dmamap_load_mbuf_sg will actually fail on the second
defragment with an EFBIG error. However, it still seems like
a possibility that we should account for.

Fix the exit check to ensure that if remap is >1, we will also exit,
even if m_head is not NULL.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>

Diff Detail

Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 22900
Build 21986: arc lint + arc unit

Event Timeline

I found this during code inspection while looking at a similar construction for a legacy device driver. It's possible this can never trigger, depending on how bus_dmamap_load_mbuf_sg returns EFBIG, but it definitely seemed plausible that it might happen, and potentially locking up a CPU is not a happy thing to do.

Move the remap++ to after the error check

sys/net/iflib.c
3288

Actually, I'm pretty sure this is incorrect. because remap will be >1 after the first call to m_defrag. Woops.

I'm definitely open to better/alternative solutions, and no, I don't actually have a live replication of the issue.

gallatin added a subscriber: gallatin.

I think this is a mostly theoretical issue, as you'll have had a chain which was impossible to send (eg, far too large, etc) even after defrag.

This revision is now accepted and ready to land.Mar 13 2019, 6:42 PM

add an MPASS so we trigger a panic when in debug

This revision now requires review to proceed.Mar 13 2019, 7:05 PM

I think this is a mostly theoretical issue, as you'll have had a chain which was impossible to send (eg, far too large, etc) even after defrag.

Yea, I agree it's theoretical. It's more of a "the consequences of this is a locked up CPU, so I'd rather avoid that if it ever does happen"

sys/net/iflib.c
3292

Shouldn't this be moved to the top of the block? No need to actually call m_defrag() before the MPASS() I think.

gallatin added inline comments.
sys/net/iflib.c
3292

if remap is >1, then neither defrag nor collapse will be called. so i don't think the placement matters, as long is it is prior to the increment (which it is)

sys/net/iflib.c
3292

Yea, we don't call either one if the MPASS is ever triggered.

This revision is now accepted and ready to land.Mar 14 2019, 10:25 PM
This revision was automatically updated to reflect the committed changes.