Differential D19468

iflib: prevent possible infinite loop in iflib_encap
ClosedPublic
Actions

Authored by jacob.e.keller_intel.com on Mar 5 2019, 7:12 PM.

Details

Reviewers

shurd
erj
gallatin

Group Reviewers

Intel Networking
iflib

Commits

rS345303: iflib: prevent possible infinite loop in iflib_encap

Summary

iflib_encap calls bus_dmamap_load_mbuf_sg. Upon it returning EFBIG, an
m_collapse and an m_defrag are attempted to shrink the mbuf cluster to
fit within the DMA segment limitations.

However, if we call m_defrag, and then bus_dmamap_load_mbuf_sg returns
EFBIG on the now defragmented mbuf, we will continuously re-call
bus_dmamap_load_mbuf_sg over and over.

This happens because m_head isn't NULL, and remap is >1, so we don't try
to m_collapse or m_defrag again. The only way we exit the loop is if
m_head is NULL. However, m_head can't be modified by the call to
bus_dmamap_load_mbuf_sg, because we don't pass it as a double pointer.

I believe this will be an incredibly rare occurrence, because it is
unlikely that bus_dmamap_load_mbuf_sg will actually fail on the second
defragment with an EFBIG error. However, it still seems like
a possibility that we should account for.

Fix the exit check to ensure that if remap is >1, we will also exit,
even if m_head is not NULL.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>

Diff Detail

Repository

rS FreeBSD src repository - subversion

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

jacob.e.keller_intel.com created this revision.Mar 5 2019, 7:12 PM

Herald added a reviewer: shurd. · View Herald TranscriptMar 5 2019, 7:12 PM

Herald added a subscriber: ae. · View Herald Transcript

Harbormaster completed remote builds in B22899: Diff 54738.Mar 5 2019, 7:12 PM

I found this during code inspection while looking at a similar construction for a legacy device driver. It's possible this can never trigger, depending on how bus_dmamap_load_mbuf_sg returns EFBIG, but it definitely seemed plausible that it might happen, and potentially locking up a CPU is not a happy thing to do.

Move the remap++ to after the error check

Harbormaster completed remote builds in B22900: Diff 54740.Mar 5 2019, 7:18 PM

jacob.e.keller_intel.com added inline comments.Mar 5 2019, 7:18 PM

sys/net/iflib.c
3289 ↗	(On Diff #54738)	Actually, I'm pretty sure this is incorrect. because remap will be >1 after the first call to m_defrag. Woops.

I'm definitely open to better/alternative solutions, and no, I don't actually have a live replication of the issue.

jacob.e.keller_intel.com added reviewers: Intel Networking, erj.Mar 7 2019, 12:24 AM

I think this is a mostly theoretical issue, as you'll have had a chain which was impossible to send (eg, far too large, etc) even after defrag.

This revision is now accepted and ready to land.Mar 13 2019, 6:42 PM

add an MPASS so we trigger a panic when in debug

This revision now requires review to proceed.Mar 13 2019, 7:05 PM

Harbormaster completed remote builds in B23066: Diff 55033.Mar 13 2019, 7:05 PM

In D19468#418973, @gallatin wrote:

I think this is a mostly theoretical issue, as you'll have had a chain which was impossible to send (eg, far too large, etc) even after defrag.

Yea, I agree it's theoretical. It's more of a "the consequences of this is a locked up CPU, so I'd rather avoid that if it ever does happen"

shurd added inline comments.Mar 14 2019, 5:17 PM

sys/net/iflib.c
3283 ↗	(On Diff #55033)	Shouldn't this be moved to the top of the block? No need to actually call m_defrag() before the MPASS() I think.

Herald added a reviewer: iflib. · View Herald TranscriptMar 14 2019, 5:17 PM

gallatin accepted this revision.Mar 14 2019, 5:56 PM

gallatin added inline comments.

sys/net/iflib.c
3283 ↗	(On Diff #55033)	if remap is >1, then neither defrag nor collapse will be called. so i don't think the placement matters, as long is it is prior to the increment (which it is)