bridge: Fix a potential memory leak in bridge_enqueue()
ClosedPublic
Actions

Authored by markj on Dec 8 2022, 8:34 PM.

Details

Reviewers

kp
zlei

Group Reviewers

network

Commits

rGebf66b2a3b39: bridge: Fix a potential memory leak in bridge_enqueue()
rG51088797305c: bridge: Fix a potential memory leak in bridge_enqueue()

Summary

A comment at the beginning of the function notes that we may be
transmitting multiple fragments as distinct fragments. So, the function
loops over all fragments, transmitting each mbuf chain. If if_transmit
fails, we need to free all of the fragments, but m_freem() only frees an
mbuf chain - it doesn't follow m_nextpkt.

Change the error handler to free each packet fragment, and count each
fragment as a separate error since we increment OPACKETS once per
fragment when transmission is successful.

Diff Detail

Repository

rG FreeBSD src repository

Lint

Lint Skipped

Unit

Tests Skipped

Build Status

Buildable 48653
Build 45539: arc lint + arc unit

Event Timeline

markj created this revision.Dec 8 2022, 8:34 PM

Herald added subscribers: glebius, melifaro, ae, imp. · View Herald TranscriptDec 8 2022, 8:34 PM

markj requested review of this revision.Dec 8 2022, 8:34 PM

Harbormaster completed remote builds in B48653: Diff 113951.Dec 8 2022, 8:34 PM

We could presumably also fix the mbuf leak by doing 'continue' in the if_transmit case rather than 'break'. Then we'd try to transmit the other packets too.
I wonder if it's reasonable to assume that if one packet fails the entire following chain will fail too.
If it likely will then your approach is better. My very quick look at two random drivers leads me to believe that'd be the case.

Another (very subtle) distinction in those parameters is if the bridge member interface gets to count those packets as errors or not.

sys/net/if_bridge.c
2069	Is there any point to this if m_freem() doesn't follow m_nextpkt anyway?

In D37635#855754, @kp wrote:

We could presumably also fix the mbuf leak by doing 'continue' in the if_transmit case rather than 'break'. Then we'd try to transmit the other packets too.
I wonder if it's reasonable to assume that if one packet fails the entire following chain will fail too.
If it likely will then your approach is better. My very quick look at two random drivers leads me to believe that'd be the case.

Generally I'd expect them all to fail, since the most likely cause is that we're hitting a queue limit or buffer allocation failure somewhere up the stack. It also doesn't seem all that useful to me to continue trying to transmit fragments when we know that at least one will be missing. Either a retransmit will cause bridge_fragment() to re-fragment the whole packet again, or the successfully transmitted fragments will eventually be dropped by the receiver. I'm happy to change it, but it doesn't seem too consequential? FWIW I only noticed this by code inspection.

Another (very subtle) distinction in those parameters is if the bridge member interface gets to count those packets as errors or not.

Hmm, are you suggesting that the member interface should be responsible for counting the failed packet, i.e., the loop should initialize n to 0 and not 1?

sys/net/if_bridge.c
2069	No, I'll just remove it.

In D37635#855756, @markj wrote:

It also doesn't seem all that useful to me to continue trying to transmit fragments when we know that at least one will be missing. Either a retransmit will cause bridge_fragment() to re-fragment the whole packet again, or the successfully transmitted fragments will eventually be dropped by the receiver.

Yes, that's another good reason to prefer your approach.
I think I was just mostly thinking through the difference between the two.
Let's go with this.

Another (very subtle) distinction in those parameters is if the bridge member interface gets to count those packets as errors or not.

Hmm, are you suggesting that the member interface should be responsible for counting the failed packet, i.e., the loop should initialize n to 0 and not 1?

No, it's part of me thinking through this. It's not worth trying to do anything about this.
My thinking was that we'd end up dropping more packets than just the first one, and it is technically the fault of the interface, not the bridge, so that'd be the reason to want the errors counted there as well. We're going to get some error counts there anyway, even if not always as many as we'd want.

(bridge_fragment() should be an unusual code path as well. I've recently changed the defaults for if_bridge to ensure that, because there are too many gotchas with playing layering games to firewall on the bridge.)