Paths

Table of Contentst

ng_l2tp: Fix callout synchronization in the RACK timeout handler
ClosedPublic
Actions

Authored by markj on Sep 24 2020, 6:40 PM.

Details

Reviewers

glebius
bz
donner

Commits

rS366167: ng_l2tp: Fix callout synchronization in the rexmit timeout handler

Summary

The synchronization between ng_l2tp_seq_recv_nr() and
ng_l2tp_seq_rack_timeout() is buggy. ng_l2tp_seq_recv_nr()
releases ack'ed packets from the transmit queue, stops the retransmit
callout, checks whether the queue is empty, and arms the callout again
if not. All of this is done with the node mutex held.

ng_l2tp_seq_rack_timeout() checks to see if it was cancelled by
ng_l2tp_seq_recv_nr(), and *then* acquires the node mutex and
retransmits outstanding packets. So there is a race:

CPU 1:

clears send queue

CPU 2:

schedules callout
callout blocks on node mutex

CPU 1:

stops callout
finds that the send queue is empty

CPU 2:

tries to transmit the first packet from the send queue
panic()

Fix it by modifying the callout handler to check whether it has been
cancelled after aquiring the node lock. We could perhaps further
"harden" the code by making the handler check for an empty queue, but I
believe it should not be necessary.

Test Plan

None, this is based on some bug reports forwarded to me.

Diff Detail

Repository

rS FreeBSD src repository - subversion

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

markj created this revision.Sep 24 2020, 6:40 PM

Herald added subscribers: melifaro, donner. · View Herald TranscriptSep 24 2020, 6:40 PM

markj requested review of this revision.Sep 24 2020, 6:40 PM

Harbormaster completed remote builds in B33803: Diff 77495.Sep 24 2020, 6:40 PM

markj edited the test plan for this revision. (Show Details)Sep 24 2020, 6:41 PM

markj added a reviewer: glebius.

bz requested changes to this revision.Sep 24 2020, 9:59 PM

bz added a subscriber: bz.

bz added inline comments.

sys/netgraph/ng_l2tp.c
1463 ↗	(On Diff #77495)	Don't you leak the mutex in that early return case?

This revision now requires changes to proceed.Sep 24 2020, 9:59 PM

markj added inline comments.Sep 24 2020, 10:28 PM

sys/netgraph/ng_l2tp.c
1463 ↗	(On Diff #77495)	D'oh, yes. Will fix.

afedorov added a subscriber: afedorov.Sep 25 2020, 1:40 AM

Can you please point me to the bug report?
We are running this node in a large scale production environment (with remote LAC at different carriers) and did not observe such an issue before.

Is this PR: 241133 ?

Fix lock leak.

Harbormaster completed remote builds in B33813: Diff 77521.Sep 25 2020, 1:10 PM

In D26548#591305, @lutz_donnerhacke.de wrote:

Can you please point me to the bug report?

The PR that Aleksandr noted, https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=241133

A pfsense developer also reported seeing a null pointer dereference in ng_l2tp_seq_rack_timeout(), apparently because seq->xwin[0] == 0. Apparently the panic was observed even after this commit: https://svnweb.freebsd.org/base?view=revision&revision=353027 .

We are running this node in a large scale production environment (with remote LAC at different carriers) and did not observe such an issue before.

How common are retransmits in this environment?

Is the bug reproducible in a test environment?
Did someone try to compile with INVARIANTS (in order to bring L2TP_SEQ_CHECK into life)?

In D26548#591401, @lutz_donnerhacke.de wrote:

Is the bug reproducible in a test environment?

No, it is apparently triggered by mpd5 running in a production environment but only rarely, once every few months.

Did someone try to compile with INVARIANTS (in order to bring L2TP_SEQ_CHECK into life)?

I don't believe so. Indeed, L2TP_SEQ_CHECK should be able to detect the problem since it effectively asserts that seq->xwin[0] != NULL if and only if the retransmit timer is active.

In D26548#591373, @markj wrote:

In D26548#591305, @lutz_donnerhacke.de wrote:

Can you please point me to the bug report?

The PR that Aleksandr noted, https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=241133

A pfsense developer also reported seeing a null pointer dereference in ng_l2tp_seq_rack_timeout(), apparently because seq->xwin[0] == 0. Apparently the panic was observed even after this commit: https://svnweb.freebsd.org/base?view=revision&revision=353027 .

Thank you for the info.

We are running this node in a large scale production environment (with remote LAC at different carriers) and did not observe such an issue before.

How common are retransmits in this environment?

We had no problems with the L2TP tunnel since PR146082 was solved. There we run into long time stability problems with L2TP sequence numbers.
Yes we have retransmits, but I did not set up an special monitoring for this, because we use sequence numbers in L2TP only for the control channel, but not for the data channel.
Typically we see a permanent difference in sequence numbers of more than one (always sending data). May be the heavy traffic hides the problem, because we usually do not run into any timeout.
Of course, we have loss of connectivity between LAC and LNS, but never had a panic with a similar stack trace.

In D26548#591438, @lutz_donnerhacke.de wrote:

In D26548#591373, @markj wrote:

In D26548#591305, @lutz_donnerhacke.de wrote:

Can you please point me to the bug report?

The PR that Aleksandr noted, https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=241133

A pfsense developer also reported seeing a null pointer dereference in ng_l2tp_seq_rack_timeout(), apparently because seq->xwin[0] == 0. Apparently the panic was observed even after this commit: https://svnweb.freebsd.org/base?view=revision&revision=353027 .

Thank you for the info.

We are running this node in a large scale production environment (with remote LAC at different carriers) and did not observe such an issue before.

How common are retransmits in this environment?

We had no problems with the L2TP tunnel since PR146082 was solved. There we run into long time stability problems with L2TP sequence numbers.
Yes we have retransmits, but I did not set up an special monitoring for this, because we use sequence numbers in L2TP only for the control channel, but not for the data channel.
Typically we see a permanent difference in sequence numbers of more than one (always sending data). May be the heavy traffic hides the problem, because we usually do not run into any timeout.
Of course, we have loss of connectivity between LAC and LNS, but never had a panic with a similar stack trace.

If the pending transmit queue is usually non-empty, then the problem should indeed be hidden. The race looks pretty hard to hit, there are several ingredients:

some pending ACKs to arm the 1s timeout
ACKs for pending transmits are received at the same time as the timeout fires
received ACKs clear the transmit queue

So if there is a lot of control traffic I expect the panic to be rare. Note that r353027 fixed a more severe problem with the timeout handler.

My problem is, that I do not understand, how this race condition can happen at all.
Sure, the fix seems obvious and does not harm, so let's go with it.

OTOH almost all handling of callouts (at least in ng_l2tp) is prone to the same error, if the reasoning is true.

In D26548#591451, @lutz_donnerhacke.de wrote:

My problem is, that I do not understand, how this race condition can happen at all.

You mean, there is some high-level reason the race should not occur? Or, are the details of the race not clear? I can try to explain my reasoning further.

Sure, the fix seems obvious and does not harm, so let's go with it.

OTOH almost all handling of callouts (at least in ng_l2tp) is prone to the same error, if the reasoning is true.

Indeed, there seems to be a race in ng_l2tp_seq_xack_timeout() as well, I started looking at it. It'll be a separate diff though.

In D26548#591453, @markj wrote:

Indeed, there seems to be a race in ng_l2tp_seq_xack_timeout() as well.

This looks like it could cause transmission of a ZLB with stale NR or NS, again in rare circumstances. I don't think it will cause a kernel panic.

bz accepted this revision.Sep 25 2020, 6:25 PM

This revision is now accepted and ready to land.Sep 25 2020, 6:25 PM

glebius accepted this revision.Sep 25 2020, 6:34 PM

In D26548#591459, @markj wrote:

In D26548#591453, @markj wrote:

Indeed, there seems to be a race in ng_l2tp_seq_xack_timeout() as well.

This looks like it could cause transmission of a ZLB with stale NR or NS, again in rare circumstances. I don't think it will cause a kernel panic.

There is not problem with an additional packet, so let this unchanged,
I've to think about the issue in the context of the netgraph framework itself.
I clearly do not understand the interaction of the components within the framework (lesson learned)

Closed by commit rS366167: ng_l2tp: Fix callout synchronization in the rexmit timeout handler (authored by markj). · Explain WhySep 25 2020, 6:56 PM

This revision was automatically updated to reflect the committed changes.

markj added a commit: rS366167: ng_l2tp: Fix callout synchronization in the rexmit timeout handler.

Herald added a subscriber: imp. · View Herald TranscriptSep 25 2020, 6:56 PM