Page MenuHomeFreeBSD

Properly stop timer before freeing link level entries for IPv4 and IPv6
ClosedPublic

Authored by hselasky on Dec 17 2015, 7:37 PM.

Details

Summary

By running the following commands link level entries will leak.
This in particular causes a panic when ejecting USB type of ethernet devices.

DEF_ROUTE=x.x.x.x # replace with IP of your default route
vmstat -m | grep llt
arp -d $DEF_ROUTE
ping $DEF_ROUTE
vmstat -m | grep llt
arp -d $DEF_ROUTE
ping $DEF_ROUTE

Fix problem by adding missing callout_stop() calls.
Technically a callout_drain() call is required, though this patch is better than nothing.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

hselasky updated this revision to Diff 11394.Dec 17 2015, 7:37 PM
hselasky retitled this revision from to Fix races managing link level tables related to use of callouts.
hselasky updated this object.
hselasky edited the test plan for this revision. (Show Details)
hselasky added reviewers: rrs, glebius, hiren, sbruno, adrian, jhb, bz.
hselasky set the repository for this revision to rS FreeBSD src repository.
sbruno added a reviewer: ae.Dec 18 2015, 9:32 PM
sbruno removed a reviewer: sbruno.
sbruno added a subscriber: sbruno.
hselasky updated this revision to Diff 11436.Dec 19 2015, 9:59 AM

Need to check for LLE_LINKED at entry of callout function. Seems some code path is not stopping the callouts!

hselasky added a comment.EditedDec 19 2015, 10:09 AM

I'm currently investigating the call to arptimer when LLE_LINKED is not set, who is not clearing the timer.

Here goes:

lltable_delete_addr() will unlink the entry and try to delete it, but doesn't stop the callout.

For example reproduce using the default route x.x.x.x:

vmstat -m | grep llt
arp -d x.x.x.x
ping x.x.x.x
vmstat -m | grep llt
arp -d x.x.x.x
ping x.x.x.x

I observe that the number of llt allocations doesn't drop after the deletion, which means the timer for the unlinked entry still has one reference which appears to be causing this arptimer panic.

hselasky updated this revision to Diff 11437.Dec 19 2015, 11:20 AM

Need to stop timer when deleting arp table entries.

Ping - any reviewers active on this one?

melifaro edited edge metadata.Jan 7 2016, 9:29 AM

Ping - any reviewers active on this one?

sorry, will take a look today.

hselasky updated this revision to Diff 18333.Jul 12 2016, 2:49 PM
hselasky edited edge metadata.

Update patch to match 12-current.

jch added a subscriber: jch.Aug 25 2016, 6:51 AM

Ping - this is still an issue in 12-current when yanking USB network devices.

hselasky updated this revision to Diff 37969.Jan 15 2018, 2:30 PM
hselasky retitled this revision from Fix races managing link level tables related to use of callouts to Properly stop timer before freeing link level entries for IPv4 and IPv6.
hselasky edited the summary of this revision. (Show Details)
hselasky edited the test plan for this revision. (Show Details)
hselasky added a reviewer: gallatin.

Update patch for 12-current

loos accepted this revision.Jan 23 2018, 8:26 PM
loos added a subscriber: loos.

We run a similar fix in pfSense for quite some time now.

This revision is now accepted and ready to land.Jan 23 2018, 8:26 PM
This revision was automatically updated to reflect the committed changes.