Page MenuHomeFreeBSD

Fix rib generation count for fib algo.
ClosedPublic

Authored by melifaro on Apr 17 2021, 6:58 PM.

Details

Summary

Currently, PCB caching mechanism relies on the rib generation counter (rnh_gen) to invalidate cached nhops/LLE entries.

With certain fib algorithms, it is now possible that the datapath lookup state applies RIB changes with some delay.
In that scenario, PCB cache will invalidate on the RIB change, but the new lookup may result in the same nexthop being returned.
When fib algo finally gets in sync with the RIB changes, PCB cache will not receive any notification and will end up caching the stale data.

To fix this, introduce additional counter, rnh_gen_rib, which is used only when FIB_ALGO is enabled.
This counter is incremented by the control plane. Each time when fib algo synchronises with the RIB, it updates rnh_gen to the current rnh_gen_rib value.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

melifaro added a reviewer: network.
donner added inline comments.
sys/net/route/fib_algo.c
821

Why is it before "fib_unref_nhop", while it is after that some lines earlier?

sys/net/route/fib_algo.c
821

It has no connection to the mechanics of the fib_unref_nhop(), so the ordering doesn't matter.
Though, I'll update the one in the apply_rtable_changes() to make it consistent.

donner added inline comments.
sys/net/route/fib_algo.c
931

As I do understand the logic, the rib-table version is incremented each time, the rib is modified. This call is done after dumping the table, without modifying it. If this is correct, the increment is not necessary. Otherwise LTGM.

This revision is now accepted and ready to land.Apr 20 2021, 11:55 AM
melifaro edited the summary of this revision. (Show Details)

Reflect donner@ comments.

This revision now requires review to proceed.Apr 20 2021, 10:12 PM
sys/net/route/fib_algo.c
931

Yep, you're right w.r.t the table dump. The nuance here is that the framework can re-instantiate algorithm instance at any point in time.
For example, the following can happen:

  • inet.0 rib backed by lradix4 receives a route update. As lradix4 is immutable, the framework schedules the algorithm rebuild (e.g. build of a new instance).

Until the rebuild is executed and complete, datapath will run with an older version of the rib snapshot.
So, when the rebuild is complete, framework needs to update the datapath generation to reflect the fact that it's now synced to the latest rib.

This revision was not accepted when it landed; it landed in state Needs Review.Apr 20 2021, 10:42 PM
This revision was automatically updated to reflect the committed changes.