Page MenuHomeFreeBSD

[FIB algo] Add support for the batched updates.
ClosedPublic

Authored by melifaro on Apr 5 2021, 4:41 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Jan 18, 5:38 PM
Unknown Object (File)
Thu, Jan 16, 11:28 PM
Unknown Object (File)
Mon, Jan 13, 6:31 PM
Unknown Object (File)
Mon, Jan 13, 6:30 PM
Unknown Object (File)
Mon, Jan 13, 6:29 PM
Unknown Object (File)
Mon, Jan 13, 6:10 PM
Unknown Object (File)
Mon, Jan 13, 11:57 AM
Unknown Object (File)
Wed, Jan 8, 12:35 AM
Subscribers

Details

Summary

Initial fib algo implementation was build on a very simple set of principles w.r.t updates:

  • algorithm is ether able to apply the change synchronously (DXR) or requires full rebuild (bsearch, lradix).
  • fall back to rebuild on every error (memory allocation, nhg limit, other internal algo errors, etc).

This changes brings the new "intermediate" concept - batched updates.
Algotirhm can indicate that the particular update has to be handled in batched fashion (FLM_BATCH).
The framework will write this update and other updates to the temporary buffer instead of pushing them to the algo callback.
Depending on the update rate, the framework will batch 50..1024 ms of updates and submit them to a different algo callback.

This functionality is handy for the slow-to-rebuild algorithms like DXR.

While here, update the batch/rebuild policy. Instead of an 50ms / 1k route (whichever comes faster) rebuilds, do the following:

  • bucket the number of updates in 50ms buckets
  • if number of updates exceeds the threshold rate (1k routes/sec), delay the update
  • repeat until the maximum update delay is reached (1sec)

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 38473
Build 35362: arc lint + arc unit

Event Timeline

melifaro retitled this revision from Add support for the batched update. Update delay policy to support both small and large-scale changes.` to [FIB algo] Add support for the batched updates. .Apr 5 2021, 4:59 PM
melifaro edited the summary of this revision. (Show Details)
melifaro added reviewers: network, zec.
sys/net/route/fib_algo.c
674

This one doesn't get called unless immediate_sync is set in handle_rtable_change_cb() , which means it is never called.

682

Shouldn't we unref .nh_old here instead, if != NULL?

melifaro retitled this revision from [FIB algo] Add support for the batched updates. to [FIB algo] Add support for the batched updates..
melifaro edited the summary of this revision. (Show Details)

Update to reflect the committed parts, address comments.

Fix case with small number of updates.

Fix case with small number of updates.

Works much better now, thanks!

Rarely I'm encountering those:

[fib_algo] inet.0 (dxr#21) schedule_fd_rebuild: Scheduling rebuild: batch queue failed (failures=0)

after which a guaranteed random panic follows soonish in an unrelated / random piece of code. Will investigate further and report back...

Do not update size of route change queue on failure.

Do not update size of route change queue on failure.

This fixes the panics I observed yesterday, the change looks fine to me now.

This revision is now accepted and ready to land.Apr 15 2021, 9:29 AM