Details
Diff Detail
- Repository
- rS FreeBSD src repository - subversion
- Lint
Lint Passed - Unit
No Test Coverage - Build Status
Buildable 38747 Build 35636: arc lint + arc unit
Event Timeline
LGTM!
Please see some minor comments in the diff.
sys/modules/Makefile | ||
---|---|---|
112 | Should be build only when FIB_ALGO is enabled. | |
sys/netinet/dxr.c | ||
63 | This header is internal to the routing subsystem and should not be included. | |
333 | I'll provide a wrapper (publish rib_lookup_bysa()) | |
336 | You shouldn't access rt fields directly, please consider using rt_get_inet_prefix_plen(). | |
383 | Nit: maybe bool? | |
424 | Generic question on the lack of M_ZERO in the memory allocations: if the concern here is the performance, would it be possible to quantify the performance loss? | |
642 | Please don't use rt fields directly and consider switching to rt_get_inet_prefix_plen() or rt_get_inet_prefix_pmask(). | |
1143 | Nit: maybe if (new) update_delta++; if (old) update_delta--; ? |
sys/netinet/dxr.c | ||
---|---|---|
63 | Not possible (yet), because we need to resolve rnh_walktree_from(), and to lesser extent rnh_matchaddr(), inside struct rib_head, which is defined in net/routr/route_var.h | |
383 | That change would have to propagated elsewhere, so not explicitly against it, but maybe later :) | |
424 | In principle I'm against M_ZERO since it acts as a cache flush with huge structures, in addition to wasting cycles when it's not really needed. The main struct dxr_aux (malloc()ed at line 809), as well as range_tbl and x_tbl which hang of it and which may be occasionally realloc()ed, are repeatedly recycled at each FIB update, so having the bzeroed at allocation time buys us nothing in terms of preventing potential bugs after the first update. The lookup structures (malloc()ed at line 945) are exact-fit taylored to the currently required size, and are completely filled bottom to top via several memcpy()s, so no reason to M_ZERO that one either. Range chunk and trie trackers, which are stored in separate UMA zone pools, may be recycled many times, so M_ZERO at uma_zalloc() time buys us nothing there as well. Plus, the code has been thoroughly stress-tested in a private userspace app over several years, including >1 year of continuous uninterrupted uptime in a production environment. |
kldload dxr
sysctl net.route.algo should list dxr
sysctl net.route.algo.inet.algo=dxr should forcibly select it
Also yhere’s quite a lot of debug if you raise net.route debuglevel to 5 or 6, showing the actual events
LGTM, please see some comments inline :-)
Also: do you have any preference/requirements on the commit message?
sys/netinet/in_fib_dxr.c | ||
---|---|---|
57 ↗ | (On Diff #88357) | |
62–63 ↗ | (On Diff #88357) | |
65 ↗ | (On Diff #88357) | |
347 ↗ | (On Diff #88357) | |
657 ↗ | (On Diff #88357) | |
1026 ↗ | (On Diff #88357) | The framework provides a guarantee that the old instance won't get deleted until this instance setup is finished (successfully or not). |
1248 ↗ | (On Diff #88357) | So now it looks like we've solved all easy problems and the hardest one in CS remains: naming :-) |
sys/netinet/in_fib_dxr.c | ||
---|---|---|
62–63 ↗ | (On Diff #88357) | Nuked route_var.h, but can't get rid of fib_algo.h. |
1026 ↗ | (On Diff #88357) | The refcount is for an auxiliary structure which may get shared by several main dxr instances (generations), thus must stay. Once the refcount on the aux struct is bumped, dxr_init() can't fail. If the dump later fails, the framework will call the dxr_destroy() destructor which will DTRT WRT dropping the refcount and freeing the aux structure if required. If there was a scenario where the aux structure would leak that would be easy to track given that it is tagged as M_DXRAUX. |
1248 ↗ | (On Diff #88357) | I've settled for "fib_dxr" as the module name, while retaining simply "dxr" as the label for net.route.algo.inet.algo. |
sys/netinet/in_fib_dxr.c | ||
---|---|---|
62–63 ↗ | (On Diff #88357) | fib_algo.h is actually fine. It was moved below in the followup comment. |
It's great to see another, performant route search algo in reworked FreeBSD's routing stack, especially that MFC is planned after 1 week. I have to admit that really, the name "dxr_lpm4" and would be probably more recognizable since at a glance fib_dxr looks to me like an addition or supplementation to dpdk_lpm4 and other (bsearch4, radix4, radix4_lockless) routing algos, but from the above review, I assume that it's another independent routing algo module and should be considered as full replacement, not any supplementation.
As a potential tester on stable/13 branch, I have a question regarding FIB_ALGO auto-selection - as far as I understand it's neither enabled nor planned yet? I want also to ask if any fib_dxr6 for IPv6 is planned?
...
I assume that it's another independent routing algo module and should be considered as full replacement, not any supplementation.
Right, DXR has no dependencies on any of the other algos.
As a potential tester on stable/13 branch, I have a question regarding FIB_ALGO auto-selection - as far as I understand it's neither enabled nor planned yet?
If you kldload fib_dxr and start adding routes, at some point the FIB_ALGO will automagically switch to dxr. The automatic switchover can be always manually overriden.
I want also to ask if any fib_dxr6 for IPv6 is planned?
DXR is tailored to IPv4's densely populated address space. IPv6 address space utilization is less uniform, so DXR with IPv6 is less likely to yield stellar results. To be more specific, I don't have IPv6 version of DXR in the works.
I have cherry-picked "2aca58e16f50 - main - Introduce DXR as an IPv4 longest prefix matching / FIB module" with "aad59c79f5f2 - main - Fix panic when trying to delete non-existent gateway in multipath route" to stable/13 and tested a few hours under light load without problems. Everything was going fine until I tried to switch manually to dpdk_lpm4 what immediately introduced panic. Unfortunately I have no core of this panic. After reboot I tried to reproduce this but it seems to be not so easily reproducible. DXR was working fine, function dxr_build with caller dxr_change_rib_batch seems to be still active, but not overloading this 8-core ATOM.
If you can reproduce the problem pls. also try to provoke it with sysctl net.route.multipath=0.
It seems not to be easily reproducible. I am testing it on stable/13 for more than 1 mont and so far I have had only two panics while switching from dxr to dpdk_lpm4, though neither one was properly recorded (no core dumps). Likely those panics are more related to algo switching than to specific algo, but for some time I have USB key plugged in and configured as a dump device.
The overall performance of DXR algo seems to be decent., but the MFC to stable/13 might help with bringing it to a wider audience and more intensely testing.