Details

Reviewers

melifaro
glebius
markj

Group Reviewers

network

Commits

rG8ed5e67e2165: routing: Include <sys/eventhandler.h> in nhop_ctl.h
rGd05d1f256082: routing: Subscribe nhops to ifnet link events

Summary

Update nexthop flags with interface link status events and
instead of checking link status of interface for every packet
only check the reachability flag of the final nexthop.

Q: Why not listening in the rib instead of nhops?
A: We don't have much nhops in comparison to routes.

Updating nhops directly is way more faster and efficient.

Test Plan

[root@ftsr1] [~] # ping 9.9.9.9
PING 9.9.9.9 (9.9.9.9): 56 data bytes
64 bytes from 9.9.9.9: icmp_seq=0 ttl=57 time=87.410 ms
64 bytes from 9.9.9.9: icmp_seq=1 ttl=57 time=95.969 ms
64 bytes from 9.9.9.9: icmp_seq=3 ttl=57 time=96.876 ms
[nhop_ctl] inet.0 nhop_free: deleting nh#2/inet/vtnet0/resolve
ping: sendto: No route to host
ping: sendto: No route to host
ping: sendto: No route to host
^C

Diff Detail

Repository

rG FreeBSD src repository

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

pouria created this revision.May 31 2026, 9:19 PM

Herald added subscribers: ae, imp. · View Herald TranscriptMay 31 2026, 9:19 PM

pouria requested review of this revision.May 31 2026, 9:19 PM

Harbormaster completed remote builds in B73595: Diff 179042.May 31 2026, 9:19 PM

I updated my servers with this patch and got good results.
Before the patch, when I pinged routed addresses and a nexthop became unreachable, my pings got stuck.
Now it immediately shows Network is down and reacts faster when my nexthop is reachable again.
Consider I have more than 2M routes.

% ping 3fff::10:15::1
PING(56=40+8+8 bytes) 3fff::2::fbf8:3f4 --> 3fff::10:15::1
16 bytes from 3fff::10:15::1, icmp_seq=0 hlim=64 time=4.100 ms
16 bytes from 3fff::10:15::1, icmp_seq=1 hlim=64 time=4.273 ms
16 bytes from 3fff::10:15::1, icmp_seq=2 hlim=64 time=1.290 ms
16 bytes from 3fff::10:15::1, icmp_seq=3 hlim=64 time=5.433 ms
16 bytes from 3fff::10:15::1, icmp_seq=4 hlim=64 time=2.654 ms
ping: sendmsg: Network is down
ping: wrote 3fff::10:15::1 16 chars, ret=-1
ping: sendmsg: Network is down
ping: wrote 3fff::10:15::1 16 chars, ret=-1
ping: sendmsg: Network is down
ping: wrote 3fff::10:15::1 16 chars, ret=-1
16 bytes from 3fff::10:15::1, icmp_seq=8 hlim=64 time=2.928 ms
16 bytes from 3fff::10:15::1, icmp_seq=9 hlim=64 time=5.729 ms
16 bytes from 3fff::10:15::1, icmp_seq=10 hlim=64 time=2.608 ms

pouria added a child revision: D57389: routing: Replace unreachable nhops in nhgrp.Jun 1 2026, 7:39 PM

pouria edited the summary of this revision. (Show Details)

glebius added inline comments.Jun 1 2026, 7:51 PM

sys/net/route/nhop_ctl.c
1344–1347	Shouldn't this be an atomic(9) operation?
1367–1370	What did happen without this crutch?

pouria added inline comments.Jun 1 2026, 7:58 PM

sys/net/route/nhop_ctl.c
1344–1347	AFAICU no. We set nhop flags without any kind of protection in netlink, rtsock and other subsystem.
1367–1370	bridge tests cause panics. Why? We calculate rib_head (rh) from fibnum + V_tables in rt_tables_get_rnh and the VNET is destroyed. So it results in garbage memory.

Move one line on updating rib_head from D57389 to here.

Harbormaster completed remote builds in B73616: Diff 179079.Jun 1 2026, 8:04 PM

I believe this is an architecturally correct change, but I'm not sure about all details.

P.S. I'll move my locking concerns discussion to D57389.

sys/net/route/nhop_ctl.c
1367–1370	Please add this info into a comment above VNET_IS_SHUTTING_DOWN().

This revision is now accepted and ready to land.Jun 11 2026, 4:01 PM

ping

markj added inline comments.Mon, Jul 6, 3:00 PM

sys/net/route/nhop.h
149
sys/net/route/nhop_ctl.c
117	Indentation on continuing lines should be by four spaces.
1344–1347	Are you sure? That seems quite incorrect. In some cases it is okay because the nhop object is newly allocated and not visible to any other threads, but that is not the case here.
1362

Address @markj comments.
Add requested code comment on vnet shutdown for @glebius. Does the comment look good?

This revision now requires review to proceed.Wed, Jul 8, 8:10 PM

Harbormaster completed remote builds in B74640: Diff 181572.Wed, Jul 8, 8:10 PM

pouria added inline comments.Wed, Jul 8, 8:10 PM

sys/net/route/nhop_ctl.c
1344–1347	Are you sure? That seems quite incorrect. In some cases it is okay because the nhop object is newly allocated and not visible to any other threads, but that is not the case here. That's also the atomic case I talked about in D57669. It's under 32-bit and it can't be atomic anyway: `atomic(9)`: Certain architectures also provide operations for types smaller than “int”. char unsigned character short unsigned short integer 8 unsigned 8-bit integer 16 unsigned 16-bit integer These types must not be used in machine-independent code.

markj added inline comments.Thu, Jul 9, 2:43 PM

sys/net/route/nhop_ctl.c
1344–1347	The paragraph you quoted in D57669 absolutely does not apply here. Please take some time to read about these topics, e.g., chapter 3 of https://arxiv.org/pdf/1701.00854 In any case, the right approach is most likely to use the nhop write lock. That is, you need an iterator which acquires the nhop write lock instead of the read lock.

Address @glebius and @markj comments by using wlock.

Harbormaster completed remote builds in B74654: Diff 181619.Thu, Jul 9, 5:05 PM

pouria added inline comments.Thu, Jul 9, 5:08 PM

sys/net/route/nhop_ctl.c
1344–1347	The paragraph you quoted in D57669 absolutely does not apply here. Please take some time to read about these topics, e.g., chapter 3 of https://arxiv.org/pdf/1701.00854 Thank you so much for the link, I will. In any case, the right approach is most likely to use the nhop write lock. That is, you need an iterator which acquires the nhop write lock instead of the read lock. Done.

Let me suggest a better KPI for wlock iterator. Instead of adding a wrapper function, just add const bool wlock member to struct nhop_iter. In that case all you need is to set it true in the initializer. Use same function to start and stop the iterator.