Page MenuHomeFreeBSD

Convert debugnet to the new routing KPI.
ClosedPublic

Authored by melifaro on Apr 24 2020, 7:57 AM.
Tags
None
Referenced Files
F103581174: D24555.id71013.diff
Tue, Nov 26, 6:09 PM
Unknown Object (File)
Fri, Nov 22, 11:46 PM
Unknown Object (File)
Sun, Nov 17, 4:23 AM
Unknown Object (File)
Sun, Nov 17, 2:49 AM
Unknown Object (File)
Oct 18 2024, 9:54 AM
Unknown Object (File)
Oct 2 2024, 1:36 PM
Unknown Object (File)
Sep 30 2024, 2:52 PM
Unknown Object (File)
Sep 29 2024, 1:54 PM
Subscribers

Details

Summary

Introduce new fib[46]_lookup_debugnet() functions serving as a
special interface for the crash-time operations. Underlying
implementation will try to return lookup result if
datastructures are not corrupted, avoding locking.

Convert debugnet to use fib4_lookup_debugnet() and switch it
to use nexthops instead of rtentries.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

Looks great to me, thanks! I have some questions due to my limited understanding of the routing/nexthop KPIs below.

sys/net/debugnet.c
675 ↗(On Diff #70935)

Are the semantics for same-link routes still the same? I.e., the API produces a nexthop object?

690–691 ↗(On Diff #70935)

What are the lifetime semantics of nexthop objects?

For example, if this was called from sysctl debug.kdb.enter=1 before returning to userspace.

This revision is now accepted and ready to land.Apr 24 2020, 4:25 PM
sys/net/debugnet.c
675 ↗(On Diff #70935)

Yes, the lookup result is _always_ nhop object (if the route is found). However, it will not contain the specific destination in the GW field - if no RTF_GATEWAY is specified, you have to use original dst to determine the link-level rewrite.

e.g. for "interface" route: no RTF_GATEWAY, gw family==AF_LINK, which is exactly the same as it was before.

If you're interested, some examples are available here.

690–691 ↗(On Diff #70935)

Yeah :-) I posted the review to have this kind of discussion.

nexthops are immutable and are reclaimed when refcount goes back to 0 by epoch(9) callback.
So, nearly all of the customers can just use nexthops within the current network epoch.

In the panic case (not sure at what moment do we stop other threads) what can be done is asking the kpi to return referenced object (NHR_REF instead of NHS_NONE) which will guarantee this nhop won't disappear. It also guarantees relevant ifp/ifa are valid, as nhop references them.

sys/net/debugnet.c
675 ↗(On Diff #70935)

Re: nexthop objects in my question, I meant that a route is found, rather than producing NULL. I.e., we still have routes for link-local subnets. Sounds like yes, I think.

I hope to expand this interface to IPv6 at some time, but haven't gotten there yet. (I still only have IPv4 at home, and that's what my familiarity is with. I took a CCNA course in the early 2000s.)

Thanks again for taking the time to update debugnet to the new nexthop interface!

690–691 ↗(On Diff #70935)

Ah, good :-). So I guess in panic and debugger context we won't have concurrent threads modifying the nexthop table and don't need a ref to prevent the object from being reclaimed?

I guess my only question is: could NIC tx / rx paths potentially change the nexthop table / associated objects out from under us? The debugnet connection PCB is essentially a very crappy version of a TCP PCB and lives for the lifetime of, say, an entire netdump, or a netgdb session. I guess this concern is not new to nexthop really, as we referenced the dest_rt members well after RTFREE_LOCKED() previously.

sys/netinet6/in6_fib.c
415–416 ↗(On Diff #70935)

I know debugnet doesn't actually support v6 yet, but what sort of changes are needed corresponding to this TODO?

423–424 ↗(On Diff #70935)

As an API consumer in the future, how would I determine the right scopeid to provide here? For v4 we pass 0, and apparently the value is unused. (I am IPv6-naive.)

If a user specifies a v6 linklocal destination address during panic time, should we accept the address's s6_addr16[1] provided (pass ntohs(s6_addr16[1]) for scopeid), or something else?

sys/net/debugnet.c
690–691 ↗(On Diff #70935)

Nexthops are reclaimed based on refcount. As one of the refcounts is always some route, route change is required to schedule deletion.
It can be either userland-induced change, or timer-based (for temporal routes), or something weird like received ICMP redirect which changes the existing redirected route (though the possibility is quite low).

In the cases above the returned nexthop can potentially get its refcount down to 0 and scheduled for the deletion by epoch callback.

Re concern: potentially you can keep the reference for the duration of the entire session, similar to the route caching, and free the nexthop afterwards.

sys/netinet6/in6_fib.c
415–416 ↗(On Diff #70935)

It's just an internal reminder on committing something like D23051 after converting everything to the new KPI, it does not affect functionality.

423–424 ↗(On Diff #70935)

Re IPv4: I hope to implement rfc3927 link-local address scope.

Re IDs: for IPv6 unicast addresses, the only scope is link-local, so the ID is the link interface index.

In IPv6 world it is typically specified by fe80::5054:ff:fe42:fef%vtnet0 postfix.

Thanks!

sys/net/debugnet.c
690–691 ↗(On Diff #70935)

We don't have to worry about userspace or timers running during debugnet. ICMP redirects are probably the concerning thing, but I think we hijack rx in debugnet_pkt_in, and just discard non-UDP IP packets in debugnet_handle_ip. So probably it is ok as-is and we don't need to keep the reference in the PCB, although that might be more correct.

sys/netinet6/in6_fib.c
415–416 ↗(On Diff #70935)

Ah ok, thanks.

423–424 ↗(On Diff #70935)

Ok, thanks. If user specifies an explicit outbound interface like vtnet0, we should pass it's scopeid to this interface, but if not, should we pass 0? I guess I'm not sure what the suffix does — select a specific interface when multiple interfaces have the same "subnet"?

sys/netinet6/in6_fib.c
423–424 ↗(On Diff #70935)

Yeah, if interface is not specified, then it's 0.
Suffix says 'please lookup this address in scope of interface vtnet0' (assuming the address itself has link scope - fe80::/10 for IPv6 or 169.254.0.0/16 for IPv4). Note the latter is not implemented yet.

So yes, addresses/networks can be the same on multiple links for the addresses within the local scope.

IPv6 case
18:43 [0] m@devel0 netstat -6rn | grep /64
fe80::%vtnet0/64                  link#1                        U        vtnet0
fe80::%lo0/64                     link#2                        U           lo0
fe80::%epair1a/64                 link#5                        U       epair1a
This revision was automatically updated to reflect the committed changes.
sys/netinet6/in6_fib.c
423–424 ↗(On Diff #70935)

Ah, thanks!