Page MenuHomeFreeBSD

Change the way of attaching IPv6 link-layer header.
ClosedPublic

Authored by melifaro on Jan 9 2015, 2:21 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Nov 23, 12:24 AM
Unknown Object (File)
Tue, Nov 19, 6:59 AM
Unknown Object (File)
Fri, Nov 15, 11:44 PM
Unknown Object (File)
Fri, Nov 15, 11:26 PM
Unknown Object (File)
Fri, Nov 15, 10:13 PM
Unknown Object (File)
Fri, Nov 15, 9:18 PM
Unknown Object (File)
Wed, Oct 30, 1:40 PM
Unknown Object (File)
Wed, Oct 30, 1:40 PM
Subscribers

Details

Summary

Problem description:

How do we currently perform layer 2 resolution and header imposition:

For IPv4 we have the following simple chain:

ip_output() -> (ether|atm|whatever)_output() -> arpresolve()

Lookup is done in proper place (link-layer output routine) and it is possible to
provide cached lle data.

For IPv6 situation is more complex:

ip6_output() -> nd6_output() -> (ether|whatever)_output() -> nd6_storelladdr()

we have ip6_ouput() which calls nd6_output() instead of link output routine.
nd6_output() does the following:

  • checks if lle exists, creates it if needed (similar to arpresolve())
  • performes lle state transitions (similar to arpresolve())
  • pushes packets to link output routine along with running SeND/MAC hooks regardless of lle state (e.g. works as run-hooks placeholder).

After that, iface output routine like ether_output() calls nd6_storelladdr() which performs lle lookup once again.

As a result, we perform lookup twice for each outgoing packet for most types of interfaces.
We also perform needless lookup in case of IP tunnels.

This does not look the optimal code path.
From architectural point of view, nd lookups are called in wrong place so we need to add hacks for non-trivial cases like p2p interfaces.
Additionally, it is hard to propagate lookup result so we end up with separate function used only for doing additional lookups, complicating usage pattern.
As a result, we already have toe_nd6_resolve() which mimics arpresolve() approach for TOE l2 resolve.
Another significant reason is performance: doing 2 lookups instead of 1 (or 1 instead of 0) not using result of first one does not improve things.

Proposed solution:
Make things simpler and return to the approach used in IPv4. To be more specific,

  • ip6_ouput(), ip6_forward() will still call nd6_output(), but the only task performed by it will be calling MAC/Dtrace/SeND hooks prior to calling ifp->if_output(). It could be inlined (exactly like in IPv4 code), but SeND check makes this unhandy. I hope that we will be able to move SeND hook out of the hot path after merging projects/routing or user/ae/inet6 branches.
  • Both ND state keeping and link-layer resolution will happen in 'case AF_INET6' part of output function (similar to AF_INET). As a result, we will have single externally-visible function (nd6_resolve()) for IPv6 consumers.

This can be achieved by number of small changes like renaming/slightly changing existing functions. In fact, we need to:

  • rename nd6_output[_slow]() to nd6_resolve_[slow]()
  • convert nd6_resolve() and nd6_resolve_slow() to arpresolve() semantic, e.g. copy L2 address to buffer instead of pushing packet towards lower layers
  • make all nd6_output() consumers use newly-added nd6_output_ifp() (created in r276844) instead. This function is responsible exactly for calling all necessary hooks and ifp->output().
  • Make all nd6_storelladdr() users use nd6_resolve()
  • eliminate nd6_storelladdr()

Error handling:
Currently sending packet to non-existing la results in ip6_<output|forward> -> nd6_output() -> nd6_output _lle() which returns 0.
In new scenario packet will be propagated to <ether|whatever>_output() -> nd6_resolve() which will return EWOULDBLOCK, and that will be converted to 0.

(It looks like we don't need LLE resolve routing returning EWOULDBLOCK in all cases, except TOE/IB, but that is worth different discussion). For now this we just copy existing IPv4 LLE behavior.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 201
Build 201: arc lint + arc unit

Event Timeline

melifaro retitled this revision from to Change the way of attaching IPv6 link-layer header..
melifaro updated this object.
melifaro edited the test plan for this revision. (Show Details)

Update diff to sync to recent commits.

bz added a subscriber: bz.

I'll look at this once I am done with D1454

sys/net/if_iso88025subr.c
296

it looks like is_gw argument missed here.

bz requested changes to this revision.Jan 13 2015, 2:36 PM
bz edited edge metadata.

Can you get the unrelated const changes out of the diff for a start please (along with the header changes where needed)?

I am still trying to dissect the rest.

sys/netinet6/nd6.c
921

Unrelated const change; can you break this out of this patch?

964

Unrelated const change; can you break this out of this patch?

1065

And this const one.

This revision now requires changes to proceed.Jan 13 2015, 2:36 PM
melifaro edited edge metadata.

Update diff to sync to recent -HEAD.

Remove 'const' changes (had to leave one in nd6_resolve()).

melifaro edited edge metadata.

Address ae@ comment on fixing if_iso88025subr.c.

ae edited edge metadata.

Looks good for me.

melifaro edited edge metadata.

Do pre-commit sync.

This revision was automatically updated to reflect the committed changes.