Page MenuHomeFreeBSD

Fix IPv6 checksums when exthdrs are present.
ClosedPublic

Authored by bz on Feb 19 2020, 6:40 PM.
Tags
None
Referenced Files
Unknown Object (File)
Mon, Jan 6, 10:40 PM
Unknown Object (File)
Sun, Jan 5, 1:34 AM
Unknown Object (File)
Dec 8 2024, 8:58 PM
Unknown Object (File)
Nov 25 2024, 1:33 PM
Unknown Object (File)
Nov 18 2024, 4:33 AM
Unknown Object (File)
Nov 16 2024, 10:46 AM
Unknown Object (File)
Nov 7 2024, 1:33 PM
Unknown Object (File)
Oct 22 2024, 5:55 PM

Details

Summary
In two places in ip6_output we are doing (delayed) checksum calculations.
The initial logic came from SCTP in r205075,205104 and was later copied
and adjusted for the TCP|UDP case in r235958.
The problem was that the original SCTP offsets were already wrong for any
case with extension headers present given IPv6 extension headers are not
part of the pseudo checksum calculations.
The later changes do not help in case there is checksum offloading as for
extension headers (incl. fragments) we do currrently never offload as we
have no infrastructure to know whether the NIC can handle these cases.

Correct the offsets for delayed checksum calculations and properly handle
mbuf flags.  In addition harmonize the almost identical duplicate code.

Submitted by: Francis Dupont (fdupont isc.org)
Reported by: Francis Dupont (fdupont isc.org)
PR: 243675
Test Plan

TODO, write tests cases; we cannot easily use vnet/epair or other local
network tests as these "circumvent" the checksum calculations and set
correct checksums on the mbufs; we'd need at least a bhyve or multiple
real host scenario.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

sys/netinet6/ip6_output.c
221 ↗(On Diff #68552)

This check is weird and I do not fully understand why the original changes from
https://svnweb.freebsd.org/base/head/sys/netinet6/ip6_output.c?r1=349528&r2=349529&
did not have the extra IFCAP_NOMAP block in the second (frag) case.

I dropped @jhb an email a few minutes ago asking. I hope we can remove the "!frag &&" here.

I'm 1/2 joking, but what would you think about not supporting extension headers at all? They are the worst part of IPv6 and make everything complicated and add lots of hairy cases. What benefit are they?
(I'm legitimately curious)

sys/netinet6/ip6_output.c
221 ↗(On Diff #68552)

I believe it is because any packtets that were originally unmapped would have already been previously expanded by mb_unmapped_to_ext(), when fragmented so there is no reason to check

markj added inline comments.
sys/netinet6/ip6_output.c
215 ↗(On Diff #68552)

You might KASSERT that plen >= optlen.

221 ↗(On Diff #68552)

Indeed, I suspect this is an oversight.

1097 ↗(On Diff #68552)

Is there any reason to have hlen? It should always have the same value as unfragpartlen I think.

This revision is now accepted and ready to land.Feb 19 2020, 9:00 PM

I'm 1/2 joking, but what would you think about not supporting extension headers at all? They are the worst part of IPv6 and make everything complicated and add lots of hairy cases. What benefit are they?
(I'm legitimately curious)

I'd love to. Sadly that train left more than 20 years ago and we are seeing more and more protocol(s | extensions) making use of them. I think as a general purpose OS we cannot do much about it.

I do have a big hammer prototype check in a branch at the beginning of ip6_output() to do all essentially mandatory checks for the default case of just sending out a normal packet (ignoring firewalls, etc).

Let's keep cleaning bits by bits up there making the code simpler and factoring parts (such as ext hdrs) out more and more and we'll see if we can have a "fast path"; but also see if that "fast path" is actually any faster.

I'm 1/2 joking, but what would you think about not supporting extension headers at all? They are the worst part of IPv6 and make everything complicated and add lots of hairy cases. What benefit are they?
(I'm legitimately curious)

https://www.iana.org/assignments/ipv6-parameters/ipv6-parameters.xhtml#extension-header

Active usage here are IPSec (50/51) (used by various VPNs), Fragment (44) (used in general, i.e. DNS(sec)) and Mobility (135) (used by Microsoft for DirectAccess) .

Leave comments and questions (and another possible bugfix comment for later).

sys/netinet6/ip6_output.c
221 ↗(On Diff #68552)

I am not sure I can entirely follow. Fragmentation happens afterwards.
Are you trying to say that the first incarnation of the call would have caught it already and done the mb_unmapped_to_ext() and so the 2nd one doesn't need to anymore? In that case I believe a lot of the logic would be duplicate.
Can you please try to explain it again for me?

1097 ↗(On Diff #68552)

Sounds right. I'll leave that for a follow-up commit if you don't mind.

1132 ↗(On Diff #68552)

There is and was for a long time a problem here in that if we called mb_unmapped_to_ext(m); and it was successful we'd get a new chain but never updated the ip6 pointer here doing

ip6 = mtod(m, struct ip6_hdr *);

if I am not wrong?

Add KASSERT as suggested by @markj .
Remove hlen variable which always matches unfragpartlen as suggested by @markj.
Fix a case where we possibly do not update the ip6 pointer.
Leave a comment about an unclear header clearing line for future contemplation (which was removed in the previous revision).

This revision now requires review to proceed.Feb 20 2020, 11:26 AM
bz marked 2 inline comments as done and an inline comment as not done.Feb 20 2020, 11:28 AM
bz added inline comments.
sys/netinet6/ip6_output.c
1026 ↗(On Diff #68552)

I am still contemplating this one. I keep wondering if the original I added years ago should have been:

m->m_pkthdr.csum_flags &= ~ifp->if_hwassist;

clearing all further hwassist flags given we've done the delayed checksum calculations?

1097 ↗(On Diff #68552)

Actually done it now given I updated the revision anyway.

gallatin added inline comments.
sys/netinet6/ip6_output.c
221 ↗(On Diff #68552)

Yeah, i see what you mean. Its just a bug. This code came from me originally (netflx tree ~2017) and the bug has been there since the initial commit. We just never noticed because we don't use SCTP. Adding the check here is correct.

This revision is now accepted and ready to land.Feb 20 2020, 1:48 PM
sys/netinet6/ip6_output.c
221 ↗(On Diff #68552)

@gallatin: Now you confuse me even more given this is unrelated to SCTP.

It's the

(ifp->if_capenable & IFCAP_NOMAP) == 0

check which is the one in question if it is needed for the second call (now around line 1129 in this review). The cases "/* Try to fragment the packet. Cases 1-b and 3. */". We'd calculate the checksum if not already done before doing fragmentation.

If we do not do a delayed checksum for TCP|UDP|SCTP (for whatever reason) and did not run into the "ifp->if_capenable & IFCAP_NOMAP == 0" case the first time (for whatever reason), then the mbuf was never run through mb_unmapped_to_ext().

I assume there are two questions:

(1) can this happen at all or would we always run through the case the first time anyway and thus it is not needed at all the 2nd time?

(1b) in that case would we actually need the 2nd case?

(2) if it can happen, then do we need to add it to the 2nd time?

In other words: do you suggest I do remove the "!frag &&" from the change?

sys/netinet6/ip6_output.c
221 ↗(On Diff #68552)

I'm utterly confused at what you're asking.

Can you point me to the line in the original code where you think there should be a check?

sys/netinet6/ip6_output.c
221 ↗(On Diff #68552)

Should there be a "} else if ((ifp->if_capenable & IFCAP_NOMAP) == 0) {" check here

https://svnweb.freebsd.org/base/head/sys/netinet6/ip6_output.c?annotate=358167#l1127

as well as it is further up here:

https://svnweb.freebsd.org/base/head/sys/netinet6/ip6_output.c?annotate=358167#l1019

?

sys/netinet6/ip6_output.c
221 ↗(On Diff #68552)

By the time we've gotten past the code around line 1029, we have examined the ifp, and converted from unmapped to mapped if the ifp is not capable of handling unmapped.

How can we get to line 1127 without having done the check at 1029? Or is there something that I'm missing, like we are using a different ifp?

The other checks there are needed because we're doing software checksums because of fragmentation, so we must convert unmapped to mapped in that case. But I do not see a need to re-check the interface caps.

Pass the proper csum flags into the 2nd call of ip6_output_delayed_csum().

This revision now requires review to proceed.Feb 20 2020, 6:53 PM
bz marked 5 inline comments as done.Feb 20 2020, 6:56 PM

Seems all sorted now. Thanks especially to @gallatin for bearing with me.

sys/netinet6/ip6_output.c
235 ↗(On Diff #68607)

I'm sure I"m asking dumb questions again, but why is performing the checksum tied to an interface not having IFCAP_NOMAP?

@gallatin I hope this makes sense ..

sys/netinet6/ip6_output.c
235 ↗(On Diff #68607)

I think this time you got the logic wrong.
If we have to compute the checksum (based on flags) we will. For that we call mb_unmapped_to_ext().
If we do not have to computer a checksum we still want to call mb_unmapped_to_ext() if IFCAP_NOMAP is not set.
It's just in the same location.

We could always do the double work of calling mb_unmapped_to_ext() twice (and in fact for SCTP we might have done this in the old code because there was a missing else inside the #ifdef SCTP block). Now that is at least folded into one call only.

There is still the case which was there before in that for the (2nd case, the before fragmentation) we'd call mb_unmapped_to_ext() again if in the first case we did not compute the checksum but only did the mb_unmapped_to_ext() because IFCAP_NOMAP was not set. So at least we are not off worse. We could optimise that out as well if we want. The main purpose of this change was to fix IPv6 delayed checksum computations though and I only stumbled over the IFCAP_NOMAP thing not fully realising why the code always was as-is.

If we'd want to further optimise the code we'd put the entire:

m = mb_unmapped_to_ext(m);
if (m == NULL) {
   ...
}

block up there under an extra

if (frag && (ifp->if_capenable & IFCAP_NOMAP) == 0) { ... }

block as in that case we'd done the mb_unmapped_to_ext() already in the !frag case.

That wasn't my logic before but was exactly the problem of my confusion initially.

gallatin added inline comments.
sys/netinet6/ip6_output.c
235 ↗(On Diff #68607)

Ah, sorry, yes. I confess it is because there are so many comments on the review that the first hunk scrolled off my screen..

This revision is now accepted and ready to land.Feb 24 2020, 6:28 PM
This revision was automatically updated to reflect the committed changes.