Page MenuHomeFreeBSD

NAT64: compute checksum for locally generated packets
AbandonedPublic

Authored by 2khramtsov_gmail.com on Jul 23 2020, 3:13 PM.
Tags
None
Referenced Files
Unknown Object (File)
Wed, Dec 18, 10:59 PM
Unknown Object (File)
Wed, Dec 11, 6:54 PM
Unknown Object (File)
Sat, Dec 7, 9:32 PM
Unknown Object (File)
Sat, Dec 7, 11:46 AM
Unknown Object (File)
Fri, Dec 6, 2:08 AM
Unknown Object (File)
Fri, Nov 29, 11:53 AM
Unknown Object (File)
Nov 21 2024, 3:05 AM
Unknown Object (File)
Nov 18 2024, 4:15 PM

Details

Reviewers
ae
donner
Group Reviewers
network

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

2khramtsov_gmail.com retitled this revision from WIP: CLAT fix TCP and UDP checksums for UE host to ipfw NAT64 CLAT: calculate TCP and UDP checksums for UE.
2khramtsov_gmail.com edited the summary of this revision. (Show Details)
2khramtsov_gmail.com edited the test plan for this revision. (Show Details)

I am new to Phabricator so please tell me if I specified anything wrong.
I did not want a WIP version review, but Phabricator for some reason asked for it.
I do not know why it specifies specific people, but I will leave it as is for now.

I also don't know anything about MFCs and how they work, but I think this
change should be backported to 12 (how?).

I also don't have hardware that is capable of ISR and I don't know if "local"
is needed without nat64_direct_output sysctl. ISR should not be broken anyway
as "local" is not set by default.

2khramtsov_gmail.com retitled this revision from ipfw NAT64 CLAT: calculate TCP and UDP checksums for UE to ipfw 464XLAT CLAT: calculate TCP and UDP checksums for UE.Jul 24 2020, 4:22 AM
2khramtsov_gmail.com edited the summary of this revision. (Show Details)

Thank you for your contribution, especially for using the documentation prefixes in your examples.
Because your test plan is so detailed, you may consider to write is as a regression test for constant revalidation.

sbin/ipfw/ipfw.8
3562 ↗(On Diff #74868)

May you consider the more generic wording "external" instead of "CPE", please?
Can you document the default behavior, if the keywords are missing at all?

sbin/ipfw/ipfw2.h
303–304 ↗(On Diff #74868)

Inserting new tokens will "renumber" later tokens, causing ABI breaks.
There is no problem with an ABI change, but it's necessary to note this more prominently.

sbin/ipfw/nat64clat.c
235–240 ↗(On Diff #74868)

Is it possible to include this to the config flags?

sys/netinet6/ip_fw_nat64.h
117 ↗(On Diff #74868)

Adding a new value into the struct causes an ABI break.
Adding a new flag to the existing flags variable (below), maintains backward compatibility.

sys/netpfil/ipfw/nat64/nat64_translate.c
1343

I do not understand this comment.
How can HW checksum helps for locally (software) generated packets?

sys/netpfil/ipfw/nat64/nat64_translate.h
144–145 ↗(On Diff #74868)

I can't check the context of this lines, because the patch was not generated by the recommended procedure, so please check yourself if the argument is part of the cfg argument in the same call.

donner requested changes to this revision.Jul 24 2020, 8:15 AM

Because the ABI break spans the kernel / userland barrier, the update procedure for the FreeBSD base system is harmed.
An old ipfw will send the old TOK_xxx values, which will be misinterpreted by a new kernel.
Please find a way to keep the old ipfw binary working during the upgrade.

This revision now requires changes to proceed.Jul 24 2020, 8:15 AM
2khramtsov_gmail.com edited the test plan for this revision. (Show Details)

Added context, changed man page, switched to flags, removed comment and added a check for CSUM_DELAY_DATA.

How can HW checksum helps for locally (software) generated packets?

I looked up how ip_fw_nat.c computes full checksum and did almost the same here.
I do not yet understand how hardware checksum offload works, so I removed this comment.

Also, now I check for CSUM_DELAY_DATA and set it off if it is set.
Because for some (?) reason CSUM_DELAY_DATA is set even if interface does not have
checksum offload support (like epair).

it's necessary to note this more prominently.

Sorry, how do I do it?

Good work, thank you.
If possible convert your test into a real test file for regressions.

I think it should be possible solve the problem without introducing extra configuration parameter. I'll take a look.

2khramtsov_gmail.com retitled this revision from ipfw 464XLAT CLAT: calculate TCP and UDP checksums for UE to ipfw 464XLAT CLAT: calculate L4 checksums for locally generated packets.Jul 27 2020, 11:23 AM
2khramtsov_gmail.com edited the summary of this revision. (Show Details)
2khramtsov_gmail.com edited the test plan for this revision. (Show Details)

This test causes panic on current in somewhere in if_epair.
For some reason something odd was written to my Wi-Fi NICs ROM,
and now it does not work and MAC address is odd.

EDIT: NIC is fine, it was just system file damage.

The original test file was also lost (0 B file now).
I've just rewritten the test file as I remember it, and I will test it once I will get more time.

In D25789#572104, @ae wrote:

I think it should be possible solve the problem without introducing extra configuration parameter. I'll take a look.

I've tried to do it this way now.
It will take some time for me to build and test this diff, as my keepassxc binary is apparently also corrupted
and I will have to spend some time checking for other file corruptions before I will get to test the diff.

2khramtsov_gmail.com edited the summary of this revision. (Show Details)

It works.
Workaround test panic by being slower in the test.
Still runs under 2s on my laptop.

I see that some .orig files were created after svn add.
Do I need to remove them?

Again, thank you for your patience.

sys/netpfil/ipfw/nat64/nat64clat.c
115 ↗(On Diff #75014)

Should this be reverted?

2khramtsov_gmail.com marked an inline comment as done.
2khramtsov_gmail.com added inline comments.
sys/netpfil/ipfw/nat64/nat64clat.c
115 ↗(On Diff #75014)

I removed these CSUM_DELAY_DATA bit off/on assignments and the test still passes.
I will update diff shortly.

2khramtsov_gmail.com marked an inline comment as done.

Improve test and do the same with kernel module as pf tests do.

Next time I will ask in mailing lists or IRC
and not be overly excited before submitting anything to avoid numerous diff changes.

Improve test and do the same with kernel module as pf tests do.

Thank you.

Next time I will ask in mailing lists or IRC
and not be overly excited before submitting anything to avoid numerous diff changes.

Software development is more an art of desillustration and expectation management than about writing real code.

This revision is now accepted and ready to land.Jul 28 2020, 12:34 PM
This revision now requires review to proceed.Jul 28 2020, 2:15 PM

I deployed this on "real" 12.1 host (KVM VDS vtnet NIC and jails with epairs) today:

Without CSUM_DELAY_DATA bit on/off assignment PLAT (host) gets correct checksum from CLAT (jail) (from 201:db8:11::192.168.1.1)
*and* IPv4 interface on PLAT (vtnet0) sends correct IPv4 checksum to the Internet, as noted by tcpdump.

But the destination host still gets wrong checksum... Bring back this CSUM_DELAY_DATA bit magic,
as without it the Internet will get wrong checksum even if tcpdump on PLAT's side said outgoing checksum was OK.

It seems that VNET jail likely passes mbuf to the host, and not just IPv6 packets.
Removing CSUM_DELAY_DATA bit assignment from mbuf will result in breakage with VNET.
It seems that CSUM_DELAY_DATA bit is critical in deciding whether checksum was computed or not.

While here remove this test as it is naive, thus misleading.
Also it belongs to ipfw/nat64, not ipfw. Later I should write ipfw test suite to justify creating dirs.

As I said before, I have no real hardware with working checksum offload to test this
on a real network, not virtualized one. However I believe it should now work with both of them.

On previous test panic: after I reinstalled the world from freshly svn co'ed HEAD,
I still could not build freshly svn co'ed 12.1 with this diff due to undefined reference to "yydebug".

The corruption from that if_epair panic was broad enough to create new ZFS pool for OS from scratch.
I don't recommend running the first test version on bare metal at all.
I may also look into the reason of that panic later, it is interesting.

I sent a more generic patch in the reply to your email a week ago, can you check your spam folder and test it?

In D25789#574377, @ae wrote:

I sent a more generic patch in the reply to your email a week ago, can you check your spam folder and test it?

Gmail inbox does not have any mail even in spam. Seems to be a gmail problem.
Please resend to evgeniy@khramtsov.org, rspamd is saner than whatever gmail runs.

2khramtsov_gmail.com retitled this revision from ipfw 464XLAT CLAT: calculate L4 checksums for locally generated packets to NAT64: compute checksum for locally generated packets.
2khramtsov_gmail.com edited the summary of this revision. (Show Details)
2khramtsov_gmail.com removed a reviewer: manpages.

The same issue exists for other NAT64 instances. Update this diff to the version from E-Mail.

  1. Move in_delayed_cksum to nat64_translate.c and add in6_delayed_cksum.
  2. Don't check for rcvif. CSUM_DELAY_DATA(_IPV6) is set only for locally generated packets.

On nat64_output (netisr):

In ip(6)_output, locally generated packets are marked in csum_flags and csum_data
fields before netisr_queue(). Locally generated packets should also have a valid pseudo-header checksum.

nat64_output also calls netisr_queue(), but does not mark locally generated packets before.
The efficient way to handle locally generated packets with nat64_output, is to ensure that mbuf will have valid
converted pseudo-header checksum, then mark locally generated packets the same as in ip(6)_output.

I have an idea and I will test it later with nat64_output and em(4).
My netisr issues were most likely caused by using if_epair, which has its own parameter for netisr_queue().

in(6)_delayed approach also works for netisr, but not as efficiently as it could be.