Diff Detail
- Repository
- rS FreeBSD src repository - subversion
- Lint
Lint Skipped - Unit
Tests Skipped
Event Timeline
I am new to Phabricator so please tell me if I specified anything wrong.
I did not want a WIP version review, but Phabricator for some reason asked for it.
I do not know why it specifies specific people, but I will leave it as is for now.
I also don't know anything about MFCs and how they work, but I think this
change should be backported to 12 (how?).
I also don't have hardware that is capable of ISR and I don't know if "local"
is needed without nat64_direct_output sysctl. ISR should not be broken anyway
as "local" is not set by default.
Thank you for your contribution, especially for using the documentation prefixes in your examples.
Because your test plan is so detailed, you may consider to write is as a regression test for constant revalidation.
sbin/ipfw/ipfw.8 | ||
---|---|---|
3562 ↗ | (On Diff #74868) | May you consider the more generic wording "external" instead of "CPE", please? |
sbin/ipfw/ipfw2.h | ||
303–304 ↗ | (On Diff #74868) | Inserting new tokens will "renumber" later tokens, causing ABI breaks. |
sbin/ipfw/nat64clat.c | ||
235–240 ↗ | (On Diff #74868) | Is it possible to include this to the config flags? |
sys/netinet6/ip_fw_nat64.h | ||
117 ↗ | (On Diff #74868) | Adding a new value into the struct causes an ABI break. |
sys/netpfil/ipfw/nat64/nat64_translate.c | ||
1343 | I do not understand this comment. | |
sys/netpfil/ipfw/nat64/nat64_translate.h | ||
144–145 ↗ | (On Diff #74868) | I can't check the context of this lines, because the patch was not generated by the recommended procedure, so please check yourself if the argument is part of the cfg argument in the same call. |
Because the ABI break spans the kernel / userland barrier, the update procedure for the FreeBSD base system is harmed.
An old ipfw will send the old TOK_xxx values, which will be misinterpreted by a new kernel.
Please find a way to keep the old ipfw binary working during the upgrade.
Added context, changed man page, switched to flags, removed comment and added a check for CSUM_DELAY_DATA.
How can HW checksum helps for locally (software) generated packets?
I looked up how ip_fw_nat.c computes full checksum and did almost the same here.
I do not yet understand how hardware checksum offload works, so I removed this comment.
Also, now I check for CSUM_DELAY_DATA and set it off if it is set.
Because for some (?) reason CSUM_DELAY_DATA is set even if interface does not have
checksum offload support (like epair).
Good work, thank you.
If possible convert your test into a real test file for regressions.
I think it should be possible solve the problem without introducing extra configuration parameter. I'll take a look.
This test causes panic on current in somewhere in if_epair.
For some reason something odd was written to my Wi-Fi NICs ROM,
and now it does not work and MAC address is odd.
EDIT: NIC is fine, it was just system file damage.
The original test file was also lost (0 B file now).
I've just rewritten the test file as I remember it, and I will test it once I will get more time.
I've tried to do it this way now.
It will take some time for me to build and test this diff, as my keepassxc binary is apparently also corrupted
and I will have to spend some time checking for other file corruptions before I will get to test the diff.
It works.
Workaround test panic by being slower in the test.
Still runs under 2s on my laptop.
I see that some .orig files were created after svn add.
Do I need to remove them?
Again, thank you for your patience.
sys/netpfil/ipfw/nat64/nat64clat.c | ||
---|---|---|
115 ↗ | (On Diff #75014) | Should this be reverted? |
sys/netpfil/ipfw/nat64/nat64clat.c | ||
---|---|---|
115 ↗ | (On Diff #75014) | I removed these CSUM_DELAY_DATA bit off/on assignments and the test still passes. |
Improve test and do the same with kernel module as pf tests do.
Next time I will ask in mailing lists or IRC
and not be overly excited before submitting anything to avoid numerous diff changes.
Thank you.
Next time I will ask in mailing lists or IRC
and not be overly excited before submitting anything to avoid numerous diff changes.
Software development is more an art of desillustration and expectation management than about writing real code.
I deployed this on "real" 12.1 host (KVM VDS vtnet NIC and jails with epairs) today:
Without CSUM_DELAY_DATA bit on/off assignment PLAT (host) gets correct checksum from CLAT (jail) (from 201:db8:11::192.168.1.1)
*and* IPv4 interface on PLAT (vtnet0) sends correct IPv4 checksum to the Internet, as noted by tcpdump.
But the destination host still gets wrong checksum... Bring back this CSUM_DELAY_DATA bit magic,
as without it the Internet will get wrong checksum even if tcpdump on PLAT's side said outgoing checksum was OK.
It seems that VNET jail likely passes mbuf to the host, and not just IPv6 packets.
Removing CSUM_DELAY_DATA bit assignment from mbuf will result in breakage with VNET.
It seems that CSUM_DELAY_DATA bit is critical in deciding whether checksum was computed or not.
While here remove this test as it is naive, thus misleading.
Also it belongs to ipfw/nat64, not ipfw. Later I should write ipfw test suite to justify creating dirs.
As I said before, I have no real hardware with working checksum offload to test this
on a real network, not virtualized one. However I believe it should now work with both of them.
On previous test panic: after I reinstalled the world from freshly svn co'ed HEAD,
I still could not build freshly svn co'ed 12.1 with this diff due to undefined reference to "yydebug".
The corruption from that if_epair panic was broad enough to create new ZFS pool for OS from scratch.
I don't recommend running the first test version on bare metal at all.
I may also look into the reason of that panic later, it is interesting.
I sent a more generic patch in the reply to your email a week ago, can you check your spam folder and test it?
Gmail inbox does not have any mail even in spam. Seems to be a gmail problem.
Please resend to evgeniy@khramtsov.org, rspamd is saner than whatever gmail runs.
The same issue exists for other NAT64 instances. Update this diff to the version from E-Mail.
- Move in_delayed_cksum to nat64_translate.c and add in6_delayed_cksum.
- Don't check for rcvif. CSUM_DELAY_DATA(_IPV6) is set only for locally generated packets.
On nat64_output (netisr):
In ip(6)_output, locally generated packets are marked in csum_flags and csum_data
fields before netisr_queue(). Locally generated packets should also have a valid pseudo-header checksum.
nat64_output also calls netisr_queue(), but does not mark locally generated packets before.
The efficient way to handle locally generated packets with nat64_output, is to ensure that mbuf will have valid
converted pseudo-header checksum, then mark locally generated packets the same as in ip(6)_output.
I have an idea and I will test it later with nat64_output and em(4).
My netisr issues were most likely caused by using if_epair, which has its own parameter for netisr_queue().
in(6)_delayed approach also works for netisr, but not as efficiently as it could be.