Page MenuHomeFreeBSD

Duplicate frames only once when using dup-to pf rule
ClosedPublic

Authored by yannis.planus_alstomgroup.com on Oct 30 2020, 9:05 AM.
Tags
None
Referenced Files
F107367212: D27018.id81087.diff
Mon, Jan 13, 3:27 AM
Unknown Object (File)
Sat, Jan 11, 11:44 AM
Unknown Object (File)
Wed, Jan 1, 12:21 PM
Unknown Object (File)
Wed, Jan 1, 2:53 AM
Unknown Object (File)
Wed, Dec 25, 9:44 PM
Unknown Object (File)
Wed, Dec 25, 9:00 PM
Unknown Object (File)
Wed, Dec 25, 8:27 AM
Unknown Object (File)
Nov 28 2024, 1:28 PM

Details

Summary

When using DUP-TO rule, frames are duplicated 3 times on both output interfaces and duplication interface. Add a flag to not duplicate a duplicated frame.

Inspired by a patch from Miłosz Kaniewski milosz.kaniewski at gmail.com
https://lists.freebsd.org/pipermail/freebsd-pf/2015-November/007886.html

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

Can you describe a setup that demonstrates the problem this fixes?
(Ideally in the form of a test case, but if you can describe a setup I can probably write the test case.)

Hi Kristof,

I don't know which is the format of a test case for you but I reproduced this problem with the following architecture/steps:

+-----------------+       +-----------------------+        +-----------------+
|                 |       |                       |        |                 |
|    TCP client   |-------|     FreeBSD router    |--------|   TCP Server    |
|                 |       |em1                 em2|        |                 |
+-----------------+       +-----------------------+        +-----------------+
  • Add dup-to rule in PF (ex: pass in quick on { em1 } dup-to (lo0 127.0.0.1) inet proto tcp from any to any flags S/SA keep state . Behavior is the same with a pass out rule)
  • Start tcpdump captures on:
    • TCP client side
    • TCP server side
    • lo0 on FreeBSD router
  • Start a TCP connection between Client and Server

Captures will show TCP duplication after going through the FreeBSD router

Let me know if you need more information.

I can reproduce the problem at least. I'll see if I can get a sane test case out of it as well.

Do you know why we see three duplicates, as opposed to an endless loop of dupes?

sys/netpfil/pf/pf.c
5681

Why this check? Why not set the PF_DUPLICATED flag in the case we actually duplicate?

(Also, why not just oifp == s->rt_kif->pfik_ifp?)

Also, it appears to panic:

Fatal trap 9: general protection fault while in kernel mode
cpuid = 2; apic id = 02
instruction pointer     = 0x20:0xffffffff80d94355
stack pointer           = 0x28:0xfffffe001bd89920
frame pointer           = 0x28:0xfffffe001bd899a0
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 12 (swi1: netisr 0)
trap number             = 9
panic: general protection fault
cpuid = 2
time = 1604412691
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe001bd89630
vpanic() at vpanic+0x182/frame 0xfffffe001bd89680
panic() at panic+0x43/frame 0xfffffe001bd896e0
trap_fatal() at trap_fatal+0x387/frame 0xfffffe001bd89740
trap() at trap+0xa4/frame 0xfffffe001bd89850
calltrap() at calltrap+0x8/frame 0xfffffe001bd89850
--- trap 0x9, rip = 0xffffffff80d94355, rsp = 0xfffffe001bd89920, rbp = 0xfffffe001bd899a0 ---
ip_input() at ip_input+0x4d5/frame 0xfffffe001bd899a0
swi_net() at swi_net+0x1a1/frame 0xfffffe001bd89a20
ithread_loop() at ithread_loop+0x279/frame 0xfffffe001bd89ab0
fork_exit() at fork_exit+0x80/frame 0xfffffe001bd89af0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe001bd89af0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---

It looks like this happens because we enqueue an mbuf and free it later. ip_input() has an mbuf pointer to a freed mbuf here.

In D27018#604230, @kp wrote:

Do you know why we see three duplicates, as opposed to an endless loop of dupes?

Yes, the loop of dupes is stopped by the check of pd->pf_mtag->routed in this part :

        if ((pd->pf_mtag == NULL &&
	    ((pd->pf_mtag = pf_get_mtag(*m)) == NULL)) ||
	    pd->pf_mtag->routed++ > 3) {
		m0 = *m;
		*m = NULL;
		goto bad_locked;
	}
In D27018#604255, @kp wrote:

Also, it appears to panic:
...
It looks like this happens because we enqueue an mbuf and free it later. ip_input() has an mbuf pointer to a freed mbuf here.

Oh, I missed it... I'll have a look at it, how do you produce this panic? Do you know if the mbuf pointer of ip_input() corresponds to the real one or the dupliacted one?

sys/netpfil/pf/pf.c
5681

Why this check? Why not set the PF_DUPLICATED flag in the case we actually duplicate?

For me there are three different cases:
1- the real frame which should be duplicated (oifp corresponding to em2 itf in this example)
2- the duplicated frame which should pass (oifp corresponding to lo0 itf in this example)
3- the others frames which should be dropped. (oifp corresponding to lo0 itf in this example)

the PF_DUPLICATED flag is used to check if we are in case 2 or in case 3.

(Also, why not just oifp == s->rt_kif->pfik_ifp?)

I usually compare the if_index variables but you're right, we can compare oifp directly

In D27018#604255, @kp wrote:

Also, it appears to panic:
...
It looks like this happens because we enqueue an mbuf and free it later. ip_input() has an mbuf pointer to a freed mbuf here.

Oh, I missed it... I'll have a look at it, how do you produce this panic? Do you know if the mbuf pointer of ip_input() corresponds to the real one or the dupliacted one?

I've got this start of a test case and then ping -c 1 198.51.100.3 from the host machine. I also have tcpdump running in the jail, but that's not likely to be relevant.

dup_to_body()
{
        pft_init

        epair_send=$(vnet_mkepair)
        ifconfig ${epair_send}a 192.0.2.1/24 up

        epair_recv=$(vnet_mkepair)
        ifconfig ${epair_recv}a up

        vnet_mkjail alcatraz ${epair_send}b ${epair_recv}b
        jexec alcatraz ifconfig ${epair_send}b 192.0.2.2/24 up
        jexec alcatraz ifconfig ${epair_recv}b 198.51.100.2/24 up
        jexec alcatraz sysctl net.inet.ip.forwarding=1
        jexec alcatraz arp -s 198.51.100.3 00:01:02:03:04:05
        route add -net 198.51.100.0/24 192.0.2.2

        jexec alcatraz pfctl -e

        pft_set_rules alcatraz "pass in on { ${epair_send}b } dup-to (lo0 127.0.0.1)"

        sleep 3600
}

When I find time and motivation I'll see about making the test actually check to make sure we don't have extra duplications, but that has to be done manually for now.

In D27018#604562, @kp wrote:
In D27018#604255, @kp wrote:

Also, it appears to panic:
...
It looks like this happens because we enqueue an mbuf and free it later. ip_input() has an mbuf pointer to a freed mbuf here.

Oh, I missed it... I'll have a look at it, how do you produce this panic? Do you know if the mbuf pointer of ip_input() corresponds to the real one or the dupliacted one?

I've got this start of a test case and then ping -c 1 198.51.100.3 from the host machine. I also have tcpdump running in the jail, but that's not likely to be relevant.

dup_to_body()
{
        pft_init

        epair_send=$(vnet_mkepair)
        ifconfig ${epair_send}a 192.0.2.1/24 up

        epair_recv=$(vnet_mkepair)
        ifconfig ${epair_recv}a up

        vnet_mkjail alcatraz ${epair_send}b ${epair_recv}b
        jexec alcatraz ifconfig ${epair_send}b 192.0.2.2/24 up
        jexec alcatraz ifconfig ${epair_recv}b 198.51.100.2/24 up
        jexec alcatraz sysctl net.inet.ip.forwarding=1
        jexec alcatraz arp -s 198.51.100.3 00:01:02:03:04:05
        route add -net 198.51.100.0/24 192.0.2.2

        jexec alcatraz pfctl -e

        pft_set_rules alcatraz "pass in on { ${epair_send}b } dup-to (lo0 127.0.0.1)"

        sleep 3600
}

When I find time and motivation I'll see about making the test actually check to make sure we don't have extra duplications, but that has to be done manually for now.

Sorry for the delay.
I developed/tested this patch on FreeBSD 11.2. I just tried it on FreeBSD 12.1 RELEASE and don't get any Panic... Which version are you using for your tests?
Also, what is the expected behavior of dup-to (lo0 127.0.0.1) in case of "set skip on lo0" is set in the configuration file?

I developed/tested this patch on FreeBSD 11.2. I just tried it on FreeBSD 12.1 RELEASE and don't get any Panic... Which version are you using for your tests?

I'm testing on CURRENT, because that's where changes go in first.

Also, what is the expected behavior of dup-to (lo0 127.0.0.1) in case of "set skip on lo0" is set in the configuration file?

I'd expect duplicated packets to still go to 127.0.0.1, because the set skip is applied on the inbound packet, not to dup'ed packets.
We check for 'set skip' on the entry point in pf_test(), and if the interface is marked as skipped we unconditionally PF_PASS.

Hi,

Because the duplicated packet goes to lo0 (with a dup-to (lo0 127.0.0.1)), it will be read as a new input packet.
If the dup-to is used to get routed packet, it must be applied on the "pass out" . Otherwise, the duplicated packets will be routed to the external interface again.
If the dup-to is applied on the "pass out", this packet will be consumed/removed by this patch before being routed.

So given the test script above, doing a ping -c 1 198.51.100.3 produces one packet on lo0 and two on ${epair_recv}b. Is that expected?

In D27018#620990, @kp wrote:

So given the test script above, doing a ping -c 1 198.51.100.3 produces one packet on lo0 and two on ${epair_recv}b. Is that expected?

Yes with the following rule : pass in on { ${epair_send}b } dup-to (lo0 127.0.0.1)
With the following rule on the out interface, only one frame will be produced on ${epair_recv}b:
pass out on { ${epair_recv}b } dup-to (lo0 127.0.0.1)

Sorry, not yet. It's high on my todo list, but I've not yet found the time.

This revision was not accepted when it landed; it landed in state Needs Review.Jan 28 2021, 4:49 PM
This revision was automatically updated to reflect the committed changes.