Draft: pf: Switch pf_route() to PACKET_TAG_IPFORWARD tag
AbandonedPublic
Actions

Authored by vegeta_tuxpowered.net on Aug 20 2023, 6:30 PM.

Details

Reviewers

Summary

The pf_route() functions bypass the normal routing mechanism thus need to re-implement some functionalities on their own, like fragmentation, ICMP signalling and testing the outbound firewall. Because of this for route-to targets functions depending on pf_send_tcp(), like syncookies, won't work and the dtrace probe for pf_test() is never called and packets delayed by dummynet need to be routed and emitted again. The current approach also skips things like ipsec.

This draft proposes the following solution: have the pf_route() and pf_route6() functions use the PACKET_TAG_IPFORWARD tag and let the FreeBSD IP stack handle the routing. This unfortunately does not work out of the box, see D41479. Furthermore the change breaks the ability to specify the interface for the redirection pools - redirection target must be a reachable address and a matching interface will be used. This patch does not deal with removing the interface specification from pf.conf and other structures like pfsync. OpenBSD has already removed the ability to force the interface and greatly simplified the redirection pool logic some years ago. I've already looked into that as a part of my NAT64 backport effort and when I have time to get back to it I'd like to port the better redirection pools and remove the interface from them anyway.

At the moment of writing this the patch breaks one test: pf/route_to:icmp_nat. After some investigation I must say that handling of post-NAT ICMP is broken in pf in general. A ruleset with no route-to, just plain forwarding and a NAT rule on an outbound interface of the router jail with small MTU will cause exactly the same issue. The function icmp_error eventually results in a call to ip_output which will only call pfil_mbuf_out which would only match the inbound pf state of the router jail. Maybe we should call pfil_mbuf_in from icmp_send?

Also I don't understand why the test uses pass out route-to instead of pass in route-to, as with such approach the gateway and possible the interface will be changed after a state is already created on another interface. With set state-policy if-bound such ruleset could not work anyway if the interface is changed.

Diff Detail

Lint

Lint Skipped

Unit

Tests Skipped

Event Timeline

vegeta_tuxpowered.net created this revision.Aug 20 2023, 6:30 PM

Herald added subscribers: glebius, melifaro, farrokhi and 2 others. · View Herald TranscriptAug 20 2023, 6:30 PM

vegeta_tuxpowered.net requested review of this revision.Aug 20 2023, 6:30 PM

vegeta_tuxpowered.net edited the summary of this revision. (Show Details)

I'm generally in agreement that pf_route() approach isn't the best, but I'm also very, very afraid of making major changes there, because a lot of rulesets out there rely on it, and any change we make is going to break things and come with a pile of bug reports. My enthusiasm for wading through dozens of bug reports trying to understand if it's a configuration error, misguided setup or real bug is not particularly high.

Iirc the pf/route_to:icmp_nat test mimics something pfsense does, and I expect I replicated the core rules of that setup.

In D41517#946200, @kp wrote:

I'm generally in agreement that pf_route() approach isn't the best, but I'm also very, very afraid of making major changes there, because a lot of rulesets out there rely on it, and any change we make is going to break things and come with a pile of bug reports. My enthusiasm for wading through dozens of bug reports trying to understand if it's a configuration error, misguided setup or real bug is not particularly high.

I could make it similar to what I've done with the scrub rules. If an user uses the route-to ( iface <pool> ) syntax, they get the old, hacky pathway. If the modern OpenBSD syntax is used, the new pathway is used. Would that be acceptable for you? I've started working on the syntax change some months ago, I could resurrect that project.

In D41517#947063, @vegeta_tuxpowered.net wrote:

I could make it similar to what I've done with the scrub rules. If an user uses the route-to ( iface <pool> ) syntax, they get the old, hacky pathway. If the modern OpenBSD syntax is used, the new pathway is used. Would that be acceptable for you? I've started working on the syntax change some months ago, I could resurrect that project.

That sounds like a good approach.

I owe you a more in-depth look at the current version too, but I don't quite have cycles for a deep look in the next few days.

Yeah, I think you're taking the right direction here, but sadly I don't think we can get rid of the old behaviour right now.
One of the big users of route-to is pfSense's multi-wan support, where we basically have two default routes and pf is used to direct traffic down one or the other link.

In D41517#953290, @kp wrote:

Yeah, I think you're taking the right direction here, but sadly I don't think we can get rid of the old behaviour right now.
One of the big users of route-to is pfSense's multi-wan support, where we basically have two default routes and pf is used to direct traffic down one or the other link.

I think you should be able to use gateways on different interfaces just fine with this. It's just that the interface can't be explicitly specified and is automatically found from the gateway IP address.

A more complete version of this change is https://reviews.freebsd.org/D8877 which has been in OPNsense for many years. The most problems we have is not being in the FreeBSD tree and subtle breakage due to related changes in the netinet code. If anyone wants to review and commit I'm happy to update it, but I need a commitment and you will have mine. :)

vegeta_tuxpowered.net abandoned this revision.Sep 17 2025, 3:48 PM