Page MenuHomeFreeBSD

icmp6: Improve validation of PMTU
ClosedPublic

Authored by kd on Jul 21 2022, 7:23 AM.
Tags
None
Referenced Files
F52275173: D35871.diff
Mon, Nov 28, 9:49 PM
Unknown Object (File)
Sun, Nov 27, 2:50 AM
F45357640: rcv-invalid-ptb-ipv4.pkt
Jul 21 2022, 7:42 PM
F45357643: rcv-valid-ptb-ipv4.pkt
Jul 21 2022, 7:42 PM
F45357641: rcv-invalid-ptb-ipv6.pkt
Jul 21 2022, 7:42 PM
F45357642: rcv-valid-ptb-ipv6.pkt
Jul 21 2022, 7:42 PM

Details

Summary

Currently we accept any pmtu between IPV6_MMTU(1280B) and the link mtu.
In some network topologies could allow a bad actor to perform a DOS attack.
Contrary to IPv4 in IPv6 oversized packets are dropped, and a ICMP PACKET_TOO_BIG message is sent back to the sender.
After receiving an ICMPv6 packet with pmtu bigger than the current one the victim will start sending frames that will be dropped a router with reduced MTU.
Although it will eventually receive another message with correct pmtu, an attacker can still just inject their spoofed packets frequently enough to overwrite the correct value.
his issue is described in detail in RFC8201, section 6.
Fix this by checking the current pmtu, and accepting the new one only if it's smaller.

Discussed with: emaste, markj

Diff Detail

Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

kd requested review of this revision.Jul 21 2022, 7:23 AM

Would it be difficult to add a regression test case for this using scapy and utilities in tests/sys/netinet6?

Would it be difficult to add a regression test case for this using scapy and utilities in tests/sys/netinet6?

Good idea. I'll try to figure something out.

In D35871#814578, @kd wrote:

Would it be difficult to add a regression test case for this using scapy and utilities in tests/sys/netinet6?

Good idea. I'll try to figure something out.

You could also use packetdrill... I can help (meaning provide some packetdrill scripts), if wanted.

Just a comment, actually not related to this patch, but related to the handling of PTB messages.
There is a fundamental difference between IPv4 and IPv6, which I don't understand.

For IPv4, some input validation is done in the ICMP layer and the further processing is delegated to the transport layer. SCTP and TCP then does some further validation (checking the verification tag in case of SCTP, checking the sequence number in case of TCP). Only after this check, the TCP hostcache is updated.
Of course, only a reduction of the PMTU is performed.

For IPv6, some input validation is done by the ICMP layer. However, then the TCP host cache is updated without any further validation by the transport layer and finally the processing is delegated to the transport layer. So an attacker does not have to guess the transport layer information to reduce the PMTU, since the ICMPv6 performs the action even if the transport layer detects the attack and does not perform the action (again).

So why is the modification of the TCP hostcache performed by the ICMPv6 stack at all? In my view, this just allows attacks...

For IPv6, some input validation is done by the ICMP layer. However, then the TCP host cache is updated without any further validation by the transport layer and finally the processing is delegated to the transport layer. So an attacker does not have to guess the transport layer information to reduce the PMTU, since the ICMPv6 performs the action even if the transport layer detects the attack and does not perform the action (again).

So why is the modification of the TCP hostcache performed by the ICMPv6 stack at all? In my view, this just allows attacks...

That's because PMTU is sort is mandatory in IPv6 due to L3 forwarding differences.

In IPv4 a packet that is too big is going to be fragmented by the L3 forwarding logic, unless it has a DF flag set.
In the DF case it'll be dropped and a PMTU ICMP message will be sent back to the sender.

Now when it comes to forwarding IPv6 always drops oversized packets.
The L3 fragmentation is supposed to be done only by the sender, who therefore has to know the PMTU.
Because of that the PMTU update logic has to be run for all protocols.

I suppose that in order to mitigate attacks a rather high lower-bound was set: 1280B.

In other words in IPv4 PMTU information is used in transport layer, e.g. TCP uses it to lower MSS.
In IPv6 world the ip6_output needs to know PMTU in order to fragment the packet.

In D35871#814666, @kd wrote:

In other words in IPv4 PMTU information is used in transport layer, e.g. TCP uses it to lower MSS.
In IPv6 world the ip6_output needs to know PMTU in order to fragment the packet.

But when the protocol inside is TCP / SCTP, a validation would be possible. Especially when the hostcache of a protocol other than IP is being modified. The returned PTB ICMP message is supposed to hold as much of the offending packet's header as possible.

At least with stateful protocols, this could also dramatically reduce the attack surface of spoofed PTB error messages, not?

But when the protocol inside is TCP / SCTP, a validation would be possible. Especially when the hostcache of a protocol other than IP is being modified. The returned PTB ICMP message is supposed to hold as much of the offending packet's header as possible.

At least with stateful protocols, this could also dramatically reduce the attack surface of spoofed PTB error messages, not?

The problem here is that an attacker could always just choose to generate a PTB message with some random UDP header inside.
In that case I don't think there is a way for us to validate it.

Here are some packetdrill tests showing the issue I was referring to:

, , , and .
In the case of an invalid PTB for IPv6, the TCP stack rejects it, but the host cache was modified anyway.

In D35871#814669, @kd wrote:

But when the protocol inside is TCP / SCTP, a validation would be possible. Especially when the hostcache of a protocol other than IP is being modified. The returned PTB ICMP message is supposed to hold as much of the offending packet's header as possible.

At least with stateful protocols, this could also dramatically reduce the attack surface of spoofed PTB error messages, not?

The problem here is that an attacker could always just choose to generate a PTB message with some random UDP header inside.
In that case I don't think there is a way for us to validate it.

I think such a packet should never be used to change the host cache. For SCTP over UDP or TCP over UDP, we do the same validation as done for SCTP or TCP. In case an application is running on top of UDP, the application must validate that the reflected packet is OK before changing the host cache. Right now, this is not possible. But one could, for connected UDP socket, provide the encapsulated packet using MSG_NOTIFICATION, let the application do its verification and if that is successful allow a socket option to put the data into the host cache. Willing to work on that...

Looking at the code: icmp6.c:1087 pretends that some validation is done, which is not true. This is done as part of the upper layer call in icmp6.c:1093. In addition, the ICMPv6 code calls tcp_hc_updatemtu() in icmp6.c:1155, which is then called again in the SCTP and TCP code. So just reming it from ICMPv6 code just works.

What do you think the validated parameter of icmp6_mtudisc_update() is about? For me it looks like an indication about the upper layer check performed. The XXX in icmp6.c:1087 is a hint...

But my main point is that IPv6 should not be substantially more risky than IPv4.

I think such a packet should never be used to change the host cache. For SCTP over UDP or TCP over UDP, we do the same validation as done for SCTP or TCP. In case an application is running on top of UDP, the application must validate that the reflected packet is OK before changing the host cache. Right now, this is not possible. But one could, for connected UDP socket, provide the encapsulated packet using MSG_NOTIFICATION, let the application do its verification and if that is successful allow a socket option to put the data into the host cache. Willing to work on that...

The problem here is that in the case of IPv6 we need to know the PMTU, regardless of the underlying protocol.
Take a look at ip6_output.c:1109.
The transmitted packets are fragmented to fit in PMTU.
That's because in IPv6 routers are not supposed to do any fragmentation by themselves.
So if we stumble upon one with reduced MTU our UDP packet will be dropped: ip6_forward.c:374.
IMO we can't really expect every application that uses UDP(v6) to figure out the potentially fluid PMTU on it's own.

Looking at the code: icmp6.c:1087 pretends that some validation is done, which is not true. This is done as part of the upper layer call in icmp6.c:1093. In addition, the ICMPv6 code calls tcp_hc_updatemtu() in icmp6.c:1155, which is then called again in the SCTP and TCP code. So just reming it from ICMPv6 code just works.

What do you think the validated parameter of icmp6_mtudisc_update() is about? For me it looks like an indication about the upper layer check performed. The XXX in icmp6.c:1087 is a hint...

IMO we can't really "validate" a packet just by parsing L2/L3 headers, i.e. IP address can always be spoofed.
In order to do a proper authentication we'd need establish trust, probably doing something similar to what TLS/IPSEC does.
But this can't really be done in ICMP layer.

But my main point is that IPv6 should not be substantially more risky than IPv4.

From what I understand it actually is, because routers that forward a packet can't fragment it anymore.

In D35871#814673, @kd wrote:

I think such a packet should never be used to change the host cache. For SCTP over UDP or TCP over UDP, we do the same validation as done for SCTP or TCP. In case an application is running on top of UDP, the application must validate that the reflected packet is OK before changing the host cache. Right now, this is not possible. But one could, for connected UDP socket, provide the encapsulated packet using MSG_NOTIFICATION, let the application do its verification and if that is successful allow a socket option to put the data into the host cache. Willing to work on that...

The problem here is that in the case of IPv6 we need to know the PMTU, regardless of the underlying protocol.
Take a look at ip6_output.c:1109.
The transmitted packets are fragmented to fit in PMTU.
That's because in IPv6 routers are not supposed to do any fragmentation by themselves.
So if we stumble upon one with reduced MTU our UDP packet will be dropped: ip6_forward.c:374.
IMO we can't really expect every application that uses UDP(v6) to figure out the potentially fluid PMTU on it's own.

If an application chooses UDP as its transport, there are two choices (see RFC 8085):

  1. Don't perform any kind of PMTUD and ensure that no packet is larger than 1280 bytes.
  2. Perform PLPMTUD. RFC 8899 provides a framework for doing it.

In any case, the application should avoid sending packets needing to be fragmented. See RFC8900.

Looking at the code: icmp6.c:1087 pretends that some validation is done, which is not true. This is done as part of the upper layer call in icmp6.c:1093. In addition, the ICMPv6 code calls tcp_hc_updatemtu() in icmp6.c:1155, which is then called again in the SCTP and TCP code. So just reming it from ICMPv6 code just works.

What do you think the validated parameter of icmp6_mtudisc_update() is about? For me it looks like an indication about the upper layer check performed. The XXX in icmp6.c:1087 is a hint...

IMO we can't really "validate" a packet just by parsing L2/L3 headers, i.e. IP address can always be spoofed.

Correct. And that is why you need to delegate any action to the upper layer. I guess this is why we have the pr_ctlinput() functions, which allow exactly that.

In order to do a proper authentication we'd need establish trust, probably doing something similar to what TLS/IPSEC does.
But this can't really be done in ICMP layer.

But my main point is that IPv6 should not be substantially more risky than IPv4.

From what I understand it actually is, because routers that forward a packet can't fragment it anymore.

I do not understand why that is a consequence.

In my view, inserting an entry in the TCP hostcache with NO validation is a security issue, since you shortcut any checks in the transport layer. I think this needs to be changed.
For example, RFC 9260 uses MUST to ensure that the validation happens before processing the packets. For TCP, RFC 5927 suggests to check the port numbers (giving 14 bit randomness) and checking the SEG.SEQ. This is not as good as SCTP, but much better what is currently done: an off-path attacker only need to know the IP address of the host to get malicious entries in the host cache and the IP address of the destination. I don't think it is acceptable that this impacts SCTP and TCP...

Hmm, ok I see.
So if I understand correctly you propose to remove the PMTU handling in icmpv6, and parse it only in the protocol layer?
I suppose that it will reduce the attack surface.
Also the RFC9260 you linked says that we SHOULD, and not MUST process PMTUD.
So I suppose that in some cases it's ok to ignore this packet.

On the other hand parsing PMTU in the upper layers doesn't completely solve the problem.
It's still possible to inject a spoofed packet that will pass, e.g. TCP layer checks.
I wonder if perhaps my change is enough here, i.e. as long as:

  1. The smallest accepted PMTU is 1280B
  2. We only reduce and not increase PMTU

spoofed ICMP packets won't do much damage.

I'll look into how this is handled in other OSes.

In D35871#815291, @kd wrote:

Hmm, ok I see.
So if I understand correctly you propose to remove the PMTU handling in icmpv6, and parse it only in the protocol layer?
I suppose that it will reduce the attack surface.
Also the RFC9260 you linked says that we SHOULD, and not MUST process PMTUD.
So I suppose that in some cases it's ok to ignore this packet.

Yes. Since the PTB messages are not transferred reliably, a data sender cannot assume that it receives them. So requesting a MUST does not make sense.

On the other hand parsing PMTU in the upper layers doesn't completely solve the problem.
It's still possible to inject a spoofed packet that will pass, e.g. TCP layer checks.
I wonder if perhaps my change is enough here, i.e. as long as:

  1. The smallest accepted PMTU is 1280B
  2. We only reduce and not increase PMTU

spoofed ICMP packets won't do much damage.

I agree. As far as I know, the SCTP and TCP handler enforce these rules.

I'll look into how this is handled in other OSes.

Great, thank you. As someone in my lab works on doing PMTUD for the UDP based transport QUIC, we are going to look into the API needed to perform PMTUD in the application layer for a UDP based transport protocol. Some extensions might be needed, for example to do the validation, but maybe also to send probe packets, I will come up potentially with some extensions to the socket API.

So in Linux icmpv6 logic the PTB message is processed in two steps:

  1. First the higher level protocol handler is called. This is equivalent to how our ctlinput scheme works. For example the esp6(ipsec) handler looks for a tunnel that matches the header from icmp packet and updates the pmtu.
  2. A separate logic for raw sockets is called. If a socket is found based on the ip addresses the pmtu is updated.

I suppose that we could do something similar.
Basically ctlinput routines for all the protocols should call tcp_hc_updatemtu after performing some basic validation.
Once that's done icmp6_mtudisc_update can safely be removed.
After taking a quick look I've found that at least the following functions will need to be updated:

  • ipsec6_ctlinput
  • udp6_common_ctlinput
  • rip6_ctlinput

As for this patch I think it should be committed as is, since it's a simple fix that can be MFC-ed, contrary to the bigger changes described above.

@tuexen What do you think?

In D35871#816337, @kd wrote:

So in Linux icmpv6 logic the PTB message is processed in two steps:

  1. First the higher level protocol handler is called. This is equivalent to how our ctlinput scheme works. For example the esp6(ipsec) handler looks for a tunnel that matches the header from icmp packet and updates the pmtu.
  2. A separate logic for raw sockets is called. If a socket is found based on the ip addresses the pmtu is updated.

I suppose that we could do something similar.
Basically ctlinput routines for all the protocols should call tcp_hc_updatemtu after performing some basic validation.

That makes sense. We should discuss the degree of checking somewhere else. As I said, I can come up with a framework for UDP.

Once that's done icmp6_mtudisc_update can safely be removed.
After taking a quick look I've found that at least the following functions will need to be updated:

  • ipsec6_ctlinput
  • udp6_common_ctlinput
  • rip6_ctlinput

As for this patch I think it should be committed as is, since it's a simple fix that can be MFC-ed, contrary to the bigger changes described above.

I agree.

@tuexen What do you think?

Thank you very much for looking into it.

This revision is now accepted and ready to land.Jul 27 2022, 10:53 AM
This revision was automatically updated to reflect the committed changes.