Page MenuHomeFreeBSD

Replace the fastforward path with tryforward which does not require a sysctl and will always be on. The former split between default and fast forwarding is removed by this commit while preserving the ability to use all network stack features.
AbandonedPublic

Authored by gnn on Sep 27 2015, 12:54 AM.

Details

Reviewers
melifaro
olivier_cochard.me
Group Reviewers
network
Test Plan

Tested using iperf3 and Conductor. Results presented
at vBSDCon 2015 in the talk "Made to Measure."

Diff Detail

Repository
rS FreeBSD src repository
Lint
Lint OK
Unit
No Unit Test Coverage
Build Status
Buildable 978
Build 978: arc lint + arc unit

Event Timeline

gnn updated this revision to Diff 8948.Sep 27 2015, 12:54 AM
gnn retitled this revision from to Replace the fastforward path with tryforward which does not require a sysctl and will always be on. The former split between default and fast forwarding is removed by this commit while preserving the ability to use all network stack features..
gnn updated this object.
gnn edited the test plan for this revision. (Show Details)
eri added a subscriber: eri.Sep 27 2015, 2:38 PM

This still breaks IPSEC scenarios at least.

ae added a subscriber: ae.Sep 27 2015, 4:03 PM

I want to note, that this patch can significantly change the "fastforward path".
Currently, for example, we can configure cxgbe(4) with 30 queues bound to 30 CPU cores and with fastforwarding enabled they will work in parallel.
With this patch all 30 queues will go through netisr. And depending to dispatch policy this configuration will work very differently.

gnn added a comment.Sep 27 2015, 9:51 PM

Understood, but that is an issue to be handled in the netisr and related layers, and not in the IP layer.

I admit I substantially prefer this approach to the prior approach, given that much of the fast-forwarding win was from direct dispatch -- something we now do by default for local-destination traffic as well (and have done for ten years).

Looking at the other comments, it seems that one concern is that one might want to use netisr for local traffic but direct dispatch for forwarded traffic -- I'm not sure I understand the use case for that, but if there is an important use case, one could potentially try to use separate netisr queues for forwarding vs local delivery for IP. In which case you could do a netisr defer point in ip_tryforward() itself, taking some care to properly handle the case where both are set to defer, not wanting to defer twice (perhaps via an m_tag?).

In general, though, I continue to take the view that direct dispatch continues to be the right work distribution policy for the majority of workloads. The main exception I encountered previously was in IPSEC processing, which may need more support for deferred execution. We will need to continue to improve lock scalability as parallelism opportunities continue to expand.

In your performance measurements, were you able to look at the impact of this change on lower-end MIPS router hardware?

ae added a comment.Sep 28 2015, 9:11 AM

I think if you will resurrect fastforwarding sysctl, this will help with IPSEC and system will not do unexpected routing by default.

eri added a comment.Sep 28 2015, 10:53 AM

The problem with IPSec is that this might leak traffic that otherwise would be used by ipsec.
Probably even looped traffic might leak with this scenarios from proxies....

melifaro edited edge metadata.Sep 28 2015, 11:26 AM

(side note: With more parallelism coming to kernel and future work on batching, fast and simple packet processing functions like ip_fastfwd are really needed.)
Given that, it would be great to attach ip_fastfwd() in a standard way (e.g. either a netisr handler or call from netisr handler function).

There was a discussion on different approach: switch NETISR_IP handler based on ip_forward sysctl (e.g. set ip_fastfwd() as handler for router), but
it didn't ended with a (good enough) patch.

There are several concerns for embedding ip_fastfwd() inside ip_input() :

  • IPSEC processing should be converted to use PFIL hooks (or use any other low-overhead method) before committing this
  • ip_tryforward() should provide feedback if packet is to-us (otherwise we would add in_localip() call for non-routers)

I'll (and hopefull Olivier will also) perform some benches to see how this impacts forwarding)

Btw, the review description is a bit misleading since we are not eliminating "traditional" forwarding (handling packets with options, etc..) (and currently we're not preserving IPSEC functionality)

feld added a subscriber: feld.Sep 28 2015, 7:58 PM

I've used a head r287478 and applied the patch from this review to it.
This new mode is just a very very little slower than the previous fastforwarding mode. The 5% error rate of the result didn't allow to precisely measure this difference.

My bench rresults on 3 differents hardware here: (check the ministat result more than the graphical results)
https://github.com/ocochard/netbenches/blob/master/AMD_G-T40E_2Cores_RTL8111E/fastforwarding-pf-ipfw/results/fbsd11-head.r287478-D3737/README.md

https://github.com/ocochard/netbenches/blob/master/Atom_C2558_4Cores-Intel_i350/fastforwarding-pf-ipfw/results/fbsd11-head.r287478-D3737/README.md

https://github.com/ocochard/netbenches/blob/master/Xeon_E5-2650-8Cores-Chelsio_T540-CR/fastforwarding-pf-ipfw.4nxq10g/results/fbsd11-head.r287478-D3737/README.md

loos added a subscriber: loos.Oct 4 2015, 12:44 AM
gnn updated this revision to Diff 9823.Oct 30 2015, 3:49 PM
gnn edited edge metadata.
  • Only update header size if we have any outbound security policies.
  • Short circuit expensive IPSEC calls when there are no security policies.

Updating D3737: Replace the fastforward path with tryforward which does not require

a sysctl and will always be on. The former split between default and
fast forwarding is removed by this commit while preserving the ability
to use all network stack features.

gnn abandoned this revision.Oct 30 2015, 8:27 PM

Replaced by this review which includes code to deal with the IPSEC case.

https://reviews.freebsd.org/D4042