Page MenuHomeFreeBSD

Reduce overhead of per-packet processing by ipfw(4)
ClosedPublic

Authored by ae on Dec 30 2018, 6:10 PM.

Details

Summary

ipfw(4) can be invoked by pfil(9) framework for each packet several times. Each call uses on-stack variable of type struct ip_fw_args to keep the state of ipfw(4) processing. Currently this variable has 240 bytes size on amd64. And each time ipfw(4) does bzero() on it, and then it initializes some fields. glebius@ has reported that they at Netflix discovered, that initialization of this variable produces significant overhead on packet processing. I did some investigations and after my patch I managed to increase performance of packet processing on simple routing with ipfw(4) firewalling to about 11%.
What contains the patch:

  • the size of struct ip_fw_args reduced up to 128 bytes. The dummypar field seems unused, thus I removed it.
  • next_hop* fields are grouped into one union. Ethernet header pointer also stored into this union, since forwarding doesn't work on layer2.
  • new field flags introduced, it keeps flags that are used to avoid access and initialization of some fields of struct ip_fw_args.
  • the rest of code modified to honor these flags.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

ae created this revision.Dec 30 2018, 6:10 PM
ae edited the summary of this revision. (Show Details)Dec 30 2018, 6:26 PM
gallatin accepted this revision.Dec 30 2018, 9:10 PM

I really like collapsing all those pointers, and making the NULL checks flags. Great work.

I wonder if more collapsing could be done. For example, it seems like src_ip and dst_ip could share unions with src_ip6 and dst_ip6 in struct ipfw_flow_id(). That would take us down to 136 bytes. Only 8 more bytes and it would fit in 2 64 byte cachelines.

Another comment is that in some experiments I did, there was benefit to move f_id and oif to the bottom of the structure, though that is likely very workload dependent.

This revision is now accepted and ready to land.Dec 30 2018, 9:10 PM
ae added a comment.Dec 31 2018, 3:30 PM

I really like collapsing all those pointers, and making the NULL checks flags. Great work.
I wonder if more collapsing could be done. For example, it seems like src_ip and dst_ip could share unions with src_ip6 and dst_ip6 in struct ipfw_flow_id(). That would take us down to 136 bytes. Only 8 more bytes and it would fit in 2 64 byte cachelines.

I think it is possible to change type of hopstore6. Something like this will fit ip_fw_args into 128 bytes:

struct ipfw_flow_id {
        union {
                uint32_t        dst_ip;
                struct in6_addr dst_ip6;
        };  
        union {
                uint32_t        src_ip;
                struct in6_addr src_ip6;
        };  
        uint16_t        dst_port;
        uint16_t        src_port;
        uint8_t         fib;    /* XXX: must be uint16_t */
        uint8_t         proto;
        uint8_t         _flags; /* protocol-specific flags */
        uint8_t         addr_type; /* 4=ip4, 6=ip6, 1=ether ? */
        uint32_t        flow_id6;
        uint32_t        extra; /* queue/pipe or frag_id */
};

struct ip_fw_nh6 {
        struct in6_addr         sin6_addr;
        uint32_t                sin6_scope_id;
        uint16_t                sin6_port;
};
struct ip_fw_args {
        struct mbuf             *m;     /* the mbuf chain               */
        struct ifnet            *oif;   /* output interface             */
        uint32_t                flags;
        struct ipfw_flow_id     f_id;   /* grabbed from IP header       */
        struct inpcb            *inp;
        union {
                struct ether_header     *eh;    /* for bridged packets  */
                struct sockaddr_in      *next_hop;
                struct sockaddr_in6     *next_hop6;
                /* ipfw next hop storage */
                struct sockaddr_in      hopstore;
                struct ip_fw_nh6        hopstore6;
        };
        struct ipfw_rule_ref    rule;   /* match/restart info           */
};

Another comment is that in some experiments I did, there was benefit to move f_id and oif to the bottom of the structure, though that is likely very workload dependent.

I'll try test this after several days... :)

ae edited the summary of this revision. (Show Details)Jan 1 2019, 9:40 AM
ae updated this revision to Diff 52466.
  • reordered some fields and reduced the size of hopstore6, now ip_fw_args fits into 128 bytes.
  • the fwd code adjusted to use new hopstore6.
This revision now requires review to proceed.Jan 1 2019, 9:40 AM
gallatin accepted this revision.Jan 1 2019, 9:10 PM
This revision is now accepted and ready to land.Jan 1 2019, 9:10 PM

Here are my classical DoS benches with ipfw (stateless with one rule "allow ip from any to any") results on 3 different setup:

PC Engines APU 2 (small AMD GX 4 core 1Ghz, intel NIC), +4.65%:

x r342759: inet4 packets per second
+ r342759 and D18690: inet4 packets per second
+--------------------------------------------------------------------------+
|xx x x  x                                              +     +    +  +   +|
||__A___|                                                                  |
|                                                         |_______AM_____| |
+--------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   5        598521        602106        599859        600100     1408.9658
+   5        623489        631854      628724.5      628019.9     3262.8851
Difference at 95.0% confidence
        27919.9 +/- 3665.25
        4.65254% +/- 0.615326%
        (Student's t, pooled s = 2513.13)

Netgate RCC-VE (small Atom C2558 4 core 2.40GHz, intel NIC), +3.57%:

x r342759: inet4 packets per second
+ r342759 and D18690: inet4 packets per second
+--------------------------------------------------------------------------+
|x x              x  x         x      +                    ++           ++ |
| |____________A__M_________|                                              |
|                                             |_____________A_____________||
+--------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   5        821429      841078.5        832432      830433.1     8366.0775
+   5        845870        868142        859685      860085.3     8989.1664
Difference at 95.0% confidence
        29652.2 +/- 12664
        3.57069% +/- 1.55049%
        (Student's t, pooled s = 8683.21)

Medium Xeon E5-2650 v2 - 8 core at 2.60GHz, Chelsio 10G NIC, + 0.3%

x r342759: inet4 packets per second
+ r342759 and D18690: inet4 packets per second
+--------------------------------------------------------------------------+
|      x  x x  x                     + +         x    +            +      +|
||__________M_____A_________________|                                      |
|                                    |________________A________________|   |
+--------------------------------------------------------------------------+
    N           Min           Max        Median           Avg        Stddev
x   5       7358553     7384995.5       7361597     7365790.3     10885.603
+   5       7377384       7400871       7387990     7388270.3       10453.8
Difference at 95.0% confidence
        22480 +/- 15564.3
        0.305195% +/- 0.211641%
        (Student's t, pooled s = 10671.9)
melifaro added inline comments.Jan 6 2019, 12:03 AM
sys/netpfil/ipfw/ip_fw_pfil.c
175 ↗(On Diff #52466)

Would it be possible to retain len variable here? Mixing error code and length in one variable in the same piece of code will eventually lead to one bug or another..

This revision was automatically updated to reflect the committed changes.
ae marked an inline comment as done.Jan 10 2019, 9:36 AM