A macro ipv6_masklen() in sys/netgraph/netflow/netflow.c is using four bitcount32() calls. It is inefficient on 64-bit platforms. This patch re-implements it with bit_count(). On amd64, for example, both Clang and GCC generate two popcntq instructions instead of four popcntl instructions when POPCNT is available.
Details
Diff Detail
- Repository
- rS FreeBSD src repository - subversion
- Lint
Lint Skipped - Unit
Tests Skipped
Event Timeline
sys/netgraph/netflow/netflow.c | ||
---|---|---|
153 | mask will always be of the form /^1*0*$/, right? That is, a bunch of ones followed by a bunch of zeros? If so, you don't need bit_count. You could use bit_ffc. bit_ffc is probably faster, especially on pre-Nehalem platforms that don't have POPCNT. However, you'd have to be careful to make sure that it works on all endiannesses and word sizes. | |
155 | I really don't like overloading len here. It has two very different meanings. Could you use separate variables instead? |
sys/netgraph/netflow/netflow.c | ||
---|---|---|
157 | Can this be a constant instead of a hardcoded value? |
sys/netgraph/netflow/netflow.c | ||
---|---|---|
17 | mlen should be initialized to the computed value too, not to a hard-coded 128. |
sys/netgraph/netflow/netflow.c | ||
---|---|---|
157 | Sure, I can go back to the original code, i.e., reusing mlen, if that's what you mean. :-) mlen = sizeof(mask->sin6_addr) * NBBY; if (info->rti_addrs & RTA_NETMASK) bit_count((bitstr_t *)&mask->sin6_addr, 0, mlen, &mlen); |
sys/netgraph/netflow/netflow.c | ||
---|---|---|
157 | No, please don't assign two different meanings to mlen. I think that the best solution is something like the following, because it avoids magic numbers and also assigns only a single meaning for each variable: if (info->rti_addrs & RTA_NETMASK) bit_count((bitstr_t *)&mask->sin6_addr, 0, sizeof(mask->sin6_addr) * NBBY, &mlen); else mlen = sizeof(mask->sin6_addr) * NBBY; |
sys/netgraph/netflow/netflow.c | ||
---|---|---|
157 | In fact, these two numbers are essentially the same thing, i.e., when netmask is not set use full 128-bit. I really don't want to duplicate sizeof(mask->sin6_addr) * NBBY. :-( |
sys/netgraph/netflow/netflow.c | ||
---|---|---|
153 | Looking at the problem in more detail, I see that using bit_ffc as I suggested would require some ntohl operations on LE architectures. I don't know whether that's faster than using bit_count, but I do know that it's uglier. I would approve of using bit_count. | |
157 | It's fine if you want to assign sizeof(mask->sin6_addr) * NBBY to some other const variable. I just don't like assigning two separate meanings to mlen (the size of the bitstr, and also the size of the network). I find it hard to reason about code when variables' meanings change. It's also not how most compilers work. I try to think in the single static assignment paradigm, because it's what my compiler will do. How about this? const int sin6_addr_size = sizeof(mask->sin6_addr) * NBBY; if (info->rti_addrs & RTA_NETMASK) bit_count((bitstr_t *)&mask->sin6_addr, 0, sin6_addr_size, &mlen); else mlen = sin6_addr_size; |
sys/netgraph/netflow/netflow.c | ||
---|---|---|
157 | I think making mask->sin6_addr into a #define would result in the biggest net win, since it's used three times. Also, sin6_addr_size should be num_bits_in_sin6_addr. |