Page MenuHomeFreeBSD

siftr: remove barely used hash generation per record
ClosedPublic

Authored by cc on Apr 26 2023, 3:26 PM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Jan 9, 9:08 AM
Unknown Object (File)
Thu, Jan 9, 8:54 AM
Unknown Object (File)
Thu, Jan 9, 8:40 AM
Unknown Object (File)
Thu, Jan 9, 6:08 AM
Unknown Object (File)
Wed, Jan 8, 6:11 PM
Unknown Object (File)
Wed, Jan 8, 5:06 PM
Unknown Object (File)
Dec 13 2024, 7:53 AM
Unknown Object (File)
Nov 24 2024, 1:10 PM

Details

Summary

Main reasons to remove this hash generation per record are:
(1) It is not enabled in default.

(2) IPv6 path does not have this.

(3) TCP blackbox can do a better job.

(4) TCP header fields can be used if needed.

The man page is updated as well.

Test Plan

Tested in Emulab testbed:
before:
enable_time_secs=1682519173 enable_time_usecs=838336 siftrver=1.3.0 sysname=FreeBSD sysver=1400088 ipmode=4
o,0x00000000,1682519173.838365,...,1073725440,14552,1026,66048,66048,9,9,4,1460,37937,1,16778212,260000,33580,148,65700,0,104,0,0,0
o,0x00000000,1682519173.838379,...,1073725440,14552,1026,66048,66048,9,9,4,1460,37937,1,16778212,260000,33580,184,65700,0,148,0,0,0
...
disable_time_secs=1682519176 disable_time_usecs=079243 num_inbound_tcp_pkts=9 num_outbound_tcp_pkts=8 total_tcp_pkts=17 num_inbound_skipped_pkts_malloc=0 num_outbound_skipped_pkts_malloc=0 num_inbound_skipped_pkts_tcpcb=0 num_outbound_skipped_pkts_tcpcb=0 num_inbound_skipped_pkts_inpcb=0 num_outbound_skipped_pkts_inpcb=0 total_skipped_tcp_pkts=0 flow_list=...,

after:
enable_time_secs=1682518987 enable_time_usecs=256874 siftrver=1.3.0 sysname=FreeBSD sysver=1400088 ipmode=4
i,1682518987.256902,...,1073725440,14480,1026,66048,66048,9,9,4,1460,39593,1,16778212,263000,33580,104,65700,0,104,0,0,0
o,1682518987.256915,...,1073725440,14584,1026,66048,66048,9,9,4,1460,35656,1,16778212,285000,33580,44,65700,0,0,0,0,0
...
disable_time_secs=1682518997 disable_time_usecs=159312 num_inbound_tcp_pkts=23 num_outbound_tcp_pkts=18 total_tcp_pkts=41 num_inbound_skipped_pkts_malloc=0 num_outbound_skipped_pkts_malloc=0 num_inbound_skipped_pkts_tcpcb=0 num_outbound_skipped_pkts_tcpcb=0 num_inbound_skipped_pkts_inpcb=0 num_outbound_skipped_pkts_inpcb=0 total_skipped_tcp_pkts=0 flow_list=...,

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 51218
Build 48109: arc lint + arc unit

Event Timeline

cc requested review of this revision.Apr 26 2023, 3:26 PM

Add date update in man page.

This revision is now accepted and ready to land.Apr 26 2023, 6:11 PM

What do we gain from removing this feature? My understanding is that if it is not enabled, it does not have a substantial CPU overhead. I have no idea who uses siftr and if they use this feature...

What do we gain from removing this feature? My understanding is that if it is not enabled, it does not have a substantial CPU overhead. I have no idea who uses siftr and if they use this feature...

This feature is not used AFAIK. We have less carbon print (at least 10 chars of "0x00000000") per record after removing this feature.

In D39835#906795, @cc wrote:

What do we gain from removing this feature? My understanding is that if it is not enabled, it does not have a substantial CPU overhead. I have no idea who uses siftr and if they use this feature...

This feature is not used AFAIK. We have less carbon print (at least 10 chars of "0x00000000") per record after removing this feature.

OK, so I understand that it makes sense not to print it, if the feature is not used. But I have (in general) a hard time to tell, which features are used and which are not... So how do you know that you are not removing a feature considered important by some people?
An argument I see in favor of this change is that I'm not aware of any tool reading a .pcap file and computing the hashes for the packets.

In D39835#906795, @cc wrote:

What do we gain from removing this feature? My understanding is that if it is not enabled, it does not have a substantial CPU overhead. I have no idea who uses siftr and if they use this feature...

This feature is not used AFAIK. We have less carbon print (at least 10 chars of "0x00000000") per record after removing this feature.

OK, so I understand that it makes sense not to print it, if the feature is not used. But I have (in general) a hard time to tell, which features are used and which are not... So how do you know that you are not removing a feature considered important by some people?
An argument I see in favor of this change is that I'm not aware of any tool reading a .pcap file and computing the hashes for the packets.

Let's assume this is a dead feature, or the original authors do not need it any longer. Also, a hash value is hard to use for a packet identification. I will consider tcp.seq, tcp.ack, and tcp.len in the tcphdr for such identification. Besides, the inconsistency between IPv4 and IPv6 for this feature does not make sense.

In D39835#906901, @cc wrote:
In D39835#906795, @cc wrote:

What do we gain from removing this feature? My understanding is that if it is not enabled, it does not have a substantial CPU overhead. I have no idea who uses siftr and if they use this feature...

This feature is not used AFAIK. We have less carbon print (at least 10 chars of "0x00000000") per record after removing this feature.

OK, so I understand that it makes sense not to print it, if the feature is not used. But I have (in general) a hard time to tell, which features are used and which are not... So how do you know that you are not removing a feature considered important by some people?
An argument I see in favor of this change is that I'm not aware of any tool reading a .pcap file and computing the hashes for the packets.

Let's assume this is a dead feature, or the original authors do not need it any longer. Also, a hash value is hard to use for a packet identification. I will consider tcp.seq, tcp.ack, and tcp.len in the tcphdr for such identification. Besides, the inconsistency between IPv4 and IPv6 for this feature does not make sense.

On a second thought, if there is really someone who uses this feature. That also means this user knows how/where to add this missing code to study TCP, under the BSD license.

In D39835#906912, @cc wrote:
In D39835#906901, @cc wrote:
In D39835#906795, @cc wrote:

What do we gain from removing this feature? My understanding is that if it is not enabled, it does not have a substantial CPU overhead. I have no idea who uses siftr and if they use this feature...

This feature is not used AFAIK. We have less carbon print (at least 10 chars of "0x00000000") per record after removing this feature.

OK, so I understand that it makes sense not to print it, if the feature is not used. But I have (in general) a hard time to tell, which features are used and which are not... So how do you know that you are not removing a feature considered important by some people?
An argument I see in favor of this change is that I'm not aware of any tool reading a .pcap file and computing the hashes for the packets.

Let's assume this is a dead feature, or the original authors do not need it any longer. Also, a hash value is hard to use for a packet identification. I will consider tcp.seq, tcp.ack, and tcp.len in the tcphdr for such identification. Besides, the inconsistency between IPv4 and IPv6 for this feature does not make sense.

On a second thought, if there is really someone who uses this feature. That also means this user knows how/where to add this missing code to study TCP, under the BSD license.

Some more thinking about this at my end: Since the hash algorithm is not documented, someone who uses this feature must be able to read the kernel code and write the corresponding application code. Such people can handle the removal of the code.
Regarding the IPv4/IPv6 difference: I don't see why today we don't enable IPv6 support, if compiled in the kernel.