TCP Blackbox Recorder
ClosedPublic

Authored by jtl on Jun 7 2017, 10:28 PM.

Details

Summary

This is the blackbox recorder code we discussed in the transport session at the BSDCan Developer Summit.

It allows you to capture events on a TCP connection in a ring buffer. It stores metadata with the event. It optionally stores the TCP header associated with an event (if the event is associated with a packet) and also optionally stores information on the sockets.

It supports setting a log ID on a TCP connection and using this to correlate multiple connections that share a common log ID.

You can program the system to put the connections in different modes. If we are doing a coordinated test with a particular connection, we may tell the system to put it in mode 4 (continuous dump). Or, if we just want to monitor for errors, we can put it in mode 1 (ring buffer) and dump all the ring buffers associated with the connection ID when we receive an error signal for that connection ID. You can set a default mode that will be applied to a particular ratio of incoming connections. You can also manually set a mode using a socket option.

Also, this is a fairly simplistic example, since it only provides the most basic of probes. @rrs has added quite an abundance of probes in his work.

There is a user-space program which we plan to commit as a port. It reads data from the log device and outputs pcapng files. (Write me for a copy of the user-space program.)

Test Plan

This has been widely used in development and production at a large content provider.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
jtl created this revision.Jun 7 2017, 10:28 PM
swills added a subscriber: swills.Jun 8 2017, 2:26 PM
kbowling added inline comments.Jun 16 2017, 2:26 AM
sys/netinet/tcp_var.h
663 ↗(On Diff #29304)

Also need in xtcpcb:

	int32_t		t_logstate;		/* (s) */

and decrease spares by 1

kbowling added inline comments.Jun 16 2017, 4:20 PM
sys/dev/tcp_log/tcp_log_dev.c
469 ↗(On Diff #29304)

in another code dump you initialized wakeup_needed to false here

gnn accepted this revision.Jun 21 2017, 8:29 PM
This revision is now accepted and ready to land.Jun 21 2017, 8:29 PM
jtl updated this revision to Diff 40491.Mar 20 2018, 10:53 AM

Update to the latest Netflix sources.

Add TCP stack IDs and use them in the black box source.

Fix nits caught by @kevin.bowling_kev009.com during the review.

This revision now requires review to proceed.Mar 20 2018, 10:53 AM
jtl updated this revision to Diff 40492.Mar 20 2018, 11:02 AM

Update t_logstate handling in struct xtcpcb.

(It is used by 3rd parties, and it would help if we actually set it.)

jtl added a comment.Mar 20 2018, 11:02 AM

FYI, planning to commit in ~28 hours (and after a tinderbox build). If you have concerns, speak now.

jtl updated this revision to Diff 40493.Mar 20 2018, 11:13 AM

Copyright/SPDX updates.

jtl updated this revision to Diff 40548.Mar 21 2018, 4:02 PM

Fix compilation errors on various architectures found by tinderbox.

This revision was not accepted when it landed; it landed in state Needs Review.Mar 22 2018, 9:40 AM
This revision was automatically updated to reflect the committed changes.