Details

Reviewers

gallatin
tuexen

Group Reviewers

transport

Commits

rS351951: This adds in the missing counter initialization which
rS351934: This adds the final tweaks to LRO that will now allow me

Summary

BBR requires precise timing information to correctly handle delays in LRO. This
is obtained with a new option to LRO where in a list of mbufs is passed up
to LRO. This hopefully is the final preparation patch for my next commit which
will bring BBRv1 into the FreeBSD tree yeah!

Test Plan

We have tested this extensively at NF running in production now for close to
1 Year :)

Diff Detail

Repository

rS FreeBSD src repository - subversion

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

rrs created this revision.Aug 1 2019, 2:33 PM

rrs added reviewers: gallatin, tuexen, transport.

The LRO patch was missing data and length limits for input processing via
MBUF_QUEUE methods. This update adds that and an elaborate set of
comments to the rack_bbr_common.c module describing in detail what
a transport designer using MBUF_QUEUEING needs to contemplate.

One more update, mainly comments but we also fix a couple of things

The timestamp provided via the ctf common functions should be the real time not something from the mbuf, we let the transports look at those flags (which both BBR and the latest Rack do).
We also for LRO add a gating of the number of acks (currently set to infinity) that can also cause a wakeup.

gallatin accepted this revision.Aug 21 2019, 12:42 PM

gallatin added inline comments.

netinet/tcp_lro.c
862 ↗	(On Diff #60829)	Would it make sense to have a sysctl to disable the in_pcblookup? Eg, to just assume the lookup will tell you that can_queue is going to be 0. In fact, I'd further suggest that when you load a TCP stack, it is marked with whether or not it supports queued lro, and when a TCP stack is loaded or unloaded, a global "tcp_stack_can_queue" is recalculated, base on the OR of all loaded TCP stacks's values for supporting queued lro. So you'd have something like: if (!queued_lro_forbidden && tcp_stack_can_queue_lro) { /* do in_pcblookups .. ? } else { / can_queue = 0 */ }

This revision is now accepted and ready to land.Aug 21 2019, 12:42 PM

rrs added inline comments.Sep 3 2019, 11:38 AM

netinet/tcp_lro.c
862 ↗	(On Diff #60829)	Yeah thats a great idea, I think we could add a "tcp_lro_register()/tcp_lro_unregister()" that a stack could do to increment/decrement an atomic that says we are going to use the MBUF queue mechanism. Then based on the atomic skip the lookup and go right to the compress methods (assuming no one as registered for it). I will spin that up and update the patch.

This adds a simple registration mechanism that stacks can use as they successfully register/deregister to
make the LRO code aware if any stack wants to use mbuf-queueing per Drew's suggestion.

This revision now requires review to proceed.Sep 3 2019, 11:54 AM

First version was missing a (void) in the registration function.. opps

This revision was not accepted when it landed; it landed in state Needs Review.Sep 6 2019, 2:25 PM

Closed by commit rS351934: This adds the final tweaks to LRO that will now allow me (authored by rrs). · Explain Why

This revision was automatically updated to reflect the committed changes.

rrs added a commit: rS351934: This adds the final tweaks to LRO that will now allow me.

Herald added 1 blocking reviewer(s): transport. · View Herald TranscriptSep 6 2019, 2:25 PM

Herald added a subscriber: imp. · View Herald Transcript

This crashes for me on boot.

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address = 0x0
fault code = supervisor write data, page not present
instruction pointer = 0x20:0xffffffff807a0c9e
stack pointer = 0x28:0xfffffe00d39fe870
frame pointer = 0x28:0xfffffe00d39fe890
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 0 (if_io_tqg_0)
trap number = 12
panic: page fault
cpuid = 0
time = 1567793282
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00d39fe530
vpanic() at vpanic+0x19d/frame 0xfffffe00d39fe580
panic() at panic+0x43/frame 0xfffffe00d39fe5e0
trap_fatal() at trap_fatal+0x39c/frame 0xfffffe00d39fe640
trap_pfault() at trap_pfault+0x62/frame 0xfffffe00d39fe690
trap() at trap+0x2b2/frame 0xfffffe00d39fe7a0
calltrap() at calltrap+0x8/frame 0xfffffe00d39fe7a0

trap 0xc, rip = 0xffffffff807a0c9e, rsp = 0xfffffe00d39fe870, rbp = 0xfffffe00d39fe890 ---

tcp_lro_flush() at tcp_lro_flush+0x2e/frame 0xfffffe00d39fe890
tcp_lro_rx_done() at tcp_lro_rx_done+0x82/frame 0xfffffe00d39fe8b0
tcp_lro_flush_all() at tcp_lro_flush_all+0x9c/frame 0xfffffe00d39fe8f0
_task_fn_rx() at _task_fn_rx+0xd22/frame 0xfffffe00d39fe9f0
gtaskqueue_run_locked() at gtaskqueue_run_locked+0xf9/frame 0xfffffe00d39fea40
gtaskqueue_thread_loop() at gtaskqueue_thread_loop+0x88/frame 0xfffffe00d39fea70
fork_exit() at fork_exit+0x84/frame 0xfffffe00d39feab0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00d39feab0

trap 0, rip = 0, rsp = 0, rbp = 0 ---

KDB: enter: panic
[ thread pid 0 tid 100050 ]
Stopped at kdb_enter+0x3b: movq $0,kdb_why

igb0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
	options=e527bb<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,TSO4,TSO6,LRO,WOL_MAGIC,VLAN_HWFILTER,VLAN_HWTSO,RXCSUM_IPV6,TXCSUM_IPV6>
	ether 00:25:90:ca:6b:b4
	inet 192.168.5.154 netmask 0xffffff00 broadcast 192.168.5.255
	media: Ethernet autoselect (1000baseT <full-duplex>)
	status: active
	nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>

the box in question is lynx4 in zoo, I can give you access if needed.

Looks like it's counters which don't get allocated.

addr2line -e kernel.debug 0xffffffff807a0c9e
/tank/users/mjg/src/freebsd/sys/amd64/include/counter.h:85

ffffffff807a0c70 <tcp_lro_flush>:
ffffffff807a0c70:       55                      push   %rbp
ffffffff807a0c71:       48 89 e5                mov    %rsp,%rbp
ffffffff807a0c74:       41 57                   push   %r15
ffffffff807a0c76:       41 56                   push   %r14
ffffffff807a0c78:       53                      push   %rbx
ffffffff807a0c79:       50                      push   %rax
ffffffff807a0c7a:       49 89 f7                mov    %rsi,%r15
ffffffff807a0c7d:       49 89 fe                mov    %rdi,%r14
ffffffff807a0c80:       48 8b 46 20             mov    0x20(%rsi),%rax
ffffffff807a0c84:       48 3b 46 30             cmp    0x30(%rsi),%rax
ffffffff807a0c88:       0f 84 b5 02 00 00       je     ffffffff807a0f43 <tcp_lro_flush+0x2d3>
ffffffff807a0c8e:       48 8b 04 25 38 28 00    mov    0xffffffff81002838,%rax
ffffffff807a0c95:       81 
ffffffff807a0c96:       48 2b 04 25 98 2e 02    sub    0xffffffff81022e98,%rax
ffffffff807a0c9d:       81 
ffffffff807a0c9e:       65 48 83 00 01          addq   $0x1,%gs:(%rax)

The counter is tcp_inp_lro_compressed

This revision was not accepted when it landed; it landed in state Needs Review.Sep 6 2019, 6:29 PM

Closed by commit rS351951: This adds in the missing counter initialization which (authored by rrs). · Explain Why

This revision was automatically updated to reflect the committed changes.

rrs added a commit: rS351951: This adds in the missing counter initialization which.

Final prep patch for BBR
ClosedPublic
Actions

Details

Diff Detail

Event Timeline

Revision Contents
Changeset List

Diff 61751

head/sys/netinet/tcp_subr.c

head/sys/netinet/tcp_var.h

Final prep patch for BBRClosedPublicActions

Details

Diff Detail

Event Timeline

Revision ContentsChangeset List

Diff 61751

head/sys/netinet/tcp_subr.c

head/sys/netinet/tcp_var.h

Final prep patch for BBR
ClosedPublic
Actions

Revision Contents
Changeset List