Details

Reviewers

jhb
• hselasky
rrs
jtl
np
scottl

Commits

rS359474: KTLS: Coalesce adjacent TLS trailers & headers to improve PCIe bus efficiency

Summary

KTLS uses the embedded header and trailer fields of unmapped mbufs. This can lead to "silly" buffer lengths, where we have an mbuf chain that will create a scatter/gather list of the following lengths (ignoring offsets in the the pages for simplicity) when using AES-GCM:

13..4096..4096..4096..4096..16..13..4096..4096..4096..4096..16..13..4096..4096..4096..16..13..4096..4096..4096.......

Notice how we have 13 bytes followed by 16 bytes between each adjacent TLS record.

For software ktls we typically wind up with a pattern where we have several TLS records encrypted, and made ready at once. When these records are made ready, we can coalesce these silly buffers in sbready_compress by copying 13b TLS header of the next record into the 16b TLS trailer of the current record. After doing so, we have:

13..4096..4096..4096..4096..29..4096..4096..4096..4096..29..4096..4096..4096..29..4096..4096..4096.......

This marginally increases PCIe bus efficiency. We've seen an almost 1Gb/s increase in peak throughput on Broadwell based Xeons running a 100% software TLS workload with Mellanox ConnectX-4 NICs.

Note that this change is ifdef'ed for KTLS, as KTLS is currently the only user of the hdr/trailer feature of unmapped mbufs, and peeking into them is expensive, since the ext_pgs struct lives in separately allocated memory, and may be cold in cache.

This optimization is not applicable to HW ("NIC") TLS, as that depends on having the entire TLS record described by a single unmapped mbuf, so we cannot shift parts of the record between mbufs for HW TLS.

Test Plan

Monitor the sizes passed to bounce_bus_dmmap_load_buffer() and observe peaks at 13b and 16b be transformed to 29b using the following dtrace

dtrace -n 'fbt::bounce_bus_dmamap_load_buffer:entry{@=lquantize(arg3,12,30,1);}'

Diff Detail

Repository

rS FreeBSD src repository - subversion

Lint

Lint Skipped

Unit

Tests Skipped

Build Status

Buildable 30191

Event Timeline

gallatin created this revision.Mar 27 2020, 3:14 PM

Herald added a subscriber: imp. · View Herald TranscriptMar 27 2020, 3:14 PM

Harbormaster completed remote builds in B30136: Diff 69921.Mar 27 2020, 3:14 PM

gallatin added reviewers: jhb, • hselasky, rrs, jtl.Mar 27 2020, 3:15 PM

gallatin added a reviewer: np.

gallatin added a reviewer: scottl.Mar 27 2020, 7:45 PM

Looks really good.

sys/kern/uipc_sockbuf.c
109	You could move these declarations inside the `if..` loop, to add a little bit of compactness.
161–162	The only thing I'd mildly recommend is noting in this comment that `n` is assigned higher up (or maybe break this comment up and one part of it up). It's far enough up that adding some context would be helpful to the reader.

This revision is now accepted and ready to land.Mar 28 2020, 6:38 PM

Change looks good.

One comment:
Style is bcopy() and not memcpy() ?

Remove ifdef KERN_TLS from locals' declarations and move those into the ifdef'ed block as suggested by scottl
Add comment regarding where n comes from as suggested by scottl

This revision now requires review to proceed.Mar 30 2020, 3:41 PM

Harbormaster completed remote builds in B30183: Diff 70001.Mar 30 2020, 3:41 PM

switch from bcopy to memcpy as suggested by Hans

Harbormaster completed remote builds in B30185: Diff 70003.Mar 30 2020, 3:43 PM

In D24204#532903, @hselasky wrote:

Change looks good.

One comment:
Style is bcopy() and not memcpy() ?

I'm just old-fashioned.. I changed it to memcy.

• hselasky accepted this revision.Mar 30 2020, 4:05 PM

This revision is now accepted and ready to land.Mar 30 2020, 4:05 PM

jhb accepted this revision.Mar 30 2020, 4:48 PM

jhb added inline comments.

sys/kern/uipc_sockbuf.c
117	Style nit to format the new comments as sentences (e.g. "Try ... trailers.")
161–162	I would actually move this entire comment block up with the assignment to 'n' since it is true for the new block as well.