Page MenuHomeFreeBSD

netlink: Fix overallocation of netlink message buffers
Needs ReviewPublic

Authored by jhb on Tue, Nov 11, 7:58 PM.
Tags
None
Referenced Files
Unknown Object (File)
Wed, Nov 19, 9:15 AM
Unknown Object (File)
Wed, Nov 19, 9:14 AM
Unknown Object (File)
Sat, Nov 15, 8:09 PM
Unknown Object (File)
Sat, Nov 15, 12:39 AM
Unknown Object (File)
Wed, Nov 12, 3:57 AM
Unknown Object (File)
Wed, Nov 12, 3:39 AM
Unknown Object (File)
Tue, Nov 11, 10:42 PM
Unknown Object (File)
Tue, Nov 11, 10:04 PM

Details

Reviewers
kp
igoro
melifaro
Summary

Prior to commit 0c511bafdd5b309505c13c8dc7c6816686d1e103, each time
snl_realloc_msg_buffer was called, it called snl_allocz to request a
new buffer. If an existing linear buffer was used, then after the
call, the linear buffer effectively contained the old buffer contents
followed by the new buffer (so there was definitely wasted memory),
but the linear buffer state was consistent (lb->offset correctly
accounted for both copies). For example, if the initial linear buffer
was 256 bytes in size, lb->size would be 256. Using 16 bytes followed
by 32 bytes would first set lb->offset to 16, then the second realloc
would allocate 48 bytes (16 + 32) setting lb->offset to 64 (16 + 48).

Commit 0c511bafdd5b309505c13c8dc7c6816686d1e103 aimed to avoid this
memory waste by resetting the base pointer to the start of the
existing linear buffer if the new allocation was later in the same
linear buffer. This avoided some of the waste, but actually broke the
accounting. Using the same example above, the second realloc would
reuse the pointer at an offset of 0, but the linear buffer would still
claim that 64 bytes was allocated via lb->offset rather than the true
allocation of 48 bytes.

One approach would be to add some code to "extend" the allocation of
an existing linear buffer (the #if 0'd code in the diff) where a
realloc would try to increase lb->offset without setting a new base
pointer so long as there was still room remaining in the linear buffer
for the new request.

However, this change takes a simpler tack. If snl_allocz() returned
an allocation from a new linear buffer, just claim the entire linear
buffer for use by the snl_writer ensuring the accounting is correct in
both the linear buffer and the snl writer. With this approach, the
initial snl_writer size would be 256 bytes for a 256 byte linear
buffer and would only grow if it needs to allocate an entirely new
linear buffer.

Fixes: 0c511bafdd5b ("netlink: fix snl_writer and linear_buffer re-allocation logic")
Sponsored by: AFRL, DARPA

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 68766
Build 65649: arc lint + arc unit

Event Timeline

jhb requested review of this revision.Tue, Nov 11, 7:58 PM

Functionally, this is equivalent to setting snl_writer.size() to the equivalent of malloc_usable_size of the current linear buffer each time, and then you know that each time you grow, you are always going to allocate a new linear buffer as you have fully consumed the current one.

The only thing that would be nicer would be to free the old linear buffer instead. Arguably, this code should not be using a linear buffer at all in snl_writer and just be using realloc() along with a memset() to 0 of the new region. That would probably be a lot simpler and more optimal, but that is a bigger change to make.

sys/netlink/netlink_snl.h
1094–1095

The #if 0 code is here to show what that type of fix would look like, but I will not include it in the actual thing merged (and would tweak the commit log as well).

1100–1105

Possibly we should assert that this is true instead. The current code assumes that there is never any other data in the linear-buffer chain so that it is always ok to jump back to the start of the current linear buffer.

Now I remember that the latest fix's key goal was to add missing nw->size maintenance not to realloc again-and-again when it's actually not needed, i.e. after each realloc we still think that nw->size is only 256. And the motivation was to avoid overlapping buffers appeared to be there as a consequence of missing nw->size maintenance.

In addition it provided a little optimization (yeah, could be a separate commit) to avoid extra copying each time we need to grow the "window" (nw) in case if we are staying at the beginning of the same linear_buffer, i.e. it was previously re-allocated for us. Maybe it's a risky assumption that the following area is also for us and there is nothing in between, i.e. snl users are not expected to have multiple writers at the same time. Even if snl users seem to be like that today.

Anyway, avoiding extra copy still leaves a room for further improvement, because by design we still allocate more than actually needed and that's why we have lb->offset far away from our needs. Probably, we do not have an "extend" mechanism for reasons, anyway it makes me think that the assumption of linear consumption of a buffer by the same writer should be avoided.

And now I see this patch as that further improvement, when even if by design we still may "eat" more space than needed (as lb grows faster by doubling 1024 vs. nw's 256 + lb's header might trigger extra doubling) the extra will be smaller as nw growing within the same buffer is not an action anymore, and the important improvement, as for me, is to avoid that assumption.

Yes, some "extend" could be considered in the future, but it feels like getting back to the same assumption what might need extra logic to be ready for other cases. The proposed solution is much simpler and still keeps an opportunity for multiple writers (I have no use case examples in my head right now, I just like the idea not to limit the possibilities).

This revision is now accepted and ready to land.Tue, Nov 11, 11:54 PM

@igoro would you be able to test this on your workload (armv7) to ensure it still does the correct thing?

jrtc27 added inline comments.
sys/netlink/netlink_snl.h
1101

Hoist declaration to start of block?

In D53697#1226388, @jhb wrote:

@igoro would you be able to test this on your workload (armv7) to ensure it still does the correct thing?

VM-based armv7 tests on my side passed. I believe it's enough as pfctl was unusable before the fix (after switching to Netlink).
It's up to @kp whether it needs additional run on actual armv7-based appliance.

In D53697#1226388, @jhb wrote:

@igoro would you be able to test this on your workload (armv7) to ensure it still does the correct thing?

VM-based armv7 tests on my side passed. I believe it's enough as pfctl was unusable before the fix (after switching to Netlink).
It's up to @kp whether it needs additional run on actual armv7-based appliance.

Unfortunately I don't have easy access to armv7 hardware. Fortunately it's not critical for us.

This revision now requires review to proceed.Thu, Nov 20, 2:45 PM