Page MenuHomeFreeBSD

netlink: Don't overwrite existing data in a linear buffer in snl_writer
ClosedPublic

Authored by jhb on Mon, Dec 8, 10:10 PM.
Tags
None
Referenced Files
F139425263: D54148.id167750.diff
Thu, Dec 11, 10:50 PM
Unknown Object (File)
Wed, Dec 10, 2:09 PM
Unknown Object (File)
Wed, Dec 10, 2:02 PM
Unknown Object (File)
Wed, Dec 10, 8:37 AM
Unknown Object (File)
Tue, Dec 9, 9:17 AM
Unknown Object (File)
Tue, Dec 9, 8:46 AM
Unknown Object (File)
Tue, Dec 9, 8:33 AM
Unknown Object (File)
Tue, Dec 9, 6:53 AM

Details

Summary

First, a bit of background on some of the data structures netlink uses
to manage data associated with a netlink connection.

  • struct linear_buffer contains a single virtually-contiguous buffer of bytes. Regions of this buffer are suballocated via lb_allocz() which uses a simple "bump" where the buffer is split into an allocated region at the start and a free region at the end. Each allocation "bumps" the boundary (lb->offset) forward by the allocation size.

    Individual allocations are not freed. Instead, the entire buffer is freed once all of the allocations are no longer in use.

    Linear buffers also contain an embedded link to permit chaining buffers together.
  • snl_state contains various state for a netlink connection including a chain of linear buffers. This chain of linear buffers can contain allocations for netlink messages as well as other ancillary data buffers such as socket address structures. The chain of linear buffers are freed once the connection is torn down.
  • snl_writer is used to construct a message written on a netlink connection. It contains a single virtually-contiguous buffer (nw->base) allocated from the associated snl_state's linear buffer chain. The buffer distinguishes between the amount of space reserved from the underlying allocator (nw->size) and the current message length actually written (nw->offset). As new chunks of data (e.g. netlink attributes) are added to the write buffer, the buffer is grown by snl_realloc_msg_buffer by reallocating a larger buffer from the associated snl_state and copying over the current message data to the new buffer.

Commit 0c511bafdd5b309505c13c8dc7c6816686d1e103 aimed to fix two bugs
in snl_realloc_msg_buffer.

The first bug is that snl_realloc_msg_buffer originally failed to
update nw->size after growing the buffer which could result in
spurious re-allocations when growing in the future. It also probably
could eventually lead to overflowing the buffer since each
reallocation request was just adding the new bytes needed for a chunk
to the original 'nw->size' while 'nw->offset' kept growing.
Eventually the new 'nw->offset' would be larger than 'nw->size + sz'
causing routines like snl_reserve_msg_data_raw() to return an
out-of-bounds pointer.

The second change in this commit I think was trying to fix the buffer
overflows due to 'nw->size' being wrong, but instead introduced a new
set of bugs. The second change ignored the returned pointer from
snl_allocz() and instead assumed it could use all of the
currently-allocated data in the current linear buffer. This is only
ok if the only data in the linear buffer chain for the associated
snl_state is the snl_writer's message buffer. If there is any other
data allocated from the snl_state, it could be earlier in the current
linear buffer, so resetting new_base to nw->ss->lb->base can result in
overwriting that other data. The second change was also
over-allocating storage from the underlying chain of linear buffers
(e.g. a writer allocation of 256 followed by 512 would end up using
the first 512 bytes, but 768 bytes would be reserved in the underlying
linear buffer).

To fix, revert the second change keeping only the fix for 'nw->size'
being wrong.

Fixes: 0c511bafdd5b ("netlink: fix snl_writer and linear_buffer re-allocation logic")
Sponsored by: AFRL, DARPA

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

jhb requested review of this revision.Mon, Dec 8, 10:10 PM

After reverting my earlier fix (which I don't intend to MFC and by which I was distracted by the "optimization" of reusing the existing buffer), this fixes pfctl for me on CHERI. In effect, it makes Igor's original commit just add the missing 'nw->size = new_size'. I think that probably explains both of the issues Igor noted in the original commit log. The buffer overflows were caused by nw->offset exceeding nw->size and eventually the realloc'd buffer was simply too small as it was only 'nw->size + <size of current object needed>' which could easily be smaller than 'nw->offset' once you've appended a few things larger than the initial buffer size.

I do have a separate change to add an optimization to permit extending the current buffer in the lb if it is at the end of the chain, but I figured that belongs in a separate commit (and it will need a fix for CHERI whereas this does not).

LGTM. That's an important point regarding the other data allocated from the snl_state.

This revision is now accepted and ready to land.Tue, Dec 9, 9:34 AM

What's this based on? It doesn't seem to want to apply to FreeBSD main (f9500e75791cf793904c80ca4a52433afd585a23).

I manually applied it to my cheri tree, and that does indeed work.

In D54148#1236974, @kp wrote:

What's this based on? It doesn't seem to want to apply to FreeBSD main (f9500e75791cf793904c80ca4a52433afd585a23).

I manually applied it to my cheri tree, and that does indeed work.

I have reverted my earlier patch first before this (I just didn't post the revert for review). So revert commit 828df4d36d9d5a6ca0dcc294d65572b4a0474142. in main first and then this should apply fine. The effective diff between the code before Igor's first commit and the end result of this change for this function is (note there was a commit after Igor's to fix the build with C++ that I needed to maintain hence some of the diff):

@@ -1029,6 +1083,7 @@ static inline bool
 snl_realloc_msg_buffer(struct snl_writer *nw, size_t sz)
 {
        uint32_t new_size = nw->size * 2;
+       char *new_base;
 
        while (new_size < nw->size + sz)
                new_size *= 2;
@@ -1036,7 +1091,7 @@ snl_realloc_msg_buffer(struct snl_writer *nw, size_t sz)
        if (nw->error)
                return (false);
 
-       void *new_base = snl_allocz(nw->ss, new_size);
+       new_base = snl_allocz(nw->ss, new_size);
        if (new_base == NULL) {
                nw->error = true;
                return (false);
@@ -1046,9 +1101,10 @@ snl_realloc_msg_buffer(struct snl_writer *nw, size_t sz)
        if (nw->hdr != NULL) {
                int hdr_off = (char *)(nw->hdr) - nw->base;
 
-               nw->hdr = (struct nlmsghdr *)(void *)((char *)new_base + hdr_off);
+               nw->hdr = (struct nlmsghdr *)(void *)(new_base + hdr_off);
        }
        nw->base = new_base;
+       nw->size = new_size;
 
        return (true);
 }

Igor, are you able to test this again on your armv7 setup btw?

Ah, thanks. With the original patch reverted this applies and works as expected.
I'm not all that familiar with this code, but it works and I don't see any obvious problems (and it addresses the problem CHERI found, being that we used more than 'new_size' from 'new_base'.)

In D54148#1237012, @jhb wrote:

Igor, are you able to test this again on your armv7 setup btw?

My testing passed. Now it's clear that my attempt to avoid extra copying on each reallocation was based on an assumption that isn't always true. It's a good plan to get back to always copying on each reallocation, as it was before, and think about an actual "buffer extension" mechanism that could be triggered when applicable.

This makes sense to me.

I applied this patch to cheribsd and don't see any test failures related to netlink, at least seemingly.

In D54148#1237012, @jhb wrote:

Igor, are you able to test this again on your armv7 setup btw?

My testing passed. Now it's clear that my attempt to avoid extra copying on each reallocation was based on an assumption that isn't always true. It's a good plan to get back to always copying on each reallocation, as it was before, and think about an actual "buffer extension" mechanism that could be triggered when applicable.

I have a patch to do extension I can post for review and testing.