Page MenuHomeFreeBSD

x86 busdma_bounce: do not make assumptions about alignment of malloc(9) results.
AbandonedPublic

Authored by kib on Jan 11 2021, 11:46 PM.
Tags
None
Referenced Files
F103550371: D28108.id82215.diff
Tue, Nov 26, 10:08 AM
Unknown Object (File)
Sat, Nov 23, 4:32 AM
Unknown Object (File)
Fri, Nov 22, 8:59 AM
Unknown Object (File)
Fri, Nov 22, 6:10 AM
Unknown Object (File)
Fri, Nov 22, 3:27 AM
Unknown Object (File)
Tue, Nov 19, 12:33 AM
Unknown Object (File)
Mon, Nov 18, 9:52 PM
Unknown Object (File)
Sun, Nov 17, 10:54 AM
Subscribers

Details

Summary

Add malloc_domainset_aligned() that force-align pointer-aligned result from malloc, copied from the rtld code.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

kib requested review of this revision.Jan 11 2021, 11:46 PM
kib created this revision.
This revision is now accepted and ready to land.Jan 12 2021, 12:02 AM

Tested this, and the "bus_dmamem_alloc failed to align memory properly" messages went away!

kib edited the summary of this revision. (Show Details)
kib added reviewers: andrew, markj.

Basically, previous version worked because malloc(9) returns result aligned on the zone size, assuming the used zone size is power of two. So there were no realignment of result and free(9) was passed the original pointer.

This can be codified in malloc_aligned() directly, but I highly dislike it. Instead I do the usual trick of storing pointer to the original allocation right before the returned memory.

sys/kern/kern_malloc.c
778 ↗(On Diff #82215)

I'm probably just missing something here, but why does the padding need to be '3 * align' instead of 'sizeof(void*) + align'?

kib marked an inline comment as done.

Lower amount of wasted memory.

sys/kern/kern_malloc.c
785 ↗(On Diff #82237)

Nit: could a simple store be used here instead of memcpy()? Since align must be at least sizeof(void*) , I don't think you'll end up with unaligned access (for the architectures that care about that sort of thing).

sys/kern/kern_malloc.c
785 ↗(On Diff #82237)

I remember from the rtld patch that a compiler does not know about alignment of the pointer.

sys/kern/kern_malloc.c
785 ↗(On Diff #82237)

I remember from the rtld patch that a compiler does not know about alignment of the pointer.

To clarify, are you saying that a compiler might choose to emit the store as something like a series of 1-byte stores (or maybe even a call to memcpy() since compilers these days can do that), because the pointer arithmetic here will make it believe that it can't guarantee natural alignment of the access?

sys/kern/kern_malloc.c
785 ↗(On Diff #82237)

No, I mean that compiler might issue a warning that the alignment of the casted pointer cannot be known.

This revision is now accepted and ready to land.Jan 14 2021, 1:22 AM
sys/kern/kern_malloc.c
800 ↗(On Diff #82237)

Is there any meaningful assertion that could be made here that 'mem' is within a certain range below 'addr', to catch corruption?

Another solution would be to require malloc() to return self-aligned memory when the allocation size is a power of 2 (up to the page size). This is already true because of how UMA lays out slabs, we could simply set the required alignment when creating malloc zones and add assertions. Then, malloc_aligned() would just round up the allocation size to the next power of 2 and free_aligned() is not needed. With this solution there is no wasted memory when the allocation size is a power of 2. I don't object to this patch, just proposing an alternative.

The man page should be updated as well.

sys/kern/kern_malloc.c
782 ↗(On Diff #82237)

This could be roundup2().

Store alignment in the header, and assert that mem and addr are close enough on free().

This revision now requires review to proceed.Jan 14 2021, 3:05 AM

Another solution would be to require malloc() to return self-aligned memory when the allocation size is a power of 2 (up to the page size). This is already true because of how UMA lays out slabs, we could simply set the required alignment when creating malloc zones and add assertions. Then, malloc_aligned() would just round up the allocation size to the next power of 2 and free_aligned() is not needed. With this solution there is no wasted memory when the allocation size is a power of 2. I don't object to this patch, just proposing an alternative.

Yes but malloc zones only request pointer-aligned allocation. I suspect that the only place in the kernel that depends on the undocumented feature of over-alignment is this busdma code.

Do we want to make the undocumented feature mandatory? If we ever want to change malloc underlying implementation, this requirement would either waste memory or tie hands of the new author. I assumed that being explicit there is more optimal.

kib marked an inline comment as done.

roundup2

In D28108#629024, @kib wrote:

Another solution would be to require malloc() to return self-aligned memory when the allocation size is a power of 2 (up to the page size). This is already true because of how UMA lays out slabs, we could simply set the required alignment when creating malloc zones and add assertions. Then, malloc_aligned() would just round up the allocation size to the next power of 2 and free_aligned() is not needed. With this solution there is no wasted memory when the allocation size is a power of 2. I don't object to this patch, just proposing an alternative.

Yes but malloc zones only request pointer-aligned allocation. I suspect that the only place in the kernel that depends on the undocumented feature of over-alignment is this busdma code.

I found another place by coincidence: struct aesni_session requires 16-byte alignment for key schedules. There, the allocation is handled by the generic opencrypto framework so aesni cannot easily specify the required alignment. I will post a patch soon.

Do we want to make the undocumented feature mandatory? If we ever want to change malloc underlying implementation, this requirement would either waste memory or tie hands of the new author. I assumed that being explicit there is more optimal.

I am proposing to change malloc to request self-alignment from UMA for power-of-2 zones, so the malloc_aligned() implementation can be optimized. External callers should still use malloc_aligned() if they have specific alignment requirements. Perhaps it is premature, let's see if new malloc_aligned() callers appear.

This revision is now accepted and ready to land.Jan 14 2021, 3:24 AM
sys/kern/kern_malloc.c
787 ↗(On Diff #82240)

for the INVARIANTS case, should this be '(uintptr_t)mem + sizeof(void*) + sizeof(size_t)' ?

kib marked an inline comment as done.