Details

Reviewers

markj
glebius
jeff

Commits

rS355137: uma: trash memory when ctor/dtor supplied too

Summary

On INVARIANTS kernels, UMA has a use-after-free detection mechanism.
This mechanism previously required that all of the ctor/dtor/uminit/fini
arguments to uma_zcreate() be NULL in order to function. Now, it only
requires that uminit and fini be NULL; now, the trash ctor and dtor will
be called in addition to any supplied ctor or dtor.

Also do a little refactoring for readability of the resulting logic.

This enables use-after-free detection for more zones, and will allow for
simplification of some callers that worked around the previous
restriction (see kern_mbuf.c).

Test Plan

kyua test -k /usr/tests/sys/Kyuafile

Diff Detail

Lint

Lint Passed

Unit

No Test Coverage

Build Status

Buildable 27806
Build 25986: arc lint + arc unit

Event Timeline

rlibby created this revision.Jun 22 2019, 3:31 AM

Harbormaster completed remote builds in B24979: Diff 58887.Jun 22 2019, 3:31 AM

I don't quite understand how this can work in all cases. Consider vm_radix_node_zone for instance. When a radix node is freed, we expect all of its child pointers to have been cleared, and INVARIANTS kernels we define a dtor which verifies that. Then, when allocating a radix node we do not pass M_ZERO because we expect that the child pointers are already zeroed. With this change I think we will return trashed nodes, which violates the assumptions of the vm radix code.

sys/vm/uma_core.c
2309–2310	This comment is a bit stale now.

In D20722#450529, @markj wrote:

I don't quite understand how this can work in all cases. Consider vm_radix_node_zone for instance. When a radix node is freed, we expect all of its child pointers to have been cleared, and INVARIANTS kernels we define a dtor which verifies that. Then, when allocating a radix node we do not pass M_ZERO because we expect that the child pointers are already zeroed. With this change I think we will return trashed nodes, which violates the assumptions of the vm radix code.

Apologies, I will check this specific example when I get back to my machine this evening, but a generic response based on my understanding:

It's not legal to expect the allocated memory to be zeroed without either M_ZERO or an init (uminit) or ctor procedure that actually zeros it. If a uminit is supplied, the automatic trash ctor/dtor shouldn't run. If just a ctor (and/or dtor) is supplied, it runs after the trash ctor (and the dtor runs before the trash dtor). If M_ZERO is supplied, it also runs after the trash ctor.

For what it's worth, I would also add that I have been running this code on a GENERIC kernel without having noticed a bug like the one described above.

I'll investigate in more depth this evening.

In D20722#450544, @rlibby wrote:

In D20722#450529, @markj wrote:

I don't quite understand how this can work in all cases. Consider vm_radix_node_zone for instance. When a radix node is freed, we expect all of its child pointers to have been cleared, and INVARIANTS kernels we define a dtor which verifies that. Then, when allocating a radix node we do not pass M_ZERO because we expect that the child pointers are already zeroed. With this change I think we will return trashed nodes, which violates the assumptions of the vm radix code.

Apologies, I will check this specific example when I get back to my machine this evening, but a generic response based on my understanding:

It's not legal to expect the allocated memory to be zeroed without either M_ZERO or an init (uminit) or ctor procedure that actually zeros it. If a uminit is supplied, the automatic trash ctor/dtor shouldn't run. If just a ctor (and/or dtor) is supplied, it runs after the trash ctor (and the dtor runs before the trash dtor). If M_ZERO is supplied, it also runs after the trash ctor.

For what it's worth, I would also add that I have been running this code on a GENERIC kernel without having noticed a bug like the one described above.

I'll investigate in more depth this evening.

Ah, sorry, I missed that the radix node zone uses an init function that zeroes the node.

Thanks for the explanation. With that the change makes sense to me, and the cleanup is nice too.

This revision is now accepted and ready to land.Jun 30 2019, 6:31 PM

Ryan, are you planning to commit this at some point?

In D20722#488420, @markj wrote:

Ryan, are you planning to commit this at some point?

Sorry, got busy for a bit. Yes, let me make sure test and make sure it is still current, and then I'll try to get it in by this weekend.

In D20722#488436, @rlibby wrote:

In D20722#488420, @markj wrote:

Ryan, are you planning to commit this at some point?

Sorry, got busy for a bit. Yes, let me make sure test and make sure it is still current, and then I'll try to get it in by this weekend.

Thanks! I ask only because I was looking at some UMA code today and started to try and clean up the ifdefs, only to remember that you had already done that. :)

Rebase and fixup comment.

This revision now requires review to proceed.Nov 23 2019, 5:41 AM

Harbormaster completed remote builds in B27723: Diff 64753.Nov 23 2019, 5:41 AM

rlibby mentioned this in D22470: Partial refactoring of uma_zalloc_arg()..Nov 23 2019, 8:55 PM

Steal from jeffr's D22470 & D22491:

Match procedure signatures, s/uma_zctor/item_ctor/ etc.
Incorporate cleanup on constructor failure.

Harbormaster completed remote builds in B27731: Diff 64787.Nov 23 2019, 11:38 PM

jeff accepted this revision.Nov 26 2019, 7:39 PM

jeff added inline comments.

sys/vm/uma_core.c
2319–2320	I have recently tightened the semantics and use of OFFPAGE zones. They should now only be used when the memory can not be directly touched by the allocator. For example, they don't assume vtoslab() will work. We should probably skip them here. I will also need to remember to disable this for SMR zones.

This revision is now accepted and ready to land.Nov 26 2019, 7:39 PM

rlibby added inline comments.Nov 26 2019, 8:54 PM

sys/vm/uma_core.c
2319–2320	Okay. Would you like that now, or in follow up? I have a couple questions: Re OFFPAGE, it seems we may also just go OFFPAGE for size or fragmentation reasons. Do we need to avoid trashing the memory for those cases too? I'd prefer to continue trashing there if possible. Is there any narrower distinction for when we can't touch the memory? While we're on the subject, I'm not sure I understand why we can't trash for NOFREE. Is it that this is a proxy for client may be doing type-stable reads-after-free?

jeff added inline comments.Nov 26 2019, 8:57 PM

sys/vm/uma_core.c
2319–2320	Right sorry, I forgot one caveat. OFFPAGE coming from the API means don't touch the memory. OFFPAGE coming internally means we moved the slab off for fragmentation. I really should make these two flags. You are right about nofree. I have an in progress patch to implement safe memory reclamation so that we can avoid that for more zones but as is it let's you keep pointers and revalidate state without risk of use after free.

Resolve conflicts after r355121.

Also, revert sending memguard addrs through item_ctor/item_dtor.
memguard doesn't want trash_ctor/trash_dtor anyway.

Patch is now much smaller.

This revision now requires review to proceed.Nov 27 2019, 4:23 PM

Harbormaster completed remote builds in B27806: Diff 64944.Nov 27 2019, 4:23 PM

rlibby added inline comments.Nov 27 2019, 4:28 PM

sys/vm/uma_core.c
2481	Whitespace here. Will fix.

markj added inline comments.Nov 27 2019, 4:37 PM

sys/vm/uma_core.c
2317	I think we can handle this simply by checking for _OFFPAGE here. If it's set here it means that the caller requested _OFFPAGE (i.e., "don't touch my memory"), as opposed to it having been set internally to avoid fragmentation.

rlibby added inline comments.Nov 27 2019, 4:40 PM

sys/vm/uma_core.c
2317	Yes, I think so too. Jeff said he planned to do some flag cleanup and asked me just to leave it for now. I added this cookie crumb so that it doesn't get missed then.