Page MenuHomeFreeBSD

Fixes for dreaded assert in jemalloc page allocator AKA mmap(MAP_ANON) providing non-zeroed pages
Needs ReviewPublic

Authored by kib on Sun, Nov 23, 9:51 AM.
Tags
None
Referenced Files
F137501764: D53891.id166990.diff
Mon, Nov 24, 12:27 AM
F137476938: D53891.id.diff
Sun, Nov 23, 8:03 PM
F137476935: D53891.id166990.diff
Sun, Nov 23, 8:03 PM
F137476904: D53891.id166987.diff
Sun, Nov 23, 8:03 PM
F137476458: D53891.diff
Sun, Nov 23, 7:56 PM
F137474748: D53891.id166990.diff
Sun, Nov 23, 7:35 PM
F137474408: D53891.diff
Sun, Nov 23, 7:32 PM
F137470259: D53891.id166987.diff
Sun, Nov 23, 6:45 PM

Details

Reviewers
alc
markj
Summary
vm_object_coalesce(): any of these conditions must prevent coalesce

When either there is more than one reference to the object, or the
object size does not exactly match the next start, or OBJ_ONEMAPPING is
not set, we cannot safely coalesce.  We might corrupt some other valid
mapping or a shadow chain doing it.

Debugging help from:    mmel


vm_object_coalesce(): simplify

Checked condition for the object size to coalesce make follow-up
re-checking of the object size redundant.  Esp., call
vm_object_page_remove() always if we coalesce.

Debugging help from:    mmel


vm_object_page_remove(): clear pager even if there is no resident pages

Swap pager might still carry the data.

Debugging help from:    mmel


vm_map_insert(): add check that coalescing does not revive stale pages

Debugging help from:    mmel

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

kib requested review of this revision.Sun, Nov 23, 9:51 AM
kib retitled this revision from Fixes for dreaded assert in jemalloc page allocator (AKA mmap(MAP_ANON) providing non-zeroed pages) to Fixes for dreaded assert in jemalloc page allocator AKA mmap(MAP_ANON) providing non-zeroed pages.
markmi_dsl-only.net added inline comments.
sys/vm/vm_object.c
2201

Just an FYI: Between the KASSERT and the prior if we have: . . . && prev_object->size == next_pindex && . . . so the KASSERT is effectively checking next_pindex > 0 as things are here.

kib marked an inline comment as done.

Drop KASSERT() checking the layout in vm_object_coalesce(). It really outlived its usefulness.

Can you elaborate on what you saw when debugging this problem? In particular, whether the map entries had OBJ_ONEMAPPING set? In principle, the ref_count and size shouldn't be a concern if the object has OBJ_ONEMAPPING set. For example, suppose there is an anonymous mapping (with OBJ_ONEMAPPING set) for the range [A, D). Further, suppose we punch a hole in the middle of that mapping, by calling munmap() on the range [B, C) where A < B < C < D. Now, we have two mappings, [A, B) and [C, D), that both reference the original object, and that object should still have OBJ_ONEMAPPING set. Because OBJ_ONEMAPPING is set, the munmap() should have freed any physical pages and swap space from the object that fell within the range [B, C). So, if a new anonymous mapping is created starting at either B or D, we should be able to safely coalesce it.

In D53891#1231399, @alc wrote:

Can you elaborate on what you saw when debugging this problem? In particular, whether the map entries had OBJ_ONEMAPPING set? In principle, the ref_count and size shouldn't be a concern if the object has OBJ_ONEMAPPING set. For example, suppose there is an anonymous mapping (with OBJ_ONEMAPPING set) for the range [A, D). Further, suppose we punch a hole in the middle of that mapping, by calling munmap() on the range [B, C) where A < B < C < D. Now, we have two mappings, [A, B) and [C, D), that both reference the original object, and that object should still have OBJ_ONEMAPPING set. Because OBJ_ONEMAPPING is set, the munmap() should have freed any physical pages and swap space from the object that fell within the range [B, C). So, if a new anonymous mapping is created starting at either B or D, we should be able to safely coalesce it.

I did not have access to the object state there (all debugging was done remotely).

The situation you described is one of the cases that concerned me. I am not sure that we have a guarantee that doing the coalesce on the object with OBJ_ONEMAPPING flag but ref_count > 1 would not corrupt some other mapping. We need to do vm_object_page_remove(), and in principle that could remove pages which belong to other fragment.