Page MenuHomeFreeBSD

mips _pmap_unwire_ptp races MipsDoTLBMiss

Authored by on Apr 18 2020, 11:11 PM.



MipsDoTLBMiss will load a segmap entry or pde, check that it isn't zero,
and then chase that pointer to a physical page. If that page has been freed
in the interim, it will read garbage and go on to populate the TLB with it.

This can happen because pmap_unwire_ptp zeros out the pde and
vm_page_free_zero()s the ptp (or, recursively, zeros out the segmap
entry and vm_page_free_zero()s the pdp) without interlocking against
MipsDoTLBMiss. The pmap is locked, and pvh_global_lock may or may not
be held, but this is not enough. Solve this issue by inserting TLB
shootdowns within _pmap_unwire_ptp; as MipsDoTLBMiss runs with IRQs
deferred, the IPIs involved in TLB shootdown are sufficient to ensure
that MipsDoTLBMiss sees either a zero segmap entry / pde or a non-zero
entry and the pointed-to page still not freed.

Thanks to Konstantin Belousov and Mark Johnston for confirming the bug.

Diff Detail

rS FreeBSD src repository
Automatic diff as part of commit; lint not applicable.
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

I believe it is better to split the review into two. The pmap_emulate_modified() looks good to me, and I do not see why the other part should block this one. retitled this revision from Fix two races in the MIPS pmap to mips _pmap_unwire_ptp races MipsDoTLBMiss.Apr 21 2020, 6:40 PM edited the summary of this revision. (Show Details)
1751 ↗(On Diff #70851)

Should it be "segmap entry, PD, and PT"?

1858 ↗(On Diff #70851)

Couldn't this block of changes be made much smaller by simply assigning va = va_next before breaking from the loop here?

1861 ↗(On Diff #70851)

style(9) encourages having all local variable declarations at the beginning of the function.

1990 ↗(On Diff #70851)

I would suggest going through all of the pmap_unuse_pt() calls in this file and adding an explicit (void) cast to calls where we don't care about the return value. added inline comments.
1858 ↗(On Diff #70851)

I won't claim that my changes are the most elegant, so there's almost surely a better way of achieving what I wanted, but...

The proposed change attempts to defer TLB shootdowns until it is certain that it must issue one, whereas the existing code may fire off several for narrow slices of the PT being traversed (repeated triggering of line 1850) and then may discover that it did one for the whole PT as part of pte_remove_page (well, pmap_unwire_pt) anyway.

That said, I seem to have considered the end of the page being a series of !PTE_V entries but neglected the possibility that the page begins with !PTE_V entries. I've amended the patch.

This looks reasonable to me aside from the style nits.

1858 ↗(On Diff #70851)

I see, thanks.

1513 ↗(On Diff #70902)

There should be no space between the cast and function name.

1922 ↗(On Diff #70902)

I'd find the code slightly clearer if PAGE_SIZE were added here instead of at the pmap_invalidate_range() call.

This revision is now accepted and ready to land.Apr 23 2020, 2:20 PM

Note that for instance x86 has similar issue, because hardware is free to speculatively walk page tables and load any valid PTE or intermediate paging structure. There all page table entries clearing functions take the free list pointer, where the freed page table is put. Only after the caller did IPI to invalidate TLB for cleared PTE, the pages from the list can be freed. marked 2 inline comments as done.

More respose to markj feedback

This revision now requires review to proceed.Apr 24 2020, 2:40 PM
This revision was not accepted when it landed; it landed in state Needs Review.Apr 24 2020, 9:21 PM
This revision was automatically updated to reflect the committed changes.